Matt Upson

Yo no soy marinero

map_df()

I’m using Hadley’s purrr package more and more, and its beginning to change the way I program in R, much like dplyr did.

map() is a great function and one of its incarnations that I really like is map_df().This will apply a function to elements of a list, and then bind the dataframes together (assuming they can be combined). It also allows us to specify additional columns for our final dataframe which takes the names of the elements of the list.

Here’s a simple example:

So what’s going on here?

• First I create a list of six univariate normal distributions using rnorm().
• Passing this to map_df() with the function (.f) argument as data_frame(x = .x) will convert each of these vectors of variables into a dataframe, naming the column of variables as x.
• map_df() essentially does a bind_rows() and outputs a single dataframe, adding a new variable dist which takes the names of the elements of the list, outputting a long dataframe.
• Finally this is passed to ggplot() which creates histograms with geom_histogram(), and facets them into six panes with facet_wrap().

It’s a very simply function, but nonetheless very useful. This is a fairly contrived example, but I find myself using these map() functions a lot recently - especially when training models, and working with lists or dataframes full of dataframes.