never-look-up-tidyr’s-pivot-wider-or-pivot-longer-again!

Never look up tidyr’s pivot_wider or pivot_longer again!

Love tidyr but can never remember exactly how pivot_wider and pivot_longer work? With RStudio code snippets, you can write a snippet once and always have a fill-in-the-blank code template at your fingertips.




Never look up tidyr’s pivot_wider or pivot_longer again!

Thinkstock

Many tidyverse users turn to the tidyr R package for reshaping data. But I’ve seen people say they can’t remember exactly how its pivot_wider() and pivot_longer() functions work. Luckily, there’s an easy solution: RStudio code snippets. Write a snippet once, and what’s basically a fill-in-the-blank form will always be at your fingertips.

From wide to long

To go from wide to long with tidyr, use the pivot_longer() function. It has the following syntax:

pivot_longer(mydata, 

cols, # columns that should pivot from wide to long (unquoted)

names_to, # name of the new category column as a quoted string

values_to # name of the new value column as a quoted string

)

For a simple example, let’s look at the well-known mtcars data set, which has a wide format. It doesn’t have a column category at all — models are row names but not in their own variable — so I used the tibble package’s handy rownames_to_column() function to add a Model column from the row names:

library(tibble)

library(tidyr)

mtcars <- tibble::rownames_to_column(mtcars, "Model")
head(mtcars)
              Model  mpg cyl disp  hp drat    wt  qsec vs am gear carb
1         Mazda RX4 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2     Mazda RX4 Wag 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3        Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4    Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5 Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
6           Valiant 18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

To convert mtcars into “tidy” or long format, all the columns starting from mpg to the last one (carb) should shift into just two new columns — one for category and another for value:

mtcars_long <- pivot_longer(mtcars,

cols = mpg:carb,

names_to = "Category",

values_to = "Value"

)

Below is code for a reusable RStudio code snippet that generates a fill-in-the-blank code explainer for pivot_longer() whenever you invoke it. This is not R code; it’s snippet code to generate R code:

snippet plonger

pivot_longer(${1:mydf},

cols = ${2:columns to pivot long},

names_to = "${3:desired name for category column}",

values_to = "${4:desired name for value column}"

)

All the items enclosed with ${} are snippet variables. If you don’t know how RStudio code snippets work, you can watch this tutorial video:

You can add the snippets code to your RStudio R code snippets file with the following R command:

usethis::edit_rstudio_snippets()

That opens the snippets file within RStudio. You can then copy and paste — or type — code into that file. Note that every line under the first line within the snippet and the name of the snippet (plonger) must start with a tab.

Dropdown list with the name of a code snippet when starting to type the snippet name Sharon Machlis, IDG

Start typing the name of a snippet to select and use it.

To use the snippet, start typing plonger and the snippet name will appear as a choice to select and use. Watch the video embedded at the very top of this article if you want to see the pivot_longer() snippet in action. 

Added bonus: The snippet already includes quotation marks where they’re needed, so you don’t have to add them.

From long to wide

To reshape the other way, use pivot_wider():

pivot_wider(mydata, 

id_cols, # optional vector of columns you do not want affected

names_from, # category column(s) to pivot from long to wide

values_from # value columns(s) that hold data for each category column

names_sep # optional string separator for category-value columns

)

For this demo I’ll use the us_rent_income data frame, which has data by US state for median annual income and median monthly rent, plus margins of error for each.

  GEOID NAME    variable estimate   moe
              
1 01    Alabama income      24476   136
2 01    Alabama rent          747     3
3 02    Alaska  income      32940   508
4 02    Alaska  rent         1200    13
5 04    Arizona income      27517   148
6 04    Arizona rent          972     4

For a more human-readable version, I’d want income and rent to each have their own columns: income, income margin of error, rent, and rent margin of error. 

Code for that:

rent_income_wide <- pivot_wider(us_rent_income,

# id_cols = optional vector of unaffected columns,

names_from = c(variable),

values_from = c(estimate, moe),

names_sep = "_"

)

The RStudio code snippet below will generate reusable fill-in-the-blank code and explanations for each pivot_wider() argument.

snippet pwider

pivot_wider(${1:mydf},

# id_cols = ${2:optional vector of unaffected columns},

names_from = c(${3:category column(s) to pivot wide}),

values_from = c(${4:value column(s) that hold data for each category column}),

names_sep = "_"

)

You can download a text file with both pivot snippets below. Right-click and save the link as a file on your system; don’t copy and paste from the file displayed in a browser.

download

Copy and paste this code into your RStudio R code snippets file Sharon Machlis

For more R tips, head to the InfoWorld Do More With R page.

Sharon Machlis is Executive Editor, Data & Analytics at IDG, where she works on data analysis and in-house editor tools in addition to writing and editing. Her book Practical R for Mass Communication and Journalism was published in December 2018.

Copyright © 2021 IDG Communications, Inc.