tidyselect and tidyeval cheat sheet

How to refer to columns when programming with dplyr.

The tidyverse functions come with two ways to refer to columns in dataframes:

  1. tidyselect which is used for example by select, across, and pivot_longer,
  2. tidyeval (aka. data-masking) which is used for example by arrange, filter, mutate, and summarize.

The programming with dplyr vignette gives a useful overview about both methods and more details can be found in the rlang and the tidyselect packages.

Here, I want to give a condensed summary how to select columns if your input variables are character vectors, quoted expression (created with vars), or data-mask function arguments. Note that this post is written for dplyrversion 1.1.2; these semantics have changed in the past and might not be ideal in future versions.

I will use a subset of the mtcars as example data

library(tidyverse)
df <- mtcars %>%
  rownames_to_column("name") %>%
  select(name, mpg, cyl) %>%
  slice(1:3)
df
##            name  mpg cyl
## 1     Mazda RX4 21.0   6
## 2 Mazda RX4 Wag 21.0   6
## 3    Datsun 710 22.8   4

Character vectors

tidyselect

char_vec <- c("mpg", "cyl")

df %>% select(all_of(char_vec))
df %>% pivot_longer(all_of(char_vec), names_to = "feature", values_to = "value")

tidyeval

df %>% mutate(cyl_plus_10 = .data$cyl + 10)
df %>% mutate(cyl_plus_10 = !!rlang::sym(char_vec[2]) + 10)
df %>% mutate("{char_vec[2]}_plus_10" := !!rlang::sym(char_vec[2]) + 10)
df %>% mutate(across(all_of(char_vec), \(x) x * 2))

Quoted expressions

tidyselect

quoted_expr <- vars(mpg, cyl)

df %>% select(!!!quoted_expr)
df %>% pivot_longer(all_of(map_chr(quoted_expr, rlang::as_name)), 
                    names_to = "feature", values_to = "value")

tidyeval

df %>% mutate(cyl_plus_10 = .data$cyl + 10)
df %>% mutate(cyl_plus_10 = !!quoted_expr[[2]] + 10)
df %>% mutate("{as_label(quoted_expr[[2]])}_plus_10" := !!quoted_expr[[2]] + 10)
df %>% mutate(across(all_of(map_chr(vars(cyl, mpg), rlang::as_name)), \(x) x * 2))

Data-mask function arguments

tidyselect

fnc1 <- function(arg) df %>% select({{arg}})
fnc1(mpg)
fnc2 <- function(...) df %>% select(...)
fnc2(mpg, cyl)
fnc3 <- function(args) df %>% select(!!! args)
fnc3(vars(mpg, cyl))

tidyeval

fnc1 <- function(arg) df %>% mutate(arg_plus_10 = {{arg}} + 10)
fnc1(mpg)
fnc2 <- function(arg) df %>% mutate("{{arg}}_plus_10" := {{arg}} + 10)
fnc2(mpg)

For more advanced cases, we have to use the transmute-as-bridge-pattern

datamask_to_names <- function(data, ...){
  inputs <- transmute(data, ...)
  names(inputs)
}
fnc3 <- function(...) df %>% mutate(across(all_of(datamask_to_names(df, ...)), \(x) x * 2))
fnc3(mpg, cyl)
fnc4 <- function(args) df %>% mutate(across(all_of(datamask_to_names(df, !!!args)), \(x) x * 2))
fnc4(vars(mpg, cyl))
fnc5 <- function(args) df %>% pivot_longer(all_of(datamask_to_names(df, !!!args)), 
                                           names_to = "feature", values_to = "value")
fnc5(vars(mpg, cyl))

Session Info

sessionInfo()
## R version 4.3.0 (2023-04-21)
## Platform: x86_64-apple-darwin20 (64-bit)
## Running under: macOS Big Sur 11.7.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: Europe/Berlin
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.2    
##  [5] purrr_1.0.1     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
##  [9] ggplot2_3.4.4   tidyverse_2.0.0
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.3     jsonlite_1.8.7   compiler_4.3.0   tidyselect_1.2.0
##  [5] jquerylib_0.1.4  scales_1.2.1     yaml_2.3.7       fastmap_1.1.1   
##  [9] R6_2.5.1         generics_0.1.3   knitr_1.43       bookdown_0.34   
## [13] munsell_0.5.0    tzdb_0.4.0       bslib_0.4.2      pillar_1.9.0    
## [17] rlang_1.1.1      utf8_1.2.3       stringi_1.7.12   cachem_1.0.8    
## [21] xfun_0.39        sass_0.4.6       timechange_0.2.0 cli_3.6.1       
## [25] withr_2.5.0      magrittr_2.0.3   digest_0.6.31    grid_4.3.0      
## [29] rstudioapi_0.14  hms_1.1.3        lifecycle_1.0.3  vctrs_0.6.2     
## [33] evaluate_0.21    glue_1.6.2       blogdown_1.17    fansi_1.0.4     
## [37] colorspace_2.1-0 rmarkdown_2.22   tools_4.3.0      pkgconfig_2.0.3 
## [41] htmltools_0.5.5