Ways to Recode Variables in R



There are a lot of ways to recode variables in R. In fact, so many that this overview can't possibly cover them all. However, this guide will attempt to cover most of the options available with base-R as well as brief overview of dplyr.

Topics include:

  1. ifelse
  2. match
  3. `[<−` (e.g., named vector look-ups)
  4. gsub
  5. dplyr::case_when
  6. dplyr::recode
  7. Interactive Recoding Function


We'll use nycflights13::flightsFrom Hadley Wickham given that it has a good mix of character and numeric variables and while not a small data set, also not so large as make experimenting cumbersome. We'll also use the auxiliary nycflights13::airlines data set as well.

df <- flights
## Observations: 336,776
## Variables: 19
## $ year           <int> 2013, 2013, 2013, 2013, 2013, 2013, 2013, 2013,...
## $ month          <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ day            <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
## $ dep_time       <int> 517, 533, 542, 544, 554, 554, 555, 557, 557, 55...
## $ sched_dep_time <int> 515, 529, 540, 545, 600, 558, 600, 600, 600, 60...
## $ dep_delay      <dbl> 2, 4, 2, -1, -6, -4, -5, -3, -3, -2, -2, -2, -2...
## $ arr_time       <int> 830, 850, 923, 1004, 812, 740, 913, 709, 838, 7...
## $ sched_arr_time <int> 819, 830, 850, 1022, 837, 728, 854, 723, 846, 7...
## $ arr_delay      <dbl> 11, 20, 33, -18, -25, 12, 19, -14, -8, 8, -2, -...
## $ carrier        <chr> "UA", "UA", "AA", "B6", "DL", "UA", "B6", "EV",...
## $ flight         <int> 1545, 1714, 1141, 725, 461, 1696, 507, 5708, 79...
## $ tailnum        <chr> "N14228", "N24211", "N619AA", "N804JB", "N668DN...
## $ origin         <chr> "EWR", "LGA", "JFK", "JFK", "LGA", "EWR", "EWR"...
## $ dest           <chr> "IAH", "IAH", "MIA", "BQN", "ATL", "ORD", "FLL"...
## $ air_time       <dbl> 227, 227, 160, 183, 116, 150, 158, 53, 140, 138...
## $ distance       <dbl> 1400, 1416, 1089, 1576, 762, 719, 1065, 229, 94...
## $ hour           <dbl> 5, 5, 5, 5, 6, 5, 6, 6, 6, 6, 6, 6, 6, 6, 6, 5,...
## $ minute         <dbl> 15, 29, 40, 45, 0, 58, 0, 0, 0, 0, 0, 0, 0, 0, ...
## $ time_hour      <dttm> 2013-01-01 05:00:00, 2013-01-01 05:00:00, 2013...
airlines <- airlines
## Observations: 16
## Variables: 2
## $ carrier <chr> "9E", "AA", "AS", "B6", "DL", "EV", "F9", "FL", "HA", ...
## $ name    <chr> "Endeavor Air Inc.", "American Airlines Inc.", "Alaska...


What are the options? Most people start off along conditionals route. Let's recode departure delay to a categorical that indicates whether there was a delay or not.

df$dep_delay_cat <- ifelse(df$dep_delay>0,'delay','no delay')

##    delay no delay
##   128432   200089

Beyond the scope of this page, but see also here...You can nest ifelse for multiple conditions, but gets confusing very quickly...

df$dep_delay_cat <- ifelse(df$dep_delay>0,'late',

##   early    late on-time
##  183575  128432   16514

The code below won't work as its not vectorized...

df$dep_delay_cat2 <- `if`(df$dep_delay>0,'delay','no delay')
## Warning in if (df$dep_delay > 0) "delay" else "no delay": the condition has
## length > 1 and only the first element will be used
## [1] "delay" "delay" "delay" "delay" "delay"

If you want to do something along the lines of the aboveMore on `Vectorize`..., use Vectorize, but this is ugly so better to use ifelse...

recode_func <- Vectorize(function(x) {
    if ( {x <- NA} # Need to account for NA!
    else if (x>0) {x <- 'delay'}
    else {x <- 'no delay'}

df$dep_delay_cat2 <- recode_func(df$dep_delay)

## [1] "delay"    "delay"    "delay"    "no delay" "no delay"


So far, while helpful, the covered functions are all less than graceful when it comes to recoding a larger number of values. This is where R's built-in vectorized subsetting operations come in extremely handy. There are a number of different ways to do this.


First, despite the somewhat perplexing documentation (see ?match), match will reliably allow you to recode a single column or vector based on a key-value pair from another data structure.

df$carrier_full <- airlines$name[match(df$carrier,airlines$carrier)]

## [1] "United Air Lines Inc."  "United Air Lines Inc."
## [3] "American Airlines Inc." "JetBlue Airways"
## [5] "Delta Air Lines Inc."

Named Look-up Table/Subset Method

Using a named vector as a look-up table is equally as powerful (and my preferred method).

airline_lookup <- setNames(airlines$name,airlines$carrier)

df$carrier_full <- airline_lookup[df$carrier]    # See also '?replace'

## [1] "United Air Lines Inc."  "United Air Lines Inc."
## [3] "American Airlines Inc." "JetBlue Airways"
## [5] "Delta Air Lines Inc."

Note that you will want to have values for each unique value otherwise you will get missing values.

##                      UA                      UA                    <NA>
## "United Air Lines Inc." "United Air Lines Inc."                      NA
##                      B6                      DL
##       "JetBlue Airways"  "Delta Air Lines Inc."

On the other hand, it's perfectly acceptable to have more than values/names than are in whatever object you want to recode. In fact, you could conceivably recode every character vector in a dataframe using this, should you want to. I also regularly use it within an interactive to quickly recode objects with limited number of values. Scroll to the end of the page to see examples of both...

String Substitution (`gsub`)

Although it doesn't save much time, you can use R's native functions for working with strings (or stringr, stringi, etc.) for quick replacements as well...
Note that `gsub` is but one of a number of string methods

head(gsub('LGA','LaGuardia',df$origin))  # According to many people, 'POS' might be a better substitution...
## [1] "EWR"       "LaGuardia" "JFK"       "JFK"       "LaGuardia" "EWR"

`dplyr` Functions

There are a number of useful functions from third-party packages as well. While a number of other recoding tutorials mention CAR::recode from the 'CAR' package, I choose not to as dplyr' has both dplyr::recode and dplyr::case_when which I'll briefly discuss.


'case_when' is a good choice when you want to recode based on multiple logical conditions. Although the formula interface is straightforward, the function is not trivial. As the function documentation notes,

'Like an if statement, the arguments are evaluated in order, so you must proceed from the most specific to the most general'

The following not only doesn't follow that rule, but also has a few other problems...

# No No

df <- df %>% mutate(angry = case_when(
    dep_delay == 0 ~ 'on-time',
    dep_delay > 60 ~ 'wayyy too ^$%#ing late',
    dep_delay > 20 ~ 'wayyy late',
    dep_delay > 0 ~ 'late',
    dep_delay < 20 ~ 'wayyy early',
    dep_delay < 0 ~ 'early'))

##                   late                on-time            wayyy early
##                  66799                  16514                 183575
##             wayyy late wayyy too ^$%#ing late
##                  35052                  26581

Instead, do this, as it will capture each unique condition. Note that it may take some trial and error*..*(at least it did for me, but that may because I'm an idiot...)

df<- df %>% mutate(angry = case_when(
    dep_delay == 0 ~ 'on-time',
    dep_delay < -20 ~ 'wayyy early',
    dep_delay < 0  ~ 'early',
    dep_delay > 20 &   dep_delay < 60  ~ 'wayyy late',
    dep_delay >= 60 ~ 'wayyy too ^$%#ing late',
    dep_delay > 0 ~ 'late'))

##                  early                   late                on-time
##                 183534                  66799                  16514
##            wayyy early             wayyy late wayyy too ^$%#ing late
##                     41                  34574                  27059


Now let's pretend someone tells you to make your coding scheme more professional and instead of telling them to go $%^# themselves you decide to oblige*...*Because, whatever...

df$dep_delay_cat2 <- recode(df$angry,
    `on-time` = 'Departed On-Schedule',
    `wayyy early` = 'More than 20 Minutes Ahead of Schedule',
    `early` = 'Ahead of Schedule',
    `wayyy too ^$%#ing late` = 'More than 60 Minutes Behind Schedule',
    `wayyy late` = 'Between 20 and 40 Minutes Behind Schedule',
    .default = 'Behind Schedule')

# Note the '.default' option (as well as other options...)

Recoding En Masse

To demonstrate this, let's upset some nerds and recode every character variableScroll to the bottom of the linked page to see a related technique... for the dplyr::starwars data set.

# Full data set also includes double and list columns..

  sw <- dplyr::starwars[sapply(starwars,is.character)]

  # Note, that I purposefully stay with base R here... 

  unique <- unlist(sapply(sw,unique))

  # Multiple NA so we'll just drop them to keep them as is

  unique <- unique[!]

  wrong_names <- setNames(names(unique),unique)

  wrong_sw <-, function(x) {

      x <- wrong_names[x]

  }),stringsAsFactors = FALSE)

  # Now...

name hair_color skin_color eye_color gender homeworld species
Luke Skywalker name1 hair_color1 skin_color1 skin_color21 gender1 homeworld1 species1
C-3PO name2 NA skin_color2 skin_color23 NA homeworld1 species2
R2-D2 name3 NA skin_color3 skin_color20 NA homeworld2 species2
Darth Vader name4 hair_color3 hair_color9 skin_color23 gender1 homeworld1 species1
Leia Organa name5 hair_color4 skin_color5 hair_color4 gender3 homeworld3 species1
Owen Lars name6 hair_color5 skin_color5 skin_color21 gender1 homeworld1 species1
# And previously..

name hair_color skin_color eye_color gender homeworld species
Luke Skywalker blond fair blue male Tatooine Human
C-3PO NA gold yellow NA Tatooine Droid
R2-D2 NA white, blue red NA Naboo Droid
Darth Vader none white yellow male Tatooine Human
Leia Organa brown light brown female Alderaan Human
Owen Lars brown, grey light blue male Tatooine Human
# This is a lot more useful if you have actual values in mind that you want to recode to...

Interactive Recoding Function

In terms of the function I mentioned above, I use (a variant) of the following to quickly change either a few column values or column names. As the day wears on and I get increasingly tired, elegant subsetting operations become more difficult so its nice to fall back upon something I wrote when I was feeling more fresh. Let's call it i_recode short for interactive recode (VERY ORIGINAL).

i_recode <- function(x) {

    vec <- as.character(x)
    temp <- unique(x)
    old_vector <- select.list(temp, multiple = T)
    new_vector <- vector(length = length(old_vector))
    for (i in seq_along(old_vector)) {
        cat("\nOld Value:", shQuote(old_vector[[i]]), "\n\n")
        new_vector[[i]] <- readline(prompt = "Please Enter New Value: ")

    new_vector <- setNames(new_vector, old_vector)

      vec[which(vec %in% old_vector)] <- new_vector[vec[which(vec %in%





2019-07-30 10:32:00
Viagra Quebec <a href=></a> Combivent Without A Prescription
2019-08-02 18:26:00
Farmacie Online Propecia Generic Medicine <a href=>generic cialis canada</a> Levitra Samples Europe
2019-08-05 10:22:00
Pregnancy After Propecia <a href=>venta viagra en madrid</a> Buy Malegra Online
2019-08-08 05:55:00
Meridia Weight Loss Online Buying Cheapest Cialis 5 Mg <a href=>buy generic cialis online</a> Cialis Efecto Duracion Youtube Levitra Farmaco Equivalente Priligy
2020-03-19 02:36:00
viagra generic generic viagra <a href=" #">generic viagra </a> generic viagra viagra online

viagra pills cheap viagra <a href=>buy viagra </a> viagra pills generic viagra
2020-03-25 00:00:00
mom helps son cum/viagra watermelon viagra <a href=" #">viagra buy </a> true viagra natural choices cost of viagra at walmart

viagra refractory period what is viagra <a href=>viagra buy </a> ekg before prescribing viagra viagra headaches cure
2020-06-19 03:47:00
prescription drugs without doctor approval <a href=" ">viagra without a doctor prescription</a> buy prescription drugs online legally
2020-06-19 04:53:00
cheap pet meds without vet prescription <a href=" ">viagra without a doctor prescription walmart</a> cvs prescription prices without insurance
2020-06-22 00:31:00
buy prescription drugs online <a href=>fast ed meds online</a> viagra sale essex
how to get cheap viagra
2020-06-23 00:37:00
pain meds online without doctor prescription <a href=" ">how to get prescription drugs without doctor</a> prescription drugs canada buy online
2020-06-23 13:13:00
when will cialis go generic <a href=" ">generic viagra</a> generic cialis
2020-06-23 15:26:00
cialis prices 20mg <a href=" ">generic viagra</a> buy generic cialis
2020-06-24 10:06:00
5mg cialis <a href=" ">buy cialis</a> cialis
cialis free trial <a href=>cialis without doctor prescription</a> sildenafil citrate generic viagra 100mg
2020-06-24 11:09:00
viagra without doctor prescription <a href=" ">viagra without doctor prescription amazon</a> cvs prescription prices without insurance
2020-06-24 12:07:00 generic viagra
2020-06-25 09:52:00
viagra without a doctor prescription <a href=" ">mexican pharmacy without prescription</a> prescription drugs without prior prescription
2020-06-26 00:05:00
Viagra Buy Uk Amsterdam - Cialis cialis face flushing <a href=>Cheap Cialis</a> Cialis On Sale Online Mastercard
2020-06-26 02:40:00
cialis without doctor prescription <a href=" ">prescription drugs online without</a> pain meds online without doctor prescription
2020-06-26 06:38:00
cialis vidalista <a href=" ">cialis without doctor prescription</a> and <a href=" ">cialis generic</a> and <a href=" ">buy cialis</a> and <a href=" ">cialis generic</a> and <a href=" ">cialis without doctor prescription</a> and <a href=" ">cialis generic</a> buy cialis
2020-06-26 17:53:00
taking l-citrulline and cialis together <a href=" ">cheap cialis</a> and <a href=" ">cialis without doctor prescription</a> and <a href=" ">generic cialis</a> and <a href=" ">buy cialis</a> and <a href=" ">buy generic cialis</a> and <a href=" ">cialis</a> cialis without doctor prescription
2020-06-27 00:14:00
fast ed meds online <a href=" ">online meds for ed</a> pet antibiotics without vet prescription
2020-06-27 00:19:00
is cialis generic available <a href=" ">cheap cialis</a> and <a href=" ">cheap cialis</a> and <a href=" ">cialis without doctor prescription</a> and <a href=" ">cialis generic</a> and <a href=" ">cheap cialis</a> and <a href=" ">cialis without doctor prescription</a> generic viagra
2020-06-28 16:13:00
100mg viagra without a doctor prescription <a href=" ">buy generic cialis</a> cialis 20mg
2020-06-29 00:58:00
free cialis <a href=" ">cialis</a> and <a href=" ">cialis price</a> and <a href=" ">buy generic cialis</a> and <a href=" ">cialis</a> and <a href=" ">cialis online</a> and <a href=" ">cialis coupon</a> cialis price
2020-06-29 10:19:00
buy prescription drugs from canada <a href=" ">cialis online</a>;u=7872 buy cialis online
2020-06-29 17:26:00
cialis online pharmacy <a href=" ">cialis price</a> and <a href=" ">cialis price</a> and <a href=" ">buy cialis online</a> and <a href=" ">cialis online</a> and <a href=" ">buy generic cialis</a> and <a href=" ">cialis</a> average price cialis <a href=" ">cialis online</a> and <a href=";u=33529 ">cialis 20mg</a> and <a href=" ">cialis 20mg</a> and <a href=" ">buy cialis online</a> and <a href=" ">cialis 20mg</a> and <a href=" ">buy cialis</a> buy cialis online
2020-06-29 23:20:00
what are the side effects of cialis <a href=" ">buy cialis</a> and <a href=" ">buy cialis online</a> and <a href=" ">generic cialis</a> and <a href=" ">buy generic cialis</a> and <a href=" ">cialis online</a> and <a href=" ">cialis price</a> cialis coupon
2020-06-30 06:32:00
viagra vs cialis vs levitra <a href=" ">generic cialis</a> and <a href=" ">cialis coupon</a> and <a href=" ">cialis coupon</a> and <a href=" ">cialis price</a> and <a href=" ">cialis price</a> and <a href=" ">cialis</a> cialis price
2020-07-02 02:30:00
$200 cialis coupon <a href=" ">cialis generic</a>
2020-07-02 15:52:00
<a href=" ">buy cialis</a> - cialis 20mg
2020-07-02 19:40:00
coffee with cialis <a href=" ">cialis without doctor prescription</a>
generic for cialis:
2020-07-03 00:22:00
fda warning list cialis <a href=" ">cialis coupon</a>
2020-07-03 03:17:00 - cialis 100 mg lowest price
[url=]buy cialis[/url]
2020-07-03 05:13:00
cialis dosage <a href=" ">cialis tadalafil</a> - safe alternatives to viagra and cialis
[url=]cialis coupon[/url]
2020-07-04 15:48:00
<a href=" ">cialis cost</a> - cialis and interaction with ibutinib
[url=]cialis coupon[/url]