In this set of exercises, we will practice filtering and rearranging the order of rows.
As for the previous set of data wrangling exercises, before we can begin, we need to load the tidyverse package(s) and import the data. Also for these exercises, it is advisable to open the codebook for the data set.
library(tidyverse)
gp_covid <- read_csv2("./data/ZA5667_v1-1-0.csv")
base R, let’s create a new data set named gp_covid_married that only contains data from respondents who reported being married.
marstat and the value indicating that the respondent is married is 1. Remember that there are 2 options in base R for filtering rows (the same ones as for selecting columns).
dplyr function for filtering rows: Create an object named gp_covid_afd_voters that only contains respondents who report that they intend to vote in the next German federal election and that they intend to vote for the right-wing populist party AfD (Alternative fuer Deutschland).
intention_to_vote and choice_of_party and the values we want to filter for are 2 (Yes), and 6 (AfD), respectively.
dplyr, create another subset of cases called gp_covid_middle_aged that only includes respondents aged 36 to 50.
age_cat and the values of that variable we are looking for are 4 to 6. You can use the helper function between() here (remember that the values you provide to this function are inclusive).
base R for this task: Sort the gp_covid data set in descending order of the household variable. You can overwrite the original gp_covid object for this task. Have a look at the resulting data frame to check if your code worked.
base R function order() here. You can check your result using head(). To limit the amount of output, you can subset columns using [ ] within the head() command (household is the 13th variable in the data set, so you could, e.g., subset columns 6:13).
dplyr package. To restore the original order of the gp_covid data set, sort in ascending order of the id variable. As for the previous task, check whether your code work, but this time using a (short) pipe chain and a dplyr function for catching a glimpse of your data.
dplyr function you are looking for is in another castle… Just kidding (and apologies for the silly “Super Mario” reference here… that’s what happens when you work with pipes more than a plumber does), it’s arrange().