In this set of exercises, we will practice filtering and rearranging the order of rows.

As for the previous set of data wrangling exercises, before we can begin, we need to load the tidyverse package(s) and import the data. Also for these exercises, it is advisable to open the codebook for the data set.

library(tidyverse)

gp_covid <- read_csv2("./data/ZA5667_v1-1-0.csv")

1

As a first exercise, using base R, let’s create a new data set named gp_covid_married that only contains data from respondents who reported being married.
The variable representing marital status is named marstat and the value indicating that the respondent is married is 1. Remember that there are 2 options in base R for filtering rows (the same ones as for selecting columns).

2

Now, let’s use the dplyr function for filtering rows: Create an object named gp_covid_afd_voters that only contains respondents who report that they intend to vote in the next German federal election and that they intend to vote for the right-wing populist party AfD (Alternative fuer Deutschland).
The names of the variables we need here are intention_to_vote and choice_of_party and the values we want to filter for are 2 (Yes), and 6 (AfD), respectively.

3

Using the same function from dplyr, create another subset of cases called gp_covid_middle_aged that only includes respondents aged 36 to 50.
The variable we need for this is called age_cat and the values of that variable we are looking for are 4 to 6. You can use the helper function between() here (remember that the values you provide to this function are inclusive).

4

Let’s briefly turn back to base R for this task: Sort the gp_covid data set in descending order of the household variable. You can overwrite the original gp_covid object for this task. Have a look at the resulting data frame to check if your code worked.
You need the base R function order() here. You can check your result using head(). To limit the amount of output, you can subset columns using [ ] within the head() command (household is the 13th variable in the data set, so you could, e.g., subset columns 6:13).

5

Let’s rearrange the order of rows again, this time using a function from the dplyr package. To restore the original order of the gp_covid data set, sort in ascending order of the id variable. As for the previous task, check whether your code work, but this time using a (short) pipe chain and a dplyr function for catching a glimpse of your data.
The dplyr function you are looking for is in another castle… Just kidding (and apologies for the silly “Super Mario” reference here… that’s what happens when you work with pipes more than a plumber does), it’s arrange().