In this set of exercises, we will practice filtering and rearranging the order of rows.
As for the previous set of data wrangling exercises, before we can begin, we need to load the tidyverse
package(s) and import the data. Also for these exercises, it is advisable to open the codebook for the data set.
library(tidyverse)
gp_covid <- read_csv2("./data/ZA5667_v1-1-0.csv")
base R
, let’s create a new data set named gp_covid_married
that only contains data from respondents who reported being married.
marstat
and the value indicating that the respondent is married is 1. Remember that there are 2 options in base R
for filtering rows (the same ones as for selecting columns).
dplyr
function for filtering rows: Create an object named gp_covid_afd_voters
that only contains respondents who report that they intend to vote in the next German federal election and that they intend to vote for the right-wing populist party AfD (Alternative fuer Deutschland).
intention_to_vote
and choice_of_party
and the values we want to filter for are 2 (Yes), and 6 (AfD), respectively.
dplyr
, create another subset of cases called gp_covid_middle_aged
that only includes respondents aged 36 to 50.
age_cat
and the values of that variable we are looking for are 4 to 6. You can use the helper function between()
here (remember that the values you provide to this function are inclusive).
base R
for this task: Sort the gp_covid
data set in descending order of the household
variable. You can overwrite the original gp_covid
object for this task. Have a look at the resulting data frame to check if your code worked.
base R
function order()
here. You can check your result using head()
. To limit the amount of output, you can subset columns using [ ] within the head()
command (household is the 13th variable in the data set, so you could, e.g., subset columns 6:13).
dplyr
package. To restore the original order of the gp_covid
data set, sort in ascending order of the id
variable. As for the previous task, check whether your code work, but this time using a (short) pipe chain and a dplyr
function for catching a glimpse of your data.
dplyr
function you are looking for is in another castle… Just kidding (and apologies for the silly “Super Mario” reference here… that’s what happens when you work with pipes more than a plumber does), it’s arrange()
.