Exercise 1_2_3: Statistical Software Files

In this set of exercises, we will work with files from statistical software. The first tasks are about importing data, while the later ones are about labelling and exporting.

1

Import the .sav version of the data from the GESIS Panel Special Survey on the Coronavirus SARS-CoV-2 Outbreak in Germany.

Clues

You need the haven package for this. The file should be stored in the data folder.

solution

library(haven)

gp_covid <-
  read_spss("./data/ZA5667_v1-1-0.sav")

Unlike in flat files, such as CSV, the variables now have labels.

2

Print the labels of the first ten variables in the data set.

Clues

You can use a function from the sjlabelled package for this. Remember that you can use [ ] ro subset columns/variables (we only want to print the labels for the first ten variables).

solution

library(sjlabelled)

get_label(gp_covid[1:10])

##                                za_number                                  version                                      doi 
##              "Studiennummer des Archivs" "Versionskennung und -datum des Archivs"        "Digital Object Identifier (doi)" 
##                                       id                                   cohort                                      sex 
##                           "Befragten-ID"                   "Rekrutierungskohorte"                             "Geschlecht" 
##                                  age_cat                            education_cat                        intention_to_vote 
##                   "Alter, kategorisiert"                 "Bildung, kategorisiert"               "Sonntagsfrage (gbzc011a)" 
##                          choice_of_party 
##         "Sonntagsfrage Wahlentscheidung"

Unfortunately, it’s all in German. Imagine you are an education researcher working on a publication in English, and you are interested in the variable education_cat. So you may want to consider translating the variable into English.

3

Change the variable label of education_cat from “Bildung, kategorisiert” to “Education, categorized”.

Clues

You can, again, use a frunction from sjlabelled for this.

solution

gp_covid$education_cat <- 
  set_label(
    gp_covid$education_cat, 
    label = "Education, categorized"
  )

get_label(gp_covid$education_cat)

## [1] "Education, categorized"

Your collaborators ask you to share the data after changing labels and stuff. Unfortunately, they do not use R or SPSS and, hence, asks you to export your data as a Stata file.

4

Export your data as a Stata file.

Clues

The haven package provides a function for writing such files that is called and works in a similar way as the corresponding function for importing data in this particular format.

solution

write_stata(gp_covid, "gesis_panel_corona_fancy_panels_final_final.dta")

Exercise 1_2_3: Statistical Software Files

Johannes Breuer, Stefan Jünger

Introduction to R for Data Analysis

1

Clues

solution

2

Clues

solution

3

Clues

solution

4

Clues

solution