About the Example Data File

 

Most of the examples will use a data file called mpg.  This data file contains 234 observations and 11 variables from a subset of the fuel economy data that the EPA makes available on http://fueleconomy.gov. It contains only models which had a new release every year between 1999 and 2008 – this was used as a proxy for the popularity of the car. This data set is included in the R package ggplot2.

Details (11 variables)

  1. manufacturer.
  2. model.
  3. displ. engine displacement, in litres
  4. year.
  5. cyl. number of cylinders
  6. trans. type of transmission
  7. drv. f = front-wheel drive, r = rear wheel drive, 4 = 4wd
  8. cty. city miles per gallon
  9. hwy. highway miles per gallon
  10. .
  11. class.

In addition, two new binary variables are created for analysis:

  1. cyl_six. cyl_six = 1 if cyl >= 6 or cyl_six = 0 if cyl < 6
  2. drv_front. drv_front = 1 if drv = ‘f’ or drv_front = 0 if drv ne ‘f’

You can get the mpg file as a SAS version data file by clicking here . And for those who use R, you can simply using the following code to generate the data set. 

library(ggplot2)
# Copy the mpg file into global environment
mpg <- mpg

# Create new variable cyl_six
mpg$cyl_six <- NA
mpg[which(mpg$cyl >= 6), ]$cyl_six <- 1
mpg[which(mpg$cyl < 6), ]$cyl_six <- 0

# Create new variable drv_front
mpg$drv_front <- NA
mpg[which(mpg$drv == 'f'), ]$drv_front <- 1
mpg[which(mpg$drv != 'f'), ]$drv_front <- 0

attach(mpg)

Leave a Reply

Your email address will not be published. Required fields are marked *