class: title-slide, center, middle background-image: url(images/ouaerial.jpeg) background-size: cover # .large[.fancy[Introduction to R]] ## .fancy[Ani Ruhil] --- # .salt[.fancy[Agenda]] - Install R and RStudio - First install the latest version of ![](./images/Rsvgsm.png) from [here](https://cloud.r-project.org) - Then install the latest version of ![](./images/Rstudiosm.png) from [here](https://www.rstudio.com/products/rstudio/download/) - Install Rtools (for Windows) from [here](https://cran.r-project.org/bin/windows/Rtools/index.html) ... useful if you have to compile packages, which you will have to do from time to time when the package (or its latest version) is not on CRAN - Test installation - Install some packages - Understand how R Markdown works - Read data in various formats - Basic data processing and saving - Fun with leaflet ??? - Go slow and make sure everyone is able to knit - Minimize panic and keep the environment light --- ## Understand your RStudio Environment <center><img src = "images/rstudiopanes.png", width = 700px></content> ??? (1) The Console ... (2) Knitting and [Code Chunk options](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf) --- ### ... key options ... `Console` = This is where commands are issued to R, either by typing and hitting enter or running commands from a script (like your R Markdown file) `Environment` = stores and shows you all the objects created `History` shows you a running list of all commands issued to R `Connections` = shows you any databases/servers you are connected to and also allows you to initiate a new connection `Files` = shows you files and folders in your current working directory, and you can move up/down in the folder hierarchy `Plots` = show you all plots that have been generated `Packages` = shows you installed packages `help` = allows you to get the help pages by typing in keywords `Viewer` = shows you are "live" documents running on the server `Knit` = allows you to generated html/pdf/word documents from a script `Insert` = allows you to insert a vanilla R chunk. You can (and should) give unique name to code chunks so that you can easily diagnose which chunk is not working `Run` = allows you to run lines/chunks Customize the `detachable` panes via `Tools -> Global Options...` You also have a spellchecker; use it --- ## Installing packages Now we install some packages via `Tools -> Install Packages...` and updated packages via `Tools -> Check for Package Updates...`<sup>1</sup> ```r install.packages( c("devtools", "reshape2", "lubridate", "car", "Hmisc", "gapminder", "leaflet", "DT", "data.table", "htmltools", "scales", "ggridges", "here", "knitr", "kableExtra", "haven", "readr", "readxl", "ggplot2","namer") ) ``` Other packages will be installed as needed Update packages via `Tools -> Check for Package Updates...` .footnote[[1] It is a good idea to update packages on a regular basis but note that every now and then something might break with an update. When this happens check the package's source, usually on `github` for solutions.] ??? - Make sure they install `devtools` --- ## Rprojects (1) Create a folder called `ouir` (2) Inside the ouir folder create a subfolder called `data`. The folder structure will now be as shown below ``` ouir/ └── session01.Rmd └── session02.Rmd └── session03.Rmd └── data/ └── some data file └── another data file ``` All data you download or create go into the `data` folder. All R code files reside in the `ouir` folder. Open the Rmd file I sent you: **Module01.Rmd** and save it in the `ouir` folder. Save the data I sent you to the **data** folder. (3) Now create a `project` via `File -> New Project` and choose `Existing Directory`. Browse to the `ouir` folder and click `Create Project`. RStudio will restart and when it does you will be in the project folder and will see a file called `ouir.Rproj` ??? - Point out that every time they start working they can click on `mpa6020.Rproj` and everything should work seamlessly unless something breaks --- ## R Markdown files - Go to `New File -> R Markdown ...` and enter a `My First Rmd File` in title and your `name`. <img src="images/Rmd.png" width="280" style="display: block; margin: auto;" /> - Click `OK`. - Now `File -> Save As..` and save it as `testing.Rmd` in the **ouir** sub-folder - Click this button: ![](./images/knit.png) > You may see a message that says some packages need to be installed/updated. Allow these to be installed/updated. ??? - Emphasize the importance of the YAML `YAML Ain't Markup Language` - Urge patience again since some packages may have to be installed more than once, perhaps via `devtools`, and some may not have admin rights (the horror, the horror!!) - Show them how to knit to Word and to PDF - Tell them you will show them how to generate a slide-deck later, if anyone is interested --- class: inverse, center, top .pull-left[ As the document knits, watch for error messages <center><img src = "images/simpsons.gif"></center> ] .pull-right[ ... if all goes well ... <img src="images/img01.png" width="70%" style="display: block; margin: auto;" /> ] --- ## .fat[.fancy[Creating pdf output]] If you need to create PDF documents, then you will need a working LaTeX setup on your machine. There are other ways to setup a LaTeX system but the easiest might be to run the following code: ```r install.packages('tinytex') tinytex::install_tinytex() # to uninstall TinyTeX, run tinytex::uninstall_tinytex() ``` Post-install, restart RStudio and click `knit to PDF` --- ### Specific R Markdown code block commands **Golden Rule:** Unique name for each chunk (no whitespace in name). Forgot? Use `namer()` ```r library(namer) name_chunks("myfilename.Rmd") ``` `eval` = If FALSE, knitr will not run the code in the code chunk. `include` = If FALSE, knitr will run the chunk but not include the chunk in the final document. `echo` = If FALSE, knitr will not display the code in the code chunk above it’s results in the final document. `error` = If FALSE, knitr will not display any error messages generated by the code. `message` = If FALSE, knitr will not display any messages generated by the code. `warning` = If FALSE, knitr will not display any warning messages generated by the code. `cache` = If TRUE, knitr caches the results to reuse in future knits until the code chunk is altered. `dev` = The R function name that will be used as a graphical device to record plots, e.g. dev='CairoPDF'. `dpi` = A number for knitr to use as the dots per inch (dpi) in graphics (when applicable). `fig.align` = 'center', 'left', 'right' alignment in the knit document `fig.height` = height of the figure (in inches, for example) `fig.width` = width of the figure (in inches, for example) `out.height, out.width` = The width and height to scale plots to in the final output. Other options can be found in [the cheatsheet available here](https://www.rstudio.com/wp-content/uploads/2015/03/rmarkdown-reference.pdf) --- class: inverse, middle, center # .salt[.fancy[Working with data]] --- Data will generally mirror one of the following types ...integer, numeric/double, character, logical, date, or a factor ```r library(tibble) library(lubridate) data_frame( variable1 = c(1L, 2L, 3L, 4L), variable2 = c(2.1, 3.4, 5.6, 7.8), variable4 = c("Low", "Medium", "High", "Missing"), variable5 = c(TRUE, FALSE, FALSE, TRUE), variable6 = ymd(c("2017-05-23", "1776/07/04", "1983-05/31", "1908/04-01")), variable7 = as.factor(c("Male", "Female", "Trans", "Trans")) ) ``` ``` ## # A tibble: 4 x 6 ## variable1 variable2 variable4 variable5 variable6 variable7 ## <int> <dbl> <chr> <lgl> <date> <fct> ## 1 1 2.1 Low TRUE 2017-05-23 Male ## 2 2 3.4 Medium FALSE 1776-07-04 Female ## 3 3 5.6 High FALSE 1983-05-31 Trans ## 4 4 7.8 Missing TRUE 1908-04-01 Trans ``` A `date` variable has a very specific meaning for R; the data point must reflect a year, a month, and a day before it is deemed a `valid date format` You can convert from most formats to another format `(but with care)` --- ## Reading data Make sure you have the following data-sets in the **data** folder. If you don't then the commands that follow will not work. We start by reading a simple `comma-separated variable` format file and then a `tab-delimited variable` format file. ```r library(here) # loaded once per session df.csv <- read.csv("data/ImportDataCSV.csv", sep=",", header=TRUE) # note sep = "," df.tab <- read.csv("data/ImportDataTAB.txt", sep="\t", header=TRUE) # note sep = "\t" ``` If the files were read then `Environment` should show objects called `df.csv` and `df.tab`. If you don't see these then check the following: - Make sure you have the csv/txt files in your **data** folder - Make sure the folder has been correctly named (no blank spaces before or after, all lowercase, etc) - Make sure the data folder is inside **ouir** .fancy[.heat[Note the assignment operator... ]] `df.tab <- read.csv(...)` `\(=\)` `df.tab = read.cav(...)` but I prefer `<-` ??? - Point out the importance of setting the data path to `data/filename.ext` --- .heat[.fancy[Excel]] files can be read via the `readxl` package ```r library(readxl) df.xls <- read_excel("data/ImportDataXLS.xls") df.xlsx <- read_excel("data/ImportDataXLSX.xlsx") ``` .heat[.fancy[SPSS, Stata, SAS]] files can be read via the `haven` package ```r library(haven) df.stata <- read_stata("data/ImportDataStata.dta") df.sas <- read_sas("data/ImportDataSAS.sas7bdat") df.spss <- read_sav("data/ImportDataSPSS.sav") ``` --- .heat[.fancy[Fixed-width]] files: It is also common to encounter fixed-width files where the raw data are stored without any gaps between successive variables. However, these files will come with documentation that will tell you where each variable starts and ends, along with other details about each variable. <center><img src = "./images/fwftxt.png", width = 200px></center> ```r df.fw <- read.fwf("data/fwfdata.txt", widths = c(4, 9, 2, 4), header = FALSE, col.names = c("Name", "Month", "Day", "Year")) ``` Notice we need `widths = c()` and `col.names = c()` * `widths` specifies how many slots each variable/field occupies * `col.names` indicates the names to be assigned to each variable/field Now an example of an even larger fixed-width file 1. Download the BRFSS data from [here](https://www.cdc.gov/brfss/annual_data/2017/files/LLCP2017ASC.zip) 2. Extract the ascii data file and place it in your data directory 3. Now copy-and-paste the code I sent you via slack into a single R code-chunk 4. Run the code chunk --- class: inverse, middle, center ![](images/magic.gif) --- ## Reading Files from the Web It is possible to specify the full web-path for a file and read it in, rather than storing a local copy. This is often useful when updated by the source (Census Bureau, Bureau of Labor, Bureau of Economic Analysis, etc.) ```r fpe <- read.table("http://data.princeton.edu/wws509/datasets/effort.dat") test <- read.table("https://stats.idre.ucla.edu/stat/data/test.txt", header = TRUE) test.csv <- read.csv("https://stats.idre.ucla.edu/stat/data/test.csv", header = TRUE) library(foreign) hsb2.spss <- read.spss("https://stats.idre.ucla.edu/stat/data/hsb2.sav") df.hsb2.spss <- as.data.frame(hsb2.spss) ``` - `hsb2.spss` was read with the `foreign` package<sup>2</sup>, an alternative to `haven` - `foreign` calls `read.spss` while `haven` calls `read_spss` .footnote[[2] The `foreign` package will also read Stata and other formats. I end up defaulting to `haven` now. There are other packages for reading SPSS, SAS, etc. files ... `sas7bdat`, `rio`, `data.table`, `xlsx`, `XLConnect`, `gdata` and others. ] ??? - Point out that they must have an internet connection or else the file won't be read - Remind them that if the source file's URL change the file may not be read, but it is easy to check if a broken URL is the source of the error by using a browser --- ## Reading compressed files ```r temp <- tempfile() download.file("ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/ Datasets/NVSS/bridgepop/2016/pcen_v2016_y1016.sas7bdat.zip", temp) oursasdata <- haven::read_sas(unz(temp, "pcen_v2016_y1016.sas7bdat")) unlink(temp) ``` You can save your data in a format that R will recognize, giving it the `RData` or `rdata` extension ```r save(oursasdata, file = "data/oursasdata.RData") save(oursasdata, file = "data/oursasdata.rdata") ``` Check your **data** directory to confirm both files are present --- name: section class: top, inverse --- name: yourturn template: section .left-column[ # <i class = "fas fa-edit"></i><br>.fancy[Your turn] ] --- name: yourturn1 template: yourturn .right-column[ .large[(1) Read in the CSV data found [here](https://raw.githubusercontent.com/words-sdsc/coursera/master/big-data-2/csv/census.csv) (2) Read in the Excel data found [here](https://github.com/vera-institute/incarceration_trends/blob/master/incarceration_trends.xlsx) ] .heat[.fancy[Make sure you save both files with .RData extension]] ] .left-column[ ## .fancy[.saltinline[10:00 minutes]] ] ```{=html} <div class="countdown blink-colon noupdate-15" id="timer_5fde928c" style="right:33%;bottom:0;" data-audio="true" data-warnwhen="0"> <code class="countdown-time"><span class="countdown-digits minutes">10</span><span class="countdown-digits colon">:</span><span class="countdown-digits seconds">00</span></code> </div> ``` --- ## Minimal example of data processing Working with the **hsb2** data: 200 students from the High school and Beyond study ```r hsb2 <- read.table('https://stats.idre.ucla.edu/stat/data/hsb2.csv', header = TRUE, sep = ",") ``` - `female` = (0/1) - `race` = (1=hispanic 2=asian 3=african-amer 4=white) - `ses` = socioeconomic status (1=low 2=middle 3=high) - `schtyp` = type of school (1=public 2=private) - `prog` = type of program (1=general 2=academic 3=vocational) - `read` = standardized reading score - `write` = standardized writing score - `math` = standardized math score - `science` = standardized science score - `socst` = standardized social studies score --- ```{=html} <div id="htmlwidget-a0725a91bfe6aa9f75a7" style="width:100%;height:auto;" class="datatables html-widget"></div> <script type="application/json" data-for="htmlwidget-a0725a91bfe6aa9f75a7">{"x":{"filter":"none","data":[[70,121,86,141,172,113,50,11,84,48,75,60,95,104,38,115,76,195,114,85,167,143,41,20,12,53,154,178,196,29,126,103,192,150,199,144,200,80,16,153,176,177,168,40,62,169,49,136,189,7,27,128,21,183,132,15,67,22,185,9,181,170,134,108,197,140,171,107,81,18,155,97,68,157,56,5,159,123,164,14,127,165,174,3,58,146,102,117,133,94,24,149,82,8,129,173,57,100,1,194,88,99,47,120,166,65,101,89,54,180,162,4,131,125,34,106,130,93,163,37,35,87,73,151,44,152,105,28,91,45,116,33,66,72,77,61,190,42,2,55,19,90,142,17,122,191,83,182,6,46,43,96,138,10,71,139,110,148,109,39,147,74,198,161,112,69,156,111,186,98,119,13,51,26,36,135,59,78,64,63,79,193,92,160,32,23,158,25,188,52,124,175,184,30,179,31,145,187,118,137],[0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],[4,4,4,4,4,4,3,1,4,3,4,4,4,4,3,4,4,4,4,4,4,4,3,1,1,3,4,4,4,2,4,4,4,4,4,4,4,4,1,4,4,4,4,3,4,4,3,4,4,1,2,4,1,4,4,1,4,1,4,1,4,4,4,4,4,4,4,4,4,1,4,4,4,4,4,1,4,4,4,1,4,4,4,1,4,4,4,4,4,4,2,4,4,1,4,4,4,4,1,4,4,4,3,4,4,4,4,4,3,4,4,1,4,4,1,4,4,4,4,3,1,4,4,4,3,4,4,2,4,3,4,2,4,4,4,4,4,3,1,3,1,4,4,1,4,4,4,4,1,3,3,4,4,1,4,4,4,4,4,3,4,4,4,4,4,4,4,4,4,4,4,1,3,2,3,4,4,4,4,4,4,4,4,4,2,2,4,2,4,3,4,4,4,2,4,2,4,4,4,4],[1,2,3,3,2,2,2,2,2,2,2,2,3,3,1,1,3,2,3,2,2,2,2,3,2,2,3,2,3,1,2,3,3,2,3,3,2,3,1,2,2,2,2,1,3,1,3,2,2,2,2,3,2,2,2,3,1,2,2,2,2,3,1,2,3,2,2,1,1,2,2,3,2,2,2,1,3,3,2,3,3,1,2,1,2,3,3,3,2,3,2,1,3,1,1,1,2,3,1,3,3,3,1,3,2,2,3,1,1,3,2,1,3,1,3,2,3,3,1,1,1,2,2,2,1,3,2,2,3,1,2,1,2,2,1,3,2,2,2,2,1,3,2,2,2,3,2,2,1,1,1,3,2,2,2,2,2,2,2,3,1,2,3,1,2,1,2,1,2,1,1,2,3,3,1,1,2,2,3,1,2,2,3,2,3,1,2,2,3,1,1,3,2,3,2,2,2,2,2,3],[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,2,2,1,1,1,2,1,2,1,2,1,1,1,2,2,1,1,1,1,1,1,2,1,1,1,1,2,1,1,1,1,2,1,2,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,1,1,2,2,1,1,1,1,2,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,2,1,1,1,1,1,2,1,2,1,1,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,2,1,1,1,1,1,1,1,1,1,1,1,1,2,1,1,1,1,1,1,2,1,1,2,2,1,2,2,1,2,1,1],[1,3,1,3,2,2,1,2,1,2,3,2,2,2,2,1,2,1,2,1,1,3,2,2,3,3,2,3,2,1,1,2,2,3,2,1,2,2,3,3,2,2,2,1,1,1,3,2,2,2,2,2,1,2,2,3,3,3,2,3,2,2,1,1,2,3,2,3,2,3,1,2,2,1,3,2,2,1,3,2,2,3,2,2,3,2,2,3,3,2,2,1,2,2,1,1,2,2,3,2,2,1,2,2,2,2,2,3,1,2,3,2,2,2,2,3,1,2,2,3,1,1,2,3,3,2,2,1,3,3,2,2,3,3,2,2,2,3,3,2,1,2,3,2,2,2,3,2,2,2,2,2,3,1,1,2,3,3,1,2,2,2,2,2,2,3,2,1,2,3,1,3,1,2,1,2,2,2,3,1,2,2,1,2,3,2,1,1,2,2,3,1,3,2,2,1,3,1,1,2]],"container":"<table class=\"display\">\n <thead>\n <tr>\n <th>id<\/th>\n <th>female<\/th>\n <th>race<\/th>\n <th>ses<\/th>\n <th>schtyp<\/th>\n <th>prog<\/th>\n <\/tr>\n <\/thead>\n<\/table>","options":{"fillContainer":false,"searching":false,"pageLength":5,"columnDefs":[{"className":"dt-right","targets":[0,1,2,3,4,5]}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":[],"jsHooks":[]}</script> ``` .fancy[.heat[Notice the absence of value labels]] --- ## Value labels in base R There are no label values for the qualitative/categorical variables (female, race, ses, schtyp, and prog) so we create these.<sup>3</sup> ```r hsb2$female.f <- factor(hsb2$female, levels = c(0, 1), labels = c("Male", "Female")) hsb2$race.f <- factor(hsb2$race, levels = c(1:4), labels = c("Hispanic", "Asian", "African American", "White")) hsb2$ses.f <- factor(hsb2$ses, levels = c(1:3), labels = c("Low", "Middle", "High")) hsb2$schtyp.f <- factor(hsb2$schtyp, levels = c(1:2), labels = c("Public", "Private")) hsb2$prog.f <- factor(hsb2$prog, levels = c(1:3), labels = c("General", "Academic", "Vocational")) ``` .footnote[[3] This is just a quick run through with creating value labels; we will cover this in more detail in a later module. ] --- ```{=html} <div id="htmlwidget-2d9f22d2f6d245f560a5" style="width:100%;height:auto;" class="datatables html-widget"></div> <script type="application/json" data-for="htmlwidget-2d9f22d2f6d245f560a5">{"x":{"filter":"none","data":[[70,121,86,141,172,113,50,11,84,48,75,60,95,104,38,115,76,195,114,85,167,143,41,20,12,53,154,178,196,29,126,103,192,150,199,144,200,80,16,153,176,177,168,40,62,169,49,136,189,7,27,128,21,183,132,15,67,22,185,9,181,170,134,108,197,140,171,107,81,18,155,97,68,157,56,5,159,123,164,14,127,165,174,3,58,146,102,117,133,94,24,149,82,8,129,173,57,100,1,194,88,99,47,120,166,65,101,89,54,180,162,4,131,125,34,106,130,93,163,37,35,87,73,151,44,152,105,28,91,45,116,33,66,72,77,61,190,42,2,55,19,90,142,17,122,191,83,182,6,46,43,96,138,10,71,139,110,148,109,39,147,74,198,161,112,69,156,111,186,98,119,13,51,26,36,135,59,78,64,63,79,193,92,160,32,23,158,25,188,52,124,175,184,30,179,31,145,187,118,137],["Male","Female","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Male","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female","Female"],["White","White","White","White","White","White","African American","Hispanic","White","African American","White","White","White","White","African American","White","White","White","White","White","White","White","African American","Hispanic","Hispanic","African American","White","White","White","Asian","White","White","White","White","White","White","White","White","Hispanic","White","White","White","White","African American","White","White","African American","White","White","Hispanic","Asian","White","Hispanic","White","White","Hispanic","White","Hispanic","White","Hispanic","White","White","White","White","White","White","White","White","White","Hispanic","White","White","White","White","White","Hispanic","White","White","White","Hispanic","White","White","White","Hispanic","White","White","White","White","White","White","Asian","White","White","Hispanic","White","White","White","White","Hispanic","White","White","White","African American","White","White","White","White","White","African American","White","White","Hispanic","White","White","Hispanic","White","White","White","White","African American","Hispanic","White","White","White","African American","White","White","Asian","White","African American","White","Asian","White","White","White","White","White","African American","Hispanic","African American","Hispanic","White","White","Hispanic","White","White","White","White","Hispanic","African American","African American","White","White","Hispanic","White","White","White","White","White","African American","White","White","White","White","White","White","White","White","White","White","White","Hispanic","African American","Asian","African American","White","White","White","White","White","White","White","White","White","Asian","Asian","White","Asian","White","African American","White","White","White","Asian","White","Asian","White","White","White","White"],["Low","Middle","High","High","Middle","Middle","Middle","Middle","Middle","Middle","Middle","Middle","High","High","Low","Low","High","Middle","High","Middle","Middle","Middle","Middle","High","Middle","Middle","High","Middle","High","Low","Middle","High","High","Middle","High","High","Middle","High","Low","Middle","Middle","Middle","Middle","Low","High","Low","High","Middle","Middle","Middle","Middle","High","Middle","Middle","Middle","High","Low","Middle","Middle","Middle","Middle","High","Low","Middle","High","Middle","Middle","Low","Low","Middle","Middle","High","Middle","Middle","Middle","Low","High","High","Middle","High","High","Low","Middle","Low","Middle","High","High","High","Middle","High","Middle","Low","High","Low","Low","Low","Middle","High","Low","High","High","High","Low","High","Middle","Middle","High","Low","Low","High","Middle","Low","High","Low","High","Middle","High","High","Low","Low","Low","Middle","Middle","Middle","Low","High","Middle","Middle","High","Low","Middle","Low","Middle","Middle","Low","High","Middle","Middle","Middle","Middle","Low","High","Middle","Middle","Middle","High","Middle","Middle","Low","Low","Low","High","Middle","Middle","Middle","Middle","Middle","Middle","Middle","High","Low","Middle","High","Low","Middle","Low","Middle","Low","Middle","Low","Low","Middle","High","High","Low","Low","Middle","Middle","High","Low","Middle","Middle","High","Middle","High","Low","Middle","Middle","High","Low","Low","High","Middle","High","Middle","Middle","Middle","Middle","Middle","High"],["Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Private","Public","Public","Public","Private","Public","Private","Public","Private","Public","Public","Public","Private","Private","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Private","Public","Public","Public","Public","Private","Public","Private","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Private","Private","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Private","Public","Public","Public","Public","Public","Private","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Public","Private","Public","Public","Public","Public","Public","Public","Private","Public","Public","Private","Private","Public","Private","Private","Public","Private","Public","Public"],["General","Vocational","General","Vocational","Academic","Academic","General","Academic","General","Academic","Vocational","Academic","Academic","Academic","Academic","General","Academic","General","Academic","General","General","Vocational","Academic","Academic","Vocational","Vocational","Academic","Vocational","Academic","General","General","Academic","Academic","Vocational","Academic","General","Academic","Academic","Vocational","Vocational","Academic","Academic","Academic","General","General","General","Vocational","Academic","Academic","Academic","Academic","Academic","General","Academic","Academic","Vocational","Vocational","Vocational","Academic","Vocational","Academic","Academic","General","General","Academic","Vocational","Academic","Vocational","Academic","Vocational","General","Academic","Academic","General","Vocational","Academic","Academic","General","Vocational","Academic","Academic","Vocational","Academic","Academic","Vocational","Academic","Academic","Vocational","Vocational","Academic","Academic","General","Academic","Academic","General","General","Academic","Academic","Vocational","Academic","Academic","General","Academic","Academic","Academic","Academic","Academic","Vocational","General","Academic","Vocational","Academic","Academic","Academic","Academic","Vocational","General","Academic","Academic","Vocational","General","General","Academic","Vocational","Vocational","Academic","Academic","General","Vocational","Vocational","Academic","Academic","Vocational","Vocational","Academic","Academic","Academic","Vocational","Vocational","Academic","General","Academic","Vocational","Academic","Academic","Academic","Vocational","Academic","Academic","Academic","Academic","Academic","Vocational","General","General","Academic","Vocational","Vocational","General","Academic","Academic","Academic","Academic","Academic","Academic","Vocational","Academic","General","Academic","Vocational","General","Vocational","General","Academic","General","Academic","Academic","Academic","Vocational","General","Academic","Academic","General","Academic","Vocational","Academic","General","General","Academic","Academic","Vocational","General","Vocational","Academic","Academic","General","Vocational","General","General","Academic"]],"container":"<table class=\"display\">\n <thead>\n <tr>\n <th>id<\/th>\n <th>female.f<\/th>\n <th>race.f<\/th>\n <th>ses.f<\/th>\n <th>schtyp.f<\/th>\n <th>prog.f<\/th>\n <\/tr>\n <\/thead>\n<\/table>","options":{"searching":false,"pageLength":5,"columnDefs":[{"className":"dt-right","targets":0}],"order":[],"autoWidth":false,"orderClasses":false,"lengthMenu":[5,10,25,50,100]}},"evals":[],"jsHooks":[]}</script> ``` ### .salt[.fancy[save your work!!]] Having added labels to the factors in __hsb2__ we can now save the data for later use. ```r save(hsb2, file = "data/hsb2.RData") ``` --- name: yourturn2 template: yourturn .right-column[ .large[(1) Read in the three SAS data-sets sent to you via `Slack` (2) In `xclass`, create value labels for `online_flag` - 0 = Not Online - 1 = Online] ] .left-column[ ## .fancy[.saltinline[10:00 minutes]] ] .right-column[ ] --- ## Saving data, objects, and workspaces .pull-left[ You save your data via `save(dataname, file = "filepath/filename.RData")` or `save(dataname, file = "filepath/filename.rdata")` ```r data(mtcars) save(mtcars, file = "data/mtcars.RData") {{rm(list = ls())}}# To clear the Environment load("data/mtcars.RData") ``` ] .pull-right[ You can also save multiple data files as follows: ```r data(mtcars) library(ggplot2) data(diamonds) *save(mtcars, diamonds, file = "data/mydata.RData") rm(list = ls()) # To clear the Environment load("data/mydata.RData") ``` ] --- If you want to save just a single `object` from the environment and then load it in a later session, maybe with a different name, then you should use `saveRDS()` and `readRDS()` ```r data(mtcars) saveRDS(mtcars, file = "data/mydata.RDS") rm(list = ls()) # To clear the Environment ourdata = readRDS("data/mydata.RDS") ``` If instead you did the following, the file will be read with the `original name` even though you called it with `ourdata` ```r data(mtcars) save(mtcars, file = "data/mtcars.RData") rm(list = ls()) # To clear the Environment ourdata = load("data/mtcars.RData") # Note ourdata is listed as "mtcars" ``` If you want to save everything you have done in the work session you can via `save.image()` ```r save.image(file = "mywork_jan182018.RData") ``` - The next time you start RStudio this image will be automatically loaded - Useful if you have a lot of R code you have written and various objects generated and do not want to start from scratch the next time around. ??? Let them know that if not in a project and they try to close RStudio after some code has been run, they will be prompted to save (or not) the `workspace` and they should say "no" --- class: inverse, center, middle # .salt[.fancy[Some useful housekeeping commands]] --- `summary(dataname)` will give you a snapshot of your data `glimpse(dataname)` does the same if you are using the `tidyverse` library `dim(dataname)` will give you the dimensions of the data frame `str(dataname)` will give you the structure of the data frame ... each variable's type and other details `names(dataname)` will give you the names of all columns as well as the column position (i.e., number) `head(dataname, x)` will give you the first `\(x\)` rows of the data frame `tail(dataname, x)` will give you the last `\(x\)` rows of the data frame `clean_names(dataname)` from the `janitor` package will clean up messy column names (i.e., ensuring that all column names are lowercase and have no blank spaces, etc) --- name: yourturn3 template: yourturn .right-column[ .large[Run each of the following commands on `xstudent` - names() - summary() - glimpse() # tidyverse must be loaded for this to work to get a feel for the output] ] .left-column[ ## .fancy[.saltinline[10:00 minutes]] ] --- ## Calculating some basic statistics `mean(dataname$varname, na.rm = TRUE)` will give you the mean of a variable `median(dataname$varname, na.rm = TRUE)` will give you the median of a variable `sd(dataname$varname, na.rm = TRUE)` will give you the standard deviation of a variable `var(dataname$varname, na.rm = TRUE)` will give you the variance of a variable `min(dataname$varname, na.rm = TRUE)` will give you the minimum of a variable `max(dataname$varname, na.rm = TRUE)` will give you the maximum of a variable `quantile(dataname$varname, p = c(0.25, 0.75), na.rm = TRUE)` will give you the first and third quartiles of a variable `scale(dataname$varname, na.rm = TRUE)` will give you z-score of a variable Note that .heatinline[na.rm = TRUE] drops all cases with missing values before calculating quantities of interest. If you forget this switch you will get nothing or worse, see an error message --- class: inverse, center, middle # .salt[.fancy[Mapping in R with leaflet]] <img src = "images/elaineyes.gif" width = 350px /img> --- `leaflet` is an easy to learn JavaScript library that generates interactive maps ```r library(leaflet) library(leaflet.extras) library(widgetframe) leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 14) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(height = '275') ``` ```{=html} <div id="htmlwidget-455622c096e162c2de34" style="width:100%;height:275px;" class="widgetframe html-widget"></div> <script type="application/json" data-for="htmlwidget-455622c096e162c2de34">{"x":{"url":"Module01_files/figure-html//widgets/widget_leaf1.html","options":{"xdomain":"*","allowfullscreen":false,"lazyload":false}},"evals":[],"jsHooks":[]}</script> ``` - `setView()` centers the map with given lat/lng - `zoom =` applies zoom factor --- ... drop a pin on Building 21 ```r leaflet() %>% setView(lat = 39.322577, lng = -82.106336, zoom = 15) %>% addMarkers(lat = 39.319984, lng = -82.107084, popup = c("The Ridges, Building 21")) %>% addTiles() %>% setMapWidgetStyle() %>% frameWidget(height = '325') ``` ```{=html} <div id="htmlwidget-f8fe59555e5cbaaa37ff" style="width:100%;height:325px;" class="widgetframe html-widget"></div> <script type="application/json" data-for="htmlwidget-f8fe59555e5cbaaa37ff">{"x":{"url":"Module01_files/figure-html//widgets/widget_leaf2.html","options":{"xdomain":"*","allowfullscreen":false,"lazyload":false}},"evals":[],"jsHooks":[]}</script> ``` --- class: right, middle ## .left[.heat[.fancy[Questions??]]] <img class="circle" src="https://github.com/aniruhil.png" width="175px"/> # Find me at... [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"/></svg> @aruhil](http://twitter.com/aruhil) [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M326.612 185.391c59.747 59.809 58.927 155.698.36 214.59-.11.12-.24.25-.36.37l-67.2 67.2c-59.27 59.27-155.699 59.262-214.96 0-59.27-59.26-59.27-155.7 0-214.96l37.106-37.106c9.84-9.84 26.786-3.3 27.294 10.606.648 17.722 3.826 35.527 9.69 52.721 1.986 5.822.567 12.262-3.783 16.612l-13.087 13.087c-28.026 28.026-28.905 73.66-1.155 101.96 28.024 28.579 74.086 28.749 102.325.51l67.2-67.19c28.191-28.191 28.073-73.757 0-101.83-3.701-3.694-7.429-6.564-10.341-8.569a16.037 16.037 0 0 1-6.947-12.606c-.396-10.567 3.348-21.456 11.698-29.806l21.054-21.055c5.521-5.521 14.182-6.199 20.584-1.731a152.482 152.482 0 0 1 20.522 17.197zM467.547 44.449c-59.261-59.262-155.69-59.27-214.96 0l-67.2 67.2c-.12.12-.25.25-.36.37-58.566 58.892-59.387 154.781.36 214.59a152.454 152.454 0 0 0 20.521 17.196c6.402 4.468 15.064 3.789 20.584-1.731l21.054-21.055c8.35-8.35 12.094-19.239 11.698-29.806a16.037 16.037 0 0 0-6.947-12.606c-2.912-2.005-6.64-4.875-10.341-8.569-28.073-28.073-28.191-73.639 0-101.83l67.2-67.19c28.239-28.239 74.3-28.069 102.325.51 27.75 28.3 26.872 73.934-1.155 101.96l-13.087 13.087c-4.35 4.35-5.769 10.79-3.783 16.612 5.864 17.194 9.042 34.999 9.69 52.721.509 13.906 17.454 20.446 27.294 10.606l37.106-37.106c59.271-59.259 59.271-155.699.001-214.959z"/></svg> aniruhil.org](https://aniruhil.org) [<svg style="height:0.8em;top:.04em;position:relative;" viewBox="0 0 512 512"><path d="M476 3.2L12.5 270.6c-18.1 10.4-15.8 35.6 2.2 43.2L121 358.4l287.3-253.2c5.5-4.9 13.3 2.6 8.6 8.3L176 407v80.5c0 23.6 28.5 32.9 42.5 15.8L282 426l124.6 52.2c14.2 6 30.4-2.9 33-18.2l72-432C515 7.8 493.3-6.8 476 3.2z"/></svg> ruhil@ohio.edu](mailto:ruhil@ohio.edu)