Below is a list of contributors to this blog.
Name | Role | Bio |
---|---|---|
Joseph Rickert | Ambassador at Large | Joseph is RStudio’s “Ambassador at Large” for all things R, is the chief editor of the R Views blog. He works with the rest of the RStudio team and the R Consortium to promote open source activities, the R language and the R Community. Joseph also represents RStudio on the R Consortium board of directors. |
Mine Çetinkaya-Rundel | Data Scientist and Professional Educator | Mine is Professional Educator at RStudio and Assistant Professor of the Practice at Duke University. Her work focuses on innovation in statistics pedagogy, with an emphasis on computation, reproducible research, open-source education, and student-centered learning. She is the author of three open-source introductory statistics textbooks as part of the OpenIntro project and teaches the popular Statistics with R MOOC on Coursera. |
Jonathan Regenstein | Enterprise Advocate | Jonathan studied International Relations as an undergraduate at Harvard, worked in finance at JP Morgan and then did graduate work in Political Economy at Emory University before joining RStudio. |
Sean Lopp | Solutions Engineer | Sean leads teams to create useful, enjoyable products. Before RStudio he was a data scientist and worked on alternative vehicle models at NREL, infant sleep dynamics, and originally studied mathematics. He lives outside Denver, CO and skis and bikes with his family. |
Nathan Stephens | Director of Solutions Engineering | Nathan has a background in analytic solutions and consulting. He has experience building data science teams, architecting analytic infrastructure, and delivering innovative data products. He is a long time user of R. |
Edgar Ruiz | Solutions Engineer | Edgar has a background in deploying enterprise reporting and business intelligence solutions. He is the author of multiple articles and blog posts sharing analytics insights and server infrastructure for data science. Recently, Edgar authored the “Data Science on Spark using sparklyr” cheat sheet. |
James Blair | Solutions Engineer | James holds a master’s degree in data science from the University of the Pacific and works as a solutions engineer. His past consulting work centered around helping businesses derive insight from data assets by leveraging R. Outside of R and data science, James’s interests include spending time with his wife and children, cooking, camping, cycling, racquetball, and exquisite food. Also, he never turns down a funnel cake. |
Andrie de Vries | Solutions Engineer | Andrie started using R in 2009 for market research statistics. He is a regular contributor to StackOverflow and co-author of “R for Dummies”. He contributed several R packages to CRAN, including miniCRAN, checkpoint, ggdendro, sss, and surveydata, and regularly speaks at industry events and R user groups. He is a qualified yoga teacher, and continues to study yoga therapy annually in Chennai, India. |
Greg Wilson | Data Scientist and Professional Educator | Greg Wilson has been a programmer, a teacher, and an author, and is now combining all three roles as a data scientist and professional educator at RStudio. He was the co-founder of Software Carpentry, a non-profit organization that teaches basic computing skills to researchers, and co-editor of “Beautiful Code”, “Making Software”, and “The Architecture of Open Source Applications”. In his spare time, he writes children’s books and is learning to play the cello. |
Max Kuhn | Software Engineer | Max Kuhn is a software engineer at RStudio. He is currently working on improving R’s modeling capabilities. He was a Director of Nonclinical Statistics at Pfizer Global R&D in Connecticut. He was applying models in the pharmaceutical and diagnostic industries for over 18 years. Max has a Ph.D. in Biostatistics. Max is the author of a number of R packages for techniques in machine learning and reproducible research and is an Associate Editor for the Journal of Statistical Software. He, and Kjell Johnson, wrote the book Applied Predictive Modeling, which won the Ziegel award from the American Statistical Association, which recognizes the best book reviewed in Technometrics in 2015. Their new book, Feature Engineering and Selection, was released in 2019. |
Alex Gold | Solutions Engineer | Alex is a longtime data nerd who worked on economic policy research, electoral politics, and healthcare at various times. He enjoys cooking and practicing martial arts and handstands in his spare time. |
Cole Arendt | Solutions Engineer | Cole is a solutions engineer and has a background in mathematics and big data. He has architected and managed analytic frameworks for reporting that use R, SAS, and Tableau. He has a diverse set of skills and interests that include soccer, philosophy, economics, and open source software. Cole lives in Raleigh, NC with his wife and two young children. |
Garrett Grolemund | Data Scientist and Professional Educator | Garrett specializes in teaching, data science, and teaching data science. He has a PhD in Statistics, wrote the popular lubridate package, invented the RStudio cheatsheets, and has (co)authored three books: Hands-On Programming with R, R for Data Science, and R Markdown: The Definitive Guide. |
Hadrien Dykiel | Customer Success Representative | Hadrien is an avid adventurer who grew up training horses. He now continues to ride on weekends along with trail running and working towards his private pilot’s license. He fell in love with data science and R after college and has worked in various industries including tech and insurance. He is now a member of the customer success team at RStudio. |
Kelly O’Briant | Solutions Engineer | Kelly O’Briant is a solutions engineer at RStudio interested in configuration and workflow management with a passion for R administration. |
Brian Law | Customer Success | Brian helps people improve their work lives by getting more out of the RStudio tools. He has a Ph.D. in Political Science, a love of the Detroit Red Wings, and several raincoats that are necessary in the Pacific Northwest. |
Once you’ve gotten started learning R, you can expand your skills by exploring many of the specialized capabilities of R. Here are 6 of the most common areas that people who already have some experience in R find particularly rewarding to learn.
Grab some cheat sheets. No one can possibly remember all the functions and arguments for every R package, which is why cheat sheets were invented. RStudio publishes a free collection of cheat sheets for the most popular R features and packages to help jog your memory. If you decide you’d like to collect them all, you may clone the cheat sheet github repository.
Learn to get help. Everyone gets stuck. Learning where and how to ask for R help is a powerful skill to hone. The Tidyverse site offers some expert advice for how to help others help you. One package you’ll grow to love is the reprex package for creating reproducible R code examples. Read through the reprex articles, which feature loads of animated gifs to illustrate the steps like Magic reprex and Using datapasta with reprex. Where to ask for help? The RStudio Community is a warm and welcoming online discussion forum to ask (and answer!) any questions about using R.
Improve your visualizations. You may already know how to create a basic plot using ggplot2, but can you build one that makes your audience go “Wow?” You can start by expanding your knowledge of the Grammar of Graphics and ggplot2 by reading Hadley Wickham’s (2016) book, ggplot2: Elegant Graphics for Data Analysis. Paper and Kindle versions are available on Amazon for the second edition of the book. The third edition is in-progress and can be viewed for free online, with the source files on GitHub. If you’d like Hadley to personally explain his philosophy of using ggplot2 in his data science work, check out Hadley’s talk from OpenVisConf 2017, The Role of Visualiation in Exploratory Data Analysis. Bookmark the updated R Graphics Cookbook by Winston Chang (2018) too; it is filled with recipes that tackle specific ggplot2 problems.
Develop interactive applications with htmlwidgets and Shiny. One concrete way to communicate your analyses better is to make your visualizations interactive. You can learn how to add browser-based interactivity to your graphics with just a few lines of code at www.htmlwidgets.org. If your interactive needs demand help from R code that needs to run on a server, learn how to write Shiny applications at shiny.rstudio.com, or follow along as Wickham (2020) writes the new Mastering Shiny book. Both approaches can be integrated with R Markdown to create polished interactive dashboards using the flexdashboard package.
Simplify your model explorations with tidymodels. Much of data science involves modeling, but each modeling package seems to invent its own interface and arguments. Enter tidymodels, a meta-package for modeling and analysis that shares the underlying design philosophy, grammar, and data structures of the tidyverse. If you previously have used caret for a uniform modeling interface, the tidymodels package parsnip is its more up-to-date child. While this project is still under development, it promises to dramatically simplify model exploration. RStudio’s Edgar Ruiz wrote up A Gentle Introduction to tidymodels to get you started.
Explore other specialized packages. R attracts data scientists because of its more than 13,000 packages that address nearly every use case. If you’re interested in genomics, you’ll want to spend some time learning the bioconductor collection of packages. If you’re working with Big Data on Spark clusters, check out sparklyr. If you want to dive into finance, you’ll probably want to start with quantmod. To find out what packages you should explore, we recommend some of the topic-based package catalogs such as Awesome R or the CRAN task views.
Bryan, Jennifer, Jim Hester, David Robinson, and Hadley Wickham. 2019. Reprex: Prepare Reproducible Example Code via the Clipboard. https://CRAN.R-project.org/package=reprex.
Chang, Winston. 2018. R Graphics Cookbook: Practical Recipes for Visualizing Data. O’Reilly Media. https://r-graphics.org/.
Chang, Winston, Joe Cheng, JJ Allaire, Yihui Xie, and Jonathan McPherson. 2019. Shiny: Web Application Framework for R. https://CRAN.R-project.org/package=shiny.
Iannone, Richard, JJ Allaire, and Barbara Borges. 2018. Flexdashboard: R Markdown Format for Flexible Dashboards. https://CRAN.R-project.org/package=flexdashboard.
Sparklyr Cheat Sheet Pdf
Weekly R-Spatial Cheat Sheet (due by 11:59 pm) 4 EAS 543-FALL 2019 corderos@umich.edu. Spark with Sparklyr, 14) Tidy evaluation with rlang, 15) caret package. RStudio IDE cheat sheet: openSharedProject: Open a project shared with you: openShinyCheatSheet: Build web applications with Shiny: openSourceDoc: Open an existing file: openSourceDocNewColumn: Open an existing file in a new column: openSparklyrCheatSheet: Interfacing Apache Spark with sparklyr: packratBootstrap: Use packrat with this project.
Kuhn, Max, and Davis Vaughan. 2018. Parsnip: A Common Api to Modeling and Analysis Functions. https://CRAN.R-project.org/package=parsnip.
Sparklyr Cheat Sheet
Max, Kuhn, and Hadley Wickham. 2018. Tidymodels: Easily Install and Load the ’Tidymodels’ Packages. https://CRAN.R-project.org/package=tidymodels.
Sparklyr Cheat Sheet Printable
Ryan, Jeffrey A., and Joshua M. Ulrich. 2018. Quantmod: Quantitative Financial Modelling Framework. https://CRAN.R-project.org/package=quantmod.
Rstudio Sparklyr Cheat Sheet
Wickham, Hadley. 2016. Ggplot2: Elegant Graphics for Data Analysis. Springer. https://ggplot2-book.org/.
Sparklyr Cheat Sheets
———. 2020. Mastering Shiny. O’Reilly Media. https://mastering-shiny.org/.
Sparklyr Cheat Sheet Fortnite
Wickham, Hadley, Winston Chang, Lionel Henry, Thomas Lin Pedersen, Kohske Takahashi, Claus Wilke, Kara Woo, and Hiroaki Yutani. 2019. Ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics.