Training

Take this course

Schedule an on-site training at your location, or suggest a public workshop.

Schedule on-site training

Overview

Are you interested in better understanding your data, and not so interested in mastering a programming language? Have you tried learning R from a book or website, but have been discouraged? If so, this is the course for you. We assume that you’ve never programmed before (although some experience doesn’t hurt), and we teach you the best tools to help analyze your data.

You won’t be a master programmer by the end of this two-day course, but through immersion you will have learned the basics of R’s syntax and grammar, and you’ll have started building an effective R vocabulary for visualizing, transforming, and modeling data. You will learn how to load, save, and transform data as well as how to write functions, generate beautiful graphs, and fit basic statistical models to your data. We’ll give you a theoretical framework to help you understand the process of data analysis, but our focus is on practical tools that you can use as soon as you get back from the course.

All techniques are motivated by real problems, and you’ll be exposed to a number of real datasets throughout the course. We alternate brief lectures with hands-on practice: you’ll get plenty of experience actually using R (not just hearing about it!), and there’s plenty of help available if you get stuck. The course concludes with a 90-minute data analysis project. You can use this as an opportunity to start using R with your data, or work on answering some of our questions about a dataset.

This tried and true course has been taken by over 200 students, from biologists to humanists, many of whom had never programmed before. This course teaches the basic skills needed by anyone seriously interested in data.

What will you learn?

Practical skills for visualizing, transforming, and modeling data in R. During this two-day course, you will learn how to explore and understand data as well as how to do basic programming in R.

Day 1

An Introduction to R and data analysis - R is more than just a programming language. R is a statistical software application in its own right, an environment for interactive data analysis, and a community of passionate users. This orientation to the R language will help you get up and running.

  • How to download and update R and RStudio
  • How to find resources and help for R
  • Stages of data analysis
  • Best practices of data analysis

Visualizing data - R’s is well known for its beautiful graphics. R packages, like ggplot2, provide an expressive and logical language for building clear and effective data visualizations.

  • Visualize the distribution of a variable
  • Exploring and plotting relationships between variables
  • Display very large data sets through graphs without over-plotting
  • Use best practices for Exploratory Data Analysis in R code

Working with data - R is a programming language with a purpose: to analyze data. Learning how R stores and handles data will help you apply R to any data source.

  • Loading different data formats into R
  • Working with factors in R
  • How to clean poorly formatted data
  • Saving your data

Manipulating data - R’s methods for data manipulation make it easy and fast to extract information from data sets and to prepare raw data for analysis.

  • Subset, transform, summarize, and reorder data sets
  • Perform targeted, groupwise operations on data
  • Join multiple data sets together

Day 2

Programming in R - Many people use R as an application, a sort of statistical calculator, but R is also a programming language. Once you learn to program in R, you will be a more versatile and capable data analyst. You’ll learn to write code that provides the precise solutions you are looking for.

  • Create an if else statement
  • Write and optimize for and while loops in R
  • Use best practices for programming in R

R functions - Functions allow you to save your code for later or to share it with other R users. Knowing how to write a function will also streamline your workflow. Functions give code a more efficient structure that avoids duplication and aids debugging.

  • Organize a problem into a series of functions
  • Write a function in R
  • Apply best practices for writing functions in R

Simulation in R - Simulating data provides a way to test hypotheses and discover the uncertainty in your estimates.

  • Generate random numbers in R
  • Visualize uncertainty with bootstrapping in R
  • Construct a confidence interval with bootstrapping in R
  • Test a hypothesis with a permutation test in R

Modeling in R - R excels at statistical analysis and modeling, but its methods for modeling may seem unintuitive at first.

  • Write a formula in R
  • Fit a model to data in R
  • Compare models
  • Explore data sets with models

Who should take this course?

This class will be a good fit for you if you are just starting with R or have dabbled in R, but wish to improve your skills. No prior experience with R or data science is required.

If you are already an R pro, consider taking our Advanced R programming or Package development courses.

What's included?

All participants who register for Intro to data science with R will receive lifetime access to all the slides, exercises, data sets, and R scripts used in the course.

What should I bring?

You need your laptop and the latest version of R. Obviously we recommend the RStudio IDE, but it's not required: bring the R environment that makes you the most productive.

Take this course

Schedule an on-site training at your location, or suggest a public workshop.

Schedule on-site training