EmbRaceR: Data Overview with R
The second course in the EmbRaceR series teaches attendees learn how to perform the initial data overview. A serious data science project always starts with data overview. You need to deeply understand how the values are stored and get familiar with the distribution of those values. In order to get a good comprehension, you need both, graphs and numbers. R is a powerful language for all data science tasks, including data overview. In this session, you will learn about datasets, cases and variables, and types of variables. You will get the basic understanding of a distribution through introductory statistics for discrete variables and descriptive statistics for continuous variables. The course covers presenting data through graphs as well. Finally, you will learn about important statistical terms like sampling, confidence level, and confidence interval.