Introduction to Statistical and Data Science Programming
This course is an introduction to programming for statistics and data science using both R and Python languages. Starting with basic programming concepts including variables, data types, data structures (lists, arrays, dictionaries, data frames, etc.), selection and repetition structures, to functions and packages, and finally data manipulation, exploration and visualization using both R and Python programming languages
Textbook References
Python Crash Course, 2nd Edition: A Hands-On, Project-Based Introduction to Programming https://ehmatthes.github.io/pcc_2e/regular_index/
Python Data Science Handbook https://github.com/jakevdp/PythonDataScienceHandbook
R for Data Science https://r4ds.had.co.nz/
Learning R: A Step-by-Step Function Guide to Data Analysis 1st Edition https://cran.r-project.org/web/packages/learningr/
List of topics:
Part 1: Programming in Python
Intro to Python
Variables, Simple Data Types, Lists
If Statements, User Input, Dictionaries, While Loops
Functions, Files
Introduction to NumPy
Data Manipulation with Pandas
Visualization with Matplotlib
Part 2: Programming in R
Intro to R, Data Types and Structures (Vectors, Matrices, Arrays, Lists, Data Frames).
Creating and calling functions, If Statements, and Basic Looping
Installing R packages and Importing data
Data visualization: ggplot2 and workflow basics
Data transformation using tidyverse
Workflow: scripts and exploratory data analysis
Workflow: Projects and Tibbles
Data Import and Tidy Data