About Me

Hi! I’m Elizabeth Hance and I love data. After getting my Master’s degree in computational mathematics, I started this page as a way to document what I’ve learned about data science. This site is a work in progress and the topics explored here do not display the full depth and breadth of my knowledge. The explanations of the concepts are meant to be brief with simple examples. Feel free to reach out to me at my contact info below!

My dog, Mia

Topics Covered

Content for these topics was gathered from a variety of sources including: DataCamp’s Data Scientist with R Career Track, Springer’s book: An Introduction to Statistical Learning with Applications in R, Wikipedia, various Medium articles, other online blogs, and my graduate coursework.

Modeling:
- Regression
- Classification
- Clustering
- Gradient Boosting
Validation/Deployment:
- Model Validation
- Model Deployment
Additional Topics:
- Databases
- Cloud Computing
- Shell/Git

Also see Additional_Topics for resources not covered on this page including:

Causal inferences notes
Notes from a deep learning coursear course
Exploring interactions in a GAM
Brief notes from a pandas tutorial

R Packages

Most of the examples are written in R, so here are some packages that are frequently used:

All cheat sheets

tidyverse
- dplyr - data manipulation
- ggplot2 - data visualization
- tidyr - to tidy/clean data
- readr - read rectangular data
- stringr - working with strings
- haven - for SPSS, Stata, and SAS data
- lubridate - working with dates
- readxl
- purrr
- tibble
- forcats
data.table - fread: to read rectangular data
broom - tidy data from R functions
DBI - database connections
- odbc
sqldf - use SQL to manipulate a dataframe
RMarkdown

Home

About Me

Topics Covered

R Packages