data sources
company
web
survey
customer
logistics
finance
data access
public record
open API (application programming interface)
data type
qualitative data
descibes with words
quantitative data
measure with numbers
continuous data (any of a range) vs. discrete data (one specifc)
structured vs. unstructured
data storage
what?
where?
how?
structured tables
relational database
SQL query
unstructured document
document database
NoSQL query
how to access later?
query
relational database
SQL query
document database
NoSQL query
data pipeline
ETL
extract raw data
transform to database schema
load data into database
tidy data
make proper table
observation in row
features in columns
delete dublicates
create unqiue id
unify
data types
spellings
deal with missing and faulty values
replace
drop
keep