Cindi Thompson, Chris Wensel, Reynold Xin, Greg Rokita
John Carnahan, Rahul Pathak, Ali Ghodsi, Mark Madsen
To encourage participation, only session summary will be transcribed.
Topic introduction
- Business Data Strategy
- Relationship between Business Data Strategy and Technical Data Strategy
Technical Data Strategy Requirements
- Easy ingestion of any data by anyone
- Unified streaming and batch processing
- Easy transition from POC and Prod
- DWH as a property of the system (no loading or integration required)
- DWH for future data, not only historical Data
- Ease of building and deploying machine learning models
- Cost efficiency, minimal operational cost
The above may or may not be applicabe for a given comapony as there is no one unified Technical Data Strategy that fits every company
Value of data
- Differences between data and information
- Data Network Effect
- Infonomics
Deriving information from data
- Data on its own is not valuable
- Information ultimately needs to translate into knowledge
- RDF is very useful but very time consuming to curate
- Will machine learning be a scalable successor to RDF?
- Certainty of expert curated content needs to replaced with ML probability based model to make the curation scalable
Data Strategy: Architecture