working-with-kaggle-data

Example

https://www.kaggle.com/competitions/house-prices-advanced-regression-techniques

https://www.kaggle.com/code/gusthema/house-prices-prediction-using-tfdf


 !pip install tensorflow_decision_forests 

       train_file_path = "./train.csv"

"By default the Random Forest Model is configured to train classification tasks. Since this is a regression problem, we will specify the type of the task (tfdf.keras.Task.REGRESSION) as a parameter here. "

https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/keras/RandomForestModel 

        https://developers.google.com/machine-learning/decision-forests/out-of-bag


Discover 


Open 

https://docs.google.com/spreadsheets/d/1h9XHoZLzsjKQfjVhG4ayYWk7-EJOXOS1Kz9V2zRBNVM/edit?usp=sharing

The dataset on Kaggle

https://www.kaggle.com/code/pratik1120/penguin-dataset-eda-classification-and-clustering

Recall 

( make year as classification prediction  to observe)

Note that, the Conf. is calculated by 

https://ydf.readthedocs.io/en/latest/tutorial/classification/


( see mass example )


Review Problem framing 

https://developers.google.com/machine-learning/problem-framing/ml-framing