The second model for our prediction of determining the rotten tomatoes rating in a given show in a streaming service is a binary decision model. The binary decision model shown to the left has a max depth of 3. The binary decision model has a mean squared error of 315.25137430420205 for the training data, and a mean squared error of 303.6240081068277 for the testing data (Both of these can be found in the jupyter notebook). The mean squared error by itself can't tell use whether this max depth is perfect or not without comparison of other max depths, but it does tell us that the data is not overfit because of the difference between the training and the testing mean squared error being marginally different.
Now we are going to see if changing the max depth of the decision tree into either 2 or 10 is a better fit.
The decision tree model to the left has a max depth of 2. The purpose of choosing this max depth is to see if the decision model of max depth of 3 is overfit. This model, however, proved that making the max depth smaller would actually make prediction model worse with the mean squared error of the training being 445.79867031158597 and the testing being 450.865157938833.
The decision model to the left has a max depth of 10. This decision model proved to have a mean square error of 311.419421900093 for training model and a mean square error of 313.9840042464673 for the testing model. The mean square error of this model seems to be a perfect fit.
The best decision model to choose from out of these three cases has to be the max depth of 10. The max depth of 10 proved to have a better mean square error than the max depth of 2 by a large margin. The max depth of 10 seems to be a better fit than the max depth of 3 when comparing the mean square error of both of them. However, an interesting case can be seen if we check this model with a higher max depth such as 16 to see how much the mean square error of the model would change.