Conclusions that can be drawn from this analysis:
Impact of Location IDs (PULocationID and DOLocationID) on trip times: By taking these columns, I have tried to analyze the relationship between the pickup and drop-off location IDs and trip times. This helps us identify if certain locations are associated with longer or shorter trip times, which may be useful for planning and optimizing taxi routes or understanding traffic patterns in different areas of New York City.
Influence of trip distance on trip times: We can explore how the trip distance (in miles) impacts trip times. This analysis reveals that longer trips tend to take more time, or if there are any anomalies where shorter trips take longer time durations. This information could help understand factors that affect trip times and may be useful for predicting trip durations accurately.
Based on these results, it can be concluded that the selected features are useful in predicting the trip time for new trips and that the SVM can be an effective tool for this task. However, further analysis may be required to improve the accuracy of the model and to identify any potential limitations or biases in the data.
Reference: https://medium.com/@haonanzhong/new-york-city-taxi-data-analysis-286e08b174a1
4. Passengers picked up from Newark Airport, JFK, and Flushing Meadows Corona Park are more likely to give a higher amount of tips. Contrarily, passengers dropped off at Great Kills and Oakwood on Staten Island tend to give a higher amount of tips, which well reflects Staten Island’s status as one of the most well-off boroughs. However, since some zones only have a few trips, the statistics might be affected by the insufficient amount of samples.
Overall, the analysis of NYC taxi data using SVM for predicting trip times and tip amount can provide valuable insights into factors that affect trip durations and can inform decision-making for transportation planning, resource allocation, and other related areas.