This dataset presents an opportunity to construct predictive models aimed at estimating the total amount paid by travelers for taxi journeys. With access to a training set containing the target variable 'total_amount' along with various informative features, participants are challenged to create accurate predictive models.
The dataset comprises various columns, each offering valuable insights into taxi rides. Notably:
total_amount: The total amount paid by the traveler for the taxi ride.
VendorID: An identifier for taxi vendors.
tpep_pickup_datetime and tpep_dropoff_datetime: Timestamps indicating pickup and drop off times.
passenger_count: The number of passengers during the ride.
trip_distance: The distance traveled during the trip.
RatecodeID: Rate code for the ride.
store_and_fwd_flag: A flag indicating whether the trip data was stored and forwarded.
PULocationID and DOLocationID: Pickup and drop off location identifiers.
payment_type: Payment type used for the ride.
Models Used:
Multiple Linear Regression
SGD Regressor
k - Neighbors Regressor
Decision Tree Regressor
Random Forest Regression