Hotel Booking

Using Python       

In this project, we analyze and provide actionable recommendations to both hotels on how to reduce their cancellation rates and improve their revenue generation, based on the insights gained from the data analysis.


ABOUT DATA

The hotel booking dataset is a collection of information related to hotel bookings made by customers. It contains 119391 rows and 32 columns of data.

The dataset includes information such as hotel type, location, booking dates, room type, number of adults and children, and whether the booking was cancelled or not.

Some of the notable columns in the dataset include:

This data can be used for various analyses and predictions related to hotel bookings, such as predicting the likelihood of a booking being cancelled or identifying trends in seasonal booking patterns.

PROJECT BEGINNING

IMPORT and LOAD DATA

EDA and DATA CLEANING

This code snippet is written in Python and is used to convert a column called reservation_status_date in a pandas DataFrame df into datetime format using the pd.to_datetime() function.

Here's how the code works:

Here we use for loop to extract unique values of object datatype columns.

This code snippet is used to count the number of missing values in each column of a pandas DataFrame df. Here's how the code works:

Overall, this code is useful for quickly identifying the number of missing values in each column of a pandas DataFrame. It can be used to determine if there are any missing values in the dataset, and to decide on an appropriate strategy for handling those missing values, such as imputation or dropping missing values.

How this drop and dropna work :

BoxPlot :


Reservation Cancel Percentage:

Resort and City Hotel : 


Resort and City Hotel : 

This code is creating a count plot to visualize the reservation status (canceled or not) by month. The first line of code is creating a new column in the DataFrame called 'month' and it is extracting the month information from the 'reservation_status_date' column. The second line of code is creating a figure with a size of 16x8. The third line of code is creating the count plot, with the 'month' column on the x-axis, the hue as the 'is_canceled' column (which will show the count of reservations that are canceled and not canceled), and using the 'df' DataFrame as the data source. The fourth and fifth lines of code are adding a title and labels for the x and y axes. Finally, the last line of code is adding a legend to the plot to show which color represents 'Not Canceled' and 'Canceled'. 

This code is creating a bar plot to visualize the Average Daily Rate (ADR) per month for canceled reservations. The first line of code is creating a figure with a size of 15x8 and adding a title. The second line of code is creating a bar plot using the Seaborn package, with 'month' on the x-axis, 'adr' on the y-axis, and using the 'df' DataFrame as the data source. The 'df' DataFrame is filtered to only include canceled reservations by using the boolean condition 'df['is_canceled']==1'. Then, the DataFrame is grouped by month and the 'adr' column is summed up. Finally, the DataFrame is reset_index() to convert the month from an index to a column. The resulting plot will show the ADR for canceled reservations for each month. 

In summary, the code creates a pie chart to visualize the percentage of canceled reservations for the top ten countries with the highest reservation cancelation rates.


This line of code is used to calculate the percentage of reservations for each unique value in the 'market_segment' column of the DataFrame 'df'. It does this by using the 'value_counts()' function, which counts the frequency of each unique value in the 'market_segment' column. The 'normalize=True' parameter is used to return the frequency counts as a percentage of the total number of reservations in the DataFrame. 

This code creates a new dataframe for canceled reservations and calculates the average daily rate (ADR) for each day that a reservation was canceled. It also creates a new dataframe for not canceled reservations and calculates the ADR for each day that a reservation was not canceled. Then, it plots a line graph that shows the trend of ADR over time for both canceled and not canceled reservations. The x-axis shows the reservation status date and the y-axis shows the average daily rate. The graph is helpful to compare the ADR trends between canceled and not canceled reservations and to identify any patterns or insights. 

To summarize, these two lines of code are filtering two separate DataFrames based on the reservation_status_date column, and keeping only the rows where the date falls within a specific range. The resulting DataFrames will only contain data for reservations that fall within that time frame and meet the condition of being either cancelled or not cancelled.

In Second Code :

Overall, this code generates a line plot that shows the ADR over time for cancelled and not-cancelled reservations. The x-axis represents the dates of the reservations, and the y-axis represents the ADR for each group of reservations. The legend shows which line represents which group of reservations.


INSIGHTS

1. Cancellation rates rise as the price does. In order to prevent cancellations of reservations, hotels could work on their pricing strategies and try to lower the rates for specific hotels based on locations. They can also provide some discounts to the consumers.

2. As the ratio of the cancellation and no cancellations of the resort hotel is higher in the resort hotel than in the city hotels. So the hotels should provide a reasonable discount on the room prices on weekends or on holidays.

3. In the month of January, hotels can start campaigns or marketing with a reasonable amount to increase their revenue as cancellation is the highest in this month.

4. They can also increase the quality of their hotels and their services mainly in Portugal to reduce the cancellation rate.

CONCLUSION

To prevent rising cancellation rates, hotels can consider adjusting their pricing strategies, offering discounts during peak times, and improving the quality of their hotels and services. These measures can improve customer satisfaction and increase revenue. 

HELPING MATERIALS

Here, You See and Download All Material Click Here