INTRODUCTION
It's no secret that in today's fast-paced world, leveraging data-driven tactics to boost productivity and enhance business operations is essential to helping companies handle complexity and competition. In order to achieve this, Splendor Hotel Groups have recently recruited me as a Business Intelligence Analyst to help make sense of their data to see how I can use my skills to reveal important insights buried within. My job is to perform a thorough analysis of a large dataset that includes financial metrics, guest profiles, distribution networks, and detailed information about bookings. Through the use of my analytical skills, we hope to derive significant insights that will guide both operational and overall improvements, helping SHG to continue its unmatched hospitality.
BUSINESS OBJECTIVES
The first step in the process is to understand the business objectives. This critical initial phase sets the foundation for the entire analysis by providing a clear roadmap and context. The following key questions were posed by the business:
Booking Patterns:
What is the trend in booking patterns over time, and are there specific seasons or months with increased booking activity?
Customer Behavior Analysis:
Which distribution channels contribute the most to bookings, and how does the average daily rate (ADR) differ across these channels?
Can we identify any patterns in the distribution of guests based on their country of origin, and how does this impact revenue?
Cancellation Analysis:
What factors are most strongly correlated with cancellations, and can we predict potential cancellations based on certain variables?
How does the revenue loss from cancellations compare across different customer segments and distribution channels?
Revenue Optimization:
What is the overall revenue trend, and are there specific customer segments or countries contributing significantly to revenue?
Can we identify optimal pricing strategies based on the Average Daily Rate (ADR) for different customer types and distribution channels?
Geographical Analysis:
How does the distribution of guests vary across different countries, and are there specific countries that should be targeted for marketing efforts?
Operational Efficiency:
What is the average length of stay for guests, and how does it differ based on booking channels or customer types?
Are there patterns in check-out dates that can inform staffing and resource allocation strategies?
Impact of Deposit Types:
How does the presence or absence of a deposit impact the likelihood of cancellations and revenue generation?
Can we identify any patterns in the use of deposit types across different customer segments?
Analysis of Corporate Bookings:
What is the proportion of corporate bookings, and how does their Average Daily Rate (ADR) compare to other customer types?
Are there specific trends or patterns related to corporate bookings that can inform business strategies?
Time-to-Event Analysis:
How does the time between booking and arrival date (lead time) affect revenue and the likelihood of cancellations?
Are there specific lead time ranges that are associated with higher customer satisfaction or revenue?
Comparison of Online and Offline Travel Agents:
What is the revenue contribution of online travel agents compared to offline travel agents?
How do cancellation rates and revenue vary between bookings made through online and offline travel agents?
Using my trusty tools (SQL Server and Tableau), I will do my best to probe the data to uncover the answers to these questions.
DATA COLLECTION AND PROCESSING
The data for this project was provided by Uchenna Splendor in this google drive folder.
The dataset contains the following columns:
Booking ID: Unique identifier for each booking.
Hotel: Type or name of the hotel within the Splendor Hotel Group.
Booking Date: Date when the booking was made.
Arrival Date: Date when the guests are scheduled to arrive.
Lead Time: Number of days between the booking date and arrival date.
Nights: Number of nights the guests are booked to stay.
Guests: Number of guests included in the booking.
Distribution Channel: The channel through which the booking was made (e.g., Direct, Online Travel Agent, Offline Travel Agent).
Customer Type: Type of customer making the booking (e.g., Transient, Corporate).
Country: Country of origin of the guests.
Deposit Type: Whether a deposit was made for the booking (e.g., No Deposit, Deposit).
Avg Daily Rate: Average daily rate for the booking.
Status: Status of the booking (e.g., Check-Out, Canceled).
Status Update: Date of the last status update for the booking.
Canceled (0/1): Binary indicator of whether the booking was canceled (1 if canceled, 0 if not canceled).
Revenue: Revenue generated from the booking.
Revenue Loss: Loss in revenue if the booking was canceled (negative value if the booking wasn't canceled).
For the purpose of analysis and initial insights derivation, SQL Server was used. The vast library of functions, along with its robust querying capabilities, enabled efficient data retrieval and manipulation and this easily made it tool of first choice for this project.
To begin this analysis, I simply imported the data into SQL Server using the Import Data option in SQL Server Management Studio.
DATA CLEANING
At this point, I had my data ready, but I had to make sure the data was clean before beginning my analysis. This is to ensure that the analysis performed will derive accurate insights as well as reliable results.
First, I took a look at the entire data table as well as the unique values in key columns. This step is important because it allows me to identify and rectify errors and inconsistencies in the data, such as spelling mistakes, inaccurately inputted values, or discrepancies in data formats.
Upon doing this, I discovered that I had to correct spellings in the Status column. I also discovered that there were null values in the Country column. To rectify this, I simply set the null values in the Country column to ‘Unknown’ instead of deleting the rows making sure that I can retain all of my data.
When viewing the table, I observed that the Booking Date, Arrival Date, and Status Update columns were in DATETIME2 format, so, I converted them to DATE format. I also observed that there were 2 separate columns for revenue - Revenue and Revenue Loss – which give the revenue gained and revenue lost. I decided to combine these columns since they ultimately represent revenue and would make my table more compact.
Next, I checked for missing values and found no missing values and I also checked for invalid values in the Lead Time, Nights, Avg Daily Rate, Guests, and Status columns to ensure that the data is free from anomalies or inaccuracies that could potentially impact the analysis. Upon doing that, I found that there was only one case (Booking ID 14970) where guests showed up and stayed and revenue was negative. I had to correct that because this was clearly an error.
Finally, I checked for duplicate values to ensure the dataset's integrity and prevent any potential distortions in the analysis. This concluded the data cleaning phase of this analysis.
DATA ANALYSIS AND INSIGHT GENERATION
Now, it was time to answer the business questions using SQL Server’s querying capabilities. I made use of several tools like:
Aggregate functions e.g. COUNT()and SUM()
Window functions e.g. ROW_NUMBER()
Date functions e.g. YEAR() and MONTH()
Common Table Expressions (CTE)
Logical functions e.g. CASE WHEN
Mathematical functions e.g. ROUND()
Joins
String Concatenation
Note: You can find the entire SQL query used for the data cleaning and analysis here.
BUSINESS DASHBOARD CREATION
Finally, I created the business intelligence dashboard using Tableau where I made use of the following features:
Parameter actions to create filters
Navigation buttons to move between dashboard pages
Custom shapes and icons
BANs creation for easy-to-understand KPIs
Color selection
Calculated fields e.g. conditional logic, date functions
BUSINESS INSIGHTS
Following the analysis, the following insights were uncovered from the data:
1. 2016 had the highest number of bookings on record with a total of 58543 bookings while 2013 had only 1 booking despite the earliest booking being in June 2013. It is notable that 2017 had fewer bookings than 2016 most likely because the last data on record was in August of the same year.
2. May and June had the lowest number of bookings with 7853 and 6063 bookings respectively while January and February had the highest number of bookings with 16688 and 13468 bookings respectively.
It was also discovered that the transient customer type had the highest frequency in the months with the highest and lowest bookings.
3. The Contract customer type had the highest lead time (143 days) while the Group customer type had the lowest lead time (55 days) among the customer types that we have definite data for.
4. The most common distribution channel utilized by guests was the Online Travel Agent channel, used by 62% of the guests while the least commonly utilized channel is the Corporate channel used by only 6% of the guests (Undefined refers to the distribution channel that was not recorded in the dataset).
5. Among these channels, the Direct distribution channel had the highest Average Daily Rate (ADR) with $72.67 while Online Travel Agent had the least ADR among the defined channels with $25.64.
6. 8 out of the top 10 countries of origin of the guests are European with Portugal having the highest total number of guests
7. Among the various customer types, the Transient category incurred the highest cost for the hotel, resulting in a loss of -$8,138,113.10, whereas the Group customer type had the lowest impact with a loss of -$17,325.19. The Online Travel Agent channel proved to be the most expensive for the hotel, leading to a loss of -$8,744,453.83. Conversely, the hotel recouped some funds from the Offline Travel Agent channel, totaling $257,189.61, likely attributed to the absence of refunds for offline bookings.
8. From 2013 to 2017, the hotel group accumulated a revenue of $29,600,725.04 from guests but incurred a loss of -$13,122,900.09 in the course of operations. Based on this analysis, the cumulative profit from 2013 to 2017 amounted to $16,477,824.95, constituting 55.67% of the company's total revenue during that period.
9. Regarding revenue generation, Transient guests recorded the highest total revenue, amounting to $22,696,813.88. Simultaneously, they incurred the highest total loss, standing at -$11,502,290.03, but also attained the highest total profit at $11,194,523.85.
10. Regarding average profit, the Offline Travel Agent channel demonstrated the highest figure at $245.22, while the Online Travel Agent channel reported the lowest at $87.38. In terms of average revenue, the Direct channel emerged with the highest at $302.69, whereas the Corporate channel recorded the lowest at $137.56.
11. The average length of stay for guests at the hotel is 3 nights. Examining various distribution channels, it is observed that Corporate guests have the shortest average length of stay at 2 nights, while guests booking through offline travel agents experience the longest average length of stay at 4 nights.
Among various customer types, the Contract customer type exhibited the lengthiest average stay at 6 nights, while the remaining three guest types shared a similar average length of stay at 3 nights each. Notably, Contract guests utilizing the offline travel agent channel surpassed this average, boasting an extended average stay of 8 nights. Conversely, several guest types and distribution channels recorded the shortest average length of stay at 2 nights.
12. In instances where the guests were not required to pay deposits, the cancellation rate stood at 71.62%. With refundable deposits, the cancellation rate rose to 77.78%, whereas non-refundable deposits resulted in a minimal 0.64% cancellation rate. Notably, non-refundable deposits proved advantageous, yielding a total profit of $3,604,337.03 for the hotel, while other cancellations incurred a total loss of $13,104,900.09.
13. Among the total customer population, 64.23% are transient guests who did not make any deposits, followed by transient party guests without deposits accounting for 19.98%, and transient guests with non-refundable deposits making up 10.81%
14. Out of transient guests who didn't pay a deposit, 44.48% attended, while 15.52% of transient-party guests without deposits and 2.36% of contract guests without deposits attended. Conversely, only 0.08% of transient-party guests who paid a non-refundable deposit attended.
15. The Corporate distribution channel exhibited the lowest representation, with only 5.59% of guests utilizing this channel, totalling 6,677 guests. The average daily rate for the corporate distribution channel was $45.48.
RECOMMENDATIONS
Building on the insights gained through the analysis of the Splendor Hotel Group's dataset, these are the recommendations that I will offer to improve the hotel's performance and refine its operational efficiency:
The stark contrast in booking numbers between the months of May and June versus January and February indicates the need for targeted marketing efforts during the former to boost occupancy rates during the months with lower bookings. For example, the company can introduce special promotions and discounts during the low booking months to incentivize potential guests as well as creating packages that include additional perks such as a complimentary breakfast.
Given that the transient customer type dominated both the months with the highest and lowest bookings suggests the importance of tailoring marketing and promotional activities to this customer segment throughout the year. Strategies should focus on enticing transient guests, considering their significant impact on booking patterns.
The data indicates that the Contract customer type has the highest lead time at 143 days, suggesting that the hotel should implement measures to capitalize on longer planning periods for this customer type. This could involve targeted marketing campaigns, early booking incentives, and personalized offerings to attract and engage customers well in advance of their stay. Meanwhile, the Group customer type, with a shorter lead time of 55 days, requires a more agile approach in marketing and service delivery. Special promotions and last-minute deals can be emphasized to capture the attention of this segment and encourage swift reservations.
In terms of distribution channels, the hotel should explore ways to leverage the popularity of the Online Travel Agent channel, which was used by 62% of guests. For instance, the company can engage in negotiations with Online Travel Agents to secure more favourable terms, such as increased commission rates as well as investing in digital marketing efforts to enhance the hotel's online visibility through the agents in order to reach a wider audience.
Examining the cost implications, the hotel should reassess its strategy regarding Transient guests, considering this customer segment resulted in the highest losses, especially through the Online Travel Agent channel. A significant factor contributing to losses for Transient guests is related to deposits. Given that a substantial number of Transient guests do not currently pay any deposits, this has led to a notable loss of -$9,905,281.40 for the company. To mitigate this issue, the company could implement a mandatory deposit requirement for Transient guests. Introducing such a policy would serve to minimize losses in the event of cancellations and contribute to overall financial stability.
Analyzing the performance of different distribution channels, the hotel should explore ways to enhance the profitability of the Offline Travel Agent channel, which exhibited the highest average profit at $245.22. The hotel should introduce strategies such as negotiating mutually beneficial terms and incentives to foster long-term collaborations for Offline Travel Agents. Simultaneously, efforts should be made to improve the Online Travel Agent channel's average profit, currently at the lowest with $87.38.
Understanding the average length of stay is crucial for resource allocation and service planning. The hotel should consider promoting longer stays, especially among Corporate guests who currently have the shortest average length of stay at 2 nights. This could include improving the current conditions of the hotel, including events and attractions that cater to the needs of the Corporate guests as well as any other relevant strategies.
CONCLUSION
This project was not only enjoyable but also incredibly exciting. The opportunity to work with a dataset mirroring a real-life business scenario and provide insights through analysis conducted in SQL Server is truly gratifying.
Take a peek at the Tableau dashboard: