Data mining is the process of extracting meaningful information from the analysis of big data which are stored digitally. Data mining method is used commonly to analyze huge amount of data and extract unforeseen results from that data. Data mining techniques are used in a wide variety of disciplines and fields such as airline digital flight analysis, tourism and hospitality, customer relationship management in marketing, medical disease prediction and etc.
It is possible to collect and store large amounts of data in a short time with the development in formation and software technology. Databases enables user to access knowledge quickly and provide large data sets. Use of information technology is needed for the analysis of thousands of records from the database. Data analysis is an important step to obtain previously unknown, hidden in data, meaningful and useful patterns from large-scale database in the process of knowledge discovery. It is possible to uncover previously unknown relationships and correlations between variables and to identify future trends and possibilities by using data mining techniques.
Digital flight analysis is the analysis of flight data that is used to identify, assess and address operational risks. It can be effectively used to support a range of airworthiness and operational safety tasks. Digital flight analysis can help us to improve the flight crew performance, operating procedures, flight training, air traffic control procedures, aircraft maintenance, airline schedule and others.
Flight delay has been the subject of several studies in recent years. With the increase in the demand for air travel, effects of flight delay have been increasing. Flight delay has been one of the major issues in the airline industry. It has negative impacts, mostly economic, for airlines, airports, and passengers. Besides, it is not only about passengers, but also the operational costs, fines and penalties from the Federal Aviation Administration. Impacts of flight delay in future are likely to get worse due to an increase in the air traffic congestion, growth of commercial airlines and increase in the number of passengers per year. While flight delays are likely to persist in future due to unavoidable factors such as weather and unpredictable flight maintenance, we seek to identify operational critical factors responsible for delays and create a predictive algorithm to forecast flight delay and the duration of delay.
There are three predictive models (Random Tree, Decision Tree, Logistic regression) that are used for binomial prediction to predict the flight arrival delays. These models will be clustered by using clustering model (X-mean, K-mean). The clustering-based classification enhancement model is used to further enhance the data model. There are another three predictive models (Linear Regression, Support Vector Machine(SVM), Neural Network) that are used for continuous prediction to predict the duration of departure delays. Furthermore, each of the model is tested for comparing the performance by determining the accuracy of each model. Predictive modeling developed in this study can lead to better management decisions allowing for effective flight scheduling. In addition, the highlighted significant factors can give an insight into the root cause of aircraft delays.
We want to improve the punctuality of our airline, without increasing of schedule flight time. Just in case when there is a delay to the schedule, the customer will be informed about the status.
Airline delays are the proverbial thorn in the side of any traveler. Airline delays can have a huge impact on travelers. Delays can cause a lot of anxiety in travelers. Many times, there are connecting flights to consider, not to mention special scheduled events like weddings, meetings, and reunions. Sometimes even a relatively small delay can wreak havoc on your travel plans, so needless to say, passengers all over the world are desperate to avoid them. If we ignore this problem, resource will need to increase to handle this cascading problem and we may miss a lot of customers which could result in lost revenue, lost business and even cause further damage to our quality reputation.
We will use data mining techniques to predict whether the flight will be delay and in the same time, predict the duration of departure delay in order to help us to improve the punctuality of our airline and the schedule.
According to the problems stated above, we decided to develop predictive modelling with the following objective:
1. To predict the flight arrival delay: The development of this model is to informing customer the flight status and delay time earlier, so that the customers are able to rearrange their plans in order to reduce the cost and effect caused by flight delay. This solution can greatly help consumer to rebuild the trust between passengers and airline company. Otherwise, it is easy to see that, the company that is careless about their passengers will lost the loyalty of customer, revenue, and then the business.
2. To predict the duration of departure delay: We would like to notice the customer with the accurate flight delay time. The accuracy of predicting the departure delay duration is essential for airline company to manage the next flight within the delay time or take other possibilities in providing a better service to passengers. The quick operation will protect the customer from being damage the airline company’s reputation. Early preparation is better than the meaningless waiting. So, this model will help the company to have a better management decision which allowing an effective flight scheduling.