Dataset 1 :
Attributes:
Dataset 2 :
Attributes :
Figure 1 : Figure shows the sum of flight delay(minutes) at arrival, group by list of aircraft.
From the diagram above, we can see that the count of the flight for aircraft 9MMUA and 9MMUD is 30 times. But the sum of the arrival delay for the 2 aircraft is different, which is 12576 min and 7714 min respectively. Meanwhile, we also can see that the count of flight for aircraft 9MSSE is 93 times, but the sum of arrival delay is -735 min. So, we can conclude that 9MMUA is the aircraft that contain the highest probability to delay if compared to 9MMUD and 9MSSE is the aircraft that contain the lowest probability to delay.
Figure 2 : Figure shows the average of flight delay(minutes) at arrival, group by list of aircraft.
From the diagram above, the highest average of arrival delay is 9MMUA follow by 9MMUD with the average of 419.2 and 257.1 respectively. Besides, the lowest average of arrival delay is 9MSSE with the value of -7.9. So, we can conclude that 9MMUA is the aircraft that contain the highest probability to delay compared to all the aircraft and 9MSSE is the aircraft that contain the lowest probability to delay.
Figure 3 : Figure shows the sum of flight delay(minutes) at arrival, group by departure airport.
The diagrams shows that most of the airline will depart from the Kuala Lumpur International Airport(KLIA) compared to the other airport as it contains the highest count of system data ID. The second highest is the Kota Kinabalu International Airport(KKIA) with the count of 460 times. So, we can conclude that in this dataset, most of the departure airport is KLIA and hence make the KLIA having the highest sum of flight delay.
Figure 4 : Figure shows the average of flight delay(minutes) at arrival, group by departure airport.
Based on figure 3, we know that most of the airline will depart from the Kuala Lumpur International Airport (IATA: KUL) compared to the other airport and follow by Kota Kinabalu International Airport (IATA: BKI), the second highest airport. But when we calculate the average of the arrival delay time, as can be seen from figure 4, we found out that the highest average of the arrival delay time is taken by Chongqing Jiangbei International Airport (IATA: CKG) and follow by Kansai International Airport (IATA: KIX). Although KUL has 1820 number record of delay, but the average of arrival delay is lower than CKG with the average number of 18.4 min. So, we can conclude that in this dataset, although most of the departure airport is from KUL, but the highest delay frequency is taken by CKG.
Figure 5 : Figure shows the average of flight delay(minutes) at arrival, group by 2 departure airport and lastest tail number.
In this diagram, Kuala Lumpur International Airport(KLIA) and Kota Kinabalu International Airport(KKIA) have been selected as the departure airport. Based on the diagram, 9MMUA is the aircraft that having the highest probability to delay compared to the others aircraft.
Figure 6 : Figure shows the sum of flight delay(minutes) at arrival, group by flight date.
Figure 7 : Figure shows the average of flight delay(minutes) at arrival, group by flight date.
Based on the 2 diagram above (Figure 6 and 7), we can see that the graph pattern for both the sum and average of the arrival flight delay is almost the same. As can be seen from the 2 diagram (Figure 6 and 7), during 4 until 7, the flights in this range are more likely to delay compared to the other date. Besides, we also can conclude that the peak of the arrival delay is on 6 of January 2018.
Figure 8 : Figure shows the average of flight delay(minutes) at arrival, group by taxi out time.
From the diagram, we can conclude that starting from 38 until 80, most of the flight in this range are more likely to be delay. Besides, we also can conclude that the taxi out time in the range of 0 until 8, it has the lowest average of arrival delay whose the flight in this range are less likely to be delay.
Figure 9 : Figure shows the average time of every delay code(minutes), group by 2 departure airport.
Based on the diagram, we can see that KUL and BKI have been selected as the departure airport out of every other departure airport. This is because in this dataset, KUL and BKI is having the highest count of system data ID compared to the other airport. From the diagram, we can conclude that both the airport are having the same delay problem, which is the miscellaneous delay. The average delay time of the miscellaneous delay type are the highest in both the airport. Besides, we also can conclude that the air traffic control, operation, miscellaneous, and technical problem affecting most of the departure time of KUL while for BKI, the most common factor are miscellaneous, technical, air traffic control and handling.
Figure 10 : Figure shows the count of delay time against delay time (Air Traffic Control).
Figure 11 : Figure shows the count of delay time against delay time (Damage/Failure).
Figure 12 : Figure shows the count of delay time against delay time (Handling).
Figure 13 : Figure shows the count of delay time against delay time (Internal).
Figure 14 : Figure shows the count of delay time against delay time (Miscellaneous).
Figure 15 : Figure shows the count of delay time against delay time (Operation).
Figure 16 : Figure shows the count of delay time against delay time (Passenger).
Figure 17 : Figure shows the count of delay time against delay time (Technical).
Figure 18 : Figure shows the count of delay time against delay time (Weather).
Table 1 : Table shows the list of every delay type and it counts.
From Figure 10 until figure 18, we can see that the proportion for every single delay type, 0 has occupy approximately 90% of the overall dataset. As can be seen from the table 1, every delay type has about 4400 – 5600 number of delay in which the time is 0. So, we can conclude that the dataset is a little bit bias as for every single delay type, only 10% of the data can be use in the data mining task, the result will be a little bit bias and not very accurate.