Kaggle Airline Delay & Cancellation Analysis (SQL)

Analysis of Domestic Airline Cancellation and Delays (2009-2018)

Home
Motivation
About the Data
Obtaining Data
Software Requirements
Analysis with SQL
Summary
Glossary
References

Motivation:

The global aviation industry in one of the biggest industries by revenue in 2020. According to a Forbes report, if Aviation was a country, it would have been the World’s 20th largest By GDP. It supports nearly $2.7 trillion in world economic activity (3.6% of global gross domestic product of the world)[1] every year. As per ICAO’s preliminary compilation of annual global statistics, , the total number of passengers carried on scheduled services rose to 4.3 billion in 2018, which is 6.4 per cent higher than the previous year, while the number of departures reached 37.8 million in 2018, a 3.5 per cent increase.

As per AviationOutlook, the overall worldwide revenue rose from $754 billion in 2017 to $824 billion in 2018 (+9.4% growth), in which the North America provided a significant contribution.

The North America contributed 22.4 per cent of world traffic in 2018 [2] . In 2018, North America’s RPK (Revenue passenger-Kilometres) was 39.9 per cent of total RPK, posting growth of 5.2 per cent.

The aviation industry plays major role in the global commerce, than just tourism. As per Investopedia, the big money that comes from business travelers outweights to those flying for leisure or personal reasonsas business passengers represent 75% of an airline’s profits, on some flights.

While the aviation industry in growing rapidly on a per year basis, their incurred losses are still high. One of the major contrbutors in these losses are the delays and cancellations occured every hour. Any minor or major flight delay or cancellation results in loss of thousands to millions of dollars in revenues annually for both Airports as well as Airlines.

Thus for my project, I chose the Kaggle dataset that provided data points for the Delays and Cancellations for the time period of 2009 through 2018. As this dataset is very recent, one could get a real picture of a very recent period.