The first step in data analysis is to load, understand and visualize initial plots that display the shape of the data. As a first step, it is a good idea to try to detect trends in our data, to avoid bias because of a lack of understanding of the data itself. For this purpose, we carry out the following exploratory data analysis.
To start with the analysis of the different polluting gases, we can see how their concentrations changed during the period of time we are investigating (2016 to February 2020). In the Interactive Chart 1, we observe this evolution for 5 stations (one in each tab).
We can see for each air quality station the progress of the pollutants they measured. If we focus on the measures from Plaza del Carmen station, which is the one inside Madrid Central, the pollutant with more concentration is the NOx, while the rest follow a more similar distribution over the time.
This plot give us more insights and inform us about which gases are measured where, and the typical amount measured. Both NO2 and PM2.5 seem to be a problem that should be taken into account in the city of Madrid.
Let's start with a comparison of the traffic intensity in each district and in each traffic measurement point, in order to have a more detailed overview of how busy the districts are. We show in Interactive Map X the average intensity per district and its traffic points, from 2016 to February 2020.
From the map, we can see that Centro is not the most transited district. We do not know yet if it is thanks to the measures, or that it is just not a very transited district. What we do know for sure, and that can be also powered by the measures, is that the surrounded districts (such as Arganzuela, Retiro or Moncloa) are some of the busiest districts in the city, so it will be interesting to see if there is a border effect because of Madrid Central.
If we focus more on the traffic measurement points as separated measures instead of districts as a whole, we can observe an expected result. The traffic intensity in the main roads of Madrid is higher than in the secondary roads. The ring roads of Madrid are visible following the yellow points.
We can also see that, inside Madrid Central, the most crowded road is Gran VĂa, which is the main commercial road in the district. We also appreciate a lot of traffic surrounding Madrid Central, which again may be interesting to investigate if it is a normal traffic flow, or it is because of the measures.
Doing the average it is not enough for doing a time series analysis, thus we are going to see how does the traffic intensity per district looks like per day (from 2016 to February 2020). In order to be able to see the evolution of the traffic during time.
Given the fact that we have 21 districts, the chart becomes a bit overwhelm. But if we visualize only the district/s that we want to see or compare, we can perceive the progress of the traffic intensity along the days.
We detect again that Centro is not the district with more traffic, or that almost all the districts follow the same tendencies. When there is a day that has less traffic, we see the decrement in all the districts in proportion to its own intensity. We suspect that these days when the traffic increases or decreases in all districts are festive or something is celebrated. Then let's try to find out if those days are related to any specific dates.
Right as we thought, almost all the peaks and valleys coincide with festive days in Madrid. Which gives us an outline of the traffic behavior. We observe that the area of Madrid Central follows the same average as the whole city of Madrid.
Once we have seen the whole picture of the traffic intensity evolution during the period we are investigating, we want to focus on a yearly, monthly and weekly analysis between Madrid Central area and the city of Madrid.
From this first plot we see that in 2016 and 2017 the average of traffic intensity per day was higher in Madrid Central than in the rest of the city, but in 2018 it changed. Since 2018, we observe how the average from Madrid Central decreases and becomes lower than the average from the city.
Regarding the average of traffic intensity per day in a month, we see that it is more or less the same for Madrid Central and the city. Still worth noting that for some months the average from Madrid Central is a bit higher.
For example during summer holidays (June, July, August, September) or Easter holidays (Match-April depending on the year) which could be related to more tourists going to the city center (Madrid Central area).
Lastly, what we can get from the weekly plot is that Madrid Central has a higher average of traffic intensity per day during the weekends than the rest of the city.
Also, that throughout the week, the average increases progressively in both cases.
Now we analyze the traffic behavior during the different months of the year inside Madrid Central area. We spot in the Interactive Chart X that in summer (June, July, August) the traffic intensity decreases a lot, specially in August. This is something that we already expected, as well as a small decrease at Christmastime (December, January). During holiday periods, people tend to leave the city center.
With this plot, we are interested in seeing how the traffic evolves during a year, and compare the different years.
We can also appreciate that the years 2018 and 2019 have a lower average of traffic intensity per month, compared with 2016 and 2017.
Once we have finished with the exploratory analysis, we can follow it up with a more in depth analysis to obtain relevant conclusions, and discover how effective Madrid Central was.