INFERRING FACTORS THAT MAY AFFECT DEPENDENCY ON PUBLIC TAXI RIDERSHIP A study on Chicago taxi ridership pre- and post-pandemic as per community areas By Aishwarya Mohgaonkar, Dishaddra Poddar | Under the guidance of Dr. Clio Andris Georgia Institute of Technology Introduction: The global pandemic created numerous impacts on cities and people, including a drastic change in travel within and outside of cities. This study is based in the city of Chicago, Illinois and focuses on the shift in the dependency on local taxi services during the pandemic. It aims to derive inferences on a community-area level in trying to understand why certain places may have a different dependency on taxi rides over others. Along with the effect of the pandemic, rideshare companies like Uber and Lyft greatly decrease the value of taxi medallions; (Source: https://www.nbcchicago.com/news/local/chicago-taxi-industry-hit-hardamid-pandemic-ride-share-trip-increases/2502335/), thus further reducing the demand of local taxi rides. With these factors, will there be a uniformity in the shift in taxi dependency and ridership? If not, what are the other factors that drive this dependency?
There are 77 Community Areas in Chicago
Significance of Study With the surge in dependency of taxis, Ubers and Lyfts, it is imperative that the dependency and usage of public transit is studied to determine the accessibility, especially with respect to affordability and preference. Additionally, the purpose of this study is also to figure out areas that have a negative difference in taxi ridership, especially during a pandemic, when most people inherently tend to avoid public transport and resort to private vehicles. This would help find out the reasons for the possible negative pattern of taxi dependency with respect to socio-economic issues.
Data Source: https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew
Data Owner: City of Chicago
Research Process: 1. We used the R Studio software to study and read in the large dataset on Chicago’s taxi pickup and drop off for all the months of year 2019 and 2020. We limited ourselves to pick-up locations only, to understand the drop or rise in pick-ups per community area, as an effect of the global pandemic. We then extract months from the given timestamps and group them by the community areas. Below are the results after plotting the pick-up locations for each year on the map of Chicago.
2. To study and compare the effect in ridership due to the pandemic, we first merge the data of the two years and then derive the changes/differences in rides (specifically pick-ups) per community area between year 2019 (pre-pandemic) and 2020 (peak pandemic). Next, we performed spatial joins on GIS to find interrelationships with CTA railway stations. This helped us study if there are any notable patterns in the differences of ridership with the extents of public transit. Note: The data is normalised to avoid any drastic results due to other measures such as difference in population.
Observations It was observed that majority of the community areas with a significant drop in ridership in 2020 were those that are well connected to transit. On the other hand, areas with an insignificant change in ridership (areas in the south) are not.
2.1 Performing the Anova Test to determine accuracy of interdependencies ANOVA is a statistical test for estimating how a quantitative dependent variable changes according to the levels of one or more categorical independent variables. ANOVA tests whether there is a difference in means of the groups at each level of the independent variable. (Source: https://www.scribbr.com/statistics/anova-in-r/ ). The most important takeaway is the ‘P value’ (indicated in the below table in red). P values determine whether the hypothesis test results are statistically significant (https://statisticsbyjim.com/hypothesis-testing/interpreting-p-values/) Low p-values are indications of strong evidence against the null hypothesis.
The low P value signifies a very low chance of a null hypothesis. Thesis 1: There is a significant correlation between dependency on cab rides with available public transport systems. Follow-up question: What could be additional factors that might have impacted the small change in ridership? 3. Drawing inferences on the impacts of socio-economic attributes to ridership differences. (Hardship Index) 3.1 A Hardship Index is a tool to compare economic and financial stresses between communities. It can be used to paint a comprehensive picture of a city or country that may need interventions that could help uplift these communities economically. The U.S. Census Bureau’s American Community Survey is a national monthly survey that produces demographic, socioeconomic, employment, income, education and behavioural estimates for households and individuals. About 3.54 million addresses are sampled each year nationwide (Source: https://greatcities.uic.edu/wpcontent/uploads/2016/07/GCI-Hardship-Index-Fact-SheetV2.pdf) 1 Hardship Index in Chicago The economic hardship score is the median of six variables that have been standardized on a scale from 0 to 100. The six variables include: 1. 2. 3. 4. 5.
Unemployment Education Per capita income Crowded housing Dependency
Higher hardship index scores indicate worse economic conditions (Source: https://greatcities.uic.edu/wpcontent/uploads/2016/07/GCI-HardshipIndex-Fact-SheetV2.pdf )
1
Source: https://greatcities.uic.edu/wp-content/uploads/2016/07/GCI-Hardship-Index-Fact-SheetV2.pdf
4. Next, to visualise and find out whether there is a connection between Chicago’s Hardship Indices with the difference in ridership, another exercise of spatial joins was carried out to study and analyse the interdependency. Note: The data is normalised to avoid any drastic results due to other measures such as difference in population.
Observations A distinct pattern was observed with the hardship indices as compared to ridership difference in community areas. From the generated map, it can be clearly observed that community areas that had a small change in ridership (darker areas on the map) tended to have higher values of hardship indices. Whereas, the community areas with a large change in ridership (lighter areas on the map) had relatively small values of hardship indices.
4.1. Performing an Anova Test To determine the accuracy of the above observation and the relationship between the Hardship Index and difference in ridership, the Anova test was performed again. Below is the table derived from the test.
Again, the low P value signifies a very low chance of a null hypothesis. Thesis 2: The correlation between Hardship Index and the dependency on Taxi rides is extremely significant.
INFERENCES In this study we develop a methodology to understand preferential patterns of taxi ridership in community areas by focusing on two significantly different years, analysing their levels of accommodation or change. By studying patterns during a natural calamity like a global pandemic, we thought that the privileges or under-privileges of communities can be better understood inn terms of proportions of taxi ridership. We systematically studied large scale data on pick-up locations within the city of Chicago, focusing on the public taxis. By deriving data on the difference in community-wise ridership, we were able to identify areas that did not behave differently. One would think that a global pandemic would change the way the general public chose to travel, and that the inherent reaction was to keep away from public transport as much as possible. However, when certain areas did not show a decline but rather a rise in taxi ridership dependency (the difference was a negative value), it drove us to explore to major factors that may have been the reason for the same. 1. The noticeable lack of public transport networks in these areas can be one reason for a rise in taxi dependency. Communities that are not well connected to public transport, may be compelled to depend on taxis for their commute. 2. Additionally, the strong and direct relationship between Hardship Index and difference in ridership speaks a lot about the helplessness of these communities. The fact that these community areas consistently low hardship indices, may mean that the people of these communities cannot afford to stay at home and require to move around the city to work and earn a living. The study tells us a lot about the vulnerability of some communities, especially during a public health crisis. We think that this analysis can help the city in taking steps towards better, more holistic physical and public infrastructure improvement.