Big Aggregate Data | 20.303 Urban Analysis

Page 1

20.016 Urban Analysis

BIG, AGGREGATE DATA

Tanjong Pagar, Arab Street and Jalan Besar Clifford Mario Kosasih (1000294) Goh Pei Xuan (1000286) Kevin Josiah Neo Jun Hao (1000133) Oor Eiffel (1000293) Sharon Ho Jia Jia (100091)


TABLE OF CONTENTS 1. INTRODUCTION

3

2. RESEARCH QUESTION

3

3. LITERATURE REVIEW

4

4. METHODOLOGY

5

5. HYPOTHESES

6

6. MAPS

7

7. BIG, AGGREGATE DATA

10

8. ANALYSIS

11

9. CONCLUSION

17

10. APPLICATION AND DESIGN RECOMMENDATIONS

18


1. INTRODUCTION The rapid rise of digital devices’ usage and geo-tagging technology have affected the field of human geography, social science and urban planning in recent decades. The ease of accessing large amounts of data from diverse samples of people has alleviated the bane of using manual surveys to collect data. This experiment capitalizes on this data mining technology and tries to predict and analyze collective human behavior and activities in three different cafe districts in Singapore: Tanjong Pagar, Arab Street and Jalan Besar. By focusing on Foursquare check-in and Instagram posts data, we are comparing temporal variation of the human activity in the area, how those collective activities are related to the diversity of the area, as well as its proximity and accessibility to tourist destinations.

2. RESEARCH QUESTION We are interested to understand the human activity spatial-temporal patterns generated from social media activity in different cafe districts with respect to their urban attributes and characteristics. The three sites that we have chosen, Tanjong Pagar, Arab Street and Jalan Besar, are dedicated commercial districts, filled with various eating places such as restaurants and cafes.

3


3. LITERATURE REVIEW Utilising location based services for the purpose of organising public life and planning urban spaces are increasingly becoming a reality with the availability and use of such services. Ahas R., Ăœlar M. (2005) discusses the implications and possibiltiites of communication technology. They state that through the social positioning, the use of real-time data would allow for a better understanding of the results and implications of policy decisions made. This is made more significant through the increasing precision of mobile positioning, which allows for more accurate studies of patterns in the space-time movement of society. In exploring the use of check-ins of location sharing services to study the social and temporal characteristics to model patterns of mobility, the extraction of data to analyse temporal and geographic characteristics is necessary . Cheng, Z., Caverlee, J., Lee, K., & Sui, D. Z. (n.d.) have highlighted a clear process of formatting data and filtering noise from checkin collections, to prepare information for analysis. They also analysed user’s spatial activity through the use of gyration radius to indicate the spatial extent of activity as one of the methodologies of increasing the informativeness of available metadata information. To properly utilise big data information as a research method to better understand urban settings, it is essential to process information in relation to hypotheses of the urban envrionment leading to quantifiable results providing relevant understanding and circumstances for urban intervention. Chen and Zhang (2012) demostrates the relation of social media patterns with culture, community and food diversity.

4


4. METHODOLOGY We used both Foursquare and Instagram data for this experiment. The Foursquare data is already given, from which we are obtaining the Instagram data. Using the Foursquare location ID given, we can retrieve Instagram location ID which can be used to track number of media (images and videos) posted, as well as their attributes such as number of likes, number of comments, its hashtags as well as the media itself. Foursquare ID: 4eaa97b68b8180c90b73fa03 Key in: https://api.instagram.com/ v1/locations/search?foursquare_v2_ id=4eaa97b68b8180c90b73fa03 Output: “latitude”: 1.302992956, “id”: “9339214”, “longitude”: 103.859214062, “name”: “Hara Village Restaurant” Key in: https://api.instagram.com/v1/ locations/9339214/media/recent Output: “attribution”: null, “tags”: [ “thecurrypuffincident” “type”: “image”, “location”: { “latitude”: 1.302992956, “name”: “Hara Village Restaurant”, “longitude”: 103.859214062, “id”: 9339214 }, “comments”: { “count”: 0,

],

“data”: [] }, “filter”: “X-Pro II”, “created_time”: “1388222018”, “link”: “https://instagram.com/p/idbm4JPCrX/”, “likes”: { “count”: 0, “data”: [] }, “images”: { “low_resolution”: { “url”: “https://scontent.cdninstagram.com/hphotos-xfp1/t51.2885-15/ s306x306/e15/1515649_190096791186808_1241163523_n.jpg”, “width”: 306, “height”: 306 }, “users_in_photo”: [], “caption”: { “created_time”: “1388222018”, “text”: “very crispy but thick crusty cp.\n#thecurrypuffincident”, “from”: { “username”: “iotz”, “profile_picture”: “https://igcdn-photos-d-a.akamaihd.net/hphotos-ak-xaf1/ t51.2885-19/10860232_787456637956747_1693700934_a.jpg”, “id”: “2338849”, “full_name”: “Lifei” },

5


5. HYPOTHESES Our first hypothesis states that cafe districts that are of close proximity to tourist destinations district have lower user to check-in ratio. A lower user to check-in ratio would mean that the number of check-ins to that place is more unique (lesser instances of repeated check-ins due to frequenting the place). This is also based on the assumption that no one user is responsible for making multiple or repeated check-ins to the same location as compared to the others, but rather that every single user has an equal chance of making repeated check-ins to a particular location. Our second hypothesis states that cafe districts with larger quantity and diversity of food options have a higher number of check-ins. Having more food choices attract more consumers to the area, as they recognise these areas as food districts and tend to go there more frequently for their meals. This would in turn result in a higher chance of people visiting the other nearby attractions before or after their meal and possibly check-in there too, therefore having more total check-ins to the whole cafe district. Our third hypothesis states that food outlets that are of closer proximity to transportation nodes have a higher patronage. With greater accessibility, it makes it easier for customers to get to these food outlets. This also increases the frequency of passers-by to the food outlet, creating a higher chance of them patronizing the store.

6


6. MAPS

Tanjong Pagar

3316 points 1111 Food 10160 Instagram media posts 4701 Instagram food-related media posts

7


Arab Street

2270 points 563 Food 6408 Instagram media posts 2267 Instagram food-related media posts

8


Jalan Besar

1001 points 224 Food 2413 Instagram media posts 648 Instagram food-related media posts

9


7. BIG, AGGREGATE DATA Instagram posts in Cafe districts (13 April - 19 April 2015)

Instagram posts count

http://cdb.io/1OuoLe8

http://cdb.io/1D2Tc4k

Tanjong Pagar Map

Arab Street Map

Jalan Besar Map

http://cdb.io/1ItF4WS

http://cdb.io/1ItF1KL

http://cdb.io/1ItEWGL

10


8. ANALYSIS RELATIONSHIP BETWEEN PROXIMITY TO TOURIST DESTINATION WITH USER AND CHECK-IN RATIO -

Jalan Besar

7 6 5 4 3 2 1 0

User to Check-

User to Check-

Tanjong Pagar

Categories

Categories

-

User to Check-

Arab Street 7 6 5 4 3 2 1 0

Categories

7 6 5 4 3 2 1 0

Among our 3 sites, we have identified Arab Street as the cafe district which is located nearest to a tourist destination - Kampong Glam. A lower user to check-in ratio would mean that there are more unique check-ins to the area. Comparing the 3 graphs against each other, we calculated that Arab Street really does have the lowest total user to check-in ratio of 22.3, whereas that of Jalan Besar is 28.9 and Tanjong Pagar is 35.0. Hence, this proves our hypothesis right as users checking in to Arab Street are more unique as they are more likely to be tourists who are visiting the nearby Kampong Glam. Tourists are also normally first-time visitors and are unlikely to make multiple or repeated trips back to the same location as they are only here for a short period of time and would move on to check out other tourist attractions. Furthermore, the only peak in the graph in Arab Street belongs to the Professional category (offices). It makes sense for the user to check-in ratio of that to be higher as people working in the area would definitely make multiple 11 check-ins to the same location as it is their work place.


RELATIONSHIP BETWEEN DIVERSITY OF FOOD OPTIONS AND NUMBER OF CHECK-INS Diversity is defined by the number of different types of food options, as well as the quantity of each option. In this analysis, the diversity index is calculated using the Simpson’s Index of Diversity method1, which takes into account the number of different food outlet types, as well as the relative abundance of each food option. An area with a higher diversity of food options would be one that has more choices and with each choice having a similar quantity. A higher index value represents a greater food diversity present in the cafe district. Simpson’s Index of Diversity (D) Formula: where n: quantity of each food option N: total number of food outlets

Looking at the respective food diversity indices of the 3 cafe districts, Arab Street has the highest food diversity index, followed by Tanjong Pagar, and lastly, Jalan Besar. However, this does not show a clear correlation between the diversity of food options with the total number of check-ins to the districts. Jalan Besar corresponds to this hypothesis, having the lowest food diversity index and number of check-ins. However, for Arab Street and Tanjong Pagar, even though the former has the highest food diversity index, it does not result in the most check-ins. This could be due to the sheer number of attractions in Tanjong Pagar, which is able to accommodate more people in the area, thus resulting in the greatest number of check-ins to the cafe district.

Simpson’s Diversity Index. (2013, May 13). Retrieved April 19, 2015, from http://geographyfieldwork.com/Simpson’sDiversityIndex.htm 1

12


However, there is a strong correlation (R2=1) between the food diversity index and the percentage of check-ins to food attractions in each cafe district. The greater the food diversity, the larger the proportion of total check-ins due to food. This shows that with a wider variety of food options, it would attract proportionately more people to the district for food, as the area would be more known for their food options.

An interesting observation during the analysis was the disparity between the results from Instagram as compared to Foursquare. The proportion of check-ins to food attractions was much higher from Instagram than from Foursquare. This could be due to the nature of the social media, with Instagram being a more visual-based platform as compared to Foursquare, which relates to the features of a food outlet, pictures are more enticing on a social media platform. This could also be due to the higher usage of Instagram2 as compared to Foursquare in Singapore. Social Media in Singapore 2014 [Infographic] | Social Media Statistics. (n.d.). Retrieved April 19, 2015, from http://www.hashmeta.com/social-media-singapore-infographic/ 2

13


FOOD AND TRANSPORT (JALAN BESAR)

Specialty Food Outlets

HIPSTER CAFES + ANCHORS

Mapping the food clusters with a heat map of bus check-in locations, there is little correlation between the areas of highest accessibility and those of higher patronage.

Instead, competitive clustering of specialty food outlets increases the amount of patronage to the area. Clustering of cafes facilitates cafe hopping, where patrons visit cafes after cafes in the vicinity, which might also be a main . contributor to the areas with higher patronage in Jalan Besar.

14


FOOD AND TRANSPORT (ARAB STREET)

Rich Culture

In the case of Arab Street’s vicinity, we can see that while the area of highest patronage does not coincide with a specific transportation node, it is however in the centre of several transportation nodes, making it highly accessible. The most clustered area is along Bussorah Street, where the cuisine portrays the rich culture of the area.

15


FOOD AND TRANSPORT (TANJONG PAGAR)

Diversity

Patronage in the Tanjong Pagar area is generally well distributed with both food and transportation nodes dispersed in the area. A slightly higher patronage is observed along Smith Street, which is a small food street with a variety of international cuisine.

16


9. CONCLUSION Cafe districts that are of close proximity to tourist destination districts have lower user to check-in ratio. There is a positive correlation between the proximity of cafe districts to tourist destination districts and a lower user to check-in ratio. This illustrates the extent of uniqueness of users to such cafe districts and the implied gravity of tourist attractions as anchors in districts. Cafe districts with larger quantity and diversity of food options have a higher number of check-ins. This was proven to a limited extent whereby there is a positive correlation between a larger diversity of food options and a higher number of check-ins to food attractions in the same district. However, this does not apply to the general patronage to the cafe district, as there are other attractor points present in these areas. Food outlets that are of closer proximity to transportation nodes do not necessarily have a higher patronage. This disproved our initial hypothesis that food outlets that are more accessible have a greater number of check-ins, as from the data collected, we observe that the location of the food outlet within the cafe district has a greater impact on the patronage, due to the positive externalities of business clustering. Additionally, customers may have walked or taken their own private transport to the area, which is less influenced by the position of transportation nodes present in the district. Our experiment utilised data gathered from Instagram and Foursquare, which may not have captured the entire demographics of visitors to these cafe districts, especially those of the older generation. However, it highlights the key features of a particular location, and is a representation of the patronage trends in these cafe districts. Since the younger generation visit these areas more regularly, it is assumed that they form the majority of the customers to the area. Perhaps check-in data could be gathered from other forms of social media too, such as Facebook and Twitter, to get a more comprehensive idea of the check-in trends in these districts.

17


10. DESIGN RECOMMENDATIONS As stated by Ahas R., Ăœlar M. (2005) Social Positioning Methods (SPM) has the capacity to influence the development of the urban environment through planning. 1. Advertising potential of cafes in cafe districts closer to tourist destinations Our studies have indicated the correlation between lower user to check-in ratios and proximity to tourist destinations. This indicates the uniqueness of human traffic at such locations, of which could be better serviced in the urban setting. More advertising platforms could be introduced to these locations which may be able to reach out to a more diverse spread of users, based on the implied diversity of human traffic present such areas. 2. Wayfinding methods by using tourist destinations as primary nodes and food outlets at secondary nodes Informal directional instructions for wayfinding in neighbourhoods near tourist destinations might increasingly utilise food outlets as secondary placemakers to the primary placemakers of key tourist destinations. These might increasingly be utilised in signage and directional pamphets to guide potential cilentele and public to and around specfic businesses in the location. 3. Business strategies for businesses in clusters near tourist destinations Collaborative business strategies could be executed by cafes located in such clusters near tourist destinations. An urban identity could be forged based on this existing demographic diversity of users around tourist destinations. Thus creating a stronger link in the urban neighbourhood to create a more cohesive an integrated urban environment.

18


11. BIBLIOGRAPHY Ahas R., Ülar M., 2005, Location based services—new challenges for planning and public administration?, Futures, 37:6 547-561 Cheng, Z., Caverlee, J., Lee, K., & Sui, D. Z. (n.d.). Exploring millions of footprints in location sharing services. ICWSM, 2011 Eagle, N., & Pentland, S. (2007). Eigenbehaviors: Identifying Structure in Routine. Behavior. Ecology and Sociobiology. González, C. M., & Barabási, A.-L. (2008). Understanding individual human mobility patterns. Nature, 453, 779–782. Liu, Liu (2014) C-IMAGE : city cognitive mapping through geo-tagged photos. MIT Thesis. Sevtsuk, A., Ratti, C., 2010, Does Urban Mobility Have a Daily Routine? Learning from the Aggregate Data of Mobile Networks, Journal of Urban Technology, Volume: 17, Issue: 1, Pages: 41-60. Simpson’s Diversity Index. (2013, May 13). Retrieved April 19, 2015, from http://geographyfieldwork.com/ Simpson’sDiversityIndex.htm Social Media in Singapore 2014 [Infographic] | Social Media Statistics. (n.d.). Retrieved April 19, 2015, from http://www. hashmeta.com/social-media-singapore-infographic/

19


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.