AUGMENTED DECISION-MAKING PROCESS IN URBAN PLANNING: TRADITIONAL SPATIAL ANALYSIS WITH REAL TIME API HOMELESSNESS IN SF
Physical Built Envirionment
Traditional Spatial Analysis
Twitter API Real Time Data
AUGMENTED DECISION-MAKING PROCESS IN URBAN PLANNING
People’s Opinion/ Feedback
HOMELESSNESS IN SF Homelessness has been one of the striking problems in San Francisco.
PART 1: PEOPLE | WHERE DO HOMELESS PEOPLE MOSTLY ENCAMP?
PART 2: RESOURCE | WHERE COULD FUTURE RESOURCES BE PUT TO?
traditional spatial analysis GIS
PART 3:
new method
OPINION | WHAT AREAS ARE MORE FRIENDLY TO HOMELESS PEOPLE?
Twitter API
GAP
PEOPLE
RESOURCE
DONATIONS /VOLUNTEER
OPINION
where homeless people are
current resource distribution
friendly areas
what areas should be allocated with more resources
Gap Analysis: what areas should be improved
where donations or volunteer help could be got
PART 1 PEOPLE | WHERE DO HOMELESS PEOPLE MOSTLY ENCAMP?
Data Source: 311 cases of SF
1 GB
311 data: ‘Homeless’ - ‘Individual Issues’ cases of people calling police to move homeless people from their neighborhood 10,000 cases in this category Method: - clean 311 data - GIS ‘display XY data’ to map the coordinates
$ 0
0.5
Miles 1
Encampment 311 data: ‘Encampment’ - people/items cases of people reporting findings of encampment items 40,000 cases in this category Method: - clean 311 data - GIS ‘display XY data’ to map the coordinates
E
0
encampment
0.5
$ Miles 1
Encampment Heat Map GIS - ‘Spatial Analysis’, ‘Point Density’ Tenderloin, south of SOMA, Civic Center, Nob Hill are the neighborhoods with the most encampment.
E encampment Homeless Encampment
Density 70.18 - 97.14 97.15 - 160.08 160.09 - 307.05 307.06 - 650.26 650.27 - 1,451.66 1,451.67 - 3,323.01 3,323.02 - 7,692.78 7,692.79 - 17,896.58
$
0
0.5
Miles 1
PART 2 RESOURCE | WHAT AREAS SHOULD WE TARGET FOR FUTURE IMPROVEMENTS?
DATA SOURCE Link-SF Link-SF is San Francisco’s first mobile-optimized website that connects homeless and low-income residents with critical and life-saving resources nearby. It is designed and funded by St. Anthony Foundation and Zendesk.
+ food + hygiene (shower, laundry) + housing (shelter, homeless projects, church) + medical (clinics, therapy center, etc.) + technology (internet, computer, career training)
METHOD GIS:
1.
MAP THE POINTS coordinates or street locator
2.
LAY BUFFERS 3 -mile walking distance maximum
3.
ASSIGN SCORES nearer, higher score
1
12
SELECT AREAS based on how much percentile you want
4.
OVERLAY LAYERS calculate weighted sum score
Treasure Island Homeless Development Initiative
St. Peter & Paul Catholic Church Salvation Army Chinatown
Mary Elizabeth Inn Kimochi Nutrition and Hot Meals Program & Senior Center Glide Memorial Church Booker T. Washington Community Svce Ctr. St. Andrew Missionary Baptist ChurchProject Open Hand Curry Senior Center Jones Memorial United Methodist Church Fraternite Notre Dame Economic Opportunity Council (EOC) - Commodity Food Program City Team Ministries Korean American Senior Service Supplemental Food Program First Friendship Institutional Baptist Church
Haight Ashbury Food Program - Food Service Center Martin De Porres House of Hospitality Women's Building Mission Neighborhood Center
GIS - ‘display XY data’, latitude, longitude
Fill up America Salvation Army, Mission
St. Aidan's
FOOD RESOURCES
churches, communit centers, NGOs which give out free meal daily or on certain days per week. overlooked the requirements on gender, age, etc.
Catholic Charities - 30th St. Senior Center Meals on Wheels of San Francisco, Inc. St. Paul Tabernacle BaptistProvidence Foundation of SF Our Lady of Lourdes Bayview TLC Family Resource Center United Council of Human Services - Oakdale United Council of Human Services - Jennings
food_homeless
Church of God Prophecy Catholic Charities - OMI Senior CenterTemple United Methodist Church YMCA: OMI Family Resource Center
0
0.5
$ Miles 1
HOMELESS-RELIABLE RESOURCES GIS - ‘display XY data’, latitude, longitude + food + hygiene + medical +housing + technology
food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless
$
0
0.5
Miles 1
LAY BUFFERS GIS - ‘Euclidean distance’, maximum distance 3 miles (roughly 1-hour walking)
food_homeless <VALUE> 0 - 1,584 1,585 - 3,168 3,169 - 4,752 4,753 - 6,336 6,337 - 7,920 7,921 - 9,504 9,505 - 11,088 11,089 - 12,672 12,673 - 14,256 14,257 - 15,840
0
0.5
$ Miles 1
ASSIGN SCORES GIS - ‘Reclassify’, reverse values, give higher values to nearer areas
food_homeless Reclass_food Value 1 2 3 4 5 6 7 8 9 10 11 12
0
0.5
$ Miles 1
OVERLAY LAYERS
points
buffers (3 miles)
scores ( reclassify 0-12)
food
0.35
hygiene
0.35
medical
0.1
housing
0.1
technology
0.1
SUMMED SCORE GIS - â&#x20AC;&#x2DC;Raster Calculatorâ&#x20AC;&#x2122;, SUM = food*0.35 + hygiene*0.35 + medical*0.1 + housing*0.1 + technology*0.1
food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless Distance Score Value High : 12 Low : 1
$ 0
0.5
Miles 1
AREAS WITH 8+ SCORE (RANGE 0-12) GIS - â&#x20AC;&#x2DC;Raster Calculatorâ&#x20AC;&#x2122;, Top 1/3 areas with good connection to the resources food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless Distance Score (8+) Value High : 12 Low : 0 Distance Score Value High : 12 Low : 1
$ 0
0.5
Miles 1
AREAS WITH 10+ SCORE (RANGE 0-12) GIS - â&#x20AC;&#x2DC;Raster Calculatorâ&#x20AC;&#x2122;, Top 1/6 areas with good connection to the resources food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless Distance Score (10+) Value High : 12 Low : 0 Distance Score (8+) Value High : 12 Low : 0 Distance Score Value High : 12 Low : 1
$ 0
0.5
Miles 1
OVERLAY THE HOMELESS ENCAMPMENT DENSITY MAP WITH THE RESOURCE SCORE MAP If we use the 8+ score buffer, we can see that outer mission, outer Richmond and Cow Hollow districts are not covered by the resources. food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless Distance Score (8+) Value High : 12 Low : 0 E encampment Homeless Encampment Density 70.18 - 97.14 97.15 - 160.08 160.09 - 307.05 307.06 - 650.26 650.27 - 1,451.66 1,451.67 - 3,323.01
3,323.02 - 7,692.78 7,692.79 - 17,896.58
$
0
0.5
Miles 1
OVERLAY THE HOMELESS ENCAMPMENT DENSITY MAP WITH THE RESOURCE SCORE MAP If we use the 10+ score buffer, we can see that Mission, and South Beach districts are not covered by the resources. food_homeless hygiene_homeless medical_homeless housing_homeless technology_homeless Distance Score (10+) Value High : 12 Low : 0 E
encampment
Homeless Encampment Density 70.18 - 97.14 97.15 - 160.08 160.09 - 307.05 307.06 - 650.26 650.27 - 1,451.66 1,451.67 - 3,323.01 3,323.02 - 7,692.78 7,692.79 - 17,896.58
$
0
0.5
Miles 1
PHASE 1: URGENT
PHASE 2: FUTURE IMPROVEMENT
COW HOLLOW EMBARCADERO
PHASE 1 INNER RICHMOND SOUTH BEACH
PRIORITIZED AREAS FOR FUTURE DEVELOPMENT
OUTER RICHMOND
Areas circled in red should be prioritzed in the future for more resources. Areas circled in white should also be assigned more resources if possible. NGOs or SF government should prioritze these areas for future improvement. OUTER SUNSET
PHASE 2 MISSION DISTRICT
BALBOA PARK
OUTER MISSION CROCKER-AMAZON
PART 3 OPINION | WHICH PART OF THE CITY IS MORE FRIENDLY TO HOMELESS PEOPLE?
RESEARCH QUESTION
PUBLIC ATTITUDE ON HOMELESSNESS ACROSS DIFFERENT AREAS
‘Friendly’ Areas
Prioritize areas where future marketing should be put to get donations or volunteer help
1. COLLECTING PUBLIC OPINION
2. SENTIMENT ANALYSIS
3. PUBLIC ATTITUDE MAP
1.
COLLECTING PUBLIC OPINION
Twitter Firehose API
access to full Twitter dataset, 100% of tweets that match the search criteria in history
Twitter Sreaming API
push of data by Twitter get a sample of tweets that are occuring, 1%-40% of tweets in real-time
Twitter Search API
pulling data that already exists based on ‘search criteria’, 180 requests in 15 minute period, each request can get up to 250 tweets, the last-7-day time limit
TWITTER APIs
Geo-tagged tweets - point map of each tweet - analyze each tweet - create a point-density map
3 / 1600
Indirect Approach
REMOVE DUPLICATES
SCRAPE DATA
20,000 Tweets in past 7 days
CLEANED
WITH COORDINATES
GEO-CODED
DATA
DATA
669 Tweets
3 Tweets
tweets_to_csv ???????????????
1600 Tweets
1. LACK OF ENOUGH GEOCODED DATA
1.
COLLECTING PUBLIC OPINION
HOME
TRANSIT
WORK
3 / 1600 - 0.1% geo-tagged tweets
geocode
-
Returns tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile.
1.
COLLECTING PUBLIC OPINION
USING MULTIPLE OVERLAPPING CIRCLES TO COVER SAN FRANCISCO Each square is 1 mile by 1 mile. Get the coordinates of each circleâ&#x20AC;&#x2122;s center.
1.
COLLECTING PUBLIC OPINION
latlng = ‘37.7810111621, -122.429969541’ # center of different circles (ish) # Set a search distance radius = ‘1mi’ # See tweepy API reference for format specifications geocode_query = latlng + ‘,’ + radius # set output file location file_name = ‘data/0807_JSON/homeless_0809.json’ # set threshold number of Tweets. Note that it’s possible # to get more than one t_max = 50000
1.
COLLECTING PUBLIC OPINION def get_tweets( geo, out_file, search_term = ‘”homeless” OR “encampment” OR “street people” OR “hobo” OR “hobos”’, tweet_per_query = 100, tweet_max = 150, since_id = None, max_id = -1, write = False ): tweet_count = 0 all_tweets = pd.DataFrame() while tweet_count < tweet_max: try: # date = None if (max_id <= 0): if (not since_id): new_tweets = api.search( q = search_term, rpp = tweet_per_query, geocode = geo, tweet_mode=’extended’, # until = date ) else: new_tweets = api.search( q = search_term, rpp = tweet_per_query, geocode = geo,
since_id = since_id, tweet_mode=’extended’, # until = date
) else: if (not since_id): new_tweets = api.search( q = search_term, rpp = tweet_per_query, geocode = geo, max_id = str(max_id - 1), tweet_mode=’extended’, # until = date ) else: new_tweets = api.search( q = search_term, rpp = tweet_per_query, geocode = geo, max_id = str(max_id - 1), since_id = since_id, tweet_mode=’extended’, # until = date ) if (not new_tweets): print(“No more tweets found”) break for tweet in new_tweets: all_tweets = all_tweets.append(parse_ tweet(tweet), ignore_index = True) if write == True:
with open(out_file, ‘w’) as f: f.write(jsonpickle.encode(tweet._ json, unpicklable=False) + ‘\n’) # max_id = new_tweets[-1].id tweet_count += len(new_tweets) except tweepy.TweepError as e: # Just exit if any error print(“Error : “ + str(e)) break print (f”Downloaded {tweet_count} tweets.”) return all_tweets # all_tweets.to_ json(‘’)
TWEETS
feelings are not indicated by words as ‘hate’, ‘like’
WORD FREQUENCY
LIST OF FREQUENTLY
TEST
USED WORDS
don’t really indicate any inclination
???????????????
the same word could mean completely opposite feelings ‘hate’
FILTERED DATA
2.
SENTIMENT ANALYSIS
NLP (Natural Language Processing)
NLTK
TextBlob
TextBlob is a Python (2 and 3) library for processing textual data. It provides a consistent API for diving into common natural language processing (NLP) tasks. It is built on the shoulders of NLTK with easier user interface. - spelling correction - language translation - get word frequencies - sentiment analysis
2.
SENTIMENT ANALYSIS
polarity [-1 , 1]
subjectivity [0 , 1]
def get_sentiment(text): sentence = TextBlob(text)
def get_subjectivity(text): sentence = TextBlob(text)
if sentence.sentiment.polarity > 0: return ‘positive’
if sentence.sentiment.subjectivity < 0.5: return ‘objective’
elif sentence.sentiment.polarity == 0: return ‘neutral’
else: return ‘subjective’
else: return ‘negative’
negative
positive
-1
1
objective 0
subjective 1
3.
SCRAPING RESULTS AND MAPS 1
A5 B4 B6 B8 C3 C5 C7 C9 D2 D4 D6 D8 E3 E5 E7 E9 F2 F4 F6 F8 G3 G5 G7 G9 H2 H4 H6 H8 I1 I5 I7 I9 K2 K8
Name
Total
FID
16 19 18 17 23 22 21 20 24 10 15 14 9 11 12 13 8 7 6 5 2 3 4 25 29 28 31 27 30 33 32 26 0 1
POINT_X -122.52129 -122.50313 -122.50311 -122.50309 -122.48474 -122.48481 -122.48479 -122.48478 -122.46657 -122.46655 -122.46656 -122.46648 -122.44823 -122.44829 -122.44831 -122.44822 -122.42987 -122.42997 -122.4299 -122.42993 -122.4117 -122.4117 -122.41166 -122.41179 -122.39349 -122.39351 -122.39352 -122.39355 -122.37517 -122.37523 -122.37526 -122.37529 -122.35681 -122.35703
7-Aug POINT_Y Positive Neutral Negative Subjective Objective Sum 37.76647403 37.78090553 37.75200207 37.72286008 37.79551306 37.76643085 37.73746771 37.70856402 37.80993887 37.78102362 37.75201287 37.72299007 37.79545858 1 1 0 1 1 37.76650749 35 24 21 35 45 37.73741324 37.70842611 1 0 0 0 1 37.80995024 37.78101116 41 28 28 42 55 37.75190505 3 0 0 1 2 37.72298951 1 0 0 0 1 37.79544045 2 0 0 2 0 37.76633442 21 1 10 22 10 37.73751429 37.70838418 37.80998608 37.78096357 16 0 3 16 3 37.75193735 37.72291811 37.82446824 37.76648039 37.73744464 37.70836574 37.80990318 37.72295456 686
466
331
617
866
0 0 0 0 0 0 0 0 0 0 0 0 2 80 0 1 0 97 3 1 2 32 0 0 0 19 0 0 0 0 0 0 0 0 237 1483
Positive
8-Aug Negative Subjective Objective Sum
Neutral
1 36
1 23
0 21
1 37
1 43
1
0
0
0
1
42 3 1 2 21 1
23 0 0 0 4 0
30 0 0 1 11 0
46 1 0 2 23 1
49 2 1 1 13 0
16
3
4
16
7
1
0
0
1
0
1
0
0
1
0
842
447
377
769
897
0 0 0 0 0 0 0 0 0 0 3 80 0 1 0 95 3 1 3 36 1 0 0 23 0 1 0 0 1 0 0 0 1666
1666
PUBLIC AWARENESS OF HOMELESSNESS The amount of tweets reflects different levels of public awareness to the homeless issue. However, the total amount of tweets collected through these smaller circles is far less than the amount collected by using the whole SF range.
PUBLIC AWARENESS COMPARED WITH HOMELESS DENSITY There is more public awareness where people are more exposed to the problem.
PUBLIC OPINION TOWARDS HOMELESSNESS Run each tweet through the sentiment analysis and get the percentage of ‘positive’, ‘neutral’, and ‘negative’ tweets in total tweets.
‘FRIENDLY AREAS’ In the areas where public are more aware of the issue, SOMA and South Beach are the relatively friendly areas with positive attitude. Tenderloin and Lower Heights have about 10% percent less positive people. It’s also worth to mention that SOMA has the highest percentage of negative tweets.
CONCLUSION OF THE HOMELESS RESEARCH
PART 1 | WHERE DO THE HOMELESS ENCAMP? Tenderloin, south of SOMA, Civic Center, Nob Hill are the neighborhoods with the most encampment.
PART 2 | HOW GOOD IS EACH AREA IN TERMS OF EASY ACCESS TO RELIABLE LIVING RESOURCE FOR THE HOMELESS? Outer Richmond, Outer Sunset, Cow Hollow, Balboa Park, and Outer Mission need urgent input on resources. South Beach and Mission District should be the second tiers of neighborhoods to allocate more resources.
PART 3 | WHERE SHOULD FUTURE MARKETING BE DIRECTED TO? South Beach is the most friendly area in SF to homeless people and the public are very well aware of the issue. Where the homeless density is high, generally there is better public awareness towards a certain issue. Areas which get enough exposure to the problem but not right in the center of the problem seems to have the most friendly people and is mostly likely to be the place for raising donations and getting help.
LIMITATIONS
1.
Dramatic shrink of data when circle radius decreases
1666
247
Returns tweets by users located within a given radius of the given latitude/longitude. The location is preferentially taking from the Geotagging API, but will fall back to their Twitter profile.
LIMITATIONS
San Francisco, CA
better for Larger Scale Analysis
LIMITATIONS
2.
Limited access to only the past 7 daysâ&#x20AC;&#x2122; data
Aug 7, 8 , 9
Aug 7, 14 , 21, 28.....
plan A: Scrape for a period of time, each week collect one time to get enough data.
plan B: Buy access to Twitter Firehose
LIMITATIONS
2.
Limited access to only the past 7 daysâ&#x20AC;&#x2122; data
Aug 7, 8 , 9
Aug 7, 14 , 21, 28.....
plan A: Scrape for a period of time, each week collect one time to get enough data.
plan B: Buy access to Twitter Firehose
LIMITATIONS
3.
Imperfectness of NLP
positive
negative
“The shame and stigma around drug use and being homeless is layered and intense,”
POSITIVE
“Thank you, @CCofChicago, for your efforts to provide Chicago’s homeless population with resources to clean up/be comfortable.” POSITIVE
“RT @eparillon: SF big business: do something about homelessness! SF: ok, let’s raise revenue to house people SFBB: no, not like that!!!…”
POSITIVE
LIMITATIONS
3.
Imperfectness of NLP
NLP
TAKEAWAYS
1. GET UNBIASED OPINIONS FROM THE REAL ORDINARY PEOPLE - #### hashtag - public hearing collecting opinions from those who already have an opinion about the issue. try
q = “Market St”
2. Each component could be used separately for collecting data or analyze sentiment
COLLECTING PUBLIC OPINION
SENTIMENT ANALYSIS
Physical Built Envirionment
Traditional Spatial Analysis
Twitter API Sentiment Analysis
AUGMENTED DECISION-MAKING PROCESS IN URBAN PLANNING
Peopleâ&#x20AC;&#x2122;s Opinion/ Feedback
Thank you!