The First User-Flocksourced Bus Intelligence System in the World by Albert Ching by Albert Ching

By Albert M. L. Ching MIT DUSP MCP 2012 Masters Thesis In Collaboration with Stephen J. Kennedy Advised by P. Chris Zegras with the gracious help of Zia Wadud, Paul Barter & Eran Ben-Joseph Inspired by Muntasir Mamun and Kewkradong team in Dhaka as well as all the entrepreneurs promoting sustainable transport in developed Asia

Pairs well with a bottle of Hitachino white ale and Stephen J. Kennedyâ&#x20AC;&#x2122;s Big Data-InformationKnowledge Continuum

Thesis Defense DRAFT, 1

Abstract Dhaka may be the slowest city in the world. For the 5 million+ bus commuters in Dhaka, this fact might be better left unverified since most do not have any other option for traversing one of the most dynamic, congested, vulnerable and enjoyable cities in the world. Data today, however has become destiny for a city and the rise of mobile networks and smartphones has made it technically possible to cheaply, scalably and quickly collect data in almost any corner of our shared planet, no matter what their speed. This possibility not only presents blistering opportunity (like the rise of the first iterative city) but also raises a few key questions that may delay a more measurable, experimental and responsive future for cities.

Key Research Questions

While smartphones can be designed to collect vast swaths of data, can flocks of people be organized and incentivized to collect data for a targeted period of time and place? If not all data in a city can be collected by flocks, can a sampled set be useful, especially if certain behaviors are predictable? To address these questions, a team of MIT students and an amazing Dhaka-based partner developed and deployed an experimental data collection technique called flocksourcing, or guided crowdsourcing, on buses in Dhaka. Armed with data collection apps, a flock of 8 paid volunteers in one week were able to gather an unbelievable amount of data on a system with hardly any. By doing so, the flock presented the possibility that sampled, real-time data can provide an effective amount of intelligence to improve sustainable transport in Dhaka.

Thesis Defense DRAFT, 2

Chapter 1

The Rise of the Iterative City A short history of planning and the role of technology Performance-based planning enabled by recent breakthroughs in low-cost, real-time spatial measurement technologies Masterplanning â&#x2020;&#x2019; Simulation â&#x2020;&#x2019; Iteration Key questions

Chapter 2

Motorization and the Mobile Opportunity Iron law and the mobile opportunity in South and SE Asia Mobile-driven intelligence and its potential uses in sustainable transport Landscape of mobile + transport mash-ups in developing Asia

Chapter 3

Introducing Flocksourcing Real-time urban data collection techniques: Ubiquitous sensing, Crowdsourcing Predictability of mobility Introducing Flocksourcing Flocksourcing workflow Experimental design

Chapter 4

A Thousand Surveys in One Week Organizing flocks in Dhaka and Boston Monitoring data collection in real-time Data authentication and flock bias What matters to users Sampled set of data for bus crowding, travel times, and routes Self-organizing flock

Appendix

Dhaka, the First Iterative City

Thesis Defense DRAFT, 3

A short history of planning & the role of technology Performance-based planning enabled by recent breakthroughs in low-cost, real-time spatial measurement technologies Masterplanning â&#x2020;&#x2019; Simulation â&#x2020;&#x2019; Iteration Key questions

Thesis Defense DRAFT, 4

A short history of planning & the role of technology

The future of cities is no longer held in one big plan but in a thousand little, measured strokes. In just a half century, our approach to managing our beloved and growing cities is rapidly shifting out of experience, out of necessity, and out of possibility. Since the early days of city planning, pioneers like Le Corbusier, Moses, Burnham, theory and intuition were the guiding forces that shaped the spatial configuration of cities from Boston to Chandigarh. Originating from the traditions of art and architecture, beauty and ideals often in the eyes of the privileged few beholders were the ultimate benchmark for what a good city was. Unlike a solitary building, however, the scale of a city and its impact on a diverse set of actors set in motion a modernist rebuttal led by Jane Jacobs and others who asserted that cities were not museum pieces to be ogled from afar; rather they were complex networks that could neither be understood nor controlled by a privileged few. Communities and citizens were the key agents responsible for the vitality of cities. They were what gave cities the character that not just drew people to them but were also the catalysts for meaningful living. The debate between planning from the top versus the bottom has been complicated in the most recent decade by a pressing need to conserve an ever decreasing amount of resources on our shared planet. Individual freedom and happiness in a city can no longer be viewed without the backdrops of climate change, congestion and energy. Present day city planners have to almost play a divine role to simultaneously optimize livelihoods of millions of diverse actors as well as a limited and decreasing amount of resources. This optimization challenge has been made even more complex by the unprecedented growth of cities. In 1800, Beijing was the only city in the world with a population greater than 1 million. Today, 120 cities in China alone have populations greater than 1 million and 30 cities around the world have urban populations greater than 10 million. In Dhaka, the focus of our study and the bustling mega-capital of Bangladesh, the population has skyrocketed from 1.5 million in 1981 to 7 million in 1991 to 10 million in 2001 to an estimated 18 million today (and a projected 23 million by 2020).

“Solving traffic jams with more or bigger highways is like putting out a fire with gasoline.” Enrique Peñalosa, Colombian politician and former mayor of Bogotá from Changing Course in Urban Transport: An Illustrated Guide (2011)

The increasing complexity of cities made our first toolkit of theory and intuition a bit less useful in practice. Building a highway for example to decrease congestion seems to be a logical decision since it increases supply -- but studies have repeatedly shown that traffic usually increases as more people switch to cars from shared modes. (The intuition and some self-interested political motives have been so strong that it has not discouraged politicians from Jakarta to Bangalore from rampant highway building in the face of Thesis Defense DRAFT, 5

paralyzing congestion). The conversion of auto-rickshaws in Dhaka from diesel to cleaner compressed natural gas (CNG) seemed like a huge win in 2007 for air quality but it also encouraged private car ownership by significantly lowering the cost of fuel (CNG is about a quarter to a fifth the cost of petrol in Dhaka and most cars are retrofit to be able to run on them). It is these unintended and unexpected consequences that result from the myriad of interconnected relationships Jane Jacobs warned about that prompted the need for a new approach to understanding and managing our cities, especially in its more complicated subfields like urban transport. The answer to this dilemma of course was technology. The first significant breakthrough occurred in the 1960s with the rise of supercomputing and the advent of the microprocessor which help democratize scalable computation. Instead of pen and paper or the abacus, supercomputing enabled humans to do complex calculations scalably and often simultaneously. One unintended consequence of this cheap computing power was a new analytical approach to complex systems called simulation, where mathematical models were constructed to mirror a system found in nature or in society. As luck would have it, one of the first applications of simulation was in the field of transportation, where the complex flow of traffic was first modeled. Simulation and city planning eventually became joined in cultural lore with the highly successful launch of SimCity in 1989. While simulation has become a handy tool to better understand some of the complex interrelationships within cities, there have been some limitations. The first has been auditing the model itself, which have often grown to be so complex that hours of explanation are required to grasp the myriad of inputs and outputs. The second is in calibrating the model closely to reality so that its predictions about real-life outcomes are reliable. Data for calibration is often gathered with slow, analog methods that does not keep pace with the often dynamic nature of systems. The upshot is a tool that may be good at explaining the past but limiting in evaluating the present and forecasting the future. Was there another approach enabled by new technologies on the horizon that could handle the increasing complexity of our cities? Performance-based planning enabled by recent breakthroughs in low-cost, real-time spatial measurement technologies â&#x20AC;&#x153;Policy initiatives in developed and developing cities should be viewed as experiments that, if carefully designed and measured, can help support the creation of an integrated, predictive theory and a new science of performance-based planning. ... Ideally, by coupling general goals (such as lower carbon emissions) to actionable policies and measurable indicators of social satisfaction, successes and failures can be assessed and corrected for, guiding development of theory and creating better solutions.â&#x20AC;? - Howard Silverman (2010) in Luis Bettencourt and Geoffrey West: Cities as Complex Systems

Thesis Defense DRAFT, 6

Perhaps the biggest gap in city planning today is the lack of understanding exactly what initiatives are actually working and at what level of impact. The early pioneers like Le Corbusier never had to audit the cities they built to judge if they were actually working as planned. Even today, urban designers at prestigious firms like Sasaki Associates judge success based on if a project was built (and that they were consequently paid) rather than if what they built was better than other alternatives. In our new backdrop of limited resources, it is no longer good enough to suspect that a project is net positive; it is critical to know precisely by how much, who wins and who loses. For a long time, however, it was difficult to plan based on performance. Not only was it difficult to understand what metrics were important but more difficult to know if the appropriate metrics could be measured, measured at an affordable cost, and at the appropriate time for decisions. Fast forward to 2000 with the rise of Google, of Big Data, and suddenly the era of performance-based design and decision-making was starting to be revealed. The advertising industry for generations was akin to city planning, spending billions of dollars on campaigns that they suspected work but they could not easily validate that spend with increased sales. Then came Google and its performance-based advertising system that could track a companyâ&#x20AC;&#x2122;s online ad spend with the revenue which that campaign generated. If managed correctly, a company could optimize their online ad spend to maximize profits, something that was a shot in the dark before. Not only was advertising subject to performance metrics but so was everything else including the design of the homepage, the color of the ads, and even whom to hire. The biases of human intuition could be neutralized by data that was suddenly really easy to collect, analyze and respond to. For cities, though, despite the new tools to process, store, analyze and utilize Big Data, the possibility of a performance-based approach was still limited by the ability to collect data in a living, breathing and often chaotic urban space. A few new developments were still required to quickly and broadly sense the city, send that data to a remote location and then process it in a scalable way. Those developments, fortunately, have come in the past few years.

Thesis Defense DRAFT, 7

In a wink of an eye, a new technological revolution has launched around cheap real-time spatial measurement technologies, which is the amalgamation of four simultaneous technical breakthroughs: ●(1) Smartphones and other distributed sensing devices that have the ability to precisely capture time, space and other dimensions ●(2) Extensive, robust and affordable mobile data networks (in many developing nations like India and Bangladesh, mobile data networks have leapfrogged over Wi-Fi) ●(3) Clouds, or online databases that can reside anywhere (but usually reside in politically stable and disaster-resilient cheap land parcels) ●(4) Machine learning, or algorithms that leverage scalable computing to quickly analyze and make predictions from large sets of data Not only are these technologies powerful in cheaply and quickly gathering urban data but they are largely accessible to almost every city on the planet. Smartphones which still retail for a 10x premium on a regular mobile ($175 in Dhaka vs. $15) are expected to become quickly affordable with price points for similar Chinese-made devices (called multimedia devices in Dhaka) as low as $30-$50 a piece. Mobile data networks in less developed contexts like India and Bangladesh boast some of the most competitive markets and consequently lowest prices in the world. Cloud services are now ubiquitous and accessible from anywhere and machine learning requires some startup technical expertise but can be scaled to any type of large dataset. While cheap measurement does indeed enable performance-based planning, what it really enables is a new approach to cities entirely - one not just based on measurement to evaluate past initiatives but measurement that encourages future experimentation. As the new generation of planners in this world of large-scale urban problems, what seems to be needed is not a few grand solutions but a million measured but innovative little strokes.

Thesis Defense DRAFT, 8

Masterplanning → Simulation → Iteration

At its core, an iterative approach to cities acknowledges that predicting the result of a particular initiative beforehand is difficult. Local conditions, politics, and sheer complexity can confound even the best predictions. Rather than spend the bulk of the blood, sweat and tears debating a proposal before it is even launched, an iterative approach would start with the best first idea (rather than the perfect first idea), launch, measure and repeat. One can imagine that instead community meetings where residents lambaste city planners on how increased density will kill their livelihoods, residents could themselves experience an experiment live and feel how it would really impact them. This idea of city experimentation is not new and is gaining favor as a way to implement or at least test out new ideas. In New York’s Times Square, closing lanes off to cars was sold as a temporary experiment that has worked out so successfully that it has become a permanent fixture in the innovative New York landscape. What is new is the ability to quickly and easily measure a wide (but not necessarily complete) range of metrics at a powerful level of precision. New York’s closure of lanes to cars may not just be a popular idea but one that increased pedestrian satisfaction by 10% and traffic through Times Square by an average of only 5 minutes. A different capacity would be required for cities to iterate effectively. The first step might be the creation of a Department of Urban Experimentation (which Stephen and I would like to run some day) which may include specialists in measurement and data analysis, Thesis Defense DRAFT, 9

computer scientists and visualizers, as well as designers of modular and dynamic infrastructure. With the help of such a team, the mayor could monitor an ongoing slate of experiments in his or her city from the comfort of his office much like an MIT scientist running experiments in a lab. Key questions

Before we pack our bags and head into a world of iterative cities, there are some outstanding questions that we should acknowledge and address before the proverbial bus reaches the station. These questions fall into two categories: ethical and technical. The first one that city planners ask is who decides? While data and measurement is helpful, it can also be manipulated to serve predetermined ends. The internet has terrabytes of tutorials on how to lie with data and graphs so while measurement can be a means of more precisely understanding our cities, we must also ask, who is measuring and for what purpose? Even with well-intentioned mayors and governments, there is another difficult challenge: How to make tradeoffs when an experiment benefits some but not others? How to decide between a lower citizen happiness today and 5% lower greenhouse gases for tomorrow? If the results of measurement fall mostly in gray areas, is it better to do nothing in those circumstances? Perhaps the most important question, though, and the one this author feels is most critical to our discussion today is one that has been increasingly relevant in this new era of smarter (and potentially iterative cities). That question is smart cities for whom? Those that need it the most like the rapidly urbanizing ones in South and Southeast Asia like Dhaka and Jakarta who will potentially contribute the most to the global challenges of motorization and climate change? Or those that can afford the services of Cisco, IBM like Singapore and Rio who expect to spend $100 Billion dollars in the next 5 years? That question is closely linked to technical ones. Can this measurement be done cheaply using existing resources - subsets of smartphones, mobile data networks, organized groups of people - and can that data be then visualized and represented in a way that can be useful for less technical users, operators and regulators? It is these questions that form the basis for the work that Stephen Kennedy, Muntasir Mamun and the team at Kewkradong Bangladesh have begun on buses in Dhaka. We believe that it is entirely possible that Dhaka, often rated one of the poorest, congested and most vulnerable spots on the planet, could become the worldâ&#x20AC;&#x2122;s first iterative city (ahead of its richer peers). It is that possibility that we strive for.

Thesis Defense DRAFT, 10

Iron law and the mobile opportunity in South and SE Asia Mobile-driven intelligence and its potential uses in sustainable transport Landscape of mobile + transport mash-ups in developing Asia

Thesis Defense DRAFT, 11

Iron law and the mobile opportunity in South and SE Asia

Can owning a cell phone replace the desire to own a car? In the rapidly developing cities of South and Southeast Asia, this may be the billion person question. On one hand, lies a story of motorization that started one hundred years ago in the United States with Henry Ford, the model T and later the Interstate Highway Act. The mass production of cars not only started a leading industry in American manufacturing but also began a mass transformation of cities where cars became a central part of both the infrastructure and the culture. The car became the epitome of freedom - freedom to go anywhere at any time - and owning one became synonymous with coming of age. Affordable cars enabled people to live far beyond where we could walk, bike or move on transit. Sixty mile daily commutes became possible in our fixed time travel budgets and suburban sprawl became the common typology of how Americans settled into their cities. But alas, this hundred years of joyriding may have come to an abrupt end with the sudden realization that our shared planet may not be able to sustain car-centric lifestyles for 7 billion and growing people who inhabit it. Congestion in especially population dense cities has become one of the leading sources of frustration for the urban dweller. Declining energy reserves and rising greenhouse gases, of which transportation accounts for about a fifth, has made us seriously rethink our cherished but inefficient and polluting mode of transport, the private car. In the US, Europe and other highly developed cities, this story is beginning to reverse itself -- but with infrastructure already built, unwinding our past commitment to the car is a slow process. In the rapidly developing cities of Asia, the story of motorization is only beginning. Rather than learn from our mistakes, developing Asians share our (old) aspirations.

Thesis Defense DRAFT, 12

Source: AC Nielson (2005)

A whopping $23 billion dollars was spent in 2011 alone (including 5 Superbowl commercials) by auto manufacturers to cultivate this aspiration globally, made more astonishing by the paltry amounts spent to promote public transit. (When this author searched online for public transit advertising budgets, he could only find references to ways to advertise on and not for public transit). This aspiration coupled with new economic freedom is leading to what has become almost an inevitability in city development: with rising incomes comes increased motorization. This Iron Law, as stated by Jinhua Zhao from the University of British Columbia is evidenced not just within developing Asia but across the world. (The lone exceptions are the dense city-states of Hong Kong and Singapore).

Thesis Defense DRAFT, 13

Source: Motorization and income data from World Bank, population from the UN population division

Archarya & Morichi (2007) note that car ownership usually takes off at an income level of $5,000 per capita and Paul Barter (1999) suggests that once cities reach 10-20% car ownership, they are “locked-into” cars for the forseeable future. Not only do the major players in developing Asia - China, India, Indonesia, Bangladesh and Pakistan - approach these inflection points but in the major cities like Beijing, Bangalore and Jakarta where congestion is paralyzing, it may already be too late to reverse course without significant pain. As the authors of Changing Course in Transport: An Illustrated Guide (2011) note, “Urban transport in Asia is in crisis.” Because of the sheer size of Asia’s developing economies, the global impact of their expected motorization is significant. According to Bill Ford, the great grandson of Henry Ford, the number of cars on the road is expected to increase dramatically from the 800 million today to an additional 2-4 billion by 2050, most of which will happen in Asia due to increased prosperity and population.

Thesis Defense DRAFT, 14

Due to higher population densities, the tipping point of gridlock congestion has occurred much sooner in Asian cities than their peers which has resulted in several attempts to manage motorization.

Source: Compiled from primary research and GIZ guides on sustainable transport

From a political perspective, restricting car use has been a difficult proposition on both a philosophical and a capacity level. While global warming is looming and arguably Asia is more vulnerable to its potential effects than any other continent, developing Asian countries have contributed less than their fair share to it. Dhaka, which has been rated the worldâ&#x20AC;&#x2122;s most vulnerable city to climate change has simultaneously one of the lowest levels of carbon emissions per capita. To restrict car use is to restrict freedom, something that is at the antithesis of development. Furthermore, the most effective car limiting measures like congestion pricing, deterrent parking fees and controlled land use requires a level of control largely absent from most developing cities in Asia. The alternative is to improve automobile substitutes, which seems to be where most effort on a sustainable transport level has focused. Specifically, under the â&#x20AC;&#x2DC;Avoid, Shift, Improveâ&#x20AC;&#x2122; framework, it has led to a singular focus on investments in large scale public transit projects, most notably Metros and Bus Rapid Transit (BRT). While these investments are helpful, the former has usually been difficult to execute on and the latter has become the Thesis Defense DRAFT, 15

singular savior for cities but one that has proven not to be as financially sustainable as once thought. The most sustainable modes of transport e.g. cycle rickshaws, bicycles and pedestrians have been largely left out of the conversation. At the current trajectory, despite the determined efforts of a cadre of global thinkers and practitioners, the outlook for sustainable transport in developing Asia looks bleak. The divine gift India is an ancient society. For many years, only few people had knowledge. It was blood by chance. The mobile phone is a godsend . . . [and] information can break the stranglehold of the ovarian lottery sealed in Indiaâ&#x20AC;&#x2122;s old hierarchies and shackles. - Sachin Pilot, India Minister of Communications and Information Technology Remarks at Mobile Empowerment Conference in Delhi, August 2011

Source: Motorization data from World Bank, mobile penetration data from various sources compiled by Wikipedia

In the same way that the car was the epitome of freedom in the 1950s, the mobile phone, with its instant access to not just local but the worldâ&#x20AC;&#x2122;s information, is the most important empowerment tool ever invented. It may also be the most successful product invented, leapfrogging ahead of cars and penetrating every corner of the planet within a seemingly Thesis Defense DRAFT, 16

short timespan. In some places like Hong Kong and Singapore, mobile phones greatly outnumber people and in less motorized countries like Bangladesh, Pakistan and India, mobile phones are an astounding 60x more ubiquitous than motorized vehicles in just a few years. In the context of sustainable transport, this development may indeed be a gift from above since mobile phones are conduits for both receiving and creating valuable urban information. For users, that means instant access to shared transport timings, routes, and on-demand vehicles. For suppliers, it means better management and matching of fleets to real-time demand. Unlike other physical transport infrastructures, the mobile information infrastructure is already being built for other purposes and can be retrofit on top of traditional transport infrastructures. While the infostractures are available, how can this new virtual highway be used to support sustainable transport in these places that are on the tipping point of motorization? Mobile-driven intelligence and its potential uses in sustainable transport

1 Marketing With $20 billion annually being spent to promote the American dream of car ownership, the first barrier to a more sustainable transport future is simply making shared public modes attractive, or in sophisticated marketing speak, sexier to users. Bus, trains, taxis, and bicycle-shares need not just market their services to increase awareness but especially for captive riders, find compelling ways to break the feeling that cars are the ultimate Thesis Defense DRAFT, 17

aspiration. While sustainable transport modes will never match the amount of money spent to market cars, targeted investments can still be effective. For modes of transport perceived to be backwards like the cycle rickshaw, cheap retrofits like a QR code sticker can help change perceptions (like it did with the mayor of Patiala in north India when he gleefully started scanning them all with his smartphone). Similar retrofits can be combined with other modes like bicycles and walkways which are thought to be for the poor.

Source: Primary research from authorâ&#x20AC;&#x2122;s fieldwork in developing Asia, Summer 2011

The interactivity between the mobile phone, which for the new generation is a symbol of freedom with a sustainable mode of transport is a powerful combination. 2 (Real-time) user services In Dhaka, riding in a private car is almost always a better experience than trying to force oneself onto a bus. There is no wait, no question about getting a seat, no yelling at the driver for a stop and no acrobatics required for alighting from a moving vehicle. Information can surprisingly help narrow this gap of user experience - and in a way that requires little additional infrastructure. The main shortcomings of shared transport like buses in Dhaka is the additional time it takes to catch a ride and the lack of investment in comfort. Information provided hopefully in real-time (or in almost real-time) to the end customer when he or she needs it via a personal mobile phone, a shared bus company tablet, or even the newspaper may help to reduce the most painful differences between the two experiences. Thesis Defense DRAFT, 18

Source: Primary research from author

Some of the more innovative real-time user services that this author observed in developing Asia involved retrofitting existing fleets of paratransit vehicles like cycle rickshaws in Punjab and ojeks, or motorcycle taxis in Jakarta into on-demand modes.

Thesis Defense DRAFT, 19

Source: Primary research from authorâ&#x20AC;&#x2122;s fieldwork in developing Asia, Summer 2011

Research in more developed contexts like Boston indicate that mobile-driven intelligence even has the potential to not only level the playing field but may also make shared transport a better experience than private cars. A study conducted by Latitude Research + Next American City (2011) revealed that technology applied to transit can unlock feelings of autonomy and community and enable productivity beyond that of a car.

Source: Latitude Research (2011) in collaboration with Next American City and Locately. Tech for Transit: Designing the Future System

Thesis Defense DRAFT, 20

3 (Real-time) operator services Information that shows users the wait time for the next bus may be counterproductive if those wait times are consistently long. In a way, while the information is good for a userâ&#x20AC;&#x2122;s peace of mind and allows him or her to grab a cup of chai during that wait time, it does not change the time at which the bus arrives. Only the bus operator can. The difficulty in managing a distributed fleet has been made much easier by the advent of GPS and other location-based technologies. Bus operators from Singapore to New York have retrofitted public buses with sophisticated systems usually called Automatic Vehicle Location, or AVLs, that range from $8,000-$20,000 per bus. In Dhaka, vehicle tracking has become a popular service provided by both mobile operators like Grameenphone and car manufacturers like NITS to deter theft. To the best of this authorâ&#x20AC;&#x2122;s knowledge, that available technology has not been yet adopted by bus operators and often requires custom hardware.

Source: Primary research from authorâ&#x20AC;&#x2122;s fieldwork in developing Asia, Summer 2011

Instead, most us operators do not know exactly where their buses are at a given moment. Employees who usually sit at the origin and destination of a given route (next to the chai wallah of course) will take manual logs of buses in a physical logbook mostly to note unusual incidents e.g. the disappearance of a bus. Even in Kuala Lumpur, a relatively wealthy context, taxi bookings are tracked on individual pieces of paper more as a last resort of record-keeping. In Dhaka, this lack of fleet visibility makes it near impossible to coordinate timings of buses within a single company let alone between multiple operators. The upshot is unpredictable waiting times for buses and terrible experiences for customers.

Thesis Defense DRAFT, 21

4 Responsive city

With a population that is suspected to grow by several thousand each day, the first and often most difficult task for the Dhaka City Corporation is to keep an accurate count of how many people are in their city so limited resources can be allocated appropriately. In places that are as dynamic as Dhaka, which is expected to grow another 5 million in to 23 million by 2020, the value of simply monitoring the city cannot be discounted. Late last year, Dhaka became the first city in the world to split into two, with officials lamenting the difficulty in serving 18 million and growing customers from one city corporation. While monitoring city performance is usually conducted by the government, there is a trend towards third parties enabled by new technologies to play that role. In Boston, the MBTA which operates the cityâ&#x20AC;&#x2122;s subway, produces a monthly scorecard that showcases metrics that it uses to evaluate its own performance. With real-time transit data made publicly available, citizen developers are exposing performance to the public in real-time using whatever metrics it can define.

Thesis Defense DRAFT, 22

The power to monitor by outside parties cannot be understated when the government has little capacity to keep track of its own citizens. In the last Dhaka masterplan, whole slum areas in Old Dhaka were left out of the plan because counting the millions of people would have gone beyond the capacity of the government. In contrast, in Kibera, the largest slum in the world, an innovative citizen-run project called Map Kibera has not only provided the first detailed map of the informal settlement but has led to partnerships between the government and slum leaders, who are now an extension of the government itself - due simply to data. It is this potential of information to empower citizens that may ultimately help make cities like Dhaka more responsive. Landscape of mobile + transport mash-ups in developing Asia

While the potential for mobiles to support sustainable transport is evident, at what scale and what level of sustainability are these mash-ups happening in developing Asia? Guided by this question, in the summer of 2011, I visited 11 cities in South and Southeast Asia Singapore, Bangkok, Dhaka, Delhi, Chandigarh, Fazilka, Mumbai, Bengalaru, Jakarta, Kuala Lumpur, and Hong Kong - to witness first-hand what types of experimentation were happening, if they were scalable and sustainable, and what the barriers were for accelerating that experimentation. In the absence of formal government and even nongovernmental instruction, the catalysts for these mash-ups were an amazingly dedicated group of social entrepreneurs, most of them driven by the desire to simply make their beloved cities better places to live.

(Top Left) Sanjeev Garv and Atul Jain from Delhi Cycles; (Top Middle) HR Murali from Bangalore Bicycle Sharing; (Top Right) Sundar Raman, Aneth Guru, and Sandeep Bhaskar from Ideophone; (Bottom Left) Anthony tan from MyTeksi; (Bottom Middle) Nadiem Makarim from GO-Jek; (Bottom Right) Navdeep Asija from Fazilka Eco-cabs Source: Primary research from authorâ&#x20AC;&#x2122;s fieldwork, Summer 2011

Thesis Defense DRAFT, 23

Anthony Tan and Hooi Ling Tan created MyTeksi, an on-demand platform for taxi operators in Kuala Lumpur to reduce the unsafe conditions for women in taxis. Navdeep Asija and the Graduates Welfare Association Fazilka launched Fazilka Eco-cabs, an organized on-demand fleet of cycle rickshaws, simply to provide a safe and reliable way for his mother and other aged persons to get to the market. Sundar Raman, Aneth Guru, and Sandeep Bhaskar of Ideophone, a mobile app company focused on improving the commute experience, developed Suruk, a mobile app that track auto-rickshaw fares to fight back against the deplorable fare-gauging by the drivers in Bengalaru.

Source: Primary research from authorâ&#x20AC;&#x2122;s fieldwork, Summer 2011

Together they formed a constellation of recent experiments (almost all less than two years old) that were happening in almost every corner of the region without official support. Most experiments were centered around the more profitable but less environmentally friendly modes of cars and auto-taxis. Most attempted to solve the problem of on-demand paratransit from auto-taxis to auto-rickshaws to motorcycle taxis and cycle rickshaws. Since the cost of learning how to set up and integrate intelligence is non-trivial, there were more experiments happening in cities with larger agglomerations of techies like Thesis Defense DRAFT, 24

Bengalaru, which is a global hub for technology development. What was rare were experiments that improved public transit and buses in particular - and experiments in Dhaka itself, a place known more as a hub of global clothing manufacturing more than software outsourcing. Since most experiments seemed to have had positive impacts, intended and unintended, there was a more difficult question about how to accelerate that experimentation. Was it to support the entrepreneurs directly through financing, through technology (which were mostly customized for single uses), or through marketing their successes globally? Could an outside research team from a leading technical university work in partnership to seed some experiments in places that would otherwise take a few years to begin. It was based on this research that we decided to focus on seeding intelligence for buses in Dhaka, in hopes that we could spark experimentation both in Dhaka and on buses throughout the region.

Thesis Defense DRAFT, 25

Real-time urban data collection techniques: Ubiquitous sensing, Crowdsourcing Predictability of mobility Introducing Flocksourcing Flocksourcing workflow Experimental design

Dhaka has 9,000 mostly privately run buses that are almost always overcrowded and among the slowest in the world. What is the best way to gather intelligence on them?

Thesis Defense DRAFT, 26

Real-time urban data collection techniques: Ubiquitous sensing, Crowdsourcing

The two most popular real-time urban data collection techniques are on opposite sides of the spectrum. On one end is ubiquitous sensing, or an approach that leverages the new location-sensing capability of portable devices like smartphones and tablets to try to capture every piece of data at all times, like a massive fish net. The approach takes advantage of the fact that this type of data collection is now relatively cheap (just the cost of hardware and connectivity) but may produce more data than needed which has a real cost to store and to mine. For bus intelligence, the approach usually involves placing an automated vehicle locator, or AVL on each vehicle (which costs between $8,000-$20,000 per bus but which can now be replaced by a smartphone for $200 per bus). For the 9,000 large buses in Dhaka, that cost is non-trivial (about $18 million) although perhaps not cost-prohibitive especially for new services like the Bus Rapid Transit (BRT) service that is about to be rolled out. Since ubiquitous sensing is a passive form of data collection, the metrics it usually captures is limited to that which requires little human interface like bus location and time. On the other side is crowdsourcing, a participatory approach that attempts to leverage the generosity of strangers, their time and their hardware to gather data for some purpose. In the urban context, participatory mapping via tools like Open Street Map have been successful examples of this type of approach. In the bus mobility space, researchers at Carnegie Mellon in 2011 have developed a tool called Tiramisu to crowdsource bus location and fullness as well as problems and positive experiences within the Pittsburgh bus system. One of the chief challenges of crowdsourcing transit information as noted by this pioneering research team is motivating users. They note that while many Thesis Defense DRAFT, 27

crowdsource models work because users are altruistic, they find that to be rare in the case of transit users (Steinfeld etal 2011). Furthermore, in the case of Dhaka, the current lack of smartphone users might make this approach especially difficult. Faced with these two options and considering the context of buses in Dhaka, some interesting questions arise: Do you need all the data, all the time to provide valuable information to users, operators and regulators? If not, what is that minimum level that you need? What is the best technique to gather that data? Predictability of mobility

Despite our deep-rooted desire for change and spontaneity, our daily mobility is, in fact, characterized by a deeply-rooted regularity. Song, Qu, Blumm, Barabasi (2010) in Limits of Predictability in Human Mobility in Science.

One of the most significant discoveries of the ubiquitous sensing era has been new insights on human mobility patterns based on location data from cell phone use. Till now, there was an assumption that unlike their animal counterparts, humans had a unique freedom to move where they wanted and not just towards watering holes or food sources. However, contrary to that belief, Song, Qu, Blumm and Barbasi (2010) have found that an astonishing 93% of our mobility patterns are predictable, mostly between home and work. If human mobility is characterized by regularity, it may not be a far stretch to predict that the buses that serve human movements also operate by a similar level of regularity. In more developed contexts, buses are intended to operate on tight schedules or headways. In Dhaka, bus schedules are uncoordinated between different operators but collectively should aspire to schedules that maximize profits and passengers. What this may suggest is that continuous and ubiquitous data may not be necessary to fully predict everything from bus timings to speeds to crowding. A representative sampled subset may be sufficient for most applications. Thesis Defense DRAFT, 28

Introducing Flocksourcing

Despite our best endeavors, keeping people interested in anything these days is difficult. Movies are usually less than two hours, Earth Hour is an hour each year, and popular You Tube videos less than 5 minutes. Urban planners often assume that everyone wakes up in the morning wanting to save the planet only to be disappointed when they donâ&#x20AC;&#x2122;t use the revolving doors or mindlessly dump recyclable cans into the trash (Donâ&#x20AC;&#x2122;t they care about the polar bears?!). In an age of short attention spans, what people respond well to are not universal ubiquitous suggestions but clearly-defined and achievable missions. Thesis Defense DRAFT, 29

The idea behind flocksourcing is to arm a flock, or a guided crowd, with well-designed sensors to collect a lot of data for a specified place in a limited period of time. If mobility is fairly predictable, than that flocksourced data which is simply a snapshot of particular buses at a particular time and place could be fairly representative of the overall system for those buses. Ensuring that a sample is representative of the whole requires two potentially difficult elements in this experimental data collection technique: (1) near flawless organization of the flock and (2) the minimization of flock bias. Fortunately, in Dhaka, there is a tradition of organizing significantly large -- in the thousands -- groups of people for everything from shutting down the roads for a political protest to cleaning up a beach. Our partner in this project, Muntasir Mamun and his team from Kewkradong have previously led 1,000 person beach clean-ups in an adjacent city to Dhaka for the International Coastal Cleanup. The second concern is a bit more difficult to manage but can mitigated with good training and be monitored closely with real-time data tracking. One of the dangers of flocksourcing for data in a place that has very little (Dhaka does not have a bus map) is that what is measured may be assumed to be the whole truth even if it is just a snapshot. The best insurance against this is more data from a diverse set of sources and flocks with different interests and intentions. If the data is consistent across these sources, then collectively there will be better intelligence on the system in general.

One potentially significant advantage of using active flocks to collect data is the capacity building and advocacy that it potentially creates within the flock. Akin to the critical mappers in Kibera, these flock members become the first hand witnesses to what they are Thesis Defense DRAFT, 30

measuring. In the case of buses in Dhaka, measuring them for a week may change their own perception of them. In addition, they will likely be a great resource for new ideas to improve the user experience especially if they are frequent bus riders themselves. In a city that has been rated the most vulnerable in the world to climate change and flooding, having a rapid data collection battalion may prove to be an amazing resource especially if the mobile data networks are working (which may be one of the few things that do during disasters). While flocksourcing was not fully intended to be context specific, it was designed with Dhaka in mind and is likely to be a more valuable strategy in the very places that have both a high availability of human resources and less capital to invest in fully blown out smart city infrastructures. If one attempted to map all the urban data collection techniques together, you may get the diagram below. The suggestion is that all else equal (which it is not), different data collection techniques are appropriate for different types of data. Analog approaches seem to be losing their importance entirely in this new paradigm since even lengthy surveys can be better streamlined using mobile and web-based interfaces. While ubiquitous sensing seems like the golden child of measurement, it cannot measure that which cannot be automatically sensed by devices so human perceptions and even measures like crowding are outside its capacity. Crowdsourcing can tap into human perceptions but may not be reliable if a large sample of data is required. Flocksourcing falls in bewtween where it simply scales and targets a crowdsourced effort towards metrics that need to be both quantitative and qualitatively precise. In an era of balancing human with environmental needs, this is of paramount importance.

Thesis Defense DRAFT, 31

Flocksourcing workflow

The above workflow diagram was a result of a monthâ&#x20AC;&#x2122;s work of technical testing and rapid iteration with the team in Dhaka in January of 2012. The design principles that guided this workflow were simple: technically feasible, affordable, and simple to understand and deploy. The last principle was perhaps the most important especially when operating in an environment with very little exposure to smartphones (none of our partners in Dhaka had ever had one before our engagement) and where English is not the primary language. Bangla, which is very similar to Bengali, a regional dialect in India, is the official language and simultaneously the main reason Bangladesh fought for independence from Pakistan. Technical feasibility One of our first tests was to see if the same technology that works in Boston would work in Dhaka, a place where smartphones and mobile data networks were relatively new. Specifically, we wanted to answer three questions: â&#x2014;? How accurate is the location-sensing technology? Thesis Defense DRAFT, 32

â&#x2014;? â&#x2014;?

How fast was the mobile data network i.e. how long would it take to upload data from phones into the cloud? What were the limitations of locally available hardware?

Location accuracy

Location accuracy in Dhaka based on field testing in January 2012

Location-sensing on smartphones today usually leverages a combination of cell tower triangulation, GPS, and Wi-Fi technologies. Cell tower is the least accurate and dependent on the spacing of the towers which should be quite tight in an urban area like Dhaka, GPS is usually accurate to 3-5 meters but only outdoors, and Wi-Fi is the most accurate and is robust indoors but only available when there is significant coverage. Location-sensors today are usually programmed to algorithmically optimize the location based upon the best available method; in the case of the location-sensing feature we tapped into on our mobile application, it appears that most location points were associated with Wi-Fi hot spots. Wi-Fi hot spots usually reside in locations off the road so the sensed location is usually slightly off from the actual location of the bus. A rough estimate is that most points are within 15 meters of where they should be but that figure would need to be verified. Nevertheless, the location accuracy was sufficiently good as a starting point and will assumed to get better in the future.

Thesis Defense DRAFT, 33

Mobile data network speed

Perhaps one of the worst experiences in the Western world is purchasing a cell phone plan. Terrible customer service, locked phones, ridiculous contracts, and just an icky feeling like you are being swindled somehow by fine print. In Dhaka, the experience is almost entirely the opposite. Since the mobile phone industry is quite competitive, there are no contracts, only SIM cards that can be topped up with even the smallest amounts. Since the banking system is largely unavailable for most of the general population, mobile payments have thrived. To enroll in a mobile data plan in Dhaka requires simply going to any local shop owner, sometimes chai wallahs, paying them about 300 Taka ($4 USD), and giving them your mobile number and service provider. The local shop owner will log your payment manually in their book and within a few minutes, a user will get an SMS notifying him of the payment. The experience is comparably amazing. Armed with a mobile data plan, it was important to test how long it would take to upload data to the cloud from the phones. In Dhaka, the current mobile data networks operate on 2 to 2.5 Gs. There has been some resistance towards upgrading the mobile data network to 3G although there is currently some testing for 4G, which may leapfrog 3G. The data packets that were sent from the phones were quite small - just a few lines of text data. In the field, the average data upload speed averaged between 2-3 seconds with sometimes longer upload speeds. After testing different data upload speeds, we determined that sending location data every 60 seconds was the optimal balance between gathering enough data and not interrupting the data collection process itself.

Thesis Defense DRAFT, 34

Locally available hardware

(Left) Samsung Galaxy Y, not available in Boston; (Right) Symphony Multimedia Phone

Due to customs, cost, and sustainability reasons, we wanted to use phones that could be purchased locally instead of importing them from abroad. The best available Android-powered phone that was available was the Samsung Galaxy Y, a phone designed specifically for India and Bangladesh with their lower price points (13,000 Taka or $175 USD inclusive of taxes, custom fees, etc.) These phones were only launched recently in Dhaka in January and available at the main technology market in Agargaon. The main challenge with the phone itself was the smaller user touch interface, about 30% smaller than a Nexus One, which was used to originally developed the mobile apps. The rest of the specs from the location and time sensing worked the same. One internet-enabled phone that was quite popular at a significantly lower price point was what is called a multimedia phone made by a Chinese company called Symphony. The phones do not have the standard Android or iPhone operating system but do have links to YouTube, Google Maps and other web applications at up to a sixth of the cost or less ($30). While not fully functional for our purposes, they do indicate how fast prices for these smartish phones are plummeting, putting increasing pressure on the rest of the market and increasing adoption faster than anyoneâ&#x20AC;&#x2122;s most optimistic predictions. Affordability Most of the smart city efforts developed by big companies like Cisco and IBM rely on building a whole new infostructure to integrate different parts of the city from water to energy to traffic systems. For a city like Dhaka, this investment is cost prohibitive especially given that the physical infrastructures for water, energy and roads are not in good shape. As such, the intention of the flocksourcing workflow was to minimize costs by leveraging resources that are relatively cheap in Dhaka: mobile data networks and organized groups of people. Mobile data networks in almost any city including Dhaka is Thesis Defense DRAFT, 35

already receiving significant investment from the private sector and can used as the major thoroughfare for intelligence. Instead of building completely closed end-to-end systems, people, who are relatively cheap in 18+ million city of Dhaka, can serve as the last mile connector for collecting data.

The upshot is a cost structure that requires some upfront investment for hardware but relatively minimum ongoing investment. The only real ongoing costs are the cost of each flockmember, which can be as low as $10-$15 per day per person, and the cost of mobile data, which is at $4 for every GB of data. Data hosting is free using Google fusion tables and the technical resources needed to develop apps is being built on MIT App Inventor, which is also free and open to the public. One of the biggest advantages of MIT App Inventor is that it requires no coding so the technical development and maintenance is minimal. In fact, this author learned mobile phone programming in a few months on the MIT App Inventor service.

(Left) Interface design for MIT App Inventor; (Right) Functionality design using blocks for MIT App Inventor

Thesis Defense DRAFT, 36

Simplicity

How do you design an intuitive and engaging data collection app to be used on excessively crowded buses for non-English speaking users with little familiarity with smartphones? By launching often and iterating.

(Left) Morshed Alam counting passengers and tracking the Falgun bus to Uttara in January 2012; (Top Right) Inside view of uncrowded Falgun bus; (Bottom Right) One of 15 iterations of Share My Bus Dhaka mobile Android app

Perhaps the most difficult part of the workflow was designing the mobile applications to be workable in the context of riding a bus in Dhaka, which can be quite a chaotic experience. This includes running to catch the bus, shoving oneâ&#x20AC;&#x2122;s way through the back, paying while aboard, shouting for stops, and then alighting safely from a still moving vehicle. Instead of trying to do research ahead of time on best practices in a distinct cultural context light years away in Dhaka, it was much easier to launch a few apps and rapidly iterate while in the field. The Share My Bus Dhaka app that was eventually used was iteration 15 of that app. The app itself was designed to be legible for someone coming from a non-English background to so icons substituted for words as much as possible. The interface was designed so each action could be completely as quickly and intuitively as possible. The bus Thesis Defense DRAFT, 37

details from the previous trip were recorded, passenger counting buttons were made larger, and the survey was designed so it could be completed like a flashcard game in about a minute. These design changes were all made based on feedback from our team in the field.

Share My Bus Dhaka mobile Android app modules: (Left) Bus details; (Middle) Passenger counting; (Right) Rider survey

Surprisingly despite us not being there, the team in Dhaka had very minimal problems with the app in practice, despite a brief 3 hour training to a new set of 8 volunteers to deploy in the field. We did, however, have multiple problems with users in Boston from downloading the app and then using the app properly. This may suggest a need for better training or custom apps for different local contexts. We are hoping that it is the former rather than the latter.

Thesis Defense DRAFT, 38

Experimental design

Key Research Questions

Thesis Defense DRAFT, 39

Share My Bus Dhaka and Boston were developed in parallel and are essentially the same app except for differences in bus and city information that is preloaded onto the app.

Thesis Defense DRAFT, 40

Thesis Defense DRAFT, 41

Locations of where 1,014 surveys were conducted onboard buses in Dhaka

When we first launched these experiments, we had no idea how it would turn out. The results surpassed even our wildest imaginations. Thesis Defense DRAFT, 42

While smartphones can be designed to collect vast swaths of data, can flocks of people be organized and incentivized to collect accurate data for a targeted period of time and place?

Organizing flocks in Boston While we intended to flocksource both Dhaka and Boston, it was much harder to organize and incentivize a semi-volunteer team in Boston as it was for our partners in Dhaka. The main experiment designers Stephen Kennedy and I were based in Boston recruiting and training volunteers from MITâ&#x20AC;&#x2122;s grad school of urban planning and transportation. While we were able to garner some interest among a very busy student population with the advertisement below with 8 signups (inclusive of 3 enthusiastic volunteers to which we procured phones and a data plan), it was very difficult to motivate the flock to execute on downloading the app and then use it while they were aboard buses. Of the 4 surveys and 20 rides, almost all of them were completed by the author himself.

Thesis Defense DRAFT, 43

Organizing flocks in Dhaka In Dhaka, the results were so prolific, it was almost unbelievable. The plan originally was to start data collection in Dhaka a week before but due to a near unprecedented hartal, or political protest by an opposition party, the government decided to ban not just private vehicles but also buses from the roads. As soon as the coast was relatively clear from a rider safety perspective, our partner Muntasir and his team from Kewkradong organized the group of flocksourcers through their extensive social networks at relatively short notice for a three hour training in their head office. The Kewkradong team was a perfect fit for this experiment in part because they are not new to organizing large numbers of people for a common mission, leading the annual 1,000 person International Coastal Cleanup at Coxâ&#x20AC;&#x2122;s Bazaar in southeastern Bangladesh. Paying the riders a daily stipend of $10 per day was enough to motivate what became an all-star cast of riders including one female. 6 of the 8 Dhaka flocksourcers conducted 100+ surveys in a 7-day period on the buses resulting in an astounding 1,014 surveys for the week. Even more astonishingly, 3 flocksourcers rode the bus for over 2,000 minutes in that period, or 33+ hours in that 7-day period. For this author, an hour on a bus in Dhaka is not an easy experience so this was especially impressive.

Monitoring data collection in real-time One of the biggest advantages of mobile-data collection techniques is that it can be monitored almost instantaneously from anywhere. In this case, with the help of a few dashboards, we could monitor and analyze the massive data collection efforts that were happening in Dhaka from 12,500 kilometers and worlds away in Boston. Thanks to great proactive communication by our partners in Dhaka, we were able to provide feedback in the middle of the data collection to make sure the data collection efforts were happening properly.

Thesis Defense DRAFT, 44

Data authentication

Individual location traces of 8 Dhaka bus flocksourcers on March 19, 2012

Thesis Defense DRAFT, 45

The traditional analog methods of data collection for mobility measurement (especially in Dhaka) are usually completed on paper among a distributed group of people. One big challenge of an analog approach is verifying that the data that is collected can be trusted. Just one person who falsifies data or who isnâ&#x20AC;&#x2122;t paying attention when counting can corrupt an entire data set. Since digital tools take advantage of sensors that can capture data with a level of precision beyond human senses, it provided an extra layer of data that could help validate the data that was collected (and an easier way of eliminating data that was found to be corrupt). Without asking our partners, we could observe from the individual location traces (see previous page) exactly where the volunteers were going during the day - the top two on the left were riding line 36 to Mirpur, the right four were riding line 27 to Uttara and the other two were riding both lines. Unlike survey data, location data is really, really time-consuming to falsify. In fact, so much so that it would likely take longer to falsify location data than to go out and collect it. Flock bias Perhaps the most surprising result of the Dhaka flocksourced data was the number of surveys that were conducted in that time period. A mind-boggling 1,014 surveys were conducted onboard extremely crowded buses to current bus riders in one week with 8 people. Our local partner Muntasir Mamun set a target of 6 surveys per ride, which average about 1.5 hours.

â&#x20AC;&#x153;Flock biasâ&#x20AC;? in surveys collected by Dhaka flocksourcers

Thesis Defense DRAFT, 46

One of the key challenges outlined earlier about flocksourcing is the potential for bias caused by the composition of the flock. One control against natural flock bias may be more data - and with the unexpectedly larger number of surveys collected, that bias should be reduced. Nevertheless, the flock we employed was comprised of young, educated, relatively tech-savvy volunteers (In many ways, this may be the ideal group of flocksourcers in any context based on their ability to efficiently navigate the technology and the complex and sometimes dangerous bus system). Since there is a general lack of official or up-to-date data in Dhaka, it is difficult to estimate the exact magnitude of this bias but it seems like the population of bus riders that were surveyed may have been more young and tech savvy than the overall population. In general though, the flock bias seemed to be otherwise innocuous.

Thesis Defense DRAFT, 47

If not all data in a city can be collected by flocks, can a sampled set be useful, especially if certain behaviors are predictable?

One of the design principles of flocksourcing is to collect as little data as possible to still be significantly useful. Understanding that tipping point will help minimize costs and make data collection more affordable for cities that need this intelligence the most. The amount of data that is needed is determined by the predictability of the system and the value of that data to users. Most observers including frequent riders like our partners from Kewkradong assert that not even god could predict the mobility patterns of the uncoordinated bus system in Dhaka. (If you watch One Minute in Dhaka, the video montage of one minute of traffic across various parts of Dhaka by Stephen Kennedy, you will quickly see why). The predictability of the bus system in Dhaka is dependent on its variability (how much the system changes each day) and on how much that variability repeats itself (how much those changes repeat themselves by time of day or day of week). It used to be the case that the more variable a metric was, the more unpredictable it was. However, with the advance of tools like machine learning, patterns that used to be difficult to observe can now be identified and tweaked by computers in almost real-time. The more data that is collected, the better the predictions usually are.

Thesis Defense DRAFT, 48

On the other side is how much the data is valuable to the specific context of users. In the United States, bus wait times are so important that NextBus, one of the leading providers of bus arrival predictions promises an accuracy to its timeconscious commuters of one minute for waits less than 5 minutes and less than 2 minutes for waits less than 10 minutes. In Dhaka, wait times may not be the most important data for users and therefore that level of accuracy and precision may not be necessary for that metric to be useful. These dimensions of the data matter because they imply what type of real-time data collection technique is the most appropriate to provide that specific service. Data that is higher valued but unpredictable like bus arrival times may require ubiquitous sensing (if that data can be collected directly through automated sensors). Data that is lower in value and more predictable like bus route mapping which could be serviced by crowdsourcing. It is the large blue area in between where flocksourcing may be the most effective all else equal and understanding flock productivity and data predictability will help determine the appropriate flock size.

While flocksourced data has potentially a wide variety of uses, for simplicityâ&#x20AC;&#x2122;s sake, we limiting our analysis to one of its more interesting applications: providing user-centric, real-time info to bus riders. Due to its public purpose, most shared transport providers have difficulty differentiating service between customers. Every bus rider in Dhaka is and is almost mandated to be treated roughly the same, pays roughly the same fare, jumps to get on the bus, pushes his or her way to a seat, shouts at the driver to get off and then alights from a moving vehicle in the middle of a busy street. (This is of course the opposite of car ownership where everything can be customized from colors to seat temperature.) Information has an ability to be customized to the individual - and specifically to his or her custom travel behavior e.g. daily route, timings and potentially gender, age, and one day even beverage preference. Before one can deploy a menu of (real-time) user services, itâ&#x20AC;&#x2122;s helpful to first understand who the customers are, what makes them happy and how much they Thesis Defense DRAFT, 49

value different pieces of information that could improve their experience on uses in Dhaka. Until recently, this information was hard to come by.

Based on 1,014 onboard surveys of current Dhaka bus riders and 10,000 counts by Dhaka flocksourcers

Thesis Defense DRAFT, 50

While bus crowding is clearly an indicator that would be useful for users of Dhakaâ&#x20AC;&#x2122;s buses to know, there are two others that we will also add to the list: (1) travel time, which is one of several time-based metrics including wait time that has proven to be valuable in other contexts, and (2) routing, which is usually one of the first pieces of information users ask for (and that providers like Google Maps license for). The amount that users value travel time data in Dhaka is probably high as it is in other contexts where people tend to optimize towards similar daily travel time budgets no matter where they live. Based on our survey data, the average one-way commute time for bus riders in Dhaka was a little over an hour (1 hour and 10 minutes) with some commuters enduring more than 2.5 hour one way commutes. In terms of navigation, despite the complexity in the network, there is currently no bus map of Dhaka (at least as far as we know) with the best available version being this one produced by My Digonto). Fortunately for Bangla-speaking bus riders in Dhaka, there is a strong collective intelligence on every street corner so bus routing information (e.g. which bus to take) is usually not that difficult to come by.

(1) How much useable data was collected by our flock of 8 on 7 days? (2) How predictable is the bus metric? 1 Bus crowding (Data collected: Bus Line 36, 2,000 counts, 500 on selected Wi-Fi spots

Thesis Defense DRAFT, 51

Thesis Defense DRAFT, 52

2 Bus travel times (Bus Line 36, 132 one-way trips, 24 trips with both stops 1 and 9)

Thesis Defense DRAFT, 53

3 Bus routes (Bus Line 36, 132 one-way trips)

Thesis Defense DRAFT, 54

Without proper ground truth and a limited one-week snapshot of data, it is difficult to determine exactly how useful the samples of data that our flock collected would be for predicting the bus lines that were measured. Ultimately, usefulness in the mobile internet age is determined by use and even limited subsets of data in a context where very little exists may be extremely helpful especially if paired with the perfect visualization output.

That being said, it is likely that the bus crowding data that was collected by our flock was sufficient. The next step would be to fill in the gaps in time where crowding information is not available using predictive machine learning algorithms, which would improve with additional similar sized samples of data. Since a bus travel time requires a bus to send data from a particular start and end location, there were fewer sample points to construct a useable time table. Aside from employing a bigger flock, which could be cost effective for bus companies to do if they wanted to use smartphones as automatic vehicle locators (AVLs), a technique called geofencing where closely located points are clustered together may help increase the usable sample size as well. There seemed to be significant variation in the travel time data itself, much of which could not easily be explained by time of day or day of week. Although bus routes seem like they would be fixed, in Dhaka, we observed some variation in bus line 36 in our sample due to a regional cricket tournament that was happening during our collection time. In theory, it should only take a couple of complete rides along a route to map a route but making it useable would require a broader coverage of lines, something that would be interesting to trial in a subsequent round of data collection.

Thesis Defense DRAFT, 55

Self-organizing flocks In nature, flocks form not by the leadership of one individual but by the collective benefit of the entire flock. Flocks of geese fly in a V because of the collective uplift they experience in their wings. Flocks of sheep as well as fish stay together for their collective safety from predators. In a similar spirit, small flocks of captive but smartphone-enabled bus riders in Dhaka (and any city in the world for that matter) could in their own self-interest organize to collect data that would improve their collective experience. If crowding has everything to do with happiness in Dhaka, a flock could be tasked to count passengers in real-time, helping them and others avoid the busiest buses and simultaneously help operators and regulators make the appropriate adjustments to avoid painfully crowded commutes. What weâ&#x20AC;&#x2122;ve discovered in this first step is that the real-time data collection technology works in Dhaka, that even a small flock operating a short time can collect an unbelievable amount of intelligence, and that that intelligence can be useful to improve a more sustainable mode of transport. It would not take much additional investment to continue this collaborative production of a mobility infostructure in Dhaka either by government, bus operators or riders themselves. It is our hope that this simple possibility of a lowcost, smart Dhaka bus system inspires the city to enhance its commitment towards sustainable mobility in a moment where that seems like the last thing on anyoneâ&#x20AC;&#x2122;s mind*. *As I write this sentence, Dhaka is a city paralyzed by the fourth nationwide strike in 8 days. Buses, cycle rickshaws and auto-rickshaws are usually the only vehicles allowed on the road during these hartal, or political protest days. Our own data collection effort in March was delayed by a week due to an even bigger opposition strike that for the first time, shut down the bus system itself. In a strange way, these days of political unrest usher in an unintended car-free day, something I experienced on my first visit to Dhaka where 4 of the 6 days I was there was carfree, congestion-free and pollution-free.

Thesis Defense DRAFT, 56

All the following images were created with the data that was collected from the flocksourcing of Dhaka buses in March 2012. The hope is that they provide inspiration for continued iterations for the future.

Thesis Defense DRAFT, 57

Thesis Defense DRAFT, 58

Thesis Defense DRAFT, 59

Thesis Defense DRAFT, 60

Dhaka seems to need bigger buses.

Thesis Defense DRAFT, 61

To be continued . . .

The Urban Launchpad is a social mission-driven company aspiring to accelerate experimentation and innovation in cities through rapid prototyping and performance measurement on an urban scale.

Thesis Defense DRAFT, 62

Archaya, Surya & Shigeru Morichi. (2007). Motorization and Role of Mass Rapid Transit in East Asian Megacities. IATSS Research, 31(2), 6-16. Barter, Paul (1999). An international comparative perspective on urban transport and urban form in Pacific Asia: the challenge of rapid motorisation in dense cities. PhD thesis, Murdoch University. Brunelle, Joey (2012, April 17). Red Line Retrieved April 17, 2012 from the How Fucked is the T Web site: www.howfuckedisthet.com Ching, Albert (2011, Summer). Does owning a cell phone reduce the desire to own a car? Case study fieldwork in Bangkok, Jakarta, Kuala Lumpur, Delhi, Fazilka, Bangalore, and Dhaka sponsored by Singapore-MIT Alliance for Research and Technology’s Future of Urban Mobility Project. de la Peña, Benjamin. (2011, Summer). Smart Cities for Whom?. Next American City, 30. <http://americancity.org/magazine/article/ideas-smart-cities-for-whom/>. Hickman, Robin & Paul Fremer, Manfred Breithaupt, Sharad Saxena (2011). Changing Course in Urban Transport: An Illustrated Guide. Published by the ADB and GIZ. Latitude Research (2011) in collaboration with Next American City and Locately. Tech for Transit: Designing the Future System retrieved April 17, 2012 from www.latd.com mBillionth South Asia Awards (2011). Mobile for Masses: South Asia’s Best Mobile Innovations 2011. Published by the Digital Empowerment Foundation. Silverman, Howard (2010). Luis Bettencourt and Geoffrey West: Cities as Complex Systems. People and Places, 1 (3). Song, C., Qu Z., Blumm N., Barabasi, AL (2010). Limits of Predictability in Human Mobility. Science, 2010 Feb 19; 327(5968):1018-21.

Thesis Defense DRAFT, 63