The Ethics of Big Data
Balancing Opportunity & Responsibility
Nick Meyers, GISP (StreetLight Data)
Whit Blanton, FAICP (Forward Pinellas)
September 2023
Balancing Opportunity & Responsibility
Nick Meyers, GISP (StreetLight Data)
Whit Blanton, FAICP (Forward Pinellas)
September 2023
Ethics of Big Data
•
What is Big Data?
•
Benefits of Big Data
• Ethical Concerns
• Balancing Opportunity & Responsibility
• Q&A
• Ownership
• Individuals own their own data
• Privacy
• If data transactions occur all reasonable effort needs to be made to preserve privacy
• Transaction transparency
• If an individual's personal data is used, they should have transparent access to the algorithm design
• Inferences and predictions using technologies such as algorithms
• Currency
• Individuals should be aware of financial transactions resulting from the use of their personal data and the scale of these transactions
• Consent
• We must obtain informed and explicitly expressed consent of personal data uses
• Openness
• Aggregate data sets should be freely available
• Framework
• Aspirational Principles
• Codes of Conduct
• Advisory Opinions & Process
• Applies to AICP members but reflects broadly on the planning profession
• Updated in 2022 to add emphasis to equity, diversity and inclusion, and provide more clarity
1. People who participate in the planning process shall continuously pursue and faithfully serve the public interest;
2. People who participate in the planning process shall do so with integrity;
3. People who participate in the planning process shall work to achieve economic, social and racial equity;
4. People who participate in the planning process shall safeguard the public trust; and
5. Practicing planners shall improve planning knowledge and increase understanding of planning activities.
• Public Interest
• Pay special attention to the interrelatedness of decisions and their unintended consequences
• Plan with Integrity and Safeguard the Public Trust
• Deal fairly with all participants in the planning process.
• Provide timely, adequate, clear, accessible, and accurate information on planning issues
• Respect the rights of all persons and groups and do not discriminate against or harass others
• Incorporate equity principles and strategies….Develop metrics and track plan implementation over time to measure and report progress
• Improve Planning Knowledge & Increase Public Understanding
• Contribute to the development of, and respect for, our profession by improving knowledge and techniques, and sharing the results of experience and research
• Mandated behavior for members of AICP
• Defines procedures for enforcement
• Quality and Integrity of Practice
• Conflicts of Interest
• Improper Influence/Abuse of Position
• Honesty and Fair Dealing
• Responsibility to Employer
• Discrimination/Harassment
• Bringing a Charge/Lack of Cooperation with Ethics Office
• Quality and Integrity of Practice
• We shall not deliberately fail to provide adequate, timely, clear and accurate information on planning issues
• We shall not direct or pressure other professionals to make analyses or reach findings not supported by available evidence
• Honesty and Fair Dealing
• We shall not disclose or use to our advantage…information gained in a professional relationship…that we should recognize as confidential because its disclosure could result in detriment to the client or employer
• Discrimination
• We shall not commit or ignore an act of discrimination or harassment
• Big Data offers advantages to analysis and tracking of performance measures that is beneficial to planning decision-making
• Artificial Intelligence (AI) has powerful predictive analytics that should be balanced with serving the public interest and ensuring equitable outcomes
• Data source(s) disclosure and protection of privacy are paramount
• Clarify limitations of Big Data – what it does and doesn’t tell us
• Big Data is complementary, but not a substitute for understanding lived experience of people affected by planning decisions
“Big data is a combination of structured, semi structured and unstructured data collected by organizations that can be mined for information and used in machine learning projects, predictive modeling and other advanced analytics applications.”
Source: Tech Target (https://www.techtarget.com/searchdatamanagement/definition/bigdata?Offer=abMeterCharCount_var2)
Source: Hurree Ltd
• Volume: Amount of data
• Velocity: Speed data is generated, managed and processed.
• Variety: Diversity and range of types and formats
• Veracity: Quality or Trustworthy
• Value: The Business Value of the data
2. Engagement
3. Measurable Outcomes
“Zone Activity” for pedestrian trips across Broward County in 2021:
• Pedestrian Volumes that identify trip activity hotspots
• Trip attributes that provide context to that activity
Hone in on, compare, and contrast the types of travel occurring at specific sites.
“Origin-Destination” for all vehicular trips across Broward County in 2021:
• Identify origin-destination pairs with a highdensity trip activity and compare to existing infrastructure
• Trip attributes help to identify trips candidate for mode shift
Use up to date regional patterns of vehicular travel to ensure alternative modes are serving current demands
“Trips to/from pre-set geography” for all vehicular trips starting in Broward County Justice40 tracts in 2021:
• Identify where trips are going from underserved communities
• Compare trip attributes to regional statistics to assess equity with existing infrastructure
Ensure that complete corridors are serving the needs of those who need it most and those who have been historically neglected
Source: Exquisite Imagination
Source: https://www.bleepingcomputer.com/news/security/discordioconfirms-breach-after-hacker-steals-data-of-760k-users/
• Data Breaches
• Raw Data – Accuracy – Aggregation
• Discrimination
• Data Management
• PII (Personally Identifiable Information)
• Risk of Identify theft
• Proper Storage and query techniques
• Don’t collect and store this information if not needed
• Individual movements
• Keeping individual privacy and autonomy
• Aggregation
• De-identification
• PII removed or obscured to not allow information to be used to identify an individual
• Movement data and level of granularity
Focused on developing and recommending concepts of right & wrong uses of Big data
- Emphasis on protection of personal data.
- Different types of data consent
- Ethical and moral code of conduct for Big Data use
- i.e. accurate representation of customers and put fairness above analytics
CLEAN
PATTERNIZE
CONTEXTUALIZE
AGGREGATE
• Potential supplier questionnaires & evaluations
• Review Privacy Policies, Terms of Use
• Set minimum baseline – limited subsets, demonstrated use of de-identification, hashing, encryption
• Contractual reps & warranties
• Regular check ins, yearly reassessments, ongoing engagement on potential changes to data feeds in response to or in preparation for changes in law & best practices
• Re-hash incoming data
• Multiple levels of security
• Break location records into pieces for contribution to probability distributions
• Automated privacy & coverage checks:
✓ what is the use case?
✓ type of data used?
✓ sample size?
✓ land use?
✓ time period?
• Multi-step, multi-layered de-id/anonym
• Training
Aggregate metrics about composite groups of people – it’s not about individuals
• Contractual safeguards including NO Re-identification
• Put end users on notice in Web App that they have important obligations to protect location data
• Encourage customers to get involved
– Civic Privacy Leaders Network, The Future of Privacy Forum (FPF), use best practices
Copyright © 2023 StreetLight Data, Inc.
• Established guidelines for our data suppliers
• Promote responsible data practices and ensure the data they provide meets our high standards for privacy protection.
• Normalization
• Collection, sample size, validation, representative sample
• Movement of an aggregate group
• Analyze mobility patterns – not the movement of individuals.
• Training & Education
• Customer Care Team (onboarding, support, training)
• Workshops, webinars, conferences, and our annual user summit
What’s Good?
- Quickly gain vast amounts of data (In ways not feasible with traditional data sources)
- High level scans, needs assessments, engagement
- Making wise & prudent financial choices
- Improved understanding or actual need of a project
- Ability to conduct ROI or benefit cost
- After Studies – Measuring successful outcomes
What’s Bad?
- Not reviewing results
- Relying too heavily on number or value
- Understand potential Bias
- Know context for using data
- Not having a process or strategy on how to leverage Big Data
- Including equity principles and necessary training
- Using Big Data to eliminate steps from the planning process
- engagement, outreach, qualitative information, life experiences
• It’s a Tool!
• Balance opportunity with responsibility
• We are all in this together