Why Do You Know So Much About Me? Privacy in the Digital Age
Not talking about surveillance Not talking about the government But rather The voluntary disclosure of personal information to private institutions
We say one thing. I want my privacy. We do something else. Here’s my data. Take what you want. (just give me my stuff)
43% of online users claim that they are likely to read the privacy policy of a website before buying anything
What Privacy Statements Say
26% actually consulted the privacy policy Even more odd, there was no difference between privacy fundamentalists, pragmatists, or the unconcerned
71% want to control who can access their personal information
75% have supplied
50% have supplied
• First name • Last name • E‐mail • Street address
• Phone number • Birthday • Credit card information
“You have zero privacy. Get over it” Scott McNealy Former CEO Sun Microsystems
“If you have something you don’t anyone to know, maybe you shouldn’t be doing it in the first place.” Eric Schmidt Former Google CEO
“People have gotten more comfortable not only sharing more information, but more openly and with more people.” Mark Zukerberg Facebook CEO
What do you think privacy is?
Privacy is….? Secrecy, Concealment, Seclusion, Solitude,
Confidentiality, Anonymity Prejudicial Information Personally Identifiable Information (PII) Whatever you want it to be
Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others.
Viewed in terms of the relation of the individual to social participation, privacy is the voluntary and temporary withdrawal of a person from a general society into a condition of anonymity or reserve.
Privacy is the ability of an individual or group to seclude themselves or information about themselves and thereby reveal themselves selectively.
Privacy in Colonial America Find an open field to talk Sneak off into the woods No privacy indoors Churches encouraged neighbors to snoop on each other
Privacy in the 1800s Long‐distance communication by telegraph Letters Concern about invasive press Snooping discouraged Gossip, Word of Mouth
Privacy from 1900 ‐ 1965 First bugging device Search of electronic conversations constitutional Telephone communications over wires Cold War prompts government to increase surveillance
of civilians without their knowledge
Privacy from 1965 ‐ 1990 Watergate Scandal Personal computers Public‐key encryption invented Internet emerged Sensationalist journalism
Privacy from 1990 ‐ 2001 No privacy for public figures Wireless communication Cameras Satellites Confusion over who owns content on computer networks
Privacy After September 11th Private customer information divulged to federal
authorities hunting for terrorists or criminals Airport searches Polls in the US indicated that people think that the 1st amendment of the US Constitution might go too far
Total Information Awareness Post 9/11 project to:
[Create] enormous computer databases to gather and store the personal information in the United States, including personal emails, social network analysis, credit card records, phone calls, medical records, and numerous other sources, without any requirement for a search warrant. Additionally, the program included funding for a biometric surveillance technologies that could identify and track individuals using surveillance cameras and other methods.
Television & Privacy 1992 brought the launch of Reality Television where
everyone’s lives became public consumption This brought about shows about people:
Living together in homes and islands Families struggling with personal issues Celebrities private issues made public People showing off their stupidity to win money and fame
In short, Reality TV took the privacy discussion to a new
level
Privacy Today YouTube has ended all forms of personal privacy Bloggers have made their personal (and their friends/
acquaintances) lives topics of discussion of the entire world And then came social networks…. We are comfortable sharing our lives and thoughts instantly with thousands of people – close friends and strangers alike
Ways Technology Threatens Privacy Phishing
Cloud Computing
Malware & Spyware
Electronic Medical Data
Social Networking sites
Public Wi‐Fi
Photo & Video Sharing
Retail Loyalty Cards
Web History
Workplace Computers
Targeted Advertising &
Cell Phones
Cookies
Why Privacy Has Changed? Curiosity Convenience The Internet and Evolving Technology Social Trends Desire to relate & share with others Identity Fame Posterity
The primary business model of today’s most successful corporation is the monetization in the mass collection, correlation & analysis of individual private data
Private Info Monetized
Acxiom – 750 billion pieces of information or 1,500 facts on ½ billion people
Correlate “consumer” info from signups, surveys, magazine subscriptions USD 1. 38 billion turnover for FY2008
Colligent – Actionable consumer research derived from social networks Rapleaf – 450 million social network profiles
Submit request and aggregated social network profiles returned within a day
Phorm
Uses “behavioral keywords” – keywords derived from a combination of search terms, URLs and even contextual page analysis over time – to find the right users
How It Affects Us?
White’s Taxonomy of Online Privacy Invasion Web Request Cross Site Tracking
Rich Browser Environments
Application Data
Aggregation, Correlation & Meta‐Data
Taxonomy – Web Request
A single web request
An image on a website
One webpage is made up of multiple requests What They Can Find Out
Location (Latitude, Longitude, City, Country) Language Operating System & Browser What site you came from ISP Have you been here before?
Web Request
Taxonomy – Cross Site Tracking Using cookies to track
across computers and affiliated sites Cookie is stored on your computer and sent with every request Cookies usually associated with login details What They Can Find Out
Who you are What sites you visit Behavioral profiles
Cross Site Tracking
Taxonomy – Rich Browser Environments Rich Web 2.0 Technologies
JavaScript/AJAX Flash/Silverlight
What They Can Find Out
Browser history Clipboard data Key presses Visual stimulus Browser plugins Desktop display preferences
Rich Browser Environments
Taxonomy – Application Data Rich Information Inputs Structured & Unstructured Data
Search requests E‐mails Calendar items Instant Message Communications
What They Can Find Out
Who you are Who your friends are What you’re doing on Sunday Your Interests
Application Data
Taxonomy – Aggregation, Correlation & Meta Data
Combining the previous levels
Meta‐Data – Include interactions with applications Aggregation – combining the information from various sources Correlation – normalizing entities across sources
Provides information you may not be aware of What they can find out
Social networks Behavioral profiles Psychological profiles Deep databases
Aggregation, Correlation & Meta‐Data
How Does Information Get Revealed?
By ISPs ISPs always know your IP address and the IP address to
which you are communicating ISPs are capable of observing unencrypted data passing between you and the Internet but not properly‐ encrypted data They are usually prevented to do so due to social pressure and law
By E‐Mail May be inappropriately spread by the original receiver May be intercepted May be legally viewed or disclosed by service providers
or authorities
By Discussion Groups There is no barrier for unsolicited messages or emails
within a mailing list or online discussion group Any member of the list or group could collect and distribute your email address and information you post
By Internet Browsers Most web browsers can save some forms of personal
data, such as browsing history, cookies, web form entries and password You may accidentally reveal such information when using a browser on a public computer or someone else's
By Search Engines Search engines have and use the ability to track each one
of your searches by IP address, search terms and time of day
How Do We Know ‐ AOL Aug 7, 06 ‐ AOL apologized for releasing search log data
on subscribers that had been intended for use with the company's newly launched research site. Almost two weeks before that, AOL had quietly released roughly twenty million search record from 658,000 users on their new AOL Research site. The data includes a number assigned to the anonymous user, the search term, the date and time of the search, and the website(s) visited as a result of the search. NY Times was able to identify several users by cross‐ referencing with phonebooks/public records
How Do We Know – Department of Justice Jan 06, the US Dept of Justice issued a subpoena asking
popular search engines to provide a "random sampling" of 1 million IP addresses that used the search engine, and a random sampling of 1 million search queries submitted over a one‐week period. The government wanted the information to defend a child pornography law. Microsoft, Yahoo and AOL complied with the request, while Google fought the subpoena.
How Do We Know ‐ Google Google collects massive amounts of user data Gmail has a machine reading email to improve the
relevance of advertisements displayed Google Street View ‐ public/private property & people captured in images Search histories are kept for two years and identified via a cookie
By Indirect Marketing Web bugs ‐ a graphic (in a website or a graphic enabled
email) that can confirm when the message or web page is viewed and record the IP address of the viewer Third party cookies ‐ a web page may contain images or other components stored on servers in other domains. Cookies that are set during retrieval of these components are called third‐party cookies.
What Are Cookies? Cookies are data packets sent by a server to a web client
and then sent back unchanged by the client each time it accesses that server Cookies are used for authenticating, session tracking and maintaining specific information about users, such as site preferences or the contents of their electronic shopping carts Cookies are only data, not programs or viruses There are two types of cookies ‐ persistent and non‐ persistent
Why Don’t We Like Cookies? Cookies can be hijacked and modified by attackers Cookies can be used to track browsing behavior so some
think they are tagged
By Direct Marketing Direct marketing is a sales pitch targeted to a person
based on previous consumer choices. It is common these days Many companies also sell or share your information to others. This sharing with other businesses can be done rapidly and cheaply
By Instant Messaging Your IM conversation can be saved onto a computer even
if only one person agrees Workplace IM can be monitored by your employer SPIM ‐ Spam distributed in IM
By Employers 76% of employers monitor employees website
connections 65% use technology to blocked connections to banned websites 55% monitor email
By Cybercrime Spyware takes advantage of security holes to attack the
browser and force it to be downloaded and installed to gather information without your knowledge Phishing occurs when criminals lure the victim into providing financial data to an unsecure website Pharming occurs when criminals plant programs in the victim's computer which redirect the victim from legitimate websites to scam look‐alike sites
Facebook “Privacy”