DeepWeb

Page 1

10 THINGS YOU SHOULD KNOW ABOUT THE

DEEP DEEP DEEP WEB WEB WEB


INTRODUCTION INTRODUCTION INTRODUCTIONN INTODUCTION The Internet is a tool used in our society so frequently and heavily, that many people believe we have discovered all it has to offer, but in reality there is a large aspect of what defines “the web” that we have limited access to, and limited understanding of. The phrase “there’s more than meets the eye” does not limit itself to the physical world. As we grow with technology it’s important that we understand its true potential to manipulate, inform, communicate, and influence us. We also must recognize the true size and ability for growth that the internet has and how it can be used for both positive and negative outcomes. Consider you relationship with technology as you learn ten new things about the deep web.


1

Intro

3

What is the Deep Web

5

Web Crawlers

7

Accessing the Deep Web

9

SilkRoad/Crime

11

Currency/Anonymity

13

Activism/Politics

15

The Future of the Deep Web

17

Conclusion


The deep web is a section of the World Wide Web that is not discoverable by the use of commonly used search engines,such as google, bing, and yahoo. This includes sites that are password-protected or dynamic pages and encrypted networks. In other words, When you surf the Web, you really are just scratching at the surface of what makes up the totals web’s content. Much deeper below that there are tens of trillions of pages that most people have never seen. This includes everything from boring statistics, to human body parts for sale, to underground drug and crime rings. Though the Deep Web is little understood and not often made aware to the public, the concept is quite simple. Think about it in terms of when you search for anything on the internet. To give you results, Google, Yahoo and Bing constantly index pages to get a sense of what will appear when things are typed into they’re search bar. What they fail to capture are dynamic pages and standalone pages, like the ones that get generated when you ask an online database a question or pages with a unique address that is not linked to anywhere else on the internet. Take for example the results from a query on the Census Bureau site. When the web searching software arrives at a database, it typically cannot follow links into the deeper content behind the search box because that data is unique to the search query used to find it, Google and others also don’t capture pages behind private networks or standalone pages that connect to nothing at all. These are all part of the Deep Web, and since they are so hidden this information is often very little understood.

3


DEEP DEEP DEEP WEB

WHAT IS THE

WEB WEB


WEB

CRAWLERS thecrawler crawlerisisperforming performing IfIfthe

IfIfthe crawler is performing archiving ofwebsites websites archiving of itit the crawler is performing archiving of websites copies andsaves saves the itit copies and the archiving of websites copies and saves the information as goes. information as ititgoes. copies and saves the information as it goes. information as it goes.

5


S

g

WEB WEB WEB WEB CRAWLERS

CRAWLERS CRAWLERS

A Web crawler is an Internet bot that systematically browses the World Wide Web, generally for the purpose of indexing what we know as the internet. A Web crawler may also be referred to as a Web spider, an ant, an automatic indexer. A Web crawler starts with a list of urls, defined by the search engine using the software, to visit, called the seeds. As the crawler visits these urls through the links, it identifies and defines all the hyperlinks in the page and adds them to the list of pages to visit, called the crawl frontier or web index. urls from the frontier are visited according to a set of policies and rules put in place by the web browser. This is the reason that search results are different between two search engines, for example Google and Bing produce different sets of results for the same search because they have a different hierarchy of crawl frontiers. If the crawler is performing archiving of websites it copies and saves the information as it goes. Such archives are usually stored such that they can be viewed, read and navigated as they were on the live web, but are preserved as ‘snapshots’. All of this information is known as the Surface Web, the websites that we use everyday that we can access by means of standard browser and search engines. These are sites like Facebook, Reddit, and Wikipedia. The deep web is what cannot be found by the web crawlers due to the nature of where they exist or the way they are linked, which is why its often referred to as hiding under the surface of the web.


THE C C AA A C C C A C EES E SSS C SSIN NG IN E GG SIIN G SS ACCE

T H T H E E T HE TH 7

When learning about the deep web many people’s first question is “how do I access it?”. There are many different ways of accessing it, some you do everyday without noticing a different part of the web and others you need special software or unique technology. The first way to access the deep web is by using sites that have database search functions within their site. This the most common and familiar aspect of the deep web that people use day to day for a variety of purposes. The vast majority of the Deep Web holds pages with valuable information. A report in 2001 estimates 54% of websites are databases. Among the world’s largest are the U.S. National Oceanic and Atmospheric Administration, nasa, the Patent and Trademark Office and the Securities and Exchange Commission’s edgar search system, all of which are public databases. The next batch has pages kept private by companies that charge a fee to see them, like the government documents on LexisNexis and Westlaw or the academic journals on Elsevier.


Another 13% of pages lie hidden because they’re only found on an Intranet. These are internal networks, at corporations or universities, that have access to message boards, personnel files or industrial control panels that can flip a light switch or control local machinery that exists at these location

DEEP DEEP DEEP DEEP THE

Then there’s Tor (The Onion Router, the darkest corner of the Internet. It’s a collection of secret websites (ending in .onion) that require special software to access them. People use Tor so that their Web activity can’t be traced. It runs on a relay system that bounces signals and location based data among different Tor-enabled computers around the world, this makes your web activity nearly impossible to trace and gives you access to a huge collection of deep web sites.

WEB WEB W EB


SILKROAD SILKROAD SILKROAD SILK ROAD Due to the level of anonymity on the deep web, it comes as no surprise that the deep web would have a high level of criminal activity attached to it. The scale of which is often underestimated, especially by law enforcement agencies in the case of internet black markets and the Silk Road. Silk Road is an online market. As part of the Deep Web, it is operated as a Tor hidden service, such that online users are able to browse it anonymously and securely without potential traffic monitoring. The website launched in February 2011.The name “Silk Road” comes from a historical network of trade routes, started during the Han Dynasty, between Europe, India, China, and many other countries on the Afro-Eurasian landmass. Silk Road was operated by “Dread Pirate Roberts”, a nickname of the man who founded and ran the site. As of March 2013, the site had 10,000 products for sale by vendors, 70% were drugs that are considered contraband in most jurisdictions. There were also legal goods and services for sale, such as apparel, art, books, cigarettes, erotica, jewelry, and writing services. Buyers can leave reviews of seller’s products on the site and in an associated forum where crowd sourcing provides information about the best sellers and worst scammers. A sister site, called “The Armory”, sold weapons (primarily guns) during 2012, but was shut down due to a lack of demand.

users are able users are are able able users to browse it to browse browse it it to anonymously and anonymously and and anonymously securely without securely without without securely potential traffic potential traffic traffic potential monitoring. monitoring. monitoring.

9


CRIME CRIME CRIME CRIME Beyond the Silk Road, the deep web exists as a haven for other factions of crime. The deep Web has been known to house sites advertising services for child pornography, assassins, money laundering, fake identification, and illegal immigration. In 2013, The fbi became involved with many of the criminal activities happening on the deep web, Leading to the takedown of Silkroad and many smaller crime rings using the hidden web. In an article written for Time Magazine it was stated that, “Silk Road presents a double conundrum. It’s a blueprint for criminals the world over at a time when fbi resources are stretched thin and political will to empower government snooping has cratered. And it has created a regulatory headache in figuring how to deal with whole new currencies, tax havens and virtual online markets.” The issue with crime on the deep web is that it exists in one big puzzle of broken links, hidden browsers, and untraceable money. There is almost no way the fbi could fathom locating all these sources of crime on the web, and therefore crime on the deep web will continue.


E-CURRENCY E-CURRENCY Currencies like Bitcoin work by creating encrypted peices of code Currencies like Bitcoin work by creating encrypted peices of code Currencies Currencieslike likeBitcoin Bitcoinwork workby bycreating creatingencrypted encryptedpeices peicesof ofcode code that have placed value on them. that have placed value on them. that thathave haveaaaaplaced placedvalue valueon onthem. them. Cryptocurrencies are not exclusive to deep web activity but they definitely play a significant role in the way money is moved around the deep web and what makes deep web industries stay vibrant and anonymous. Cryptocurrencies are a form of money that exist solely on the internet, the most populur being BitCoin. Currencies like Bitcoin work by creating encrypted peices of code that have a placed value on them. BitCoin and other Cryptocurrencies can be mined by having powerful computers solve math problems that create new encrypted codes and therefore, new bitcoins. Crypto-

11

currencies work the same as regular currency in that their value can fluctuate with the market, they can be exchanged for other forms of currency, and in the position of the deep web, they can be traded under the table. This allows for sites such as the silkroad, internet blackmarkets, and other currency run operations to continue on the deep web with minimal amounts of tracking.


ANONYMITY E-CURRENCY ANONYMITY ANONIMITY Currencies like Bitcoin work by creating encrypted peices of code

ignores shared content across web pages, doesn’t save browsing It ignores shared content across web pages, doesn’t save browsing ItIt content across doesn’t Itignores ignoresshared shared content across webpages, pages, doesn’tsave savebrowsing browsing Currencies like Bitcoin work by web creating encrypted peices of code It ignores shared content across webwith pages, doesn’t save browsing sessions allow results be shared with collaborators. sessions oror results toto be shared have collaborators.that havea a sessions or allow results to be shared withcollaborators.that collaborators. that have aallow placed value on them. sessions orWeballow results tocommunicabe shared collaborators.placed valplaced on them. becausewith it ignores shared content across web pages, doesn’t save browsing sessions or Many aspects ofvalue thevalue Deep rely on the fact that the services and placed on them. allow results to be shared with collaborators. When users access Tor, their location tion theon deep web offers, hinge on the role of anonymity. Our everyday web services ue them. and ip address is hidden due to the ability to hide behind ip proxies. The exchange that exist on the surface web, like wikipedia, our known by those tracing web usage across multiple IP addresses. This is why it would be impossible to create a site like the silkroad on the surface web. The usage of our everyday sites is far too traceable by law enforcment, private institutions, and everyday users. The Deep Web does not have these issues because there is a level of hidden characteristics at each stage of use,

of money is now hidden because bitcoin and other E-Currencies can be hidden in encrypted files, disuised as another form of media, or exchanged physically. When it comes to the Deep Web every user is indistuinsable from the next, which is what allows it to thrive.


& &

CTIVISM C T V AA A C C IISM A T SM EIIV C SIS M T S IN IS MG TIV ACIV

& & AHNED THE T

POLITICS POLITICS POLITICS DEEP POL WEB 13


secrecy, comunication, comunication, and and secrecy, secrecy, comunication, and anonymity are crucial for anonymity are crucial for anonymity are crucial for change and activism activism change and change and activism Groups such as Anonymous are the expression of a phenomenon defined “Hacktivism” that refers the use of computers and networks to engage in social protest and to promote political ideologies. Hacktivists use it to attack systems and architectures with both legal and illegal tools to manifest their dissent through denial-of-service attacks, information theft, data breaches, web site defacements, typosquatting and other methods of digital sabotage. Hacktivists are one of the most active elements currently of the Deep Web, and many consider them as the equivalent of today’s Carbonari uprisings. Hacktivists are interested in the Deep Web not only because they need secure communication platform, but also because they are highly active in this cyberspace.

We must distinguish two different participative approaches to the Dark Web. Hacktivist in fact could surf in the hidden space for information gathering purposes, the “passive mode”, and also in “active mode” by conducing cyber operations similar to ones promoted in the ordinary web. In a time where secrecy, communication, and anonymity are crucial for change and activism, the deep web serves as a haven for those looking to make a change. The deep web is also used by intelligence analysts to study the political situation of foreign countries thanks to the use of powerful analysis tools such as Tor Metrics, a project that aggregates all kinds of interesting data about the Tor network and visualizes them in graphs

and reports. For example, analyzing the number of access to the Tor Network over the time it has been possible to discover how The Ethiopian Telecommunication Corporation, unique telecommunication service provider of the country, has deployed for testing purpose a Deep Packet Inspection (dpi) of all Internet traffic. Using the metrics, it was possible to identify the introduction of the filtering system as displayed in the following graphs. The deployment of monitoring system is usually associated by repressive politics of central governments that are interest of persecution of opponents.


FUTUR OF FUTURE OF FUTURE THE DEEP THE DEEPWEB DEEP WEB WEB 15


B

The deep web is a powerful entity of technology and communication, so it comes with no suprise that many different institutions, both government run and private want to find ways to access it, but will this change the deep web that we know today? An Article published by Motherboard reported the following. In early 2013, darpa (Defense Advanced Research Projects Agency) called for proposals to create a next-gen search engine to “revolutionize the discovery, organization and presentation of search results.” The agency laid out what it sees as the shortcomings of search today. It doesn’t crawl sites that aren’t indexed, only organizes results in a list of links, and requires entering the exact right text to get the results you’re looking for. Most importantly, it’s centralized. Search today is a one-sized-fits-all product. Instead, darpa wants a system that can tailor searches to focus on a specific topic, or realm of the internet. It would automate the process, continuously crawling the web for a mission-specific subject, and would leverage image recognition and natural language technology to find content beyond plugging in certain keywords. It would also drastically expand the scope of what is indexed, to include “link discovery and inference of obfuscated links, discovery of deep content such as source code and comments, discovery of dark web content, hidden services, etc,” according to the project report.The idea is to eventually use the personalized indexing to comb through the hoards of information that are in the public domain but currently not indexed. But first, the military would focus on hunting down human traffickers, and the modern-day slave trade that lives largely on the web in forums, chats, advertisements, job postings, and hidden services. It’s also eyeing the counterfeit goods, missing people, and found data realms. This makes many ask if the deep web becomes discoverable, are we losing a precious mode of anonymous and free communication. The deep web is defined by its state of undiscoverability, and we may see a day where software is powerful and smart enough to discover what is going on below the surface.


CONCLUSION CONCLUSION CONCLUSI In a society where every aspect of our lives is posted on the web in a public manner we may be forced to communicate in a way that hides from the surface. The Deep Web may not be an entity that is pertinent in everyone’s life. Many people spend the entirety of their lives using the internet without being aware that the Deep Web exists. While it may remain a haven for criminals, and hackers, it also allows people to communicate and trade without a governmental or private entity looking over their shoulder and that has value in a world where everything exists publicly. The deep web exists as an important reminder that in every aspect of our society,

17


Neal, Meghan. “DARPA’s Building a New Search Engine to Crawl the Deep Web.” Motherboard. 1 Jan. 2014. Web. 29 Oct. 2014.

Raghavan, Sriram and Garcia-Molina, Hector. “Crawling the Hidden Web. In: 27th International Conference on Very Large Data Bases” (VLDB 2001)

Grossman, Lev. “The Secret Web: Where Drugs, Porn and Murder Live Online.” Time. Time Magazine, 1 Jan. 2013. Web. 29 Oct. 2014.

Raghavan, Sriram, and Hector Garcia-Molina. “Crawling the Hidden Web.” Computer Science Department.

The Deep Web: Surfacing Hidden Value. http://www.completeplanet.com/Tutorials/DeepWeb/.



10 THINGS THINGS YOU YOU SHOULD SHOULD 10 KNOW ABOUT ABOUT THE THE KNOW

DEEP DEEP DEEP WEB WEB WEB


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.