Masterseeks businessplan

Page 1

BUSINESSPLAN

INVESTMENT OPPORTUNITY IN THE FUTURE GLOBAL B2B SEARCH ENGINE

Businessplan

The information contained herein has been provided on a confidential basis solely for the use of the person to whom it has been delivered. It may not be reproduced, forwarded or provided to any other person. The distribution of this information does not constitute or form an offer to participate in any investment.

1


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

2


BUSINESSPLAN

EXECUTIVE SUMMARY Market survey and Masterseek's positioning As the search engine enviroment has matured the next challange is clearly transparency and in-depth relevance. The most obvious search engine gap concearns market that relates to Business 2 Business searches, where services as google and Yahoo! have only very limited value for companies searching for business related issues and services. The Masterseek solution will bridge this gap with the creation of the most powerful user subscription free B2B search service in the world based on a a clear concept and superior crawler technology. The main income source stems from link and key word subscriptions from companies listed in the database. With a basic link and key word subscription cost of around only USD 300 per annum the barrier of entry to sales should be low or even very low. There are only a few players on the market with regards to company information, and no players have positioned themselves as a supplier of global, detailed business to business information. All suppliers on the market leave room for improved value proposition, although with regards to different parameters. Masterseek's value proposition With its value proposition, Masterseek will acquire a large part of the market of online marketing which is very large and is expected to double over the next five years. Masterseek will deliver a unique value proposition to two target groups - the users and the advertisers.

3


BUSINESSPLAN

EXECUTIVE SUMMARY Masterseek will provide the users of the company information with a new and attractive value proposition on five core parameters: • Width: To deliver the largest database of company information both with regards to the number of companies and the number of countries covered. • Depth: To deliver the most detailed and categorized database of company information. • Updating: To deliver a database of company information which is always updated. Precision: To deliver information that match the search criteria relevantly. • Cost: To deliver a wide range of services for free. Masterseek's value proposition is supported by specially developed web crawlers, data mining software and advanced search algorithms which guarantee the users the most effective information search. Masterseek will utilize their competences within optimizing search words and their technology to meet the key criteria of the advertisers: a large target group that is relevant in relation to the company's products, effectiveness of the search that ensures that the advertisers are positioned where the users have the greatest inclination of making a buy, and low costs which are directly related to the number of clicks to the advertiser's website (pay-per-click). Masterseek have defined a number of products that are particularly attractive in comparison to the criteria of the advertisers: Sponsor links which are targeted directly at the relevant "users" based on these search criteria, MasterListing: company memberships give the opportunity to add further search words and detailed information about the company and Datablocks: extracts from Masterseek's database targeted according to the specifications of the inquirer.

4


BUSINESSPLAN

EXECUTIVE SUMMARY Masterseek's organization and management Masterseek's management ensures strong competences in relevant areas regarding building and running companies and new business areas, optimizing search engines, data mining, application's development, sales, marketing and investor relations. The day to day management is supported by a strong supervisory board and an advisory board with experience in managing internet companies, internet based partnership agreements and strategy. Financial development Significant growth in both turnover and EBIT is expected over the 6-year period estimated in the business plan. The costs consitute a relatively small fraction of the turnover, which emphasizes the attractiveness of Masterseeks business model Based on traditional valuation methods Masterseeks Net Present Value is estimated to be in the interval of 400-600 million USD.

Contact us: Masterseek Corp. 82, Wall Street 10005, New York, USA

Claus Jakobsen President & CEO. Cell: (+45) 28998899

Rasmus Refer Founder Cell (+45) 20300606

5


BUSINESSPLAN

CONTENT

Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

6


BUSINESSPLAN

MARKET ANALYSIS SHOWS A NEED FOR AN EFFECTIVE TOOL FOR GENERATING GLOBAL AND PRECISE COMPANY INFORMATION

Increasing need for globally avail-able and precise company information

Only few players on the market where none have positioned themselves as a supplier of global detailed B2B information

Compared with other suppliers of company information value proposition can be significantly improved

• Increasing outsourcing • Increased globalisation • Greater demand for new sales channels Increased need for manageability

• Most search engines cover globally but require a lot of further adaptation to deliver precise B2B information • The existing suppliers of directory services are either national or regional in their geographical coverage and are restricted to basic information

Market analysis shows possible improvement on the parameters: • Width of company information • Depth of company information • Updating • Precision • Cost

A need for an effective tool for generating global and precise company information

7


BUSINESSPLAN

DEMAND FOR GLOBALLY AVAILABLE AND PRECISE COMPANY INFORMATION IS INCREASING Increasing outsourcing - Technological innovation has lead to increased complexity in modern production methods. In order to meet the increasing competition on price and quality, companies utilize outsourcing increasingly as a strategic tool Increased globalisation More and more companies operate on the global market in order to rationalize their production. The internet has contributed considerably in making trade across borders more easy Greater demand for new sales channels - Demands are still made and there are still expectations for the companies' growth and market position; which is why a greater number of companies constantly have to establish new sales channels including extending their dealerships abroad

• Increasing need for globally available company information. • With the internet there is an opportunity to cover this need • However, it requires an information forum where search hits are relevant and precise

Increased demand for manageability - Today the internet consists of billions of webpages with information on companies, public institutions, private persons and many more, however, in order to utilize the potential of the internet, easy and quick access to the relavant information is crucial

8


BUSINESSPLAN

ONLY FEW PLAYERS IN THE MARKET WHERE NONE HAVE POSITIONED THEMSELVES AS A SUPPLIER OF GLOBAL DETAILED B2B INFORMATION Differentiation in comparison to existing search methods

Detailed B2B information Positioning opportunity

National/regional

Interational/global

• Most search engines cover globally but require a lot of further adaptation to deliver precise B2B information • The existing suppliers of directory services are either national or regional in their geographical coverage and are restricted to basic information • Business opportunity on the market for global Business to Business directories

Primary B2B information

9


BUSINESSPLAN

COMPARED TO OTHER SUPPLIERS OF COMPANY INFORMATION VALUE PROPOSITION CAN BE IMPROVED (1 OF 2) ESTIMATES Hoovers

Width of company information 14 mil companies in the US and UK

Depth of company information Updating information Basic information and financial information. No second layer information(e.g. specific roduct numbers) Basic information. No second layer information

Manual updating based on individual delivery from au-thorities and companies Manual updating based on individual delivery from companies

Kompass

1.9 mil companies worldwide

Europages

550.000 companies in Europe

Basic information. No second layer information

Manual updating based on individual delivery from companies

Eniro/WLW

450.000 companies in Scandinavia (Europe)

Basic information. No second layer information

Manual updating based on individual delivery from companies

90 mil companies worldwide

Basic information and financial information. No second layer information

Manual updating based on individual delivery from authorities and companies

D&B

Only D&B cover more companies worldwide, however they are more expensive and more difficult to use

Existing directories deliver only basic information. No one delivers content directly from the companies' websites

High Low

Increased updating rate will provide greater validity and quality of information

10


BUSINESSPLAN

COMPARED TO OTHER SUPPLIERS OF COMPANY INFORMATION VALUE PROPOSITION CAN BE IMPROVED (1 OF 2) ESTIMATES Hoovers

Precision of information Difficult to find precise product information.

Cost

Total score Payment for largely all information

Good depth, width in the US and UK and updating, however, poor precision and high cost

Kompass

Easy to find precise product information.

Partly payment for information

Geographic limits as well as less depth and slow updating. Good precision, however

Europages

Fairly easy to find precise product information

Free search

Geographic limits as well as less depth and slow updating. Free user access, however

Eniro/WLW

Fairly easy to find precise product information

Free search

Geographic limits as well as less depth and slow updating. Free user access, however

D&B

Difficult to find precise product information.

Payment for all information

Good depth, width and updating, however, poor precision and high cost

There is a need for precise company and product information being found easily and intuitively

11

Europages and WLW have free basic information, however, they only cover European companies

All competitors leave room for improvement, although with regards to different parameters

High Low


BUSINESSPLAN

THE BASIS FOR MASTERSEEK'S VISION IS THE IDENTIFIED MARKET NEED A review of the market points to a need that is not being covered...

... which Masterseek can fill by achieving their vision

Vision: Increasing need for globally available and precise company information

Only few players on the market where none have positioned themselves as a supplier of global detailed B2B information

Compared with other suppliers of company information value proposition can be significantly improved

A need for an effective tool for generating global and precise company information

• Through advanced and specially developed software Masterseek will obtain and index company and product information from all companies in the world, thus developing the largest leading global business to business directory • Masterseek will create the most profitable business to business directory via a unique concept providing all companies with the opportunity to present and market their company, its products and services locally as well as globally.

12


BUSINESSPLAN

MASTERSEEK WILL DELIVER A UNIQUE VALUE PROPOSITION TO TWO TARGET GROUPS - THE USERS AND THE ADVERTISERS The target group "The users" Who: • Small, medium sized and large companies • Importers / exporters • Public industrial development boards • Organizers of Expos Utilization: • To find suppliers • To find products • To research / monitor the competition • To analyze investment opportunities Criteria: • Width of company information • Depth of company information • Updating • Precision • Cost

13

Unique value proposition

The target group: "The advertisers" Who: • Small, medium sized and large companies • Exporters Utilization: • To expand outlets • To sell products • To acquire extensive company information to be used in for instance direct marketing Criteria: • Size and relevance of the target group • Effectiveness • Costs


BUSINESSPLAN

MASTERSEEK WILL CAPTURE A CONSIDERABLE PART OF THE ADVERTISING MARKET FOR ONLINE MARKETING WITH ATTRACTIVE VALUE PROPOSITION; THIS MARKET IS FAIRLY LARGE AND IS EXPECTED TO GROW FURTHER OVER THE NEXT FIVE YEARS Online marketing is expected to constitute 10% of the total American marketing budgets for 2006... Percentage

... And the global market for online marketing is expected to double over the next five years

ESTIMATES

ESTIMATES CAGR: 12%

2004

2010

Source: Jupiter Research, Kelsey Group

14


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

15


BUSINESSPLAN

MASTERSEEK WILL GIVE THE USERS ACCESS TO A NEW AND ATTRACTIVE VALUE PROPOSITION Existing search methods for company information do not meet the need of the users for both precise width and updated depth...

...which is meet by Masterseek's technology

Masterseek will offer a new and attractive value proposition

Seach engines are unfocused and display lack of depth within company information

Masterseek have great depth

Width • To deliver the largest database of company information both with regards to the number of companies and the number of countries covered

• Search engines such as Google and Yahoo deliver large quantities of information and width. However, the information is often confusing and does not focus on the given areas such as company information • The inquirers use a lot of time finding an often insufficient quantity of relevant company information Directories lack width and continuous updating of depth information • Directories such as Kompass, Europages, WLW and Hoovers have relatively large depth, however, at the same time a limited width of company information (0.4 MM - 1.9 MM) • This is due to the fact that these directories mainly limit themselves to parts of Europe or the US

• Masterseek's software and webcrawler are constantly and automatically updating the database with a large volume of new data from the companies' websites • Furthermore, Masterseek have affiliate agreements guaranteeing additional and relevant information on the individual company

Masterseek have focused width • On Masterseek you can find the companies and products you want immediately - regardless of line of business and country • Over 22 million companies from 75 countries are indexed in Masterseek version 1.0 and divided into 50,000 categories that together provide more than 100 million products and services • Performing a search on Masterseek is at the same time simple, manageable and easy to do

Depth • To deliver the most detailed and categorized database of company information Updating • To deliver a database of company information which is always updated Precision • To deliver information that match the search criteria relevantly Cost • To deliver a wide range of services for free

16


BUSINESSPLAN

MASTERSEEK'S VALUE PROPOSITION IS SUPPORTED BY SPECIALLY DEVELOPED WEBCRAWLERS, DATAMINING SOFTWARE AND ADVANCED SEARCH ALGORITHMS Guaranteeing 1

WIDTH

2

DEPTH

3

UPDATING

GUARANTEEING 4

PRECISION

5

NO USER COSTS

Online Users

Data supply from affiliate partners (e.g. news, annual accounts etc.)

Masterseek’s database • Advanced search

Data supply from companies (e.g. changing contact information or new employees etc.)

algorithm • Data mining • Categorizing

Websites

Masterseek's intelligent webcrawling

17

GUARANTEEING

Masterseek's intelligent and specially developed software guarantee support of value proposition

Searching: • Products • Categories • Names • Regions • Etc. Result: • Manageable information • Precise information • All types of information • Updated information • Relevant active links


BUSINESSPLAN

MASTERSEEK'S STRENGTH ON WIDTH, DEPTH AND UPDATING PROVIDE THE USERS WITH THE LARGEST ACCESS TO RELEVANT COMPANY INFORMATION COST EFFECTIVELY Guaranteeing 1

WIDTH

2

DEPTH

3

UPDATING

Masterseek is strong when it comes to width, depth and updating • Over the last 7 years Masterseek have developed webcrawler software that automatically and constantly search company websites in 23 different languages. Then the information is indexed in Masterseek's database in order for the company information and products are easily and quickly obtainable • At the same time, because of their state of the art terrabyte server part, Masterseek provide companies with free access to updating basic information on Masterseek and affiliate partners' information is automatically and constantly updated

Data supply from clients and affiliate partners

• Other directories collect most of their information via telemarketing and direct mail. This process is slow as well as costly. Masterseek have several more easily available information compared to the competitors

(News, changing contact information, changing key employees, etc.)

• This method provides a large quantity of nuanced information in a cost effective way. That is why Masterseek have the largest information supply as well as the lowest costs for data collection compared to the competitors

Websites

Masterseek's intelligent webcrawling

Today’s search engines actually reduce the chance of finding the right pieces of information. Masterseek is a tool that counters this trend. Masterseek could be a very useful tool for Siemens”.

“Corporate Communications Executive”

18


BUSINESSPLAN

MASTERSEEK'S PRECISION GUARANTEES THE USERS THE MOST EFFECTIVE INFORMATION SEARCH Guaranteeing 4

PRECISION

The point of departure is the opportunity for advanced search Search word • Flat screen • OEM products • Taiwan

Masterseek’s database • Advanced search

It is processed in Masterseek's developed software and server Park...

• Strong and scalable server park from HP • Intelligent software utilizing search words, IP addresses and country codes • Intelligent algorithms (master rank) • Cache memory remembering previous search attempts

...guaranteeing a precise and quick search

• Quicker data request than the closest competitors • No irrelevant information • Sorting by relevance • User-friendly experience • Easy access to subpages with furt er detailing the required information

algorithm • Data mining • Categorizing

"You only have to take a look at the site and see how quick it is. There isn't anything there that you don't need. The categories, the companies, the news, well, everything a company would like to see. And it's easy to find. Very credible".

“E-business executive”

19


BUSINESSPLAN

5

NO USER COSTS

Searching: Products Categories Names Regions Etc. Result: Manageable information Precise information All types of information Updated information Relevant active links

INFORMATION

Online Users

Masterseek provide the best cost / benefit ratio to potential users

Basic value proposition within companies

Guaranteeing

Attractive value proposition within company information

MASTERSEEK'S USER ACCESS DIFFERENTIATES BY PROVIDIND FREE ACCESS TO ATTRACTIVE COMPANY INFORMATION

FEE BASED

Free COST

20


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

21


BUSINESSPLAN

MASTERSEEK'S VALUE PROPOSITION MUST MEET THE ADVERTISERS' KEY CRITERIA Key criteria Size and relevance of the target group

• Considerable size of relevant traffic from target group is crucial for the directories’ ability to generate a turnover • Masterseek's unique value proposition towards users is a strong foundation for creating a large target group • Masterseek will utilize their competence in optimizing search words to generate traffic from Google, Yahoo and MSN • Masterseek will create massive PR and marketing to generate traffic • Online marketing in an internet BtB forum is particularly attractive - companies can target campaigns by choosing the terms and search words that are relevant in relation to the products, thereby reaching a clearly defined, targeted and relevant target group

Effectiveness

Costs

• The users of Masterseek have a defined need (searching) and are open to marketing - users of B2B search are more likely to make a purchase than ordinary inquirers. • Masterseek provide easy and precise information, thus minimizing the search time and strengthens the value proposition towards the users • It is important that companies get the opportunity to raise their profile attractively • Masterseek is an interactive media which is why there is an opportunity for immediate "action" • The most common form of income model for online advertising is PPC (pay per click) - in connection with Masterseek this means that you only pay for the person having performed a search on the company search words and then "clicked" on the company's link • The companies only pay for the number of visits they want

22


BUSINESSPLAN

MASTERSEEK HAVE DEFINED A NUMBER OF PRODUCTS THAT ARE PARTICULARLY ATTRACTIVE IN COMPARISON TO THE CRITERIA OF THE ADVERTISERS Masterseek's products

Sponsor links

The key criteria of the advertisers Size and relevance of the target group

Effectiveness

Costs

• Masterseek have a global target group

• Masterseek provide effective marketing: Masterseek generate "click" from the users that have a defined need for the respective company and provide the opportunity for immediate action"/approach

• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective

Direct links to advertisers Differentiated pricing depends on placement

• Sponsor links are targeted directly at the relevant "users" according to their search criteria

• Sponsor link has a predominant place on Masterseek

My Masterseek profile (see further details in the appendix)

Masterseek Basic The company is listed and is guaranteed a good exposure based on keywords and based on additional listing of products. $299,- /12 months subscription Masterseek Advanced The company is listed and is guaranteed a good exposure based on keywords and based on additional listing of products, company news, vacant positions etc. $449,- /12 months subscription

23

• Masterseek have a global target group • By being a member, you can add significantly more search words meaning that more people and more relevant target groups will find you

• The opportunity to present the company in a very attractive way compared to the competing directories • Easy and manageable format for "the users" • The opportunity for targeting the profile via access to user statistics

• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective


BUSINESSPLAN

MASTERSEEK HAVE DEFINED A NUMBER OF PRODUCTS THAT ARE PARTICULARLY ATTRACTIVE IN COMPARISON TO THE CRITERIA OF THE ADVERTISERS Masterseek's products

Data blocks

The key criteria of the advertisers Size and relevance of the target group

Effectiveness

Costs

• The opportunity to define global target group

• In depth data

• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive

Delivery of specifically adjusted data blocks and services which are matched and presented according to the client's request

• The opportunity to define your own target group very precisely by using product codes

Price per unit

• One legal entity and with that the opportunity to run all data transversely

Banner Ads

Direct links to advertisers Differentiated pricing depends on placement and popularity of the chosen search terms

• Masterseek have a global target group • Banner ads are targeted directly at the relevant "users" according to their search criteria

• Company indices guarantee optimal targeting and effectiveness • Constant updating guarantees validity and effectiveness when using data

• Masterseek provide effective marketing: Masterseek generate "click" from the users that have a defined need for the respective company and provide the opportunity for immediate "action"/approach

• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective

• Few banner ads per page will provide great visibility on Masterseek

24


BUSINESSPLAN

IN THE LONG RUN MASTERSEEK HAS THE OPPORTUNITY FOR INTRODUCING PRODUCTS THAT WILL GENERATE CONSIDERABLE EXTRA TURNOVER Masterseek's products

Web link

In the long run it will be possible to introduce PPC (pay per click) for click on web link for the individual companies

The key criteria of the advertisers Size and relevance of the target group

Effectiveness

Costs

• Masterseek have a global target group

• With web link the search result appears objective, thus lowering user barriers

• Since there is an opportunity for placing more web links per page, it is possible to provide a competitive price per "click"

• More easy for "the users" to click on web link and thus a clear differentiation compared to companies that do not provide web link

It is a strategic decision not to introduce these product at the launch of Masterseek in order to: • Guarantee a credible profile as an objective search method (search results not dependent on advertising) • Guarantee that all companies have web link in the beginning (not possible to make deals with all companies in Masterseek in the short term)

25


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

26


BUSINESSPLAN

MASTERSEEK'S VALUE PROPOSITION TOWARDS PARTNERS Masterseek’s audience and user are at-the-work professionals who comes from a wide range of industries. The target user will most likely be interested in several business related information and services. Services A partnership with Masterseek can direct targeted users and potential buyers of services like to specific suppliers of: Company information of higher dept, Credit information, Online travel booking, Online purchase of IT services and Hardware, Recruitment services, Stock listing and financial services, Online Marketing & advertising and many more. Content Masterseek even offers partnerships for other portals with a Business to Business oriented content, which gives the partner a high quality of dynamic content with a high relevance for the targeted user and audience. Sales The wide range of proven online advertising concepts gives partners of sales an interesting possibility. Both Masterseek Membership profiles and Text link advertisement are requested by professional marketeers due to the highly defined target group and audience on Masterseek. A sales partnership can be with both professinal SEO companies and consultants and core sales companies.

27


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

28


BUSINESSPLAN

MASTERSEEK'S ORGANIZATION HAS STRONG COMPETENCES IN ALL AREAS AND HAVE LEADING SUPPLIERS AS SHAREHOLDERS

Board of Directors Appointment are pending

Advisory Board Christian Bahl Thor Høberg-Petersen

IT & Development Rasmus Refer

29

Management team Claus Jakobsen, COO Rasmus Refer, CTO Martin Ohrt, CFO

Sales & Marketing

General Management Robert Perz

Key suppliers to Masterseek Fujitsu-Siemens supplier Rackhosting Jay.Net, Search, Technology 11Design, CVI and design

PR Outsourced Burson Marsteller


BUSINESSPLAN

MASTERSEEK’S MANAGEMENT HAS STRONG COMPETENCES WITHIN CONSTRUCTION AND OPERATION OF NEW BUSINESS AREAS, SEARCH ENGINE OPTIMISING AND IT IS SUPPLEMENTED BY A STRONG NETWORK

Advisory Board

Name

Short CV

Christian Bahl

• Executive MBA in Strategic management, 2005 • Sales manager and co-owner, Jobindex, (1997 – today), Has established partnerships with companies such as Altavista, Eniro, TV2, Tele2, MSN, Yahoo, Jubii, TvDanmark, etc, • Advisory board member for Kultunaut

Thor Høberg-Petersen

• MBA, Finance and Strategy, Yale School of Management, Yale University, 2000 • BeautyJungle,com, Strategic Planner, 1999 • Associate, McKinsey & Company, 2000 – 2001 • Zacco A/S, Executive Assistant, 2002 -

30


BUSINESSPLAN

... AND MASTERSEEK’S ORGANISATION HAS VAST EXPERIENCE AND STRONG COMPETENCES WITHIN ALL CORE AREAS Management Team Name

Short CV

Claus Jakobsen

President & CEO • 1980-1986 IT consultant in Burroughs Data Systems (later Unisys). Focus on project and production management • 1987-1991 Senior at Tandem Computers. Focus on major projects in manufacturing • 1991-1994 Sales of SAS Data. Focus on Facility Management of Operations and networking tasks • 1994-1995 Executive Officer of Unisource Business Networks. Focus on the establishment of the Telia-owned company in Denmark • 1995-1998 CEO of Telia Denmark. Focus on building the organization and customer base in Denmark • 1998-2001 Managing Director of Tele1 Europe (later Song Networks). Focus on building the organization, customer base and fire in Denmark • 2001-2002 Executive Vice President of Songs Nordic leadership based in Stockholm. Focus on the realization of synergies in production unit PR

Martin Ohrt

Chief Financial Officer • Regional manager, Denmark at Freetrax A/S • Label manager at Sony Music Denmark

31


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

32


BUSINESSPLAN

MASTERSEEK’S STRATEGIC DEVELOPMENT SHOULD MAKE A STOCK ENCHANGE LISTING POSSIBLE EARLY 2009 2. Q 2008

3. Q 2008

4. Q 2008

Central strategic goals

• Launch of masterseek.com as a unique tool for company information • Uploading of affiliate information is commenced • Sale of Masterseek services can commence: Sponsorlinks, banner advertisements and membership subscriptions

• Jobportal, news section and product B2B auction will be open for audience. • Partnerships implemented (e.g. sales, content, revenue partnerships)

• Established brand name within global search for company information • Masterseek ready for stock exchange listing

Central operational goals at the end of the period

• 45 mill. companies and 25 mil. company websites uploaded • Visitors: 50.000/day • Employees: 5 • Focus countries with respect to sale and marketing: Scandinavia

• 50 mill. companies and 30 mill. company websites uploaded • Visitors: 80.000/day • Employees: 8 • Extra focus countries: The Netherlands, Germany and the UK

• 55 mill. companies and 35 mill. company websites uploaded • Visitors: 120.000/day • Employees: 12 • Extra focus countries: The US, the remainder of EU and chosen Asian countries

Key success factors (KSFs)

Internal KSFs • Technical solution • Securing and attracting key employees • Sufficient capital

Ready for potential IPO.

External KSFs • Traffic generation to masterseek.com • Efficient sale of Masterseek services • Strong partnerships

33


BUSINESSPLAN

MASTERSEEK IN STRONG POSITION IN RELATION TO INTERNAL SUCCESS FACTORS Internal KSFs

Technical solution

Securing and attracting key employees

Sufficient capital

34

Activities

Comment

• Database development • Production environment development • Web crawler software develoment • Beta version test • Scalability and flexibility • Continuous improvement

• User tests held with positive feedback • Masterseek.com is fully implemented by the end of 1st Quarter 2006 • The technical solution is optimized continuously with respect to feedback and Masterseek’s growth • Masterseek have state-of-the-art server park, Co-operation agreement with HP

• Defining work functions • Development of attractive incentive structure • Recruitment of people with the necessary competences • Continuous HR focus

• Existing strong organisation (equivalent, section on the organization) • Stock option program for key employees worked out by the end of 2005 • HR responsible in the organization

• 1st round of financing • 2nd round of financing

• 1st round of financing in place • 2nd investment round is initiated in Scandinavia, Switzerland and Germany • Preliminary positive feedback from several private • Scandinavian investors


BUSINESSPLAN

ORGANISATIONAL DEVELOPMENT 1. Organisational challenges The evolution of Masterseek in the short term (2008) must per definition stem from the existing Danish platform regardless of the holding company Masterseek Corp. being a US company. Regardless of which funding model is achieved this fact cannot be changed before sometime into 2009. Throughout 2008 Masterseek will suffer from a chronic lack of critical mass and middle management capacity regardless financial reserves. It will simply not be possible to identify, hire and run in a larger management structure as quickly as desirable. To compensate for this the company will depend on outsourcing and rely on a strong top management structure. Particular attention has to be given to discipline in relation to the daily management versus the semi active shareholders. Clear command lines and authority will be established to deal with that challenge. The vision is to appoint a strong externally recruited Chairman of the Supervisory Board supplemented by a working Vice Chairman to establish the bridge between the daily management and the Supervisory Board/main shareholders. In the event the funding route involves a VC investor taking a substantial stake it’s envisaged that this VC investor will introduce the incoming Chairman. Already by the latter half of 2008 it may be relevant to appoint an international capacity as CEO based in USA or UK supported by an assistant to accelerate the internationalization of Masterseek in collaboration with the management team in Copenhagen.

35


BUSINESSPLAN

ORGANISATIONAL DEVELOPMENT 2. Strategy for sales. To recruit, built, run in and manage a sales team with the capacity to penetrate quickly several countries simultaneously is simply not feasible in the near term. This task will be outsourced. Talks with the pan European telemarketing organisation Ranger is in progress. It is not clear what share of the sales revenue will have to be ceded to Ranger for their services, but to be conservative the entire first year revenue has been set aside as commission. A draft agreement with Ranger should be in place for consideration before mid May 2008. 3. Strategy for strategic partnerships & top level external negotiations. This will be the responsibility of the working Vice Chairman supported by the Chairman and the Copenhagen based General Manager. Longer term this work may be transferred to the future US/UK based CEO which may also involve the relocation of the working Vice Chairman if not entirely replaced by a new set up.

36


BUSINESSPLAN

ORGANISATIONAL DEVELOPMENT 4. IT operations & programming. The company must short term hire 2-3 database programmers. Besides, arrangements have been agreed with the Danish IT company ID Solutions for support both for .net applications and server operations. Also ad hoc arrangement will be made for front end design and applications. 5. Sales support Before sales start 1 to 2 customer support staff will have to be in place fully trained to handle customer inquiries. 6. Accounting. A full time accountant with strong spread sheet and documentation capacity will be hired in the short term. Until the person is in place and run in the firm Auditors will supply ad hoc assistance when required.

37


BUSINESSPLAN

MASTERSEEK TAKE IMPORTANT STEPS TO ACHIEVE A STRONG POSITION IN RELATION TO THE EXTERNAL SUCCES FACTORS* External KSFs Traffic generation to Masterseek.com

Efficient sale of Masterseek services (sponsor links, subscriptions, blocks and banners)

Strong partnerships

38

Activities

Comment

• Search word optimising on Google, Yahoo, MSN (cf, next page) • Cost efficient banner advertising with among others affiliate partners • PR in International newspapers, magazines and TV stations such as CNN and CNBC • Submission to other relevant search registers with for instance authorities, libraries, etc, • Strong exposure on fairs and conferences

• Masterseek have core competences within search word optimising for Google, Yahoo, MSN etc, (cf, next page) • Search engine optimising takes place via company names, concepts and search words from Masterseek’s database whereby a link is established directly to the relevant page with Masterseek • Masterseek have employed the renowned and international Burson Marsteller • The management group has strong ties to CNN, CNBC, Doubleclick, etc.

• Telephone sale • Direct mails • Sale through partners • Presentation on Masterseek.com • Online advertising and ordering • Fairs and conferences • PR

• Efficient internal sales force in place, Strong track record with similar tasks • Dialogue about co-operation with leading telemarketing companies in focus countries has been initiated • Huge focus on ”search functionality on the Internet” which will be made use of through vigorous focus on press releases, sponsoring, events, etc. • 200 chosen companies are offered free membership for a year in order to establish powerful case stories • Contract already made with Reuters • Present negotiations with other strong partners, incl, D&B

• Identification of the best partners • Entering into of attractive partnership agreements

* Besides the information in this business plan, we are currently working on an in-depth marketing plan – incl, a plan for the launch of Masterseek.com which will constitute approx, 12 % of all the 2006 costs


BUSINESSPLAN

AS CENTRAL SOURCE OF TRAFFIC GENERATION MASTERSEEK WILL MAKE USE OF A CORE COMPETENCE WITHIN SEARCH WORD OPTIMISING TO ACHIEVE TOP PLACES ON LEADING SEARCH ENGINES Masterseek will achieve top places on the majority of search words within “International business search”.

…which is achieved by means of in-depth insight into the 4 core parameters of search words optimising and the data code for leading search engines Relevance (PageRank) Relevance is calculated based on how many other sites are linked to the individual website, Masterseek have access to a huge network of other websites where Masterseek will be submitted and thus acquiring high relevance Link structure Search engines find and index information on a website based on the coding of the underlying link structure (i.e., number and naming of links), Masterseek makes a link structure based on seven years experience which ensures optimized score with respect to link structure Text contents and Metatags Websites are awarded a higher score the better the text contents and metatags match the search word or concept. Masterseek is constructed in such a manner that all words in the database, including product and company names, are implemented automatically in the text contents and the metatags

39


BUSINESSPLAN

MASTERSEEK’S UNIQUE COMPETENCES WITHIN SEARCH OPTIMISING WILL CONTRIBUTE TO ENSURING THAT TRAFFIC COMES TO MASTERSEEK.COM Estimated traffic to MasterSeek per month from search word optimising 743 million

25.5 million per month x 75.000

340

Number of Internet users in countries with languages covered by Masterseek* * **

***

40

Number of clicks per top place per month

Top place per month***

Click (=number of user sessions) per month to MasterSeek

Source: Internetworldstats,com, Average based on search engine optimising carried out for D & B on Google, Yahoo and MSN, weighted with the number of inhabitants in the covered countries, Result of search engine optimizing for Denmark: 0.5, Assumptions for covered countries: Europe: 0.50; the US, Australia and Japan: 10 % lower; China and South America: 30 % lower Based on the search engines’ algorithms 75,000 – 100,000 top places per language can be achieved.


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

41


BUSINESSPLAN

THE STARTING POINT OF THE FINANCIAL ESTIMATES FOR INCOME DRIVERS, REALISTICALLY COMPARE TO RELEVANT BENCHMARKS

42

Income sources

Drivers for income sources

Year 1

Year 3

Year 5

Comment

Sponsorlinks

• User sessions (million per month) • Number of page views per session • Number of sponsor links per page view • Click rate for sponsor links • Income per click, USD

25.5 6 5

52.0 6 5

105.5 6 5

0.003 0.3

0.003 0.3

0.003 0.3

• 25,5 mil/month in year 1 is conservative since it is primarily based on traffic from search optimization. Masterseek also expects considerable traffic from e.g. PR, affiliate promotion and banner ads • As a benchmark Yahoo has 72.000.000.000 pageviews each month • Benchmark search engines have 5-8 pageviews per user session (e.g. Google and MSN)

MasterListing

• Number of advanced subscribers • Number of elite subscribers • Subscription prices advanced/elite, USD

6.600 4.400 150/250

17.500 10.500 150/250

20.000 12.000 150/250

• Kompass is estimated to have 80,000 paying subscribers on their site, equivalent to 3-5% of the companies

Data-Blocks

• Number of sold data blocks • Average price per data block, USD

40 20.000

58 20.000

83 20.000

• Thanks to Masterseek’s technique, software and construction, data blocks can be extracted with higher quality and lower price points than its competing directories


BUSINESSPLAN

THE STARTING POINT OF THE FINANCIAL FIGURES IS ESTIMATES FOR THE INCOME DRIVERS, WHICH ARE REALISTIC COMPARED TO RELEVANT BENCHMARKS Income sources

Drivers for income sources

Year 1

Year 3

Year 5

Comment

Affiliate agreements

• Number of affiliate links per page view • Click rate • Income per click, USD

0.5 0.003 0.15

0.5 0.003 0.15

1.0 0.003 0.15

• Based on Jupiter research studies of similar affiliate programs

0 0 0

0 0 0

0.06 0.003 0.2

0

0

2

• The users are very interested in clicking on a web link for a company which they themselves have been searching actively for • Kompass offers this service with success

0 0

0 0

0 0.2

• Based on collected information from adbrite.com and DoubleClick

Company web links • Number of web links per page view • Click rate • Income per click, USD Banner advertisements • Number of banner advertisements per pageview • Click rate • Income per click, USD

43


BUSINESSPLAN

THE STARTING POINT OF THE FINANCIAL FIGURES IS ESTIMATES FOR THE INCOME DRIVERS, WHICH ARE REALISTIC COMPARED TO RELEVANT BENCHMARKS Turnover

EBIT

USD 1.000

USD 1.000 110.909 97.579

68.811 59.977 40.188 34.395 24.282

20.218

18.000 11.686

Year 1

44

8.860

Year 2

Year 3

Year 4

Year 5

Year 6

Year 1

14.306

Year 2

Year 3

Year 4

Year 5

Year 6


BUSINESSPLAN

TURNOVER I CONSTANTLY INCREASED THE PERIOD, AND COSTS CONSTITUTE A SMALLER FRACTION OF THE TURNOVER, WHICH EMPHASIZES THE ATTRACTIVENESS OF THE BUSINESS PLAN USD 1.000

Year 1

Year 2

Year 3

Year 4

Year 5

Year 6

Turnover

1.249

22.290

29.295

45.596

74.626

117.144

Sales costs*

1.753

2.160

1.943

3.215

5.505

8.873

Variable costs in total

1.177

9.515

7.331

6.853

9.390

13.005

71

12.776

21.964

38.742

65.236

104.139

marketing costs

1.900

1.450

960

915

960

1.180

Salaries

2.030

2.590

2.818

13.125

3.424

4.285

Administration costs

95

175

340

400

450

510

Accomodation costs

135

290

340

362

406

433

498

575

672

765

900

4.811

5.125

5.183

5.658

6.231

7.585

7

14

21

28

35

42

(4.747)

7.637

16.760

33.056

58.970

96.512

0

0

0

0

0

0

4.747

7.637

16.760

33.056

58.970

96.512

0

0

0

0

0

0

4.747

7.637

16.760

33.056

58.970

96.512

Contribution margin

Other costs incl. depreciations402 Fixed costs in total Depriciation** EBIT Financial costs*** Result before taxes Tax**** Result after taxes

*Sales costs fall from 15% to 8% in year 3, where the salessetup is expected to be optimized ** The serverpark is delivered by HP. *** Operations are expected to be financed by the company capital ****Masterseeks juridical homestate is Nevada, which results in a tax percent of 0.

45


BUSINESSPLAN

THE LIQUIDITY BUDGET SHOWS POSITIVE YEARLY LIQUIDITY EFFECTS DURING THE PERIOD

USD 1.000

Year 1

Year 2

Year 3

Year 4

Year 5

Year 6

(4.747)

7.637

16.760

33.056

58.970

96.512

7

14

21

28

35

42

(4.740)

7.650

16.781

33.084

59.005

96.554

499

721

(177)

(0.002)

259

414

(208)

(3.507)

(1.167)

(2.717)

(4.838)

(7.086)

(4.449)

4.865

15.436

30.367

54.426

89.881

70

70

70

70

70

70

0

0

0

0

0

0

The liquidity effect of the period

(4.519)

4.795

15.366

30.397

54.356

89.812

Accumulated liquidity effect

(4.519)

275

15.641

45.938

100.294

190.106

The result of the period before tax Depreciations Financing from operation Change, creditors Changes, trade debtors Operational liquidity in total Investments in material fixed assets Paid tax

46


BUSINESSPLAN

SENSITIVITY ANALYSIS IS BUILT ON CHANGES IN RELATION TO BASE CASE Possible changes in relation to base case

Worst case (Estimated probability: 10 %)

Base case (Estimated probability: 75 %)

Best case (Estimated probability: 15 %)

• User sessions will be 100.000/day the first 2 years, and will rise in year 3 to the level of the base case year 1 • The first three years sales costs are doubled compared to the base case budget • There will be sold 0 memberships and 50% less datablocks the first 3 years • An extra 1 mil. USD wil be spent on a launch campaign • The realized click rate is 0.002 instead of 0.003

• As budgeted

• User sessions are 50% higher than base case – and the growth rate for user sessions is also 50% higher than in the base case • The fraction of companies buying weblinks is doubled • The realized clickrate is 0.005 instead of 0.003

47


BUSINESSPLAN

WORST CASE SCENARIO SHOWS LOWER TURNOVER IN YEAR 1, WHICH IS COUNTERED BY CONSIDERABLE HIGHER TURNOVER IN THE BEST CASE SCENARIO

Worst case

Turn Over USD 1.000

48

Base case

Best case


BUSINESSPLAN

WORST CASE SCENARIO SHOWS LOWER TURNOVER IN YEAR 1, WHICH IS COUNTERED BY CONSIDERABLE HIGHER TURNOVER IN THE BEST CASE SCENARI

Worst case

Base case

Best case

Ebit USD 1.000

49


BUSINESSPLAN

TO MAKE A COMPANY VALUATION THREE SCENARIOS AND TWO DIFFERENT VALUATION METHODS ARE EMPLOYED

Worst case (probability: 10%)

Base case (probability: 75%)

Best case (probability: 15%)

50

Valuation method

Comment

Discounted Cash Flow (DCF)

DCF estimates the value on the basis of the company’s expected ability to generate cash flow

Multiple

Multiple can be used to estimate the value of the company in relation to its turnover and income by looking at what similar companies are sold for


BUSINESSPLAN

BASED ON THE TRADITIONAL VALUATION METHODS THE COMPANY’S PRESENT VALUE IS ESTIMATED TO BE BETWEEN 400 AND 600 MIL USD The company’s value interval is based on various valuation methods (USD 1,000) I. DCF valuation Base case: WACC 12% Terminal growth rate 2% (75% probability)

• The company’s value lies in the interval between 400-600 million USD depending on which valuation method carries the most weight

537.500

Worst case: WACC 12% Terminal growth rate 2% (10% probability)

255.400

• Investors should compare their own risk profile with the applied WACC of 12%

Best case: WACC 12% Terminal growth rate 2% (15% probability)

3.822.678

• Investors should consider the probabilities of base, worst and best case

II. Multiple valuation (year 6, WACC 12%) 453.163

Base case: EBIT x 10,0 Total interval when assessing the company’s present value

400.000

0

250.000

600.000

500.000

• Investors should consider multiple in relation to whether exit is wished in year 6 or later

750.000

51


BUSINESSPLAN

CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan

52


BUSINESSPLAN

IMPLEMENTATION PLAN Activities

Quarter:

07Q4

08Q1

08Q2

08Q3

08Q4

09Q1

09Q2

09Q3

09Q4

10Q1

10Q2

10Q3

10Q4

Internal activities • Technical implementation • 2. investor round • Masterseek.com on air • Affiliate information uploaded on masterseek.com • Continuous technical set-up optimising and devel. in rel. to growth and feedbk. External activities • Data block sale • Preparation for sale of remaining services • Launching campaign • Sale of remaining Masterseek services • Stock exchange introduction Geographical expansion • Scandinavia • Germany, the Netherlands, the UK • Remaining EU, the US and chosen Asian countries

53


BUSINESSPLAN

MASTERSEEK’S TECHNOLOGICAL CONSTRUCTION ENSURES THAT MASTERSEEK CAN BE SCALED FINANCIALLY IN A RATIONAL MANNER Front end • Today MasterSeek’s front end system consists of 15 HP dual Xeon servers which are placed behind a redundant Alteon 3408 setup • Alteon 3408 is a leading load balancer for this type of application and it can handle some 100,000 simultaneous sessions pr. second at its current set-up and it can easily be scaled as need be

• The use of standard equipment and applications ensures that the system can handle the huge amount of data and the very many simultaneous users and queries,

• Alteon 3408 balances the load between the various HTTP front ends and determines the optimum response time, disconnects equipment, etc.

• The system is fully scalable with respect to front end and back end

• The front end will service all HTTP requests and it can be scaled unlimitedly to meet future needs, too Back end • MS-SQL Back ends consist of traditional servers in multiple cluster setups with redundant disc arrays to handle raw data • The clusters will be divided into different categories of high-load queries and types of content as the system is being used

54

• The system can easily be expanded in step with increasing needs and the use of standard equipment ensures that the incremental costs of the expansion are small

APPENDIX 1


BUSINESSPLAN

ELABORATION OF MASTERSEEK’S ”MY MASTERSEEK” MEMBERSHIPS ILLUSTRATES PROMINANT PROFILING POSSIBILITY AND DEGREE OF DETAILS

APPENDIX 2

Increased Value Interactive services

• SMS messages • E-mail messages • Regular, extensive information about traffic

Further multimedia information

• Company logo • Presentations (ppt. and video)

• Photos • Audio files • Animations

• • • • • • • •

• • • • • • • •

• • • • • • • •

Contact info Management News Press releases Products Costumers References Partners

Contact info Management News Press releases Products Costumers References Partners

• • • • • • • •

Increased chance of being found

Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition

The chance of being found is multiplied with the number of products, services, press releases and activities which the company itself adds to its company registration

Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition

Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition

Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition

My Masterseek - Free

My Masterseek - Basic

My Masterseek - Advanced

Insurance of data validity

Comp. profile Ownership Organisation Employees Event calendar Financial info Technology Special offers

Comp. profile Ownership Organisation Employees Event calendar Financial info Technology Special offers

Further statistical information

55


APPENDIX

WHITEPAPER DATA MINING MasterSeek 2008, Denmark

56


APPENDIX

CONTENT Tools & Technologies Components Goals Crawler The Sorter URL mining/Validation Diagram Information mining from existing URLs Diagram

57


APPENDIX

Tools & Technologies Below is a preliminary list of tools involved on different stages of data mining & validation process. 1. 2. 3. 4.

PERL 5.8 (Practical Extraction and Report Language) used at Dispatcher/Crawler stage. Microsoft .Net Development Tools version 1.1 at Sorter stage. REGEXP (Regular Expressions Matchmaking) at all stages of process. DBMS.

Components The following section describes the different components involved in the data mining process. The components integrate with each other via database, in order to maintain a traceable connection, and to secure data transmission. The components have different interaction with different functionality such as data mining, data validation, first level URL search. The 3 component are 1. Dispatcher 2. Crawler 3. Sorter Goals Goals are specified in different parts/processes. Main goal is to keep updated Masterseek’s companies database, currently having 22 million companies’ information and growing. 1. Find URL (website addresses) and crawl the required information/properties of 11 million companies of which Masterseek does not have URL information in its database. 2. At specified time intervals, crawling, matching and keeping updated all those companies’ information, who have their URL addresses in Masterseek Database.

58


APPENDIX

Dispatcher Responsible for dispatching work load to the crawler component. The Dispatcher ensures scalability by dispatching work load according to available work stations, enabling the data mining process to separate work load on the entire Masterseek network (include office machine, using CPU when machine hibernating) The dispatcher is using a queue model which has been implemented successfully in main frame systems during last 25 years. The model is based on many queues, each queue has a defined batch of initiators, that actually is responsible for dispatching the tasks. The amount of initiators variant from one queue to the other and the time interval is also defined for each queue, enabling maximum flexibility for prioritizing, time frame for site scanning and divide work load over the available resources. Crawler Using site map extraction with country/locale specific low level RegExp technology will extract URL content. The text analysis will not be accurate to optimize performance, and a high error percentage is taken under advice. The crawling process will advise The crawling Meta-Data database before crawling, in order to receive manual crawling information and feedback from the sorter on data quality. The result of such crawling process after a few dozen crawling iteration will be an accurate list with properties and their respective URL’s, generating the at most efficient crawler with high accuracy. Sites do change once in a while, and that is the reason why we are crawling them over and over again, but such changes can be only in the data inside the URL, for example the telephone number or address has been changed. The other type of change is structure changes with URL’s replacement. This change will cause the sorter to report errors back to the crawler and the learning process will start over again. The performance of the crawler is important, but such performance is obtained by Crawling optimization. It is crucial to obtain a flexible crawler and use a high level language such as Perl, which have a sophisticated text parsing functionality.

59


APPENDIX

Crawling techniques There is no such thing as the perfect crawler, every site contains its own unique structure. The goal of a good crawler is to deal with the majority of the sites, and to alert on irregularities of the site structure to indicate a potential crawling error. The crawler mission is to gather most relevant data on the company according to the defined properties, and to pass the information to the second process, the fine sorting or the SORTER The Site Map The site map is one of the more efficient techniques, due to the clear tree structure presentation, and clear node definition. It allows us to scan the entire site very quickly and locate the relevant information. Here is an example of a typical site map: (affinity.com)

In the site map we can find a clear link to the searched data, with limited vocabulary keywords.

60


APPENDIX

Building the Site Map When the site map isn’t supplied, it is crucial to build a site map. We don’t need the entire site map, just the more relevant areas in the site. Building such a site map requires a general page content analysis, in order to understand what the page is about. Such an analysis is possible with, special keywords (Regular Expression RegExp), the RegExp is a powerful technique to find text structures. Using the at most features of the RegExp technology will provide a powerful crawling technique. Generally, a URL content’s description resides in web page’s Meta Data Tags such as meta TITLE tag or meta DESCRIPTION and meta KEYWORDS tags which point out what the page contains. But there are also possibilities that web page does not contain Meta Data Tags. In that case, we would have to make a smart analysis to find the theme of contents using RegExp and extract our required properties/attributes. Therefore, for more optimization downloading the first X characters will enable us to analyze the page content. Using this optimization technique will allow us to scan more pages with greater speed and higher quality in our companies’ database. Here is an example from http://www.aag-gummi.dk/ <META NAME="keywords" CONTENT="bærelejer, cellegummi, cellegummilister, ekstruderede profiler, entrémåtter, epdm, fendere, gummi/metal, gummidug, gummifødder, gummilister, gummimembraner, gummistropper, gummitætninger, moosgummi, naturgummi, neopren, nitril, oringsnor, pakning, polyurethan, silicone, sneplovskær, stænklapper, støbte emner, teknisk gummi, udstansning, aag"> This Meta-Data can be used for categorization using RegExp, it is basically a summery of the company products.

61


APPENDIX

The Sorter Using country/locale specific information and sophisticated RegExp technology to extract the company attributes. Demands a high performance server with high caching capabilities written in a low lever language, and a scalable solution, in order to handle the data stream from the crawler.

62


APPENDIX

Regular Expression – Content analysis based on country/locale formats RegExp is a very powerful technology in text analysis, it allows us to identify data structure for example: Telephone number +45-46-90-21-89 or +45 4690 2189 or +4546902189 or +45-4690-2189 or 0045-4690-2189 of 46-90-21-89 All of the formats have few common attributes that match Denmark’s telephone number system: It will start with + and 8 digits after and the number length will be max 12 chars *(1)[+, ]45*(12)[0-9, ,-] It will start with 00 and 8 digits after and the number length will be max 12 chars *(3)[0, ]45*(12)[0-9, ,-] It will have 6 digits and number length will be max 9 chars *(9)[0-9, ,-] Using country/locale specific patterns we can match almost every telephone number format and identify it as a telephone number. Address format: Address format are more variable and therefore, more difficult to identify, but again with country/locale specific information it is easier. Based on the specific country we can identify an address by the town name; using town names list for each country which will give us a high accuracy in identifying address. Regarding the address format, we need to explore the various address formats that match the country specific in order to build a RegExp pattern for each and every country. Combination of techniques Sites are different with their structure and presentation type, therefore, as mentioned earlier it’s impossible to create the perfect crawler. The goal is to create a crawler that will succeed with more than 60% of the sites. Achieving such goal will require a combination of algorithms that will try to ‘solve the problem’ from different angles, using the feedback from the sorter we could adjust the matching algorithm to the different sites.

63


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.