BUSINESSPLAN
INVESTMENT OPPORTUNITY IN THE FUTURE GLOBAL B2B SEARCH ENGINE
Businessplan
The information contained herein has been provided on a confidential basis solely for the use of the person to whom it has been delivered. It may not be reproduced, forwarded or provided to any other person. The distribution of this information does not constitute or form an offer to participate in any investment.
1
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
2
BUSINESSPLAN
EXECUTIVE SUMMARY Market survey and Masterseek's positioning As the search engine enviroment has matured the next challange is clearly transparency and in-depth relevance. The most obvious search engine gap concearns market that relates to Business 2 Business searches, where services as google and Yahoo! have only very limited value for companies searching for business related issues and services. The Masterseek solution will bridge this gap with the creation of the most powerful user subscription free B2B search service in the world based on a a clear concept and superior crawler technology. The main income source stems from link and key word subscriptions from companies listed in the database. With a basic link and key word subscription cost of around only USD 300 per annum the barrier of entry to sales should be low or even very low. There are only a few players on the market with regards to company information, and no players have positioned themselves as a supplier of global, detailed business to business information. All suppliers on the market leave room for improved value proposition, although with regards to different parameters. Masterseek's value proposition With its value proposition, Masterseek will acquire a large part of the market of online marketing which is very large and is expected to double over the next five years. Masterseek will deliver a unique value proposition to two target groups - the users and the advertisers.
3
BUSINESSPLAN
EXECUTIVE SUMMARY Masterseek will provide the users of the company information with a new and attractive value proposition on five core parameters: • Width: To deliver the largest database of company information both with regards to the number of companies and the number of countries covered. • Depth: To deliver the most detailed and categorized database of company information. • Updating: To deliver a database of company information which is always updated. Precision: To deliver information that match the search criteria relevantly. • Cost: To deliver a wide range of services for free. Masterseek's value proposition is supported by specially developed web crawlers, data mining software and advanced search algorithms which guarantee the users the most effective information search. Masterseek will utilize their competences within optimizing search words and their technology to meet the key criteria of the advertisers: a large target group that is relevant in relation to the company's products, effectiveness of the search that ensures that the advertisers are positioned where the users have the greatest inclination of making a buy, and low costs which are directly related to the number of clicks to the advertiser's website (pay-per-click). Masterseek have defined a number of products that are particularly attractive in comparison to the criteria of the advertisers: Sponsor links which are targeted directly at the relevant "users" based on these search criteria, MasterListing: company memberships give the opportunity to add further search words and detailed information about the company and Datablocks: extracts from Masterseek's database targeted according to the specifications of the inquirer.
4
BUSINESSPLAN
EXECUTIVE SUMMARY Masterseek's organization and management Masterseek's management ensures strong competences in relevant areas regarding building and running companies and new business areas, optimizing search engines, data mining, application's development, sales, marketing and investor relations. The day to day management is supported by a strong supervisory board and an advisory board with experience in managing internet companies, internet based partnership agreements and strategy. Financial development Significant growth in both turnover and EBIT is expected over the 6-year period estimated in the business plan. The costs consitute a relatively small fraction of the turnover, which emphasizes the attractiveness of Masterseeks business model Based on traditional valuation methods Masterseeks Net Present Value is estimated to be in the interval of 400-600 million USD.
Contact us: Masterseek Corp. 82, Wall Street 10005, New York, USA
Claus Jakobsen President & CEO. Cell: (+45) 28998899
Rasmus Refer Founder Cell (+45) 20300606
5
BUSINESSPLAN
CONTENT
Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
6
BUSINESSPLAN
MARKET ANALYSIS SHOWS A NEED FOR AN EFFECTIVE TOOL FOR GENERATING GLOBAL AND PRECISE COMPANY INFORMATION
Increasing need for globally avail-able and precise company information
Only few players on the market where none have positioned themselves as a supplier of global detailed B2B information
Compared with other suppliers of company information value proposition can be significantly improved
• Increasing outsourcing • Increased globalisation • Greater demand for new sales channels Increased need for manageability
• Most search engines cover globally but require a lot of further adaptation to deliver precise B2B information • The existing suppliers of directory services are either national or regional in their geographical coverage and are restricted to basic information
Market analysis shows possible improvement on the parameters: • Width of company information • Depth of company information • Updating • Precision • Cost
A need for an effective tool for generating global and precise company information
7
BUSINESSPLAN
DEMAND FOR GLOBALLY AVAILABLE AND PRECISE COMPANY INFORMATION IS INCREASING Increasing outsourcing - Technological innovation has lead to increased complexity in modern production methods. In order to meet the increasing competition on price and quality, companies utilize outsourcing increasingly as a strategic tool Increased globalisation More and more companies operate on the global market in order to rationalize their production. The internet has contributed considerably in making trade across borders more easy Greater demand for new sales channels - Demands are still made and there are still expectations for the companies' growth and market position; which is why a greater number of companies constantly have to establish new sales channels including extending their dealerships abroad
• Increasing need for globally available company information. • With the internet there is an opportunity to cover this need • However, it requires an information forum where search hits are relevant and precise
Increased demand for manageability - Today the internet consists of billions of webpages with information on companies, public institutions, private persons and many more, however, in order to utilize the potential of the internet, easy and quick access to the relavant information is crucial
8
BUSINESSPLAN
ONLY FEW PLAYERS IN THE MARKET WHERE NONE HAVE POSITIONED THEMSELVES AS A SUPPLIER OF GLOBAL DETAILED B2B INFORMATION Differentiation in comparison to existing search methods
Detailed B2B information Positioning opportunity
National/regional
Interational/global
• Most search engines cover globally but require a lot of further adaptation to deliver precise B2B information • The existing suppliers of directory services are either national or regional in their geographical coverage and are restricted to basic information • Business opportunity on the market for global Business to Business directories
Primary B2B information
9
BUSINESSPLAN
COMPARED TO OTHER SUPPLIERS OF COMPANY INFORMATION VALUE PROPOSITION CAN BE IMPROVED (1 OF 2) ESTIMATES Hoovers
Width of company information 14 mil companies in the US and UK
Depth of company information Updating information Basic information and financial information. No second layer information(e.g. specific roduct numbers) Basic information. No second layer information
Manual updating based on individual delivery from au-thorities and companies Manual updating based on individual delivery from companies
Kompass
1.9 mil companies worldwide
Europages
550.000 companies in Europe
Basic information. No second layer information
Manual updating based on individual delivery from companies
Eniro/WLW
450.000 companies in Scandinavia (Europe)
Basic information. No second layer information
Manual updating based on individual delivery from companies
90 mil companies worldwide
Basic information and financial information. No second layer information
Manual updating based on individual delivery from authorities and companies
D&B
Only D&B cover more companies worldwide, however they are more expensive and more difficult to use
Existing directories deliver only basic information. No one delivers content directly from the companies' websites
High Low
Increased updating rate will provide greater validity and quality of information
10
BUSINESSPLAN
COMPARED TO OTHER SUPPLIERS OF COMPANY INFORMATION VALUE PROPOSITION CAN BE IMPROVED (1 OF 2) ESTIMATES Hoovers
Precision of information Difficult to find precise product information.
Cost
Total score Payment for largely all information
Good depth, width in the US and UK and updating, however, poor precision and high cost
Kompass
Easy to find precise product information.
Partly payment for information
Geographic limits as well as less depth and slow updating. Good precision, however
Europages
Fairly easy to find precise product information
Free search
Geographic limits as well as less depth and slow updating. Free user access, however
Eniro/WLW
Fairly easy to find precise product information
Free search
Geographic limits as well as less depth and slow updating. Free user access, however
D&B
Difficult to find precise product information.
Payment for all information
Good depth, width and updating, however, poor precision and high cost
There is a need for precise company and product information being found easily and intuitively
11
Europages and WLW have free basic information, however, they only cover European companies
All competitors leave room for improvement, although with regards to different parameters
High Low
BUSINESSPLAN
THE BASIS FOR MASTERSEEK'S VISION IS THE IDENTIFIED MARKET NEED A review of the market points to a need that is not being covered...
... which Masterseek can fill by achieving their vision
Vision: Increasing need for globally available and precise company information
Only few players on the market where none have positioned themselves as a supplier of global detailed B2B information
Compared with other suppliers of company information value proposition can be significantly improved
A need for an effective tool for generating global and precise company information
• Through advanced and specially developed software Masterseek will obtain and index company and product information from all companies in the world, thus developing the largest leading global business to business directory • Masterseek will create the most profitable business to business directory via a unique concept providing all companies with the opportunity to present and market their company, its products and services locally as well as globally.
12
BUSINESSPLAN
MASTERSEEK WILL DELIVER A UNIQUE VALUE PROPOSITION TO TWO TARGET GROUPS - THE USERS AND THE ADVERTISERS The target group "The users" Who: • Small, medium sized and large companies • Importers / exporters • Public industrial development boards • Organizers of Expos Utilization: • To find suppliers • To find products • To research / monitor the competition • To analyze investment opportunities Criteria: • Width of company information • Depth of company information • Updating • Precision • Cost
13
Unique value proposition
The target group: "The advertisers" Who: • Small, medium sized and large companies • Exporters Utilization: • To expand outlets • To sell products • To acquire extensive company information to be used in for instance direct marketing Criteria: • Size and relevance of the target group • Effectiveness • Costs
BUSINESSPLAN
MASTERSEEK WILL CAPTURE A CONSIDERABLE PART OF THE ADVERTISING MARKET FOR ONLINE MARKETING WITH ATTRACTIVE VALUE PROPOSITION; THIS MARKET IS FAIRLY LARGE AND IS EXPECTED TO GROW FURTHER OVER THE NEXT FIVE YEARS Online marketing is expected to constitute 10% of the total American marketing budgets for 2006... Percentage
... And the global market for online marketing is expected to double over the next five years
ESTIMATES
ESTIMATES CAGR: 12%
2004
2010
Source: Jupiter Research, Kelsey Group
14
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
15
BUSINESSPLAN
MASTERSEEK WILL GIVE THE USERS ACCESS TO A NEW AND ATTRACTIVE VALUE PROPOSITION Existing search methods for company information do not meet the need of the users for both precise width and updated depth...
...which is meet by Masterseek's technology
Masterseek will offer a new and attractive value proposition
Seach engines are unfocused and display lack of depth within company information
Masterseek have great depth
Width • To deliver the largest database of company information both with regards to the number of companies and the number of countries covered
• Search engines such as Google and Yahoo deliver large quantities of information and width. However, the information is often confusing and does not focus on the given areas such as company information • The inquirers use a lot of time finding an often insufficient quantity of relevant company information Directories lack width and continuous updating of depth information • Directories such as Kompass, Europages, WLW and Hoovers have relatively large depth, however, at the same time a limited width of company information (0.4 MM - 1.9 MM) • This is due to the fact that these directories mainly limit themselves to parts of Europe or the US
• Masterseek's software and webcrawler are constantly and automatically updating the database with a large volume of new data from the companies' websites • Furthermore, Masterseek have affiliate agreements guaranteeing additional and relevant information on the individual company
Masterseek have focused width • On Masterseek you can find the companies and products you want immediately - regardless of line of business and country • Over 22 million companies from 75 countries are indexed in Masterseek version 1.0 and divided into 50,000 categories that together provide more than 100 million products and services • Performing a search on Masterseek is at the same time simple, manageable and easy to do
Depth • To deliver the most detailed and categorized database of company information Updating • To deliver a database of company information which is always updated Precision • To deliver information that match the search criteria relevantly Cost • To deliver a wide range of services for free
16
BUSINESSPLAN
MASTERSEEK'S VALUE PROPOSITION IS SUPPORTED BY SPECIALLY DEVELOPED WEBCRAWLERS, DATAMINING SOFTWARE AND ADVANCED SEARCH ALGORITHMS Guaranteeing 1
WIDTH
2
DEPTH
3
UPDATING
GUARANTEEING 4
PRECISION
5
NO USER COSTS
Online Users
Data supply from affiliate partners (e.g. news, annual accounts etc.)
Masterseek’s database • Advanced search
Data supply from companies (e.g. changing contact information or new employees etc.)
algorithm • Data mining • Categorizing
Websites
Masterseek's intelligent webcrawling
17
GUARANTEEING
Masterseek's intelligent and specially developed software guarantee support of value proposition
Searching: • Products • Categories • Names • Regions • Etc. Result: • Manageable information • Precise information • All types of information • Updated information • Relevant active links
BUSINESSPLAN
MASTERSEEK'S STRENGTH ON WIDTH, DEPTH AND UPDATING PROVIDE THE USERS WITH THE LARGEST ACCESS TO RELEVANT COMPANY INFORMATION COST EFFECTIVELY Guaranteeing 1
WIDTH
2
DEPTH
3
UPDATING
Masterseek is strong when it comes to width, depth and updating • Over the last 7 years Masterseek have developed webcrawler software that automatically and constantly search company websites in 23 different languages. Then the information is indexed in Masterseek's database in order for the company information and products are easily and quickly obtainable • At the same time, because of their state of the art terrabyte server part, Masterseek provide companies with free access to updating basic information on Masterseek and affiliate partners' information is automatically and constantly updated
Data supply from clients and affiliate partners
• Other directories collect most of their information via telemarketing and direct mail. This process is slow as well as costly. Masterseek have several more easily available information compared to the competitors
(News, changing contact information, changing key employees, etc.)
• This method provides a large quantity of nuanced information in a cost effective way. That is why Masterseek have the largest information supply as well as the lowest costs for data collection compared to the competitors
Websites
Masterseek's intelligent webcrawling
Today’s search engines actually reduce the chance of finding the right pieces of information. Masterseek is a tool that counters this trend. Masterseek could be a very useful tool for Siemens”.
“Corporate Communications Executive”
18
BUSINESSPLAN
MASTERSEEK'S PRECISION GUARANTEES THE USERS THE MOST EFFECTIVE INFORMATION SEARCH Guaranteeing 4
PRECISION
The point of departure is the opportunity for advanced search Search word • Flat screen • OEM products • Taiwan
Masterseek’s database • Advanced search
It is processed in Masterseek's developed software and server Park...
• Strong and scalable server park from HP • Intelligent software utilizing search words, IP addresses and country codes • Intelligent algorithms (master rank) • Cache memory remembering previous search attempts
...guaranteeing a precise and quick search
• Quicker data request than the closest competitors • No irrelevant information • Sorting by relevance • User-friendly experience • Easy access to subpages with furt er detailing the required information
algorithm • Data mining • Categorizing
"You only have to take a look at the site and see how quick it is. There isn't anything there that you don't need. The categories, the companies, the news, well, everything a company would like to see. And it's easy to find. Very credible".
“E-business executive”
19
BUSINESSPLAN
5
NO USER COSTS
Searching: Products Categories Names Regions Etc. Result: Manageable information Precise information All types of information Updated information Relevant active links
INFORMATION
Online Users
Masterseek provide the best cost / benefit ratio to potential users
Basic value proposition within companies
Guaranteeing
Attractive value proposition within company information
MASTERSEEK'S USER ACCESS DIFFERENTIATES BY PROVIDIND FREE ACCESS TO ATTRACTIVE COMPANY INFORMATION
FEE BASED
Free COST
20
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
21
BUSINESSPLAN
MASTERSEEK'S VALUE PROPOSITION MUST MEET THE ADVERTISERS' KEY CRITERIA Key criteria Size and relevance of the target group
• Considerable size of relevant traffic from target group is crucial for the directories’ ability to generate a turnover • Masterseek's unique value proposition towards users is a strong foundation for creating a large target group • Masterseek will utilize their competence in optimizing search words to generate traffic from Google, Yahoo and MSN • Masterseek will create massive PR and marketing to generate traffic • Online marketing in an internet BtB forum is particularly attractive - companies can target campaigns by choosing the terms and search words that are relevant in relation to the products, thereby reaching a clearly defined, targeted and relevant target group
Effectiveness
Costs
• The users of Masterseek have a defined need (searching) and are open to marketing - users of B2B search are more likely to make a purchase than ordinary inquirers. • Masterseek provide easy and precise information, thus minimizing the search time and strengthens the value proposition towards the users • It is important that companies get the opportunity to raise their profile attractively • Masterseek is an interactive media which is why there is an opportunity for immediate "action" • The most common form of income model for online advertising is PPC (pay per click) - in connection with Masterseek this means that you only pay for the person having performed a search on the company search words and then "clicked" on the company's link • The companies only pay for the number of visits they want
22
BUSINESSPLAN
MASTERSEEK HAVE DEFINED A NUMBER OF PRODUCTS THAT ARE PARTICULARLY ATTRACTIVE IN COMPARISON TO THE CRITERIA OF THE ADVERTISERS Masterseek's products
Sponsor links
The key criteria of the advertisers Size and relevance of the target group
Effectiveness
Costs
• Masterseek have a global target group
• Masterseek provide effective marketing: Masterseek generate "click" from the users that have a defined need for the respective company and provide the opportunity for immediate action"/approach
• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective
Direct links to advertisers Differentiated pricing depends on placement
• Sponsor links are targeted directly at the relevant "users" according to their search criteria
• Sponsor link has a predominant place on Masterseek
My Masterseek profile (see further details in the appendix)
Masterseek Basic The company is listed and is guaranteed a good exposure based on keywords and based on additional listing of products. $299,- /12 months subscription Masterseek Advanced The company is listed and is guaranteed a good exposure based on keywords and based on additional listing of products, company news, vacant positions etc. $449,- /12 months subscription
23
• Masterseek have a global target group • By being a member, you can add significantly more search words meaning that more people and more relevant target groups will find you
• The opportunity to present the company in a very attractive way compared to the competing directories • Easy and manageable format for "the users" • The opportunity for targeting the profile via access to user statistics
• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective
BUSINESSPLAN
MASTERSEEK HAVE DEFINED A NUMBER OF PRODUCTS THAT ARE PARTICULARLY ATTRACTIVE IN COMPARISON TO THE CRITERIA OF THE ADVERTISERS Masterseek's products
Data blocks
The key criteria of the advertisers Size and relevance of the target group
Effectiveness
Costs
• The opportunity to define global target group
• In depth data
• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive
Delivery of specifically adjusted data blocks and services which are matched and presented according to the client's request
• The opportunity to define your own target group very precisely by using product codes
Price per unit
• One legal entity and with that the opportunity to run all data transversely
Banner Ads
Direct links to advertisers Differentiated pricing depends on placement and popularity of the chosen search terms
• Masterseek have a global target group • Banner ads are targeted directly at the relevant "users" according to their search criteria
• Company indices guarantee optimal targeting and effectiveness • Constant updating guarantees validity and effectiveness when using data
• Masterseek provide effective marketing: Masterseek generate "click" from the users that have a defined need for the respective company and provide the opportunity for immediate "action"/approach
• With a pricing that corresponds to the less attractive competitors' price level, Masterseek is particularly attractive from the advertiser’s perspective
• Few banner ads per page will provide great visibility on Masterseek
24
BUSINESSPLAN
IN THE LONG RUN MASTERSEEK HAS THE OPPORTUNITY FOR INTRODUCING PRODUCTS THAT WILL GENERATE CONSIDERABLE EXTRA TURNOVER Masterseek's products
Web link
In the long run it will be possible to introduce PPC (pay per click) for click on web link for the individual companies
The key criteria of the advertisers Size and relevance of the target group
Effectiveness
Costs
• Masterseek have a global target group
• With web link the search result appears objective, thus lowering user barriers
• Since there is an opportunity for placing more web links per page, it is possible to provide a competitive price per "click"
• More easy for "the users" to click on web link and thus a clear differentiation compared to companies that do not provide web link
It is a strategic decision not to introduce these product at the launch of Masterseek in order to: • Guarantee a credible profile as an objective search method (search results not dependent on advertising) • Guarantee that all companies have web link in the beginning (not possible to make deals with all companies in Masterseek in the short term)
25
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
26
BUSINESSPLAN
MASTERSEEK'S VALUE PROPOSITION TOWARDS PARTNERS Masterseek’s audience and user are at-the-work professionals who comes from a wide range of industries. The target user will most likely be interested in several business related information and services. Services A partnership with Masterseek can direct targeted users and potential buyers of services like to specific suppliers of: Company information of higher dept, Credit information, Online travel booking, Online purchase of IT services and Hardware, Recruitment services, Stock listing and financial services, Online Marketing & advertising and many more. Content Masterseek even offers partnerships for other portals with a Business to Business oriented content, which gives the partner a high quality of dynamic content with a high relevance for the targeted user and audience. Sales The wide range of proven online advertising concepts gives partners of sales an interesting possibility. Both Masterseek Membership profiles and Text link advertisement are requested by professional marketeers due to the highly defined target group and audience on Masterseek. A sales partnership can be with both professinal SEO companies and consultants and core sales companies.
27
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
28
BUSINESSPLAN
MASTERSEEK'S ORGANIZATION HAS STRONG COMPETENCES IN ALL AREAS AND HAVE LEADING SUPPLIERS AS SHAREHOLDERS
Board of Directors Appointment are pending
Advisory Board Christian Bahl Thor Høberg-Petersen
IT & Development Rasmus Refer
29
Management team Claus Jakobsen, COO Rasmus Refer, CTO Martin Ohrt, CFO
Sales & Marketing
General Management Robert Perz
Key suppliers to Masterseek Fujitsu-Siemens supplier Rackhosting Jay.Net, Search, Technology 11Design, CVI and design
PR Outsourced Burson Marsteller
BUSINESSPLAN
MASTERSEEK’S MANAGEMENT HAS STRONG COMPETENCES WITHIN CONSTRUCTION AND OPERATION OF NEW BUSINESS AREAS, SEARCH ENGINE OPTIMISING AND IT IS SUPPLEMENTED BY A STRONG NETWORK
Advisory Board
Name
Short CV
Christian Bahl
• Executive MBA in Strategic management, 2005 • Sales manager and co-owner, Jobindex, (1997 – today), Has established partnerships with companies such as Altavista, Eniro, TV2, Tele2, MSN, Yahoo, Jubii, TvDanmark, etc, • Advisory board member for Kultunaut
Thor Høberg-Petersen
• MBA, Finance and Strategy, Yale School of Management, Yale University, 2000 • BeautyJungle,com, Strategic Planner, 1999 • Associate, McKinsey & Company, 2000 – 2001 • Zacco A/S, Executive Assistant, 2002 -
30
BUSINESSPLAN
... AND MASTERSEEK’S ORGANISATION HAS VAST EXPERIENCE AND STRONG COMPETENCES WITHIN ALL CORE AREAS Management Team Name
Short CV
Claus Jakobsen
President & CEO • 1980-1986 IT consultant in Burroughs Data Systems (later Unisys). Focus on project and production management • 1987-1991 Senior at Tandem Computers. Focus on major projects in manufacturing • 1991-1994 Sales of SAS Data. Focus on Facility Management of Operations and networking tasks • 1994-1995 Executive Officer of Unisource Business Networks. Focus on the establishment of the Telia-owned company in Denmark • 1995-1998 CEO of Telia Denmark. Focus on building the organization and customer base in Denmark • 1998-2001 Managing Director of Tele1 Europe (later Song Networks). Focus on building the organization, customer base and fire in Denmark • 2001-2002 Executive Vice President of Songs Nordic leadership based in Stockholm. Focus on the realization of synergies in production unit PR
Martin Ohrt
Chief Financial Officer • Regional manager, Denmark at Freetrax A/S • Label manager at Sony Music Denmark
31
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
32
BUSINESSPLAN
MASTERSEEK’S STRATEGIC DEVELOPMENT SHOULD MAKE A STOCK ENCHANGE LISTING POSSIBLE EARLY 2009 2. Q 2008
3. Q 2008
4. Q 2008
Central strategic goals
• Launch of masterseek.com as a unique tool for company information • Uploading of affiliate information is commenced • Sale of Masterseek services can commence: Sponsorlinks, banner advertisements and membership subscriptions
• Jobportal, news section and product B2B auction will be open for audience. • Partnerships implemented (e.g. sales, content, revenue partnerships)
• Established brand name within global search for company information • Masterseek ready for stock exchange listing
Central operational goals at the end of the period
• 45 mill. companies and 25 mil. company websites uploaded • Visitors: 50.000/day • Employees: 5 • Focus countries with respect to sale and marketing: Scandinavia
• 50 mill. companies and 30 mill. company websites uploaded • Visitors: 80.000/day • Employees: 8 • Extra focus countries: The Netherlands, Germany and the UK
• 55 mill. companies and 35 mill. company websites uploaded • Visitors: 120.000/day • Employees: 12 • Extra focus countries: The US, the remainder of EU and chosen Asian countries
Key success factors (KSFs)
Internal KSFs • Technical solution • Securing and attracting key employees • Sufficient capital
Ready for potential IPO.
External KSFs • Traffic generation to masterseek.com • Efficient sale of Masterseek services • Strong partnerships
33
BUSINESSPLAN
MASTERSEEK IN STRONG POSITION IN RELATION TO INTERNAL SUCCESS FACTORS Internal KSFs
Technical solution
Securing and attracting key employees
Sufficient capital
34
Activities
Comment
• Database development • Production environment development • Web crawler software develoment • Beta version test • Scalability and flexibility • Continuous improvement
• User tests held with positive feedback • Masterseek.com is fully implemented by the end of 1st Quarter 2006 • The technical solution is optimized continuously with respect to feedback and Masterseek’s growth • Masterseek have state-of-the-art server park, Co-operation agreement with HP
• Defining work functions • Development of attractive incentive structure • Recruitment of people with the necessary competences • Continuous HR focus
• Existing strong organisation (equivalent, section on the organization) • Stock option program for key employees worked out by the end of 2005 • HR responsible in the organization
• 1st round of financing • 2nd round of financing
• 1st round of financing in place • 2nd investment round is initiated in Scandinavia, Switzerland and Germany • Preliminary positive feedback from several private • Scandinavian investors
BUSINESSPLAN
ORGANISATIONAL DEVELOPMENT 1. Organisational challenges The evolution of Masterseek in the short term (2008) must per definition stem from the existing Danish platform regardless of the holding company Masterseek Corp. being a US company. Regardless of which funding model is achieved this fact cannot be changed before sometime into 2009. Throughout 2008 Masterseek will suffer from a chronic lack of critical mass and middle management capacity regardless financial reserves. It will simply not be possible to identify, hire and run in a larger management structure as quickly as desirable. To compensate for this the company will depend on outsourcing and rely on a strong top management structure. Particular attention has to be given to discipline in relation to the daily management versus the semi active shareholders. Clear command lines and authority will be established to deal with that challenge. The vision is to appoint a strong externally recruited Chairman of the Supervisory Board supplemented by a working Vice Chairman to establish the bridge between the daily management and the Supervisory Board/main shareholders. In the event the funding route involves a VC investor taking a substantial stake it’s envisaged that this VC investor will introduce the incoming Chairman. Already by the latter half of 2008 it may be relevant to appoint an international capacity as CEO based in USA or UK supported by an assistant to accelerate the internationalization of Masterseek in collaboration with the management team in Copenhagen.
35
BUSINESSPLAN
ORGANISATIONAL DEVELOPMENT 2. Strategy for sales. To recruit, built, run in and manage a sales team with the capacity to penetrate quickly several countries simultaneously is simply not feasible in the near term. This task will be outsourced. Talks with the pan European telemarketing organisation Ranger is in progress. It is not clear what share of the sales revenue will have to be ceded to Ranger for their services, but to be conservative the entire first year revenue has been set aside as commission. A draft agreement with Ranger should be in place for consideration before mid May 2008. 3. Strategy for strategic partnerships & top level external negotiations. This will be the responsibility of the working Vice Chairman supported by the Chairman and the Copenhagen based General Manager. Longer term this work may be transferred to the future US/UK based CEO which may also involve the relocation of the working Vice Chairman if not entirely replaced by a new set up.
36
BUSINESSPLAN
ORGANISATIONAL DEVELOPMENT 4. IT operations & programming. The company must short term hire 2-3 database programmers. Besides, arrangements have been agreed with the Danish IT company ID Solutions for support both for .net applications and server operations. Also ad hoc arrangement will be made for front end design and applications. 5. Sales support Before sales start 1 to 2 customer support staff will have to be in place fully trained to handle customer inquiries. 6. Accounting. A full time accountant with strong spread sheet and documentation capacity will be hired in the short term. Until the person is in place and run in the firm Auditors will supply ad hoc assistance when required.
37
BUSINESSPLAN
MASTERSEEK TAKE IMPORTANT STEPS TO ACHIEVE A STRONG POSITION IN RELATION TO THE EXTERNAL SUCCES FACTORS* External KSFs Traffic generation to Masterseek.com
Efficient sale of Masterseek services (sponsor links, subscriptions, blocks and banners)
Strong partnerships
38
Activities
Comment
• Search word optimising on Google, Yahoo, MSN (cf, next page) • Cost efficient banner advertising with among others affiliate partners • PR in International newspapers, magazines and TV stations such as CNN and CNBC • Submission to other relevant search registers with for instance authorities, libraries, etc, • Strong exposure on fairs and conferences
• Masterseek have core competences within search word optimising for Google, Yahoo, MSN etc, (cf, next page) • Search engine optimising takes place via company names, concepts and search words from Masterseek’s database whereby a link is established directly to the relevant page with Masterseek • Masterseek have employed the renowned and international Burson Marsteller • The management group has strong ties to CNN, CNBC, Doubleclick, etc.
• Telephone sale • Direct mails • Sale through partners • Presentation on Masterseek.com • Online advertising and ordering • Fairs and conferences • PR
• Efficient internal sales force in place, Strong track record with similar tasks • Dialogue about co-operation with leading telemarketing companies in focus countries has been initiated • Huge focus on ”search functionality on the Internet” which will be made use of through vigorous focus on press releases, sponsoring, events, etc. • 200 chosen companies are offered free membership for a year in order to establish powerful case stories • Contract already made with Reuters • Present negotiations with other strong partners, incl, D&B
• Identification of the best partners • Entering into of attractive partnership agreements
* Besides the information in this business plan, we are currently working on an in-depth marketing plan – incl, a plan for the launch of Masterseek.com which will constitute approx, 12 % of all the 2006 costs
BUSINESSPLAN
AS CENTRAL SOURCE OF TRAFFIC GENERATION MASTERSEEK WILL MAKE USE OF A CORE COMPETENCE WITHIN SEARCH WORD OPTIMISING TO ACHIEVE TOP PLACES ON LEADING SEARCH ENGINES Masterseek will achieve top places on the majority of search words within “International business search”.
…which is achieved by means of in-depth insight into the 4 core parameters of search words optimising and the data code for leading search engines Relevance (PageRank) Relevance is calculated based on how many other sites are linked to the individual website, Masterseek have access to a huge network of other websites where Masterseek will be submitted and thus acquiring high relevance Link structure Search engines find and index information on a website based on the coding of the underlying link structure (i.e., number and naming of links), Masterseek makes a link structure based on seven years experience which ensures optimized score with respect to link structure Text contents and Metatags Websites are awarded a higher score the better the text contents and metatags match the search word or concept. Masterseek is constructed in such a manner that all words in the database, including product and company names, are implemented automatically in the text contents and the metatags
39
BUSINESSPLAN
MASTERSEEK’S UNIQUE COMPETENCES WITHIN SEARCH OPTIMISING WILL CONTRIBUTE TO ENSURING THAT TRAFFIC COMES TO MASTERSEEK.COM Estimated traffic to MasterSeek per month from search word optimising 743 million
25.5 million per month x 75.000
340
Number of Internet users in countries with languages covered by Masterseek* * **
***
40
Number of clicks per top place per month
Top place per month***
Click (=number of user sessions) per month to MasterSeek
Source: Internetworldstats,com, Average based on search engine optimising carried out for D & B on Google, Yahoo and MSN, weighted with the number of inhabitants in the covered countries, Result of search engine optimizing for Denmark: 0.5, Assumptions for covered countries: Europe: 0.50; the US, Australia and Japan: 10 % lower; China and South America: 30 % lower Based on the search engines’ algorithms 75,000 – 100,000 top places per language can be achieved.
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
41
BUSINESSPLAN
THE STARTING POINT OF THE FINANCIAL ESTIMATES FOR INCOME DRIVERS, REALISTICALLY COMPARE TO RELEVANT BENCHMARKS
42
Income sources
Drivers for income sources
Year 1
Year 3
Year 5
Comment
Sponsorlinks
• User sessions (million per month) • Number of page views per session • Number of sponsor links per page view • Click rate for sponsor links • Income per click, USD
25.5 6 5
52.0 6 5
105.5 6 5
0.003 0.3
0.003 0.3
0.003 0.3
• 25,5 mil/month in year 1 is conservative since it is primarily based on traffic from search optimization. Masterseek also expects considerable traffic from e.g. PR, affiliate promotion and banner ads • As a benchmark Yahoo has 72.000.000.000 pageviews each month • Benchmark search engines have 5-8 pageviews per user session (e.g. Google and MSN)
MasterListing
• Number of advanced subscribers • Number of elite subscribers • Subscription prices advanced/elite, USD
6.600 4.400 150/250
17.500 10.500 150/250
20.000 12.000 150/250
• Kompass is estimated to have 80,000 paying subscribers on their site, equivalent to 3-5% of the companies
Data-Blocks
• Number of sold data blocks • Average price per data block, USD
40 20.000
58 20.000
83 20.000
• Thanks to Masterseek’s technique, software and construction, data blocks can be extracted with higher quality and lower price points than its competing directories
BUSINESSPLAN
THE STARTING POINT OF THE FINANCIAL FIGURES IS ESTIMATES FOR THE INCOME DRIVERS, WHICH ARE REALISTIC COMPARED TO RELEVANT BENCHMARKS Income sources
Drivers for income sources
Year 1
Year 3
Year 5
Comment
Affiliate agreements
• Number of affiliate links per page view • Click rate • Income per click, USD
0.5 0.003 0.15
0.5 0.003 0.15
1.0 0.003 0.15
• Based on Jupiter research studies of similar affiliate programs
0 0 0
0 0 0
0.06 0.003 0.2
0
0
2
• The users are very interested in clicking on a web link for a company which they themselves have been searching actively for • Kompass offers this service with success
0 0
0 0
0 0.2
• Based on collected information from adbrite.com and DoubleClick
Company web links • Number of web links per page view • Click rate • Income per click, USD Banner advertisements • Number of banner advertisements per pageview • Click rate • Income per click, USD
43
BUSINESSPLAN
THE STARTING POINT OF THE FINANCIAL FIGURES IS ESTIMATES FOR THE INCOME DRIVERS, WHICH ARE REALISTIC COMPARED TO RELEVANT BENCHMARKS Turnover
EBIT
USD 1.000
USD 1.000 110.909 97.579
68.811 59.977 40.188 34.395 24.282
20.218
18.000 11.686
Year 1
44
8.860
Year 2
Year 3
Year 4
Year 5
Year 6
Year 1
14.306
Year 2
Year 3
Year 4
Year 5
Year 6
BUSINESSPLAN
TURNOVER I CONSTANTLY INCREASED THE PERIOD, AND COSTS CONSTITUTE A SMALLER FRACTION OF THE TURNOVER, WHICH EMPHASIZES THE ATTRACTIVENESS OF THE BUSINESS PLAN USD 1.000
Year 1
Year 2
Year 3
Year 4
Year 5
Year 6
Turnover
1.249
22.290
29.295
45.596
74.626
117.144
Sales costs*
1.753
2.160
1.943
3.215
5.505
8.873
Variable costs in total
1.177
9.515
7.331
6.853
9.390
13.005
71
12.776
21.964
38.742
65.236
104.139
marketing costs
1.900
1.450
960
915
960
1.180
Salaries
2.030
2.590
2.818
13.125
3.424
4.285
Administration costs
95
175
340
400
450
510
Accomodation costs
135
290
340
362
406
433
498
575
672
765
900
4.811
5.125
5.183
5.658
6.231
7.585
7
14
21
28
35
42
(4.747)
7.637
16.760
33.056
58.970
96.512
0
0
0
0
0
0
4.747
7.637
16.760
33.056
58.970
96.512
0
0
0
0
0
0
4.747
7.637
16.760
33.056
58.970
96.512
Contribution margin
Other costs incl. depreciations402 Fixed costs in total Depriciation** EBIT Financial costs*** Result before taxes Tax**** Result after taxes
*Sales costs fall from 15% to 8% in year 3, where the salessetup is expected to be optimized ** The serverpark is delivered by HP. *** Operations are expected to be financed by the company capital ****Masterseeks juridical homestate is Nevada, which results in a tax percent of 0.
45
BUSINESSPLAN
THE LIQUIDITY BUDGET SHOWS POSITIVE YEARLY LIQUIDITY EFFECTS DURING THE PERIOD
USD 1.000
Year 1
Year 2
Year 3
Year 4
Year 5
Year 6
(4.747)
7.637
16.760
33.056
58.970
96.512
7
14
21
28
35
42
(4.740)
7.650
16.781
33.084
59.005
96.554
499
721
(177)
(0.002)
259
414
(208)
(3.507)
(1.167)
(2.717)
(4.838)
(7.086)
(4.449)
4.865
15.436
30.367
54.426
89.881
70
70
70
70
70
70
0
0
0
0
0
0
The liquidity effect of the period
(4.519)
4.795
15.366
30.397
54.356
89.812
Accumulated liquidity effect
(4.519)
275
15.641
45.938
100.294
190.106
The result of the period before tax Depreciations Financing from operation Change, creditors Changes, trade debtors Operational liquidity in total Investments in material fixed assets Paid tax
46
BUSINESSPLAN
SENSITIVITY ANALYSIS IS BUILT ON CHANGES IN RELATION TO BASE CASE Possible changes in relation to base case
Worst case (Estimated probability: 10 %)
Base case (Estimated probability: 75 %)
Best case (Estimated probability: 15 %)
• User sessions will be 100.000/day the first 2 years, and will rise in year 3 to the level of the base case year 1 • The first three years sales costs are doubled compared to the base case budget • There will be sold 0 memberships and 50% less datablocks the first 3 years • An extra 1 mil. USD wil be spent on a launch campaign • The realized click rate is 0.002 instead of 0.003
• As budgeted
• User sessions are 50% higher than base case – and the growth rate for user sessions is also 50% higher than in the base case • The fraction of companies buying weblinks is doubled • The realized clickrate is 0.005 instead of 0.003
47
BUSINESSPLAN
WORST CASE SCENARIO SHOWS LOWER TURNOVER IN YEAR 1, WHICH IS COUNTERED BY CONSIDERABLE HIGHER TURNOVER IN THE BEST CASE SCENARIO
Worst case
Turn Over USD 1.000
48
Base case
Best case
BUSINESSPLAN
WORST CASE SCENARIO SHOWS LOWER TURNOVER IN YEAR 1, WHICH IS COUNTERED BY CONSIDERABLE HIGHER TURNOVER IN THE BEST CASE SCENARI
Worst case
Base case
Best case
Ebit USD 1.000
49
BUSINESSPLAN
TO MAKE A COMPANY VALUATION THREE SCENARIOS AND TWO DIFFERENT VALUATION METHODS ARE EMPLOYED
Worst case (probability: 10%)
Base case (probability: 75%)
Best case (probability: 15%)
50
Valuation method
Comment
Discounted Cash Flow (DCF)
DCF estimates the value on the basis of the company’s expected ability to generate cash flow
Multiple
Multiple can be used to estimate the value of the company in relation to its turnover and income by looking at what similar companies are sold for
BUSINESSPLAN
BASED ON THE TRADITIONAL VALUATION METHODS THE COMPANY’S PRESENT VALUE IS ESTIMATED TO BE BETWEEN 400 AND 600 MIL USD The company’s value interval is based on various valuation methods (USD 1,000) I. DCF valuation Base case: WACC 12% Terminal growth rate 2% (75% probability)
• The company’s value lies in the interval between 400-600 million USD depending on which valuation method carries the most weight
537.500
Worst case: WACC 12% Terminal growth rate 2% (10% probability)
255.400
• Investors should compare their own risk profile with the applied WACC of 12%
Best case: WACC 12% Terminal growth rate 2% (15% probability)
3.822.678
• Investors should consider the probabilities of base, worst and best case
II. Multiple valuation (year 6, WACC 12%) 453.163
Base case: EBIT x 10,0 Total interval when assessing the company’s present value
400.000
0
250.000
600.000
500.000
• Investors should consider multiple in relation to whether exit is wished in year 6 or later
750.000
51
BUSINESSPLAN
CONTENT Executive summary Market survey and Masterseek's positioning Masterseek's value proposition towards users Masterseek's value proposition towards advertisers Masterseek’s value proposition towards partners Masterseek's organization and management Masterseek's development Financial key figures Implementation plan
52
BUSINESSPLAN
IMPLEMENTATION PLAN Activities
Quarter:
07Q4
08Q1
08Q2
08Q3
08Q4
09Q1
09Q2
09Q3
09Q4
10Q1
10Q2
10Q3
10Q4
Internal activities • Technical implementation • 2. investor round • Masterseek.com on air • Affiliate information uploaded on masterseek.com • Continuous technical set-up optimising and devel. in rel. to growth and feedbk. External activities • Data block sale • Preparation for sale of remaining services • Launching campaign • Sale of remaining Masterseek services • Stock exchange introduction Geographical expansion • Scandinavia • Germany, the Netherlands, the UK • Remaining EU, the US and chosen Asian countries
53
BUSINESSPLAN
MASTERSEEK’S TECHNOLOGICAL CONSTRUCTION ENSURES THAT MASTERSEEK CAN BE SCALED FINANCIALLY IN A RATIONAL MANNER Front end • Today MasterSeek’s front end system consists of 15 HP dual Xeon servers which are placed behind a redundant Alteon 3408 setup • Alteon 3408 is a leading load balancer for this type of application and it can handle some 100,000 simultaneous sessions pr. second at its current set-up and it can easily be scaled as need be
• The use of standard equipment and applications ensures that the system can handle the huge amount of data and the very many simultaneous users and queries,
• Alteon 3408 balances the load between the various HTTP front ends and determines the optimum response time, disconnects equipment, etc.
• The system is fully scalable with respect to front end and back end
• The front end will service all HTTP requests and it can be scaled unlimitedly to meet future needs, too Back end • MS-SQL Back ends consist of traditional servers in multiple cluster setups with redundant disc arrays to handle raw data • The clusters will be divided into different categories of high-load queries and types of content as the system is being used
54
• The system can easily be expanded in step with increasing needs and the use of standard equipment ensures that the incremental costs of the expansion are small
APPENDIX 1
BUSINESSPLAN
ELABORATION OF MASTERSEEK’S ”MY MASTERSEEK” MEMBERSHIPS ILLUSTRATES PROMINANT PROFILING POSSIBILITY AND DEGREE OF DETAILS
APPENDIX 2
Increased Value Interactive services
• SMS messages • E-mail messages • Regular, extensive information about traffic
Further multimedia information
• Company logo • Presentations (ppt. and video)
• Photos • Audio files • Animations
• • • • • • • •
• • • • • • • •
• • • • • • • •
Contact info Management News Press releases Products Costumers References Partners
Contact info Management News Press releases Products Costumers References Partners
• • • • • • • •
Increased chance of being found
Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition
The chance of being found is multiplied with the number of products, services, press releases and activities which the company itself adds to its company registration
Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition
Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition
Companies without websites or whose websites’ construction prevents automatic validation are ensured manual addition
My Masterseek - Free
My Masterseek - Basic
My Masterseek - Advanced
Insurance of data validity
Comp. profile Ownership Organisation Employees Event calendar Financial info Technology Special offers
Comp. profile Ownership Organisation Employees Event calendar Financial info Technology Special offers
Further statistical information
55
APPENDIX
WHITEPAPER DATA MINING MasterSeek 2008, Denmark
56
APPENDIX
CONTENT Tools & Technologies Components Goals Crawler The Sorter URL mining/Validation Diagram Information mining from existing URLs Diagram
57
APPENDIX
Tools & Technologies Below is a preliminary list of tools involved on different stages of data mining & validation process. 1. 2. 3. 4.
PERL 5.8 (Practical Extraction and Report Language) used at Dispatcher/Crawler stage. Microsoft .Net Development Tools version 1.1 at Sorter stage. REGEXP (Regular Expressions Matchmaking) at all stages of process. DBMS.
Components The following section describes the different components involved in the data mining process. The components integrate with each other via database, in order to maintain a traceable connection, and to secure data transmission. The components have different interaction with different functionality such as data mining, data validation, first level URL search. The 3 component are 1. Dispatcher 2. Crawler 3. Sorter Goals Goals are specified in different parts/processes. Main goal is to keep updated Masterseek’s companies database, currently having 22 million companies’ information and growing. 1. Find URL (website addresses) and crawl the required information/properties of 11 million companies of which Masterseek does not have URL information in its database. 2. At specified time intervals, crawling, matching and keeping updated all those companies’ information, who have their URL addresses in Masterseek Database.
58
APPENDIX
Dispatcher Responsible for dispatching work load to the crawler component. The Dispatcher ensures scalability by dispatching work load according to available work stations, enabling the data mining process to separate work load on the entire Masterseek network (include office machine, using CPU when machine hibernating) The dispatcher is using a queue model which has been implemented successfully in main frame systems during last 25 years. The model is based on many queues, each queue has a defined batch of initiators, that actually is responsible for dispatching the tasks. The amount of initiators variant from one queue to the other and the time interval is also defined for each queue, enabling maximum flexibility for prioritizing, time frame for site scanning and divide work load over the available resources. Crawler Using site map extraction with country/locale specific low level RegExp technology will extract URL content. The text analysis will not be accurate to optimize performance, and a high error percentage is taken under advice. The crawling process will advise The crawling Meta-Data database before crawling, in order to receive manual crawling information and feedback from the sorter on data quality. The result of such crawling process after a few dozen crawling iteration will be an accurate list with properties and their respective URL’s, generating the at most efficient crawler with high accuracy. Sites do change once in a while, and that is the reason why we are crawling them over and over again, but such changes can be only in the data inside the URL, for example the telephone number or address has been changed. The other type of change is structure changes with URL’s replacement. This change will cause the sorter to report errors back to the crawler and the learning process will start over again. The performance of the crawler is important, but such performance is obtained by Crawling optimization. It is crucial to obtain a flexible crawler and use a high level language such as Perl, which have a sophisticated text parsing functionality.
59
APPENDIX
Crawling techniques There is no such thing as the perfect crawler, every site contains its own unique structure. The goal of a good crawler is to deal with the majority of the sites, and to alert on irregularities of the site structure to indicate a potential crawling error. The crawler mission is to gather most relevant data on the company according to the defined properties, and to pass the information to the second process, the fine sorting or the SORTER The Site Map The site map is one of the more efficient techniques, due to the clear tree structure presentation, and clear node definition. It allows us to scan the entire site very quickly and locate the relevant information. Here is an example of a typical site map: (affinity.com)
In the site map we can find a clear link to the searched data, with limited vocabulary keywords.
60
APPENDIX
Building the Site Map When the site map isn’t supplied, it is crucial to build a site map. We don’t need the entire site map, just the more relevant areas in the site. Building such a site map requires a general page content analysis, in order to understand what the page is about. Such an analysis is possible with, special keywords (Regular Expression RegExp), the RegExp is a powerful technique to find text structures. Using the at most features of the RegExp technology will provide a powerful crawling technique. Generally, a URL content’s description resides in web page’s Meta Data Tags such as meta TITLE tag or meta DESCRIPTION and meta KEYWORDS tags which point out what the page contains. But there are also possibilities that web page does not contain Meta Data Tags. In that case, we would have to make a smart analysis to find the theme of contents using RegExp and extract our required properties/attributes. Therefore, for more optimization downloading the first X characters will enable us to analyze the page content. Using this optimization technique will allow us to scan more pages with greater speed and higher quality in our companies’ database. Here is an example from http://www.aag-gummi.dk/ <META NAME="keywords" CONTENT="bærelejer, cellegummi, cellegummilister, ekstruderede profiler, entrémåtter, epdm, fendere, gummi/metal, gummidug, gummifødder, gummilister, gummimembraner, gummistropper, gummitætninger, moosgummi, naturgummi, neopren, nitril, oringsnor, pakning, polyurethan, silicone, sneplovskær, stænklapper, støbte emner, teknisk gummi, udstansning, aag"> This Meta-Data can be used for categorization using RegExp, it is basically a summery of the company products.
61
APPENDIX
The Sorter Using country/locale specific information and sophisticated RegExp technology to extract the company attributes. Demands a high performance server with high caching capabilities written in a low lever language, and a scalable solution, in order to handle the data stream from the crawler.
62
APPENDIX
Regular Expression – Content analysis based on country/locale formats RegExp is a very powerful technology in text analysis, it allows us to identify data structure for example: Telephone number +45-46-90-21-89 or +45 4690 2189 or +4546902189 or +45-4690-2189 or 0045-4690-2189 of 46-90-21-89 All of the formats have few common attributes that match Denmark’s telephone number system: It will start with + and 8 digits after and the number length will be max 12 chars *(1)[+, ]45*(12)[0-9, ,-] It will start with 00 and 8 digits after and the number length will be max 12 chars *(3)[0, ]45*(12)[0-9, ,-] It will have 6 digits and number length will be max 9 chars *(9)[0-9, ,-] Using country/locale specific patterns we can match almost every telephone number format and identify it as a telephone number. Address format: Address format are more variable and therefore, more difficult to identify, but again with country/locale specific information it is easier. Based on the specific country we can identify an address by the town name; using town names list for each country which will give us a high accuracy in identifying address. Regarding the address format, we need to explore the various address formats that match the country specific in order to build a RegExp pattern for each and every country. Combination of techniques Sites are different with their structure and presentation type, therefore, as mentioned earlier it’s impossible to create the perfect crawler. The goal is to create a crawler that will succeed with more than 60% of the sites. Achieving such goal will require a combination of algorithms that will try to ‘solve the problem’ from different angles, using the feedback from the sorter we could adjust the matching algorithm to the different sites.
63