KnowledgeWatch
.

Looking Forward:
Web Search and Information Mining

A KnowledgeWatch White Paper

Background

Today's rapid business environment allows very little time for knowledge workers to keep aware of new, vital information from their industry or to locate new knowledge and tactics that may help with their jobs.

And yet, never before has there been such easily available information, in digital form, that can help you and your organization maintain competitiveness in the fast changing business world.   You know there's information, news and events happening that you need to be aware of.  And, your enterprise needs to be adaptive to changes in the competitive business world.
Finding a needle in a haystack.

Sadly, the information industry has progressed, with tools that are not a lot better than using a large rake to find needles in a haystack: - all while the haystacks get bigger and more complex to find things.  If you needed to find a needle lost in a haystack, you wouldn't use a rake.   If you could you would use a special magnet that would automatically, quickly and effortlessly attract that needle for you.

Whereas the "needles" in the above metaphor are those key information pieces you need to consistently enhance your job function, or they are the breaking industry news that may better help you and your organization thwart competitors and gain sales.  The KnowledgeWatch mission is to provide you with solutions that help you automatically mine and attract key information to you "from the enormous haystack of information".

At KnowledgWatch we're focused on engineering and delivering service solutions making information easy to absorb and less costly for you and your enterprise to find.


Public Web Search Engines Serve a Role - A Limited Role

We believe public web search engines serve a role, - but a limited role. Perhaps that role is strongest for individuals surfing the Web for personal use.  There is not as strong a role for public Web search engines for enterprise knowledge workers as once thought.  As alternatives to these monetarily free services come onto the scene, search engines will more likely continue to be used by individual surfers.  However, enterprise knowledge worker use of public search engines will be supplanted by better, more cost effective alternatives.

Web search engines:

Google Yahoo
MSN Search Lycos
AltoVista AOL Search
AllTheWeb Ask.com


Naming just eight reasonably well-known search engines, have taken an important place within the Web's information architecture.   These public search engines offer a very broad view of millions of information points on the Web and offer strong, no-monetary-cost and meaningful application results when a Web user wants to search for random information - like finding a map to a theater, finding a place in the community that sells particular commodities or a student doing a homework assignment.

Search engine experts report public search engines crawl and index less than 50% of the content actually stored on the Web.   This non-indexed portion is known as the "hidden Web" because it requires login permission for access into private databases.   Search engine users never see this "hidden content".

Public Web Search Engines

The public engines are, however, problematic in that they are free - as they utilize advertising sales to maintain their business models.   Because they depend on advertising sales, they have a predilection toward ranking search results using certain mysterious methods.   Results from searches are generated using a proprietary ranking black box.

( It's worth noting a large part of the Web developer community earns their livelihood trying to understand and consult within the Web site market on how to comprehend, "get tuned" and "work well" with public search engines. )


Using an Advertising Results Methodology


And results? Boy do the public search engines give you results.  Millions of them are produced, most of the time.  How many among us have read through beyond 30+% of results from a Web search?  We bet not more than a few score are ever looked at by 99% of the searchers.  The role of public Web search engines is not focused.  The resulting effect is that a searcher receives unfocused results and suffers significant costs associated with potentially missed information.

This point is clear by examining these actual results from searching Google for the following terms:


Terms Searched: Results Fetched:
automotive engines 23,200,100 items
ethanol 32,900,000 items
drive train 91,010,000 items
tires 73,100,000 items


Who has the time to look at more than the first twenty or thirty items?   The role of the Web search engines needs to be considered as an applicable tool somewhere, but they should not be considered serious information providers to very busy knowledge workers who could save 95+% of their search time and a dramatically large time in manpower cost.

A Look Forward: More Powerful Search Tools and Machines - Like The Information Mining Engine

Clearly, in order to save time and money, knowledge workers need a more powerful model to work with. KnowledgeWatch would like to raise five questions to provoke some thought on an alternative search machine methodology leading toward a better, easier to use and less costly tool for business knowledge workers.


1. Why do we rely so heavily on advertising business models to find things on the Web?

Public search engines are free and openly accessible so they require a means for financial support - advertising is perfect for those services.  However, advertising is not a perfect companion to relevant search methodology.   This is particularly a valid point for business use of Web search.   Enterprise's methods for search and information mining should afford a more effective strategy than a "free approach".   Perhaps it would be better to pay a reasonable subscription for an open search and information mining service that can retrieve very focused information from a set of sources known to the user.

2. Why can't we use multiple sets of topics that pertain to our information needs?

Except for special application programming interfaces, public search engines offer a one topic at a time, manually input, approach to Web information search.  It's time there was a more powerful, effective approach available.  One time, manual, random search may be suitable for public search engines, but let's say knowledge workers have an ongoing set of information topics they need to have "watched" on an ongoing basis.  Also, what if the set of topics started out small and grew over time, it would be nice to be able to just add to your information target needs whenever necessary.

3. Couldn't we have an automated approach that eliminates manpower work and time in a useful way?

Unless you use a private search service, automation with public search services is relegated to e-mail alerts.  Interruption of knowledge workers using regularly arriving e-mail alerts is not an efficient time and manpower saving approach.  Perhaps it's necessary to have notification options scaled according to the importance of information retrieved so that one topic would derive a more important notification, immediately, and other topics can offer either daily or weekly notifications.

4. Would we be better off finding information at selected locations or content providers?

Of course this is true.  It's impractical to manually read and study hundreds of thousands of search results.  We generally know where there's good information.  What we need is an automated way to manage the large "haystack" of information from known, quantified sources.   Perhaps some of these sources would be from the "hidden Web" sections the public search engines never get to see.

5. Couldn't we better leverage the new Web phenomenon in Really Simple Syndication - RSS?

RSS, however, is an open Web standard for publishing a large body of information on the Web.  Other than some blogs, public search engines do not index RSS feeds.   Why isn't this so?  Is it because there is no way to harness the advertising impact that is carried with RSS?   Is it because RSS does not incorporate advertising directly?



Major Property Information Mining Engine Public Web Search Engine
Advertising Business Model? No Yes
Can use multiple sets of topics? Yes Programmer is needed
Works continuously and automatically? Yes No
Selected content sources? Yes No
Indexes - searches RSS files? Yes No


KnowledgeWatch Overview

KnowledgeWatch - Automated Information Mining for Enterprises


KnowledgeWatch was founded to provide industry business workers solutions that harvest available digital information pertaining to their focused business needs.  Standard with KnowledgeWatch solutions are the following properties:

1) Automated Information Mining

2) User-defined Topic Mining

3) Targeted Content Sources

4) User Controlled Operation

Together, these four properties enable a set of enterprise and user worker advantages and very unique results on Web search and information mining work.   KnowledgeWatch has combined these four properties to solve Web Search and Information Mining problems with a very innovative and effective approach.   By combining these properties KnowledgeWatch derives a significant return on investment for its subscribing customers.

We're building and delivering standard services and solutions like KnowledgeWatch TIME - The Enterprise Information Mining Engine.   In addition, we tailor our solutions for implementation by our business partners for their enterprise customers in various industry settings such as Advertising & Public Relations, Aerospace & Defense, Automotive, Biotechnology, Computer, Government, Healthcare and Telecommunicaions.

Following are descriptions of the major properties of KnowledgeWatch solutions.

Automated Information Mining


Automated information mining provides for a continuous search and "watch" for user-targeted information on an around the clock basis.  KnowledgeWatch works for you even when you're not working.



User defined topic mining


Automated Information Mining


User defined topic mining provides knowledge workers with several pre-defined information topics that KnowledgeWatch keeps a watch on for the worker's behalf.  This property allows a user to define several areas for ongoing search and information mining while the user is accomplishing other work.

This property can save the enterprise significant manpower and also delivers results that would have normally gone by, probably outside the view of the worker.

One of the significant results of this approach is the user can configure for any of their topics, independently, reporting results options ranging among the following:

1. Going online at any time to see the results

2. Having e-mails delivered immediately upon finding a topic

3. Delivering a daily summary e-mail

4. Delivering a weekly summary e-mail

5. Aggregating results into a user's personal information bank for later retrieval


User's Targeted Content Sources

Targeted Content Sources


Knowledge workers often know where they might find important information on subjects of their interest.  KnowledgeWatch provides for a way for the user to identify the various content sources for searching and information mining functions to perform retrieval.

This is a very important capability that diverges dramatically from public search engine operation.  The resulting information that is mined is focused very closely to the interests of the knowledge worker.  This capability allows a level of user focus and time saving that other approaches do not offer.

Research Services Available
KnowledgeWatch offers a manpower service that assists customers in identifying potential sources of content for their individual information mining operations.   We are familiar with the Web and know about sources that customers may not be familiar with.  In addition to providing a standard services solution we can do research that complements the knowledge worker's information search requirements.




User Controlled Operation


User Controlled Operation


User Controlled Operation allows a user to modify KnowledgeWatch configuration at any time.  This extends the usefulness and value of our standard service and helps the user to accomplish a self-tailored approach to focus in on just the information content that is most meaningful to them.

Integration Services Available
Additionally, KnowledgeWatch, as an integration service, will integrate additional file formats that don't conform to standard Web conventions that already operate with our standard services.


We're focused on engineering and delivering service solutions making information easy to absorb and less costly for you and your enterprise to find.


All of these KnowledgeWatch elements allow the information gathering process to be astonishingly, manpower low-cost.  Our objective is to deliver technology services and solutions that provide the "automated magnets" that will find you "the needles in the haystack" that you need to find and know about.


KnowledgeWatch makes available to qualified prospective customers trial use programs for TIME - The Enterprise Information Mining Engine.  Please call KnowledgeWatch at 248-427-0726 to discuss or visit our Web site at www.knowledgewatch.com to request a trial use program.

Return to Home Page