Search Images Maps Play YouTube Gmail Drive Calendar More »
Sign in
Screen reader users: click this link for accessible mode. Accessible mode has the same essential features but works better with your reader.

Patents

  1. Advanced Patent Search
Publication numberUSRE42870 E1
Publication typeGrant
Application numberUS 12/325,881
Publication date25 Oct 2011
Filing date1 Dec 2008
Priority date4 Oct 2000
Fee statusPaid
Also published asUS7330850
Publication number12325881, 325881, US RE42870 E1, US RE42870E1, US-E1-RE42870, USRE42870 E1, USRE42870E1
InventorsJohn C. Seibel, Yu Feng, Robert L. Foster
Original AssigneeDafineais Protocol Data B.V., Llc
Export CitationBiBTeX, EndNote, RefMan
External Links: USPTO, USPTO Assignment, Espacenet
Text mining system for web-based business intelligence applied to web site server logs
US RE42870 E1
Abstract
A text mining system for collecting business intelligence about a client, as well as for identifying prospective customers of the client, for use in a lead generation system accessible by the client via the Internet. The text mining system has various components, including a data acquisition process that extracts textual data from Internet web sites, including their logs, content, processes, and transactions. The system compares log data to content and process data, and relates the results of the comparison to transaction data. This permits the system to provide aggregate cluster data representing statistics useful for customer lead generation.
Images(10)
Previous page
Next page
Claims(15)
1. A text mining system for providing data representing Internet activities of a visitor to a web site of a business enterprise, comprising:
a data acquisition process, operable to:
extract visitor identification data from a server log of the web site, wherein the visitor identification data identifies a visitor to the web site at a known time;
aggregate the visitor identification data with visitor purchase data to provide aggregated visitor data that represents whether a purchase was made from the website by the visitor at or near the known time;
extract text documents from Internet-wide text sources, the Internet-wide text sources selected from the group of: newsgroups, discussion forums, and mailing lists to provide visitor related documents; and
extract predictive statistics from the aggregated visitor data to provide extracted predictive statistics;
a server, operable to:
receive one or more queries, wherein each query of the one or more queries represents a request for information about the visitor and the visitor related documents; and
provide responses to the one or more queries based on the received one or more queries and the aggregated visitor data, the extracted predictive statistics, and the visitor related documents;
wherein the server is accessible via a web browser over the Internet.
2. The system of claim 1, wherein the text mining server is further operable to generate and store information maps representing the aggregated visitor data and the visitor related documents.
3. A text mining method for providing data representing Internet activities of a visitor to a website of a business enterprise, comprising:
extracting visitor identification data from a server log of the web site, the data identifying a visitor to the website at a known time
aggregating the visitor identification data with visitor purchase data to provide aggregated visitor data that represents whether a purchase was made from the website by the visitor at or near the same time; and
extracting text documents from Internet-wide text sources other than the website, the Internet-wide text sources selected from the group of: newsgroups, discussion forums, and mailing lists to provide visitor related documents;
extracting predictive statistics based on said extracting visitor identification data, said aggregating, and said extracting the text documents to provide extracted predictive statistics;
receiving one or more queries, wherein each query of the one or more queries represents a request for information about the visitor and the visitor related documents;
generating results based on the one or more queries and the aggregated visitor data, the extracted predictive statistics, and the visitor related documents; and
storing the generated results.
4. The method of claim 3, further comprising:
generating and storing information maps representing the aggregated visitor data and the visitor related documents.
5. A method, comprising:
extracting visitor identification data from a server log of a website of an e-commerce client, wherein the visitor identification data identifies a visitor to the website at a known time;
aggregating the visitor identification data with information related to at least one of web data or processes occurring at or near the known time to generate aggregated visitor data;
determining whether the visitor purchased a product at or near the known time;
storing information regarding the visitor and activity of the visitor based on said determining, operable to be provided to the e-commerce client; and
extracting predictive statistics from the information regarding the visitor and the activity of the visitor to provide extracted predictive statistics, wherein the extracted predictive statistics are operable to be provided to the e-commerce client.
6. The method of claim 5, wherein the predictive statistics comprise one or more of:
profitable aggregation clusters;
least profitable aggregation clusters;
mean aggregation clusters; and
dropped transaction aggregation clusters.
7. The method of claim 5, further comprising:
overlaying the statistics with one or more of transaction, survey, or user-entered demographics and preferences.
8. A method, comprising:
extracting first information comprising visitor identification data from a server log of a website, wherein the visitor identification data identifies a visitor to the website at a known time;
determining second information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time, wherein the second information corresponds to activity of the visitor;
determining third information regarding whether the visitor purchased a product at or near the known time;
extracting predictive statistics based on the first information, the second information, and the third information to provide extracted predictive statistics;
storing the first information comprising the visitor identification data, the second information comprising the activity of the visitor, the third information regarding visitor purchase in a memory, and the extracted predictive statistics;
wherein the first, second, third information, and extracted predictive statistics are useable to evaluate the website.
9. The method of claim 8, further comprising:
providing the first, second and third information to a requesting entity associated with the website;
the requesting entity adjusting content of the website based on the first, second and third information.
10. A system, comprising:
a server log that stores information regarding visitors to a website;
at least one first server operable to:
extract visitor identification data from the server log of the website, wherein the visitor identification data identifies a visitor to the website at a known time;
determine first information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time;
determine whether the visitor purchased a product at or near the known time based on the visitor identification data and the first information; and
extract predictive statistics based on information regarding the visitor and activity of the visitor to provide extracted predictive statistics;
wherein the at least one first server comprises a memory operable to store the visitor identification data, the first information, the extracted predictive statistics, and information regarding whether the visitor purchased a product at the known time;
wherein the at least one first server comprises a web server interface accessible by a client web browser to provide statistics regarding visitors to the website.
11. A computer readable memory medium storing program instructions executable by a processor to:
extract first information comprising visitor identification data from a server log of a website, wherein the visitor identification data identifies a visitor to the website at a known time;
determine second information related to at least one of web data or processes corresponding to the visitor identification data and occurring at or near the known time, wherein the second information corresponds to activity of the visitor;
determine third information regarding whether the visitor purchased a product at or near the known time;
store the first information comprising the visitor identification data, the second information comprising the activity of the visitor, and the third information regarding visitor purchase in a memory;
wherein the first, second, and third information are useable to evaluate the website; and
extract predictive statistics from at least one of the first, second, or third information to provide extracted predictive statistics, wherein the extracted predictive statistics are usable to evaluate the website.
12. The method of claim 11, wherein the predictive statistics comprise one or more of:
profitable aggregation clusters;
least profitable aggregation clusters;
mean aggregation clusters; or
dropped transaction aggregation clusters.
13. The method of claim 11, wherein the program instructions are further executable to:
overlay the statistics with one or more of transaction, survey, or user-entered demographics and preferences.
14. The method of claim 11, wherein the program instructions are further executable to:
extract unstructured text documents from unstructured Internet sources other than the website to provide visitor related documents.
15. The method of claim 11, wherein the unstructured Internet sources comprise one or more of:
newsgroups;
discussion forums; and
mailing lists.
Description
RELATED PATENT APPLICATIONS

This application claims the benefit of U.S. Provisional Application No. 60/238,094, filed Oct. 4, 2000 and entitled “Server Log File System Utilizing Text mining Methodologies and Technologies”. The present patent application and additionally the following patent application is a conversion from the foregoing provisional filing: U.S. Pat. No. 7,043,531 entitled “Web-Based Customer Lead. Generator System with Pre-Emptive Profiling” and filed Oct. 4, 2001.

This patent application is related to the following pending applications: patent application Ser. No. 09/862,832 entitled “Web-Based Customer Lead Generator System” and filed May 21, 2001; patent application Ser. No. 09/865,802 entitled “Database Server System for Web-Based Business Intelligence” and filed May 24, 2001; patent application Ser. No. 09/865,804 entitled “Data Mining System for Web-Based Business Intelligence” and filed May 24, 2001; patent application Ser. No. 09/865,735 entitled “Text Mining System for Web-Based Business Intelligence” and filed May 24, 2001; patent application Ser. No. 09/862,814 entitled “Web-Based Customer Prospects Harvester System” and filed May 21, 2001; patent application Ser. No. 09/865,805 entitled “Text Indexing System for Web-Based Business Intelligence” and filed May 24, 2001.

TECHNICAL FIELD OF THE INVENTION

This invention relates to electronic commerce, and more particularly to business intelligence software tools for acquiring leads for prospective customers, using Internet data sources.

BACKGROUND OF THE INVENTION

Most small and medium sized companies face similar challenges in developing successful marketing and sales campaigns. These challenges include locating qualified prospects who are making immediate buying decisions. It is desirable to personalize marketing and sales information to match those prospects, and to deliver the marketing and sales information in a timely and compelling manner. Other challenges are to assess current customers to determine which customer profile produces the highest net revenue, then to use those profiles to maximize prospecting results. Further challenges are to monitor the sales cycle for opportunities and inefficiencies, and to relate those findings to net revenue numbers.

Today's corporations are experiencing exponential growth to the extent that the volume and variety of business information collected and accumulated is overwhelming. Further, this information is found in disparate locations and formats. Finally, even if the individual data bases and information sources are successfully tapped, the output and reports may be little more than spreadsheets, pie charts and bar charts that do not directly relate the exposed business intelligence to the companies' processes, expenses, and to its net revenues.

With the growth of the Internet, one trend in developing marketing and sales campaigns is to gather customer information by accessing Internet data sources. Internet data intelligence and data mining products face specific challenges. First, they tend to be designed for use by technicians, and are not flexible or intuitive in their operation; secondly, the technologies behind the various engines are changing rapidly to take advantage of advances in hardware and software, and finally, the results of their harvesting and mining are not typically related to a specific department goals and objectives.

SUMMARY OF THE INVENTION

One aspect of the invention is a text mining system for collecting business intelligence about a client, as well as for identifying prospective customers of the client. The text mining system may be used in a lead generation system accessible by the client via the Internet.

The text mining system has various components, including a data acquisition process that extracts textual data from Internet web sites, including their logs, content, processes, and transactions. The system compares log data to content and process data, and relates the results of the comparison to transaction data. This permits the system to provide aggregate cluster data representing statistics useful for customer lead generation.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates the operating environment for a web based lead generator system in accordance with the invention.

FIG. 2 illustrates the various functional elements of the lead generator system.

FIG. 3 illustrates the various data sources and a first embodiment of the prospects harvester.

FIGS. 4 and 5 illustrate a database server system, which may be used within the lead generation system of FIGS. 1 and 2.

FIGS. 6 and 7 illustrate a data mining system, which may be used within the lead generation system of FIGS. 1 and 2.

FIGS. 8 and 9 illustrate a text mining system, which may be used within the lead generation system of FIGS. 1 and 2.

FIG. 10 illustrates a text mining system, similar to that of FIG. 8, applied to web site server logs.

DETAILED DESCRIPTION OF THE INVENTION

Lead Generator System Overview

FIG. 1 illustrates the operating environment for a web-based customer lead generation system 10 in accordance with the invention. System 10 is in communication, via the Internet, with unstructured data sources 11, an administrator 12, client systems 13, reverse look-up sources 14, and client applications 15.

The users of system 10 may be any business entity that desires to conduct more effective marketing campaigns. These users may be direct marketers who wish to maximizing the effectiveness of direct sales calls, or e-commerce web site who wish to build audiences.

In general, system 10 may be described as a web-based Application Service Provider (ASP) data collection tool. The general purpose of system 10 is to analyze a client's marketing and sales cycle in order to reveal inefficiencies and opportunities, then to relate those discoveries to net revenue estimates. Part of the latter process is proactively harvesting prequalified leads from external and internal data sources. As explained below, system 10 implements an automated process of vertical industry intelligence building that involves automated reverse lookup of contact information using an email address and key phrase highlighting based on business rules and search criteria.

More specifically, system 10 performs the following tasks:

    • Uses client-provided criteria to search Internet postings for prospects who are discussing products or services that are related to the client's business offerings
    • Selects those prospects matching the client's criteria
    • Pushes the harvested prospect contact information to the client, with a link to the original document that verifies the prospects interest
    • Automatically opens or generates personalized sales scripts and direct marketing materials that appeal to the prospects' stated or implied interests
    • Examines internal sales and marketing materials, and by applying data and text mining analytical tools, generates profiles of the client's most profitable customers
    • Cross-references and matches the customer profiles with harvested leads to facilitate more efficient harvesting and sales presentations
    • In the audience building environment, requests permission to contact the prospect to offer discounts on services or products that are directly or indirectly related to the conversation topic, or to direct the prospect to a commerce source.

System 10 provides open access to its web site. A firewall (not shown) is used to prevent access to client records and the entire database server. Further details of system security are discussed below in connection with FIG. 5.

Consistent with the ASP architecture of system 10, interactions between client system 13 and system 10 will typically be by means of Internet access, such as by a web portal. Authorized client personnel will be able to create and modify profiles that will be used to search designated web sites and other selected sources for relevant prospects.

Client system 11 may be any computer station or network of computers having data communication to lead generator system 10. Each client system 11 is programmed such that each client has the following capabilities: a master user account and multiple sub user accounts, a user activity log in the system database, the ability to customize and personalize the workspace; configurable, tiered user access; online signup, configuration and modification, sales territory configuration and representation, goals and target establishment, and online reporting comparing goals to target (e.g., expense/revenue; budget/actual).

Administration system 14 performs such tasks as account activation, security administration, performance monitoring and reporting, assignment of master user id and licensing limits (user seats, access, etc.), billing limits and profile, account termination and lockout, and a help system and client communication.

System 10 interfaces with various client applications 15. For example, system 10 may interface with commercially available enterprise resource planning (ERP), sales force automation (SFA), call center, e-commerce, data warehousing, and custom and legacy applications.

Lead Generator System Architecture

FIG. 2 illustrates the various functional elements of lead generator system 10. In the embodiment of FIG. 2, the above described functions of system 10 are partitioned between two distinct processes.

A prospects harvester process 21 uses a combination of external data sources, client internal data sources and user-parameter extraction interfaces, in conjunction with a search, recognition and retrieval system, to harvest contact information from the web and return it to a staging data base 22. In general, process 21 collects business intelligence data from both inside the client's organization and outside the organization. The information collected can be either structured data as in corporate databases/spreadsheet files or unstructured data as in textual files.

Process 21 may be further programmed to validate and enhance the data, utilizing a system of lookup, reverse lookup and comparative methodologies that maximize the value of the contact information. Process 21 may be used to elicit the prospect's permission to be contacted. The prospect's name and email address are linked to and delivered with ancillary information to facilitate both a more efficient sales call and a tailored e-commerce sales process. The related information may include the prospect's email address, Web site address and other contact information. In addition, prospects are linked to timely documents on the Internet that verify and highlight the reason(s) that they are in fact a viable prospect. For example, process 21 may link the contact data, via the Internet, to a related document wherein the contact's comments and questions verify the high level value of the contact to the user of this system (the client).

A profiles generation process 25 analyzes the user's in-house files and records related to the user's existing customers to identify and group those customers into profile categories based on the customer's buying patterns and purchasing volumes. The patterns and purchasing volumes of the existing customers are overlaid on the salient contact information previously harvested to allow the aggregation of the revenue-based leads into prioritized demand generation sets. Process 25 uses an analysis engine and both data and text mining engines to mine a company's internal client records, digital voice records, accounting records, contact management information and other internal files. It creates a profile of the most profitable customers, reveals additional prospecting opportunities, and enables sales cycle improvements. Profiles include items such as purchasing criteria, buying cycles and trends, cross-selling and up-selling opportunities, and effort to expense/revenue correlations. The resulting profiles are then overlaid on the data obtained by process 21 to facilitate more accurate revenue projections and to enhance the sales and marketing process. The client may add certain value judgments (rankings) in a table that is linked to a unique lead id that can subsequently be analyzed by data mining or OLAP analytical tools. The results are stored in the deliverable database 24.

Profiles generation process 25 can be used to create a user (client) profiles database 26, which stores profiles of the client and its customers. As explained below, this database 26 may be accessed during various data and text mining processes to better identify prospective customers of the client.

Web server 29 provides the interface between the client systems 13 and the lead generation system 10. As explained below, it may route different types of requests to different sub processes within system 10. The various web servers described below in connection with FIGS. 4-11 may be implemented as separate servers in communication with a front end server 29. Alternatively, the server functions could be integrated or partitioned in other ways.

Data Sources

FIG. 3 provides additional detail of the data sources of FIGS. 1 and 2. Access to data sources may be provided by various text mining tools, such as by the crawler process 31 or 41 of FIGS. 3 and 4.

One data source is newsgroups, such as USENET. To access discussion documents from USENET newsgroups such as “news.giganews.com”, NNTP protocol is used by the crawler process to talk to USENET news server such as “news.giganews.com.” Most of the news servers only archive news articles for a limited period (giganews.com archives news articles for two weeks), it is necessary for the iNet Crawler to incrementally download and archive these newsgroups periodically in a scheduled sequence. This aspect of crawler process 31 is controlled by user-specified parameters such as news server name, IP address, newsgroup name and download frequency, etc.

Another data source is web-Based discussion forums. The crawler process follows the hyper links on a web-based discussion forum, traverse these links to user or design specified depths and subsequently access and retrieve discussion documents. Unless the discussion documents are archived historically on the web site, the crawler process will download and archive a copy for each of the individual documents in a file repository. If the discussion forum is membership-based, the crawler process will act on behalf of the authorized user to logon to the site automatically in order to retrieve documents. This function of the crawler process is controlled by user specified parameters such as a discussion forum's URL, starting page, the number of traversal levels and crawling frequency.

A third data source is Internet-based or facilitated mailing lists wherein individuals send to a centralized location emails that are then viewed and/or responded to by members of a particular group. Once a suitable list has been identified a subscription request is initiated. Once approved, these emails are sent to a mail server where they are downloaded, stored in system 10 and then processed in a fashion similar to documents harvested from other sources. The system stores in a database the filters, original URL and approval information to ensure only authorized messages are actually processed by system 10.

A fourth data source is corporations' internal documents. These internal documents may include sales notes, customer support notes and knowledge base. The crawler process accesses corporations' internal documents from their Intranet through Unix/Windows file system or alternately be able to access their internal documents by riding in the databases through an ODBC connection. If internal documents are password-protected, crawler process 31 acts on behalf of the authorized user to logon to the file systems or databases and be able to subsequently retrieve documents. This function of the crawler process is controlled by user-specified parameters such as directory path and database ODBC path, starting file id and ending file id, and access frequency. Other internal sources are customer information, sales records, accounting records, and digitally recorded correspondence such as e-mail files or digital voice records.

A fifth data source is web pages from Internet web sites. This function of the crawler process is similar to the functionality associated with web-discussion-forums. Searches are controlled by user-specified parameters such as web site URL, starting page, the number of traversal levels and crawling frequency.

Database Server System

FIGS. 4 and 5 illustrate a database server system 41, which may be used within system 10 of FIGS. 1 and 2. FIG. 4 illustrates the elements of system 41 and FIG. 5 is a data flow diagram. Specifically, system 41 could be used to implement the profiles generation process 25, which collects profile data about the client.

The input data 42 can be the client's sales data, customer-contact data, customer purchase data and account data etc. Various data sources for customer data can be contact management software packages such as ACT, MarketForce, Goldmine, and Remedy. Various data sources for accounting data are Great Plains, Solomon and other accounting packages typically found in small and medium-sized businesses. If the client has ERP (enterprise resource planning) systems (such as JD Edwards, PeopleSoft and SAP) installed, the data sources for customer and accounting data will be extracted from ERP customer and accounting modules. This data is typically structured and stored in flat files or relational databases. System 41 is typically an OLAP (On-line analytic processing) type server-based system. It has five major components. A data acquisition component 41a collects and extracts data from different data sources, applying appropriate transformation, aggregation and cleansing to the data collected. This component consists of predefined data conversions to accomplish most commonly used data transformations, for as many different types of data sources as possible. For data sources not covered by these predefined conversions, custom conversions need to be developed. The tools for data acquisition may be commercially available tools, such as Data Junction, ETI*EXTRACT, or equivalents. Open standards and APIs will permit employing the tool that affords the most efficient data acquisition and migration based on the organizational architecture.

Data mart 41b captures and stores an enterprise's sales information. The sales data collected from data acquisition component 41a are “sliced and diced” into multidimensional tables by time dimension, region dimension, product dimension and customer dimension, etc. The general design of the data mart follows data warehouse/data mart Star-Schema methodology. The total number of dimension tables and fact tables will vary from customer to customer, but data mart 41b is designed to accommodate the data collected from the majority of commonly used software packages such as PeopleSoft or Great Plains.

Various commercially available software packages, such as Cognos, Brio, Informatica, may be used to design and deploy data mart 41b. The Data Mart can reside in DB2, Oracle, Sybase, MS SQL server, P.SQL or similar database application. Data mart 41b stores sales and accounting fact and dimension tables that will accommodate the data extracted from the majority of industry accounting and customer contact software packages.

A Predefined Query Repository Component 41c is the central storage for predefined queries. These predefined queries are parameterized macros/business rules that extract information from fact tables or dimension tables in the data mart 41b. The results of these queries are delivered as business charts (such as bar charts or pie charts) in a web browser environment to the end users. Charts in the same category are bounded with the same predefined query using different parameters. (i.e. quarterly revenue charts are all associated with the same predefined quarterly revenue query, the parameters passed are the specific region, the specific year and the specific quarter). These queries are stored in either flat file format or as a text field in a relational database.

A Business Intelligence Charts Repository Component 41d serves two purposes in the database server system 41. A first purpose is to improve the performance of chart retrieval process. The chart repository 41d captures and stores the most frequently visited charts in a central location. When an end user requests a chart, system 41 first queries the chart repository 41d to see if there is an existing chart. If there is a preexisting chart, server 41e pulls that chart directly from the repository. If there is no preexisting chart, server 41e runs the corresponding predefined query from the query repository 41c in order to extract data from data mart 41b and subsequently feed the data to the requested chart. A second purpose is to allow chart sharing, collaboration and distribution among the end users. Because charts are treated as objects in the chart repository, users can bookmark a chart just like bookmarking a regular URL in a web browser. They can also send and receive charts as an email attachment. In addition, users may logon to system 41 to collaboratively make decisions from different physical locations. These users can also place the comments on an existing chart for collaboration.

Another component of system 41 is the Web Server component 41e, which has a number of subcomponents. A web server subcomponent (such as Microsoft IIS or Apache server or any other commercially available web servers) serves HTTP requests. A database server subcomponent (such as Tango, Cold Fusion or PHP) provides database drill-down functionality. An application server subcomponent routes different information requests to different other servers. For example, sales revenue chart requests will be routed to the database system 41; customer profile requests will be routed to a Data Mining server, and competition information requests will be routed to a Text Mining server. The latter two systems are discussed below. Another subcomponent of server 41e is the chart server, which receives requests from the application server. It either runs queries against data mart 41b, using query repository 41c, or retrieves charts from chart repository 41c.

As output 43, database server system 41 delivers business intelligence about an organization's sales performance as charts over the Internet or corporate Intranet. Users can pick and choose charts by regions, by quarters, by products, by companies and even by different chart styles. Users can drill-down on these charts to reveal the underlying data sources, get detailed information charts or detailed raw data. All charts are drill-down enabled allowing users to navigate and explore information either vertically or horizontally. Pie charts, bar charts, map views and data views are delivered via the Internet or Intranet.

As an example of operation of system 41, gross revenue analysis of worldwide sales may be contained in predefined queries that are stored in the query repository 41c. Gross revenue queries accept region and/or time period as parameters and extract data from the Data Mart 41b and send them to the web server 41e. Web server 41e transforms the raw data into charts and publishes them on the web.

Data Mining System

FIGS. 6 and 7 illustrate a data mining system 61, which may be used within system 10 of FIGS. 1 and 2.

FIG. 6 illustrates the elements of system 61 and FIG. 7 is a data flow diagram. Specifically, system 61 could be used to implement the profiles process 25, which collects profile data about the client.

Data sources 62 for system 61 are the Data Mart 41b, e.g., data from the tables that reside in Data Mart 41b, as well as data collected from marketing campaigns or sales promotions.

For data coming from the Data Mart 41b, data acquisition process 61a between Mining Base 61b and Data Mart 41b extract/transfer and format/transform data from tables in the Data Mart 41b into Data Mining base 61b. For data collected from sales and marketing events, data acquisition process 61a may be used to extract and transform this kind of data and store it in the Data Mining base 61b.

Data Mining base 61b is the central data store for the data for data mining system 61. The data it stores is specifically prepared and formatted for data mining purposes. The Data Mining base 61b is a separate data repository from the Data Mart 41b, even though some of the data it stores is extracted from Data Mart's tables. The Data Mining base 61b can reside in DB2, Oracle, Sybase, MS SQL server, P.SQL or similar database application.

Chart repository 61d contains data mining outputs. The most frequently used decision tree charts are stored in the chart repository 61d for rapid retrieval.

Customer purchasing behavior analysis is accomplished by using predefined Data Mining models that are stored in a model repository 61e. Unlike the predefined queries of system 41, these predefined models are industry-specific and business-specific models that address a particular business problem. Third party data mining tools such as IBM Intelligent Miner and Clementine, and various integrated development environments (IDEs) may be used to explore and develop these data mining models until the results are satisfactory. Then the models are exported from the IDE into standalone modules (in C or C++) and integrated into model repository 61e by using data mining APIs.

Data mining server 61c supplies data for the models, using data from database 61c. FIG. 7 illustrates the data paths and functions associated with server 61c. Various tools and applications that may be used to implement server 61c include VDI, EspressChart, and a data mining GUI.

The outputs of server 61e may include various options, such as decision trees, Rule Sets, and charts.

By default, all the outputs have drill-down capability to allow users to interactively navigate and explore information in either a vertical or horizontal direction. Views may also be varied, such as by influencing factor. For example, in bar charts, bars may represent factors that influence customer purchasing (decision-making) or purchasing behavior. The height of the bars may represent the impact on the actual customer purchase amount, so that the higher the bar is the more important the influencing factor is on customers, purchasing behavior. Decision trees offer a unique way to deliver business intelligence on customers' purchasing behavior. A decision tree consists of tree nodes, paths and node notations. Each individual node in a decision tree represents an influencing. A path is the route from root node (upper most level) to any other node in the tree. Each path represents a unique purchasing behavior that leads to a particular group of customers with an average purchase amount. This provides a quick and easy way for on-line users to identify where the valued customers are and what the most important factors are when customer are making purchase decisions. This also facilitates tailored marketing campaigns and delivery of sales presentations that focus on the product features or functions that matter most to a particular customer group. Rules Sets are plain-English descriptions of the decision tree. A single rule in the RuleSet is associated with a particular path in the decision tree. Rules that lead to the same destination node are grouped into a RuleSet. RuleSet views allow users to look at the same information presented in a decision tree from a different angle. When users drill down deep enough on any chart, they will reach the last drill-down level that is data view. A data view is a table view of the underlying data that supports the data mining results. Data Views are dynamically linked with Data Mining base 61b and Data Mart 41b through web server 61f.

Web server 61f, which may be the same as database server 41e, provides Internet access to the output of mining server 61c. Existing outputs may be directly accessed from storage in charts repository 61d. Or requests may be directed to models repository 61e. Consistent with the application service architecture of lead generation system 10, access by the client to web server 61f is via the Internet and the client's web browser.

Text Mining System

FIGS. 8 and 9 illustrate a text mining system 81, which may be used within system 10 of FIGS. 1 and 2. FIG. 8 illustrates the elements of system 81 and FIG. 9 is a data flow diagram. As indicated in FIG. 8, the source data 82 for system 81 may be either external and internal data sources. Thus, system 81 may be used to implement both the prospects system and profiles system of FIG. 2.

The source data 82 for text mining system 81 falls into two main categories, which can be mined to provide business intelligence. Internal documents contain business information about sales, marketing, and human resources. External sources consist primarily of the public domain in the Internet. Newsgroups, discussion forums, mailing lists and general web sites provide information on technology trends, competitive information, and customer concerns.

More specifically, the source data 82 for text mining system 81 is from five major sources. Web Sites: on-line discussion groups, forums and general web sites. Internet News Group: Internet newsgroups for special interests such as alt.ecommerce and microsoft.software.interdev. For some of the active newsgroups, hundreds of news articles may be harvested on a weekly basis. Internet Mailing Lists: mailing lists for special interests, such as e-commerce mailing list, company product support mailing list or Internet marketing mailing list. For some of the active mailing lists, hundreds of news articles will be harvested on a weekly basis. Corporate textual files: internal documents such as emails, customer support notes sales notes, and digital voice records.

For data acquisition 81a from web sites, user-interactive web crawlers are used to collect textual information. Users can specify the URLs, the depth and the frequency of web crawling. The information gathered by the web crawlers is stored in a central repository, the text archive 81b. For data acquisition from newsgroups, a news collector contacts the news server to download and transform news articles in an html format and deposit them in text archive 81b. Users can specify the newsgroups names, the frequency of downloads and the display format of the news articles to news collector. For data acquisition from Internet mailing lists, a mailing list collector automatically receives, sorts and formats email messages from the subscribed mailing lists and deposit them into text archive 81b. Users can specify the mailing list names and address and the display format of the mail messages. For data acquisition from client text files, internal documents are sorted, collected and stored in the Text Archive 81b. The files stored in Text Archive 81b can be either physical copies or dynamic pointers to the original files.

The Text Archive 81b is the central data store for all the textual information for mining. The textual information it stores is specially formatted and indexed for text mining purpose. The Text Archive 81b supports a wide variety of file formats, such plain text, html, MS Word and Acrobat.

Text Mining Server 81c operates on the Text Archive 81b. Tools and applications used by server 81c may include ThemeScape and a Text Mining GUI 81c. A repository 81d stores text mining outputs. Web server 81e is the front end interface to the client system 13, permitting the client to access database 81b, using an on-line search executed by server 81c or server 81e.

The outputs of system 81 may include various options. Map views and simple query views may be delivered over the Internet or Intranet. By default, all the outputs have drill-down capability to allow users to reach the original documents. HTML links will be retained to permit further lateral or horizontal navigation. Keywords will be highlighted or otherwise pointed to in order to facilitate rapid location of the relevant areas of text when a document is located through a keyword search. For example, Map Views are the outputs produced by ThemeScape. Textual information is presented on a topological map on which similar “themes” are grouped together to form “mountains.” On-line users can search or drill down on the map to get the original files. Simple query views are similar to the interfaces of most of the Internet search engines offered (such as Yahoo, Excite and HotBot). It allows on-line users to query the Text Archive 81b for keywords or key phrases or search on different groups of textual information collected over time.

A typical user session using text-mining system 81 might follow the following steps. It is assumed that the user is connected to server 81e via the Internet and a web browser, as illustrated in FIG. 1. In the example of this description, server 81e is in communication with server 81c, which is implemented using ThemeScape software.

    • 1. Compile list of data sources (Newsgroups, Discussion Groups, etc).
    • 2. Start ThemeScape Publisher or comparable application.
    • 3. Select “File”.
    • 4. Select “Map Manager” or comparable function.
    • 5. Verify that server and email blocks are correctly set. If not, insert proper information.
    • 6. Enter password.
    • 7. Press “Connect” button
    • 8. Select “New”.
    • 9. Enter a name for the new map.
    • 10. If duplicating another maps settings, use drop down box to select the map name.
    • 11. Select “Next”.
    • 12. Select “Add Source”.
    • 13. Enter a Source Description.
    • 14. Source Type remains “World Wide Web (WWW)”.
    • 15. Enter the URL to the site to be mined.
    • 16. Add additional URLs, if desired.
    • 17. Set “Harvest Depth.” Parameters range from 1 level to 20 levels.
    • 18. Set “Filters” if appropriate. These include Extensions, Inclusions, Exclusions, Document Length and Rations.
    • 19. Set Advanced Settings, if appropriate. These include Parsing Settings, Harvest Paths, Domains, and Security and their sub-settings.
    • 20. Repeat steps 14 through 20 for each additional URL to be mined.
    • 21. Select “Advanced Settings” if desired. These include Summarization Settings, Stopwords, and Punctuation.
    • 22. Select “Finish” once ready to harvest the sites.
    • 23. The software downloads and mines (collectively known as harvesting) the documents and creates a topographical map.
    • 24. Once the map has been created, it can be opened and searched.

Text Mining Applied to Web Site Server Logs

The text mining concepts discussed above in connection with text mining system 81 can be applied to web site server logs.

FIG. 10 illustrates a text mining system 101 applied to web site server logs. Text mining system 101 is programmed to aggregate unstructured factual and contextual log entries for comparison to related content pages and processes occurring at the moment indicated by the log entry. This aggregated intelligence is then related to consummated and incomplete purchase transactions. Various predictive statistics are then extracted. These statistics include the most profitable aggregation clusters, the least profitable, the mean aggregation clusters, and dropped transaction aggregation clusters. The various aggregation clusters may be overlaid on transaction, survey, and user-entered demographics and preferences.

OTHER EMBODIMENTS

Although the present invention has been described in detail, it should be understood that various changes, substitutions, and alterations can be made hereto without departing from the spirit and scope of the invention as defined by the appended claims.

Patent Citations
Cited PatentFiling datePublication dateApplicantTitle
US49145866 Nov 19873 Apr 1990Xerox CorporationGarbage collector for hypermedia systems
US561964830 Nov 19948 Apr 1997Lucent Technologies Inc.Message filtering techniques
US56301212 Feb 199313 May 1997International Business Machines CorporationArchiving and retrieving multimedia objects using structured indexes
US56491141 Jun 199515 Jul 1997Credit Verification CorporationMethod and system for selective incentive point-of-sale marketing in response to customer shopping histories
US565946927 Apr 199519 Aug 1997Credit Verification CorporationCheck transaction processing, database building and marketing method and system utilizing automatic check reading
US574281615 Sep 199521 Apr 1998Infonautics CorporationMethod and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic
US578742211 Jan 199628 Jul 1998Xerox CorporationMethod and apparatus for information accesss employing overlapping clusters
US58094818 Aug 199615 Sep 1998David BaronAdvertising method and system
US589762216 Oct 199627 Apr 1999Microsoft CorporationElectronic shopping and merchandising system
US59240684 Feb 199713 Jul 1999Matsushita Electric Industrial Co. Ltd.Electronic news reception apparatus that selectively retains sections and searches by keyword or index for text to speech conversion
US592410526 Jan 199813 Jul 1999Michigan State UniversityMethod and product for determining salient features for use in information searching
US593190723 Jan 19963 Aug 1999British Telecommunications Public Limited CompanySoftware agent for comparing locally accessible keywords with meta-information and having pointers associated with distributed information
US594806129 Oct 19967 Sep 1999Double Click, Inc.Method of delivery, targeting, and measuring advertising over networks
US597439811 Apr 199726 Oct 1999At&T Corp.Method and apparatus enabling valuation of user access of advertising carried by interactive information and entertainment services
US59866907 Nov 199416 Nov 1999Discovery Communications, Inc.Electronic book selection and delivery system
US59872479 May 199716 Nov 1999International Business Machines CorporationSystems, methods and computer program products for building frameworks in an object oriented environment
US599992724 Apr 19987 Dec 1999Xerox CorporationMethod and apparatus for information access employing overlapping clusters
US60062425 Apr 199621 Dec 1999Bankers Systems, Inc.Apparatus and method for dynamically creating a document
US602643317 Mar 199715 Feb 2000Silicon Graphics, Inc.Method of creating and editing a web site in a client-server environment using customizable web site templates
US602914127 Jun 199722 Feb 2000Amazon.Com, Inc.Internet-based customer referral system
US602916416 Jun 199722 Feb 2000Digital Equipment CorporationMethod and apparatus for organizing and accessing electronic mail messages using labels and full text and label indexing
US602917431 Oct 199822 Feb 2000M/A/R/C Inc.Apparatus and system for an adaptive data management architecture
US60291955 Dec 199722 Feb 2000Herz; Frederick S. M.System for customized electronic identification of desirable objects
US60349702 Jul 19977 Mar 2000Adaptive Micro Systems, Inc.Intelligent messaging system and method for providing and updating a message using a communication device, such as a large character display
US605551024 Oct 199725 Apr 2000At&T Corp.Method for performing targeted marketing over a large computer network
US605837520 Oct 19972 May 2000Samsung Electronics Co., Ltd.Accounting processor and method for automated management control system
US605839824 May 19992 May 2000Daewoo Electronics Co., Ltd.Method for automatically linking index data with image data in a search system
US605841818 Feb 19972 May 2000E-Parcel, LlcMarketing data delivery system
US607889124 Nov 199720 Jun 2000Riordan; JohnMethod and system for collecting and processing marketing data
US610505513 Mar 199815 Aug 2000Siemens Corporate Research, Inc.Method and apparatus for asynchronous multimedia collaboration
US611910117 Jan 199712 Sep 2000Personal Agents, Inc.Intelligent agents for electronic commerce
US613454819 Nov 199817 Oct 2000Ac Properties B.V.System, method and article of manufacture for advanced mobile bargain shopping
US614500317 Dec 19977 Nov 2000Microsoft CorporationMethod of web crawling utilizing address mapping
US614828918 Apr 199714 Nov 2000Localeyes CorporationSystem and method for geographically organizing and classifying businesses on the world-wide web
US615158224 Feb 199721 Nov 2000Philips Electronics North America Corp.Decision support system for the management of an agile supply chain
US615160112 Nov 199721 Nov 2000Ncr CorporationComputer architecture and method for collecting, analyzing and/or transforming internet and/or electronic commerce data for storage into a data storage area
US615476630 Jun 199928 Nov 2000Microstrategy, Inc.System and method for automatic transmission of personalized OLAP report output
US617001112 Nov 19982 Jan 2001Genesys Telecommunications Laboratories, Inc.Method and apparatus for determining and initiating interaction directionality within a multimedia communication center
US619908130 Jun 19986 Mar 2001Microsoft CorporationAutomatic tagging of documents and exclusion by content
US620221021 Aug 199813 Mar 2001Sony Corporation Of JapanMethod and system for collecting data over a 1394 network to support analysis of consumer behavior, marketing and customer support
US620543216 Nov 199820 Mar 2001Creative Internet Concepts, LlcBackground advertising system
US621217811 Sep 19983 Apr 2001Genesys Telecommunication Laboratories, Inc.Method and apparatus for selectively presenting media-options to clients of a multimedia call center
US622662323 May 19971 May 2001Citibank, N.A.Global financial services integration system and process
US623357523 Jun 199815 May 2001International Business Machines CorporationMultilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values
US623697529 Sep 199822 May 2001Ignite Sales, Inc.System and method for profiling customers for targeted marketing
US624041115 Jun 199829 May 2001Exchange Applications, Inc.Integrating campaign management and data mining
US624976425 Feb 199919 Jun 2001Hewlett-Packard CompanySystem and method for retrieving and presenting speech information
US625662322 Jun 19983 Jul 2001Microsoft CorporationNetwork search access construct for accessing web-based search services
US626298726 Mar 199817 Jul 2001Compaq Computer CorpSystem and method for reducing latencies while translating internet host name-address bindings
US626333411 Nov 199817 Jul 2001Microsoft CorporationDensity-based indexing method for efficient execution of high dimensional nearest-neighbor queries on large databases
US628254821 Jun 199728 Aug 2001Alexa InternetAutomatically generate and displaying metadata as supplemental information concurrently with the web page, there being no link between web page and metadata
US628934220 May 199811 Sep 2001Nec Research Institute, Inc.Autonomous citation indexing and literature browsing using citation context
US629781916 Nov 19982 Oct 2001Essential Surfing Gear, Inc.Parallel web sites
US633215419 Feb 199918 Dec 2001Genesys Telecommunications Laboratories, Inc.Method and apparatus for providing media-independent self-help modules within a multimedia communication-center customer interface
US633806625 Sep 19988 Jan 2002International Business Machines CorporationSurfaid predictor: web-based system for predicting surfer behavior
US634528815 May 20005 Feb 2002Onename CorporationComputer-based communication system and method using metadata defining a control-structure
US636337722 Dec 199826 Mar 2002Sarnoff CorporationSearch data processor
US637799324 Sep 199823 Apr 2002Mci Worldcom, Inc.Integrated proxy interface for web based data management reports
US638159915 Nov 199930 Apr 2002America Online, Inc.Seamless integration of internet resources
US639346529 May 199821 May 2002Nixmail CorporationJunk electronic mail detector and eliminator
US64010915 Dec 19954 Jun 2002Electronic Data Systems CorporationBusiness information repository system and method of operation
US640111813 Aug 19984 Jun 2002Online Monitoring ServicesMethod and computer program product for an online monitoring search engine
US640519717 Mar 199911 Jun 2002Tacit Knowledge Systems, Inc.Method of constructing and displaying an entity profile constructed utilizing input from entities other than the owner
US643054521 Dec 19986 Aug 2002American Management Systems, Inc.Use of online analytical processing (OLAP) in a rules based decision management system
US643062414 Feb 20006 Aug 2002Air2Web, Inc.Intelligent harvesting and navigation system and method
US643454428 Feb 200013 Aug 2002Hyperroll, Israel Ltd.Stand-alone cartridge-style data aggregation server providing data aggregation for OLAP analyses
US64345487 Dec 199913 Aug 2002International Business Machines CorporationDistributed metadata searching system and method
US643853925 Feb 200020 Aug 2002Agents-4All.Com, Inc.Method for retrieving data from an information network through linking search criteria to search strategy
US643854317 Jun 199920 Aug 2002International Business Machines CorporationSystem and method for cross-document coreference
US646003824 Sep 19991 Oct 2002Clickmarks, Inc.System, method, and article of manufacture for delivering information to a user through programmable network bookmarks
US646006915 Mar 19991 Oct 2002Pegasus Transtech CorporationSystem and method for communicating documents via a distributed computer system
US647375611 Jun 199929 Oct 2002Acceleration Software International CorporationMethod for selecting among equivalent files on a global computer network
US647753622 Jun 19995 Nov 2002Microsoft CorporationVirtual cubes
US648084225 Mar 199912 Nov 2002Sap Portals, Inc.Dimension to domain server
US648088525 Apr 200012 Nov 2002Michael OlivierDynamically matching users for group communications based on a threshold degree of matching of sender and recipient predetermined acceptance criteria
US64905828 Feb 20003 Dec 2002Microsoft CorporationIterative validation and sampling-based clustering using error-tolerant frequent item sets
US649062024 Sep 19983 Dec 2002Worldcom, Inc.Integrated proxy interface for web based broadband telecommunications management
US649370311 May 199910 Dec 2002Prophet Financial SystemsSystem and method for implementing intelligent online community message board
US651043224 Mar 200021 Jan 2003International Business Machines CorporationMethods, systems and computer program products for archiving topical search results of web servers
US651633714 Oct 19994 Feb 2003Arcessa, Inc.Sending to a central indexing site meta data or signatures from objects on a computer network
US651957127 May 199911 Feb 2003Accenture LlpDynamic customer profile management
US652302131 Jul 200018 Feb 2003Microsoft CorporationBusiness directory search engine
US652990931 Aug 19994 Mar 2003Accenture LlpMethod for translating an object attribute converter in an information services patterns environment
US65464169 Dec 19988 Apr 2003Infoseek CorporationMethod and system for selectively blocking delivery of bulk electronic mail
US655573820 Apr 200129 Apr 2003Sony CorporationAutomatic music clipping for super distribution
US65570087 Dec 199929 Apr 2003International Business Machines CorporationMethod for managing a heterogeneous IT computer complex
US65642098 Mar 200013 May 2003Accenture LlpKnowledge management tool for providing abstracts of information
US656779719 Oct 199920 May 2003Xerox CorporationSystem and method for providing recommendations based on multi-modal user clusters
US656780331 May 200020 May 2003Ncr CorporationSimultaneous computation of multiple moving aggregates in a relational database management system
US657123411 May 199927 May 2003Prophet Financial Systems, Inc.System and method for managing online message board
US657461924 Mar 20003 Jun 2003I2 Technologies Us, Inc.System and method for providing cross-dimensional computation and data access in an on-line analytical processing (OLAP) environment
US657800918 Feb 200010 Jun 2003Pioneer CorporationMarketing strategy support system for business customer sales and territory sales information
US658105430 Jul 199917 Jun 2003Computer Associates Think, Inc.Dynamic query model and method
US659805419 Oct 199922 Jul 2003Xerox CorporationSystem and method for clustering data objects in a collection
US660664424 Feb 200012 Aug 2003International Business Machines CorporationSystem and technique for dynamic information gathering and targeted advertising in a web based model using a live information selection and analysis tool
US660912413 Aug 200119 Aug 2003International Business Machines CorporationHub for strategic intelligence
US661183915 Mar 200126 Aug 2003Sagemetrics CorporationComputer implemented methods for data mining and the presentation of business metrics for analysis
US66151844 Jan 20002 Sep 2003Mitzi HicksSystem and method for providing customers seeking a product or service at a specified discount in a specified geographic area with information as to suppliers offering the same
US662150530 Sep 199816 Sep 2003Journee Software Corp.Dynamic process-based enterprise computing system and method
US662559825 Oct 200023 Sep 2003Mpc Computers, LlcData verification system and technique
US665104822 Oct 199918 Nov 2003International Business Machines CorporationInteractive mining of most interesting rules with population constraints
US66510551 Mar 200118 Nov 2003Lawson Software, Inc.OLAP query generation engine
US665106530 Nov 200118 Nov 2003Global Information Research And Technologies, LlcSearch and index hosting system
US666219229 Mar 20009 Dec 2003Bizrate.ComSystem and method for data collection, evaluation, information generation, and presentation
US666565813 Jan 200016 Dec 2003International Business Machines CorporationSystem and method for automatically gathering dynamic content and resources on the world wide web by stimulating user interaction and managing session information
US66682595 Sep 200023 Dec 2003Rockwell Automation Technologies, Inc.Tracking method for storing event data using database-objects
US667796316 Nov 199913 Jan 2004Verizon Laboratories Inc.Computer-executable method for improving understanding of business data by interactive rule manipulation
US66842071 Aug 200127 Jan 2004Oracle International Corp.System and method for online analytical processing
US668421821 Nov 200027 Jan 2004Hewlett-Packard Development Company L.P.Standard specific
US66911059 Feb 200010 Feb 2004America Online, Inc.System and method for geographically organizing and classifying businesses on the world-wide web
US67005753 May 20002 Mar 2004Ge Mortgage Holdings, LlcMethods and apparatus for providing a quality control management system
US67005901 Nov 20002 Mar 2004Indx Software CorporationSystem and method for retrieving and presenting data using class-based component and view model
US671497924 Sep 199830 Mar 2004Worldcom, Inc.Data warehousing infrastructure for web based reporting tool
US672168928 Nov 200113 Apr 2004Icanon Associates, Inc.System and method for hosted facilities management
US67321619 Nov 19994 May 2004Ebay, Inc.Information presentation and management in an online trading environment
US67576897 Sep 200129 Jun 2004Hewlett-Packard Development Company, L.P.Enabling a zero latency enterprise
US676335331 Oct 200113 Jul 2004Vitria Technology, Inc.Real time business process analysis method and apparatus
US676900920 Apr 200027 Jul 2004Richard R. ReismanMethod and system for selecting a personalized set of information channels
US676901011 May 200027 Jul 2004Howzone.Com Inc.Apparatus for distributing information over a network-based environment, method of distributing information to users, and method for associating content objects with a database wherein the content objects are accessible over a network communication medium by a user
US677219627 Jul 20003 Aug 2004Propel Software Corp.Electronic mail filtering system and methods
US679583031 May 200121 Sep 2004Oracle International CorporationTechniques for providing off-host storage for a database application
US67992214 Aug 200028 Sep 2004Akamai Technologies, Inc.System and method for server-side optimization of data delivery on a distributed computer network
US680470418 Aug 200012 Oct 2004International Business Machines CorporationSystem for collecting and storing email addresses with associated descriptors in a bookmark list in association with network addresses of electronic documents using a browser program
US684537019 Nov 199818 Jan 2005Accenture LlpAdvanced information gathering for targeted activities
US686838918 Jan 200015 Mar 2005Jeffrey K. WilkinsInternet-enabled lead generation
US68683929 Jul 199915 Mar 2005Fujitsu LimitedSystem and method for electronic shopping using an interactive shopping agent
US686839522 Dec 199915 Mar 2005Cim, Ltd.Business transactions using the internet
US692050217 Jul 200119 Jul 2005Netilla Networks, Inc.Apparatus and accompanying methods for providing, through a centralized server site, an integrated virtual office environment, remotely accessible via a network-connected web browser, with remote network monitoring and management capabilities
US700351724 May 200121 Feb 2006Inetprofit, Inc.Web-based system and method for archiving and searching participant-based internet text sources for customer lead data
US70319685 Jan 200118 Apr 2006Prev-U Israel Ltd.Method and apparatus for providing web site preview information
US703960623 Mar 20012 May 2006Restaurant Services, Inc.System, method and computer program product for contract consistency in a supply chain management framework
US708242724 May 200125 Jul 2006Reachforce, Inc.Text indexing system to index, query the archive database document by keyword data representing the content of the documents and by contact data associated with the participant who generated the document
US709622021 May 200122 Aug 2006Reachforce, Inc.Web-based customer prospects harvester system
US712062924 May 200110 Oct 2006Reachforce, Inc.Prospects harvester system for providing contact data about customers of product or service offered by business enterprise extracting text documents selected from newsgroups, discussion forums, mailing lists, querying such data to provide customers who confirm to business profile data
US731586111 Jul 20051 Jan 2008Reachforce, Inc.Text mining system for web-based business intelligence
US2001002024216 Nov 19986 Sep 2001Amit GuptaMethod and apparatus for processing client information
US2001003209229 Dec 200018 Oct 2001James CalverSmall business web-based portal method and system
US2001003466321 Feb 200125 Oct 2001Eugene TevelerElectronic contract broker and contract market maker infrastructure
US2001004200230 Aug 199915 Nov 2001Jeff KoopersmithMethod and system for communicating targeted information
US2001004203717 Apr 200115 Nov 2001Kam Kendrick W.Internet-based system for identification, measurement and ranking of investment portfolio management, and operation of a fund supermarket, including "best investor" managed funds
US2001004210412 Feb 200115 Nov 2001Donoho David LeighInspector for computed relevance messaging
US2001004467629 Oct 199822 Nov 2001Christopher Clemmett Macleod BeckInterface engine for managing business processes within a multimedia communication-center
US2001005200329 Mar 200113 Dec 2001Ibm CorporationSystem and method for web page acquisition
US200100540047 Jun 200120 Dec 2001Powers Arthur C.Method of direct communication between a business and its customers
US2001005636630 May 200127 Dec 2001Naismith Robert W.Targeted response generation system
US2002001673517 Apr 20017 Feb 2002Runge Mark W.Electronic mail classified advertising system
US200200326033 May 200114 Mar 2002Yeiser John O.Method for promoting internet web sites
US2002003272517 Jul 200114 Mar 2002Netilla Networks Inc.Apparatus and accompanying methods for providing, through a centralized server site, an integrated virtual office environment, remotely accessible via a network-connected web browser, with remote network monitoring and management capabilities
US2002003550119 Nov 199821 Mar 2002Sean HandelA personalized product report
US2002003556822 Dec 200021 Mar 2002Benthin Mark LouisMethod and apparatus supporting dynamically adaptive user interactions in a multimodal communication system
US2002003829916 Jan 200128 Mar 2002Uri ZernikInterface for presenting information
US2002004613823 Apr 200118 Apr 2002Brian FitzpatrickMethod and system for electronically selecting, modifying, and operating a motivation or recognition program
US2002004962226 Apr 200125 Apr 2002Lettich Anthony R.Vertical systems and methods for providing shipping and logistics services, operations and products to an industry
US2002007298212 Dec 200013 Jun 2002Shazam Entertainment Ltd.Method and system for interacting with a user in an experiential environment
US200200730585 Jan 200113 Jun 2002Oren KremerMethod and apparatus for providing web site preview information
US2002008306727 Sep 200127 Jun 2002Pablo TamayoEnterprise web mining system and method
US2002008738729 Dec 20004 Jul 2002James CalverLead generator method and system
US200201077018 Jun 20018 Aug 2002Batty Robert L.Systems and methods for metering content on the internet
US2002011636231 Oct 200122 Aug 2002Hui LiReal time business process analysis method and apparatus
US2002011648416 Feb 200122 Aug 2002Gemini Networks, Inc.System, method, and computer program product for supporting multiple service providers with a trouble ticket capability
US2002012395714 Aug 20015 Sep 2002Burt NotariusMethod and apparatus for marketing and communicating in the wine/spirits industry
US200201438705 Jan 20013 Oct 2002Overthehedge.Net, Inc.Method and system for providing interactive content over a network
US2002016168517 Aug 200131 Oct 2002Michael DwinnellBroadcasting information and providing data access over the internet to investors and managers on demand
US2002017816626 Mar 200128 Nov 2002Direct411.ComKnowledge by go business model
US2003000943019 Nov 19989 Jan 2003Chad BurkeySystem, method and article of manufacture for advanced information gathering for targetted activities
US200300288963 Aug 20016 Feb 2003Swart William D.Video and digital multimedia aggregator remote content crawler
US2003004084510 May 200227 Feb 2003Spool Peter R.Business management system and method for a deregulated electric power market using customer circles aggregation
US2003006580523 May 20023 Apr 2003Barnes Melvin L.System, method, and computer program product for providing location based services and mobile e-commerce
US2003008392228 Aug 20021 May 2003Wendy ReedSystems and methods for managing critical interactions between an organization and customers
US2003012050223 Apr 200226 Jun 2003Robb Terence AlanApplication infrastructure platform (AIP)
US2003013997512 Dec 200224 Jul 2003Perkowski Thomas J.Method of and system for managing and serving consumer-product related information on the world wide web (WWW) using universal product numbers (UPNS) and electronic data interchange (EDI) processes
US200302257369 Dec 20024 Dec 2003Reuven BakalashEnterprise-wide resource planning (ERP) system with integrated data aggregation engine
US2004000288728 Jun 20021 Jan 2004Fliess Kevin V.Presenting skills distribution data for a business enterprise
US2005002161128 Jun 200427 Jan 2005Knapp John R.Apparatus for distributing content objects to a personalized access point of a user over a network-based environment and method
US200500442801 Oct 200424 Feb 2005Teleshuttle Technologies, LlcSoftware and method that enables selection of one of a plurality of online service providers
US2005013794622 Dec 200323 Jun 2005Schaub Thomas M.Use of separate rib ledgers in a computerized enterprise resource planning system
US2006001313428 Jun 200519 Jan 2006Neuse Douglas MSystem and method for performing capacity planning for enterprise applications
US2006001542415 Jul 200419 Jan 2006Augusta Systems, Inc.Management method, system and product for enterprise environmental programs
EP1118952A211 Jan 200125 Jul 2001Hewlett-Packard CompanySystem, method and computer program product for providing a remote support service
EP1162558A16 Jun 200012 Dec 2001Carlo CamilliThe internet global flea market newspaper
EP1555626A214 Jan 200520 Jul 2005Microsoft CorporationImage-based indexing and retrieval of text documents
Non-Patent Citations
Reference
180-20 Software, "End Email and File Chaos," 80-20 Retriever Enterprise Edition, 4 pages, 2003.
2Adomavicius et al., "Using Data Mining Methods to Build Customer Profiles", IEEE 2001 computer, pp. 74-82, 2001.
3Adomavicius, et al., "Using Data Mining Methods to Build Customer Profiles," IEEE 2001, Computer, pp. 74-82.
4An Insuma GmbH White Paper, "OASIS Distributed Search Engine," pp. 1-11, no. date.
5An InsumaGmbH White Paper, "OASIS Distributed Search Engine," pp. 1-11, no date.
6Andreas Geyer-Schultz et al., "A Customer Purchase Incidence Model Applied to Recommender Services" WEBKDD 2001 Mining Log data across all customer touch points, third international workshop, p. 1-11, Aug. 26, 2001.
7Beantree, "Enterprise Business Application Architecture" Enterprise Business Components Whitepaper, 5 pages, Sep. 1999.
8Beantree, "Enterprise Business Application Architecture," Enterprise Business Components, Whitepaper, Sep. 1999, 5 pages.
9Delen et al., "An Integrated Toolkit for Enterprise Modeling and Analysis", Proceedings of the 1999 winter Simulation Conference, pp. 289-297, 1999.
10Delen, et al., "An Integrated Toolkit for Enterprise Modeling and Analysis," Proceedings of the 1999 Winter Simulation Conference, pp. 289-297, 1999.
11Elprin, Nick et al., "An Analysis of Database-Driven Mail Servers," LISA XVII, pp. 15-22, Oct. 26-31, 2003.
12Elprin, Nick et al., An Analysis of Database-Driven Mail Servers, LISA XVII, pp. 15-22, 2003.
13Gravano, Luis et al., "GIOSS: text-source discovery over the Internet", ACM Transactions on Database Systems (TODS), vol. 24, Issue 2, Jun. 1999, pp. 229-264.
14Gravano, Luis et al., "GIOSS: text-source discovery over the Internet," ACM Transactions on Database Systems (TODS), vol. 24, Issue 2, Jun. 1999, pp. 229-264.
15Griffin et al., "Enterprise Customer Relationship Management", DM review, 15 pages, Dec. 1999.
16Griffin, et al., "Enterprise Customer Relationship Management," DM review, 15 pages, Dec. 1999.
17Grobelnik, et al., "Text mining as integration of several related research areas: report on KDD's workshop on text mining 2000," Dec. 2000, ACM-SIGKDD Explorations, vol. 2, issue 2, pp. 99-102.
18Grobelnik, et al., "Text mining as integration of several related research areas: report on KDD's workshop on text mining 2000," Dec. 2000, ACM—SIGKDD Explorations, vol. 2, issue 2, pp. 99-102.
19 *Joshen Dorre et al., Text Mining: finding Nuggets in Mountains of textual Data, ACM, 1999, 398-401.
20Joshen Dorre, et al., "Text Mining: finding Nuggets in Mountains of textual Data," ACM, 1999, 398-401.
21Journyx and IBM team to deliver enterprise project and time tracking software, article, 3 pages, Apr. 5, 1999.
22Journyx and IBM team to deliver enterprise project and time tracking software, article, Apr. 5, 1999, 3 pages.
23Key Building Blocks for Knowledge Management Solutions, "IBM Intelligent Miner for Text" 2 pages, 1999.
24Key Building Blocks for Knowledge Management Solutions, "IBM Intelligent Miner for Text," 2 pages, 1999.
25Lee et al., "An enterprise intelligence system integrating WWW intranet resource" IEEE Xplore Release 1.8, pp. 28-35 with abstract, 1999.
26Lee, et al., "An enterprise intelligence system integrating WWW intranet resource," IEEE Xplore Release 1.8, pp. 28-35 with abstract, 1999.
27Letter of Express Abandonment in U.S. Appl. No. 09/865,804 dated Feb. 9, 2006, 1 page.
28M. Kitayama, R. Matsubara and Y. Izui; "Application of data mining to customer profile analysis in the power electric industry," IEEE Power Engineering Society Winter Meeting, vol. 1, Jan. 2002, pp. 632-634.
29Mathur, Srita, "Creating Unique Customer Experiences: The New Business Model of Cross-Enterprise Integration" IEEE Xplore Release 1.8, pp. 76-81 with abstract, 2000.
30Mathur, Srita, "Creating Unique Customer Experiences: The New Business Model of Cross-Enterprise Integration," IEEE Xplore Release 1.8, pp. 76-81 with abstract, 2000.
31Mouri, T. et al., "Extracting new topic contents from hidden web sites", International Conference on Information Technology: Coding and Computing 2004, pp. 314-319.
32Murtagh, Fionn, "Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining", Library and Information Services in Astronomy III, ASP Conference Series, vol. 153, 1998, pp. 51-60.
33Murtagh, Fionn, "Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining," Library and Information Services in Astronomy III, ASP Conference Series, vol. 153, 1998, pp. 51-60.
34Official Action in U.S. Appl. No. 09/862,832 issued Jan. 21, 2005, 10 pages.
35Official Action in U.S. Appl. No. 09/865,804 issued Aug. 1, 2005, 18 pages.
36Official Action in U.S. Appl. No. 09/865,804 issued Aug. 4, 2004, 13 pages.
37Official Action in U.S. Appl. No. 09/865,804 issued Feb. 23, 2004, 10 pages.
38Official Action in U.S. Appl. No. 09/865,804 issued Nov. 30, 2004, 16 pages.
39Official Action in U.S. Appl. No. 11/415,017 issued Feb. 1, 2007, 14 pages.
40Optio Software, Inc., News: Optio Software and Syntax.net Reseller Partnership Offers a Robust Solution to Provider and Deliver Customized Documents to Support E-Business and Extend the Reach of the Global Enterprise, 2 pages, Dec. 20, 1999.
41P. Markellou, I Mousourouli, S. Sirmakessis and A. Tsakalidis, "Personalized E-commerce Recommendations," IEEE Conference on e-Business Engineering, Oct. 2005, pp. 245-252.
42Parkhomenko et al., "Personalization Using Hybrid Data Mining Approaches in E-Business Applications", Amer. assoc. for Artificial Intelligence, 7 pages, 2002.
43Parkhomenko, et al., "Personalization Using Hybrid Data Mining Approaches in E-business Applications," American Association for Artificial Intelligence, 5 pages, 2002.
44Paul Dean, "Browsable OLAP Apps on SQL Server Analysis Services," Intelligent Enterprise Magazine, product review, 5 pages, May 7, 2001.
45Paul Dean, "Browsable OLAP Apps on SQL Server Analysis Services," Intelligent Enterprise Magazine, product review, May 7, 2001, 5 pages.
46Pending U.S. Appl. No. 09/862,814 entitled "Web-Based Customer Prospects Harvester System" filed by Seibel, et al., filed May 21, 2001.
47Pending U.S. Appl. No. 09/862,832 entitled "Web-Based Customer Lead Generator System" filed by Seibel, et al., filed May 21, 2001.
48Pending U.S. Appl. No. 09/865,735 entitled "Text Mining System for Web-Based Business Intelligence" filed by Seibel, et al., filed May 24, 2001.
49Pending U.S. Appl. No. 09/865,802 entitled "Database Server System for Web-Based Business Intelligence" filed by Seibel, et al., filed May 24, 2001.
50Pending U.S. Appl. No. 09/865,804 entitled "Data Mining System for Web-Based Business Intelligence" filed by Seibel, et al., filed May 24, 2001.
51Pending U.S. Appl. No. 09/865,805 entitled "Text Indexing System for Web-Based Business Intelligence" filed by Seibel, et al., filed May 24, 2001.
52Pervasive Solution Sheet "Harvesting Unstructured Data", 5 pages, 2003.
53Preliminary Amendment in U.S. Appl. No. 12/325,909 dated Sep. 25, 2009, 6 pages.
54Preliminary Amendment in U.S. Appl. No. 12/651,451 dated Jan. 1, 2010, 4 pages.
55 *Raymond Kosala et al., Web Mining Research: A Survey, 2000, ACM-SIGKDD Explorations, vol. 2, 1-15.
56 *Raymond Kosala et al., Web Mining Research: A Survey, 2000, ACM—SIGKDD Explorations, vol. 2, 1-15.
57Raymond Kosala, et al., "Web Mining Research: A Survey," 2000, ACM-SIGKDD Explorations, vol. 2, 1-15.
58Raymond Kosala, et al., "Web Mining Research: A Survey," 2000, ACM—SIGKDD Explorations, vol. 2, 1-15.
59S. Fong and S. Chan; "Mining online users' access records for web business intelligence", IEEE International Conference on Data Mining, Dec. 2002, pp. 759-762.
60Schwartz, Michael F. et al., "Applying an information gathering architecture to Netfind: a white pages tool for a changing and growing internet", IEEE/ACM Transactions on Networking (TON), vol. 2, Issue 5, Oct. 1994, pp. 426-439.
61Schwartz, Michael F. et al., "Applying an Information Gathering Architecture to Netfind: A White Pages Tool for a Changing and Growing Internet," IEEE/A CM Transactions on Networking (TON), vol. 2, Issue 5, Oct. 1994, pp. 426-439.
62Supplemental Preliminary Amendment in U.S. Appl. No. 12/325,909 dated May 26, 2010, 13 pages.
63Supplemental Preliminary Amendment in U.S. Appl. No. 12/325,909 dated Nov. 30, 2009, 15 pages.
64Supplemental Preliminary Amendment in U.S. Appl. No. 12/651,451 dated May 25, 2010, 14 pages.
65T. Puschmann and R. Alt, "Enterprise Application Integration-the Case of the Robert Bosch Group," 34th Annual Hawaii International Conference on System Science; Jan. 2001, pp. 1-10.
66T. Puschmann and R. Alt, "Enterprise Application Integration—the Case of the Robert Bosch Group," 34th Annual Hawaii International Conference on System Science; Jan. 2001, pp. 1-10.
67 *Text mining as integration of several related research areas: report on KDD's workshop on text mining, 2000, ACM-SIGKDD Explorations, vol. 2, 1-99102.
68 *Text mining as integration of several related research areas: report on KDD's workshop on text mining, 2000, ACM—SIGKDD Explorations, vol. 2, 1-99102.
69 *U.S. Appl. No. 60/200,338.
70Warlick, David, "Searching the Internet: Part III", Raw Materials for the Mind: Teaching & Learning in Information & Technology Rich Schools, ISBN 0-9667432-0-2, Mar. 18, 1999.
71Warlick, David, "Searching the Internet: Part III," Raw Materials for the Mind: Teaching & Learning in Information & Technology Rich Schools, ISBN 0-9667432-0-2, Mar. 18, 1999, 4 pages.
72Watson, Ian, "A Case Based Reasoning Application for Engineering Sales Support Using Introspective Reasoning," 2000 American Association for Artificial Intelligence, 6 pages, 2000.
73Weiss, Gary M., "Data Mining in Telecommunications", 13 pages, no date.
74Weiss, Gary M., "Data Mining in Telecommunications", Department of Computer and Information Science, Fordham University, 13 pages, no. date.
75Wood, David, "Metadata Searches of Unstructured Textual Content," Tucana Plugged in Software white Paper, 4 pages, Sep. 26, 2002.
76Wood, David, "Metadata Searches of Unstructured Textual Content," Tucana: Plugged in Software White Paper, 4 pages, Sep. 26, 2002.
Referenced by
Citing PatentFiling datePublication dateApplicantTitle
US8204901 *19 Jun 2012International Business Machines CorporationGenerating query predicates for processing multidimensional data
US923022918 Oct 20135 Jan 2016Sap SePredicting levels of influence
US9305285 *1 Nov 20135 Apr 2016Datasphere Technologies, Inc.Heads-up display for improving on-line efficiency with a browser
US20110055149 *2 Sep 20093 Mar 2011International Business Machines CorporationGenerating query predicates for olap processing
US20140289394 *13 Dec 201225 Sep 2014Peking University Founder Group Co., LtdMethod of and system for collecting network data
US20150112755 *18 Oct 201323 Apr 2015Sap AgAutomated Identification and Evaluation of Business Opportunity Prospects
Classifications
U.S. Classification707/769, 707/805, 707/776
International ClassificationG06F7/00, G06F17/30
Cooperative ClassificationY10S707/99937, Y10S707/99936, Y10S707/99934, G06F17/30616, G06F17/3089
European ClassificationG06F17/30T1E, G06F17/30W7
Legal Events
DateCodeEventDescription
15 Dec 2008ASAssignment
Owner name: DAFINEAIS PROTOCOL DATA B.V., LLC, DELAWARE
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:REACHFORCE, INC.;REEL/FRAME:021976/0417
Effective date: 20081001
3 Jun 2009ASAssignment
Owner name: INETPROFIT, INC., TEXAS
Free format text: CONFIRMATORY ASSIGNMENT;ASSIGNOR:SEIBEL, JOHN C.;REEL/FRAME:022774/0264
Effective date: 20081112
Owner name: INETPROFIT, INC., TEXAS
Free format text: CONFIRMATORY ASSIGNMENT;ASSIGNOR:FOSTER, ROBERT L.;REEL/FRAME:022774/0272
Effective date: 20081112
Owner name: INETPROFIT, INC., TEXAS
Free format text: CONFIRMATORY ASSIGNMENT;ASSIGNOR:FENG, YU;REEL/FRAME:022774/0278
Effective date: 20081103
Owner name: INETPROFIT, INC., TEXAS
Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SEIBEL, JOHN C.;FENG, YU;FOSTER, ROBERT L.;REEL/FRAME:022771/0826
Effective date: 20011025
Owner name: REACHFORCE, INC., TEXAS
Free format text: CHANGE OF NAME;ASSIGNOR:INETPROFIT, INC.;REEL/FRAME:022774/0285
Effective date: 20050825
3 Jul 2012CCCertificate of correction
11 Feb 2015ASAssignment
Owner name: SQUARE 1 BANK, NORTH CAROLINA
Free format text: SECURITY INTEREST;ASSIGNOR:REACHFORCE, INC.;REEL/FRAME:034935/0726
Effective date: 20150206
28 Jul 2015FPAYFee payment
Year of fee payment: 8
11 Jan 2016ASAssignment
Owner name: CALLAHAN CELLULAR L.L.C., DELAWARE
Free format text: MERGER;ASSIGNOR:DAFINEAIS PROTOCOL DATA B.V., LLC;REEL/FRAME:037477/0744
Effective date: 20150826
20 Jul 2016ASAssignment
Owner name: REACHFORCE, INC., TEXAS
Free format text: RELEASE BY SECURED PARTY;ASSIGNOR:PACIFIC WESTERN BANK (AS SUCCESSOR IN INTEREST BY MERGER TO SQUARE 1 BANK);REEL/FRAME:039198/0189
Effective date: 20160713