US20080155056A1 - Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores - Google Patents

Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores Download PDF

Info

Publication number
US20080155056A1
US20080155056A1 US11/967,525 US96752507A US2008155056A1 US 20080155056 A1 US20080155056 A1 US 20080155056A1 US 96752507 A US96752507 A US 96752507A US 2008155056 A1 US2008155056 A1 US 2008155056A1
Authority
US
United States
Prior art keywords
web page
cached
cache
deleting
objects
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/967,525
Inventor
Melvin Richard Zimowski
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US11/967,525 priority Critical patent/US20080155056A1/en
Publication of US20080155056A1 publication Critical patent/US20080155056A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5682Policies or rules for updating, deleting or replacing the stored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services

Definitions

  • This invention relates in general to computer-implemented systems, and, in particular, to maintaining and managing dynamic web pages and objects referenced by the web pages.
  • the Internet computer network is a collection of computer networks that exchange information via the Transmission Control Protocol/Internet Protocol (“TCP/IP”) protocol suite.
  • TCP/IP Transmission Control Protocol/Internet Protocol
  • the World Wide Web (i.e., the “WWW” or the “Web”) is a hypertext information and communication system used on the Internet computer network with data communications operating according to a client/server model.
  • a user of a Web browser at a Web client computer will request data stored in data sources from a Web server computer, at which Web server software resides.
  • the Web server software interacts with other computer programs that use interfaces to connect to these data sources, for example, a database managed by a Database Management System (“DBMS”), or uses the interfaces directly to access these data sources.
  • DBMS Database Management System
  • These computer programs residing at the Web server computer transmit the requested data to the client computer in worldwide web documents referred to as web pages.
  • the data can be of many different types of information, including database data, images, video clips, or audio tracks.
  • Web pages can be static web pages (i.e. web pages with fixed content that are pre-generated long before the Web client request is issued) or dynamic web pages (i.e., web pages whose content is dynamically generated at the time the web client request is processed).
  • Dynamic web pages are typically expensive to generate because they contain data that must be obtained dynamically at web servers from either local or remote data sources. For this reason, web server caches are often used to store dynamic Web pages that are requested by multiple users. These caches are of finite size and have limits on the number of web pages they can contain. Further, the need for retaining individual pages within the cache varies over time. Thus, without proper maintenance, the storage allocated to the cache may become completely used up or the content of the cache may become outdated.
  • a web page may and often does contain hypertext links to objects stored in data stores (e.g., a local file system) that can be accessed by the web server that serves the web page.
  • data stores e.g., a local file system
  • hypertext links permit the objects that they reference to be made available to a user at a web browser as an integral part of the web page that contains them.
  • the web page appears incomplete if it is served to a user and the objects referenced through these hypertext links cannot be materialized as required by the web page or user.
  • the present invention discloses a method, apparatus, and article of manufacture for managing data stored in a data storage device connected to a computer.
  • a web page is to be cached.
  • the web page references other objects.
  • the referenced objects are stored in one or more data stores.
  • the web page is cached.
  • the cached web page and the referenced objects are managed in a coordinated fashion to ensure the display of a complete web page.
  • FIG. 1 schematically illustrates the hardware environment of a preferred embodiment of the present invention, and more particularly, illustrates a typical distributed computer system using the Internet;
  • FIG. 2 is a flow chart illustrating the steps performed in. accordance with an embodiment of the present invention.
  • One embodiment of the present invention provides a management system for maintaining and managing dynamic web pages and objects referenced in the web pages.
  • the preferred embodiment of the invention provides techniques for managing the contents of a dynamic web page system cache and related data stores containing objects that the web pages reference while also ensuring the completeness of cached web pages that are subsequently reused for display at browsers.
  • the preferred embodiment of the invention uses a DBMS (e.g., DB2® from International Business Machines Corporation) to cache dynamic web pages and to track dependencies that the dynamic web pages have on large objects cached in the local UNIX file system (in particular, the Hierarchical File System available under OS/390® UNIX System Services).
  • DBMS e.g., DB2® from International Business Machines Corporation
  • the preferred embodiment of the invention provides tools for managing the contents of the dynamic web page cache and the large object cache, while also ensuring that the management is performed in such a way as to guarantee the completeness of any cached web pages that are displayed at a browser.
  • FIG. 1 schematically illustrates the hardware environment of a preferred embodiment of the present invention, and more particularly, illustrates a typical distributed computer system using the Internet 100 to connect Web client computers 102 executing Web browsers to a Web server computer 104 executing Web server software and other computer programs that connect the server system 104 to data sources 106 .
  • a typical combination of resources may include client computers 102 that are personal computers or workstations, and a web server computer 104 that is a personal computer, workstation, minicomputer, or mainframe. These systems are coupled to one another by various networks, including LANs, WANs, SNA networks, and the Internet.
  • a Web client computer 102 typically executes a Web browser and is coupled to a Web server computer 104 executing Web server software.
  • the Web browser is typically a program such as Microsoft's Internet Explorer® or Netscape Navigator®.
  • the Web server software is typically a program such as IBM's HTTP Server or other WWW server software.
  • the software executing on the Web server uses a data source interface and, possibly, other computer programs, for connecting to the data sources 106 .
  • the software executing on the Web server may also include a cache management system 110 .
  • the client computer 102 is bi-directionally coupled with the Web server computer 104 over a line or via a wireless system.
  • the Web server computer 104 is bi-directionally coupled with data sources 106 .
  • the data source interface permits the software executing on the Web Server to be connected to a Database Management System (DBMS), which supports access to a data source 106 by executing DBMS software.
  • DBMS Database Management System
  • the DBMS may be located on the same server as the Web server computer 104 or may be located on a separate machine.
  • the data sources 106 may be geographically distributed.
  • the software executing on the Web server translates the request received from a Web browser into one or more statements (e.g., a macro file or a COBOL program) that can be processed to retrieve data from data sources 106 .
  • Web pages can be HTML web pages or XML documents.
  • a cached page is static, as the content of a cached page reflects the state of data stores and business logic at the time the web page was created. Subsequent changes to the data stores and business logic do not affect the content of the cached page.
  • Caching directives are used to specify, among other things, the web pages to be cached. A number of factors affect whether web pages should be cached. More specifically, a page should be cached when the page is repeatedly requested by users and when the content of the page changes infrequently. A pages should not be cached when the processing associated with the generation of the web page makes changes to data sources. If a cached web page is used to respond to a user's request, the processing logic associated with the generation of the cached page is not executed and no changes are made to the data sources.
  • the caching directives are processed for each cache management system user request.
  • the caching directives are processed once per web server address space at the time the first management system request is assigned to a worker thread associated with that address space.
  • Stored procedures are provided by the cache management system 110 for managing the contents of the cache.
  • authorized Web server administrators configure the cache management system 110 to cache web pages by adding caching directives to the management system initialization file (db2www.ini).
  • the DTW_CACHE_PAGE directive is used to specify web pages that are to be cached by the cache management system 110 . If the management system initialization file does not contain a DTW_CACHE_PAGE directive, then no web pages are cached.
  • the following is the syntax for the DTW_CACHE-PAGE directive:
  • file_name_spec refers to the specification of one or all blocks within a macro file using the fully qualified name of the macro file.
  • a macro file is an installation provided application that the cache management system 110 executes to generate one or more web pages.
  • DTW_CACHE_PAGE /u/USRND01/macros/custqord.d2w/* specifies the caching of all web pages created by the execution of any block in macro custqord.d2w in the directory /u/USRND01/macros.
  • path_template_spec refers to the specification of blocks within macro files using a path template for one or more directories containing macro files.
  • a path template contains the suffix/*.
  • the management system caches all web pages created by the execution of blocks in macro files contained within the directory or directories that match the path template.
  • the term lifetime refers to the minimum number of seconds that a cached web page is valid.
  • usage_scope specifies the degree to which the reuse of the web page is restricted. Reuse is granted or denied based on the authority of the userid associated with the request. Usage_scope can have a value of PUBLIC or PRIVATE.
  • PUBLIC means that the cached web page should be served (i.e., returned to a user) when the user request matches the cache key (discussed in further detail below), the cached page is valid, and the userid associated with the web server thread or process processing the request is authorized to execute the macro that generated the page.
  • PRIVATE means that the cached web page should be served when the user request matches the cache key (discussed in further detail below), the cached page is valid, and the userid associated with the web server thread or process processing the request is the same as the userid that was associated with the web server thread or process that cached the web page.
  • the caching directive can be specified multiple times. Namely, a different caching directive (i.e., DTW_CACHE_PAGE directive) can be specified for each file-name_spec or path template_spec value. When the caching directives conflict with each other, the first directive specified takes precedence.
  • a cached page is reused for a request when the URL, the form data, and the query string of the request match the URL, form data, and query string of the request that caused the web page to be cached. Examples of caching directives are shown below:
  • the cache management system caches the Web pages generated when it executes the output block in the macro main.d2w, located in the /u/USER1/macros directory.
  • the Web pages have PUBLIC scope, and remain valid for at least 1 hour.
  • the cache management system caches any Web pages the cache management system generates when it executes any block in the macro main.d2w, located in the /u/USER1/macros directory.
  • the Web pages have PUBLIC scope, and remain valid for at least 30 minutes.
  • the cache management system caches any Web pages the cache management system generates when it executes any block in any macro located in the /u/USERI/macros directory or any of its subdirectories.
  • the Web pages have PRIVATE scope, and remain valid for at least 1 hour.
  • the cache management system caches all Web pages that the cache management system generates.
  • the Web pages have PUBLIC scope, and remain valid for at least 1 hour.
  • the cache management system caches the following: (1) All Web pages generated from any block in any macro located in the /u/USERI/macros/main/ directory. The Web pages have PUBLIC scope and remain valid for at least 30 minutes. (2) All Web pages generated by the daily_news.d2w macro in the directory /u/USER1/macros/special/. These Web pages have PUBLIC scope and remain valid for at least 12 hours. (3) All Web pages generated by the employee_stats.d2w macro in the directory /u/USERI/macros/special/. These Web pages have PRIVATE scope and remain valid for at least 1 hour.
  • the table used to cached the web pages must be set up. The following steps are necessary to set up the table used to cache the web pages:
  • web pages can be cached:
  • the actual cache key for the cached dynamic web page consists of path information, macro name, HTML or XML block name, plus the query string plus the form data (if present) that caused the dynamic web page to be generated.
  • a web page may be defined as a set of HTML tags or an XML (Extensible Mark-up Language) document and the objects (e.g., DB2 large objects or LOBs) referenced by the web page using hypertext links.
  • HTML tags or an XML (Extensible Mark-up Language) document
  • objects e.g., DB2 large objects or LOBs
  • a cached web page The contents of a cached web page reflect the state of data stores and business logic at the time the cached web page was created. Thus, a cached web page is static.
  • a cached web page is automatically deleted upon expiration of its lifetime (i.e., lifetime value plus creation time), when automatic management is enabled, or until deleted by an authorized web server administrator.
  • the management system When serving a cached web page, the management system maintains the consistency that existed at the time the cached web page was created.
  • the removal of a referenced object from its data store e.g. the local UNIX file system
  • the removal of a dynamic web page from the management system cache invalidates referenced objects stored in the referenced object data store.
  • a web page is not returned to a user at a browser until all LOBs referenced in the web page are successfully placed in the referenced object data store. Additionally, the web page is not cached until all LOBs referenced in the web page are successfully placed in the referenced object data store.
  • any LOBs referenced by that web page are removed from the referenced object data store.
  • any web page that references the LOB is removed from the dynamic Web page cache and any other LOBs referenced by that web page are removed from the referenced object data store.
  • the cached web page is always removed before any dependent LOBs are removed from the referenced object data store.
  • ASCII encoded web pages are cached in a data store for reuse by the management system.
  • referenced objects are large objects (LOBs).
  • Caching directives specify the web pages to be cached and the minimum number of seconds the cached page is valid.
  • the key of a cached page consists of path information, macro name, HTML or XML block name, plus query string plus form data (if present) Cached web pages are classified by usage scope: PUBLIC or PRIVATE.
  • the DB2 dynamic web page cache table is: SYSIBM.DTWCACHEDPAGES.
  • the index key for a cached page in the dynamic web page cache table is: INDEXED_KEY CHAR(250), which is the first 250 characters of path information, macro name, HTML or XML block name, query string, and form data.
  • An identifier is specified by: ID INTEGER, which is an identifier derived from path information, macro name, HTML or XML block name, query string, and form data.
  • an index to the dynamic web page cache table is provided that consists of both the column for the INDEXED_KEY and the column for the ID.
  • ACTUAL_KEY VARCHAR(4000) which is comprised of the path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page.
  • CREATOR CHAR(8) which is the userid associated with a request that created the cached page.
  • the creation timestamp is: CREATION_TIME TIMESTAMP, which specifies the date and time of the creation of the cached page. This date and time is the same as the CREATION_TIME for any LOBs that the web page references.
  • the expiration timestamp is: EXPIRATION_TIME TIMESTAMP, which specifies the date and time of expiration of the cached page (value of CREATION_TIME+lifetime value from DTW_CACHE_PAGE directive).
  • a second index for the dynamic web page cache table is provided that consists of the column for EXPIRATION_TIME.
  • the indexed column (i.e., EXPIRATION_TIME) of the second index is used to efficiently identify the cached web pages that have expired.
  • the size is: SIZE INTEGER, which specifies the size of the cached page in bytes.
  • the usage scope is: USAGE_SCOPE SMALLINT; a value of 1 means that the page has a PUBLIC usage scope and a value of 2 means that the page has a PRIVATE usage scope.
  • the ordinal position of the segment is: ORDINAL_POSITION SMALLINT, which specifies the ordinal position of the web page segment within the complete cached page.
  • the dynamic web page segment is: PAGE_SEGMENT VARCHAR(28100) FOR BIT DATA, which is the ASCII encoded web page segment.
  • a dynamic web page/LOB dependency table must be created and several stored procedures must be installed before the cache management system 110 can manage cached web pages and LOBs.
  • the steps necessary to created the web page dependency table are outlined below:
  • the dynamic Web page/LOB dependency table contains information about the LOB files stored in HFS and about the relationship that these files may have, if any, to Web pages stored in the dynamic Web page cache.
  • the DB2 web page/LOB dependency table is: SYSIBM.DTWCACHEDEPS.
  • the indexed key for a cached page in the DB2 web page/LOB dependency table is: INDEXED_KEY CHAR(250), which specifies the first 250 characters of path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page.
  • the identifier is: ID INTEGER, which specifies an identifier derived from path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page.
  • An index to the DB2 web page/LOB dependency table is provided that consists of both the column for the INDEXED_KEY and the column for the ID.
  • Actual key of the cached page is: ACTUAL KEY VARCHAR(4000)—path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page.
  • the fully qualified HFS filename for the LOB is: FILENAME VARCHAR(1024).
  • the creation timestamp is: CREATION_TIME TIMESTAMP, which specifies the date and time of the creation of the LOB. This date and time is the same as the CREATION_TIME for the web page that references the LOB.
  • the expiration timestamp is: EXPIRATION_TIME TIMESTAMP, which specifies the date and time for expiration of the LOB (value of CREATION_TIME+DTW_LOB LIFETIME configuration value, when dynamic web page caching not in use; value of CREATION_TIME+max(DTW_LOB_LIFETIME configuration value; lifetime, value from DTW_CACHE_PAGE directive) when dynamic web page caching in use).
  • a second index for the DB2 web page/LOB dependency table is provided that consists of the column for EXPIRATION_TIME.
  • the indexed column (i.e., EXPIRATION_TIME) of the second index is used to efficiently identify objects (referenced by a web page) that have expired.
  • the size is: SIZE INTEGER, which specifies the size of the LOB in bytes.
  • SYSIBM.DTWCACHEDPAGES is the dynamic Web page cache table. This table contains the cached Web pages and information about the cached Web pages.
  • EXPIRATION_TIME Expiration timestamp: date and time for expiration of cached TIMESTAMP page (value of CREATION_TIME + lifetime value from DTW_CACHE_PAGE directive)
  • SIZE INTEGER Size: size of cached page in bytes
  • USAGE_SCOPE Usage scope: a value of 1 means that the page has a PUBLIC SMALLINT usage scope and a value of 2 means that the page has a PRIVATE usage scope
  • ORDINAL_POSITION Ordinal position of segment: the ordinal position of the Web SMALLINT page segment within the complete cached page
  • PAGE_SEGMENT Dynamic Web page segment the ASCII encoded Web page VARCHAR(28100)FOR segment BIT DATA SYSIBM.DTWCACHEDEPS is the Web page dependency table. This table contains information about the LOBs referenced by Web pages.
  • EXPIRATION_TIME Expiration timestamp: date and time of the expiration of the TIMESTAMP LOB (value of CREATION_TIME + DTW_LOB_LIFETIME configuration value when dynamic web page caching not in use; value of CREATION_TIME + max (DTW_LOB_LIFETIME configuration value, lifetime value from DTW_CACHE_PAGE directive) when dynamic web page caching in use)
  • SIZE INTEGER Size: size of the LOB in bytes
  • LOBs referenced in web pages are stored as files in a single HFS directory.
  • the name of the directory is specified by the HTML_PATH configuration statement of the management system initialization file.
  • the settings of two configuration variables control the manner in which LOBs and cached web pages are automatically managed.
  • the configuration variable DTW_LOB_LIFETIME specifies the minimum number of seconds LOBs are available. The default is 0 seconds. When dynamic web page caching is in use, the minimum number of seconds a LOB is available is the larger of the value of DTW_LOB_LIFETIME and the lifetime value specified in the caching directive for the dynamic web page that references the LOB.
  • the configuration variable DTW_CACHE_MANAGEMENT_INTERVAL specifies a minimum number of seconds between successive automatic cache management attempts. The default is 0 seconds. A value of 0 means that the cache management system 110 will not perform automatic cache management.
  • Cached web pages and LOBs are automatically managed by the management system using DB2 stored procedures. Additionally, a management system-provided macro permits Web server administrators to manage LOBs and cached web pages in more sophisticated ways using a stored procedure.
  • the management system provides a macro that allows a system administrator to execute a management system provided stored procedure to perform cache management. Use of this macro is referred to as more advanced cache management.
  • the management system-provided macro for the more advanced management of the caches permits the user to perform the following types of management.
  • First, a cleanup operation may be requested that deletes all cached dynamic web pages and LOBs with expiration timestamps that precede the current timestamp.
  • Second, web pages may be deleted from the dynamic web page cache based on system administrator provided input, such as macro file and block name template values, timestamp values or a combination of name and timestamp values (e.g., delete web pages where key of cached page is like X and creation timestamp is less than some provided timestamp value Y). All cached LOBs that are referenced by the deleted web pages are also deleted.
  • LOBs from the LOB cache may be deleted based on a timestamp value (e.g., delete LOBs where creation timestamp is greater than some provided timestamp value Y). All cached dynamic web pages that contain hypertext references to the LOBs are first deleted.
  • a timestamp value e.g., delete LOBs where creation timestamp is greater than some provided timestamp value Y.
  • the macro for more advanced management of dynamic web pages and LOBs allows a system administrator to specify the following:
  • the macro for more advanced management of LOBs also allows a user to specify the following:
  • the macro for more advanced management is named manage_cache.d2w.
  • c. Optionally type a string in the Enter the ACTUAL_KEY filter field that matches any part of the ACTUAL_KEY for the Web pages to be deleted. This string acts as filter for selecting the cached Web pages Net.Data deletes.
  • the string can contain up to 250 characters. For example, when the following string is entered: /netdata/macros/my_macro.d2w/report Net. Data deletes all cached Web pages that have an ACTUAL_KEY value containing this string.
  • d. Optionally click on the Starting CREATION_TIME check box and enter a timestamp value.
  • Net.Data deletes all cached Web pages that have creation times greater than or equal to this timestamp value, and that have creation times less than or equal to the Ending CREATION_TIME, if specified. If no Ending CREATION_TIME value is specified, than Net.Data deletes all cached Web pages that have creation times greater than or equal to the Starting CREATION_TIME value. For example, when the following timestamp is entered:
  • Net.Data deletes all cached Web pages that were created on or after 2:00 PM on Mar. 23, 1999 up to and including the value of Ending CREATION_TIME, if specified.
  • Net.Data deletes all cached Web pages that have creation times less than or equal to this timestamp value, and that have creation times greater than or equal to the Starting CREATION_TIME, if specified. If no Starting CREATION_TIME value is specified, then Net.Data deletes all cached Web pages that have creation times less than or equal to the Ending CREATION_TIME value. For example, when the following timestamp is entered:
  • Net.Data deletes all cached Web pages that were created on or before 11:59:59 PM on Mar. 23, 1999, starting with the value of Starting CREATION_TIME, if specified.
  • Net.Data deletes all LOBs that were created on or after 2:00 PM on Mar. 23, 1999, up to and including the value of Ending CREATION_TIME, if specified.
  • the userid(s) associated with the threads or processes that execute management system requests must have the EXECUTE privilege for the stored procedure used to add dynamic web pages to the cache and to add dependency information to the dynamic web page/LOB dependency table. Additionally, the userid(s) associated with the threads or processes that execute management system requests must have the authority to add files to the HTML_PATH directory.
  • the userid(s) associated with the threads or processes that execute the management system requests must also have the EXECUTE privilege for the stored procedure used to automatically manage the dynamic web page and LOB caches.
  • the userid(s) used for more advanced management of LOBs and cached web pages must have the EXECUTE privilege for the stored procedure used to manually manage the dynamic web page and LOB caches.
  • An administrative userid is typically used to create the HTML_PATH directory and prepare the stored procedures for execution.
  • the administrative userid must have the INSERT, SELECT, and DELETE privileges on the web cache table and on the dynamic web page/LOB dependency table.
  • the preferred embodiment of the present invention provides a technique for managing the contents of a dynamic web page system cache and related data stores containing objects that the web pages reference while also ensuring the completeness of cached web pages that are subsequently reused for display at browsers.
  • the content may be managed by expiration times or in more advanced ways.
  • FIG. 2 is a flow chart illustrating the steps performed in accordance with an embodiment of the present invention.
  • Block 200 represents an exemplary embodiment of the present invention receiving a request to generate a dynamic web page. According to the exemplary embodiment, it is determined whether the requested web page is cached, as represented by Block 201 . If the requested web page is not cached, the data is retrieved and placed in a dynamically generated web page, as represented by Block 202 . The retrieved data may be linked to other stored data._Block 203 represents the determination as to whether the web page should be cached. Block 204 represents the present invention caching the retrieved data and the linked data. If the requested web page is cached, the web page is retrieved from the cache, as represented by Block 205 . The present invention then manages the cached data, as represented by Block 206 .
  • any type of computer such as a mainframe, minicomputer, or personal computer, or computer configuration, such as a timesharing mainframe, local area network, or standalone personal computer, could be used with the present invention.

Abstract

An apparatus for managing data stored in a data storage device connected to a computer. In accordance with the present invention, it is determined that a web page is to be cached. The web page references other objects. The referenced objects are stored in one or more data stores. The web page is cached. The cached web page and the referenced objects are managed in a coordinated fashion to ensure the display of a complete Web page.

Description

    PROVISIONAL APPLICATION
  • This is a continuation of application Ser. No. 09/602,412 filed Jun. 23, 2000, which claims benefit of Provisional Application No. 60/140,711 filed Jun. 24, 1999 entitled “TECHNIQUE FOR MAINTAINING AND MANAGING DYNAMIC WEB PAGES STORED IN A SYSTEM CACHE AND REFERENCED OBJECTS CACHED IN OTHER DATA STORES,”, by Mel Zimowski. The entire disclosures of these prior applications are hereby incorporated by reference in their entirety.
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention
  • This invention relates in general to computer-implemented systems, and, in particular, to maintaining and managing dynamic web pages and objects referenced by the web pages.
  • 2. Description of Related Art
  • The Internet computer network is a collection of computer networks that exchange information via the Transmission Control Protocol/Internet Protocol (“TCP/IP”) protocol suite. Currently, the use of the Internet computer network for commercial and non-commercial uses is exploding. Via its networks, the Internet computer network enables users in different locations to access information stored in data sources (e.g., databases) on servers distributed across these networks.
  • The World Wide Web (i.e., the “WWW” or the “Web”) is a hypertext information and communication system used on the Internet computer network with data communications operating according to a client/server model. Typically, a user of a Web browser at a Web client computer will request data stored in data sources from a Web server computer, at which Web server software resides. The Web server software interacts with other computer programs that use interfaces to connect to these data sources, for example, a database managed by a Database Management System (“DBMS”), or uses the interfaces directly to access these data sources. These computer programs residing at the Web server computer transmit the requested data to the client computer in worldwide web documents referred to as web pages. The data can be of many different types of information, including database data, images, video clips, or audio tracks.
  • Web pages can be static web pages (i.e. web pages with fixed content that are pre-generated long before the Web client request is issued) or dynamic web pages (i.e., web pages whose content is dynamically generated at the time the web client request is processed).
  • Dynamic web pages are typically expensive to generate because they contain data that must be obtained dynamically at web servers from either local or remote data sources. For this reason, web server caches are often used to store dynamic Web pages that are requested by multiple users. These caches are of finite size and have limits on the number of web pages they can contain. Further, the need for retaining individual pages within the cache varies over time. Thus, without proper maintenance, the storage allocated to the cache may become completely used up or the content of the cache may become outdated.
  • A web page may and often does contain hypertext links to objects stored in data stores (e.g., a local file system) that can be accessed by the web server that serves the web page. These hypertext links permit the objects that they reference to be made available to a user at a web browser as an integral part of the web page that contains them. Thus, the web page appears incomplete if it is served to a user and the objects referenced through these hypertext links cannot be materialized as required by the web page or user.
  • When web pages are generated dynamically, it is not possible to predetermine the objects that will be referenced within these web pages using hypertext links. In addition, when dynamic web pages are stored in a system cache for reuse, the relationship between the web page and the objects that it references is no longer momentary. Thus, unless the contents of the system cache and the data stores containing the objects referenced in the web page are properly managed, incomplete web pages could, easily be displayed at web browsers. Accordingly, the providers of web sites need to manage the contents of a dynamic web page system cache and the contents of the data stores containing objects that are referenced by hypertext links that occur within those dynamic web pages in a coordinated fashion.
  • Thus, there is a need in the art for improved techniques for maintaining and managing dynamic web pages stored in system caches and objects referenced by the web pages.
  • SUMMARY OF THE INVENTION
  • To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for managing data stored in a data storage device connected to a computer.
  • In accordance with the present invention, it is determined that a web page is to be cached. The web page references other objects. The referenced objects are stored in one or more data stores. The web page is cached. The cached web page and the referenced objects are managed in a coordinated fashion to ensure the display of a complete web page.
  • BRIEF DESCRIPTION OF THE DRAWING
  • Referring now to the drawings in which like reference numbers represent corresponding parts throughout:
  • FIG. 1 schematically illustrates the hardware environment of a preferred embodiment of the present invention, and more particularly, illustrates a typical distributed computer system using the Internet; and
  • FIG. 2 is a flow chart illustrating the steps performed in. accordance with an embodiment of the present invention.
  • DETAILED DESCRIPTION
  • In the following description, reference is made to the accompanying drawings which form a part hereof, and which is shown by way of illustration a specific embodiment in which the invention may be practiced. It is to be understood that other embodiments may be utilized as structural changes may be made without departing from the scope of the present invention.
  • A. Overview
  • One embodiment of the present invention provides a management system for maintaining and managing dynamic web pages and objects referenced in the web pages. The preferred embodiment of the invention provides techniques for managing the contents of a dynamic web page system cache and related data stores containing objects that the web pages reference while also ensuring the completeness of cached web pages that are subsequently reused for display at browsers.
  • The preferred embodiment of the invention uses a DBMS (e.g., DB2® from International Business Machines Corporation) to cache dynamic web pages and to track dependencies that the dynamic web pages have on large objects cached in the local UNIX file system (in particular, the Hierarchical File System available under OS/390® UNIX System Services). The preferred embodiment of the invention provides tools for managing the contents of the dynamic web page cache and the large object cache, while also ensuring that the management is performed in such a way as to guarantee the completeness of any cached web pages that are displayed at a browser.
  • B. Hardware Environment
  • FIG. 1 schematically illustrates the hardware environment of a preferred embodiment of the present invention, and more particularly, illustrates a typical distributed computer system using the Internet 100 to connect Web client computers 102 executing Web browsers to a Web server computer 104 executing Web server software and other computer programs that connect the server system 104 to data sources 106. A typical combination of resources may include client computers 102 that are personal computers or workstations, and a web server computer 104 that is a personal computer, workstation, minicomputer, or mainframe. These systems are coupled to one another by various networks, including LANs, WANs, SNA networks, and the Internet.
  • A Web client computer 102 typically executes a Web browser and is coupled to a Web server computer 104 executing Web server software. The Web browser is typically a program such as Microsoft's Internet Explorer® or Netscape Navigator®. The Web server software is typically a program such as IBM's HTTP Server or other WWW server software. The software executing on the Web server uses a data source interface and, possibly, other computer programs, for connecting to the data sources 106. The software executing on the Web server may also include a cache management system 110. The client computer 102 is bi-directionally coupled with the Web server computer 104 over a line or via a wireless system. In turn, the Web server computer 104 is bi-directionally coupled with data sources 106.
  • The data source interface permits the software executing on the Web Server to be connected to a Database Management System (DBMS), which supports access to a data source 106 by executing DBMS software. The DBMS may be located on the same server as the Web server computer 104 or may be located on a separate machine. The data sources 106 may be geographically distributed. The software executing on the Web server translates the request received from a Web browser into one or more statements (e.g., a macro file or a COBOL program) that can be processed to retrieve data from data sources 106.
  • Those skilled in the art will recognize many modifications may be made to this configuration without departing from the scope of the present invention.
  • C. Dynamic Web Page Caching Overview
  • 1. Overview of Design
  • The caching of dynamic web pages enhances the Web server's ability to quickly serve web pages containing dynamic content. In the preferred embodiment, ASCII encoded web pages are cached in a database for reuse by the cache management system 110. This eliminates the costs associated with reconstructing a dynamic web page that is requested by multiple users. Web pages can be HTML web pages or XML documents.
  • A cached page is static, as the content of a cached page reflects the state of data stores and business logic at the time the web page was created. Subsequent changes to the data stores and business logic do not affect the content of the cached page.
  • Caching directives are used to specify, among other things, the web pages to be cached. A number of factors affect whether web pages should be cached. More specifically, a page should be cached when the page is repeatedly requested by users and when the content of the page changes infrequently. A pages should not be cached when the processing associated with the generation of the web page makes changes to data sources. If a cached web page is used to respond to a user's request, the processing logic associated with the generation of the cached page is not executed and no changes are made to the data sources.
  • When the cache management system 110 (also referred to as the management system) is configured as a CGI application, the caching directives are processed for each cache management system user request. When the cache management system 110 is configured as a Web server plugin or as a servlet, the caching directives are processed once per web server address space at the time the first management system request is assigned to a worker thread associated with that address space. For additional information about worker threads, refer to pending U.S. patent application Ser. No. 09/104,879, filed Jun. 25, 1998 by M. Zimowski, S. Greenspan, P. Livecchi, and J. Aman, entitled “METHOD AND SYSTEM FOR MANAGING CONNECTIONS TO A DATABASE MANAGEMENT SYSTEM”.
  • Stored procedures are provided by the cache management system 110 for managing the contents of the cache.
  • 2. Web Page Caching Directives
  • In the preferred embodiment, authorized Web server administrators configure the cache management system 110 to cache web pages by adding caching directives to the management system initialization file (db2www.ini).
  • In particular, the DTW_CACHE_PAGE directive is used to specify web pages that are to be cached by the cache management system 110. If the management system initialization file does not contain a DTW_CACHE_PAGE directive, then no web pages are cached. The following is the syntax for the DTW_CACHE-PAGE directive:
  • DTW_CACHE_PAGE [=] file_name_spec|path_template_spec lifetime usage_scope
  • The term file_name_spec refers to the specification of one or all blocks within a macro file using the fully qualified name of the macro file. A macro file is an installation provided application that the cache management system 110 executes to generate one or more web pages. A block is a subsection of a macro that is capable of generating a specific web page. For example, DTW_CACHE_PAGE=/u/USRND01/macro s/custqord.d2w/Output specifies the caching of the web page created by the execution of the block Output in macro custqord.d2w in the directory /u/USRND01/macros. In the following additional example, DTW_CACHE_PAGE=/u/USRND01/macros/custqord.d2w/* specifies the caching of all web pages created by the execution of any block in macro custqord.d2w in the directory /u/USRND01/macros.
  • The term path_template_spec refers to the specification of blocks within macro files using a path template for one or more directories containing macro files. A path template contains the suffix/*. The management system caches all web pages created by the execution of blocks in macro files contained within the directory or directories that match the path template. For example, DTW_CACHE_PAGE=/u/USRND01/macros/* specifies the caching of web pages created by the execution of blocks in all macros contained within the directory /u/USRNDO1/macros and any subdirectory of /u/USRNDO1/macros. In the following additional example, DTW_CACHE_PAGE=/* specifies the caching of all web pages created by all management system macros.
  • The term lifetime refers to the minimum number of seconds that a cached web page is valid.
  • The term usage_scope specifies the degree to which the reuse of the web page is restricted. Reuse is granted or denied based on the authority of the userid associated with the request. Usage_scope can have a value of PUBLIC or PRIVATE. PUBLIC means that the cached web page should be served (i.e., returned to a user) when the user request matches the cache key (discussed in further detail below), the cached page is valid, and the userid associated with the web server thread or process processing the request is authorized to execute the macro that generated the page. PRIVATE means that the cached web page should be served when the user request matches the cache key (discussed in further detail below), the cached page is valid, and the userid associated with the web server thread or process processing the request is the same as the userid that was associated with the web server thread or process that cached the web page.
  • The caching directive can be specified multiple times. Namely, a different caching directive (i.e., DTW_CACHE_PAGE directive) can be specified for each file-name_spec or path template_spec value. When the caching directives conflict with each other, the first directive specified takes precedence.
  • A cached page is reused for a request when the URL, the form data, and the query string of the request match the URL, form data, and query string of the request that caused the web page to be cached. Examples of caching directives are shown below:
  • EXAMPLE 1
  • Specifies the caching of any Web pages generated when the cache management system executes a particular HTML block in the specified macro
  • DTW_CACHE_PAGE /u/USER1/macros/main.d2w/output 3600 PUBLIC
  • In this example, the cache management system caches the Web pages generated when it executes the output block in the macro main.d2w, located in the /u/USER1/macros directory. The Web pages have PUBLIC scope, and remain valid for at least 1 hour.
  • EXAMPLE 2
  • Specifies the caching of any Web pages. generated when the cache management system executes any block in the specified macro
  • DTW_CACHE_PAGE /u/USER1/macros/main.d2w/* 1800 PUBLIC
  • In this example, the cache management system caches any Web pages the cache management system generates when it executes any block in the macro main.d2w, located in the /u/USER1/macros directory. The Web pages have PUBLIC scope, and remain valid for at least 30 minutes.
  • EXAMPLE 3
  • Specifies the caching of any Web pages generated when the cache management system executes blocks in macros located in one or more directories
  • DTW_CACHE_PAGE /u/USER1/macros/* 3600 PRIVATE
  • In this example, the cache management system caches any Web pages the cache management system generates when it executes any block in any macro located in the /u/USERI/macros directory or any of its subdirectories. The Web pages have PRIVATE scope, and remain valid for at least 1 hour.
  • EXAMPLE 4
  • Specifies the caching of any Web page generated by all macros
  • DTW_CACHE_PAGE /* 3600 PUBLIC
  • In this example, the cache management system caches all Web pages that the cache management system generates. The Web pages have PUBLIC scope, and remain valid for at least 1 hour.
  • EXAMPLE 5
  • Specifies multiple Web pages caching directives
  • DTW_CACHE_PAGE/u/USERI/macros/main/* 1800 PUBLIC
  • DTW_CACHE_PAGE/u/USERI/macros/special/daily_news.d2w/* 43200 PUBLIC
    DTW_CACHE_PAGE/u/USERI/macros/special/employee_stats.d2w/* 3600 PRIVATE
  • In this example, the cache management system caches the following: (1) All Web pages generated from any block in any macro located in the /u/USERI/macros/main/ directory. The Web pages have PUBLIC scope and remain valid for at least 30 minutes. (2) All Web pages generated by the daily_news.d2w macro in the directory /u/USER1/macros/special/. These Web pages have PUBLIC scope and remain valid for at least 12 hours. (3) All Web pages generated by the employee_stats.d2w macro in the directory /u/USERI/macros/special/. These Web pages have PRIVATE scope and remain valid for at least 1 hour.
  • After the web pages to be cached have been specified, the table used to cached the web pages must be set up. The following steps are necessary to set up the table used to cache the web pages:
    • a. Create the Web page cache table, SYSIBM.DTWCACHEPAGES.
    • b. Define the stored procedure used to insert the cached pages into SYSIBM.DTWCACHEDPAGES to the DBMS.
    • c. Prepare the stored procedure for execution using a user ID with INSERT, SELECT, and DELETE privileges on SYSIBM.DTWCACHEDPAGES. The user IDs associated with the requests that cache pages must have the EXECUTE privilege for the stored procedure.
  • After the above three steps are completed, web pages can be cached:
  • 3. Web Page Cache Key
  • The actual cache key for the cached dynamic web page consists of path information, macro name, HTML or XML block name, plus the query string plus the form data (if present) that caused the dynamic web page to be generated.
  • D. Management System Dynamic Web Page Cache and Referenced Object Data Store Consistency
  • 1. Management System Objectives Regarding Cache Consistency
  • A web page may be defined as a set of HTML tags or an XML (Extensible Mark-up Language) document and the objects (e.g., DB2 large objects or LOBs) referenced by the web page using hypertext links.
  • The contents of a cached web page reflect the state of data stores and business logic at the time the cached web page was created. Thus, a cached web page is static.
  • A cached web page is automatically deleted upon expiration of its lifetime (i.e., lifetime value plus creation time), when automatic management is enabled, or until deleted by an authorized web server administrator.
  • The following are elements that the management system is designed to achieve regarding dynamic web page cache and referenced object consistency. When serving a cached web page, the management system maintains the consistency that existed at the time the cached web page was created. The removal of a referenced object from its data store (e.g. the local UNIX file system) invalidates the cached dynamic web page that references the object. The removal of a dynamic web page from the management system cache invalidates referenced objects stored in the referenced object data store.
  • A web page is not returned to a user at a browser until all LOBs referenced in the web page are successfully placed in the referenced object data store. Additionally, the web page is not cached until all LOBs referenced in the web page are successfully placed in the referenced object data store. When a web page is removed from the dynamic Web page cache, any LOBs referenced by that web page are removed from the referenced object data store. When a LOB is removed from the referenced object data store, any web page that references the LOB is removed from the dynamic Web page cache and any other LOBs referenced by that web page are removed from the referenced object data store. The cached web page is always removed before any dependent LOBs are removed from the referenced object data store.
  • 2. The Management System Dynamic Web Page Cache
  • ASCII encoded web pages are cached in a data store for reuse by the management system. In the preferred embodiment, referenced objects are large objects (LOBs). Caching directives specify the web pages to be cached and the minimum number of seconds the cached page is valid. The key of a cached page consists of path information, macro name, HTML or XML block name, plus query string plus form data (if present) Cached web pages are classified by usage scope: PUBLIC or PRIVATE.
  • The DB2 dynamic web page cache table is: SYSIBM.DTWCACHEDPAGES. The index key for a cached page in the dynamic web page cache table is: INDEXED_KEY CHAR(250), which is the first 250 characters of path information, macro name, HTML or XML block name, query string, and form data. An identifier is specified by: ID INTEGER, which is an identifier derived from path information, macro name, HTML or XML block name, query string, and form data. To provide fast access to the cache, an index to the dynamic web page cache table is provided that consists of both the column for the INDEXED_KEY and the column for the ID.
  • The actual key of a cached page is: ACTUAL_KEY VARCHAR(4000), which is comprised of the path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page.
  • The userid of the creator is: CREATOR CHAR(8), which is the userid associated with a request that created the cached page. The creation timestamp is: CREATION_TIME TIMESTAMP, which specifies the date and time of the creation of the cached page. This date and time is the same as the CREATION_TIME for any LOBs that the web page references.
  • The expiration timestamp is: EXPIRATION_TIME TIMESTAMP, which specifies the date and time of expiration of the cached page (value of CREATION_TIME+lifetime value from DTW_CACHE_PAGE directive). A second index for the dynamic web page cache table is provided that consists of the column for EXPIRATION_TIME. The indexed column (i.e., EXPIRATION_TIME) of the second index is used to efficiently identify the cached web pages that have expired.
  • The size is: SIZE INTEGER, which specifies the size of the cached page in bytes. The usage scope is: USAGE_SCOPE SMALLINT; a value of 1 means that the page has a PUBLIC usage scope and a value of 2 means that the page has a PRIVATE usage scope. The ordinal position of the segment is: ORDINAL_POSITION SMALLINT, which specifies the ordinal position of the web page segment within the complete cached page. The dynamic web page segment is: PAGE_SEGMENT VARCHAR(28100) FOR BIT DATA, which is the ASCII encoded web page segment.
  • 3. Dynamic Web Page/LOB Dependency Table
  • A dynamic web page/LOB dependency table must be created and several stored procedures must be installed before the cache management system 110 can manage cached web pages and LOBs. The steps necessary to created the web page dependency table are outlined below:
  • a. The dynamic Web page/LOB dependency table contains information about the LOB files stored in HFS and about the relationship that these files may have, if any, to Web pages stored in the dynamic Web page cache.
  • (i) Create the dynamic Web Page/LOB dependency table, SYSIBM.DTWCACHEDEPS.
  • (ii) Define the stored procedure used to insert the dependency information into SYSIBM.DTWCACHEDEPS to the DBMS.
  • (iii) Prepare the stored procedure for execution using a user ID with INSERT, SELECT, and DELETE privileges on SYSIBM.DTWCACHEDEPS. The user IDs associated with the requests that retrieve LOBs must have the EXECUTE privilege for the stored procedure.
  • b. Create the stored procedure that performs automatic management of the Web page cache and LOBs.
  • (i) Define the stored procedure to the DBMS.
  • (ii) Prepare the stored procedure for execution using a user ID with the DELETE privilege on SYSIBM.DTWCACHEPAGES and SYSIBM.DTWCACHEDEPS. The user IDs that execute macros must have the EXECUTE privilege for the stored procedure.
  • c. Create the stored procedure that is used for more advanced management of the Web page cache and LOBs.
  • (i) Define the stored procedure to the DBMS.
  • (ii) Prepare the stored procedure for execution using a userid with the DELETE privilege on SYSIBM.DTWCACHEPAGES and on SYSIBM.DTWCACHEDEPS. The userids that execute the management system-provided macro must have the EXECUTE privilege for the stored procedure.
  • 4. Dynamic Web Page/LOB Dependency Table Details
  • The DB2 web page/LOB dependency table is: SYSIBM.DTWCACHEDEPS. The indexed key for a cached page in the DB2 web page/LOB dependency table is: INDEXED_KEY CHAR(250), which specifies the first 250 characters of path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page. The identifier is: ID INTEGER, which specifies an identifier derived from path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page. An index to the DB2 web page/LOB dependency table is provided that consists of both the column for the INDEXED_KEY and the column for the ID.
  • Actual key of the cached page is: ACTUAL KEY VARCHAR(4000)—path information, macro name, HTML or XML block name, query string, and form data for the request that created the cached page. The fully qualified HFS filename for the LOB is: FILENAME VARCHAR(1024).
  • The creation timestamp is: CREATION_TIME TIMESTAMP, which specifies the date and time of the creation of the LOB. This date and time is the same as the CREATION_TIME for the web page that references the LOB.
  • The expiration timestamp is: EXPIRATION_TIME TIMESTAMP, which specifies the date and time for expiration of the LOB (value of CREATION_TIME+DTW_LOB LIFETIME configuration value, when dynamic web page caching not in use; value of CREATION_TIME+max(DTW_LOB_LIFETIME configuration value; lifetime, value from DTW_CACHE_PAGE directive) when dynamic web page caching in use). A second index for the DB2 web page/LOB dependency table is provided that consists of the column for EXPIRATION_TIME. The indexed column (i.e., EXPIRATION_TIME) of the second index is used to efficiently identify objects (referenced by a web page) that have expired.
  • The size is: SIZE INTEGER, which specifies the size of the LOB in bytes.
  • The SYSIBM.DTWCACHEDPAGES and the SYSIBMDTWCACHEDEPS DB2 tables are shown below. SYSIBM.DTWCACHEDPAGES is the dynamic Web page cache table. This table contains the cached Web pages and information about the cached Web pages.
  • TABLE 1
    SYSIBM.DTWCACHEDPAGES
    Column: Column Datatype Description
    INDEXED_KEY: Indexed key for cached page: the first 250 characters of the
    CHAR(250) actual key
    ID: INTEGER Identifier: an identifier derived from the key
    ACTUAL_KEY: The actual key of the cached page: the path information,
    VARCHAR(4000) macro name, HTML or XML block name, query string, and
    form data of the request that generated the page.
    CREATOR: CHAR(8) User ID of creator: user ID associated with the request that
    created the cached page.
    CREATION_TIME: Creation timestamp: date and time of creation of cached
    TIMESTAMP page.
    It is the same as the CREATION_TIME for any LOBs that
    the Web page references.
    EXPIRATION_TIME: Expiration timestamp: date and time for expiration of cached
    TIMESTAMP page (value of CREATION_TIME + lifetime value from
    DTW_CACHE_PAGE directive)
    SIZE: INTEGER Size: size of cached page in bytes
    USAGE_SCOPE: Usage scope: a value of 1 means that the page has a PUBLIC
    SMALLINT usage scope and a value of 2 means that the page has a
    PRIVATE usage scope
    ORDINAL_POSITION: Ordinal position of segment: the ordinal position of the Web
    SMALLINT page segment within the complete cached page
    PAGE_SEGMENT Dynamic Web page segment: the ASCII encoded Web page
    VARCHAR(28100)FOR segment
    BIT DATA

    SYSIBM.DTWCACHEDEPS is the Web page dependency table. This table contains information about the LOBs referenced by Web pages.
  • TABLE 2
    SYSIBM.DTWCACHEDEPS
    Column: Column Datatype Description
    INDEXED_KEY: Indexed key for Web page: first 250 characters of the
    CHAR(250) actual key
    ID: INTEGER Identifier: an identifier derived from the actual key
    ACTUAL_KEY: The actual key of the cached page: the path information,
    VARCHAR(4000) macro name, HTML or XML block name, query string, and
    form data of the request that generated the page.
    FILENAME: Fully qualified HFS filename for the LOB
    VARCHAR(1024)
    CREATION_TIME: Creation timestamp: date and time of the creation of the
    TIMESTAMP LOB.
    Is the same as the CREATION_TIME for the Web page that
    references this LOB
    EXPIRATION_TIME: Expiration timestamp: date and time of the expiration of the
    TIMESTAMP LOB (value of CREATION_TIME + DTW_LOB_LIFETIME
    configuration value when dynamic
    web page caching not in use; value of CREATION_TIME + max
    (DTW_LOB_LIFETIME configuration value, lifetime
    value from DTW_CACHE_PAGE directive) when dynamic
    web page caching in use)
    SIZE: INTEGER Size: size of the LOB in bytes
  • 5. Managing LOBs and Cached Web Pages
  • LOBs referenced in web pages are stored as files in a single HFS directory. The name of the directory is specified by the HTML_PATH configuration statement of the management system initialization file.
  • The settings of two configuration variables control the manner in which LOBs and cached web pages are automatically managed. The configuration variable DTW_LOB_LIFETIME specifies the minimum number of seconds LOBs are available. The default is 0 seconds. When dynamic web page caching is in use, the minimum number of seconds a LOB is available is the larger of the value of DTW_LOB_LIFETIME and the lifetime value specified in the caching directive for the dynamic web page that references the LOB. The configuration variable DTW_CACHE_MANAGEMENT_INTERVAL specifies a minimum number of seconds between successive automatic cache management attempts. The default is 0 seconds. A value of 0 means that the cache management system 110 will not perform automatic cache management.
  • Cached web pages and LOBs are automatically managed by the management system using DB2 stored procedures. Additionally, a management system-provided macro permits Web server administrators to manage LOBs and cached web pages in more sophisticated ways using a stored procedure.
  • 6. Automatic Cache Management
  • When automatic cache management occurs, all cached dynamic web pages and LOBs with expiration timestamps that precede the current timestamp are automatically deleted.
  • The following are the logical steps for automatic cache management:
      • # Delete cache table entries where expiration timestamp<current timestamp
      • # Select dependency table entries where expiration timestamp<current timestamp
      • # For each selected dependency table entry:
        • < Delete HFS file identified by HFS filename for LOB
        • < Delete dependency table entry
  • 7. More Advanced Cache Management
  • The management system provides a macro that allows a system administrator to execute a management system provided stored procedure to perform cache management. Use of this macro is referred to as more advanced cache management.
  • When more advanced cache management occurs, all cached dynamic web pages and LOBs with expiration timestamps that precede the current timestamp are automatically deleted.
  • The management system-provided macro for the more advanced management of the caches permits the user to perform the following types of management. First, a cleanup operation may be requested that deletes all cached dynamic web pages and LOBs with expiration timestamps that precede the current timestamp. Second, web pages may be deleted from the dynamic web page cache based on system administrator provided input, such as macro file and block name template values, timestamp values or a combination of name and timestamp values (e.g., delete web pages where key of cached page is like X and creation timestamp is less than some provided timestamp value Y). All cached LOBs that are referenced by the deleted web pages are also deleted. Third, LOBs from the LOB cache may be deleted based on a timestamp value (e.g., delete LOBs where creation timestamp is greater than some provided timestamp value Y). All cached dynamic web pages that contain hypertext references to the LOBs are first deleted.
  • 8. Macro for More Advanced Cache Management
  • More specifically, the macro for more advanced management of dynamic web pages and LOBs allows a system administrator to specify the following:
      • # The deletion of all cached dynamic web pages and LOBs with expiration timestamps that precede the current timestamp.
  • # A macro and block template value, where all dynamic web pages are deleted where ACTUAL_KEY LIKE % value %.
      • # A starting creation timestamp value: where all dynamic web pages are deleted where creation timestamp>or equal to starting creation timestamp.
      • # An ending creation timestamp value: where all dynamic web pages are deleted where creation timestamp<or equal to ending creation timestamp.
  • # Both a starting and an ending creation timestamp value, where all dynamic web pages are deleted where creation timestamp>or equal to starting creation timestamp and <or equal to ending creation timestamp.
  • # A macro and block template value and a starting creation timestamp value, where all dynamic web pages are deleted where ACTUAL_KEY LIKE % value % and where creation timestamp>or equal to starting creation timestamp.
      • # A macro and block template value and an ending creation timestamp value where all dynamic web pages are deleted where ACTUAL_KEY LIKE % value % and where creation timestamp<or equal to ending creation timestamp.
      • A# macro and block template value and both a starting and an ending creation timestamp value, where all dynamic web pages are deleted where ACTUAL_KEY LIKE % value % and where creation timestamp>or equal to starting creation timestamp and <or equal to ending creation timestamp.
  • The macro for more advanced management of LOBs also allows a user to specify the following:
      • # A starting creation timestamp value, where all LOBs are deleted where creation timestamp>or equal to starting creation timestamp.
      • # An ending creation timestamp value, where all LOBs are deleted where creation timestamp<or equal to ending creation timestamp.
      • # Both a starting and an ending creation timestamp value, where all LOBs are deleted where creation timestamp>or equal to starting creation timestamp and <or equal to ending creation timestamp.
  • The macro for more advanced management is named manage_cache.d2w.
  • EXAMPLE 1
  • To delete selected dynamic Web pages and related large objects that have been cached:
  • a. Invoke the BEGIN HTML block of the manage_cache.d2w macro.
    b. Click on Delete selected dynamic Web pages and related large object files choice.
  • This choice lets you specify a filter and timestamp values for the cached Web pages you want to delete. All expired cached Web pages and LOBs as well as all LOBs referenced by the deleted Web pages are also deleted.
  • c. Optionally type a string in the Enter the ACTUAL_KEY filter field that matches any part of the ACTUAL_KEY for the Web pages to be deleted. This string acts as filter for selecting the cached Web pages Net.Data deletes. The string can contain up to 250 characters. For example, when the following string is entered:
    /netdata/macros/my_macro.d2w/report
    Net. Data deletes all cached Web pages that have an ACTUAL_KEY value containing this string.
    d. Optionally click on the Starting CREATION_TIME check box and enter a timestamp value. Net.Data deletes all cached Web pages that have creation times greater than or equal to this timestamp value, and that have creation times less than or equal to the Ending CREATION_TIME, if specified. If no Ending CREATION_TIME value is specified, than Net.Data deletes all cached Web pages that have creation times greater than or equal to the Starting CREATION_TIME value. For example, when the following timestamp is entered:
  • Starting CREATION_TIME:
  • Year 1999 Month 03 Day 23 Hour 14 Minute 00 Second 00
  • Net.Data deletes all cached Web pages that were created on or after 2:00 PM on Mar. 23, 1999 up to and including the value of Ending CREATION_TIME, if specified.
  • e. Optionally click on the Ending CREATION_TIME check box and enter a timestamp value. Net.Data deletes all cached Web pages that have creation times less than or equal to this timestamp value, and that have creation times greater than or equal to the Starting CREATION_TIME, if specified. If no Starting CREATION_TIME value is specified, then Net.Data deletes all cached Web pages that have creation times less than or equal to the Ending CREATION_TIME value. For example, when the following timestamp is entered:
  • Ending CREATION_TIME:
  • Year 1999 Month 03 Day 23 Hour 23 Minute 59 Second 59
  • Net.Data deletes all cached Web pages that were created on or before 11:59:59 PM on Mar. 23, 1999, starting with the value of Starting CREATION_TIME, if specified.
  • If the Enter the ACTUAL_KEY filter field is empty and neither of the check boxes are checked, Net.Data deletes only expired cached Web pages and LOBs.
  • f. Click the EXECUTE push button to proceed, or select Back to the beginning to return to the main page and cancel your request.
  • EXAMPLE 2
  • To delete large objects in HFS and related Web pages that have been cached:
      • (i) Invoke the BEGIN HTML block of the manage_cache.dtw macro.
      • (ii) Click on the Delete selected large object files and related dynamic Web pages choice.
        • This choice lets you specify timestamp values for the LOBs you want to delete. All expired cached Web pages and LOBs as well as all LOBs referenced by deleted Web pages are also deleted.
      • (iii) Optionally click on the Starting CREATION_TIME check box and enter a timestamp value. Net.Data deletes all LOB files that have creation times greater than or equal to this timestamp value, and that have creation times less than or equal to the Ending CREATION_TIME, if specified. If no Ending CREATION_TIME value is specified, then Net.Data deletes all LOB files that have creation times greater than or equal to the Starting CREATION_TIME value. For example, when the following timestamp is entered:
    Starting CREATION_TIME Year 1999 Month 03 Day 23 Hour 14 Minute 00 Second 00
  • Net.Data deletes all LOBs that were created on or after 2:00 PM on Mar. 23, 1999, up to and including the value of Ending CREATION_TIME, if specified.
      • (iv) Optionally click on the Ending CREATION_TIME check box and enter a timestamp value. Net.Data deletes all LOB files that have creation times less than or equal to this timestamp value, and that have creation times greater than or equal to the Starting CREATION_TIME, if specified. If no Starting CREATION_TIME value is specified, then Net. Data deletes all LOB files that have creation times less than or equal to the Ending CREATION_TIME value. For example, when the following timestamp is entered:
        • Ending CREATION_TIME:
        • Year 1999 Month 03 Day 23 Hour 23 Minute 59 Second 59 Net.Data deletes all LOB files that were created on or before 11:59:59 PM on Mar. 23, 1999, starting with the value of Starting CREATION_TIME, if specified.
        • If neither of the check boxes are checked, Net.Data deletes only expired LOB files and cached Web pages.
      • (v) Click the EXECUTE push button to proceed, or select Back to the beginning to return to the main page and cancel your request.
  • 9. Security Considerations and Guidelines
  • The userid(s) associated with the threads or processes that execute management system requests must have the EXECUTE privilege for the stored procedure used to add dynamic web pages to the cache and to add dependency information to the dynamic web page/LOB dependency table. Additionally, the userid(s) associated with the threads or processes that execute management system requests must have the authority to add files to the HTML_PATH directory.
  • The userid(s) associated with the threads or processes that execute the management system requests must also have the EXECUTE privilege for the stored procedure used to automatically manage the dynamic web page and LOB caches. The userid(s) used for more advanced management of LOBs and cached web pages must have the EXECUTE privilege for the stored procedure used to manually manage the dynamic web page and LOB caches.
  • An administrative userid is typically used to create the HTML_PATH directory and prepare the stored procedures for execution. The administrative userid must have the INSERT, SELECT, and DELETE privileges on the web cache table and on the dynamic web page/LOB dependency table.
  • In summary, the preferred embodiment of the present invention provides a technique for managing the contents of a dynamic web page system cache and related data stores containing objects that the web pages reference while also ensuring the completeness of cached web pages that are subsequently reused for display at browsers. The content may be managed by expiration times or in more advanced ways. FIG. 2 is a flow chart illustrating the steps performed in accordance with an embodiment of the present invention.
  • Block 200 represents an exemplary embodiment of the present invention receiving a request to generate a dynamic web page. According to the exemplary embodiment, it is determined whether the requested web page is cached, as represented by Block 201. If the requested web page is not cached, the data is retrieved and placed in a dynamically generated web page, as represented by Block 202. The retrieved data may be linked to other stored data._Block 203 represents the determination as to whether the web page should be cached. Block 204 represents the present invention caching the retrieved data and the linked data. If the requested web page is cached, the web page is retrieved from the cache, as represented by Block 205. The present invention then manages the cached data, as represented by Block 206.
  • CONCLUSION
  • This concludes the description of the preferred embodiment of the invention. The following describes some alternative embodiments for accomplishing the present invention. For example, any type of computer, such as a mainframe, minicomputer, or personal computer, or computer configuration, such as a timesharing mainframe, local area network, or standalone personal computer, could be used with the present invention.
  • The foregoing description of the preferred embodiment of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form disclosed. Many modifications and variations are possible in light of the above teaching. It is intended that the scope of the invention be limited not by this detailed description, but rather by the claims appended hereto.

Claims (24)

1. An apparatus for processing a request that requires the dynamic generation of a web page, the apparatus comprising:
a computer; and
one or more programs, performed by the computer, for:
determining whether to respond to a request for a web page by retrieving the web page from a cache or by constructing the web page;
if it is determined that the request is to be responded to by constructing the web page,
retrieving data and placing data in the web page, wherein the data is linked to other objects,
determining that the web page is to be cached, wherein the web page references the other objects;
storing the referenced objects in one or more data stores; and
caching the web page in the cache;
if it is determined that the request is to be responded to by retrieving the web page from the cache, retrieving the web page from the cache;
automatically managing the cached web page and the referenced objects to ensure the display of a complete web page; and
when one or more of the referenced objects is deleted, deleting the web page from the cache,
wherein a system initialization file comprises at least one caching directive which is used in determining whether to cache the constructed web page.
2. The apparatus of claim 1, further comprising, when the web page is deleted from the cache, deleting the referenced objects.
3. The apparatus of claim 1, further comprising, prior to determining that a web page is to be cached, one or more computer programs, performed by the computer, for retrieving data and placing the data in a dynamically generated web page, wherein the data is linked to other stored objects.
4. The apparatus of claim 3, wherein managing the cached data comprises one or more computer programs, performed by the computer, for receiving a request from an administrator to delete the retrieved data based on an administrator-provided input, and deleting the retrieved data based on the administrator-provided input.
5. The apparatus of claim 3, wherein managing the cached data comprises one or more computer programs, performed by the computer, for receiving a request from an administrator to delete the linked objects based on a second user provided input, and deleting the linked objects based on the administrator-provided input.
6. The apparatus of claim 1, further comprising one or more computer programs, performed by the computer, for processing a caching directive that specifies whether the web page should be cached.
7. The apparatus of claim 1, further comprising one or more computer programs, performed by the computer, for associating an expiration timestamp with the web page, wherein the expiration timestamp defines a time period in which the cached web page is valid.
8. The apparatus of claim 7, wherein managing the cached web page and referenced objects further comprises one or more computer programs, performed by the computer, for automatically deleting the web page and the referenced objects when the expiration timestamp precedes a current timestamp.
9. The apparatus of claim 8, wherein deleting further comprises one or more computer programs, performed by the computer, for first, deleting the web page and second, deleting the referenced objects.
10. The apparatus of claim 7, wherein managing the web page and referenced objects comprises one or more computer programs, performed by the computer, for receiving a request from an administrator to delete all cached web pages according to some administrator-specified selection criteria, and deleting all cached web pages and referenced objects that satisfy the administrator-specified selection criteria.
11. The apparatus of claim 10, wherein deleting further comprises one or more computer programs, performed by the computer, for first, deleting the web page and second, deleting the referenced objects.
12. The apparatus of claim 1, wherein at least one of the referenced objects is not stored in said cache.
13. An article of manufacture comprising a computer program carrier readable by a computer and embodying one or more instructions executable by the computer to perform method steps for responding to a request for a web page, comprising:
determining whether to respond to a request for a web page by retrieving the web page from a cache or by constructing the web page;
if it is determined that the request is to be responded to by constructing the web page,
retrieving data and placing data in the web page, wherein the data is linked to other objects,
determining that the web page is to be cached, wherein the web page references the other objects;
storing the referenced objects in one or more data stores; and
caching the web page in the cache;
if it is determined that the request is to be responded to by retrieving the web page from the cache, retrieving the web page from the cache;
automatically managing the cached web page and the referenced objects to ensure the display of a complete web page; and
when one or more of the referenced objects is deleted, deleting the web page from the cache,
wherein a system initialization file comprises at least one caching directive which is used in determining whether to cache the constructed web page.
14. The article of manufacture of claim 13, further comprising, when the web page is deleted, deleting the referenced objects.
15. The article of manufacture of claim 13, further comprising, prior to determining that a web page is to be cached:
retrieving data and placing the data in a dynamically generated web page, wherein the data is linked to other stored objects.
16. The article of manufacture of claim 15, wherein managing the cached web page and referenced objects comprises the steps of:
receiving a request from an administrator to delete the retrieved data based on an administrator-provided input; and
deleting the retrieved data based on the administrator-provided input.
17. The article of manufacture of claim 15, wherein managing the web page and referenced objects comprises the steps of:
receiving a request from an administrator to delete the linked objects based on an administrator-provided input; and
deleting the linked objects based on the administrator-provided input.
18. The article of manufacture of claim 13, further comprising processing a caching directive that specifies whether the web page should be cached.
19. The article of manufacture of claim 13, further comprising associating an expiration timestamp with the web page, wherein the expiration timestamp defines a time period in which the cached web page is valid.
20. The article of manufacture of claim 19, wherein managing the cached web page and referenced objects further comprises automatically deleting the web page and the referenced objects when the expiration timestamp precedes a current timestamp.
21. The article of manufacture of claim 20, wherein deleting further comprises first, deleting the web page and second, deleting the referenced objects.
22. The article of manufacture of claim 19, wherein managing the cached web page and referenced objects comprises the steps of:
receiving a request from an administrator to delete all cached web pages according to some administrator-specified selection criteria; and
deleting all cached web pages and referenced objects that satisfy the administrator-specified selection criteria.
23. The article of manufacture of claim 22, wherein deleting further comprises first, deleting the web page and second, deleting the referenced objects.
24. The article of manufacture of claim 13, wherein at least one of the referenced objects is not stored in said cache.
US11/967,525 1999-06-24 2007-12-31 Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores Abandoned US20080155056A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/967,525 US20080155056A1 (en) 1999-06-24 2007-12-31 Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14071199P 1999-06-24 1999-06-24
US09/602,412 US7343412B1 (en) 1999-06-24 2000-06-23 Method for maintaining and managing dynamic web pages stored in a system cache and referenced objects cached in other data stores
US11/967,525 US20080155056A1 (en) 1999-06-24 2007-12-31 Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/602,412 Continuation US7343412B1 (en) 1999-06-24 2000-06-23 Method for maintaining and managing dynamic web pages stored in a system cache and referenced objects cached in other data stores

Publications (1)

Publication Number Publication Date
US20080155056A1 true US20080155056A1 (en) 2008-06-26

Family

ID=39155450

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/602,412 Expired - Fee Related US7343412B1 (en) 1999-06-24 2000-06-23 Method for maintaining and managing dynamic web pages stored in a system cache and referenced objects cached in other data stores
US11/967,525 Abandoned US20080155056A1 (en) 1999-06-24 2007-12-31 Technique for Maintaining and Managing Dynamic Web Pages Stored in a System Cache and Referenced Objects Cached in Other Data Stores

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/602,412 Expired - Fee Related US7343412B1 (en) 1999-06-24 2000-06-23 Method for maintaining and managing dynamic web pages stored in a system cache and referenced objects cached in other data stores

Country Status (1)

Country Link
US (2) US7343412B1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090024801A1 (en) * 2007-07-19 2009-01-22 Ebay Inc. Method and system to detect a cached web page
US20090150435A1 (en) * 2007-12-08 2009-06-11 International Business Machines Corporation Dynamic updating of personal web page
US7594001B1 (en) * 2001-07-06 2009-09-22 Microsoft Corporation Partial page output caching
US20120233199A1 (en) * 2011-03-10 2012-09-13 Jenkins Jeffrey R Intelligent Web Caching
US20130086323A1 (en) * 2011-09-30 2013-04-04 Oracle International Corporation Efficient cache management in a cluster
US20140068196A1 (en) * 2012-08-28 2014-03-06 Louis Benoit Method and system for self-tuning cache management
US20140122575A1 (en) * 2012-10-29 2014-05-01 Fujitsu Limited Tunnel communication system
US9026578B2 (en) 2004-05-14 2015-05-05 Microsoft Corporation Systems and methods for persisting data between web pages
EP3220607A4 (en) * 2014-11-11 2018-08-22 Alibaba Group Holding Limited Service data processing method, device and system

Families Citing this family (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7571217B1 (en) 2000-08-16 2009-08-04 Parallel Networks, Llc Method and system for uniform resource locator transformation
US7587515B2 (en) * 2001-12-19 2009-09-08 International Business Machines Corporation Method and system for restrictive caching of user-specific fragments limited to a fragment cache closest to a user
US7707287B2 (en) * 2002-03-22 2010-04-27 F5 Networks, Inc. Virtual host acceleration system
US7725452B1 (en) 2003-07-03 2010-05-25 Google Inc. Scheduler for search engine crawler
US8707312B1 (en) 2003-07-03 2014-04-22 Google Inc. Document reuse in a search engine crawler
US7987172B1 (en) 2004-08-30 2011-07-26 Google Inc. Minimizing visibility of stale content in web searching including revising web crawl intervals of documents
US7877392B2 (en) * 2006-03-01 2011-01-25 Covario, Inc. Centralized web-based software solutions for search engine optimization
US20070233566A1 (en) * 2006-03-01 2007-10-04 Dema Zlotin System and method for managing network-based advertising conducted by channel partners of an enterprise
US8676868B2 (en) * 2006-08-04 2014-03-18 Chacha Search, Inc Macro programming for resources
US20080052278A1 (en) * 2006-08-25 2008-02-28 Semdirector, Inc. System and method for modeling value of an on-line advertisement campaign
US8972379B1 (en) 2006-08-25 2015-03-03 Riosoft Holdings, Inc. Centralized web-based software solution for search engine optimization
US8943039B1 (en) 2006-08-25 2015-01-27 Riosoft Holdings, Inc. Centralized web-based software solution for search engine optimization
CN101201827B (en) * 2006-12-14 2013-02-20 阿里巴巴集团控股有限公司 Method and system for displaying web page
CN101539911B (en) * 2008-03-18 2013-05-29 盛大计算机(上海)有限公司 Device and method for accelerating display of web page of browser
US8190594B2 (en) 2008-06-09 2012-05-29 Brightedge Technologies, Inc. Collecting and scoring online references
US8204928B2 (en) * 2008-10-10 2012-06-19 Caterpillar Inc. System and method for analyzing internet usage
US8396742B1 (en) 2008-12-05 2013-03-12 Covario, Inc. System and method for optimizing paid search advertising campaigns based on natural search traffic
US8671089B2 (en) 2009-10-06 2014-03-11 Brightedge Technologies, Inc. Correlating web page visits and conversions with external references
US9514243B2 (en) * 2009-12-03 2016-12-06 Microsoft Technology Licensing, Llc Intelligent caching for requests with query strings
EP2680533B8 (en) * 2011-04-28 2017-12-13 Huawei Device (Dongguan) Co., Ltd. Content acquiring method and client based on http
US9400851B2 (en) 2011-06-23 2016-07-26 Incapsula, Inc. Dynamic content caching
US9747263B1 (en) * 2014-06-27 2017-08-29 Google Inc. Dynamic page classifier for ranking content
CN104182547A (en) * 2014-09-10 2014-12-03 北京浩瀚深度信息技术股份有限公司 Method for optimizing page rendering of server and web cache server
US10867005B1 (en) * 2018-12-10 2020-12-15 Amazon Technologies, Inc. Dynamic content caching and rendering system
US11073991B2 (en) 2019-03-26 2021-07-27 International Business Machines Corporation Dynamically linking objects in an object storage during uploads to cloud
CN110990611B (en) * 2019-12-03 2023-04-21 北京奇艺世纪科技有限公司 Picture caching method and device, electronic equipment and storage medium

Citations (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5961602A (en) * 1997-02-10 1999-10-05 International Business Machines Corporation Method for optimizing off-peak caching of web data
US5991810A (en) * 1997-08-01 1999-11-23 Novell, Inc. User name authentication for gateway clients accessing a proxy cache server
US6026413A (en) * 1997-08-01 2000-02-15 International Business Machines Corporation Determining how changes to underlying data affect cached objects
US6119153A (en) * 1998-04-27 2000-09-12 Microsoft Corporation Accessing content via installable data sources
US6141759A (en) * 1997-12-10 2000-10-31 Bmc Software, Inc. System and architecture for distributing, monitoring, and managing information requests on a computer network
US6173322B1 (en) * 1997-06-05 2001-01-09 Silicon Graphics, Inc. Network request distribution based on static rules and dynamic performance data
US6182122B1 (en) * 1997-03-26 2001-01-30 International Business Machines Corporation Precaching data at an intermediate server based on historical data requests by users of the intermediate server
US6185608B1 (en) * 1998-06-12 2001-02-06 International Business Machines Corporation Caching dynamic web pages
US6269403B1 (en) * 1997-06-30 2001-07-31 Microsoft Corporation Browser and publisher for multimedia object storage, retrieval and transfer
US6272598B1 (en) * 1999-03-22 2001-08-07 Hewlett-Packard Company Web cache performance by applying different replacement policies to the web cache
US6298373B1 (en) * 1996-08-26 2001-10-02 Microsoft Corporation Local service provider for pull based intelligent caching system
US6298356B1 (en) * 1998-01-16 2001-10-02 Aspect Communications Corp. Methods and apparatus for enabling dynamic resource collaboration
US20010034814A1 (en) * 1997-08-21 2001-10-25 Michael D. Rosenzweig Caching web resources using varied replacement sttrategies and storage
US6314492B1 (en) * 1998-05-27 2001-11-06 International Business Machines Corporation System and method for server control of client cache
US6327598B1 (en) * 1997-11-24 2001-12-04 International Business Machines Corporation Removing a filled-out form from a non-interactive web browser cache to an interactive web browser cache
US6334145B1 (en) * 1998-06-30 2001-12-25 International Business Machines Corporation Method of storing and classifying selectable web page links and sublinks thereof to a predetermined depth in response to a single user input
US6345292B1 (en) * 1998-12-03 2002-02-05 Microsoft Corporation Web page rendering architecture
US6351767B1 (en) * 1999-01-25 2002-02-26 International Business Machines Corporation Method and system for automatically caching dynamic content based on a cacheability determination
US20020056100A1 (en) * 1999-04-16 2002-05-09 Tsutomu Shimomura A broadband data broadcasting service
US6408360B1 (en) * 1999-01-25 2002-06-18 International Business Machines Corporation Cache override control in an apparatus for caching dynamic content
US6421733B1 (en) * 1997-03-25 2002-07-16 Intel Corporation System for dynamically transcoding data transmitted between computers
US6453342B1 (en) * 1998-12-03 2002-09-17 International Business Machines Corporation Method and apparatus for selective caching and cleaning of history pages for web browsers
US20020178232A1 (en) * 1997-12-10 2002-11-28 Xavier Ferguson Method of background downloading of information from a computer network
US20020198993A1 (en) * 2000-09-01 2002-12-26 Ncr Corporation Downloading and uploading data in information networks
US6522738B1 (en) * 1998-12-16 2003-02-18 Nortel Networks Limited Web site content control via the telephone
US6542967B1 (en) * 1999-04-12 2003-04-01 Novell, Inc. Cache object store
US6578078B1 (en) * 1999-04-02 2003-06-10 Microsoft Corporation Method for preserving referential integrity within web sites
US6601090B1 (en) * 1999-06-25 2003-07-29 Nortel Networks Limited System and method for servicing internet object accessess from a coupled intranet
US7100106B1 (en) * 1998-12-14 2006-08-29 Microsoft Corporation Mirroring operations performed on linked files and folders

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69316631T2 (en) * 1992-08-31 1998-07-16 Dow Chemical Co SCREENPLAY BASED SYSTEM FOR TESTING A MULTI-USER COMPUTER SYSTEM
US5740430A (en) 1995-11-06 1998-04-14 C/Net, Inc. Method and apparatus for server-independent caching of dynamically-generated customized pages
US5894554A (en) 1996-04-23 1999-04-13 Infospinner, Inc. System for managing dynamic web page generation requests by intercepting request at web server and routing to page server thereby releasing web server to process other requests
US5878223A (en) 1997-05-07 1999-03-02 International Business Machines Corporation System and method for predictive caching of information pages
US6256712B1 (en) * 1997-08-01 2001-07-03 International Business Machines Corporation Scaleable method for maintaining and making consistent updates to caches
US6266742B1 (en) * 1997-10-27 2001-07-24 International Business Machines Corporation Algorithm for cache replacement
US6393526B1 (en) * 1997-10-28 2002-05-21 Cache Plan, Inc. Shared cache parsing and pre-fetch
US6366947B1 (en) * 1998-01-20 2002-04-02 Redmond Venture, Inc. System and method for accelerating network interaction
US6209003B1 (en) * 1998-04-15 2001-03-27 Inktomi Corporation Garbage collection in an object cache
US6338117B1 (en) * 1998-08-28 2002-01-08 International Business Machines Corporation System and method for coordinated hierarchical caching and cache replacement
US6408296B1 (en) * 1998-09-01 2002-06-18 Lucent Technologies Inc. Computer implemented method and apparatus for enhancing access to a file
US6347312B1 (en) * 1998-11-05 2002-02-12 International Business Machines Corporation Lightweight directory access protocol (LDAP) directory server cache mechanism and method
US6453339B1 (en) * 1999-01-20 2002-09-17 Computer Associates Think, Inc. System and method of presenting channelized data

Patent Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6298373B1 (en) * 1996-08-26 2001-10-02 Microsoft Corporation Local service provider for pull based intelligent caching system
US5961602A (en) * 1997-02-10 1999-10-05 International Business Machines Corporation Method for optimizing off-peak caching of web data
US6421733B1 (en) * 1997-03-25 2002-07-16 Intel Corporation System for dynamically transcoding data transmitted between computers
US6182122B1 (en) * 1997-03-26 2001-01-30 International Business Machines Corporation Precaching data at an intermediate server based on historical data requests by users of the intermediate server
US6173322B1 (en) * 1997-06-05 2001-01-09 Silicon Graphics, Inc. Network request distribution based on static rules and dynamic performance data
US6269403B1 (en) * 1997-06-30 2001-07-31 Microsoft Corporation Browser and publisher for multimedia object storage, retrieval and transfer
US5991810A (en) * 1997-08-01 1999-11-23 Novell, Inc. User name authentication for gateway clients accessing a proxy cache server
US6026413A (en) * 1997-08-01 2000-02-15 International Business Machines Corporation Determining how changes to underlying data affect cached objects
US20010034814A1 (en) * 1997-08-21 2001-10-25 Michael D. Rosenzweig Caching web resources using varied replacement sttrategies and storage
US6327598B1 (en) * 1997-11-24 2001-12-04 International Business Machines Corporation Removing a filled-out form from a non-interactive web browser cache to an interactive web browser cache
US20020178232A1 (en) * 1997-12-10 2002-11-28 Xavier Ferguson Method of background downloading of information from a computer network
US6141759A (en) * 1997-12-10 2000-10-31 Bmc Software, Inc. System and architecture for distributing, monitoring, and managing information requests on a computer network
US6298356B1 (en) * 1998-01-16 2001-10-02 Aspect Communications Corp. Methods and apparatus for enabling dynamic resource collaboration
US6119153A (en) * 1998-04-27 2000-09-12 Microsoft Corporation Accessing content via installable data sources
US6314492B1 (en) * 1998-05-27 2001-11-06 International Business Machines Corporation System and method for server control of client cache
US6185608B1 (en) * 1998-06-12 2001-02-06 International Business Machines Corporation Caching dynamic web pages
US6334145B1 (en) * 1998-06-30 2001-12-25 International Business Machines Corporation Method of storing and classifying selectable web page links and sublinks thereof to a predetermined depth in response to a single user input
US6453342B1 (en) * 1998-12-03 2002-09-17 International Business Machines Corporation Method and apparatus for selective caching and cleaning of history pages for web browsers
US6345292B1 (en) * 1998-12-03 2002-02-05 Microsoft Corporation Web page rendering architecture
US7100106B1 (en) * 1998-12-14 2006-08-29 Microsoft Corporation Mirroring operations performed on linked files and folders
US6522738B1 (en) * 1998-12-16 2003-02-18 Nortel Networks Limited Web site content control via the telephone
US6351767B1 (en) * 1999-01-25 2002-02-26 International Business Machines Corporation Method and system for automatically caching dynamic content based on a cacheability determination
US6408360B1 (en) * 1999-01-25 2002-06-18 International Business Machines Corporation Cache override control in an apparatus for caching dynamic content
US6272598B1 (en) * 1999-03-22 2001-08-07 Hewlett-Packard Company Web cache performance by applying different replacement policies to the web cache
US6578078B1 (en) * 1999-04-02 2003-06-10 Microsoft Corporation Method for preserving referential integrity within web sites
US6542967B1 (en) * 1999-04-12 2003-04-01 Novell, Inc. Cache object store
US6526580B2 (en) * 1999-04-16 2003-02-25 Digeo, Inc. Broadband data broadcasting service
US20020056100A1 (en) * 1999-04-16 2002-05-09 Tsutomu Shimomura A broadband data broadcasting service
US6601090B1 (en) * 1999-06-25 2003-07-29 Nortel Networks Limited System and method for servicing internet object accessess from a coupled intranet
US20020198993A1 (en) * 2000-09-01 2002-12-26 Ncr Corporation Downloading and uploading data in information networks

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7594001B1 (en) * 2001-07-06 2009-09-22 Microsoft Corporation Partial page output caching
US9026578B2 (en) 2004-05-14 2015-05-05 Microsoft Corporation Systems and methods for persisting data between web pages
US9436572B2 (en) 2007-07-19 2016-09-06 Ebay Inc. Method and system to detect a cached web page
US20090024801A1 (en) * 2007-07-19 2009-01-22 Ebay Inc. Method and system to detect a cached web page
US8745164B2 (en) 2007-07-19 2014-06-03 Ebay Inc. Method and system to detect a cached web page
US20090150435A1 (en) * 2007-12-08 2009-06-11 International Business Machines Corporation Dynamic updating of personal web page
US20120233199A1 (en) * 2011-03-10 2012-09-13 Jenkins Jeffrey R Intelligent Web Caching
US20130086323A1 (en) * 2011-09-30 2013-04-04 Oracle International Corporation Efficient cache management in a cluster
US9112922B2 (en) * 2012-08-28 2015-08-18 Vantrix Corporation Method and system for self-tuning cache management
US20140068196A1 (en) * 2012-08-28 2014-03-06 Louis Benoit Method and system for self-tuning cache management
US9811470B2 (en) 2012-08-28 2017-11-07 Vantrix Corporation Method and system for self-tuning cache management
US20140122575A1 (en) * 2012-10-29 2014-05-01 Fujitsu Limited Tunnel communication system
EP3220607A4 (en) * 2014-11-11 2018-08-22 Alibaba Group Holding Limited Service data processing method, device and system
US10642907B2 (en) 2014-11-11 2020-05-05 Alibaba Group Holding Limited Processing service data

Also Published As

Publication number Publication date
US7343412B1 (en) 2008-03-11

Similar Documents

Publication Publication Date Title
US7343412B1 (en) Method for maintaining and managing dynamic web pages stored in a system cache and referenced objects cached in other data stores
US6832222B1 (en) Technique for ensuring authorized access to the content of dynamic web pages stored in a system cache
US6804674B2 (en) Scalable Content management system and method of using the same
US7167874B2 (en) System and method for command line administration of project spaces using XML objects
US7523173B2 (en) System and method for web page acquisition
US5649185A (en) Method and means for providing access to a library of digitized documents and images
US8458163B2 (en) System and method for enabling website owner to manage crawl rate in a website indexing system
US6920455B1 (en) Mechanism and method for managing service-specified data in a profile service
US8255430B2 (en) Shared namespace for storage clusters
CA2330664C (en) A co-presence data retrieval system
US11436201B2 (en) Network accessible file server
US7801850B2 (en) System of and method for transparent management of data objects in containers across distributed heterogenous resources
US6564218B1 (en) Method of checking the validity of a set of digital information, and a method and an apparatus for retrieving digital information from an information source
US20170124111A1 (en) System And Method For Synchronizing File Systems With Large Namespaces
US8156227B2 (en) System and method for managing multiple domain names for a website in a website indexing system
US11442902B2 (en) Shard-level synchronization of cloud-based data store and local file system with dynamic sharding
US11640374B2 (en) Shard-level synchronization of cloud-based data store and local file systems
WO2017223265A1 (en) Shard-level synchronization of cloud-based data store and local file systems
US8533226B1 (en) System and method for verifying and revoking ownership rights with respect to a website in a website indexing system
US6519610B1 (en) Distributed reference links for a distributed directory server system
US7917609B2 (en) Method and apparatus for managing lightweight directory access protocol information
JP2004013258A (en) Information filtering system
US7596564B1 (en) Method and system for cache management of a cache including dynamically-generated content
US20120078862A1 (en) Hybrid off-peak and just-in-time integration
JPH11338796A (en) Information distribution system

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION