WO2002025499A1

WO2002025499A1 - Method for extracting and storing records of data backup activity from a plurality of backup devices

Info

Publication number: WO2002025499A1
Application number: PCT/US2001/029435
Authority: WO
Inventors: Cory Bear; Liam Scanlan
Original assignee: Bocada, Inc.
Priority date: 2000-09-19
Filing date: 2001-09-19
Publication date: 2002-03-28
Also published as: EP1330722A1; AU2001291169A1; AU2001292862A1; EP1330722A4; WO2002025498A1; AU2001292863A1; WO2002025462A1

Abstract

A method and system for requesting, cross-referencing, extracting and storing records of data backup activity (s3) by using a software component that interfaces to a plurality of data backup software devices is disclosed. A method of storing automated request for records of data backup activity schedules (s6) is disclosed; invoking request through a component that interfaces to a plurality of data backup software devices; receiving records of data backup activity from said component; making alterations to said records of data backup activity and inserting subsets of said records of data backup activity into a table related to said central data table (s9).

Description

METHOD FOR EXTRACTING AND STORING RECORDS OF DATA BACKUP ACTIVITY FROM A PLURALITY OF BACKUP DEVICES

Cross-Reference to Related Applications

Patent Application of Cory Bear and Liam Scanlan, serial number 09/665,270, entitled "EXTENSIBLE METHOD FOR OBTAINING AN HISTORICAL RECORD OF DATA BACKUP ACTIVITY (AND ERRORS) AND CONVERTING SAME INTO CANONICAL FORMAT;" and Patent Application of Liam Scanlan and Cory Bear, serial number 09/665,269 entitled "METHOD FOR VISUALIZING DATA BACKUP ACTIVITY FROM A PLURALITY OF BACKUP DEVICES" which are incorporated herein by reference.

Field of the Invention

The present invention is related generally to electronic/software data backup and more particularly to simultaneous and seamless examination of such data backup activity performed across a plurality of data backup software programs.

Background of the Invention

Most data backup software devices in use today provide for the repeated, regular electronic transfer, over a network, of data from the point at which it is in regular use to a medium, such a magnetic tape, for the purposes of securing a fall-back situation should damage occur to the original data. Included in the list of such software programs, are programs that work on relatively small amounts of data, sometimes on a one-computer- to-one-tape-drive basis, and others that work on very large amounts of data, with banks of tape drives that are used to back up data from potentially thousands of computers connected to a network. Mostly, these data backup software products use what is known as a "client/server" model. In the context of data backup, this means that there is one computer (the "server") that controls and manages the actual data backup activity, and other computers (the "clients") that get backed up by the "server". In this scenario, the data backup tape drives are usually connected directly to the backup "server". There is also usually more than one backup server, each of which is responsible for the backup of data of numerous clients.

A central function of the activity of data backup is the ability to "restore" data in the case of damage to the data that is in use. The backup server computer usually controls this restore process. Understandably, the time it takes to recover data, and the confidence that the data recovery process will succeed, are two critical aspects of the data backup and restore function as a whole. Disk drive capacities and data volumes, and consequently the volumes of data to be backed up, have historically been increasing at a greater rate than the backup server speed, tape drive capacity and network bandwidth are increasing to handle it. Accordingly, new technologies have been added to help. Such new technologies include fiber-optic cables (for fast data transfer across the network), faster chips, tape drives that handle more tapes, faster tape drives, "Storage Area Networks" and so on. The activity of data backup has become more and more critical, as the importance of the data has increased. At the advent of the desktop "revolution", that is, when people first started using personal computers (PCs), almost every piece of important data was still stored on one, single computer, possibly a mainframe or a minicomputer. As the numbers and types of computers proliferated, particularly on the desktop, and the purpose for which these desktops were now being used, making the data on such computers increasingly valuable, many different products designed to backup data were created and put into the marketplace. Now, there are some 50 or more data backup products in use by organizations and private individuals.

Generally, but not always, such data backup software devices (products) have a reputation for being difficult to use. When there is an exception to this, the data backup software product often has other, perhaps related, limitations (e.g. the amount of data is can back up is small). Not all data backup software devices perform the same function. Thus, it is frequently necessary to have two or more different types of data backup software programs in use within the same organization, especially in large organizations. Anecdotally, one company has as many as 17 different data backup software devices in use somewhere in their organization. This is referred to as fragmentation. In large organizations, is has become necessary to hire expensive expertise to manage such large data backup and restore services. The more varied their data backup devices, the more expensive this becomes. Also, for large organizations, it has become increasingly likely that scheduled data backup activities will fail. Because of the extra complexity of running a variety of data backup software devices, and because of the sheer number of data backup activities that need to take place regularly, failed data backups often go unnoticed in a sea of less-relevant data backup information.

An additional problem is that beyond a certain number of hours, perhaps minutes, if identifying a failed data backup takes too long, then it often becomes too late for meaningful corrective action to be taken. As a result, large organizations often take an expensive "best guess" approach. Anecdotally, the level of confidence that large organizations live with regarding data backup success is said to be about 80%. In other words, it is expected that no more that 4 out of 5 data backups will be successful. Almost every large organization will relate experiences where data was lost because they mistakenly believed the data was been backed up.

In the marketplace today there are several data backup reporting products available. Each works with only one data backup software device. There are no known patents relating to either of these two products.

1. Legato GEMS Reporter™, which provides trend analysis and text-based failures analysis. This product works with Legato NetWorker. It is built to handle up to approximately 4 or 5 average-sized backup servers.

2. Veritas Advanced Reporter™ 3.2 form Veritas is similar to GEMS

Reporter. 3. SAMS Vantage™ provides statistical report from data backup activity of

Computer Associates ArcServelT product.

Accordingly, an OPEN relational database is required to enable the cross- referencing of historical data backup activity across a plurality of data backup software devices. This is because to examine the data, 3^rd party reporting/querying tools are generally used, and such tools generally only work with OPEN relational databases.

Summary of the Invention

In accordance with the present invention an automated software device for the extraction of historical records of data backup activity from a plurality of data backup software devices, and the storing of those records in a general-purpose relational database. .

Accordingly, besides the objects and advantages stated in our above patent, several objects and advantages of the present invention are:

(a) Provides the ability to automate, requiring little or no further intervention, the regular and scheduled extraction of historical records of data backup activity from a plurality of data backup software devices.

(b) Consolidates and cross-references the historical records of data backup activity into a single database, thereby providing the ability to create reports on data backup activity from a plurality of different data backup software devices.

(c) Exposes those consolidated and cross-referenced historical records of data backup activity in an open relational database, thereby enabling users of the invention to use 3^rd party reporting/querying products to create their own, perhaps unique reports.

(d) Provides such consolidation of historical records of data backup activity without the need to install additional software devices at the source of those data records. Brief Description of the Drawings

Figure 1: A flowchart describing how historical records of data backup activity are requested and received from the Database Update Service (DUS), and then stored in the database.

Figure 2: a list of the fields in the zrequests table.

Figure 3: A list of the fields returned by the embedded BX showing where each field flows to, where processes pi, p2, p3 and p4 are applied to some of the fields. It also illustrates which fields in the backups table record those data are inserted into. It also shows a list of fields in the servers table (ST) with an indication of the field ST3 being used.

Figure 4: A list of the fields returned by the embedded BX showing where each field flows, where process p2 is being applied to field CT4. It also shows which fields in the clients table (CT) those fields are inserted into.

Figure 5: A list of the fields returned by the embedded BX showing where each field flows. It also shows which fields in the validtargets table (VT) those fields are inserted into.

Figure 6: A list of the fields returned by the embedded BX showing where each field flows. It also shows which fields in the backupproducts table (PT) those fields are inserted into.

Figure 7: A list of the fields returned by the embedded OCX (BX) showing where each field flows to. It also shows which fields in the levels table (LT) those fields are inserted into.

Detailed Description of a Preferred Embodiment Terminology

"Batches" of data

In this document we refer to "batches" of data. This term means a collection of one or more records of the same format each containing an identical set of fields, each field having a different form, purpose and content from other fields in the same record, and each field containing potentially different data from the same field in other records in the batch.

"SQL" (Structured Query Language)

In this document there are several references to, and examples of, SQL statements that are routinely used in the Database Update Service (DUS) program. SQL, or "Structured Query Language" is a decades old "English language-like" computer language invented by IBM Corporation to facilitate the insertion, manipulation and retrieval of data in a database.

Typical SQL statements begin with words such as INSERT, or SELECT, or DELETE.

Using SQL statements, it is possible to "join" two (or more than two) related tables of data together with the intention of retrieving a batch of data that contains data from each of the tables. That is called a "JOIN", and examples of JOINs can be seen in this document.

There are several "versions" of SQL, but most of the versions are so similar that they are almost indistinguishable from one another. In the preferred embodiment, only the version known as ANSI SQL (American National Standards Institute SQL) is used. Relational Database

A relational database allows the definition of data structures, storage and retrieval operations and integrity constraints. In such a database the data and relations between them are organized in tables. A table is a collection of records and each record in a table contains the same fields. Certain fields may be designated as "keys", which means that searches for specific values of that field will use indexing to speed them up.

Open Relational Database

An Open Relational Database is a relational database that is accessible using data analysis tools generally available on the market, for example, Crystal Reports™. In this embodiment,

RDB

The term RDB, an acronym for Relational Database, is used throughout this document to represent the underlying source of data for reports described in this embodiment. The RDB contains historical records relating to backup activity across a plurality of backup engines. In the preferred embodiment, the RDB resides in an implementation of Microsoft SQL Server™ (described above).

Backup

The term Backup means the actual transfer of data that is in regular use, usually across a network, to a data storage medium, such as a magnetic tape, for the purposes of retrieval at a later date, should the data in regular use become damaged.

Backup Engine

The term Backup Engine means any software product or program that is used for the purposes of Backup described in the previous paragraph. For example, Legato NetWorker™, Veritas BackupExec™, BakBone NetVault™. BX

This term is used throughout this document to denote a software component that provides an interface to a plurality of backup engines. By connection, it is meant the ability to request and receive historical records of backup activity from those backup engines. This software component is not part of this invention, but is described in detail in the accompanying patent application referenced at the beginning of this application. In the preferred embodiment described in this document, BX behaves as described in that patent application.

With reference to Figure 1, there is a program called Database Update Service (referred to hereafter as "DUS" in this document) running "in the background" on the computer. Running in the background allows the user of the computer to proceed with other tasks while someone may be using the computer to perform other tasks (for example, creating a spreadsheet). This is possible because DUS does not need any direct user intervention in order to operate. In the preferred embodiment, it runs as a Windows NT Service.

Every several seconds, the program "wakes up" (see Figure 1 SI), and checks if the hour has changed from the last time it checked it a few seconds ago (see Figure 1 S2), in other words, has the current time on the computer passed to a new hour of the day. The program gets the hour by using this short piece of standard Delphi program code:

FormatDateTime('hh', now).

Every time it is checked, it is stored in memory so that it can be compared for any change with the next time the hour is retrieved a few seconds later.

If the hour has changed

First, all records in the table RT (see Figure 2) with a status field RT5 (see Figure 2) with the value "Finished" are deleted using a SQL statement understood by any person familiar to the art. Then, if the hour has changed, it scans the servers table ST (see Figure 3) looking at field ST4 (see Figure 3).

In the preferred embodiment, for each hour of the day there is a digit in field ST4 (see Figure 3). Hour 0 (any time between midnight and just before 1AM) is represented by the first digit; 3AM is represented by the 4^th digit, and so on. Typically, the string might look like this:

000100000000000000000010

Note that there are 24 digits, each of which can be either 0 or 1.

In the above example, the program would automatically create arequest for update at 3 AM and at 10PM every day because the 4^th and 23^rd digits are set to "1".

If a particular hour's digit is "1", then it means that a refresh must be requested S3 (see Figure 1) at that hour. This is done by programmatically inserting a record into the table RT (see Figure 2) for that server/backup product combination.

In the preferred embodiment, for optimization, a SQL statement is created that links directly to field ST4 (see Figure 3) and if a given hour is 1, then a request is inserted. This is what the SQL statement will typically look like:

INSERT INTO zrequests (request, reference, backupproductname, requestorname, status, taskid, requestdatetime ) SELECT 'Refresh', servers . servername, servers .backupproductname, 'SYSTEM (scheduled)', 'Waiting',

0, '8/24/00 07:00:06'

FROM servers

WHERE SUBSTRING (refreshstring, 7, l)='l' The SQL syntax above is standard ANSI SQL and can be understood by anyone familiar with average experience of the SQL language. Checks for any refresh requests

The next thing DUS checks RT5 (see Figure 2) is whether there is at least one refresh request in the table RT (see Figure 2) whose status field RT5 (see Figure 2) is set to "Waiting".

Sends refresh request to the BX

If there is at least one record in the table RT, then the first request it finds is taken, and a request is made of the BX for specific historical records of data backup activity.

In the preferred embodiment, to economize on resources, the request specifies a "from date", in other words, the date on which (and after which) historical records of data backup activity from the server should be retrieved. This "from date" is taken from field ST5 (see Figure 3). How the request to BX is made is detailed in the next paragraph.

This request to the BX for historical records of data backup activity is very simple, and in the form of BX.RequestBackupLogs, which requires as arguments the name of the server, the name of the data backup product and the date since which historical records of data backup activity is to be retrieved. The BX, to acknowledge its receipt of the request, returns a field RT6 (see Figure 2). It also "flags" that request as being "in progress" by changing that request record's status field RT5 to "in progress".

Waits for BX to get historical records of data backup activity and return them

The program then waits until BX returns the results of that request before doing anything else.

When the results are returned, they come in the form of a "batch" of records, each having an identical format to one another. Alternatively, there may be an error (such as a network failure) or there may be no historical records of data backup activity to return. In either case, DUS changes the field RT5 (see Figure 2) to the value "Finished" for the record whose taskid is the one DUS just processed.

In the preferred embodiment, changing a request in field RT5 (see Figure 2) in table RT (see Figure 2) to "Finished" is done by constructing a SQL Statement that changes the record whose taskid is the one that was supplied by BX and we just finished with. The typical SQL statement to do this looks like this:

"UPDATE zrequests SET status="Finished" WHERE taskid=1001"

Cycles through batch of records from BX

DUS then cycles through each record in the batch, and for each of those records, prepares a record for insertion into each of 5 different tables, BT, CT, VT, PT, LT, in the database, and illustrated in Figures 3, 4, 5 6 and 7, respectively. Each table is of a different structure, each with a different purpose.

The program inserts records into database For each of the 5 table insertions, an industry-standard "insert" SQL statement is created, which is exemplified in this preferred embodiment.-

INSERT INTO backups (backupdatetime, servername, clientname, clientfqhostname, clientnickna e, targetname, backupproductname, backuplevel, backupcanonicallevel, backupdatetimelocal, backupbytes, backupfilecount, backuperrorcount, dayof eek, hourofday, backupvolume) VALUES ( '8/19/00 23:31:01', 'skylab.backupreport.com', ' skylab.backupreport . com' , ' skylab.backupreport . com' , 'skylab', 'c:\Financial', 'NetWorker', 'incr',

' Incremental ' , ' 8 /19/ 00 23 : 31 : 01 ' , 14728 , 13 , -1 , ' Sat ' , ' 23 ' , ' tape-020 ' )

This SQL INSERT statement is immediately understandable by anyone familiar to the art. It is constructed piece by piece to contain each of the fields and respective field contents required to add a single, unique record to the backups table. When the SQL statement is prepared, it is then "sent" to the SQL Server to be executed. For each record that the BX provides, the program creates and sends a new SQL Statement. Insertion into the backups table BT:

These are the steps DUS takes to prepare each SQL Statement for the table B (see Figure 3)T:

1. Field BX1 (see Figure 3), the name of the computer that did the actual data backup, is placed directly into field BT (see Figure 3)2.

2. Field BX2 (see Figure 3), the name of the computer from which the backup server took data and backed it up, is placed directly into field BT2 (see Figure 3).

3. Field BX3 (see Figure 3), the complete name, that is, including the domain name that it belongs to, of the computer from which the backup server took data and backed it up, is placed directly into field BT4 (see Figure 3).

4. Field BX3 (see Figure 3) is taken again, and all text to the right of the first period in it, including that period, is stripped off (process p2, figure 3) and what is left is placed in field BT5 (see Figure 3). This gives what is often referred to as the "nickname" or "short name".

5. Field BX4 (see Figure 3), the name of the object that was backed up, example: c:\tempfiles, is taken, and placed directly into field BT6 (see Figure 3).

6. Field BX5 (see Figure 3), the name of the data backup product software product that did the backup, is placed directly into field BT7 (see Figure

3).

7. Field BX6 (see Figure 3), the name of the level of backup (for example "incr" meaning "incremental") that took place, is placed directly into field BT8 (see Figure 3). 8. Field BX7 (see Figure 3), the canonical level name, which is the "generalized" name computed by the BX to mean the same kind of backup level regardless of which data backup software product did the backup of data.

9. Field BX8 (see Figure 3), the backup date and time as it occurred in the time zone, in which the backup occurred, is placed directly into field BT10

(see Figure 3). This date and time field does not get changed.

10. Field BX8 (see Figure 3) is taken again, and from it is calculated an adjusted date and time to produce the date and time of the database at the time of the backup (process pi in figure 3), and the result is placed directly into field BTl (see Figure 3). For example, if the backup was performed on a server in New

York at 9PM, but the database is stored in San Francisco, the calculated time would be three hours less than what was taken from field BX8 (see Figure 3), i.e., the result would be 6PM. The hours difference used for this calculation is taken from field BT (see Figure 3)2 in the servers table ST (see Figure 3) from the record specified by field BX1 (see Figure 3) and field BX5 (see Figure 3).

11. In the preferred embodiment, the contents of field ST3 (see Figure 3) was originally set when the program user added a backup server to the database, and specified the New York time zone.

12. Field BX8 (see Figure 3) is taken again, and applying the Delphi language program code FormatDateTime('ddd' ...), the result gives the 3-letter abbreviation for the day of week that that date falls on (process p3, in figure 3), for example "Tue", "Sat", etc. The result is placed into field BT14 (see Figure 3). In the preferred embodiment, the invention uses Delphi is the software tool/language. It could just as easily have been written in another language like "BASIC" or "C++". Delphi was thought to be the optimal tool for us to use in this invention. 13. Field BX8 (see Figure 3) is taken again, and by standard Delphi program code, taking the 11^th and 12^th digits, the hour of the day, in military time format, is obtained (process p4). That hour is placed in field BT15 (see Figure 3).

14. Field BX9 (see Figure 3) is taken, and placed directly into field BT11 (see Figure 3).

15. Field BX10 (see Figure 3) is taken, and placed directly into field BT12 (see Figure 3).

16. Field BX11 (see Figure 3) is taken, and placed directly into field BT13 (see Figure 3).

When all field contents are thus placed in their respective fields in the SQL

Statement, the SQL Statement is sent to the SQL Server for execution on the database. If the record already exists, that is, if that specific data backup activity record was already received by BX, the insertion will not occur. This is because the table was designed to allow only unique records. By unique, it is meant that only one record may exist with a specific combination of the following fields: BTl, BT2, BT3, BT6, BT7 in the table BT (see Figure 3).

When all records in the batch from the BX are inserted (or attempted to be inserted), the lastupdated field (ST5, in figure 3) is updated with the latest backup date found in the batch. This allows the next request to narrow down what is asks for by only asking for the "newest" historical records of data backup activity. (See section "Sends refresh request to BX" earlier in this document).

Insertion into the clients table CT (figure 4):

These are the steps DUS takes to prepare each SQL Statement for the table CT:

1. Field BX1 (see Figure 4), the name of the computer that did the actual backup of data, is placed directly into field CT2 (see Figure 4). 2. Field BX2 (see Figure 4), the name of the computer from which the backup server took data and backed it up, is placed directly into field CTl (see Figure 4).

3. Field BX3 (see Figure 4), the complete name, that is, including the domain name that it belongs to, of the computer from which the backup server took data and backed it up, is placed directly into field CT3 (see Figure 4).

4. Field BX3 (see Figure 4) is taken again, and all text to the right of the first period in it, including that period, is stripped off (process p2) and what's left is placed in field CT4 (see Figure 4). This gives what is often referred to as the "nickname" or "short name".

5. The text "Default" is placed into the field CT5 (see Figure 4).

When all field contents are thus placed in their respective fields in the SQL Statement, the SQL Statement is sent to the SQL Server for execution on the database. If the record already exists, that is, if there is already a record in table CT (see Figure 4) for that combination of fields CTl, CT2, CT3, (all in Figure 4) the insertion will not occur. This is because the table was designed to allow only unique records. By unique, it is meant that only one record may exist with a specific combination of the following fields: CTl, CT2, CT3 in the table CT (see Figure 4).

Insertion into the validtargets table VT:

These are the steps DUS takes to prepare each SQL Statement for the table VT

(in Figure 5):

1. Field BX1 (see Figure 5), the name of the computer that did the actual backup of data, is placed directly into field VT1 (see Figure 5). 2. Field BX2 (see Figure 5), the name of the computer from which the backup server took data and backed it up, is placed directly into field VT2 (see Figure 5).

3. Field BX3 (see Figure 5), the complete name, that is, including the domain name that it belongs to, of the computer from which the backup server took data and backed it up, is placed directly into field VT3 (see Figure 5).

4. Field BX3 (see Figure 5) is taken again, and all text to the right of the first period in it, including that period, is stripped off (process p2 in figure 5) and what is left is placed in field VT4 (see Figure 5). This gives what is often referred to as the "nickname" or "short name".

5. Field BX4 (see Figure 5), the name of the object that was backed up, example: c:\tempfiles, is taken, and placed directly info field VT5 (see Figure 5).

When all field contents are thus placed in their respective fields in the SQL Statement, the SQL Statement is sent to the SQL Server for execution on the database. If the record already exists, that is, if there is already a record in table VT (see Figure 5) for that combination of fields VTl, VT2, VT3, VT5, all in Figure 5, the insertion will not occur. This is because the table was designed to allow only unique target records. By unique, it is meant that only one record may exist with a specific combination of the following fields: VTl, VT2, VT3, VT5 in the table VT (see Figure 5).

Insertion into the backupproducts table PT (in Figure 6):

These are the steps DUS takes to prepare each SQL Statement for the table PT (see Figure 6): 1. Field BX5 (see Figure 6), the name of the data backup software product that did the backup of data, is placed directly into field PTl (see Figure 6).

When all field contents are thus placed in their respective fields in the SQL Statement, the SQL Statement is sent to the SQL Server for execution on the database. If the record already exists, that is, if there is already a record in table PT (see Figure 6) for the value in field PTl (see Figure 6), the insertion will not occur. This is because the table was designed to allow only unique data backup product name records. By unique, it is meant that only one record may exist with a given value for field PTl (see Figure 6) in the table PT (see Figure 6).

When all field contents are thus placed in their respective fields in the SQL

Statement, the SQL Statement is sent to the SQL Server for execution on the database. If the record already exists, that is, if there is already a record in table PT (see Figure 6) for the value in field PTl (see Figure 6), the insertion will not occur. This is because the table was designed to allow only unique data backup product name records. By unique, it is meant that only one record may exist with a given value for field PTl (see Figure 6) in the table PT (see Figure 6).

Insertion into the levels table LT (in Figure 7):

These are the steps DUS takes to prepare each SQL Statement for the table LT (see Figure 7):

• Field BX5 (see Figure 7), the name of the data backup software product that did the backup of data, is placed directly into field LT1 (see Figure 7).

• Field BX6 (see Figure 7), the piece of text that describes the data backup level, for example "incremental" used by the data backup software product that did the backup of data, is placed directly into field LT2 (see Figure 7). • Field BX7 (see Figure 7), the equivalent generic backup level name, is placed directly into field LT3 (see Figure 7).

After the batch of records is inserted

When DUS has cycled through all of the records supplied by the BX and inserted them into the database as described above, DUS then changes the field RT5 (see figure 2) to the value "Finished" for the record whose taskid is the one DUS just processed.

In the preferred embodiment, changing a request in RT (see figure 2) to "Finished" is done by constructing a SQL Statement that changes the record whose taskid is the one that was supplied by BX and we just finished with. The typical SQL statement to do this looks like this:

UPDATE zrequests SET status="Finished" WHERE taskid=1001.

Date of last update stored

The field ST5 (see figure 3) is updated with the highest date/time value that was received in the batch of records that came from BX. This is referred to by the program the next time it makes a refresh request to the BX for the same server. (This is described in detail in the section "Sends refresh request to the BX", earlier in this document).

Alternative Embodiments

Although in the preferred embodiment an embedded software component is used to communicate with the Backup Engines, an alternative embodiment would be to communicate with the Backup Engines directly. -IP-

ADVANTAGES:

1. Flexibility. Because historical records of data backup activity from a plurality of data backup software devices is stored in a single, consolidated, cross-referenced, open relational database, there exists a new opportunity to the user of the invention: the ability to create their own, "custom" reports using industry-standard report writing devices such as Crystal Reports™, Microsoft Access™ or one of dozens of others available in the marketplace. It also allows the user of the invention to set the times of the day at which DUS is to request new historical records of data backup activity.

2. Automation: A software device that can be set up once, and can run with little or no further attention as it continues to extract and make available historical records of data backup activity from a plurality of data backup software devices.

3. Transparency: By isolating the data backup software devices "behind" BX in the fashion described, and thus giving them the same generic interface, it allows the program code to be written only once knowing that it will work for all other data backup software devices that are added to the

4. Cost Savings: By removing the need to have a technical understanding of a plurality of data backup software devices, a significant reduction in the cost of expertise is attained over the expertise required when not using the invention.

5. Minimum Impact: The invention can be used without installing any software on backup servers, or interfering with those installations in any way. Conclusion, Ramifications and Scope

We have provided a solution to the problem of lack of adequate data backup reporting solutions, in particular for a plurality of data backup devices.

While the foregoing has been with reference to a particular embodiment of the invention, it will be appreciated by those skilled in the art that changes in this embodiment may be made without departing from the principles and spirit of the invention, the scope of which is defined by the appended claims.

Claims

Claims:

1. A method for extracting records of data backup activity from a plurality of data backup devices comprising: providing a computer network with data backup activity performed by a plurality of data backup products each with its own clients and servers, providing a host computer interfaced to said network, providing said host computer is ranning a software device or program that includes said method as one of its software components, providing for each data backup product a backup engine plug-in that will obtain records of data backup activity from that data backup product, providing records of data backup activity expressed as a canonical backup log containing backup job records including: i. a date and time that a data backup attempt or operation took place, and ii. a proprietary name of the data backup client, and iii. a fully qualified host name of the data backup client, and iv. a number of bytes that were backed up or default value, and v. a number of files or objects that were backed up or a default value, and vi. a proprietary data backup level name or a default value, and vii. a canonical data backup level name or a default value, and viii. a description of where the information in the data backup job record was obtained, and ix. a number of seconds that elapsed during the data backup operation or a default value, and x. a number of errors or a default value, and xi. a data and time the data backup will expire or default value, and xii. a logical target name, and xiii. a media label of the storage media the data backup was written to, providing a request for the said canonical backup activity log from a said software device or program, whereby said method will make available to said software device or program said canonical backup activity log in response to the asynchronous requests made by said software device or program.

2. A method and system of inserting records of data backup activity from claim 1 into a general purpose database comprising the following steps: providing a general purpose database, construction of a uniformly formatted record for insertion into one or more data tables containing records of data backup activity, execution of a process to add said record to one or more data tables containing records of data backup activity, construction of a uniformly formatted record for insertion into one or more data tables containing backup client records, execution of a process to add said record to one or more data tables containing backup client records, construction of a uniformly formatted record for insertion into one or more data tables containing backup target records, execution of a process to add said record to one or more data tables containing backup target records, construction of a uniformly formatted record for insertion into one or more data tables containing data backup device name records, execution of a process to add said record to one or more data tables containing data backup device name records, construction of a uniformly formatted record for insertion into one or more data tables containing backup level records, executing a process to said record to one or more data tables containing backup level records, executing scheduled requests for new records of data backup activity, executing operator initiated requests for new records of data backup activity, whereby a general purpose database is made available for cross referencing and analysis of backup activity using third party tools.

3. A method for automated generation of refresh requests for records of data backup activity from data backup software devices comprising: the method and system from claim 2, providing a data table of request record data, providing said request table containing a record of the name of a backup data software program, providing said table containing a record the name of backup server on which the said backup program is running, providing said table containing a record data relating to the specific data backup software program, providing said table containing data that are needed to communicate a request for records of data backup activity from said server such as: i. passwords, and ii. user name, whereby the invention will invoke the transmission of requests for data relating to new data backup activities.

4. An automatic refresh request generation system for a data backup system, comprising: a computer network with data backup activity performed by a plurality of data backup products wherein each data backup product generates records relating to the data backup activity; a host computer connected to the computer network that executes a canonical backup device; and the canonical backup device further comprising a backup engine plug-in that will obtain records of data backup activity from each data backup product and a refresh request unit that operates in the background of the host computer, the refresh request unit further comprising a timer that determines if the hour of the current day has changed, means for checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and means for initiating a refresh request if the refresh request has been initiated.

5. The system of Claim 4, wherein the refresh request unit further comprises means for deleting completed refresh requests.

6. The system of Claim 5, wherein the deletion means further comprises means for scanning a status field in a request table of a database for a completed value and means for deleting the record having the completed value.

7. The system of Claim 4, wherein the automatic request determiner further comprises a database table having an indicator to indicate when a refresh request is to be initiated.

8. The system of Claim 1, wherein the indicator comprises a string of bits wherein each bit corresponds to a particular hour in a day and wherein the automatic request deteiminer further comprises means for checking a bit corresponding to a particular hour of a day when the particular hour of the day occurs.

9. The system of Claim 4, wherein the initiating means fiirther comprises means for writing data into a database table in order to initiate a refresh operation.

10. The system of Claim 9, wherein the writing means further comprises means for changing a status field in a request database table to waiting in order to initiate a refresh request.

11. An automatic refresh request generation device for a data backup system having a plurality of data backup products, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the device comprising: a backup engine plug-in that will obtain records of data backup activity from each data backup product; and a refresh request unit that operates in the background of the device, the refresh request unit further comprising a timer that determines if the hour of the current day has changed, means for checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and means for initiating a refresh request if the refresh request has been initiated.

12. The device of Claim 11, wherein the refresh request unit further comprises means for deleting completed refresh requests.

13. The device of Claim 12, wherein the deletion means further comprises means for scanning a status field in a request table of a database for a completed value and means for deleting the record having the completed value.

14. The device of Claim 11 , wherein the automatic request determiner further comprises a database table having an indicator to indicate when a refresh request is to be initiated.

15. The device of Claim 14, wherein the indicator comprises a string of bits wherein each bit corresponds to a particular hour in a day and wherein the automatic request determiner further comprises means for checking a bit corresponding to a particular hour of a day when the particular hour of the day occurs.

16. The device of Claim 11 , wherein the initiating means further comprises means for writing data into a database table in order to initiate a refresh operation.

17. The device of Claim 16, wherein the writing means further comprises means for changing a status field in a request database table to waiting in order to initiate a refresh request.

18. An automatic refresh request generation method for a data backup system having a plurality of data backup products, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the method comprising: obtaining records of data backup activity from each data backup product; and automatically requesting the refresh of a particular data backup product, comprising determining if the hour of the current day has changed, checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and initiating a refresh request if the refresh request has been initiated.

19. The method of Claim 18, wherein requesting the refresh fiirther comprises deleting completed refresh requests.

20. The method of Claim 19, wherein the deletion further comprises scanning a status field in a request table of a database for a completed value and deleting the record having the completed value.

21. The method of Claim 18, wherein the automatic request determining further comprises a database table having an indicator to indicate when a refresh request is to be initiated.

22. The method of Claim 21, wherein the indicator comprises a string of bits wherein each bit corresponds to a particular hour in a day and wherein the automatic request determiner further comprises checking a bit corresponding to a particular hour of a day when the particular hour of the day occurs.

23. The method of Claim 18, wherein the initiating further comprises writing data into a database table in order to initiate a refresh operation.

24. The method of Claim 23, wherein the writing further comprises changing a status field in a request database table to waiting in order to initiate a refresh request.

25. A system for inserting records from one or more data backup products into a canonical database, comprising: a computer network with data backup activity performed by a plurality of data backup products wherein each data backup product generates records relating to the data backup activity; a host computer connected to the computer network that executes a canonical backup device; and the canonical backup device fiirther comprising a backup engine plug-in that will obtain records of data backup activity from each data backup product and a database insertion unit that inserts the data from each record of a data backup product into a canonical database, the database insertion unit further comprising means for inserting data from the record into a data backup portion, means for inserting data from the record into a backup client portion, means for inserting data from the record into a backup target portion, means for inserting data from the record into a backup device name portion and means for inserting data from the record into a backup level portion.

26. The system of Claim 25, wherein inserting means further comprises means for inserting the data from the record into one or more data tables containing data backup records, means for inserting data from the record into one or more data tables containing backup client records, means for inserting data from the record into one or more data tables containing backup target records, means for inserting data from the record into one or more data tables containing backup device name records and means for inserting data from the record into one or more data tables containing backup level records.

27. The system of Claim 25, wherein the database insertion unit further comprises means for constructing a uniformly formatted record for insertion into one or more data tables.

28. The system of Claim 25, wherein the canonical backup device further comprising a refresh request unit that operates in the background of the host computer, the refresh request unit further comprising a timer that determines if the hour of the current day has changed, means for checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and means for initiating a refresh request if the refresh request has been initiated.

29. The system of Claim 28, wherein the refresh request unit further comprises means for deleting completed refresh requests.

30. The system of Claim 29, wherein the deletion means further comprises means for scanning a status field in a request table of a database for a completed value and means for deleting the record having the completed value.

31. The system of Claim 28, wherein the automatic request determiner further comprises a database table having an indicator to indicate when a refresh request is to be initiated.

32. The system of Claim 31 , wherein the indicator comprises a string of bits wherein each bit corresponds to a particular hour in a day and wherein the automatic request determiner further comprises means for checking a bit corresponding to a particular hour of a day when the particular hour of the day occurs.

33. The system of Claim 28, wherein the initiating means further comprises means for writing data into a database table in order to initiate a refresh operation.

34. The system of Claim 33, wherein the writing means further comprises means for changing a status field in a request database table to waiting in order to initiate a refresh request.

35. A device for inserting records from one or more data backup products into a canonical database, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the device comprising: a backup engine plug-in that will obtain records of data backup activity from each data backup product; and a database insertion unit that inserts the data from each record of a data backup product into a canonical database, the database insertion unit further comprising means for inserting data from the record into a data backup portion, means for inserting data from the record into a backup client portion, means for inserting data from the record into a backup target portion, means for inserting data from the record into a backup device name portion and means for inserting data from the record into a backup level portion.

36. The device of Claim 35, wherein inserting means further comprises means for inserting the data from the record into one or more data tables containing data backup records, means for inserting data from the record into one or more data tables containing backup client records, means for inserting data from the record into one or more data tables containing backup target records, means for inserting data from the record into one or more data tables containing backup device name records and means for inserting data from the record into one or more data tables containing backup level records.

37. The device of Claim 35, wherein the database insertion unit further comprises means for constructing a uniformly formatted record for insertion into one or more data tables.

38. The device of Claim 35, wherein the canonical backup device further comprising a refresh request unit that operates in the background of the host computer, the refresh request unit further comprising a timer that determines if the hour of the current day has changed, means for checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and means for initiating a refresh request if the refresh request has been initiated.

39. The device of Claim 38, wherein the refresh request unit further comprises means for deleting completed refresh requests.

40. The device of Claim 39, wherein the deletion means further comprises means for scarining a status field in a request table of a database for a completed value and means for deleting the record having the completed value.

41. The device of Claim 38, wherein the automatic request determiner further comprises a database table having an indicator to indicate when a refresh request is to be initiated.

42. The device of Claim 41, wherein the indicator comprises a string of bits wherein each bit corresponds to a particular hour in a day and wherein the automatic request determiner further comprises means for checking a bit corresponding to a particular hour of a day when the particular hour of the day occurs.

43. The device of Claim 38, wherein the initiating means further comprises means for writing data into a database table in order to initiate a refresh operation.

44. The device of Claim 43, wherein the writing means further comprises means for changing a status field in a request database table to waiting in order to initiate a refresh request.

45. A method for inserting records from one or more data backup products into a canonical database, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the method comprising: obtaining records of data backup activity from each data backup product; and inserting the data from each record of a data backup product into a canonical database, the database insertion further comprising inserting data from the record into a data backup portion, inserting data from the record into a backup client portion, inserting data from the record into a backup target portion, inserting data from the record into a backup device name portion and inserting data from the record into a backup level portion.

46. The method of Claim 45, wherein inserting further comprises inserting the data from the record into one or more data tables containing data backup records, inserting data from the record into one or more data tables containing backup client records, inserting data from the record into one or more data tables containing backup target records, inserting data from the record into one or more data tables containing backup device name records and inserting data from the record into one or more data tables containing backup level records.

47. The method of Claim 45, wherein the database insertion further comprises constructing a uniformly formatted record for insertion into one or more data tables.

48. The method of Claim 45, wherein the canonical backup device further comprising a refresh request unit that operates in the background of the host computer, the refresh request unit further comprising a timer that determines if the hour of the current day has changed, means for checking an automatic request determiner to determine if a refresh request for the particular server is automatically initiated and means for initiating a refresh request if the refresh request has been initiated.

49. A system for inserting records from a plurality of data backup products having different formats into a canonical database, the system comprising: a computer network with data backup activity performed by a plurality of data backup products wherein each data backup product generates records relating to the data backup activity; a host computer connected to the computer network that executes a canonical backup device; and the canonical backup device further comprising a backup engine plug-in that will obtain records of data backup activity from each data backup product and a database insertion unit that inserts the data from each record of a data backup product into a canonical database, the database insertion unit further comprising means for inserting data from the record into a data backup table having a backup time and date field, a server name field, a client name field, a client host name field, a client nickname field, a target name field, a backup product name field, a backup level field, a backup canonical name field, a backup local data and time field, a backup bytes field, a backup file number field, a backup error count field, a days of week field, an hour of the day field and a backup volume field, means for inserting data from the record into a backup client table having a client name field, a server name field, a client host name field, a client nickname field and a client owner field, means for inserting data from the record into a backup target table having a server name field, a client name field, a client host name field, a client nickname field, a target name field, an ignore field and a ignore until field, means for inserting data from the record into a backup device name table having a backup product name field and means for inserting data from the record into a backup level table having a backup product name field, a backup level field and a backup canonical level field.

50. The system of Claim 49, wherein each record of each backup product includes one or more of a server name field, a client name field, a client host name field, a logical target field, an engine name field, a level name field, a canonical level field, a backup date field, a byte count field, a file count field, an error count field and a media label field.

51. A device for inserting records from one or more data backup products into a canonical database, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the device comprising: a backup engine plug-in that will obtain records of data backup activity from each data backup product; and a database insertion unit that inserts the data from each record of a data backup product into a canonical database, the database insertion unit further comprising means for inserting data from the record into a data backup table having a backup time and date field, a server name field, a client name field, a client host name field, a client nickname field, a target name field, a backup product name field, a backup level field, a backup canonical name field, a backup local data and time field, a backup bytes field, a backup file number field, a backup error count field, a days of week field, an hour of the day field and a backup volume field, means for inserting data from the record into a backup client table having a client name field, a server name field, a client host name field, a client nickname field and a client owner field, means for inserting data from the record into a backup target table having a server name field, a client name field, a client host name field, a client nickname field, a target name field, an ignore field and a ignore until field, means for inserting data from the record into a backup device name table having a backup product name field and means for inserting data from the record into a backup level table having a backup product name field, a backup level field and a backup canonical level field.

52. The device of Claim 51, wherein each record of each backup product includes one or more of a server name field, a client name field, a client host name field, a logical target field, an engine name field, a level name field, a canonical level field, a backup date field, a byte count field, a file count field, an error count field and a media label field.

53. A method for inserting records from one or more data backup products into a canonical database, each data backup product connected to a computer network and each data backup product generating records relating to the data backup activity for the data backup product, the method comprising: obtaining records of data backup activity from each data backup product; and inserting the data from each record of a data backup product into a canonical database, the database insertion further comprising inserting data from the record into a data backup table having a backup time and date field, a server name field, a client name field, a client host name field, a client nickname field, a target name field, a backup product name field, a backup level field, a backup canonical name field, a backup local data and time field, a backup bytes field, a backup file number field, a backup error count field, a days of week field, an hour of the day field and a backup volume field, inserting data from the record into a backup client table having a client name field, a server name field, a client host name field, a client nickname field and a client owner field, inserting data from the record into a backup target table having a server name field, a client name field, a client host name field, a client nickname field, a target name field, an ignore field and a ignore until field, inserting data from the record into a backup device name table having a backup product name field and inserting data from the record into a backup level table having a backup product name field, a backup level field and a backup canonical level field.

54. The system of Claim 53, wherein eachrecord of each backup product includes one or more of a server name field, a client name field, a client host name field, a logical target field, an engine name field, a level name field, a canonical level field, a backup date field, a byte count field, a file count field, an error count field and a media label field.