Method, System, and Computer Program Product for Managing Storage Resources
Background of the Invention
Field of the Invention
The invention relates generally to the field of storage area networks, and more particularly to the managing of storage resources in storage area networks.
Related Art
Traditional approaches exist for managing the integrity of storage devices in computer networks. These approaches generally the combined use of various hardware monitoring devices and software analysis applications. Administrators typically install a new or modified monitoring device or analysis application for every addition or modification to the systems of their storage area network (SAN).
Additionally, administrators are faced with the often impossible task of achieving compatibility and functionality among the various systems in use in their SAN. For instance, a problem of maintaining the operational efficiency, as well as general health, including but not limited to data integrity and device stability, exists for administrators of SAN environments. For example, one device in the SAN may conflict with another device in the SAN. Therefore, in view of the above, what is needed is a system, method and computer program product for managing the storage resources of a storage area network. Furthermore, what is needed is a system, method and computer program product for monitoring and analyzing capacity, inventory and performance information concerning the general status of a data storage network to facilitate detection and correction of system irregularities, such as device failures. Still further, what is needed is a system, method and computer program
product for notifying users of system irregularities and reporting current system conditions and allowing users to provide instructions to a system, method and computer program product able to update the storage resources of a storage area network.
Summary of the Invention
The present invention is directed to a method, system, and computer program product for managing the storage resources of a storage area network. Service agents query devices of the SAN for status information. In one embodiment, service agents use Simple Network Mail Protocol (SNMP) to query the devices. The service agents forward the status information they receive to a storage resource manager (SRM). In one embodiment, a communications module of the SRM handles the status information, which can also be sent using SNMP. The SRM parses the status information according to parameters (device type, device number, version, current temperature, etc.) and, based on that parsing, matches the currently status information with previously received status information. In one embodiment, a correlation module provides this functionality. The SRM sends notifications to users when a parameter does not meet a predefined rule. The SRM further translates the parameters of the status information to provide additional information for analysis. In one embodiment, the further translation is accomplished by one or more capacity management modules, and inventory and performance management modules. The system allows the user to provide instructions to alter the operation of devices based on information reported to the user by the system. The one or more devices are part of the SAN. In a further aspect of the invention, a graphical user interface (GUI) is provided by the SRM so that users can create login accounts for accessing components of the SAN. Further, the GUI provides both system and device level diagrams in a tree format illustrating device and event relationships.
Additionally, the GUI provides users with the ability to search for specific kinds and types of components of the SAN.
In a further aspect of the invention, the SRM enables a user to indicate code changes such as, but not limited to software upgrades, which can be obtained by the SRM from sources such as the SAN, the internet, or the global
Internet. The SRM is then able to download these code changes to the devices of the SAN, thereby upgrading them uniformly.
Further aspects of the invention, and further features and benefits thereof, are described below. The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention.
Brief Description of the Figures
In the drawings, like reference numbers indicate identical or functionally similar elements. Additionally, the left-most digit(s) of a reference number identifies the drawing in which the reference number first appears.
FIG. 1 illustrates a block diagram of an example storage resource manager in a storage area network configuration, according to embodiments of the invention; FIG. 2 shows a flowchart providing detailed operational steps of the service agents, according to an embodiment of the present invention;
FIG. 3 shows a flowchart providing detailed operational steps of the storage resource manager, according to an embodiment of the invention;
FIG.4 shows a flowchart providing detailed operational steps of a routine for updating storage resources, according to an embodiment of the invention;
FIG. 5 illustrates a block diagram of an example graphical user interface of a storage resource manager, according to an embodiment of the invention;
FIG. 6 illustrates another block diagram of an example graphical user interface of a storage resource manager, according to another embodiment of the invention;
FIG.7 illustrates an event trace flowchart of the storage resource manager, according to embodiments of the present invention;
FIG. 8 illustrates an example data communication network, according to an embodiment of the present invention;
FIG. 9 shows a simplified five-layered communication model, based on an Open System Interconnection (OSI) reference model; and FIG. 10 shows an example of a computer system for implementing the present invention.
The present invention will now be described with reference to the accompanying drawings.
Detailed Description of the Preferred Embodiments
Overview
The present invention is directed to a method, system, and computer program product for managing the storage resources of a storage area network (SAN). The invention manages the storage resources of a SAN by monitoring the current operating conditions of the devices of the SAN. The invention also allows for those operating conditions to be altered, thereby allowing dynamic management of devices in the SAN. According to the present invention, information is recorded over time so that capacity, inventory, and performance of the SAN can also be determined.
It is noted that embodiments of the invention are discussed hereafter in FIG.l, with reference to the flowcharts of FIGS. 2 - 4 and 7. Graphical user
interface (GUI) diagrams of FIGS. 5 and 6 are also mentioned. FIGS. 2 - 7 are discussed in further detail after a discussion of an example SAN environment in FIGS. 8 and 9. After the detailed discussions of FIGS. 2 - 7, an example computer system is described with respect to FIG. 10. FIG. 1 illustrates a block diagram of an example storage resource manager in a storage area network configuration, according to embodiments of the invention. Storage resource manager (SRM) 102 includes communications module 104, database 106, notification module 108, correlation module 110, reporting module 112, capacity management module 114, direct remote access manager module 116, inventory and performance management module 118, and control logic module 120.
FIG. 1 further illustrates modules of the SRM which operate external to the SRM 102 in the illustrated embodiment. Service agents 124 and direct remote access console (DRAC) 126 are shown within storage area network (SAN) 122. A storage area network (SAN) is a sub-network of shared storage devices. While service agents 124 and DRAC 126 are shown external to SRM 102, this is but one embodiment. One skilled in the relevant art, based on the teachings described herein, could implemented modules 124 and 126 within SRM 102 (for example, through the use of conduits allowing for the sending of status information to service agents 124 within SRM 102). It is noted that FIG. 2, described in detail below, discusses a routine for service agents 124, according to an embodiment of the invention.
In one embodiment, the service agents of the present invention can be implemented as device modules and device chains as described in detail in a commonly-ownedU.S.PatentApplicationNo. (to be assigned), (Attorney Docket
No. 1942.0070000), entitled "Method, System, and Computer Program Product for a Data Propagation Platform and Applications of Same."
In one embodiment, communications module 104 receives status information from service agents 124 within storage area network (SAN) 122 via a simple network mail protocol (SNMP). In one embodiment, service agents 124
and DRAC 126 are implemented in a customized switch at the network entry into the SAN. As such, modules 124 and 126 are able to use their network links (either to a local SRM or over the Internet) to transmit status information to communications module 104. Communications module receives the status information sent by service agents 124. In one embodiment, communications module includes the functionality of a SNMP server, such that it is able to both send and receive messages in SNMP.
Database 106 receives status information from the communications module 104 and stores it for further processing. In one embodiment, database 106 includes a structured query language (SQL) server with stored procedures
(SPs) or rules for translating the status information in a format readily readable by other modules of the invention. FIGS. 3, 4, and 7 described these features in greater detail below.
As discussed above, correlation module 110 receives status information from communications module 104 and parses the status information according to the type of information as well as the specific device to which the information relates. Correlation module 110 also accesses the stored information of database 106. In one embodiment, correlation module 110 matches status information with stored information according to the SPs of database 106. Notification module 108 provides alert management functionality for the
SRM 102. Notification module 108 accesses stored notification rules. In one embodiment, the notification rules are stored within notification module 108. In another embodiment, the notification rules are stored in database 106. The notification rules provide specific values for parameters of the status information. For instance, a notification rule can state that the fan speed of all the fans of all devices being monitored must be greater than 2000 revolutions per minute (rpm). If a fan's speed is lower than 2000 rpm then notification module 108 sends a notification to defined user devices 128.
User devices 128 include handheld devices 130, phones 132, mobile devices 134, personal computers (PCs) 136, pagers 318, and personal digital
assistants (PDAs) 140. Description of these user devices 128 is provided for convenience only. It is not intended that the invention be limited to application with these devices. In fact, after reading the following description, it will become apparent to a person skilled in the relevant art how to implement the invention with alternative devices known now or developed in the future.
Reporting module 112 provides a graphical user interface (GUI) through which users can access both the stored information of database 106 and the current status information received by communications module 104. FIGS. 5 and 6 show example GUIs according to embodiments of the invention. In a further embodiment, reporting module 112 generates detailed analyses according to the information generated by capacity management module 114 and inventory and performance management module 116.
Capacity management module 114 determines the overall SAN capacity. In one embodiment, the SAN capacity include network bandwidth and utilization determination. In other embodiments, electrical consumption rates (power) is also measured. The resulting data is hereinafter referred to as capacity information. Additionally, in embodiment, capacity management module is able to determine changes in the SAN's capacity over time by comparing current status information with stored information. Inventory and performance management module 118 determines the presence or absence of devices from the SAN, as well as the performance levels of those devices. In one embodiment, an absent device is given a null performance level. In another embodiment, an absent device is not evaluated for performance levels. In yet another embodiment, the absence of a device triggers a notification rule, described above, and recent performance levels are included in current reports to aid a user in understanding the absence of the device.
Similarly to capacity management module 114, inventory and performance management module 116 is capable of accessing stored information from database 106 and current status information received by communications module 104.
Control logic module 120 enables the modules of SRM 102 to utilize the ports and memory of a personal computer (PC). In one embodiment, the modules of the SRM 102 are implemented in software, and control logic module 120 fulfills the duties of an operating system capable of supporting the functionality described herein.
In another embodiment, the modules of the SRM 102 are implemented in hardware and control logic module 120 provides power regulations and port arbitration functionality. In yet another embodiment, the modules of SRM 102 are implemented in a combination of hardware and software, and control logic module 120 provides all of the above-described functionality.
Description in these terms is provided for convenience only. It is not intended that the invention be limited to application in this embodiment. In fact, after reading the following description, it will become apparent to a person skilled in the relevant art how to implement the invention in alternative environments known now or developed in the future. Further detailed embodiments of the elements of SRM 102 are discussed below.
Terminology related to the present invention is described in the following subsection. Next, an example storage area network environment is described, in which the present invention may be applied. Detailed embodiments of the routines of the SRM 102 of the present invention are presented in the following subsection, followed by exemplary graphical user interface of the storage resource manager. Finally, an exemplary computer system in which the present invention can be implemented is then described.
Terminology
To more clearly delineate the present invention, an effort is made throughout the specification to adhere to the following term definitions as consistently as possible.
Arbitrated A shared 1 OOMBps Fibre Channel transport supporting up to Loop 126 devices and 1 fabric attachment.
Fabric One or more Fibre Channel switches in a networked topology.
GBIC Gigabit interface converter; a removable transceiver module for Fibre Channel and Gigabit Ethernet physical-layer transport. GLM Gigabit link module; a semipermanent transceiver that incorporates serializing/deserializing functions.
HBA Host bus adapter; an interface between a server or workstation bus and a Fibre Channel network.
Hub In Fibre Channel, a wiring concentrator that collapses a loop topology into a physical star topology.
Initiator On a Fibre Channel network, typically a server or a workstation that initiates transactions to disk or tape targets.
JBOD Just a bunch of disks; typically configured as an Arbitrated Loop segment in a single chassis. LAN Local area network; a network linking multiple devices in a single geographical location.
Point-to- A dedicated Fibre Channel connection between two devices. point
Private A free-standing Arbitrated Loop with no fabric attachment. loop Private An Arbitrated Loop device that does not support fabric login. loop device
Public loop An Arbitrated Loop attached to a fabric switch.
Public loop An Arbitrated Loop device that supports fabric login and device services. RAID Redundant Array of Independent Disks.
SCSI Small Computer Systems Interface; both a protocol for transmitting large blocks of data and a parallel bus architecture.
SCSI-3 A SCSI standard that defines transmission of SCSI protocol over serial links.
Storage Any device used to store data; typically, magnetic disk media or tape.
Switch A device providing full bandwidth per port and high-speed routing of data via link-level addressing.
Target Typically a disk array or a tape Subsystem on a Fibre Channel network.
Topology The physical or logical arrangement of devices in a networked configuration.
WAN Wide area network; a network linking geographically remote sites.
Example Storage Area Network Environment
In a preferred embodiment, the present invention is applicable to storage area networks. As discussed above, a storage area network (SAN) is a high-speed sub-network of shared storage devices. A SAN operates to provide access to the shared storage devices for all servers on a local area network (LAN), wide area network (WAN), or other network coupled to the SAN.
It is noted that SAN attached storage (S AS) elements can connect directly to the SAN, and provide file, database, block, or other types of data access services. SAS elements that provide such file access services are commonly called Network Attached Storage, or NAS devices. NAS devices can be coupled to the SAN, either directly or through their own network configuration. A SAN configuration potentially provides an entire pool of available storage to each network server, eliminating the conventional dedicated connection between server and disk. Furthermore, because a server's mass data storage requirements are fulfilled by the SAN, the server's processing power is largely conserved for the handling of applications rather than the handling of data requests.
FIG. 8 illustrates an example data communication network 800, according to an embodiment of the present invention. Network 800 includes a variety of devices which support communication between many different entities, including
businesses, universities, individuals, government, and financial institutions. As shown in FIG. 8, a communication network, or combination of networks, interconnects the elements of network 800. Network 800 supports many different types of communication links implemented in a variety of architectures. Network 800 may be considered to be an example of a storage area network that is applicable to the present invention. Network 800 comprises a pool of storage devices, including disk arrays 820, 822, 824, 828, 830, and 832. Network 800 provides access to this pool of storage devices to hosts/servers comprised by or coupled to network 800. Network 800 may be configured as point-to-point, arbitrated loop, or fabric topologies, or combinations thereof.
Network 800 comprises a switch 812. Switches, such as switch 812, typically filter and forward packets between LAN segments. Switch 812 may be an Ethernet switch, fast-Ethernet switch, or another type of switching device known to persons skilled in the relevant art(s). In other examples, switch 812 may be replaced by a router or a hub. A router generally moves data from one local segment to another, and to the telecommunications carrier, such as AT&T, Inc. or WorldCom, Inc., for remote sites. A hub is a common connection point for devices in a network. Suitable hubs include passive hubs, intelligent hubs, and switching hubs, and other hub types known to persons skilled in the relevant art(s).
Various types of terminal equipment and devices may interface with network 800. For example, apersonal computer 802, a workstation 804, aprinter 806, a laptop mobile device 808, and a handheld mobile device 810 interface with network 800 via switch 812. Further types of terminal equipment and devices that may interface with network 800 may include local area network (LAN) connections (e.g., other switches, routers, or hubs), personal computers with modems, content servers of multi-media, audio, video, and other information, pocket organizers, Personal Data Assistants (PDAs), cellular phones, Wireless Application Protocol (WAP) phones, and set-top boxes. These and additional types of terminal equipment and devices, and ways to interface them with
network 800, will be known by persons skilled in the relevant art(s) from the teachings herein.
Network 800 includes one or more hosts or servers. For example, network 800 comprises server 814 and server 816. Servers 814 and 816 provide devices 802, 804, 806, 808, and 810 with network resources via switch 812.
Servers 814 and 816 are typically computer systems that process end-user requests for data and/or applications. In one example configuration, servers 814 and 816 provide redundant services. In another example configuration, server 814 and server 816 provide different services and thus share the processing load needed to serve the requirements of devices 802, 804, 806, 808, and 810. In further example configurations, one or both of servers 814 and 816 are connected to the Internet, and thus server 814 and/or server 816 may provide Internet access to network 800. One or both of servers 814 and 816 may be Windows NT servers or UNIX servers, or other servers known to persons skilled in the relevant art(s). In FIG. 8, appliance 818 is connected to servers 814 and 816, and to disk arrays 820, 822, and 824. Preferably, appliance 818 has a fibre channel switch or other high-speed device used to allow servers 814 and 816 access to data stored on connected storage devices, such as disk arrays.820, 822, and 824. Further fibre channel switches may be cascaded with appliance 818 to allow for the expansion of the SAN, with additional storage devices, servers, and other devices. As shown in example network 800 of FIG. 8, appliance 818 is also connected to a hub 826.
Hub 826 is connected to disk arrays 828, 830, and 832. Preferably, hub 826 is a fibre channel hub or other device used to allow servers 814 and 816 access to data stored on connected storage devices, such as disk arrays 828, 830, and 832. Further fibre channel hubs may be cascaded with hub 826 to allow for expansion of the SAN, with additional storage devices, servers, and other devices. In an example configuration for network 800, hub 826 is an arbitrated loop hub. In such an example, disk arrays 828, 830, and 832 are organized in a ring or loop topology, which is collapsed into a physical star configuration by hub 826. Hub
826 allows the loop to circumvent a disabled or disconnected device while maintaining operation.
Disk arrays 820, 822, 824, 828, 830, and 832 are storage devices providing data and application resources to servers 814 and 816 through appliance 818 and hub 826. As shown in FIG. 8, the storage of network 800 is principally accessed by servers 814 and 816 through appliance 818. The storage devices may be fibre channel-ready devices, or SCSI (Small Computer Systems Interface) compatible devices. Fibre channel-to-SCSI bridges may be used to allow SCSI devices to interface with fibre channel hubs and switches, and other fibre channel-ready devices. One or more of disk arrays 820, 822, 824, 828, 830, and 832 may instead be alternative types of storage devices, including tape systems, JBODs (Just a Bunch Of Disks), floppy disk drives, optical disk drives, and other related storage drive types.
The topology or architecture of network 800 will depend on the requirements of the particular application, and on the advantages offered by the chosen topology. One or more hubs 826 and/or one or more appliances 818 may be interconnected in any number of combinations to increase network capacity. Disk arrays 820, 822, 824, 828, 830, and 832, or fewer or more disk arrays as required, may be coupled to network 800 via these hubs 826 and appliances 818. The SAN appliance or device as described elsewhere herein may be inserted into network 800, according to embodiments of the present invention. For example, appliance 818 may be augmented by other SAN appliances to provide improved connectivity between the storage device networking (disk arrays 820, 822, 824, 828, 830, and 832), the user devices (elements 802, 804, 806, 808, and 810) and servers 814 and 816, and to provide the additional functionality of the appliance 818 of the present invention described elsewhere herein.
Communication over a communication network, such as shown in network 800 of FIG. 8, is carried out through different layers. FIG. 9 shows a simplified five-layered communication model, based on Open System
Interconnection (OSI) reference model. As shown in FIG. 9, this model includes an application layer 908, a transport layer 910, a network layer 920, a data link layer 930, and a physical layer 940. As would be apparent to persons skilled in the relevant art(s), any number of different layers and network protocols may be used as required by a particular application.
Application layer 908 provides functionality for the different tools and information services which are used to access information over the communications network. Example tools used to access information over a network include, but are not limited to Telnet log-in service 901, IRC chat 902, Web service 903, and SMTP (Simple Mail Transfer Protocol) electronic mail service 906. Web service 903 allows access to HTTP documents 904, and FTP (File Transfer Protocol) and Gopher files 905. Secure Socket Layer (SSL) is an optional protocol used to encrypt communications between a Web browser and Web server. Transport layer 910 provides transmission control functionality using protocols, such as TCP, UDP, SPX, and others, that add information for acknowledgments that blocks of the file had been received.
Network layer 920 provides routing functionality by adding network addressing information using protocols such as IP, IPX, and others, that enable data transfer over the network.
Data link layer 930 provides information about the type of media on which the data was originated, such as Ethernet, token ring, or fiber distributed data interface (FDDI), and others.
Physical layer 940 provides encoding to place the data on the physical transport, such as twisted pair wire, copper wire, fiber optic cable, coaxial cable, and others.
Description of this example environment in these terms is provided for convenience only. It is not intended that the invention be limited to application in this example environment. In fact, after reading the description herein, it will become apparent to persons skilled in the relevant art(s) how to implement the
invention in alternative environments. Further details on designing, configuring, and operating storage area networks are provided in Tom Clark, "Designing Storage Area Networks: A Practical Reference for Implementing Fibre Channel SANs" (1999).
Storage Resource Manager Embodiments
The method for the storage resource manager of the present invention are described in more detail. These method embodiments are routines that described herein for illustrative purposes, and are not limiting. In particular, the present invention as described herein can be achieved using many orderings of the steps described herein.
Furthermore, the method of the present invention as described herein can be implemented in a computer system, application-specific box, or other device. In an embodiment, the present invention may be implemented in a SAN appliance, which provides for an interface between host servers and storage. Such SAN appliances include the SANLink™ appliance, developed by
StorageApps Inc., located in Bridgewater, New Jersey.
Storage resource manager 102 provides the capability to monitor and analyze information concerning the general health of a data storage network to facilitate detection and correction of system irregularities, such as component failures.
As discussed above, the SRM 102 of the invention monitors and records information regarding performance of the overall storage network (system level) as well as individual components of the network (device level). The SRM 102 automatically and continuously gathers hardware and software configuration information, performance information, as well as operational events data, for purposes of reporting and storing that information to a database, such as database 106. In one embodiment, database 106 is a centralized database that stores
information regarding the storage area network, which can be analyzed in support of management decisions affecting the network.
Furthermore, the SRM 102 of the invention analyzes network information based on a knowledge-base of historical data (stored information) on networks (e.g., cause-and-effect data). In one embodiment, the SRM 102 of the invention is programmed to alert the user to possible future problems. This analytical capability facilitates intelligent and informed management decisions that improve the architecture and performance of the overall network. Storage resource management provides a number of desirable storage functions, including the following:
Notification (Alert Management):
The SRM 102 of the present invention detects errors or failures. The SRM 102 also detects trends that may lead to or signify the likely occurrence of an error or failure. In embodiments, notification module 108 determines if a condition or device parameter signifies an error or possible error, based on notification rules. In one embodiment, upon detection of an actual or potential problem, the notification module assigns a severity rating to the problem based on the degree of the deviation from the standard supplied by the rule. Depending on the severity of the error, the SRM can initiate the proper remedial response according to embodiments described with respect to FIG. 4 and 7.
As described above, the SRM 102 of the present invention provides both real-time information and information that is stored for later analysis. The SRM 102 regularly polls the various components of the storage area network, and measures performance criteria such as data throughput. In one embodiment, service agents 124 provide this polling and measuring. The detection of any error or failure by the SRM 102 triggers an immediate notification to user devices, and thus users themselves. In one embodiment, the information can be reported in real time to users via user devices 128 having Internet or other telephonic connectivity.
In another embodiment, general system or network configuration information is sent to database 106 at regular fixed intervals, where the information is stored for subsequent analysis by reporting module 112.
In one embodiment, status information is transmitted to the database 106 via a direct connection or the Internet over SMTP/e-mail. The communications module 104 provides built-in mail processing functionality that allows users to log remotely the status of service calls into the database 106 via user devices 128. In a further embodiment, the SRM 102 allows users to disable temporarily specific notifications for a period of time.
Capacity Management:
In one embodiment, the capacity management module of the present invention can assist users in conducting "what if scenarios to plan for future growth of the storage area network.
Inventory (Asset Management): The SRM 102 of the invention has detailed information about the entire storage area network and is able to determine when changes made in the network (e.g., added components and upgrades) occur by changes in the status information when compared with stored information. Thus, by recording such changes in the database 106, the inventory and performance module 118 of SRM 102 is able to inform users of the current and historical inventory of the network in terms of its configuration and composition.
Performance Management:
In embodiments, the SRM 102 of the invention correlates and associates performance events with certain known problems. This cause-and-effect capability allows the system to detect problematic trends, and therefore recognize diagnose actual and potential problems. In one embodiment, inventory and performance management module 118 provides this functionality. In another
embodiment, module 118 is able to forward performance information to reporting module 112 for delivery to user devices 128.
The invention also takes advantage of the versatility of the Internet and the user-friendliness of web pages. In embodiments, information gathered by the SRM 102 of the present invention from the storage area network can be made directly available to user devices 128 via the Internet in an easy to use and understandable format. For example, a user can access the status information and receive notifications through an authorized web page via a mobile device 130. Through a dynamic web-based GUI, the user can search the SAN for information on a component-by-component basis using a graphical network representation described below with respect to system tree 518 of FIG. 5, and device tree 618 of FIG. 6.
Referring to FIG. 2, a flowchart describing a routine 200 for service agents according to an embodiment of the present invention, is shown. In step 202, service agents 124 monitor the storage area network.
Typically, servers, storage devices, and network devices are able to report on their current operating conditions. While the protocols and formats and content of their report differ from device to device, the information is available. In one embodiment, the service agents are made of up one or more service agent threads which are designed to monitor the specific protocol, format and content of a given device. In all, the number of service agents used at a given moment depends largely on the number and kinds of devices being utilized in the SAN.
In step 204, service agents 124 obtain device information from the devices available on the SAN. In one embodiment, service agents can be instructed to ignore certain devices.
In step 206, service agents 124 obtain SAN information from the SAN appliance. In one embodiment, the service agents 124 are executed on the SAN appliance which can serve as switch 818 in FIG. 8, thus performing as the point- of-entry for the SAN.
In another embodiment, the service agents 124 are executed on the SRM 102 and connect to the SAN to obtain the SAN and device information.
In step 208, service agents 124 determine status information based on the obtained device information and SAN information. In embodiments, the service agents 124 combine the device and SAN information and tabulate the information so that it can be transmitted via a standard protocol.
In step 210, service agents 124 forward status information. In one embodiment, service agents 124 forward the status information to the communications module 104 of SRM 102. Referring to FIG. 3, a flowchart describing a routine for the storage resource manager, according to an embodiment of the invention, is shown.
In step 302, SRM 102 receives status information from service agents 124.
In step 304, SRM 102 accesses stored information from database 106. In one embodiment, current status information is added to the stored information so that other modules of SRM 102 can access the information directly from database
106. In another embodiment, the modules of SRM 102 are able to access database 106 for the stored information, as well as communications module 104 for current status information. In step 306, SRM 102 correlates the status information and the stored information. In one embodiment, correlation module 110 performs parsing operations to align the parameters of the information according to defined categories. For example, system temperatures are aligned for easy checking to see if a notification rule requires the sending of a notification (that is, a system has a temperature which is outside of acceptable limits).
In step 308, SRM 102 determines capacity information. In one embodiment, the capacity information includes projected limits on the number of devices for a given network, projected bandwidth requirements, and projected resource utilization levels.
In step 310, SRM 102 determines inventory and performance information by comparing status information to stored information and providing variances and changes.
In step 312, SRM 102 accesses notification rules that define values for which a notification is required to be sent to selected user devices 128. In embodiments, notification rules track network and device status, as well as data status information. For example, several notification rules are based on values for: Central processing unit (CPU) temperature, motherboard temperature, fan speed, network utilization, hard disk storage capacity, operating system type, and hardware configuration. These examples are not intended to limit the application of notification rules. Alternate notification rules will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. For instance, one can provide a notification rule where changes in the amount of physical memory on a devices triggers an alert (from a rule matching values for hardware configuration).
In step 314, SRM 102 determines required notifications based on notification rules.
In step 316, SRM 102 forwards required notifications to selected user devices 128. In step 318, SRM 102 accesses reporting rules that define the format and associations of the various devices of the SAN.
In step 320, SRM 102 determines required reports based on the reporting rules. The following list includes some example reporting rules that are not intended to limited the invention but to show the invention's functionality and ability to provide accurate information.
In one embodiment, reporting rules are run against all incoming notifications. More specifically, each reporting rule is applied by running its stored procedure on the notifications until all reports are applied or the list of alertable notifications reaches 0. If there are any notifications left after the reporting rules are applied, then the remaining are forwarded in a report.
In an embodiment, new reporting rules can be added to the invention. Reporting rules can also be disabled. The following list identifies several example reporting rules:
Alertable Notification Reporting Rule - Notifications that are specified as not alertable are removed. This reporting rule is set at the component type and severity level. If an notification comes in with a component type and severity level that is specified as not being alertable, then the notification is removed.
Disabled Notification Reporting Rule - Notifications can be temporarily disabled at the appliance and component level. An entry can be made in a table to disable notifications concerning a component in a device for a set number of hours. In one embodiment, when this reporting rule is applied to the incoming notification, the notifications that fall within the disable period are removed.
Outdated Notification Reporting Rule - Notifications are removed if the date of the notification is older that the device's setup date. This is meant to remove notifications that the SRM received before the most recent setup date
(where the device's configuration may have been altered)..
Duplicate Notification Reporting Rule - Duplicate notifications are removed.
Duplicate Alert Interval Reporting Rule - There is a global setting per company which allows someone to set the time interval in which the system will not send duplicate reports. This reporting rule removes notifications that fall within the interval specified.
Sensor Override Reporting Rule - Sensor notifications are reported out based on sensor override threshold values set by the user. If a fan sensor warning comes in from a device and the sensor override threshold table indicates that the fan speed falls within the user's override setting no report is forwarded.
Moreover, some examples of the additional reporting rules and combinations thereof are: 1) The ability to disable individual alerts; 2) the ability to allow users to subscribe to alerts; 3) state driven alerts. (Rules of this type would suppress things like hardware failures until the state of the hardware
changes.); 4) placing thresholds on alerts so a user would get an alert after a certain number of occurrences of the alert; 5) alert correlation - ability for a combination of alerts to be reported as something different than a series of alerts (For example, a fan speed of zero along with a power supply voltage of zero and no network presence from the device would lead to a single report to a user that the device appears to be disabled or without power and not several alerts about the various notifications received).
In step 322, SRM 102 forwards required reports to selected user devices 218. In one embodiment, the required reports are forward to a Web site or similarly functioning component of reporting module 112, where users can use a user device 128 to access the reports.
Referring to FIG. 4, a flowchart describing a routine for updating storage resources, according to an embodiment of the invention, is shown.
In step 402, SRM 102 receives updates from a user device 128. In one embodiment, the updates include instructions for the altering of the configuration of the devices in the SAN. In another embodiment, the updates include instructions for the addition of new devices to the SAN. In yet another embodiment, the updates include instructions that tell the SRM 102 to obtains update information (that is, code changes, software upgrades, patches, and/or other fixes).
In step 404, SRM 102 revises stored information and rules to reflect the updates received.
In step 406, SRM 102 obtains update information. In one embodiment, the update information is obtained from other Web sites on the global Internet. In another embodiment, the update information is obtained from a device on the
SAN.
In step 408, SRM 102 forwards updates. In one embodiment, the communications module 104 of the SRM 102 communicates the operating changes to the devices of the SAN. In another embodiment, the communications module 104 utilizes the service agents 124 to communicate with the devices of
the SAN. In yet another embodiment, the direct remote access manager module (DRAM module) 116 is used to communicate the updates directly to the DRAC 126. The DRAC 126 therefore provides the altered operating and configuration information to the devices of the SAN. In step 410, SRM 102 forwards update information to the devices of the
SAN. In embodiment similar to those of step 408, the SRM 102 provides the update information obtained in step 406 to the devices of the SAN.
It is noted that FIG. 7 provides an alternative embodiment of FIGS. 3 and 4. Referring to FIG. 7, an event trace flowchart 700 of the storage resource manager, according to embodiments of the present invention, is shown.
Storage area network (SAN) 702, storage resource manager (SRM) 704, and user (706) are delineated in order to emphasize, in this embodiment, the separation of the steps described in FIGS. 3 and 4.
In step 708, service agents forward status information from the SAN 702 ' to the communications module of SRM 704.
In step 710, SRM 704, similar to SRM 102, receives the status information.
In step 712, SRM 704 correlates the status information and stored information. In step 714, SRM 704 determines the capacity, inventory and performance information of the SAN 702.
In step 716, SRM 704 determines notifications.
1
In step 718, user 706 receives nominations. In step 720, SRM 704 determines reports. In step 722, user 706 receives reports.
In step 724, user 706 forwards updates to SRM 704. In embodiments, the communication in steps 718, 722, and 724, the SRM 704 and user 706 takes place via a communications module of SRM 704 or other control logic module of SRM 704 and a user device. In step 726, SRM 704 revises stored information to reflect the updates.
In step 728, SRM 704 sends the updates to the service agents or DRAC via the communications module of DRAM module of the SRM 704.
In step 730, service agents or DRAC receive updates.
In step 732, SRM 704 obtains update information. In step 734, SRM 704 sends the update information to the service agents or DRAC via the communications module of DRAM module of the SRM 704.
In step 736, service agents or DRAC receive update information.
The embodiments for the storage resource manager of the present invention described above are provided for purposes of illustration. These embodiments are not intended to limit the invention. Alternate embodiments, differing slightly or substantially from those described herein, will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein.
Graphical User Interface Embodiments of the Storage Resource Manager
The invention may be implemented in any communication network, including LANs, WANs, and the Internet. As described above, the reports of
SRM 102 can be web pages. The reporting module 112, as well as the web pages generated by it, are independent of the service agents 124. Therefore, the reporting module 112 is able to provide the GUI of the present invention regardless of the operations of the service agents 124. For example, if the SAN becomes disconnected and no status information is available, then the reporting module 112 will still be able to provide that information or rather the lack of current status information, as well as any stored information available from database 106.
Exemplary GUIs of the invention are now discussed in greater detail. Referring to FIG. 5, a block diagram of an example graphical user interface screen 500 of a storage resource manager, according to an embodiment of the invention, is shown.
Screen 500 includes a header panel 502, a system tree panel 518, and a database selector panel 520. Screen 500 is used to select and view the devices of a SAN. Screen 500 is also used to review the configuration settings and operating parameters of the devices of the SAN. Header 502 includes a home link 504, an accounts link 506, a reports link
508, a tree link 510, a downloads link 512, a search link 514, and a log out link 516. Home link 504 provides a button or hotspot the activation of which displays the main title screen of the Web site. Accounts link 506 provides a button to a screen (not shown) displaying a listing of users who have accounts (login and passwords) in the SRM 102. Reports link 508 provides a button to display a Web page listing the available reports. Tree link 510 provides a button to display the available system trees of panel 518. Downloads link 512 provides a button to display the current update information available to the SRM 102. Search link 514 provides a button to display a search screen that allows a user to search for and find records/stored information about a specific device. Log out link 516 provides a button to terminate a user's current session with the GUI.
Panel 518 provides a listing of all the systems known to the invention. Panel 518 displays the dependancies of the systems with other systems. One or more SANs can be displayed in panel 518. In one embodiment, selection of a specific system in panel 518 updates panel 520 to show the database which stores information for that system. Thus, database selector panel 520 displays information about itself, as well as selection fields 522 which detail the current and historical settings and parameters of a given number of devices within the selected system. Panel 520 provided specific information about the system selected in panel 518. In one embodiment, panel 520 contains information about a specific appliance 818, when appliance 818 is selected in panel 518. Panel 520 contains current status information about appliance 818, as well as additional links to other parsed status information. For example, an alerts link can be activated such that a listing of notifications is displayed in panel 520. As described herein, the
notifications result from variances of parameters in current status information from either the stated parameters of a notification rule or the stored parameters for that device. Additional links contained in panel 520 can be: a note link, which links to a notes field for adding notes about the device for future users; a history link; and a software link.
Referring to FIG. 6, another block diagram of an example graphical user interface of a storage resource manager, according to another embodiment of the invention, is shown.
Screen 600 includes a header panel 602, a device tree panel 618, and a device identity information panel 620. Screen 600 is used to select and view the devices of a SAN. Screen 600 is also used to review the configuration settings and operating parameters of the devices of the SAN.
Header 602 includes a home link 604, an accounts link 606, a reports link 608, a tree link 610, a downloads link 612, a search link 614, and a log out link 616. Home link 604 provides a button or hotspot the activation of which displays the main title screen of the Web site. Accounts link 606 provides a button to a screen (not shown) displaying a listing of users who have accounts (login and passwords) in the SRM 102. Reports link 608 provides a button to display a Web page listing the available reports. Tree link 610 provides a button to display the available device trees of panel 618. Downloads link 612 provides a button to display the current update information available to the devices managed by SRM 102. Search link 614 provides a button to display a search screen that allows a user to search for and find records/stored information about a specific device. Log out link 616 provides a button to terminate a user's current session with the GUI.
Panel 618 provides a listing of all the devices and their dependancies to other devices within the SAN. One or more SANs can be displayed in panel 618. In one embodiment, selection of a specific device in panel 618 updates panel 620 to show the device which has information stored about it. Thus, device identity information panel 620 displays information about itself, as well as device
parameters 622 which detail the current and historical settings and parameters of a given number of devices within the selected system.
In alternative embodiments, the storage resource manager can be configured to operate in a semi-automated or fully automated fashion, where storage resources are managed by the operation of a computer algorithm. For instance, a user is able to designate certain notifications are requiring the immediate shutdown of a given device, or the given device can be shutdown automatically.
Further, the computer algorithm can then re-allocate the remaining devices to provide the services provided by the now shutdown device.
Alternatively, a previously inactive device can be activated to provide the services for the shutdown device. Further ways of managing storage resources will be known to persons skilled in the relevant art(s) from the teachings herein.
It will be known to persons skilled in the relevant art(s) from the teachings herein that the invention is adaptable to additional or fewer GUI elements, additional or fewer links and/or fields. Description in these elements is provided for convenience only. It is not intended that the invention be limited to application in this example GUI. In fact, after reading the following description, it will become apparent to persons skilled in the relevant arts how to implement the invention in alternative graphical user interfaces known now or developed in the future.
Example Computer System
An example of a computer system 1040 is shown in FIG. 10. The computer system 1040 represents any single or multi-processor computer. In conjunction, single-threaded and multi-threaded applications can be used.
Unified or distributed memory systems can be used. Computer system 1040, or portions thereof, may be used to implement the present invention. For example,
the storage resource manager of the present invention may comprise software running on a computer system such as computer system 1040.
In one example, the storage resource manager of the present invention is implemented in a multi-platform (platform independent) programming language such as JAVA 1.1, programming language/structured query language (PL/SQL), hyper-text mark-up language (HTML), practical extraction report language (PERL), common gateway interface/structured query language (CGI/SQL) or the like. Java™- enabled and JavaScript™- enabled browsers are used, such as, Netscape™, Hot Java™, and Microsoft™ Explorer™ browsers. Active content Web pages can be used. Such active content Web pages can include Java™ applets or ActiveX™ controls, or any other active content technology developed now or in the future. The present invention, however, is not intended to be limited to Java™, JavaScript™, or their enabled browsers, and can be implemented in any programming language and browser, developed now or in the future, as would be apparent to a person skilled in the art given this description.
In another example, the storage resource manager of the present invention may be implemented using a high-level programming language (e.g., C++) and applications written for the Microsoft Windows™ environment. It will be apparent to persons skilled in the relevant art(s) how to implement the invention in alternative embodiments from the teachings herein.
Computer system 1040 includes one or more processors, such as processor 1044. One or more processors 1044 can execute software implementing routines described above, such as shown in flowchart 400. Each processor 1044 is connected to a communication infrastructure 1042 (e.g., a communications bus, cross-bar, or network). Various software embodiments are described in terms of this exemplary computer system. After reading this description, it will become apparent to a person skilled in the relevant art how to implement the invention using other computer systems and/or computer architectures.
Computer system 1040 can include a display interface 1002 that forwards graphics, text, and other data from the communication infrastructure 1042 (or from a frame buffer not shown) for display on the display unit 1030.
Computer system 1040 also includes a main memory 1046, preferably random access memory (RAM), and can also include a secondary memory 1048.
The secondary memory 1048 can include, for example, a hard disk drive 1050 and/or a removable storage drive 1052, representing a floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 1052 reads from and/or writes to a removable storage unit 1054 in a well known manner. Removable storage unit 1054 represents a floppy disk, magnetic tape, optical disk, etc., which is read by and written to by removable storage drive 1052. As will be appreciated, the removable storage unit 1054 includes a computer usable storage medium having stored therein computer software and/or data. In alternative embodiments, secondary memory 1048 may include other similar means for allowing computer programs or other instructions to be loaded into computer system 1040. Such means can include, for example, a removable storage unit 1062 and an interface 1060. Examples can include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an EPROM, or PROM) and associated socket, and other removable storage units 1062 and interfaces 1060 which allow software and data to be transferred from the removable storage unit 1062 to computer system 1040.
Computer system 1040 can also include a communications interface 1064. Communications interface 1064 allows software and data to be transferred between computer system 1040 and external devices via communications path 1066. Examples of communications interface 1064 can include a modem, a network interface (such as Ethernet card), a communications port, interfaces described above, etc. Software and data transferred via communications interface 1064 are in the form of signals which can be electronic, electromagnetic, optical
or other signals capable of being received by communications interface 1064, via communications path 1066. Note that communications interface 1064 provides a means by which computer system 1040 can interface to a network such as the Internet. The present invention can be implemented using software running (that is, executing) in an environment similar to that described above with respect to FIG. 8. In this document, the term "computer program product" is used to generally refer to removable storage unit 1054, a hard disk installed in hard disk drive 1050, or a carrier wave carrying software over a communication path 1066 (wireless link or cable) to communication interface 1064. A computer useable medium can include magnetic media, optical media, or other recordable media, or media that transmits a carrier wave or other signal. These computer program products are means for providing software to computer system 1040.
Computer programs (also called computer control logic) are stored in main memory 1046 and/or secondary memory 1048. Computer programs can also be received via communications interface 1064. Such computer programs, when executed, enable the computer system 1040 to perform the features of the present invention as discussed herein. In particular, the computer programs, when executed, enable the processor 1044 to perform features of the present invention. Accordingly, such computer programs represent controllers of the computer system 1040.
The present invention can be implemented as control logic in software, firmware, hardware or any combination thereof. In an embodiment where the invention is implemented using software, the software may be stored in a computer program product and loaded into computer system 1040 using removable storage drive 1052, hard disk drive 1050, or interface 1060. Alternatively, the computer program product may be downloaded to computer system 1040 over communications path 1066. The control logic (software), when executed by the one or more processors 1044, causes the processor(s) 1044 to perform functions of the invention as described herein.
In another embodiment, the invention is implemented primarily in firmware and/or hardware using, for example, hardware components such as application specific integrated circuits (ASICs). Implementation of a hardware state machine so as to perform the functions described herein will be apparent to persons skilled in the relevant art(s) from the teachings herein.
Conclusion
While various embodiments of the present invention have been described above, it should be understood that they have been presented by way of example only, and not limitation. It will be apparent to persons skilled in the relevant art that various changes in form and detail can be made therein without departing from the spirit and scope of the invention. Thus, the breadth and scope of the present invention should not be limited by any of the above-described exemplary embodiments, but should be defined only in accordance with the following claims and their equivalents.