![]() |
System and method for distribution of network file accesses over network storage devicesNo:6654795 -Application no:09513146 -Filed date:2000-02-25 -Issue date:2003-11-25Abstract:The present invention is a system and method supporting efficient distribution of file access requests across one or more storage device systems. A typical system according to the present invention includes a gateway system, one or more indexing system and one or more storage device systems. File access requests are initially received by the gateway system. The gateway system forwards a received file access request to a selected indexing system; the gateway system maintains information concerning the connection between the originator of the request and the selected indexing system. The selected indexing system determines the locations at which the requested file is stored among the storage device systems and selects one of these locations. The selected indexing system forwards the file access request to the storage device system indicated by the selected location, which responds by breaking the requested file into data packets and transmitting the data packets to the request originator. When the first of the data packets passes through the gateway, the gateway updates its information concerning the connection by replacing the information regarding the selected indexing system with information regarding the storage device system indicated by the selected location. US Classes:Inventors:Agents:Claims:What is claimed is: 1. A system for distribution of network file accesses over storage devices, comprising: (a) one or more storage device systems containing files, wherein each storage device system is capable of performing the steps of: (i) receiving a file access request; (ii) responsive to the receipt of the file access request, generating a file data packet; and (iii) outputting the generated file data packet; and (b) one or more indexing systems, wherein each indexing system is capable of performing the steps of: (i) receiving a file access request comprising a file identifier; (ii) locating a storage device system containing a file corresponding to the file identifier in the received file access request; and (iii) forwarding the received file access request to tbe located storage device system; and (c) a gateway system for performing the steps of: (i) establishing a connection between a client computer and a selected indexing system; (ii) forwarding a file access request received from the client computer to the selected indexing system; and (iii) upon receiving the generated file data packet from a selected storage device system, changing the connection between the client computer and the selected indexing system to a connection between the client computer and the located storage device system. 2. The system of claim 1, and further comprising a communication network connecting the one or more storage device systems, the gateway system and the one or more indexing systems. 3. The system of claim 1, wherein the gateway system, the one or more indexing systems and the one or more storage devices arc connected on a single physical network. 4. The system of claim 1, and further comprising a private communication network connecting the one or more storage device systems and the one or more indexing systems. 5. The system of claim 4, wherein the selected indexing system performs the step of forwarding the file access request to the located storage device system via the private communication network. 6. The system of claim 1, wherein the gateway system comprises a storage device capable of storing one or more connection objects each representing a connection between a client computer and a specific indexing system or a specific storage device. 7. The system of claim 6, wherein the gateway system establishes the connection between the client computer and a selected indexing system by performing the steps of: (A) receiving a connection initiation packet from the client computer; (B) selecting an indexing system from the one or more indexing systems; and (C) creating a connection object representing the connection between the client computer and the selected indexing system in the gateway system's storage device. 8. The system of claim 7, wherein the gateway system performs the step of selecting an indexing system by performing the step comprising of selecting an indexing system based upon current usage of the one or more indexing systems. 9. The system of claim 7, wherein the gateway system performs the step of selecting an indexing system by performing the step comprising of selecting an indexing system based upon a circular queuing algorithm. 10. The system of claim 6, wherein the gateway system forwards a file access request received from the client computer to the selected indexing system by performing the steps of: (A) receiving from the client computer a file access request comprising a destination address and a source address; (B) identifying the connection object in the gateway system's storage device corresponding to the received the file access request; and (C) transmitting the received file access request to the selected indexing system indicated by the identified connection object. 11. The system of claim 10, wherein the gateway system forwards a file access request received from the client computer to the selected indexing system by performing the further step comprising of: (D) modifying the destination address of the received file access request to a destination address of the selected indexing system. 12. The system of claim 11, wherein the gateway system forwards a file access request further comprising a checksum received from the client computer to the selected indexing system by performing the further step comprising of: (E) updating the checksum of the received file access request to conform with the modified destination address. 13. The system of claim 1, wherein each indexing system locates a storage device system containing a file corresponding to the file identifier in the received file access request by performing the steps of: (A) determining a set of storage device systems from the one or more storage device systems containing a file corresponding to the file identifier in the received file access request; and (B) selecting a storage device system from the determined set. 14. The system of claim 13, wherein the indexing system performs the step of selecting a storage device system by performing the step comprising of selecting a storage device system from the determined set based upon current usage of the storage device systems in the determined set. 15. A method operable by a gateway system for distribution of a network file access received from a client computer to a storage device system via an indexing system, comprising the steps of: (a) establishing a connection between the client computer and the indexing system; (b) forwarding a file access request comprising a file identifier received from the client computer to the indexing system; (c) upon receiving a file data packet from a storage device system, changing the connection between the client computer and the indexing system to a connection between the client computer and the storage device system. 16. The method of claim 15, and further comprising the steps of (d) receiving by the indexing system the file access request; (e) locating a storage device system containing a file conesponding to the file identifier in the received file access request; and (f) forwarding the received file access request to the icoated storage device system. 17. The method of claim 16, and further comprising the step of (g) receiving a file access request by the located storage device system; (h) responsive to the receipt of the file access request, generating a file data packet by the located storage device system; and (i) outputting by the located storage device system the generated file data packet. 18. The method of claim 15, and further comprising the steps of (d) receiving a file access request by a storage device system; (e) responsive to the receipt of the file access request, generating a file data packet by the storage device system; and (f) outputting by the storage device system the generated file data packet. 19. A computer-readable, digital storage device containing instructions, that upon executiot by a processor in a gateway system, cause the processor to distribute a network file access received from a client computer to a storage device system via an indexing system, by performing the steps comprising of: (a) establishing a connection between the client computer and the indexing system; (b) forwarding a file access request comprising a file identifier received from the client computer to the indexing system; (c) upon receiving a file data packet from a storage device system, changing the connection between the client computer and the indexing system to a connection between the client computer and the storage device system. Text:BACKGROUND OF INVENTION1. Field of Invention The invention relates to a system and method for distributing file accesses over a network of storage device systems. More specifically, the invention relates to a network translation and organizational system and method that receives file access requests and selectively distributes such requests to an appropriate connected storage device system. 2. Description of Related Art Over the past several years, the Internet has experienced explosive growth. A significant portion of this growth relates to the expanded use of the World Wide Web (the Web). The Web is a group of computer on the Internet providing a distributed hypermedia framework for presenting and viewing multimedia documents. Web pages may contain a variety of multimedia elements and links to other Web pages. Pages are generally constructed using the Hypertext Markup Language (HTML) although other document formatting standards may play a role such as Cascading Style Sheets (CSS) and Extensible Markup Language (XML)âand its progeny such as the Synchronized Multimedia Integration Language (SMIL) and the Resource Description Framework (RDF). Once a page is created, the page resides on a Web server system, or Web site. A particular Web site may host a variety of pages. Client computers can access pages residing on Web sites using a variety of commonly available browser software packages such as Internet Explorer (Microsoft), Netscape (Netscape) or other similar product. The browser software and the server system communicate with each other using the Hypertext Transfer Protocol (HTTP). The client issues a request for a particular resource on the Web using a Uniform Resource Identifier (URI); typically in the case of an HTML Web page, the URI will be a Uniform Resource Locator (URL). A URL specifically identifies a particular resource such as a Web page on the Web. The URL will indicate the particular computer on the Web on which the desired Web page resides, as well as the location of the desired Web page on that computer. A browser software on a client computer generates an HTTP request under a variety of user triggered circumstances such as entering a target URL, selecting a link in a currently viewed page, selecting an item from the browser's history list or pressing the home button. In addition, HTTP requests are generated automatically when other discrete resources are included within a retrieved Web page; for instance, a second HTTP request would be generated for requesting and retrieving an image embedded within page retrieved from an initial user triggered HTTP request. When the Web server indicates in the URL receives the request, the server parses the location information in the URL. Servers utilize a recursive hierarchical directory structure for storing Web available resources. The parsed location information from the request serves as a map by which the server locates the requested resource. The server formats an HTTP response including the requested resource and forwards the response to the requesting client. With the tremendous proliferation of Web usage, the volume of requests has overburdened single Web servers. As seen in This solution has several disadvantages. First, all Web servers must redundantly store all pages for the single logical Web site. Second, the load balancing server must deal with the flow through for all communication between the client and the selected server in the cluster. The same content management issues also occur within the context of mulitmedia content servers having a set of file servers and a set of network attached storage systems. Even in load balancing systems, communications must repeatedly flow through the file servers rather than directly between the client and the data storage systems. The prior art systems do not support the efficient distribution of multiple simultaneous file requests to a single logical server. Further, they do not provide for establishment of direct communication paths between clients and data storage systems. Finally, current systems usually require redundant storage of all data on each data storage system. The system according to the present invention addresses these disadvantages. SUMMARY OF THE INVENTIONThe present invention is directed to a system and method for distributing file access requests over connected storage device systems. A system according to the present invention will include one or more storage device systems, one or more indexing systems and a gateway system. When an incoming file (data set) access request is received at the gateway system, it establishes a connection between the computer originating the file access request and an indexing system and forwards the file access request to the indexing system. The indexing system receives the file access request, locates a storage system that contains the file indicated in the file access request and forwards the file access request to the located storage system. The located storage system receives the request and generates a data packet containing file data directed to the computer originating the file access request. As the data packet passes through the gateway system, the gateway system updates the connection between the request originating computer and the indexing system to a connection between the request originating computer and the located storage system. The above and other objects and advantages of the present invention will become more readily apparent when reference is made to the following description, taken in conjunction with the accompanying drawings. BRIEF DESCRIPTION OF THE DRAWINGSDETAILED DESCRIPTION OF THE INVENTIONA preferred embodiment of the invention is now described in detail. Referring to the drawings, like numbers indicate like parts throughout the views. As used in the description herein and throughout the claims that follow, the meaning of âa,â âan,â and âtheâ includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of âinâ includes âinâ and âonâ unless the context clearly dictates otherwise. In a preferred embodiment, the present invention includes a gateway system, one or more storage device systems and one or more indexing systems. A typical environment is displayed in FIG. A gateway system The system includes an indexing subsystem The union of all files stored within the storage subsystem constitutes a single information space for the server environment The gateway system, all indexing systems and all storage device systems communicate via ethernet The gateway system User software running a client computer The various system components process the received file access request as seen in FIG. The gateway system The selection process may be random, based on system load of available indexing systems or, in a preferred embodiment, based on a circular queue of indexing system. In the circular queue approach, all indexing systems are part of the queue. When an incoming file access request is received, the indexing system at the head of the queue is selected, and the head of the queue is advanced to the next position in the queue. Since the queue is circular, indexing systems will be selected on a least recently selected basis. A variety of methods for implementing circular queues such as linked-list or static array are known to those skilled in the art, and any such method could be used in implementing this selection process (see, e.g., W. Ford and W. Topp, Once the indexing system is selected, the gateway system In this embodiment, all file requests are assumed to be directed to a single virtual server having a single logical IP address and port. Where the server environment After establishing the connection to the selected indexing system In a further embodiment where the gateway system Upon receipt of the file access request forwarded by the gateway system In a preferred embodiment, the search on the indexing systems is accomplished via a hashing scheme. Hashing is a general search technique known to those skilled in the art (see, e.g., Ford, id., pp. 799-814). This technique, however, has not been applied in the context of determination of file locations in a distributed information space. The hashing scheme in this preferred embodiment may utilize any standard hashing function. The input to the hashing function would be the string representing the file identifier contained in the file access request. In the case of a URL, this identifier information would be the full path of the desired resource on the target server. The result of applying the hashing function to the file identifier would yield an integer value. The integer value would be used as an index into an array buckets where each bucket contained one or more records correlating file identifiers with file locations. The records in the bucket would be searched to locate the record corresponding to the hashed file identifier. In a preferred embodiment, each bucket would constitute a linked-list of records correlating file identifiers with file locations and locating the particular record would be accomplished by linear traversal of the linked list. In other embodiments, the buckets could store the records in an ordered tree structure for more efficient searching using a depth-first technique. In yet other embodiments, the buckets could be organized as a second level hash table utilizing a different hashing function. In this embodiment, the integer resulting from applying the hashing function would again serve as an index into an array of buckets containing records of the type described above. The organization of these buckets could utilize any appropriate structure such as those previously described with respect to the first level hashing table. It will be understood by those skilled in the art that other organization structures for the top level or lower level buckets may be interchangeably utilized within the scope of the present invention. In a further embodiment, the hashing scheme might include one or more layers of hashing where each layer utilizes the same or differing hash functions. In this embodiment, the hashing function or functions may be applied to various substrings within the file identifier. For example, each level of hashing could be performed with respect to a set of a predetermined number of characters. If six characters were selected as the predetermined number, the first six characters of the identifier would be used for the first level of hashing, the next six would be used for the second level, and so forth. In another embodiment, where the file identifier includes a string based path specification with individual directories in the path indicated by particular delimiters, two level of hashing might be performed. The first level hashing would be applied to the entirety of the identifier except the portion after the final path delimiter. The second level hashing would be applied to the portion of the identifier after the final path delimiter. For example, if http://www.somesite.com/dir/subdir/subsubdir/file.htm were the file identifier, the first level hashing would be applied to the /dir/subdir/subsubdir/ portion of the identifier, and the second level hashing would be applied to the file.htm portion of the identifier. As with the buckets described above, a variety of organizational structures for the buckets could be used interchangeably within the scope of the present invention. Once the locations of the requested file are determined, a particular location is selected (step Once a location for the file has been determined, the access request is forwarded to the storage device system The sequence could occur in accordance with the sequencing described in Information Science Institute, âThe Transmission Control Protocol,â John Postel, ed., Request For Comments (RFC) 793, September 1981 (available at http://www.ietf.org/rfc/rfc0793.txt), which is expressly incorporated herein in its entirety. The maintenance of the transmission control block for handling the transmissions would include variable SND.UNA, SND.NXT, SND.WND, SND.UP, SND.WL1, SND.WL2, ISS, RCV.NXT, RCV.WND, RCV.UP, IRS as described therein. In a preferred embodiment, each indexing system will also track usage of each file in the information space by monitoring all file access requests forwarded by itself and other indexing system. Each indexing system would monitor the forwarded file accesses issued by all indexing system over the private ethernet The usage information in the table may be absolute number of requests, a request rate, or other suitable measure as would be known to those skilled in the art. If the usage information is a request rate, the request rate may be based upon a either a fixed time interval such as a specified number of days, an absolute time frame such as the time since the gateway system was last restarted or a relative time frame such as the time since the particular file was first placed in its first location among the storage device systems. In a preferred embodiment, indexing systems would run the private ethernet The control messages occurring on such a network would include: 1. Connection redirection requests (as described above); 2. Copy requestsâissued by an indexing system directing a storage device system to copy a data set (file) to a second storage device system; 3. Delete requestsâissued by an indexing system directing a storage device system to delete a data set (file); and 4. Move requestsâissued by an indexing system directing a storage device system to move a data set (file) from itself to a second storage device system. When an indexing system determines that a particular data set (file) is in high demand, a copy request may issue to spread the usage across additional storage device system. Once one indexing system issues such a copy request, other indexing systems monitor the request and refrain from making a similar request unless warranted by further usage demands. Upon completion of the copy, all indexing systems update the set of locations associated with the file subject to the copy request. When an indexing system determines that a particular data set (file) is not being accessed often enough to justify the number of times that it is stored across the storage device systems, the indexing system may issue a delete request to one of the storage device systems containing the particular data set. Once one indexing system issues such a delete request, other indexing systems monitor the request and refrain from making a similar request unless the single requested deletion does not sufficiently correct the situation. Upon completion of the deletion, all indexing systems update the set of locations associated with the file subject to the delete request. When an indexing system determines that a particular storage device system is being selected significantly more than others, the indexing system may issue a move request with respect to particular data sets stored. This request would issue in response to a load imbalance among storage device systems. Particular data sets may be targeted for moves to other storage device systems to more evenly balance the dynamic load across the storage device system. A move is the functional equivalent of a copy and an implicit delete. The selected storage device system The embodiments described above are given as illustrative examples only. It will be readily appreciated that many deviations may be made from the specific embodiment disclosed in this specification without departing from the invention. Accordingly, the scope of the invention is to be determined by the claims below rather than being limited to the specifically described embodiment above. Field of search:Foreign documents:References: |
Browse by classes
Agriculture
Animals Automotives and Transportation Business and Commerce Chemistry Communications Construction Containers Electricity Energy Engineering Entertainment Fashion and Accessories Food Hardware and Tools Health and Medicine Home Industrial Information Technology Machines Materials and Material Science Miscellaneous Optics Outdoors Paper and Office Materials Physics Sanitation Technology Textiles Weaponry
Advertisements
|
