Database for electronic design automation applications
Application no:09689167 -
Filed date:2000-10-11 -
A database for storing chip design information comprises a plurality of parallel arrays for storing a particular class of information. The union of related entries commencing at a given array index across the one or more parallel arrays of a particular class forms a structure for a given instance within a class. Between classes, individual records in an array may cross-reference, through an array index, records in other arrays. The inherent sequential nature of records stored in the array may be used as linking information, thus avoiding the requirement of storing linking pointers in memory. Rather than storing all of the coordinate or spatial information for a given shape, only the offset information from the preceding shape may be stored, with the assumption that the second shape starts at the ending point of the first shape. Certain default values or characteristics for information within the array records can be assumed unless overridden by an indicator in the array record. Allocation of storage space for data entries may be adaptively managed based on the size of the data to be stored, with allocation size being determined by the largest value of the stored entries, or a header code for the data entry indicating the number of bytes. The data header of each class may include a pointer indicating the position in memory of a main data header, which in turn contains pointers to the positions in memory of the other classes, allowing instances in a class to refer to related instances in the other classes through an integer index number without requiring the use of other pointers.
No:707 - Data processing: database and file management or data structures / No:101 - Manipulating data structure (e.g., compression, compaction, compilation)
No:716 - Data processing: design and analysis of circuit or semiconductor mask / No:11 - Layout editor (e.g., updating)
What is claimed is:
1. A method for managing information in a database used for automated electronic circuit design, the method comprising the steps of: storing a plurality of groups of one or more parallel arrays in a computer-readable medium, each group of one or more parallel arrays corresponding to a class of data items, wherein each data item within a class comprises one entry from each of the one or more parallel arrays within the class at the same index location across the one or more parallel arrays; storing a master database header comprising a plurality of class pointers, each class pointer identifying the location of one of said groups of one or more parallel arrays; and storing a plurality of class data headers, one class data header for each group of one or more parallel arrays, each class data header comprising a pointer identifying the location of said master database header.
2. The method of claim 1, wherein each group of one or more parallel arrays corresponds to a component type used in an electronic circuit design.
3. The method of claim 1, further comprising the step of expanding the size of an existing class by storing an additional parallel array to one of the groups of one or more parallel arrays, said additional parallel array containing an entry for each existing data item in the group.
4. The method of claim 1, wherein each of said class data headers comprises an allocation field indicating a number of array entries that have been allocated, and a use field indicating a number of array entries that are currently in use for storing data items.
5. The method of claim 4, wherein each of said class data headers comprises an unused entry index field indicating the index of the first unused data item location in the group of one or more parallel arrays, and wherein each unused data item location in the group except for the last unused data item location stores a continuation index link identifying the next unused data item location in the group.
6. The method of claim 1, wherein a data association across groups of the one or more parallel arrays is carried out by storing in a location of a first array in a first group an index of an associated data item in the one or more parallel arrays of a second group, wherein the method further comprises the step of accessing said associated data item upon reading said location of the first array in the first group by using the pointer in the class data header of the first array to locate the master database header, using one of the class pointers in the master database header to locate said second group, and using the index of the associated item to locate the associated data item in the second group.
7. The method of claim 1, wherein at least one of the one or more arrays in at least one of said groups comprises an intelligent integer array.
8. The method of claim 7, wherein said intelligent integer array has a width corresponding to a number of bytes, and wherein said number of bytes is adaptively determined based on the size of stored data elements in said intelligent integer array.
9. The method of claim 8, further comprising the step of expanding the width of said intelligent integer array when a data element is too large to store in the intelligent integer array based upon its current width.
10. The method of claim 7, wherein said intelligent integer array comprises a header field for one or more corresponding data entries, the header field indicating the number of bytes for its corresponding data entry.
11. The method of claim 10, wherein said intelligent integer array stores shape objects used in routing a circuit design, each shape object being stored in an adaptively selected number of bytes depending upon its type.
12. The method of claim 7, wherein said intelligent integer array stores properties for another array in the groups of one or more parallel arrays.
13. The method of claim 1, wherein said plurality of groups of one or more parallel arrays, said master database header, and said plurality of class data headers are defined in an object-oriented computer language.
14. The method of claim 1, wherein said step of storing said plurality of groups of one or more parallel arrays on a computer-readable medium comprises the step of making a single operating system call to store each of the one or more parallel arrays in each group.
15. The method of claim 1, wherein data entries in at least one group of one or more parallel arrays comprise routing coordinates used in a circuit design, wherein the routing coordinates are normalized by coordinates of a fixed routing grid prior to being stored.
16. The method of claim 15, further comprising the steps of storing a set of header codes indicating that the stored routing coordinates are to be multiplied by the coordinates of the fixed routing grid when being read out.
17. The method of claim 16, wherein at least one array in said plurality of groups of one or more parallel arrays comprises From-To class of data items, wherein each data item is associated with a set of default assumptions and only deviations from the set of default assumptions are stored with the data item.
18. The method of claim 17, wherein said set of default assumptions include assumptions that a path starts at coordinates [0, 0], each layer starts at a defined first layer, no layer change occurs, no width change occurs, if a layer change occurs from a first layer to a second layer then a default via connects the first layer to the second layer, and if a layer change occurs then the new layer has a default width.
19. The method of claim 18, further comprising the step of reading out said at least one array comprising From-To class of data items to construct a vectored path of connectivity in the circuit design, wherein the construction of said vectored path of connectivity uses said set of default assumptions for each data item in the absence of deviations stored with the data item.
20. A computer-readable medium for storing data for access by an electronic design automation software tool being executed on a data processing system, the computer-readable medium including a data structure comprising: a plurality of data objects, each data object comprising: an allocation field indicating a number of array entries that have been allocated; a use field indicating a number of array entries that are currently in use for storing data instances; and a set of parallel arrays, wherein the set of parallel arrays has an internal relationship such that each parallel array has an equal number of storage locations and a set of elements taken from the parallel arrays at a common array index represents a record for a single data instance in the data object, the set of parallel arrays configured such that performing an operation taken from a set of data object operations does not alter the internal relationship, and wherein at least one of the parallel arrays is configured to store data entries comprising index values referencing at least one other array.
21. The computer-readable medium of claim 20, wherein the set of data object operations includes a creation operation for a new parallel array in the data object and a modification operation for changing a dimension of at least one of the parallel arrays in the data object, wherein the at least one other array resides in a separate data object, and wherein the data structure further comprises a master database header comprising a plurality of pointers to the plurality of data objects.
22. The computer-readable medium of claim 20, wherein at least one of the parallel arrays is configured to store data entries comprising delta values representing original data values modified by a base value to reduce storage requirements.
23. The computer-readable medium of claim 22, wherein the modification of the original data values comprises subtraction of the base value.
24. The computer-readable medium of claim 22, wherein the modification of the original data values comprises division by the base value.
25. The computer-readable medium of claim 20, wherein one of the data objects stores properties, another of the data objects stores owners, and the owner data object includes extra data bits available for storing bit-codes that identify common properties.
26. The computer-readable medium of claim 20, wherein the plurality of data objects are defined in an object-oriented computer language.
27. The computer-readable medium of claim 20, wherein the at least one other array resides in the same data object as the at least one of the parallel arrays configured to store index values, and wherein at least one of the parallel arrays is configured to store data entries comprising markers for first values in the parallel arrays of the same data object, thereby enabling creation of linked lists within the data object's arrays.
28. The computer-readable medium of claim 27, wherein the at least one other array has an associated set of predefined default assumptions such that only deviations from the set of predefined default assumptions are stored.
29. The computer-readable medium of claim 28, wherein the at least one other array stores bytestream encoded From-To data, and the set of predefined default assumptions comprise: a path starts at default coordinates; a path starts on a layer equal to a default element in a layer array; there are no layer changes unless otherwise indicated; there are no width changes unless otherwise indicated; when a layer change occurs, a default via is presumed unless otherwise indicated; and when a layer change occurs, a default width for the new layer is presumed unless otherwise indicated.
30. The computer-readable medium of claim 20, wherein at least one of the parallel arrays comprises an intelligent integer array such that storage locations defined by the intelligent integer array change their physical dimensions automatically based upon a maximum value entered into the intelligent integer array, the intelligent integer array utilizing a mode data value and a maximum data value for the intelligent integer array.
31. The computer-readable medium of claim 20, wherein the at least one other array comprises a bystream encoded array residing in the same data object as the at least one of the parallel arrays configured to store index values, and wherein the bytestream encoded array comprises an array of bytes organized such that a single data entry is stored using a minimum number of bytes needed to represent the single data entry, and a portion of each first byte, from a set of one or more bytes holding a single data entry, contains a header code indicating how many bytes make up the set of one or more bytes holding the single data entry.
32. The computer-readable medium of claim 31, wherein the header code is defined to correlate particular header code values to particular base values such that a bytestream encoded data entry comprises a delta value representing a data value modified by the base value identified by the particular header code value.
33. The computer-readable medium of claim 32, wherein the modification of the data value comprises subtraction of the base value.
34. The computer-readable medium of claim 32, wherein the modification of the data value comprises division by the base value.
35. The computer-readable medium of claim 32, wherein the base value comprises a coordinate for a fixed routing grid.
36. The computer-readable medium of claim 32, wherein the base value comrises the data value for a previous bytestream encoded data entry.
37. The computer-readable medium of claim 32, wherein the header code is further defined to correlate particular header code values to particular data values.
38. The computer-readable medium of claim 37, wherein the portion of each first byte consists of eight bits, and header code values above a maximum value represent a layer change to a layer equal to the header code value less the maximum value.
39. A data processing system executing an application program and containing a database used by the application program, the data processing system comprising: a processor for executing the application program; and a memory including a data structure accessible by the application program, the data structure comprising: a plurality of data objects, each data object comprising: an allocation field indicating a number of array entries that have been allocated; a use field indicating a number of array entries that are currently in use for storing data instances; a set of parallel arrays, wherein the set of parallel arrays has an internal relationship such that each parallel array has an equal number of storage locations and a set of elements taken from the parallel arrays at a common array index represents a record for a single data instance in the data object, the set of parallel arrays configured such that performing an operation taken from a set of data object operations does not alter the internal relationship, and wherein at least one of the parallel arrays is configured to store data entries comprising index values referencing at least one other array.
40. The data processing system of claim 39, wherein one of the data objects stores properties, another of the data objects stores owners, and the owner data object includes extra data bits available for storing bit-codes that identify common properties.
41. The data processing system of claim 39, wherein the at least one other array resides in the same data object as the at least one of the parallel arrays configured to store index values, and at least one of the parallel arrays is configured to store data entries comprising markers for first values in the parallel arrays of the same data object, thereby enabling creation of linked lists within the data object's arrays, and wherein the at least one other array has an associated set of predefined default assumptions such that only deviations from the set of predefined default assumptions are stored.
42. The data processing system of claim 39, wherein at least one of the parallel arrays comprises an intelligent integer array such that storage locations defined by the intelligent integer array change their physical dimensions automatically based upon a maximum value entered into the intelligent integer array, the intelligent integer array utilizing a mode data value and a maximum data value for the intelligent integer array.
43. The data processing system of claim 42, wherein the at least one other array comprises a bystream encoded array residing in the same data object as the at least one of the parallel arrays configured to store index values, and wherein the bytestream encoded array comprises an array of bytes organized such that a single data entry is stored using a minimum number of bytes needed to represent the single data entry, and a portion of each first byte, from a set of one or more bytes holding a single data entry, contains a header code indicating how many bytes make up the set of one or more bytes holding the single data entry.
44. The data processing system of claim 43, wherein the header code is defined to correlate particular header code values to particular data values and to particular base values such that some header codes represent data and other header codes represent a type of data and a modification operation, whereby performing the modification operation on a delta value designated by the header code, using the base value, results in an original value.
45. The data processing system of claim 44, wherein at least one of the parallel arrays is configured to store data entries comprising delta values representing original data values modified by a base value to reduce storage requirements.
46. The data processing system of claim 45, wherein a hierarchical tree structure is used to store cell names and net names.
47. The data processing system of claim 46, further comprising a control knob for selectively controlling memory saving techniques used by the data processing system, the memory saving techniques including use of hierarchical tree structures, use of delta values in place of original data values, and use of three-byte encoding and decoding for intelligent integer arrays.
48. The data processing system of claim 39, wherein the internal relationships for a plurality of the data objects are harmonized with each other, thereby enabling retrieval of multiple related data values from the multiple data objects using only a single cross-referencing index value for each parallel array.
49. In a data processing system executing an application program, wherein the application program accesses a data structure comprising information resident in a database, wherein the data structure resides in a memory of the data processing system and includes a plurality of data objects, each data object comprising a plurality of parallel arrays having an internal relationship such that each parallel array in the data object has an equal number of storage locations and a set of elements taken from the parallel arrays in the data object at a common array index represents a record for a single data item, a method of using the parallel arrays of a data object to store and retrieve data, comprising: creating a new parallel array in the data object when a new data field is needed, the creating step being performed such that the internal relationship of the parallel arrays is not changed; storing data in the data object using a first index number, the storing step automatically increasing the size of the parallel arrays when necessary to store the data, the storing step occurring without changing the internal relationship; retrieving data by first-retrieving a second index value from a first parallel array in the data object using a first index number and second-retrieving the data from an array in another data object using the second index value as an index; saving the data structure to a secondary storage; loading the entire data structure from the secondary storage when the entire data structure is needed; and loading the data structure from the secondary storage incrementally when only a portion of the data structure is needed, wherein the incremental loading step is performed without reading in all records for a skipped data object.
50. The method of claim 49, further comprising retrieving a dynamically associated data value via the data object using the first index number when the dynamically associated data value is needed, the retrieving a dynamically associated data value step comprising: accessing a second data object containing associated data values; creating a temporary lookup table for finding a first associated data value for the data item in the data object, the creating step utilizing a bit field in the second data object that marks first associated data values, and the creating step utilizing a pre-counted number of data lists corresponding to the data stored in the data object; and using the temporary lookup table to retrieve the associated data value, if present.
51. The method of claim 50, wherein the retrieving a dynamically associated data value step further comprises first-checking extra bits in the data object, the extra bits configured to store data representing common instances of the second data object.
BACKGROUND OF THE INVENTION
1) Field of the Invention
The field of the present invention relates to electronic design automation for chip design and, more particularly, to computer-implemented methods, systems and/or products for database construction and management useful for electronic design automation applications.
Chip designers often use electronic design automation (EDA) software tools to speed up the design process and to allow simulation of a chip design prior to prototyping or production. Chip design using EDA software tools generally involves an iterative process whereby the chip design is gradually perfected. In a typical chip design process using EDA software tools, the chip designer builds up a circuit by inputting information at a computer workstation generally having high quality graphics capability so as to display portions of the circuit design as needed. A top-down design methodology is usually employed using hardware description languages (HDLs), such as VerilogÂ® or VHDL, for example, by which the designer creates an integrated circuit by hierarchically defining functional components of the circuit, and then decomposing each component into smaller and smaller components.
Two of the primary types of components used in integrated circuits are datapaths and control logic. Control logic, typically random logic, is used to control the operations of datapaths. Datapath areas of the circuit perform functional operations, such as mathematical or other operations. More particularly, datapaths are typically composed of large numbers of highly regular and structured datapath functions, each such function typically including an arrangement of logic cells. In large integrated circuit designs, the number of logic cells can be enormousâon the order of millions. The logic cells, and the information relating to them, is kept track of using a very large database.
The various components of an integrated circuit are initially defined by their functional operations and relevant inputs and outputs. The designer may also provide basic organizational information about the placement of components in the circuit using floorplanning tools. During these design states, the designer generally structures the circuit using considerable hierarchical information, and has typically provided substantial regularity in the design.
From the HDL description, the actual logic cell implementation is typically determined by logic synthesis, which converts the functional description of the circuit into a specific circuit implementation. The logic cells are then placed (i.e., given specific coordinate locations in the circuit layout) and routed (i.e., wired or connected together according to the designer's circuit definitions). The placement and routing software routines generally accept as their input a flattened netlist that has been generated by the logic synthesis process. This flattened netlist identifies the specific logic cell instances from a target standard cell library, and describes the specific cell-to-cell connectivity.
Further explanation of a particular chip design process, with particular emphasis on placement and routing of datapaths, is set forth, for example, in U.S. Pat. No. 5,838,583, hereby incorporated by reference as if set forth fully herein.
While EDA software tools provide substantial assistance to chip designers, the demands on such software tools has dramatically increased over the years as the number of components that can be fit on a single microchip has likewise increased. One aspect of such software tools that has particularly been strained is the database that is required to keep track of all of the myriad components and connections used in a chip design. The database is needed to store, among other things, the type and content of each cell (known as instances), its connections to other cells, and its location within the chip layout. Where hundreds of thousands, or millions, of cells are required for a given integrated circuit design, the database for storing such information can become quite large.
One conventional way of implementing a database for storing chip design information is to create a series of individually allocated chunks of memory, referred to as âbeadsâ, and to link these beads dynamically with pointers. Such a database design is convenient from the standpoint of ease of varying the number of elements, since adding new elements simply involves adding more records to a linked list (or circular list) of like records. However, pointer list database designs have several drawbacks. A major disadvantage is that pointers take up a great deal of memory space. Since pointers are generally required to reach any point in working memory, for large working memories the pointers need to be relatively long. Currently, pointers are often four bytes (32 bits) in length. Where millions of cells are present in a chip design, cross-referencing information through pointers therefore can consume huge amounts of memory.
An additional disadvantage of pointer list databases is that they can take a long time to read into working memory. This problem arises because when the database is stored on a disk (or other permanent storage medium), the pointer values are not recorded since they will have different values each time the data is read in from the disk. Thus, when the database is read from the disk into working memory space, pointer values need to be assigned. The assignment of pointer values upon being read is known as pointer set-up, and the account of time needed for carrying out this process can be quite long for large databases. Where numerous design changes are made, which is a frequent occurrence with large, complicated chip designs, the pointer set-up time is endured each time the database is read from disk, which can greatly slow down the design process.
Yet another disadvantage with pointer list databases in EDA applications is that temporary data associated with a particular bead cannot easily be accessed unless space is permanently allocated within the bead. To provide for such temporary data storage and manipulation, a single generic work field, which the programmer is generally allowed to use for any purpose, is often allocated in the bead. Usually, the generic work field is used for a pointer to additional data. To provide many or all records with a generic work field generally requires setting aside a large amount of storage space, many of which may go unused and therefore be wasteful.
Since data is accessed by pointer lists, it is possible to end up with completely isolated chunks of memory, which are impossible to access using the pointer lists. This situation leads to inefficient use of memory resources.
Another disadvantage of pointer list databases is that this format as stored on the disk is usually different from their in-memory format (i.e., after the database is read in to working memory). The data beads are generally read into working memory in bits and pieces, or else they can be read in singly. In the latter case, backward compatibility with older versions of the database becomes a nuisance and inevitably slows down the read process. In addition, substantial care must be taken at the design stage, since it is often difficult to add extra pointers and data fields after the database design has been set.
Another type of conventional database is based on a series of records stored sequentially in an array. However, array-based databases for storing chip design data have not been utilized extensively in electronic design automation applications. One possible reason for the unpopularity of array-based databases in chip design applications is that such databases are commonly viewed as inefficient for handling large aggregates of data. Typically, the same amount of memory space must be allocated in advance for each record, yet the amount of data within each record can vary widely. From a programming standpoint, it can be unwieldy to handle data stored in an array-based database. Furthermore, dynamic size changes to an array-based database structure are difficult. Another drawback is that there is difficulty inserting new records anywhere but the end of the array.
It would therefore be advantageous to provide a database especially suited for chip design applications, which overcomes one or more of the foregoing problems.
SUMMARY OF THE INVENTION
The present invention relates to computer-implemented methods, systems and/or products for providing a database particularly useful for electronic design automation applications.
In one embodiment as described herein, a database for storing chip design information comprises a plurality of parallel arrays. One or more parallel arrays are used for storing a particular class of information. In this context, a class is an abstract data type comprising a collection of data, such as integers or characters, possibly in connection with a set of special functions (member functions) for operating on the collection of data. For many chip designs, there may be a hundred or more classes of information. The union of related entries commencing at a given array index across the one or more parallel arrays of a particular class forms the structure for a given instance within a class. Between classes, individual records in an array may cross-reference, through an array index, records in other arrays. For example, the database may include an array of port records in one class and a separate array of associated net records in another class, wherein each port record refers to the net assigned to it by an index number into the array of net records. A union of entries across classes that are cross-referenced together with array indexes may form a complete structure for a cell instance or other type of system component.
Due to the general absence of pointers, and the use of array indexes in their stead, large savings of memory are possible. Also, the version of the database as stored on disk or in other permanent storage medium can be almost exactly the same as the version stored in working memory (i.e., the âin-memoryâ version). Another advantage of the array-based database structures described herein is that the record arrays can be written to disk and read from disk with a single operating system call for each array, providing increased speed of performance. Moreover, since each record set is of a known size, it is possible to quickly step over those arrays which are not needed, making read-on-demand capability relatively easy to implement.
Further memory savings and speed advantages may be obtained in other embodiments of the invention, wherein the inherent sequential nature of records stored in the array is used as linking information, thus avoiding the requirement of storing such information in memory. For example, the ports on a cell can be assumed to be in the same order as the port definitions on the corresponding cell master. As a result, a cross-referencing index from the port instance array to the port master array need only be provided to the first entry in each array, and it can be assumed that each subsequent port instance or port definition follows sequentially for the given cell.
Another technique for saving memory space is to avoid storing all of the coordinate or spatial information for a given shape, but rather to store only the offset information from the preceding shape. In one embodiment, an assumption is made that the second shape starts at the ending point of the first shape. The system can also provide certain default values or characteristics for information within the array records, with the default values being used unless overridden by an indicator in the array record. Since little or no information needs to be stored where the default values are used, memory storage requirements can be substantially reduced.
As another technique for conserving memory in connection with the usage of parallel arrays as described above, the one or more parallel arrays within a given class may adaptively allocate storage space for data entries based on the size of the data to be stored. In a particular embodiment, from one to four bytes of data (or a ânullâ designator) may be adaptively allocated for storing data entries. The selection of allocation size may be determined by the largest value of the stored entries. The resulting array structure may be referred to as an intelligent integer array. Alternatively, one of the one or more parallel arrays in a given class may be a âbytestreamâ encoded array, wherein a header code for a given data entry within the bytestream encoded array indicates the number of bytes following the header code for that data entry.
In a particular embodiment as disclosed herein, in which the above parallel array techniques are utilized, a database head comprises a set of pointers to a plurality of memory blocks (i.e., beads), each of which comprises a plurality of records in the form of one or more parallel arrays. Several databases may be stored in working memory at the same time, and a separate database head may be provided for each database. The database heads themselves may be connected to one another through pointers in a linked list.
In a preferred embodiment, each class has a data header with a field indicating the number of indexes allocated for each of the one or more parallel arrays and another field indicating the actual number of indices used within the one or more parallel arrays. In addition, the data header of a given class preferably includes a pointer indicating the position in memory of a main data header. Because the main data header in turn has a pointer to the position in memory of the other classes, instances in the given class may refer to related instances in the other classes through an integer index number without requiring the use of other pointers.
Further embodiments, variations and enhancements are also described herein or shown in the accompanying drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
In support of the above-mentioned system components, a chip floorplanner
Referring now to
According to one embodiment as described herein, a database useful in electronic design automation (EDA) applications is provided. In this database, integrated circuit data such as netlists, cell instances, and cell names are organized into classes. In a preferred embodiment, the database of the present invention is implemented in the C++ language. Although the object-oriented features of C++ provides useful functions such as information hiding and operator overloading, the database of the present invention may be implemented in non-object-oriented languages such as C.
Turning now to
The data header
In a preferred embodiment, the same array index number is used across all of the parallel arrays
Once the database structure is determined, additional fields may readily be added by adding another parallel array
In a preferred embodiment, pointers are not necessary to relate one class to another. Instead, an integer index is used to relate an object in one class to an object in another class. In such an embodiment, pointers would be used to identify each individual class as a whole. A preferred version of such an embodiment is illustrated in more detail in the example shown in FIG.
Information stored within a particular class
A specific example is provided in
By relating classes through index numbers (integers) rather than through pointers, the present invention conserves vast amounts of working memory. For example, in conventional 32-bit computers, each pointer is 4 bytes, and in more modem 64-bit computers, each pointer is 8 bytes, doubling the amount of storage required. However, because the database structure described herein preferably uses index numbers rather than pointers to cross-relate data among parallel arrays and classes, the storage requirements for information are independent of the computer's addressable working memory space. As a result, increasing processing power from a 32-bit computer to a 64-bit computer will not double the required storage for the index numbers.
In a preferred implementation in which C++ is used as the programming language, it may be observed that variables of type integer (âintâ) are stored in 4 bytes, regardless of whether a given integer could be stored in fewer bytes. Although the present invention limits the use of pointers, the memory savings would be less significant if, through the preferred use of the C++ language, every single index number had to be stored in 4 bytes. Thus, two techniques are provided herein for reducing the required memory. In the first technique, the parallel arrays
Turning first to the use of intelligent integer arrays, attention is drawn to FIG.
In one embodiment in which the preferred programming language of C++ is used, the different modes of the adaptive class
The 0 byte mode is particularly useful when all the data is NULL values (i.e, when the database is not using this field). This mode takes up almost no memory. In sharp contrast, an unused field in most prior art databases will always take up memory space. The 1 byte, 2 byte and 3 byte modes use C++ unsigned bytes to maximize the range. Because unsigned bytes are used, special accommodation may need to be made to provide for NULL values. Typically, a value of â1 is used in the database to represent a NULL pointer in the 0 byte mode, whereas [
The data header
It may be observed that in present versions of C++, no variable type is provided that consists of three bytes. Thus, in accordance with one embodiment as disclosed herein, in the 3 byte mode an individual data entry consist of three sequential unsigned characters. Assuming that num_allocated is an arbitrary integer n in the other modes, in the 3 byte mode, the actual num_allocated would be 3n. However, because C++ allows the use of overloaded operators, users of the adaptive class
Some examples will illustrate the memory saving benefits of the intelligent integer array. A number of classes are commonly used in an chip design database, and some of these classes, such as Layer, Master Cell, and Master Port, have a very limited number of possible entries, so that they will usually be referred to by other classes with index numbers that have relatively little dynamic range. For example, there are typically well under 254 (the maximum value representable by one byte, assuming provision for a â1 or NULL value) layers in an integrated circuit database, and so other classes can always refer to the Layer class with index numbers that are less than 254âi.e., less than a single byte of information. Thus, the 1 byte mode could be used for an array of indexes referencing the Layer class. Similarly, the 1 byte mode could typically be used for arrays of Cell Master indexes or Port Master indexes. Even for classes with index numbers having a relatively large dynamic range, such as Cell Instance, the index number would still be expected to be under 4 bytes in virtually all contemplated situations, even in the most current deep submicron (DSM) designs. The 3 byte mode is expected to accommodate most any integer required for a class index in a chip design database, providing a substantial memory savings by avoiding a 4 byte âintâ type storage.
In another embodiment, bytestream encoding is used as an alternative to the use of intelligent integer arrays. Bytestream encoding is particularly useful for data which has a wide dynamic range (i.e., a wide expected value range) and thus might not be as efficiently stored using an intelligent integer array. In a preferred embodiment implementing bytestream encoding, an array of unsigned characters (1 byte) is used to store variable-length data, with additional bytes being allocable depending upon the data value. The first byte in a data entry, and more specifically the first nibble (4 bits), indicates via a header code how many subsequent bytes need to be read to retrieve the integer value. Because less than 16 types of header codes will allow encoding of integer values up to 5 bytes (which is a huge number for even foreseeable databases), the header code can be stored in the first 4 bits of the start byte. This leaves the top 4 bits in the start byte available for storing data. The type of header codes may, in one embodiment, be as given in the following table:
Because the data is interpreted positionally, the start byte of each integer needs to be known in order to make it accessible.
The memory savings of bytestream encoding may be significant. Consider, for example, a case in which one million integers are to be stored, all having values of less than 255. Using an intelligent integer array, as previously described herein, all one million integers could be stored in the 1 byte mode. If, however, a single entry were added of value 256, the storage space might double because the number of bytes needed to store the data of an intelligent integer array is determined by the maximum value that is stored. However, using bytestream encoding the data is encoded on an individual basis, resulting in one million 1-byte records and a single 2-byte record. Effectively, the ârogueâ value did not penalize the storage requirements for the other one million entries.
Additional memory savings can be achieved by providing additional header codes tailored to applications particular to certain chip design scenarios. For example, because many routing coordinates and other data tend to lie on a fixed routing grid, the stored data may be normalized by the coordinates of the grid, making it easier to fit into a smaller number of bytes, provided that the division can be accomplished without losing information. An additional set of header codes may then be provided, similar to the previously described header codes, which indicate that the encoded value must be multiplied by the coordinates of the routing grid when read out.
In chip design applications, a significant use for bytestream encoding may be in connection with the Shape class. A shape is the most primitive object of routing, and can take the form, for example, of a Rectangle, Path, Line, Point, etc. The number of items associated with a particular shape depends upon what kind of shape it is. A rectangle, for example, requires a bounding box, while a point requires only the X, Y coordinates. Most shapes are associated with a particular net, so that the shapes in the database must be linked together in a linked list so they can be found by their owner net. Because an integrated circuit database may contain millions of shapes for a given circuit design, it is important to keep memory usage low for the Shape class.
In one embodiment, memory space is conserved by bytestream encoding each particular shape. The number of bytes used for this bytestream encoding depends upon the encoded data. Each shape preferably starts with a header code to indicate what type of shape it is, followed by the unique data for that type of shape.
Each parallel array can be an intelligent integer array (although the Bits array, which can store boolean fields, would typically be a simple array of unsigned characters). The Net record itself preferably has an index number pointing to its first shape. The arrangement of
This general design principle of using bytestream encoding and delta values is applicable in many situations. For example, one application of a bytestream encoded class is referred to as a âFrom-Toâ class, and has certain advantages for efficient memory usage, particularly with respect to routing information. It is generally very important to store routed connections efficiently for computer aided design (CAD) applications. As integrated circuit designs have become larger and larger, the memory needed to store the physical routing is becoming critically close to current machine limits. The âFrom-Toâ bytestream encoding techniques described herein allow for storage of physical routing data in a much compressed form.
A âFrom-Toâ is essentially a vectored path of connectivity, which can change layer and width as desired. The principles of âFrom-Toâ bytestream encoding may be explained with reference to
Consider the segment of the conduction path in
Storage of such records in variables of type âintâ would, in a language such as C++, require 32 bytes per segment. However, portions of this routing data may be viewed as redundant and need not be stored. For example, the starting coordinates for the next segment on layer METAL
Accordingly, memory space can be conserved by making certain assumptions when storing routing data. In one embodiment, the following assumptions are made when coding and reconstituting the bytestream encoded From-To data:
1) A path starts at coordinates [
2) Each layer starts as layer #
3) There is no layer change unless otherwise indicated.
4) There is no width change unless otherwise indicated.
5) When the layer is changed, assume the default via that connects the two layers unless otherwise indicated.
6) When the layer is changed, assume the default width for the new layer.
The header codes for the bytestream encoded From-To class attempt to store the integer data in as few bytes as possible. The following are a few examples to illustrate the technique:
SEG_INT_HORIZONTAL: The next 4 bytes contain an integer that should be added to the current X coordinate.
SEG_SHORT_NEG_VERTICAL: The next 2 bytes contain an integer that should be subtracted from the current Y coordinate.
SEG_SHORT_WIDTH: The next 2 bytes contain an integer for the new width.
Because the header codes are relative to the current X,Y coordinate, the deltas (changes in X and Y) tend to be small numbers and can typically be encoded into one, two or three bytes, instead of the usual four bytes required for an integer in languages such as C++.
Moreover, in one embodiment, the header code is expanded to include the entire first byte. This allows the use of additional header codes, which are particular to a class. Thus, the additional header codes are specific to particular aspects of the class, such as segment widths, diagonal segments, and vias (including stacked vias) in a From-To class.
The most prolific header code is generally for a layer change. Because a typical database will not require more than 127 defined header codes, a convention may be adopted wherein any header code with a value greater than 127 indicates a layer change. The new layer index number is obtained by subtracting 127 from the value of a header code which has indicated a layer change. Because a typical database in EDA applications will normally have far fewer than 127 layers, this technique will nearly always suffice. If, however, the database requires more than 127 layers, larger header codes may be used to store the layer index.
An example of âFrom-To bytestream encoding in accordance with the above principles is shown in FIG.
In the example shown in
While intelligent integer arrays and bytestream encoding described above represent two powerful features to minimize the required memory, other features may also be provided to further reduce memory storage requirements. For example, every object in the database can have a property assigned to it, permitting the dynamic addition of information to each object. The property may include a Name and a Value. In one embodiment, the Value can be chosen as one of the following:
LIST (like TEXT but used in LISP)
TIME_T (unix time code)
It would generally be advantageous to minimize the amount of memory needed to store properties. For example, many owners (i.e., the objects potentially possessing a property) do not have properties, and it would be very wasteful of memory to have a dedicated âfirst_propertyâ field for each object, especially in the case of âInsttermsâ which can run into the millions. Thus, it is preferred to provide a dynamically creatable lookup mechanism which can find the first property of a particular owner quickly. There are at least two techniques for accomplishing this lookup mechanism: first, a parallel lookup array (which may be an intelligent integer array), and second, a hash table.
To implement either of these two lookup techniques, in accordance with a preferred embodiment, during the creation of a property array the first property is marked as being âFIRSTâ (using a bit field). Also, the number of owner lists is summed during the creation of the property array. For a hash table, the number of owner lists indicates how much memory will be needed to store the hash table. For a parallel array, the number of owners determines the array size, and hence the size of memory needed to store it. Choosing between the two methods depends primarily upon the number of owners possessing a property, with some bias given to the parallel lookup array where memory needs are comparable in view of its quicker speed over the hash table.
In the case in which a parallel array is used as the lookup mechanism, each Property class preferably stores in its parallel arrays (preferably intelligent integer arrays) the following:
OWNER (an index number into some other class)
NEXT (next property in a linked list with same owner)
Turning now to
The lookup mechanism (either the hashtable or the lookup array) can be built on the fly, especially since the FIRST properties on each linked list are marked, so it is relatively simple to look through the property table for these start properties. The NAME, OWNER, NEXT and DATA fields are all preferably intelligent integer arrays. The BITS array is an array of unsigned characters. There are two new tables added, which are classes to store arrays of 4-byte integers and float values.
In a particular embodiment, the method for storing the DATA for each property varies depending upon the type of property:
TEXT, LIST: the DATA array is an index into the STRINGTABLE
INTEGER, BOOLEAN, TIME_T: the DATA array stores the integer value
FLOAT: the DATA is an index into the FLOAT TABLE array of floats
3ÃINTEGER triplet: the DATA is an index into the INTEGER TABLEâthe sequence of 3 integers are the next sequential integers in the table.
3ÃFLOATS triplet: the DATA is an index into the FLOAT TABLEâthe sequence of 3 floats are the next sequential floats in the table.
Although the Property class of
int MaxBitProperties; // returns how many bits are available for use
void SetPropBit(int ownerindex,int bitindex);// set a bit TRUE
void UnSetPropBit(int ownerindex,int bitindex);// set a bit FALSE
Boolean TestPropBit(int ownerindex,int bitnum); // test a bit
These functions can be implemented for each class; some classes may not have any available bits, in which case MaxBitProperties( ) would return zero. In
The BIT properties are preferably not hard-coded, but rather registered by the user application at startup time. For example, the code could be:
GNS top.RegisterProp(GNS_INSTANCE,PROPTYPE_TEXT,âdpRtStatusâ ,ârouted Specialâ);
These definitions are preferably stored in the property table array
Turning now to
When an instance name is encoded, the hierarchy name tree table
Besides the aforementioned techniques, other advantages in conserving memory space and increasing database speed may be achieved from the fundamental arrangement of each class into one or more parallel arrays. For example, the inherent next and previous information that exists in an array can be used to great advantage. Aside from the ability to step forwards and backwards through the list, the database may associate records of two different arrays by their common, relative ordering. Thus, the ports on a cell, stored in consecutive records in a first array, can be assumed to be in the same order as the port definitions on the corresponding master, which are stored in consecutive records in the same relative positions in a second array.
As another advantage, due to the general lack of reliance on pointers, the disk version of the database can be almost exactly the same as the in-memory version. Moreover, the record arrays can be written and read with a single operating system call for each array, which is extremely fast. Since each record is of known size (as indicated in the data header), it is possible to quickly step over those arrays that are not needed. Thus, the parallel array database structure provides a read-on-demand capability for very little cost.
Another advantage provided by the fundamental arrangement of each class into one or more parallel arrays is that the classes may be incrementally loaded as needed. In a typical prior art database in which objects in one class relate to objects in another class only through the use of pointers, the database ordinarily could not be loaded incrementally, at least not conveniently so. This is because a substantial delay would be experienced in attempting to load a linked list database incrementally, since to skip a class would require that each record be read individually to locate the end of the class, and a class may have millions of members. However, in various database structures in accordance with the preferred embodiment(s) as described herein, such delay is avoided, because it can be readily determined how far to skip in order to reach the end of an array. This is because instances within a given class are related by array index number, not by pointers. Given the fact that the number of bytes in each array member is known, it can be readily determined how many bytes must be skipped to reach the end of an array. Where the database is implemented using the C++ programming language, the fseek( ) function call, which being a low level call is very fast, may be used to quickly skip over a class. For example, a million gate array can be fully skipped in hundredths of a second.
By incrementally loading in a database, a given class is only read into memory when needed. Because of the large size of some classes (e.g., the String class), providing an incremental loading capability allows substantial savings in the memory needed for application that does not need a particular class or classes.
Another advantage of using the various parallel-array techniques for implementing databases, as described with various preferred embodiments herein, is that the database can be largely machine independent. For example, a Sun workstation stores integers in a different byte order than does an IBM-compatible personal computer (PC). To make a database created by an IBM-compatible PC accessible to a Sun workstation, the Sun workstation must generally first swap the bytes to a different order, to make them correct for the Sun workstation data protocol. This transformation may be carried out by a relatively simple swapping process that can be part of the database software. In one embodiment, the database file starts with a known test integer. If the integer does not match the expected value, it is byte-swapped and compared again. If a match results, the successful byte-swap is noted, and thereafter byte-swapping is automatically applied as each class is read into memory. However, much of the parallel-array database need not be byte-swapped. For example, a 1-byte array, such as the bytestream array, or an intelligent integer array in lbyte mode, need not be swapped. Similarly, because data in the 3 byte mode of the intelligent integer array is formed from the internally-controlled concatenation of three consecutive bytes, an intelligent integer array in 3 byte mode also need not be swapped. Thus, relatively fast conversion of the parallel-array database can be accomplished between different machines.
The parallel-array-based class structure of the preferred embodiment allows databases created by machines of different processor sizes (e.g., 32-bit computers and 64-bit computers) to be used by one another, because the array structure is the same independent of the processor size. In prior art databases relying upon linked lists, however, such interoperability would be unavailable, because either pointers or pointer-offsets unavoidably get written along with the data on disk, and pointers (or pointer-offsets) for a 32-bit computer cannot be utilized, without modification, by a 64-bit computer, and vice versa.
Most of the features of the present invention described above are designed to reduce the amount of memory used for a design at the expense of speed. However, some users will have more RAM and/or disk space than others, and may wish to trade back memory savings for faster access times. Thus, in one embodiment, a âControl Knobâ is provided to allow user adjustment of the compromise between speed and memory.
Adjustment of the âControl Knobâ turns off and on various memory saving aspects of the present invention. Four main factors are involved. First, the 3-byte encoding and decoding for the intelligent integer arrays is much slower than the 1, 2 and 4-byte encoding because it involves AND'ing and SHIFTING three separate unsigned characters in order to store and retrieve the value. Thus, thus the âControl Knobâ can turn off 3-byte encoding, thereby causing the intelligent integer arrays to go directly from 2-byte encoding to 4-byte encoding.
Second, the hierarchical name tree structure, used for storing cell and net names, drastically reduces the memory needed for those designs that have been flattened, but this memory savings is at the expense of speed. Thus, thus the âControl Knobâ can turn off the use of the hierarchical name tree structures, thereby causing cell and net names to be saved normally. Third, the storing of delta coordinates using the routing grid instead of storing actual coordinates saves memory but slows the storing and retrieval of integers. Thus, the âControl Knobâ can turn off the use of delta data values instead of actual data values. Finally, when objects are deleted from the database, there may be some imbedded unused data. When the garbage percentage of a particular array becomes intolerable, it âGarbage Collectsâ to get rid of it. This involves reconstituting the array in most cases, which may pause the process. If the user is not interested in garbage collecting (except, maybe on db writeout), he can turn it off with the âControl Knob.â
Utilizing the various embodiments as described herein, a fast, flexible and efficient database is provided. Experimentation has shown that databases constructed and utilized in accordance with the inventive techniques described herein can increase read and write speed by 20 to 100 times. Furthermore, the size of the database will normally be much smaller (e.g., 3 to 6 times smaller) than those relying upon pointer lists.
Although the database of the present invention was described with respect to integrated circuit EDA application, those of ordinary skill in the art will appreciate that the principles of the database could be readily adapted to other applications, such as printed circuit board design. Thus, it is to be understood that the embodiments described herein are merely illustrative of the many possible specific arrangements that can be devised to represent application of the principles of the invention.
Field of search:
No:1 - DATABASE OR FILE ACCESSING
No:11 - Layout editor (e.g., updating)
No:10 - Distributed or remote access
Browse by classes
Automotives and Transportation
Business and Commerce
Fashion and Accessories
Hardware and Tools
Health and Medicine
Materials and Material Science
Paper and Office Materials