US20030074537A1 - Method and apparatus for indexing a cache - Google Patents

Method and apparatus for indexing a cache Download PDF

Info

Publication number
US20030074537A1
US20030074537A1 US10/298,961 US29896102A US2003074537A1 US 20030074537 A1 US20030074537 A1 US 20030074537A1 US 29896102 A US29896102 A US 29896102A US 2003074537 A1 US2003074537 A1 US 2003074537A1
Authority
US
United States
Prior art keywords
bits
string
array
address
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/298,961
Inventor
Roland Pang
Gregory Thornton
Bryon Conley
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US10/298,961 priority Critical patent/US20030074537A1/en
Publication of US20030074537A1 publication Critical patent/US20030074537A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/10Address translation
    • G06F12/1027Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB]
    • G06F12/1045Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache
    • G06F12/1054Address translation using associative or pseudo-associative address translation means, e.g. translation look-aside buffer [TLB] associated with a data cache the data cache being concurrently physically addressed
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/60Details of cache memory
    • G06F2212/608Details relating to cache mapping
    • G06F2212/6082Way prediction in set-associative cache

Definitions

  • the present invention relates to address searching within a cache index.
  • the present invention relates to address searching within a cache index using a partial physical address.
  • caches and cache indexes are organized into a number of sets, with each set containing one or more entry locations, or ways.
  • the available string of bits i.e. those bits common to the virtual and physical address
  • the system read out an individual set whose ways may be later searched to determine the location of requested data.
  • address bits 0 - 11 for a virtual and physical address are the same.
  • bits 0 - 6 are used for addressing data within each entry of the cache, and therefore are not used to index the entries themselves.
  • bits 7 - 11 can identify 32 individual sets (the five bits forming 2 5 , or 32, unique binary numerals). Thus for caches with 32 or fewer sets, if bits 7 - 11 are available they may be immediately used to uniquely identify the individual set which might contain the requested data.
  • a 256 kilobyte cache containing, for example, 256 sets requires an 8-bit string (e.g. bits 7 - 14 ) to effectively begin a search of the cache index.
  • 8-bit string e.g. bits 7 - 14
  • Known cache systems therefore, have not initiated a search of a cache index for large caches using bits 0 - 11 , since these systems require bits 12 - 14 to initiate the search anyway.
  • These systems must wait until the translation process is complete, and the full physical address is available, before initiating a search of the cache index.
  • the problem with this method is that one or more clock cycles are wasted waiting for the translation process to finish prior to beginning the search for cache data.
  • the method for searching a cache index includes the steps of receiving a virtual address of a requested data element which has at least one common bit which is untranslated between the virtual address and a physical address and searching the cache index using the at least one common bit, before the virtual address is completely translated, to identify a selected block.
  • the cache index may be organized into blocks, each block containing a plurality of sets. Each of the plurality of sets may contain at least one way, each of the ways containing an address tag.
  • An embodiment of a device includes a cache index organized into a number of sets and ways, and an auxiliary memory structure for receiving and storing intermediate search results.
  • FIG. 1A shows a schematic view of a cache unit including an embodiment of a cache index according to the present invention.
  • FIG. 1B shows a schematic view of the cache unit of FIG. 1 connected to a processor.
  • FIG. 1C shows a schematic view of an embodiment of a cache index according to the present invention.
  • FIG. 2A shows a schematic view of an array of the cache index of FIG. 1C.
  • FIG. 2 shows a schematic view of a second array of the cache index of FIG. 1C.
  • FIG. 3 shows a schematic view of an embodiment of an array of a cache index according to the present invention with an identified block of sets.
  • FIG. 4 shows a schematic view of the array of FIG. 2 with an identified block of sets read out to an auxiliary memory structure.
  • FIGS. 5A, 5B, and 6 show a flow diagram of a method according to the present invention.
  • FIG. 7 shows a schematic view of another embodiment of an array of a cache index according to the present invention.
  • FIGS. 8A and 8B show a flow diagram of a method for searching the data array of FIG. 7.
  • FIG. 1A shows a cache unit 10 including an embodiment of a cache index according to the present invention.
  • This system includes a cache index 30 connected to a cache data array 40 .
  • the functions of the cache index 30 are controlled by control block 20 .
  • the control block 20 may be any type of controlling circuit or processor.
  • the cache unit 10 may be connected, for example, to a processor 50 .
  • FIG. 1C shows an exemplary structure of a cache index 30 according to the present invention.
  • the cache index 30 includes two arrays A 1 and A 2 .
  • the cache index 30 also includes, for example, an auxiliary memory structure B 1 which may store intermediate search results from array A 1 .
  • Comparitors 31 and 32 are also provided, for example, in the cache index 30 . These may be used by the control block 20 to perform comparisons between partial address strings of a requested piece of data and partial address strings contained in the arrays A 1 and A 2 .
  • FIGS. 2A and 2B show this exemplary cache index 30 in greater detail.
  • the cache index 30 stores address tags or simply “tags” for the actual data stored in the cache data array 40 .
  • the cache index is organized into, for example, two separate arrays A 1 and A 2 . Each of the two arrays is further organized into, for example, 256 sets, each of the sets containing, for example, 8 ways (other particular configurations being possible).
  • the number of ways in each set generally indicates the number of entries that may be stored in each set.
  • the cache organized into fewer sets and a greater number of ways e.g. 256 sets in8 ways
  • the former will be able to store data in more combinations than the latter.
  • the expanded options provided by the former arrangement generally lead to longer search times.
  • an embodiment of the present invention utilizes two arrays A 1 and A 2 , each organized into, for example, 256 sets and 8 ways, which are used to store address tags.
  • the separate arrays A 1 and A 2 store different portions of the address tag.
  • the first array A 1 stores, for example, tag bits 15 - 23 of an address tag. These bits form a so-called “micro-tag” or “ ⁇ tag” which is used, for example, for way prediction.
  • the second array A 2 stores, for example, tag bits 24 - 35 of an address tag. These bits form a so-called “upper tag” and may be used to confirm a predicted address hit.
  • the first array A 1 stores a nine bit tag (bits 15 - 23 ) in 256 sets and 8 ways
  • the second array A 2 stores a 12 bit tag (bits 24 - 35 ) in 256 sets and 8 ways.
  • the 256 sets are organized into, for example, 32 blocks of 8 sets each.
  • the address tag may be any suitable length.
  • the bits comprising the address tag may be apportioned between the two arrays A 1 and A 2 in any suitable manner (i.e. the micro-tag and upper tag may each include any suitable number of bits).
  • the cache index 30 differs from known cache indexes in that it includes, for example, auxiliary memory structure B 1 for receipt and storage of intermediate search results. As described below, this auxiliary memory structure B 1 allows searching of the cache index based upon a partial physical address that need not uniquely identify an individual set that can the requested data.
  • bits 0 - 11 of each virtual or linear address are, for example, identical to (or otherwise untranslated from) bits 0 - 11 of each corresponding physical address.
  • bits 0 - 6 are used, for example, to address the data within each entry of the cache.
  • bits 7 - 11 of the physical address may be used for indexing the entries stored in the cache data array 40 .
  • These bits therefore identify, for example, a partial physical address for cache indexing purposes. It can be understood that this partial physical address need not include the exemplary bits 7 - 11 , but that other bit strings may be equally useful in practicing the present invention.
  • the 5-bit string 7 - 11 may be used to identify 2 5 , or 32, individual memory locations. These 32 individual memory locations correspond, for example, to the 32 blocks of 8 sets each in the array A 1 . Accordingly, bits 7 - 11 may be used (by, for example, the control block 40 ) to identify the particular block in array Al that might contain the address of a requested piece of data. The control block 20 may then cause the selected block to be read to the additional memory structure B 1 . Because bits 7 - 11 are available as soon as the data request is received, the control block 20 may perform these functions without waiting for a translation of the requested address.
  • bits 12 - 35 of each virtual address are, for example, not identical to bits 12 - 35 of the corresponding physical address. Rather bits 12 - 35 must be translated by the system to generate bits 12 - 35 of the physical address. This translation may be performed using any known technique and typically requires a full clock cycle or more. In an exemplary embodiment, the translation occurs while bits 7 - 11 are used to identify the selected block discussed above. The translation may also be performed in parts, so that a portion of the (translated) bits become available before the remaining bits. In an exemplary embodiment, bits 12 - 23 are available to the cache index 30 first, followed by bits 24 - 35 .
  • Bits 12 - 35 of the physical address may perform a variety of functions with respect to addressing data within the cache.
  • the 3-bit string of bits 12 - 14 are used, for example, to index the eight individual sets within each block, so that bits 7 - 14 , as a group, uniquely index each of the 256 sets of the cache index 30 .
  • bits 12 - 14 may be used, for example, by control block 20 to multiplex the eight sets of the identified block down to a single selected set that might contain an address tag of the requested data.
  • bits 15 - 35 may be used to identify the individual way, if any, which contains the address of the requested data.
  • bits 15 - 23 are used, for example, to execute a 9-bit compare for way prediction. Specifically, these bits are compared, for example, on the eight ways of the selected set. Alternatively, this 9-bit compare may be performed on all eight sets before bits 12 - 14 are used to multiplex the eight sets of the identified block down to a selected set.
  • bits 24 - 35 may be used to confirm whether the predicted way actually contains the requested data. Specifically, these bits may be compared on array A 2 or some portion thereof (e.g. a single set) to determine if a predicted way contains the requested data or instead contains a different piece of data having the same 9-bit tag as the requested data. Techniques and algorithms for performing the necessary compares are well known in the art.
  • FIGS. 5A, 5B and 6 outline a method of searching an embodiment of a cache index according to the present invention.
  • the method begins with step 101 of FIG. 5, with the cache index maintained in an initial configuration organized, for example, into 2 arrays.
  • the illustrated method is described in connection with the exemplary cache index described above.
  • the first array stores, for example, a number of 9-bit strings
  • the second array stores, for example, 12-bit strings.
  • the cache index (aided by, for example, a control block 20 ) receives bits 7 - 11 . These bits 7 - 11 are untranslated, for example, from bits 7 - 11 of the physical address of the data. As noted above, this 5-bit string of bits 7 - 11 uniquely identifies one of 32 blocks of 8 sets in the 256 set cache index.
  • the system e.g. the cache unit 10
  • the system can identify a single block of 8 sets in first array, as shown in FIG. 3.
  • the system can then read out this block of 8 sets (step 103 ) into the auxiliary memory structure B 1 , as shown in FIG. 4. Because bits 7 - 11 in the virtual address are untranslated from bits 7 - 11 in the physical address, the system can perform this initial search and retrieval without waiting for the translation of the entire virtual address.
  • the system performs this preliminary searching and reading out, the remainder of the virtual address may be translated in the background to generate the remainder of the physical address.
  • the system receives the translated bits in two steps, first bits 12 - 23 and later bits 24 - 35 .
  • the cache index can receive these bits, including bits 12 - 14 (step 104 ).
  • the system may perform several sets of steps concurrently.
  • the 3-bit string 12 - 14 allows the system to multiplex the 8 sets of the read-out block down to a single set (step 105 ).
  • This single set potentially stores an address tag for the requested data.
  • the system compares, for example, bits 15 - 23 of the physical address on each of the ways of the single selected set identified in step 105 (step 108 ). In the present exemplary embodiment, this 9-bit string is compared on all eight ways of the set. If bits 15 - 23 match any of the 9-bit strings contained in the read-out block (step 109 ), the system predicts a hit and ships, for example, the identity of a predicted way to the data array and the array A 2 (step 110 ). If bits 15 - 23 do not match any of the 9-bit strings contained in the memory structure B 1 , then the data is not in the cache, and the system must look elsewhere for the requested data (step 109 N).
  • an individual corresponding index set of the array A 2 may be identified using the bits 7 - 14 .
  • the system may then read out the corresponding index set from the second array into, for example, the comparitor 32 (step 106 ).
  • address bits 7 - 14 may likewise be used to identify a corresponding data set in the cache data array 40 .
  • This corresponding data set is, for example, the single selected set that might contain the requested data (as opposed to the requested data address tags, which are contained in arrays A 1 and A 2 ).
  • the cache may begin a read out of the predicted way of the selected set of the data array (step 110 B).
  • the system receives, for example, bits 24 - 35 of the physical address and compares those bits, for example, on the predicted way of the corresponding index set identified in step 106 .
  • the 12-bit physical address string is compared to the 12-bit string in the predicted way of the corresponding index set (step 111 ). If a match occurs (e.g. if the 12-bit string is identical to the 12-bit string in the predicted way), the predicted hit is confirmed (step 112 ) and the data being shipped by the cache to the processor may be used. If no match occurs, the data being shipped is not the requested data.
  • the processor can, for example, ignore the shipped data, or the shipped data may otherwise be canceled (step 113 ).
  • the method and apparatus according to the present invention need not be limited to cache indexes of the structure described above. Rather, the method and apparatus according to the present invention may be used with any cache index in which each virtual address and corresponding physical address share some untranslated bits and in which these common bits do not uniquely identify a set containing the requested data.
  • FIG. 7 shows, for example, a cache index organized into 2 n sets.
  • a string containing at least n bits is required to uniquely identify each of the 2 n sets contained in the cache index as shown in FIG. 5.
  • searching this array for a particular set if less than n bits are common between the virtual address and the physical address, then the virtual address must be translated to obtain the physical address before the particular set can be uniquely identified.
  • the method and apparatus according to an embodiment of the present invention allow a search to begin based upon the partial physical address formed by any bits common to the virtual and physical addresses. If a number k of bits are shared between the virtual address and the physical address, where k is less than n, then the cache index can be subdivided into, for example, 2 k blocks, each block containing 2 (n ⁇ k) sets. Each string of k shared bits can then be used to uniquely identify one of the 2 k blocks within the cache index. The 2 (n ⁇ k) sets may therefore be read out into an auxiliary memory structure B shown in FIG. 5, and these sets may be later multiplexed down to 1 set when the full physical address is available.
  • the “blocks” need not be represented in the actual architecture of the cache index. Instead, it is merely required that the available common bits be utilized to reduce the potential hits (i.e. the potential sets) to a minimum, and that these potential sets be read out to an auxiliary memory structure for later multiplexing down to a single set.
  • step 201 the cache index array is organized, for example, into 2 k blocks, where k is the number of shared bits between the virtual address and the physical address. Each block contains, for example, 2 (n ⁇ k) sets, with 2 n being the total number of sets contained in the cache.
  • the system Upon receiving a request for data that includes the virtual address of the requested data, the system immediately receives the string of k bits common to both the virtual address and the physical address (step 202 ). Using this string, and without waiting for the translation of the full physical address, the system searches the array and determines which block might contain the requested data (step 203 ). Once that block is identified, the system reads out the 2 (n ⁇ k) sets contained within that block to an auxiliary memory structure B shown in FIG. 5 (step 204 ).
  • the 2 (n ⁇ k) sets can be multiplexed down to 1 set (step 206 ).
  • Remaining address bits may then, for example, be read (step 207 ) and compared to the address strings contained in the ways of the set to determine whether the requested data is contained within the cache (step 208 ). If the compare registers a hit, the data may be read out (step 210 ). If no hit occurs, the data is not within the cache (step 209 N).
  • the number or sets and number of blocks need not be powers of two (e.g. 2 n where n is an integer). Even if the number of sets or blocks is not a power of two, bits common to both the virtual and physical addresses can be used to eliminate all but a subset of the sets. This subset of sets (e.g. a block) may the be read out and later multiplexed down to one set when other address bits become available.

Abstract

A method for indexing a cache includes searching on a cache index using a partial physical address, the partial physical address including any bits of the virtual address which are untranslated between the virtual address and the physical address. The partial physical address is used to identify a block of the cache index sets that might contain an address of requested data. The identification is performed prior to translation of the virtual address to the physical address. Once identified, the block is read out into an auxiliary memory structure. After the full physical address becomes available, the block is multiplexed down to one set, and a compare is performed on the ways of the set to determine if the requested data is in the cache and, if so, which way the data is in. A device for achieving the method includes a cache index organized into two arrays, each having a number of sets and a number of ways. One of the arrays may used to store micro-tags for way prediction. In addition, the device includes an auxiliary memory structure for receiving and storing intermediate search results.

Description

    FIELD OF THE INVENTION
  • The present invention relates to address searching within a cache index. In particular, the present invention relates to address searching within a cache index using a partial physical address. [0001]
  • BACKGROUND OF THE INVENTION
  • Data is stored in memory according to a physical address scheme. Software programmers, however, write program code that requires the retrieval of data using a virtual or linear address scheme (referred to herein as the “virtual address”). Therefore, it becomes necessary for a system to translate a virtual address for a piece of data to a physical address before the data can be read from physical memory. [0002]
  • Many conventional searching techniques require that the full physical address be translated from the virtual address of the requested data prior to initiating a search. This significantly slows down the process of actually retrieving the data from memory, especially if the data is stored in high speed cache memory. Since a physical address is necessary before identifying and/or retrieving data from a cache, many conventional cache systems must wait until the translation from the virtual to the physical address is complete. This process can delay a search by a clock cycle or more. [0003]
  • In an attempt to solve this problem, other known cache systems have implemented a technique for searching an index for cache data based upon a partial physical address. This technique is based upon the recognition that a virtual address and a physical address share some address bits in common, so that these bits are available to the system immediately. [0004]
  • Generally, caches and cache indexes are organized into a number of sets, with each set containing one or more entry locations, or ways. In order to begin a partial-address search using the above-mentioned techniques, the available string of bits (i.e. those bits common to the virtual and physical address) must be of sufficient length to uniquely identify the individual set which might contain the requested data. According to known systems, only then may the system read out an individual set whose ways may be later searched to determine the location of requested data. [0005]
  • For example, for some systems, address bits [0006] 0-11 for a virtual and physical address are the same. In this example, bits 0-6 are used for addressing data within each entry of the cache, and therefore are not used to index the entries themselves.
  • For smaller caches, for example 16 kilobyte caches, it is possible to begin searching a cache index using bits [0007] 7-11 as many such caches are organized into 32 or fewer sets. This is possible because bits 7-11 can identify 32 individual sets (the five bits forming 25, or 32, unique binary numerals). Thus for caches with 32 or fewer sets, if bits 7-11 are available they may be immediately used to uniquely identify the individual set which might contain the requested data.
  • Larger caches, however, for example 256 kilobyte caches, typically contain more than 32 sets. A 256 kilobyte cache containing, for example, 256 sets requires an 8-bit string (e.g. bits [0008] 7-14) to effectively begin a search of the cache index. Known cache systems, therefore, have not initiated a search of a cache index for large caches using bits 0-11, since these systems require bits 12-14 to initiate the search anyway. These systems must wait until the translation process is complete, and the full physical address is available, before initiating a search of the cache index. The problem with this method, however, is that one or more clock cycles are wasted waiting for the translation process to finish prior to beginning the search for cache data.
  • SUMMARY OF THE INVENTION
  • The method for searching a cache index includes the steps of receiving a virtual address of a requested data element which has at least one common bit which is untranslated between the virtual address and a physical address and searching the cache index using the at least one common bit, before the virtual address is completely translated, to identify a selected block. The cache index may be organized into blocks, each block containing a plurality of sets. Each of the plurality of sets may contain at least one way, each of the ways containing an address tag. [0009]
  • An embodiment of a device according to the present invention includes a cache index organized into a number of sets and ways, and an auxiliary memory structure for receiving and storing intermediate search results.[0010]
  • BRIEF DESCRIPTION OF THE DRAWING
  • FIG. 1A shows a schematic view of a cache unit including an embodiment of a cache index according to the present invention. [0011]
  • FIG. 1B shows a schematic view of the cache unit of FIG. 1 connected to a processor. [0012]
  • FIG. 1C shows a schematic view of an embodiment of a cache index according to the present invention. [0013]
  • FIG. 2A shows a schematic view of an array of the cache index of FIG. 1C. [0014]
  • FIG. 2 shows a schematic view of a second array of the cache index of FIG. 1C. [0015]
  • FIG. 3 shows a schematic view of an embodiment of an array of a cache index according to the present invention with an identified block of sets. [0016]
  • FIG. 4 shows a schematic view of the array of FIG. 2 with an identified block of sets read out to an auxiliary memory structure. [0017]
  • FIGS. 5A, 5B, and [0018] 6 show a flow diagram of a method according to the present invention.
  • FIG. 7 shows a schematic view of another embodiment of an array of a cache index according to the present invention. [0019]
  • FIGS. 8A and 8B show a flow diagram of a method for searching the data array of FIG. 7.[0020]
  • DETAILED DESCRIPTION
  • FIG. 1A shows a [0021] cache unit 10 including an embodiment of a cache index according to the present invention. This system includes a cache index 30 connected to a cache data array 40. The functions of the cache index 30 are controlled by control block 20. The control block 20 may be any type of controlling circuit or processor. As shown in FIG. 1B, the cache unit 10 may be connected, for example, to a processor 50.
  • FIG. 1C shows an exemplary structure of a [0022] cache index 30 according to the present invention. In this embodiment, the cache index 30 includes two arrays A1 and A2. The cache index 30 also includes, for example, an auxiliary memory structure B1 which may store intermediate search results from array A1. Comparitors 31 and 32 are also provided, for example, in the cache index 30. These may be used by the control block 20 to perform comparisons between partial address strings of a requested piece of data and partial address strings contained in the arrays A1 and A2.
  • FIGS. 2A and 2B show this [0023] exemplary cache index 30 in greater detail. The cache index 30 stores address tags or simply “tags” for the actual data stored in the cache data array 40. As noted above, the cache index is organized into, for example, two separate arrays A1 and A2. Each of the two arrays is further organized into, for example, 256 sets, each of the sets containing, for example, 8 ways (other particular configurations being possible).
  • The number of ways in each set generally indicates the number of entries that may be stored in each set. One skilled in the art will understand that, given two caches with the same number of total entries (e.g. 2048), the cache organized into fewer sets and a greater number of ways (e.g. 256 sets in8 ways) will be more architecturally flexible than one having a greater number of sets and fewer ways (e.g. 512 sets in 4 ways). In particular, the former will be able to store data in more combinations than the latter. However, the expanded options provided by the former arrangement generally lead to longer search times. [0024]
  • As noted above, an embodiment of the present invention utilizes two arrays A[0025] 1 and A2, each organized into, for example, 256 sets and 8 ways, which are used to store address tags. The separate arrays A1 and A2 store different portions of the address tag. The first array A1 stores, for example, tag bits 15-23 of an address tag. These bits form a so-called “micro-tag” or “μtag” which is used, for example, for way prediction. The second array A2 stores, for example, tag bits 24-35 of an address tag. These bits form a so-called “upper tag” and may be used to confirm a predicted address hit. Thus in an exemplary embodiment, the first array A1 stores a nine bit tag (bits 15-23) in 256 sets and 8 ways, while the second array A2 stores a 12 bit tag (bits 24-35) in 256 sets and 8 ways. In array A1, the 256 sets are organized into, for example, 32 blocks of 8 sets each. It can be understood that the address tag may be any suitable length. In addition, the bits comprising the address tag may be apportioned between the two arrays A1 and A2 in any suitable manner (i.e. the micro-tag and upper tag may each include any suitable number of bits).
  • The [0026] cache index 30 according to the illustrated embodiment of the present invention differs from known cache indexes in that it includes, for example, auxiliary memory structure B1 for receipt and storage of intermediate search results. As described below, this auxiliary memory structure B1 allows searching of the cache index based upon a partial physical address that need not uniquely identify an individual set that can the requested data.
  • In the system employing the illustrated embodiment of the [0027] cache index 30 of the present invention, bits 0-11 of each virtual or linear address are, for example, identical to (or otherwise untranslated from) bits 0-11 of each corresponding physical address. Of these bits, bits 0-6 are used, for example, to address the data within each entry of the cache. Thus these bits are not used for addressing the entries themselves within the cache index 30. In contrast, bits 7-11 of the physical address may be used for indexing the entries stored in the cache data array 40. These bits therefore identify, for example, a partial physical address for cache indexing purposes. It can be understood that this partial physical address need not include the exemplary bits 7-11, but that other bit strings may be equally useful in practicing the present invention.
  • In the exemplary embodiment, the 5-bit string [0028] 7-11 may be used to identify 25, or 32, individual memory locations. These 32 individual memory locations correspond, for example, to the 32 blocks of 8 sets each in the array A1. Accordingly, bits 7-11 may be used (by, for example, the control block 40) to identify the particular block in array Al that might contain the address of a requested piece of data. The control block 20 may then cause the selected block to be read to the additional memory structure B1. Because bits 7-11 are available as soon as the data request is received, the control block 20 may perform these functions without waiting for a translation of the requested address.
  • Unlike bits [0029] 7-11, bits 12-35 of each virtual address are, for example, not identical to bits 12-35 of the corresponding physical address. Rather bits 12-35 must be translated by the system to generate bits 12-35 of the physical address. This translation may be performed using any known technique and typically requires a full clock cycle or more. In an exemplary embodiment, the translation occurs while bits 7-11 are used to identify the selected block discussed above. The translation may also be performed in parts, so that a portion of the (translated) bits become available before the remaining bits. In an exemplary embodiment, bits 12-23 are available to the cache index 30 first, followed by bits 24-35.
  • Bits [0030] 12-35 of the physical address may perform a variety of functions with respect to addressing data within the cache. In an exemplary embodiment, the 3-bit string of bits 12-14 are used, for example, to index the eight individual sets within each block, so that bits 7-14, as a group, uniquely index each of the 256 sets of the cache index 30. Thus when bits 12-14 are available, they may be used, for example, by control block 20 to multiplex the eight sets of the identified block down to a single selected set that might contain an address tag of the requested data.
  • Once the selected set is identified, bits [0031] 15-35 may be used to identify the individual way, if any, which contains the address of the requested data. In the exemplary embodiment, bits 15-23 are used, for example, to execute a 9-bit compare for way prediction. Specifically, these bits are compared, for example, on the eight ways of the selected set. Alternatively, this 9-bit compare may be performed on all eight sets before bits 12-14 are used to multiplex the eight sets of the identified block down to a selected set.
  • If a hit is predicted (i.e. if bits [0032] 15-23 of the requested data match a 9-bit tag in the selected set), then bits 24-35 may be used to confirm whether the predicted way actually contains the requested data. Specifically, these bits may be compared on array A2 or some portion thereof (e.g. a single set) to determine if a predicted way contains the requested data or instead contains a different piece of data having the same 9-bit tag as the requested data. Techniques and algorithms for performing the necessary compares are well known in the art.
  • FIGS. 5A, 5B and [0033] 6 outline a method of searching an embodiment of a cache index according to the present invention. The method begins with step 101 of FIG. 5, with the cache index maintained in an initial configuration organized, for example, into 2 arrays. For purposes of clarity, the illustrated method is described in connection with the exemplary cache index described above. Thus the first array stores, for example, a number of 9-bit strings, and the second array stores, for example, 12-bit strings.
  • Upon receiving a request for data that contains a virtual address for the requested data (step [0034] 102), the cache index (aided by, for example, a control block 20) receives bits 7-11. These bits 7-11 are untranslated, for example, from bits 7-11 of the physical address of the data. As noted above, this 5-bit string of bits 7-11 uniquely identifies one of 32 blocks of 8 sets in the 256 set cache index.
  • Accordingly, upon receiving bits [0035] 7-11, the system (e.g. the cache unit 10) can identify a single block of 8 sets in first array, as shown in FIG. 3. The system can then read out this block of 8 sets (step 103) into the auxiliary memory structure B1, as shown in FIG. 4. Because bits 7-11 in the virtual address are untranslated from bits 7-11 in the physical address, the system can perform this initial search and retrieval without waiting for the translation of the entire virtual address.
  • While the system performs this preliminary searching and reading out, the remainder of the virtual address may be translated in the background to generate the remainder of the physical address. In an embodiment of the present invention, the system receives the translated bits in two steps, first bits [0036] 12-23 and later bits 24-35. Once the translation of bits 12-23 from the virtual address to physical address is complete, the cache index can receive these bits, including bits 12-14 (step 104).
  • At this point, the system may perform several sets of steps concurrently. First, the 3-bit string [0037] 12-14 allows the system to multiplex the 8 sets of the read-out block down to a single set (step 105). This single set potentially stores an address tag for the requested data. The system then compares, for example, bits 15-23 of the physical address on each of the ways of the single selected set identified in step 105 (step 108). In the present exemplary embodiment, this 9-bit string is compared on all eight ways of the set. If bits 15-23 match any of the 9-bit strings contained in the read-out block (step 109), the system predicts a hit and ships, for example, the identity of a predicted way to the data array and the array A2 (step 110). If bits 15-23 do not match any of the 9-bit strings contained in the memory structure B1, then the data is not in the cache, and the system must look elsewhere for the requested data (step 109N).
  • Second, concurrent with the above steps, an individual corresponding index set of the array A[0038] 2 may be identified using the bits 7-14. The system may then read out the corresponding index set from the second array into, for example, the comparitor 32 (step 106).
  • Third, concurrent with the above steps, address bits [0039] 7-14 may likewise be used to identify a corresponding data set in the cache data array 40. This corresponding data set is, for example, the single selected set that might contain the requested data (as opposed to the requested data address tags, which are contained in arrays A1 and A2).
  • Assuming a hit has been predicted in [0040] step 110, the cache may begin a read out of the predicted way of the selected set of the data array (step 110B). Next, while the data is being shipped, the system receives, for example, bits 24-35 of the physical address and compares those bits, for example, on the predicted way of the corresponding index set identified in step 106. In other words, the 12-bit physical address string is compared to the 12-bit string in the predicted way of the corresponding index set (step 111). If a match occurs (e.g. if the 12-bit string is identical to the 12-bit string in the predicted way), the predicted hit is confirmed (step 112) and the data being shipped by the cache to the processor may be used. If no match occurs, the data being shipped is not the requested data. The processor can, for example, ignore the shipped data, or the shipped data may otherwise be canceled (step 113).
  • It can be understood that the method and apparatus according to the present invention need not be limited to cache indexes of the structure described above. Rather, the method and apparatus according to the present invention may be used with any cache index in which each virtual address and corresponding physical address share some untranslated bits and in which these common bits do not uniquely identify a set containing the requested data. [0041]
  • FIG. 7 shows, for example, a cache index organized into 2[0042] n sets. One skilled in the art can understand that a string containing at least n bits is required to uniquely identify each of the 2n sets contained in the cache index as shown in FIG. 5. In searching this array for a particular set, if less than n bits are common between the virtual address and the physical address, then the virtual address must be translated to obtain the physical address before the particular set can be uniquely identified.
  • The method and apparatus according to an embodiment of the present invention, however, allow a search to begin based upon the partial physical address formed by any bits common to the virtual and physical addresses. If a number k of bits are shared between the virtual address and the physical address, where k is less than n, then the cache index can be subdivided into, for example, 2[0043] k blocks, each block containing 2(n−k) sets. Each string of k shared bits can then be used to uniquely identify one of the 2k blocks within the cache index. The 2(n−k) sets may therefore be read out into an auxiliary memory structure B shown in FIG. 5, and these sets may be later multiplexed down to 1 set when the full physical address is available.
  • It can be understood that the “blocks” need not be represented in the actual architecture of the cache index. Instead, it is merely required that the available common bits be utilized to reduce the potential hits (i.e. the potential sets) to a minimum, and that these potential sets be read out to an auxiliary memory structure for later multiplexing down to a single set. [0044]
  • A further exemplary method corresponding to the above is outlined in detail in FIG. 8. In [0045] step 201, the cache index array is organized, for example, into 2k blocks, where k is the number of shared bits between the virtual address and the physical address. Each block contains, for example, 2(n−k) sets, with 2n being the total number of sets contained in the cache.
  • Upon receiving a request for data that includes the virtual address of the requested data, the system immediately receives the string of k bits common to both the virtual address and the physical address (step [0046] 202). Using this string, and without waiting for the translation of the full physical address, the system searches the array and determines which block might contain the requested data (step 203). Once that block is identified, the system reads out the 2(n−k) sets contained within that block to an auxiliary memory structure B shown in FIG. 5 (step 204).
  • Once the full physical address has been translated, the 2[0047] (n−k) sets can be multiplexed down to 1 set (step 206). Remaining address bits may then, for example, be read (step 207) and compared to the address strings contained in the ways of the set to determine whether the requested data is contained within the cache (step 208). If the compare registers a hit, the data may be read out (step 210). If no hit occurs, the data is not within the cache (step 209N).
  • It can also be understood that the number or sets and number of blocks need not be powers of two (e.g. 2[0048] n where n is an integer). Even if the number of sets or blocks is not a power of two, bits common to both the virtual and physical addresses can be used to eliminate all but a subset of the sets. This subset of sets (e.g. a block) may the be read out and later multiplexed down to one set when other address bits become available.
  • The cache index and method according to the present invention have been described with respect to several exemplary embodiments. It can be understood, however, that there are many other variations of the above described embodiments which will be apparent to those skilled in the art. It is understood that these modifications are within the teaching of the present invention which is to be limited only by the claims appended hereto. [0049]

Claims (22)

What is claimed is:
1. A cache index, comprising:
a first array for storing a plurality of first partial address tags, the first array organized into a plurality of first-array sets, the plurality of first-array sets organized into a plurality of blocks, each of the plurality of blocks containing a subset of the plurality of first-array sets, each of the plurality of first-array sets containing a plurality of ways, each of the plurality of ways of the first-array sets containing one of the plurality of first partial address tags; and
an auxiliary memory structure.
2. The cache index according to claim 1, further comprising a control block, the control block receiving a first string of bits, the first string of bits being untranslated between a virtual address and a physical address, the control block using the first string of bits to identify a selected one of the plurality of blocks, the selected one of the plurality of blocks being read into the auxiliary memory structure;
the control block receiving a second string of bits and using the second string of bits to multiplex the subset of first-array sets contained in the selected one of the plurality of blocks down to a selected first-array set; and
the control block receiving a third string of bits and comparing the third string of bits on each of the plurality of ways contained in the selected first-array set, a hit being predicted when the third string of bits matches one of the first partial address tags contained in the selected first-array set.
4. The cache index according to claim 2, wherein the first string of bits is a 5-bit string.
5. The cache index according to claim 4, wherein the first string of bits includes bits 7-11 of the virtual address.
6. The cache index according to claim 2, further comprising a second array for storing a plurality of second partial address tags, the second array organized into a plurality of second-array sets, each of the second-array sets containing a plurality of ways, each of the plurality of ways of the second-array sets containing one of the plurality of second partial address tags;
the control block identifying a corresponding second-array set using the first and second strings of bits; and
the control block receiving a fourth string of bits and comparing the fourth string of bits on each of the ways contained in the corresponding second-array set, a hit being confirmed when the fourth string of bits matches one of the second partial address tags contained in the corresponding second-array set.
7. The cache index according to claim 6, wherein the first string of bits is a 5-bit string.
8. The cache index according to claim 7, wherein the first string of bits includes bits 7-1 1 of the virtual address.
9. The cache index according to claim 8, wherein each of the plurality of first partial address tags is a 9-bit string and each of the plurality of second partial address tags is a 12-bit string.
10. The cache index according to claim 8, wherein the first string of bits includes bits 7-11 of the virtual address and the second string of bits includes bits 12-14 of the physical address.
11. A method for searching a cache index, the cache index organized into blocks each containing a plurality of sets, each of the plurality of sets containing at least one way, each of the ways containing an address tag, comprising the steps of:
a. receiving a virtual address of a requested data element which has at least one common bit, the common bit being untranslated between the virtual address and a physical address; and
b. searching the cache index using the at least one common bit, before the virtual address is completely translated, to identify a selected block.
12. The method according to claim 11, further comprising the steps of:
c. reading out the selected block into an auxiliary data structure;
d. receiving a first string of translated bits and a second string of translated bits;
e. multiplexing the selected block using the first string of translated bits to identify a single selected set that might contain the requested address tag;
f. comparing the second string of translated bits on the at least one way contained within the single selected set to determine if the requested data is in the cache; and
g. shipping a predicted way to the cache when the second string of translated bits matches one of the at least one address tags contained in the single selected set.
13. The method according to claim 12, further comprising the following step, which is performed concurrent with steps (b) and (c) and prior to step (d):
h. translating a remainder of the virtual address to generate a remainder of the physical address, the remainder of the physical address including the first string of translated bits and the second string of translated bits.
14. The method according to claim 13, wherein the at least one common bit includes bits 7-11 of the virtual address.
15. A method for searching a cache index connected to a cache, the cache index comprising a first array and a second array, the first array organized into a plurality of first-array sets and the second array organized into a plurality of second-array sets, each of the first-array sets and the second-array sets containing at least one way, the ways of the first array containing micro address tags and the ways of the second array containing confirmation address tags, comprising the steps of:
a. receiving a virtual address of a requested data element which includes at least one common bit untranslated from a bit in a physical address;
b. searching on the first array using the at least one common bit, before the virtual address is completely translated, to identify a selected block of the plurality of first-array sets; and
c. reading out the selected block into an auxiliary data structure.
16. The method according to claim 15, further comprising the steps of:
d. receiving a first string of translated bits and a second string of translated bits;
e. multiplexing the selected block using the first string of translated bits to identify a single potential first-array set;
f. comparing the second string of translated bits on the at least one way contained within the single potential first-array set, a hit being predicted when the second string of bits matches one of the micro address tags contained within the single selected set; and
g. shipping an identity of a predicted way to the second array and the cache when the second string of bits matches one of the micro address tags contained within the single selected set, the predicted way being located in the cache.
17. The method according to claim 16, further comprising the following step, which is performed while steps (e), (f), and (g) are performed:
h. identifying and reading out from the second array a corresponding second-array set.
18. The method according to claim 17, further comprising the following step, which is performed while steps (e), (f), and (g) are performed:
i. identifying and reading out a corresponding data set from the cache, the corresponding data set containing the predicted way.
19. The method according to claim 18, further comprising the following step:
j. reading out predicted data from the cache, the predicted data being contained in the predicted way.
20. The method according to claim 19, further comprising the following steps, which are performed concurrent with step (j):
k. receiving a third string of translated bits; and
l. comparing the third string of translated bits on the at least one way of the corresponding second-array set, a hit being confirmed when the third string of translated bits matches one of the confirmation address tags contained within the corresponding second array set.
21. The method according to claim 20, further comprising the following step, which is preformed concurrent with steps (b) and (c) and prior to step (d):
m. translating a remainder of the virtual address to generate a remainder of the physical address, the remainder of the physical address including the first string of translated bits, the second string of translated bits, and the third string of translated bits.
22. The method according to claim 20, wherein the at least one common bit includes bits 7-11 of the virtual address.
23. The method according to claim 22, wherein the first string of translated bits includes bits 12-14 of the physical address, the second string of translated bits includes bits 15-23 of the physical address, and the third string of translated bits includes bits 24-35 of the physical address.
US10/298,961 1997-12-31 2002-11-19 Method and apparatus for indexing a cache Abandoned US20030074537A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US10/298,961 US20030074537A1 (en) 1997-12-31 2002-11-19 Method and apparatus for indexing a cache

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US09/002,163 US6516386B1 (en) 1997-12-31 1997-12-31 Method and apparatus for indexing a cache
US10/298,961 US20030074537A1 (en) 1997-12-31 2002-11-19 Method and apparatus for indexing a cache

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US09/002,163 Division US6516386B1 (en) 1997-12-31 1997-12-31 Method and apparatus for indexing a cache

Publications (1)

Publication Number Publication Date
US20030074537A1 true US20030074537A1 (en) 2003-04-17

Family

ID=21699498

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/002,163 Expired - Fee Related US6516386B1 (en) 1997-12-31 1997-12-31 Method and apparatus for indexing a cache
US10/298,961 Abandoned US20030074537A1 (en) 1997-12-31 2002-11-19 Method and apparatus for indexing a cache

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/002,163 Expired - Fee Related US6516386B1 (en) 1997-12-31 1997-12-31 Method and apparatus for indexing a cache

Country Status (1)

Country Link
US (2) US6516386B1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040181626A1 (en) * 2003-03-13 2004-09-16 Pickett James K. Partial linearly tagged cache memory system
US7467131B1 (en) * 2003-09-30 2008-12-16 Google Inc. Method and system for query data caching and optimization in a search engine system
US20120066475A1 (en) * 2002-09-13 2012-03-15 Nytell Software LLC Translation lookaside buffer
US20120297110A1 (en) * 2011-05-18 2012-11-22 University Of North Texas Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks
US20150293853A1 (en) * 2006-09-29 2015-10-15 Arm Finance Overseas Limited Data cache virtual hint way prediction, and applications thereof
US9946547B2 (en) 2006-09-29 2018-04-17 Arm Finance Overseas Limited Load/store unit for a processor, and applications thereof
WO2018231408A1 (en) * 2017-06-15 2018-12-20 Rambus Inc. Hybrid memory module
US10497044B2 (en) 2015-10-19 2019-12-03 Demandware Inc. Scalable systems and methods for generating and serving recommendations

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6516386B1 (en) * 1997-12-31 2003-02-04 Intel Corporation Method and apparatus for indexing a cache
US6760814B2 (en) 2001-12-17 2004-07-06 Lsi Logic Corporation Methods and apparatus for loading CRC values into a CRC cache in a storage controller
US6772289B1 (en) * 2001-12-17 2004-08-03 Lsi Logic Corporation Methods and apparatus for managing cached CRC values in a storage controller
US6901551B1 (en) 2001-12-17 2005-05-31 Lsi Logic Corporation Method and apparatus for protection of data utilizing CRC
US7996619B2 (en) * 2004-04-22 2011-08-09 Intel Corporation K-way direct mapped cache
US8364897B2 (en) * 2004-09-29 2013-01-29 Intel Corporation Cache organization with an adjustable number of ways
DE602005023273D1 (en) * 2005-04-29 2010-10-14 St Microelectronics Srl An improved cache system
US10210087B1 (en) 2015-03-31 2019-02-19 EMC IP Holding Company LLC Reducing index operations in a cache
US10922228B1 (en) 2015-03-31 2021-02-16 EMC IP Holding Company LLC Multiple location index
US10169246B2 (en) * 2017-05-11 2019-01-01 Qualcomm Incorporated Reducing metadata size in compressed memory systems of processor-based systems
US10747679B1 (en) * 2017-12-11 2020-08-18 Amazon Technologies, Inc. Indexing a memory region

Citations (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5014195A (en) * 1990-05-10 1991-05-07 Digital Equipment Corporation, Inc. Configurable set associative cache with decoded data element enable lines
US5060137A (en) * 1985-06-28 1991-10-22 Hewlett-Packard Company Explicit instructions for control of translation lookaside buffers
US5067078A (en) * 1989-04-17 1991-11-19 Motorola, Inc. Cache which provides status information
US5193166A (en) * 1989-04-21 1993-03-09 Bell-Northern Research Ltd. Cache-memory architecture comprising a single address tag for each cache memory
US5265220A (en) * 1986-12-23 1993-11-23 Nec Corporation Address control device for effectively controlling an address storing operation even when a request is subsequently cancelled
US5307477A (en) * 1989-12-01 1994-04-26 Mips Computer Systems, Inc. Two-level cache memory system
US5386527A (en) * 1991-12-27 1995-01-31 Texas Instruments Incorporated Method and system for high-speed virtual-to-physical address translation and cache tag matching
US5412787A (en) * 1990-11-21 1995-05-02 Hewlett-Packard Company Two-level TLB having the second level TLB implemented in cache tag RAMs
US5606683A (en) * 1994-01-28 1997-02-25 Quantum Effect Design, Inc. Structure and method for virtual-to-physical address translation in a translation lookaside buffer
US5636363A (en) * 1991-06-14 1997-06-03 Integrated Device Technology, Inc. Hardware control structure and method for off-chip monitoring entries of an on-chip cache
US5640339A (en) * 1993-05-11 1997-06-17 International Business Machines Corporation Cache memory including master and local word lines coupled to memory cells
US5682515A (en) * 1993-01-25 1997-10-28 Benchmarq Microelectronics, Inc. Low power set associative cache memory with status inhibit of cache data output
US5696925A (en) * 1992-02-25 1997-12-09 Hyundai Electronics Industries, Co., Ltd. Memory management unit with address translation function
US5732242A (en) * 1995-03-24 1998-03-24 Silicon Graphics, Inc. Consistently specifying way destinations through prefetching hints
US5740416A (en) * 1994-10-18 1998-04-14 Cyrix Corporation Branch processing unit with a far target cache accessed by indirection from the target cache
US5835963A (en) * 1994-09-09 1998-11-10 Hitachi, Ltd. Processor with an addressable address translation buffer operative in associative and non-associative modes
US5854943A (en) * 1996-08-07 1998-12-29 Hewlett-Packard Company Speed efficient cache output selector circuitry based on tag compare and data organization
US5860104A (en) * 1995-08-31 1999-01-12 Advanced Micro Devices, Inc. Data cache which speculatively updates a predicted data cache storage location with store data and subsequently corrects mispredicted updates
US5953747A (en) * 1994-03-30 1999-09-14 Digital Equipment Corporation Apparatus and method for serialized set prediction
US5956752A (en) * 1996-12-16 1999-09-21 Intel Corporation Method and apparatus for accessing a cache using index prediction
US5956746A (en) * 1997-08-13 1999-09-21 Intel Corporation Computer system having tag information in a processor and cache memory
US6078995A (en) * 1996-12-26 2000-06-20 Micro Magic, Inc. Methods and apparatus for true least recently used (LRU) bit encoding for multi-way associative caches
US6092172A (en) * 1996-10-16 2000-07-18 Hitachi, Ltd. Data processor and data processing system having two translation lookaside buffers
US6145054A (en) * 1998-01-21 2000-11-07 Sun Microsystems, Inc. Apparatus and method for handling multiple mergeable misses in a non-blocking cache
US6256709B1 (en) * 1997-06-26 2001-07-03 Sun Microsystems, Inc. Method for storing data in two-way set associative odd and even banks of a cache memory
US6516386B1 (en) * 1997-12-31 2003-02-04 Intel Corporation Method and apparatus for indexing a cache
US6651144B1 (en) * 1998-06-18 2003-11-18 Hewlett-Packard Development Company, L.P. Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state

Patent Citations (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5060137A (en) * 1985-06-28 1991-10-22 Hewlett-Packard Company Explicit instructions for control of translation lookaside buffers
US5265220A (en) * 1986-12-23 1993-11-23 Nec Corporation Address control device for effectively controlling an address storing operation even when a request is subsequently cancelled
US5067078A (en) * 1989-04-17 1991-11-19 Motorola, Inc. Cache which provides status information
US5193166A (en) * 1989-04-21 1993-03-09 Bell-Northern Research Ltd. Cache-memory architecture comprising a single address tag for each cache memory
US5542062A (en) * 1989-12-01 1996-07-30 Silicon Graphics, Inc. Cache memory system employing virtual address primary instruction and data caches and physical address secondary cache
US5307477A (en) * 1989-12-01 1994-04-26 Mips Computer Systems, Inc. Two-level cache memory system
US5699551A (en) * 1989-12-01 1997-12-16 Silicon Graphics, Inc. Software invalidation in a multiple level, multiple cache system
US5014195A (en) * 1990-05-10 1991-05-07 Digital Equipment Corporation, Inc. Configurable set associative cache with decoded data element enable lines
US5412787A (en) * 1990-11-21 1995-05-02 Hewlett-Packard Company Two-level TLB having the second level TLB implemented in cache tag RAMs
US5636363A (en) * 1991-06-14 1997-06-03 Integrated Device Technology, Inc. Hardware control structure and method for off-chip monitoring entries of an on-chip cache
US5386527A (en) * 1991-12-27 1995-01-31 Texas Instruments Incorporated Method and system for high-speed virtual-to-physical address translation and cache tag matching
US5696925A (en) * 1992-02-25 1997-12-09 Hyundai Electronics Industries, Co., Ltd. Memory management unit with address translation function
US5682515A (en) * 1993-01-25 1997-10-28 Benchmarq Microelectronics, Inc. Low power set associative cache memory with status inhibit of cache data output
US5640339A (en) * 1993-05-11 1997-06-17 International Business Machines Corporation Cache memory including master and local word lines coupled to memory cells
US5727180A (en) * 1993-05-11 1998-03-10 International Business Machines Corporation Memory including master and local word lines coupled to memory cells storing access information
US5606683A (en) * 1994-01-28 1997-02-25 Quantum Effect Design, Inc. Structure and method for virtual-to-physical address translation in a translation lookaside buffer
US5953747A (en) * 1994-03-30 1999-09-14 Digital Equipment Corporation Apparatus and method for serialized set prediction
US5835963A (en) * 1994-09-09 1998-11-10 Hitachi, Ltd. Processor with an addressable address translation buffer operative in associative and non-associative modes
US5740416A (en) * 1994-10-18 1998-04-14 Cyrix Corporation Branch processing unit with a far target cache accessed by indirection from the target cache
US5732242A (en) * 1995-03-24 1998-03-24 Silicon Graphics, Inc. Consistently specifying way destinations through prefetching hints
US5860104A (en) * 1995-08-31 1999-01-12 Advanced Micro Devices, Inc. Data cache which speculatively updates a predicted data cache storage location with store data and subsequently corrects mispredicted updates
US5854943A (en) * 1996-08-07 1998-12-29 Hewlett-Packard Company Speed efficient cache output selector circuitry based on tag compare and data organization
US6092172A (en) * 1996-10-16 2000-07-18 Hitachi, Ltd. Data processor and data processing system having two translation lookaside buffers
US5956752A (en) * 1996-12-16 1999-09-21 Intel Corporation Method and apparatus for accessing a cache using index prediction
US6078995A (en) * 1996-12-26 2000-06-20 Micro Magic, Inc. Methods and apparatus for true least recently used (LRU) bit encoding for multi-way associative caches
US6256709B1 (en) * 1997-06-26 2001-07-03 Sun Microsystems, Inc. Method for storing data in two-way set associative odd and even banks of a cache memory
US5956746A (en) * 1997-08-13 1999-09-21 Intel Corporation Computer system having tag information in a processor and cache memory
US6516386B1 (en) * 1997-12-31 2003-02-04 Intel Corporation Method and apparatus for indexing a cache
US6145054A (en) * 1998-01-21 2000-11-07 Sun Microsystems, Inc. Apparatus and method for handling multiple mergeable misses in a non-blocking cache
US6651144B1 (en) * 1998-06-18 2003-11-18 Hewlett-Packard Development Company, L.P. Method and apparatus for developing multiprocessor cache control protocols using an external acknowledgement signal to set a cache to a dirty state

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8607026B2 (en) * 2002-09-13 2013-12-10 Nytell Software LLC Translation lookaside buffer
US20120066475A1 (en) * 2002-09-13 2012-03-15 Nytell Software LLC Translation lookaside buffer
US20040181626A1 (en) * 2003-03-13 2004-09-16 Pickett James K. Partial linearly tagged cache memory system
US7467131B1 (en) * 2003-09-30 2008-12-16 Google Inc. Method and system for query data caching and optimization in a search engine system
US20170192894A1 (en) * 2006-09-29 2017-07-06 Arm Finance Overseas Limited Data cache virtual hint way prediction, and applications thereof
US10268481B2 (en) 2006-09-29 2019-04-23 Arm Finance Overseas Limited Load/store unit for a processor, and applications thereof
US10768939B2 (en) 2006-09-29 2020-09-08 Arm Finance Overseas Limited Load/store unit for a processor, and applications thereof
US9632939B2 (en) * 2006-09-29 2017-04-25 Arm Finance Overseas Limited Data cache virtual hint way prediction, and applications thereof
US20150293853A1 (en) * 2006-09-29 2015-10-15 Arm Finance Overseas Limited Data cache virtual hint way prediction, and applications thereof
US9946547B2 (en) 2006-09-29 2018-04-17 Arm Finance Overseas Limited Load/store unit for a processor, and applications thereof
US10430340B2 (en) * 2006-09-29 2019-10-01 Arm Finance Overseas Limited Data cache virtual hint way prediction, and applications thereof
US20120297110A1 (en) * 2011-05-18 2012-11-22 University Of North Texas Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks
US9396135B2 (en) * 2011-05-18 2016-07-19 University Of North Texas Method and apparatus for improving computer cache performance and for protecting memory systems against some side channel attacks
US10497044B2 (en) 2015-10-19 2019-12-03 Demandware Inc. Scalable systems and methods for generating and serving recommendations
US11164235B2 (en) 2015-10-19 2021-11-02 Salesforce.Com, Inc. Scalable systems and methods for generating and serving recommendations
WO2018231408A1 (en) * 2017-06-15 2018-12-20 Rambus Inc. Hybrid memory module
CN110537172A (en) * 2017-06-15 2019-12-03 拉姆伯斯公司 Mixing memory module
US11080185B2 (en) 2017-06-15 2021-08-03 Rambus Inc. Hybrid memory module
US11573897B2 (en) 2017-06-15 2023-02-07 Rambus Inc. Hybrid memory module

Also Published As

Publication number Publication date
US6516386B1 (en) 2003-02-04

Similar Documents

Publication Publication Date Title
US6516386B1 (en) Method and apparatus for indexing a cache
US5148538A (en) Translation look ahead based cache access
US4587610A (en) Address translation systems for high speed computer memories
KR100423276B1 (en) Instruction fetch method and apparatus
US4914582A (en) Cache tag lookaside
JPH11203199A (en) Cache memory
JP2003509733A5 (en)
US4322815A (en) Hierarchical data storage system
US4821171A (en) System of selective purging of address translation in computer memories
US5241638A (en) Dual cache memory
JPH0778735B2 (en) Cache device and instruction read device
KR20040033029A (en) Method and apparatus for decoupling tag and data accesses in a cache memory
EP0404126B1 (en) Cache memory simultaneously conducting update for mishit and decision on mishit of next address
JPH0668736B2 (en) Apparatus and method for providing a cache memory unit with a write operation utilizing two system clock cycles
US6289438B1 (en) Microprocessor cache redundancy scheme using store buffer
US6009504A (en) Apparatus and method for storing data associated with multiple addresses in a storage element using a base address and a mask
US5077826A (en) Cache performance in an information handling system employing page searching
EP0581425A1 (en) Rapid data retrieval from data storage structures using prior access predictive annotations
US7543113B2 (en) Cache memory system and method capable of adaptively accommodating various memory line sizes
US5136702A (en) Buffer storage control method and apparatus
US7047400B2 (en) Single array banked branch target buffer
EP0206050A2 (en) Virtually addressed cache memory with physical tags
US7873780B2 (en) Searching a content addressable memory
US5704056A (en) Cache-data transfer system
US4733367A (en) Swap control apparatus for hierarchical memory system

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION