US5990910A - Method and apparatus for co-processing multi-formatted data - Google Patents

Method and apparatus for co-processing multi-formatted data Download PDF

Info

Publication number
US5990910A
US5990910A US09/047,193 US4719398A US5990910A US 5990910 A US5990910 A US 5990910A US 4719398 A US4719398 A US 4719398A US 5990910 A US5990910 A US 5990910A
Authority
US
United States
Prior art keywords
data
processor
memory
format
data elements
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
US09/047,193
Inventor
Indra Laksono
Anthony Asaro
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ATI Technologies ULC
Original Assignee
ATI Technologies ULC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ATI Technologies ULC filed Critical ATI Technologies ULC
Priority to US09/047,193 priority Critical patent/US5990910A/en
Assigned to ATI TECHNOLOGIES, INC. reassignment ATI TECHNOLOGIES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ASARO, ANTHONY, LAKSONO, INDRA
Priority to US09/088,190 priority patent/US6195105B1/en
Application granted granted Critical
Publication of US5990910A publication Critical patent/US5990910A/en
Assigned to ATI TECHNOLOGIES ULC reassignment ATI TECHNOLOGIES ULC CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: ATI TECHNOLOGIES INC.
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • G06F9/30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3877Concurrent instruction execution, e.g. pipeline, look ahead using a slave processor, e.g. coprocessor

Definitions

  • the present invention relates generally to computers system and more particularly to co-processing multi-formatted data within computer systems.
  • a computer is known to include a central processing unit, system memory, video graphics circuitry, audio processing circuitry, and peripheral ports.
  • the peripheral ports allow the computer to interface with peripheral devices such as printers, monitors, external tape drives, Internet, etc.
  • the central processing unit functions as a host processor while the video graphics circuit functions as a loosely coupled co-processor.
  • the host processor executes applications and, during execution, calls upon the co-processor to execute its particular function. For example, if the host central processing unit requires a drawing operation to be done, it requests, via a command through a command delivery system, the video graphics co-processor to perform the drawing function.
  • the host central processing unit needs to know the current status of the co-processor, or co-processors, before it can continue with processing the particular application and/or before sending new commands to the co-processor.
  • the host central processing unit obtains such status information from the co-processors via a handshaking protocol.
  • the hosts central processing initiates the handshaking protocol by poling a co-processor to obtain its status and by poling a co-processor register to obtain the stored status.
  • the host processor determines whether the co-processors status has changed. If so, host processor updates the co-processor register and continues with additional processing operations. If not, the host processor waits unit the co-processor has completed the current task.
  • Such a technique is known as poll and register writes.
  • a command first-in, first-out (“FIFO”) queue may be incorporated.
  • the command FIFO stores queued commands from the host processor that are awaiting execution by the co-processor.
  • the co-processor retrieves the command from the command FIFO.
  • the co-processor executes a queued command, it updates a co-processor register.
  • the host processor needs to verify that the command FIFO is not full and still needs to read the co-processor register to determine the current status of the co-processor. If the command FIFO is relatively small, i.e., holds a limited number of commands, the host processor still experiences wait cycles while the co-processor completes the processing of a command thereby freeing space in the command FIFO.
  • Increasing the size of the command FIFO such that the host processor can download as many commands as needed, may reduce the wait cycles. But, by increasing the command FIFO, the required memory is increased, as is the die area, and the cost of the co-processor.
  • an additional issue reduces concurrency between the host processor and the co-processor is when the co-processor is required to process data having different formats.
  • the co-processor is a video graphic co-processor, it may be required to process a variety of graphics data having different formats.
  • graphics data includes two-dimensional images, three-dimensional images, M-PEG data, etc., each of which uses a different data format.
  • the central processing unit pushes a processing command to a software driver that converts the command into a format that is compatible with the processing format of the video graphics co-processor. Once the software driver has converted the command, the converted command is provided to the video-graphics co-processor's command FIFO. Once the command FIFO is full, the central processor experiences wait periods since it cannot push additional commands to the software driver until the FIFO has an available entry.
  • FIG. 1 illustrates a schematic block diagram of a processing system in accordance with the present invention
  • FIG. 2 illustrates a more detailed schematic block diagram of the processing system of FIG. 1;
  • FIG. 3 illustrates a logic diagram of a method for co-processing multi-formatted data in accordance with the present invention
  • FIG. 4 illustrates a logic diagram of a method for a co-processor to perform data format conversions in accordance with the present invention
  • FIG. 5 illustrates a logic diagram of an alternate method for a co-processing multi-formatted data conversion in accordance with the present invention.
  • the present invention provides a method and apparatus for co-processing multi-formatted data.
  • a process begins when a host processor writes data blocks, in a substantially continuous manner, into memory.
  • Each of the data blocks includes a plurality of data elements and each data element has one of a plurality of data formats.
  • a co-processor retrieves selected data elements from the memory. Upon retrieving the selected data elements, the co-processor interprets them to identify the data format. If the data format is consistent with the data format of the co-processor, the co-processor processes the data element without conversion.
  • the co-processor converts the format of the selected data element into the format consistent with the co-processor.
  • the co-processor is performing the format conversion process.
  • the central processor's providing of data elements, which include commands, is no longer dependent on the processing of the commands by the co-processor.
  • the central processing unit can continuously provide data elements to the memory and the co-processor may retrieve them and process them at its own rate. By breaking the dependency, the host processor and co-processor operate in a much more concurrent manner than in previous embodiments.
  • FIG. 1 illustrates a schematic block diagram of a processing system that includes a host processor 12, memory 14, and a co-processor 16.
  • the host processor 12 may be a central processing unit within a personal computer, laptop computer, and/or work station, or may be a stand-alone processing device such as a micro-processor, micro-controller, digital signal processor, and/or any other device that manipulates digital information based on programming instructions.
  • the host processor writes data blocks 16, which includes a plurality of data elements, to the memory 14.
  • the host processor 12 provides signals 22 to the co-processor 16.
  • the signals 22 provide instructions to the co-processor 16 indicating the manner in which the data elements are to be retrieved from the memory 14.
  • the memory 14 may be system memory, local memory to the host processor, local memory to the co-processor, or a combination thereof.
  • the memory may be constructed of random access memory, floppy disk memory, hard disk memory, magnetic tape memory, CD memory, DVD memory, and/or any device that stores digital information.
  • the memory 14 is arranged in a ring buffer 26 such that the last data element n is followed by the first data element 0. In this manner, the host processor 12 writes the data elements of data block 16 into the ring buffer in a circular fashion.
  • the writing of data into a ring buffer is known, thus no further discussion will be provided except to facilitate the understanding of the present invention.
  • the co-processor 16 may be a micro-processor, micro-controller, digital signal processor, processor on a video graphics card, and/or any other device that manipulates digital information based on programming instructions.
  • the co-processor 16 retrieves selected data elements 20 from the memory based on signals 22. Upon receiving the selected data elements, the co-processor determines whether the format of the data elements are consistent with the format in which the co-processor is constructed. If so, the co-processor processes the selected data elements to produce the processed data 24. If the format is inconsistent, the co-processor 16 converts the selected data elements 20 into a format consistent with that of the co-processor 16 and produces the processed data therefrom.
  • the processing performed by the co-processor 16 may relate to video graphics processing wherein the data block is representative of a sub-frame, frame and/or frame grouping of two-dimensional video graphics, three-dimensional video graphics and/or digitally encoded video graphics (e.g., MPEG).
  • Each of the video graphics types have different data formats such that the two-dimensional video graphics has one format, the three-dimensional video graphics has another format and the digitally encoded video graphics has yet another data format.
  • the co-processor if constructed to process two-dimensional video graphics, would have to convert the three-dimensional video graphics and/or digitally encoded video graphics data elements into the equivalent two-dimensional video graphic data elements for processing.
  • Such processing includes producing pixel data from the data elements for subsequent display on a monitor.
  • the video graphics co-processor 16 is constructed to span an object based on the start pixel of the span (e.g., DST. START) and the width of the span (e.g., DST. WIDTH).
  • the current video image being process by the host processor has a data format where the span information is provided as the start pixel and the stop pixel.
  • the co-processor 16 converts the start and stop pixel span information into start and width span information. Once converted into this format, the co-processor 16 can process the data, which, in this example, is processing the span of an object.
  • the host processor is no longer dependent upon the co-processor such that the host processor and the co-processor operate with greater concurrency.
  • FIG. 2 illustrates a more detailed schematic block diagram of the processing system 10.
  • the processing system 10 includes the host processor 12, a software driver 13, memory 14, and the co-processor 16.
  • the co-processor 16 includes a first buffer 30, a programmable parsing module 32, a second buffer 36, and a processing module 38. Also shown is a comparison between the prior art host processor and co-processor interaction (lower left portion of the Figure) and the interaction of the present invention (above the prior art illustration).
  • the prior art process shows the host processor processing for a while and then waiting while the co-processor is processing. When the co-processor is done processing, the host processor resumes processing and the co-processor waits. The alternating of processing and waiting between the host processor and co-processor continues until the co-processor has completed its task.
  • the present invention allows the host processor to continually write a data block or plurality of data blocks for a given function.
  • the host processor also provides indications (i.e., signals 22 of FIG. 1) as to the data that it is writing to the memory.
  • indications i.e., signals 22 of FIG. 1
  • an indicator may be generated for each data element stored in memory or an indicator may be generated for each group of data elements stored in memory. For example, an indicator may be generated for every hundred triangles of an image that are provided to memory.
  • the co-processor While the host processor is continuously writing data elements to the memory, the co-processor is pulling selected data elements from the memory. The selection of the data elements is based on the indications received from the host processor. As such, the host processor and co-processor are concurrently processing, wherein a majority of the host processor's processing time is spent writing data elements into the memory and not performing data format conversions.
  • the software driver 13 includes a ring buffer algorithm such that the data elements received from the host processor are stored in memory 14 in a ring buffer fashion.
  • the software driver provides the indicators to the co-processor.
  • the software driver module 13 may include programming instructions to interpret the data elements as to their particular data format and make the conversion if necessary. While the software driver may perform this function, it is preferable to have the co-processor make such determinations since the host processor 12 executes the programming instructions of the software driver 13.
  • the co-processor retrieves data elements from memory 14 and stores them in a first buffer 30.
  • the first buffer 30 may function as a command FIFO such that the co-processor 16 queues incoming commands and data elements.
  • the data elements stored in the first buffer 30 are provided to the programmable parsing module 32, which includes a plurality of parsing modules 34.
  • Each of the parsing modules is operable to convert the data format of one of the plurality of data formats to the data format of the co-processor.
  • one parsing module may perform the function of converting 3-D graphics data into 2-D graphics data
  • another parsing module 34 may include programming instructions to convert a digitally encoded signal into the 2-D video graphics.
  • the converted data elements are provided to the second buffer 36 and are eventually pulled into the processing module 38, which includes a plurality of co-processor execution modules 40.
  • Each of the co-processor execution modules 40 performs a particular function specific to the co-processor 16.
  • the co-processing execution modules 40 may include a setup engine, an edgewalker circuit, texel blending module, etc.
  • FIG. 3 illustrates a logic diagram of a method for co-processing multi-formatted data.
  • the process begins at step 50 where a data block is written into memory.
  • the data block is written into memory in a substantially continuous manner as instructed by a host processor.
  • the data block includes data elements that have one of a plurality of data formats. If the co-processor were a video graphics co-processor, the data blocks would include graphical data of an image and a plurality of commands for rendering the image.
  • the data block may be for an image, a sub-frame of data, a frame of data, or a frame grouping of data, where a frame of data is representative of one screen, or window.
  • the data formats of the data elements may be based on two-dimensional video graphics, three-dimensional video graphics and/or digitally encoded video graphics, such as DVD, MPEG 1 and 2, etc.
  • the host processor may provide the data block to a software driver.
  • the software driver would route the data elements to memory such that the data elements are stored in a ring buffer manner. Regardless of how the data blocks are provided to memory, they are stored in a ring buffer manner.
  • step 52 a selected data element of the data block is retrieved.
  • the retrieval of the data block may be done by a co-processor in response to receiving one of a plurality of indicators.
  • the indicator may indicate a single data element or a group of data elements.
  • step 54 the co-processor interprets the selected data element to identify a data format.
  • step 56 a determination is made as to whether the data format equals a first data format, where the first data format is that in which the co-processor functions. If so, the process proceeds to step 58 where the co-processor processes the selected data element based on commands.
  • step 60 the data format of the selected data element is converted into the first data format. Having made the conversion, the process proceeds to step 62 where the converted data elements are processed by the co-processor.
  • FIG. 4 illustrates a logic diagram of an alternate method for co-processing multi-formatted data.
  • the process begins at step 70 where a host processor provides a data block to a software driver.
  • the data block includes a plurality of data elements that are formatted in one of a plurality of data formats.
  • the process then proceeds to step 72 where the software driver interprets the data elements to identify a particular data format.
  • the process then proceeds to step 74 where the software driver determines whether the data format of the data elements matches the data format of a co-processor. If so, the process proceeds to step 78. If not, the process proceeds to step 76 where the format of the data elements is converted into a format consistent with that of a co-processor.
  • step 78 the data elements, or the converted data elements, are stored in memory.
  • step 80 the data elements that are stored in memory are retrieved by a co-processor.
  • step 82 the co-processor processes the retrieved data elements in accordance with commands contained within at least some of the data elements.
  • FIG. 5 illustrates a logic diagram of yet another alternate method for co-processing multi-formatted data.
  • the process begins at step 90 where a host processor instructs a data block to be stored in memory in a substantially continuous manner.
  • the process then proceeds to step 92 where the host processor provides a plurality of indicators to a co-processor.
  • the plurality of indicators relate to the data elements as they are being stored in memory.
  • the process then proceeds to step 94 where the co-processor utilizes one of the plurality of indicators to retrieve a corresponding data element from memory.
  • the co-processor determines at steps 96 whether the data format of the retrieved data element matches the co-processors data format. If so, the process proceeds to step 98 where the co-processor processes the data in accordance with commands contained within at least some of the data elements. If, however, the data formats are not consistent, the process proceeds to step 100.
  • the co-processor converts the data format of the retrieved data element into the co-processor data format. Having done this, the co-processor processes the converted data based on the commands contained in at least some of the data processing elements.

Abstract

A method and apparatus for co-processing multi-formatted data which begins when a host processor writes data blocks, in a substantially continuous manner, into memory. Each of the data blocks includes a plurality of data elements and each data element has one of a plurality of data formats. As the data block is being stored in memory, a co-processor retrieves selected data elements from the memory. Upon retrieving the selected data elements, the co-processor interprets them to identify the data format. If the data format is consistent with the data format of the co-processor, the co-processor processes the data element without conversion. If, however, the data format of the selected data element is not consistent with the data format of the co-processor, the co-processor converts the format of the selected data element into the format consistent with the co-processor.

Description

TECHNICAL FIELD OF THE INVENTION
The present invention relates generally to computers system and more particularly to co-processing multi-formatted data within computer systems.
BACKGROUND OF THE INVENTION
A computer is known to include a central processing unit, system memory, video graphics circuitry, audio processing circuitry, and peripheral ports. The peripheral ports allow the computer to interface with peripheral devices such as printers, monitors, external tape drives, Internet, etc. In such a computer, the central processing unit functions as a host processor while the video graphics circuit functions as a loosely coupled co-processor. In general, the host processor executes applications and, during execution, calls upon the co-processor to execute its particular function. For example, if the host central processing unit requires a drawing operation to be done, it requests, via a command through a command delivery system, the video graphics co-processor to perform the drawing function.
In many situations, the host central processing unit needs to know the current status of the co-processor, or co-processors, before it can continue with processing the particular application and/or before sending new commands to the co-processor. The host central processing unit obtains such status information from the co-processors via a handshaking protocol. In essence, the hosts central processing initiates the handshaking protocol by poling a co-processor to obtain its status and by poling a co-processor register to obtain the stored status. The host processor then determines whether the co-processors status has changed. If so, host processor updates the co-processor register and continues with additional processing operations. If not, the host processor waits unit the co-processor has completed the current task. Such a technique is known as poll and register writes.
To reduce the host processor's idle time while it is waiting for the co-processor, a command first-in, first-out ("FIFO") queue may be incorporated. The command FIFO stores queued commands from the host processor that are awaiting execution by the co-processor. When the co-processor is able to perform a command, it retrieves the command from the command FIFO. As the co-processor executes a queued command, it updates a co-processor register. In this implementation, the host processor needs to verify that the command FIFO is not full and still needs to read the co-processor register to determine the current status of the co-processor. If the command FIFO is relatively small, i.e., holds a limited number of commands, the host processor still experiences wait cycles while the co-processor completes the processing of a command thereby freeing space in the command FIFO.
Increasing the size of the command FIFO, such that the host processor can download as many commands as needed, may reduce the wait cycles. But, by increasing the command FIFO, the required memory is increased, as is the die area, and the cost of the co-processor.
An additional issue reduces concurrency between the host processor and the co-processor is when the co-processor is required to process data having different formats. For example, if the co-processor is a video graphic co-processor, it may be required to process a variety of graphics data having different formats. Such various graphics data includes two-dimensional images, three-dimensional images, M-PEG data, etc., each of which uses a different data format. To process the various formatted graphics data, the central processing unit pushes a processing command to a software driver that converts the command into a format that is compatible with the processing format of the video graphics co-processor. Once the software driver has converted the command, the converted command is provided to the video-graphics co-processor's command FIFO. Once the command FIFO is full, the central processor experiences wait periods since it cannot push additional commands to the software driver until the FIFO has an available entry.
Therefore, a need exists for a method and apparatus that provides co-processing of multi-formatted data with minimal wait periods and without the need for increasing the command FIFO.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 illustrates a schematic block diagram of a processing system in accordance with the present invention;
FIG. 2 illustrates a more detailed schematic block diagram of the processing system of FIG. 1;
FIG. 3 illustrates a logic diagram of a method for co-processing multi-formatted data in accordance with the present invention;
FIG. 4 illustrates a logic diagram of a method for a co-processor to perform data format conversions in accordance with the present invention; and
FIG. 5 illustrates a logic diagram of an alternate method for a co-processing multi-formatted data conversion in accordance with the present invention.
DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT
Generally, the present invention provides a method and apparatus for co-processing multi-formatted data. Such a process begins when a host processor writes data blocks, in a substantially continuous manner, into memory. Each of the data blocks includes a plurality of data elements and each data element has one of a plurality of data formats. As the data block is being stored in memory, a co-processor retrieves selected data elements from the memory. Upon retrieving the selected data elements, the co-processor interprets them to identify the data format. If the data format is consistent with the data format of the co-processor, the co-processor processes the data element without conversion. If, however, the data format of the selected data element is not consistent with the data format of the co-processor, the co-processor converts the format of the selected data element into the format consistent with the co-processor. With such a method and apparatus, the co-processor is performing the format conversion process. As such, the central processor's providing of data elements, which include commands, is no longer dependent on the processing of the commands by the co-processor. Thus, the central processing unit can continuously provide data elements to the memory and the co-processor may retrieve them and process them at its own rate. By breaking the dependency, the host processor and co-processor operate in a much more concurrent manner than in previous embodiments.
The present invention can be more fully described with FIGS. 1 through 5. FIG. 1 illustrates a schematic block diagram of a processing system that includes a host processor 12, memory 14, and a co-processor 16. The host processor 12 may be a central processing unit within a personal computer, laptop computer, and/or work station, or may be a stand-alone processing device such as a micro-processor, micro-controller, digital signal processor, and/or any other device that manipulates digital information based on programming instructions. In operation, the host processor writes data blocks 16, which includes a plurality of data elements, to the memory 14. In addition, the host processor 12 provides signals 22 to the co-processor 16. The signals 22 provide instructions to the co-processor 16 indicating the manner in which the data elements are to be retrieved from the memory 14.
The memory 14 may be system memory, local memory to the host processor, local memory to the co-processor, or a combination thereof. The memory may be constructed of random access memory, floppy disk memory, hard disk memory, magnetic tape memory, CD memory, DVD memory, and/or any device that stores digital information. Further, the memory 14 is arranged in a ring buffer 26 such that the last data element n is followed by the first data element 0. In this manner, the host processor 12 writes the data elements of data block 16 into the ring buffer in a circular fashion. The writing of data into a ring buffer is known, thus no further discussion will be provided except to facilitate the understanding of the present invention.
The co-processor 16 may be a micro-processor, micro-controller, digital signal processor, processor on a video graphics card, and/or any other device that manipulates digital information based on programming instructions. The co-processor 16 retrieves selected data elements 20 from the memory based on signals 22. Upon receiving the selected data elements, the co-processor determines whether the format of the data elements are consistent with the format in which the co-processor is constructed. If so, the co-processor processes the selected data elements to produce the processed data 24. If the format is inconsistent, the co-processor 16 converts the selected data elements 20 into a format consistent with that of the co-processor 16 and produces the processed data therefrom.
The processing performed by the co-processor 16 may relate to video graphics processing wherein the data block is representative of a sub-frame, frame and/or frame grouping of two-dimensional video graphics, three-dimensional video graphics and/or digitally encoded video graphics (e.g., MPEG). Each of the video graphics types have different data formats such that the two-dimensional video graphics has one format, the three-dimensional video graphics has another format and the digitally encoded video graphics has yet another data format. As such, the co-processor, if constructed to process two-dimensional video graphics, would have to convert the three-dimensional video graphics and/or digitally encoded video graphics data elements into the equivalent two-dimensional video graphic data elements for processing. Such processing includes producing pixel data from the data elements for subsequent display on a monitor.
For example, assume that the video graphics co-processor 16 is constructed to span an object based on the start pixel of the span (e.g., DST. START) and the width of the span (e.g., DST. WIDTH). The current video image being process by the host processor, however, has a data format where the span information is provided as the start pixel and the stop pixel. As such, the co-processor 16 converts the start and stop pixel span information into start and width span information. Once converted into this format, the co-processor 16 can process the data, which, in this example, is processing the span of an object. By having the co-processor performing the format conversion, the host processor is no longer dependent upon the co-processor such that the host processor and the co-processor operate with greater concurrency.
FIG. 2 illustrates a more detailed schematic block diagram of the processing system 10. The processing system 10 includes the host processor 12, a software driver 13, memory 14, and the co-processor 16. The co-processor 16 includes a first buffer 30, a programmable parsing module 32, a second buffer 36, and a processing module 38. Also shown is a comparison between the prior art host processor and co-processor interaction (lower left portion of the Figure) and the interaction of the present invention (above the prior art illustration). The prior art process shows the host processor processing for a while and then waiting while the co-processor is processing. When the co-processor is done processing, the host processor resumes processing and the co-processor waits. The alternating of processing and waiting between the host processor and co-processor continues until the co-processor has completed its task. In contrast, the present invention allows the host processor to continually write a data block or plurality of data blocks for a given function.
The host processor also provides indications (i.e., signals 22 of FIG. 1) as to the data that it is writing to the memory. (Note that an indicator may be generated for each data element stored in memory or an indicator may be generated for each group of data elements stored in memory. For example, an indicator may be generated for every hundred triangles of an image that are provided to memory.) While the host processor is continuously writing data elements to the memory, the co-processor is pulling selected data elements from the memory. The selection of the data elements is based on the indications received from the host processor. As such, the host processor and co-processor are concurrently processing, wherein a majority of the host processor's processing time is spent writing data elements into the memory and not performing data format conversions.
The software driver 13 includes a ring buffer algorithm such that the data elements received from the host processor are stored in memory 14 in a ring buffer fashion. In addition, the software driver provides the indicators to the co-processor. Further, the software driver module 13 may include programming instructions to interpret the data elements as to their particular data format and make the conversion if necessary. While the software driver may perform this function, it is preferable to have the co-processor make such determinations since the host processor 12 executes the programming instructions of the software driver 13.
The co-processor, based on the indicators, retrieves data elements from memory 14 and stores them in a first buffer 30. The first buffer 30 may function as a command FIFO such that the co-processor 16 queues incoming commands and data elements. The data elements stored in the first buffer 30 are provided to the programmable parsing module 32, which includes a plurality of parsing modules 34. Each of the parsing modules is operable to convert the data format of one of the plurality of data formats to the data format of the co-processor. For example, one parsing module may perform the function of converting 3-D graphics data into 2-D graphics data, while another parsing module 34 may include programming instructions to convert a digitally encoded signal into the 2-D video graphics.
The converted data elements are provided to the second buffer 36 and are eventually pulled into the processing module 38, which includes a plurality of co-processor execution modules 40. Each of the co-processor execution modules 40 performs a particular function specific to the co-processor 16. For example, if the co-processor 16 is a video graphics co-processor, the co-processing execution modules 40 may include a setup engine, an edgewalker circuit, texel blending module, etc.
FIG. 3 illustrates a logic diagram of a method for co-processing multi-formatted data. The process begins at step 50 where a data block is written into memory. The data block is written into memory in a substantially continuous manner as instructed by a host processor. The data block includes data elements that have one of a plurality of data formats. If the co-processor were a video graphics co-processor, the data blocks would include graphical data of an image and a plurality of commands for rendering the image. The data block may be for an image, a sub-frame of data, a frame of data, or a frame grouping of data, where a frame of data is representative of one screen, or window. The data formats of the data elements may be based on two-dimensional video graphics, three-dimensional video graphics and/or digitally encoded video graphics, such as DVD, MPEG 1 and 2, etc.
As an alternative to the host processor writing the data block directly into memory, the host processor may provide the data block to a software driver. Upon receiving the data block, the software driver would route the data elements to memory such that the data elements are stored in a ring buffer manner. Regardless of how the data blocks are provided to memory, they are stored in a ring buffer manner.
The process then proceeds to step 52 where a selected data element of the data block is retrieved. The retrieval of the data block may be done by a co-processor in response to receiving one of a plurality of indicators. The indicator may indicate a single data element or a group of data elements. The process then proceeds to step 54 where the co-processor interprets the selected data element to identify a data format. The process then proceeds to step 56 where a determination is made as to whether the data format equals a first data format, where the first data format is that in which the co-processor functions. If so, the process proceeds to step 58 where the co-processor processes the selected data element based on commands.
If, however, the data format of the selected data element is not that of the first data format, the process proceeds to step 60. At step 60, the data format of the selected data element is converted into the first data format. Having made the conversion, the process proceeds to step 62 where the converted data elements are processed by the co-processor.
FIG. 4 illustrates a logic diagram of an alternate method for co-processing multi-formatted data. The process begins at step 70 where a host processor provides a data block to a software driver. The data block includes a plurality of data elements that are formatted in one of a plurality of data formats. The process then proceeds to step 72 where the software driver interprets the data elements to identify a particular data format. The process then proceeds to step 74 where the software driver determines whether the data format of the data elements matches the data format of a co-processor. If so, the process proceeds to step 78. If not, the process proceeds to step 76 where the format of the data elements is converted into a format consistent with that of a co-processor.
The process then proceeds to step 78 where the data elements, or the converted data elements, are stored in memory. The process then proceeds to step 80 where the data elements that are stored in memory are retrieved by a co-processor. The process then proceeds to step 82 where the co-processor processes the retrieved data elements in accordance with commands contained within at least some of the data elements.
FIG. 5 illustrates a logic diagram of yet another alternate method for co-processing multi-formatted data. The process begins at step 90 where a host processor instructs a data block to be stored in memory in a substantially continuous manner. The process then proceeds to step 92 where the host processor provides a plurality of indicators to a co-processor. The plurality of indicators relate to the data elements as they are being stored in memory. The process then proceeds to step 94 where the co-processor utilizes one of the plurality of indicators to retrieve a corresponding data element from memory.
Upon retrieving the data element the co-processor, determines at steps 96 whether the data format of the retrieved data element matches the co-processors data format. If so, the process proceeds to step 98 where the co-processor processes the data in accordance with commands contained within at least some of the data elements. If, however, the data formats are not consistent, the process proceeds to step 100. At step 100, the co-processor converts the data format of the retrieved data element into the co-processor data format. Having done this, the co-processor processes the converted data based on the commands contained in at least some of the data processing elements.
The preceding discussion has presented a method and apparatus for co-processing multi-formatted data. By shifting the conversion of multi-formatted data external to the central processing unit, the concurrency between the central processing unit and co-processor is substantially increased. In addition, by shifting the determination into the co-processor, a substantial portion of processing time is off-loaded from the central processing unit since it does not have to perform such a conversion. As one of average skill in the art would readily appreciate, the present invention is applicable to a wide variety of co-processing environments and should not be limited to just the video graphics arena.

Claims (20)

What is claimed is:
1. A method for co-processing multi-formatted data, the method comprises:
a) writing a data block into memory, wherein the data block is being written into the memory in a substantially continuous manner as instructed by a host processor, wherein data elements of the data block have one of a plurality of data formats;
b) retrieving selected data elements of the data block from the memory;
c) interpreting the selected data elements to identify a data format of the plurality of data formats;
d) determining whether the data format is a first data format of the plurality of data formats;
e) when the data format is not the first data format, converting the selected data elements into data elements having the first data format to produce converted data elements; and
f) processing the converted data elements by a co-processor.
2. The method of claim 1, wherein the data block includes a plurality of commands for rendering an image.
3. The method of claim 1, wherein step (a) further comprises:
providing, by the host processor, the data block to a software driver; and
routing, by the software driver, the data elements to the memory, wherein the memory stores the data elements in a ring buffer manner.
4. The method of claim 3 further comprises:
providing, by the host processor, a plurality of indicators;
receiving, by the co-processor, one of the plurality of indicators, wherein the co-processor utilizes the one of the plurality of indicators to retrieve the selected data elements.
5. The method of claim 1, wherein the data block is representative of a sub-frame, frame, or frame grouping of at least one of: two-dimensional video graphics data, three-dimensional video graphics data, and digitally encoded video graphics data.
6. The method of claim 5, wherein the two-dimensional video graphics data is formatted in at least one of the plurality of data formats, the three-dimensional video graphics data is formatted in at least another one of the plurality of data formats, and the digitally encoded video graphics data is formatted in at least one other of the plurality data formats.
7. The method of claim 1 further comprises:
providing, by the host processor, the data block to a software driver; and
interpreting, by the software driver, the data elements to identify the data format; and
converting, by the software driver, the data elements into data elements having the first format when the data format is not the first data format prior to the writing the data block into the memory.
8. A method for providing concurrency between a host processor and a co-processor, the method comprises the steps of:
a) instructing, by the host processor, a data block to be stored in memory, wherein the instructing causes the data block to be stored in a substantially continuous manner;
b) providing, by the host processor, a plurality of indicators as data elements of the data block are stored in the memory;
c) utilizing, by the co-processor, one of the plurality of indicators to retrieve a corresponding data element of the data block from the memory;
d) converting, by the co-processor, formatting of the corresponding data element to a co-processor data format to produce a converted data element when the corresponding data element has a data format that is inconsistent with the co-processor data format; and
e) processing, by the co-processor, the converted data element.
9. The method of claim 8, wherein the data block includes a plurality of commands for rendering an image.
10. The method of claim 8, wherein step (a) further comprises:
providing, by the host processor, the data block to a software driver; and
routing, by the software driver, the data elements to the memory, wherein the memory stores the data elements in a ring buffer manner.
11. The method of claim 8, wherein the data block is representative of a sub-frame, frame, or frame grouping of at leas one of: two-dimensional video graphics data, three-dimensional dimensional video graphics data, and digitally encoded video graphics data.
12. The method of claim 11, wherein the two-dimensional video graphics data is formatted in at least one of the plurality of data formats, the three-dimensional video graphics data is formatted in at least another one of the plurality of data formats, and the digitally encoded video graphics data is formatted in at least one other of the plurality data formats.
13. A processing system comprises:
a host processor;
memory operably coupled to the host processor, wherein the memory stores a data block in a substantially continuous manner based on writing instructions from the host processor, and wherein data elements of the data block are stored in a ring buffer manner; and
a co-processor operably coupled to the memory and to receiving signals from the host processor, wherein the co-processor retrieves selected data elements of the data block from the memory based on at least one of the signals, wherein the co-processor converts a data format of the selected data elements to produce converted data elements when the data format is inconsistent with a co-processor data format, and wherein the co-processor processes the converted data elements.
14. The processing system of claim 13 further comprises a software driver operably coupled to receive the data elements and to cause the data elements to be stored in the memory.
15. The processing system of claim 14, wherein the co-processor further comprises:
a first buffer operably coupled to receive the selected data elements;
a programmable parsing module operably coupled to the first buffer, wherein the programmable parsing module interprets the selected data elements and converts the data format of the selected data elements when the data format is inconsistent with the co-processor data format;
a second buffer operably coupled to store the converted data elements; and
a processing module operably coupled to process the converted data elements.
16. The processing system of claim 15, wherein the data block includes a plurality of commands for rendering an image, and wherein the processing module includes a rendering circuit.
17. A co-processor comprises:
a first buffer operably coupled to retrieve selected data elements of a data block from a memory in concurrence with writing the data block into the memory;
a programmable parsing module operably coupled to the first buffer, wherein the programmable parsing module interprets the selected data elements and converts the data format of the selected data elements to produce converted data elements when the data format is inconsistent with a co-processor data format;
a second buffer operably coupled to store the converted data elements; and
a processing module operably coupled to process the converted data elements in concurrence with the writing of the data block into the memory.
18. The co-processor of claim 17, wherein the data block further comprises a plurality of commands and a data stream for rendering an image, and wherein the processing module includes a rendering circuit.
19. The co-processor of claim 17, wherein the programmable parsing module further comprises a plurality of parsing modules, wherein each of the plurality of parsing modules corresponds to at least one of the plurality of data formats.
20. The co-processor of claim 17, wherein the processing module further comprises a plurality of co-processor execution units.
US09/047,193 1998-03-24 1998-03-24 Method and apparatus for co-processing multi-formatted data Expired - Lifetime US5990910A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09/047,193 US5990910A (en) 1998-03-24 1998-03-24 Method and apparatus for co-processing multi-formatted data
US09/088,190 US6195105B1 (en) 1998-03-24 1998-06-01 Method and apparatus for improved concurrent video graphic processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US09/047,193 US5990910A (en) 1998-03-24 1998-03-24 Method and apparatus for co-processing multi-formatted data

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US09/088,190 Continuation-In-Part US6195105B1 (en) 1998-03-24 1998-06-01 Method and apparatus for improved concurrent video graphic processing

Publications (1)

Publication Number Publication Date
US5990910A true US5990910A (en) 1999-11-23

Family

ID=21947569

Family Applications (1)

Application Number Title Priority Date Filing Date
US09/047,193 Expired - Lifetime US5990910A (en) 1998-03-24 1998-03-24 Method and apparatus for co-processing multi-formatted data

Country Status (1)

Country Link
US (1) US5990910A (en)

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6307558B1 (en) * 1999-03-03 2001-10-23 Intel Corporation Method of hierarchical static scene simplification
US6707457B1 (en) * 1999-09-30 2004-03-16 Conexant Systems, Inc. Microprocessor extensions for two-dimensional graphics processing
US20080267795A1 (en) * 2007-04-27 2008-10-30 Rusty Singer Positive Displacement Injection Pump
US7669037B1 (en) * 2005-03-10 2010-02-23 Xilinx, Inc. Method and apparatus for communication between a processor and hardware blocks in a programmable logic device
US20100053223A1 (en) * 2008-08-29 2010-03-04 Mitsubishi Electric Corporation Gradation control method and display device
US7743176B1 (en) 2005-03-10 2010-06-22 Xilinx, Inc. Method and apparatus for communication between a processor and hardware blocks in a programmable logic device
US20150033000A1 (en) * 1999-02-25 2015-01-29 Pact Xpp Technologies Ag Parallel Processing Array of Arithmetic Unit having a Barrier Instruction
US9141390B2 (en) 2001-03-05 2015-09-22 Pact Xpp Technologies Ag Method of processing data with an array of data processors according to application ID
US9170812B2 (en) 2002-03-21 2015-10-27 Pact Xpp Technologies Ag Data processing system having integrated pipelined array data processor
US9250908B2 (en) 2001-03-05 2016-02-02 Pact Xpp Technologies Ag Multi-processor bus and cache interconnection system
US9256575B2 (en) 2000-10-06 2016-02-09 Pact Xpp Technologies Ag Data processor chip with flexible bus system
US9274984B2 (en) 2002-09-06 2016-03-01 Pact Xpp Technologies Ag Multi-processor with selectively interconnected memory units
US9411532B2 (en) 2001-09-07 2016-08-09 Pact Xpp Technologies Ag Methods and systems for transferring data between a processing device and external devices
US9436631B2 (en) 2001-03-05 2016-09-06 Pact Xpp Technologies Ag Chip including memory element storing higher level memory data on a page by page basis
US9552047B2 (en) 2001-03-05 2017-01-24 Pact Xpp Technologies Ag Multiprocessor having runtime adjustable clock and clock dependent power supply
US9690747B2 (en) 1999-06-10 2017-06-27 PACT XPP Technologies, AG Configurable logic integrated circuit having a multidimensional structure of configurable elements
US10031733B2 (en) 2001-06-20 2018-07-24 Scientia Sol Mentis Ag Method for processing data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5797028A (en) * 1995-09-11 1998-08-18 Advanced Micro Devices, Inc. Computer system having an improved digital and analog configuration
US5854639A (en) * 1994-03-03 1998-12-29 Fujitsu Limited Graphic display unit and graphic display method using the same
US5925099A (en) * 1995-06-15 1999-07-20 Intel Corporation Method and apparatus for transporting messages between processors in a multiple processor system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5854639A (en) * 1994-03-03 1998-12-29 Fujitsu Limited Graphic display unit and graphic display method using the same
US5925099A (en) * 1995-06-15 1999-07-20 Intel Corporation Method and apparatus for transporting messages between processors in a multiple processor system
US5797028A (en) * 1995-09-11 1998-08-18 Advanced Micro Devices, Inc. Computer system having an improved digital and analog configuration

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150033000A1 (en) * 1999-02-25 2015-01-29 Pact Xpp Technologies Ag Parallel Processing Array of Arithmetic Unit having a Barrier Instruction
US6307558B1 (en) * 1999-03-03 2001-10-23 Intel Corporation Method of hierarchical static scene simplification
US9690747B2 (en) 1999-06-10 2017-06-27 PACT XPP Technologies, AG Configurable logic integrated circuit having a multidimensional structure of configurable elements
US6707457B1 (en) * 1999-09-30 2004-03-16 Conexant Systems, Inc. Microprocessor extensions for two-dimensional graphics processing
US9256575B2 (en) 2000-10-06 2016-02-09 Pact Xpp Technologies Ag Data processor chip with flexible bus system
US9552047B2 (en) 2001-03-05 2017-01-24 Pact Xpp Technologies Ag Multiprocessor having runtime adjustable clock and clock dependent power supply
US9436631B2 (en) 2001-03-05 2016-09-06 Pact Xpp Technologies Ag Chip including memory element storing higher level memory data on a page by page basis
US9141390B2 (en) 2001-03-05 2015-09-22 Pact Xpp Technologies Ag Method of processing data with an array of data processors according to application ID
US9250908B2 (en) 2001-03-05 2016-02-02 Pact Xpp Technologies Ag Multi-processor bus and cache interconnection system
US10031733B2 (en) 2001-06-20 2018-07-24 Scientia Sol Mentis Ag Method for processing data
US9411532B2 (en) 2001-09-07 2016-08-09 Pact Xpp Technologies Ag Methods and systems for transferring data between a processing device and external devices
US9170812B2 (en) 2002-03-21 2015-10-27 Pact Xpp Technologies Ag Data processing system having integrated pipelined array data processor
US10579584B2 (en) 2002-03-21 2020-03-03 Pact Xpp Schweiz Ag Integrated data processing core and array data processor and method for processing algorithms
US9274984B2 (en) 2002-09-06 2016-03-01 Pact Xpp Technologies Ag Multi-processor with selectively interconnected memory units
US10296488B2 (en) 2002-09-06 2019-05-21 Pact Xpp Schweiz Ag Multi-processor with selectively interconnected memory units
US7743176B1 (en) 2005-03-10 2010-06-22 Xilinx, Inc. Method and apparatus for communication between a processor and hardware blocks in a programmable logic device
US7669037B1 (en) * 2005-03-10 2010-02-23 Xilinx, Inc. Method and apparatus for communication between a processor and hardware blocks in a programmable logic device
US20080267795A1 (en) * 2007-04-27 2008-10-30 Rusty Singer Positive Displacement Injection Pump
US20100053223A1 (en) * 2008-08-29 2010-03-04 Mitsubishi Electric Corporation Gradation control method and display device

Similar Documents

Publication Publication Date Title
US5990910A (en) Method and apparatus for co-processing multi-formatted data
US6124868A (en) Method and apparatus for multiple co-processor utilization of a ring buffer
US5943064A (en) Apparatus for processing multiple types of graphics data for display
JP3273202B2 (en) Method of transferring data through a plurality of data channels and circuit architecture thereof
CN101237548B (en) Image pickup apparatus and control method, image display apparatus and control method
US7724262B2 (en) Memory system and method for improved utilization of read and write bandwidth of a graphics processing system
JP2004280125A (en) Video/graphic memory system
JP2001005582A (en) System and method for plotting picture-based data
JPH0916366A (en) Method and apparatus for display of image as well as data-processing system
US6141023A (en) Efficient display flip
WO2007057053A1 (en) Conditional updating of image data in a memory buffer
EP0309676B1 (en) Workstation controller with full screen write mode and partial screen write mode
US6392654B1 (en) Method and apparatus for processing data with improved concurrency
US20060082580A1 (en) Method and apparatus for triggering frame updates
EP3522530A1 (en) System performance improvement method, system performance improvement device and display device
CN114302087A (en) MIPI data transmission mode conversion method and device and electronic equipment
US9928187B2 (en) Increasing data throughput in the image processing path of a document reproduction device
US7735093B2 (en) Method and apparatus for processing real-time command information
US6825842B1 (en) Method and system for queueing draw operations
EP1209655A2 (en) Method and system for displaying images
US6195105B1 (en) Method and apparatus for improved concurrent video graphic processing
JP2006504183A (en) Logic analyzer and logic analyzer data processing method
US20090073163A1 (en) Apparatus for and method of processing vertex
US6025855A (en) Store double word and status word write graphics primitives
CN111091848B (en) Method and device for predicting head posture

Legal Events

Date Code Title Description
AS Assignment

Owner name: ATI TECHNOLOGIES, INC., CANADA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:LAKSONO, INDRA;ASARO, ANTHONY;REEL/FRAME:009121/0351

Effective date: 19980319

STCF Information on status: patent grant

Free format text: PATENTED CASE

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12

AS Assignment

Owner name: ATI TECHNOLOGIES ULC, CANADA

Free format text: CHANGE OF NAME;ASSIGNOR:ATI TECHNOLOGIES INC.;REEL/FRAME:026270/0027

Effective date: 20061025