US20070277096A1 - Method of dividing structured documents into several parts - Google Patents

Method of dividing structured documents into several parts Download PDF

Info

Publication number
US20070277096A1
US20070277096A1 US11/800,550 US80055007A US2007277096A1 US 20070277096 A1 US20070277096 A1 US 20070277096A1 US 80055007 A US80055007 A US 80055007A US 2007277096 A1 US2007277096 A1 US 2007277096A1
Authority
US
United States
Prior art keywords
information
document
structured
main
secondary portion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US11/800,550
Inventor
Claude Seyrat
Cedric Thienot
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Expway SA
Original Assignee
Expway SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Expway SA filed Critical Expway SA
Priority to US11/800,550 priority Critical patent/US20070277096A1/en
Publication of US20070277096A1 publication Critical patent/US20070277096A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • G06F16/986Document structures and storage, e.g. HTML extensions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99942Manipulating data structure, e.g. compression, compaction, compilation

Definitions

  • the present invention relates to a method enabling structured documents to be divided into several parts.
  • a structured document is a connection of data sets each associated with a type and attributes, and interconnected by relationships that are mainly hierarchical.
  • Such documents use a markup language such as Standard Generalized Markup Language (SGML), Hypertext Markup Language (HTML), or Extensible Markup Language (XML), serving in particular to distinguish between the various subsets of information making up the document.
  • SGML Standard Generalized Markup Language
  • HTML Hypertext Markup Language
  • XML Extensible Markup Language
  • the content information of the document is mixed in with layout information and type information.
  • a structured document includes markers for separating different sets of information in the document. For SGML, XML, or HTML formats, these markers are referred to as “tags” and have the form “ ⁇ XXXX>” and “ ⁇ /XXXX>”, the first marker marking the beginning of a set of information called “XXXX”, and the second marking the end of said set.
  • a set of information may itself be made up of a plurality of lower-level sets of information.
  • a structured document presents a tree or hierarchical structure schema, each node representing a set of information and being connected to a node at a higher hierarchical level representing a set of information that contains the sets of information at lower level.
  • the nodes situated at the ends of branches in such a tree structure represent sets of information containing data of predetermined type, themselves not suitable for being resolved into subsets of information.
  • a structured document contains separation markers represented in textual or binary data form, said markers defining information sets or subsets that can themselves contain other subsets of information defined by the markers.
  • a structured document is associated with a structure schema defining the structure in the form of rules together with the type of information in each set of information of the document.
  • a schema is constituted by nested groups of information set structures, these groups possibly being ordered sequences, groups of alternative elements, or groups of necessary elements, ordered or not ordered.
  • a structured document when a structured document is to be transmitted, it is initially compressed so as to minimize the volume of data to be transmitted.
  • the document structuring data is also compressed, given that the recipient of the document is assumed to know beforehand the structure schema of the document and to be able to use the structure schema to determine at all times what information set is about to be received. It is therefore essential for the structure of the document as transmitted to correspond exactly to the structure schema that the recipient of the document intends to use for receiving and decoding the document, since otherwise the recipient cannot determine the type of data that has been transmitted and is thus in-capable of decoding the data and of reconstituting the original document.
  • An object of the invention is to overcome that drawback. This object is achieved by providing a method of dividing a structured document presenting a hierarchical structure defined by a structure schema, the document combining a main set of information including information subsets, at least some of the information subsets being capable of including information subsets of lower hierarchical level, each information subset being associated with a respective information type.
  • the method comprises the steps of:
  • each portion is understandable on its own and can be decoded regardless of the selected partitioning.
  • the remainder of the document remains valid and only the portion that was not transmitted correctly needs to be retransmitted, there being no need to retransmit the entire document.
  • the document includes a header which is inserted in each portion, the header including a flag whose value specifies whether or not the document is complete.
  • each portion has a header containing information giving the location of the portion in the hierarchical structure of the document.
  • Said information concerning the location of the secondary portion in the hierarchical structure of the document advantageously describes a path in said structure, defining the position of the secondary portion in the document.
  • Said path may be defined in absolute manner relative to the main set of information of the document. It may also be defined in relative manner relative to the position of a most recently-transmitted secondary portion.
  • each type of information allocated to the predefined value is followed by a reference to the secondary portion containing the subset of information associated with the type of information, said information concerning the location of the secondary portion in the hierarchical structure of the document being the reference of said secondary portion.
  • the method may also include transmitting a plurality of document portions associated with the same location in the structure. Under such circumstances, the most recently-transmitted portion replaces the previous portion that was associated with the same location.
  • the structured document may be of the SGML, XML, or HTML type, for example.
  • FIG. 1 shows a tree structure in which each node symbolizes a set or a subset of information in a structured document which is normally transmitted as a single entity;
  • FIG. 2 shows the structured document of FIG. 1 partitioned into a plurality of portions, each capable of being transmitted separately in accordance with the invention
  • FIG. 3 shows in greater detail the structure of the information contained in a structured document
  • FIG. 4 shows another tree structure illustrating a method of defining the position of a portion of the structure, said portion being transmitted separately from the remainder of the structure.
  • FIG. 1 shows a tree structure comprising a root node 1 partitioned into three lower level nodes, of which a first node 1 . 1 is not partitioned into lower level nodes, a second node 1 . 2 comprises two nodes 1 . 2 . 1 and 1 . 2 . 2 , and a third node 1 . 3 comprises a single node 1 . 3 . 1 .
  • the two nodes 1 . 2 . 1 and 1 . 2 . 2 of the second node 1 . 2 are respectively attached to one 1 . 2 . 1 . 1 and to two nodes 1 . 2 . 2 . 1 and 1 . 2 . 2 . 2 of lower level.
  • This structure represents a structured document D comprising a header H in which a certain number of parameters are defined that define the coding and display format of the document, and a main body B containing the information and the sets of information constituting the document.
  • a structured document can be transmitted as a plurality of separate portions P 1 , P 2 , P 3 , i.e. a main portion, and secondary portions P 2 and P 3 which are attached to the main portion ( FIG. 2 ). Such transmission is preferably performed after each portion for separate transmission has been compressed in appropriate manner.
  • Each portion of the document, whether or not it is compressed, comprises a header H, H 2 , H 3 , and a main body B 1 , B 2 , B 3 .
  • a main body B of the document comprises a data header DH and one or more data bodies DB each containing the information of an information subset of the document.
  • the data header DH may have a field K enabling ambiguity to be resolved at the time the document is decoded, in particular by giving a number enabling the following data set to be defined, and/or a field containing the number N of occurrences of the data body DB.
  • each data body DB may comprise a field T specifying the type of information it contains, a field L giving length of the information as a number of bits or of bytes, a field A containing the attributes of the information subsets, and a field Val containing the value or the content of the information subsets.
  • the field Val may itself contain a data header field DH and one or more fields containing a data body DB.
  • the information contained in the document is held in the nodes 1 . 1 , 1 . 2 . 1 . 1 , 1 . 2 . 2 . 1 , 1 . 2 . 2 . 2 , and 1 . 3 . 1 situated at the ends of the branches, and also in the attribute fields A of the subsets symbolized by all of the nodes of the document.
  • the field T containing the type of the information in a data body DB that has not been transmitted or that has been withdrawn from the document receives a predefined value specifying that the following information subset is not transmitted.
  • This predefined particular value for information type is selected to be equal to zero, for example, when a document is in compressed form, with other types of information having values that are not zero.
  • the length field L and the fields A and Val which normally follow the information type do not appear in the transmitted data. Consequently, following an information type that is equal to the predefined value, there is the header DH of the next set of data in the document, or an end-of-document flag.
  • the portions P 1 , P 2 , and P 3 may be transmitted separately one or more times.
  • each has a header H, H 2 , H 3 comprising firstly a parameter specifying that the document is not complete, followed by a definition of the location of the transmitted portion in the tree structure of the complete document.
  • a structured document can be enriched and modified over time.
  • the document can be partitioned in such a manner that the main portion does not contain any payload data, so that the entire document can be reconstituted from the secondary portions and their locations within the document structure.
  • headers H, H 2 , H 3 of the portions P 1 , P 2 , P 3 may contain information specifying a mode of processing the portion relative to an already transmitted portion associated with the same location in the structure, for example whether the transmitted portion is to replace an already transmitted portion associated with the same location, or whether it should not be taken into account if it already appears in the received document, or indeed whether it should be merged with the already transmitted portion associated with the same location.
  • this definition of location may comprise the names of all of the higher nodes going back to the root node R, possibly associated with an order number relative to the higher node.
  • the firstly node of the first node of the third node of the first node attached to the root node (identified in FIG. 4 by a sequence of arrows coming from the root node R) can be referenced as follows:
  • This notation indicates that it is a node of type “d” connected to the first node of type “b” connected to the last node of type “a” connected to the node of type “c” which is directly connected to the root node R.
  • the third node connected to the same node inmmediately above the preceding node may be referenced as follows:
  • the location of the transmitted portion P 2 , P 3 of the document may be defined merely by means of a reference to the document portion, said reference having already been transmitted in the main portion of P 1 of the document, e.g. following the predefined value specifying that the following information subset is not transmitted.
  • the document, or the portions P 1 , P 2 , P 3 of the document for transmission is/are previously compressed.
  • the structure information is constituted by all of the fields except for the value fields Val when these fields are not structured, i.e. when they are not capable of being partitioned into structured subsets of information.
  • these are the fields Val of the information subsets 1 . 1 , 1 . 2 . 1 . 1 , 1 . 2 . 2 . 1 , 1 . 2 . 2 . 2 , and 1 . 3 . 1 , situated at the bottom ends of the branches of the document tree structure.
  • Compression processing proper consists, for example, in reading the portion of the document that is to be compressed sequentially, in applying an appropriate compression algorithm for processing the structure information, and in applying a compression algorithm adapted to the information type when a non-partitionable field Val appears while reading the document portion. It should be observed that in a compressed document or document portion, the structure information and the content information appears in the same order as in the original, non-compressed document.

Abstract

The method applies to a structured document (D) presenting a hierarchical structure defined by a structure schema, the document combining a main structured set (1) of information including information subsets (1.1, 1.2, 1.3, . . . , 1.2.2.2), at least some of the information subsets being structured and being capable of including information subsets of lower hierarchical level, each information subset being associated in the higher level information set with a respective information type (T). The method comprises the steps of: dividing the document into structured portions (P1, P2, P3) capable of being handled individually, namely a main portion (P1) and at least one secondary portion (P2, P3), the main portion containing at least the main set (1) of information, and the secondary portion containing an information subset (1.2.1, 1.2.2) which is removed from the main set of information, each secondary portion being attached to the main portion or to another secondary portion; and allocating a predefined value to the information type of each information subset (1.2.1, 1.2.2) that has been removed from an information set (1.2) of higher hierarchical level.

Description

  • The present invention relates to a method enabling structured documents to be divided into several parts.
  • It applies particularly but not exclusively to handling, transmitting, storing, and reading structured multimedia documents, digital or video images or image sequences, movies or video programs, and more generally to any transfer of said documents between processor units interconnected by data transmission networks, or between a processor unit and a storage unit, or indeed between a processor unit and a playback unit such as a television set if the document is a video program.
  • More and more frequently, documents handled and transmitted in this way contain a plurality of different types of data integrated in a structure. A structured document is a connection of data sets each associated with a type and attributes, and interconnected by relationships that are mainly hierarchical. Such documents use a markup language such as Standard Generalized Markup Language (SGML), Hypertext Markup Language (HTML), or Extensible Markup Language (XML), serving in particular to distinguish between the various subsets of information making up the document. In contrast, in a “linear” document, the content information of the document is mixed in with layout information and type information.
  • A structured document includes markers for separating different sets of information in the document. For SGML, XML, or HTML formats, these markers are referred to as “tags” and have the form “<XXXX>” and “</XXXX>”, the first marker marking the beginning of a set of information called “XXXX”, and the second marking the end of said set. A set of information may itself be made up of a plurality of lower-level sets of information. Thus, a structured document presents a tree or hierarchical structure schema, each node representing a set of information and being connected to a node at a higher hierarchical level representing a set of information that contains the sets of information at lower level. The nodes situated at the ends of branches in such a tree structure represent sets of information containing data of predetermined type, themselves not suitable for being resolved into subsets of information.
  • Thus, a structured document contains separation markers represented in textual or binary data form, said markers defining information sets or subsets that can themselves contain other subsets of information defined by the markers.
  • A structured document is associated with a structure schema defining the structure in the form of rules together with the type of information in each set of information of the document. A schema is constituted by nested groups of information set structures, these groups possibly being ordered sequences, groups of alternative elements, or groups of necessary elements, ordered or not ordered.
  • At present, when a structured document is to be transmitted, it is initially compressed so as to minimize the volume of data to be transmitted. For best efficiency in such compression processing, the document structuring data is also compressed, given that the recipient of the document is assumed to know beforehand the structure schema of the document and to be able to use the structure schema to determine at all times what information set is about to be received. It is therefore essential for the structure of the document as transmitted to correspond exactly to the structure schema that the recipient of the document intends to use for receiving and decoding the document, since otherwise the recipient cannot determine the type of data that has been transmitted and is thus in-capable of decoding the data and of reconstituting the original document.
  • Unfortunately, structured documents for transmission are tending to become more and more voluminous. Proposals have been made, for example, to transmit or broadcast complete descriptions of movies or TV programs in this way.
  • In this context, if a transmission error should occur while a document is being transmitted, the recipient of the document may no longer be able to determine which subset is being transmitted, in which case the entire document needs to be transmitted again. Furthermore, if it is desired to transmit a movie sequence and display it simultaneously on a screen, it can be necessary to comply with periods of time for transmitting the various elements of the sequence. Certain elements of the sequence must also be capable of being transmitted several times over so as to enable a recipient who was not connected at the beginning of the transmission of the sequence to receive and display the end of the sequence.
  • It may also be necessary to replace a portion of a document by another, these two portions having the same structure schema.
  • The solution which consists in retransmitting the entire document leads to a considerable increase in the volume of information that needs to be transmitted. It is therefore desirable to be able to divide a document into a plurality of portions which are transmitted separately. It turns out that present transmission methods are not suitable for transmitting a document in part only.
  • An object of the invention is to overcome that drawback. This object is achieved by providing a method of dividing a structured document presenting a hierarchical structure defined by a structure schema, the document combining a main set of information including information subsets, at least some of the information subsets being capable of including information subsets of lower hierarchical level, each information subset being associated with a respective information type.
  • According to the invention, the method comprises the steps of:
  • dividing the document into portions that can be handled separately, namely a main portion and at least one secondary portion, the main portion containing at least the main set of information, and the secondary portion containing an information subset which is removed from the main set of information, each secondary portion being attached to the main portion or to another secondary portion; and
  • allocating a predefined value to the information type of each information subset that has been removed from a higher level information set.
  • In this way, each portion is understandable on its own and can be decoded regardless of the selected partitioning. In addition, when such a portion is transmitted and the transmission fails, the remainder of the document remains valid and only the portion that was not transmitted correctly needs to be retransmitted, there being no need to retransmit the entire document. Furthermore, there is no need to have main portions and secondary portions upstream from a portion in order to be able to decode that portion, since each portion is valid and comprehensible on its own. By means of these dispositions, a transmitted document can be enriched and modified as time progresses.
  • Advantageously, the document includes a header which is inserted in each portion, the header including a flag whose value specifies whether or not the document is complete.
  • According to a feature of the invention, each portion has a header containing information giving the location of the portion in the hierarchical structure of the document.
  • Said information concerning the location of the secondary portion in the hierarchical structure of the document advantageously describes a path in said structure, defining the position of the secondary portion in the document.
  • Said path may be defined in absolute manner relative to the main set of information of the document. It may also be defined in relative manner relative to the position of a most recently-transmitted secondary portion.
  • Alternatively, each type of information allocated to the predefined value is followed by a reference to the secondary portion containing the subset of information associated with the type of information, said information concerning the location of the secondary portion in the hierarchical structure of the document being the reference of said secondary portion.
  • The method may also include transmitting a plurality of document portions associated with the same location in the structure. Under such circumstances, the most recently-transmitted portion replaces the previous portion that was associated with the same location.
  • Provision may also be made for the header of each portion to contain information specifying a way of processing the portion relative to a portion associated with the same location in the structure.
  • The structured document may be of the SGML, XML, or HTML type, for example.
  • A preferred embodiment of the invention is described below by way of non-limiting examples and with reference to the accompanying drawing, in which:
  • FIG. 1 shows a tree structure in which each node symbolizes a set or a subset of information in a structured document which is normally transmitted as a single entity;
  • FIG. 2 shows the structured document of FIG. 1 partitioned into a plurality of portions, each capable of being transmitted separately in accordance with the invention;
  • FIG. 3 shows in greater detail the structure of the information contained in a structured document; and
  • FIG. 4 shows another tree structure illustrating a method of defining the position of a portion of the structure, said portion being transmitted separately from the remainder of the structure.
  • FIG. 1 shows a tree structure comprising a root node 1 partitioned into three lower level nodes, of which a first node 1.1 is not partitioned into lower level nodes, a second node 1.2 comprises two nodes 1.2.1 and 1.2.2, and a third node 1.3 comprises a single node 1.3.1. The two nodes 1.2.1 and 1.2.2 of the second node 1.2 are respectively attached to one 1.2.1.1 and to two nodes 1.2.2.1 and 1.2.2.2 of lower level.
  • This structure represents a structured document D comprising a header H in which a certain number of parameters are defined that define the coding and display format of the document, and a main body B containing the information and the sets of information constituting the document.
  • According to the invention, a structured document can be transmitted as a plurality of separate portions P1, P2, P3, i.e. a main portion, and secondary portions P2 and P3 which are attached to the main portion (FIG. 2). Such transmission is preferably performed after each portion for separate transmission has been compressed in appropriate manner. Each portion of the document, whether or not it is compressed, comprises a header H, H2, H3, and a main body B1, B2, B3.
  • As shown in FIG. 3, a main body B of the document comprises a data header DH and one or more data bodies DB each containing the information of an information subset of the document. The data header DH may have a field K enabling ambiguity to be resolved at the time the document is decoded, in particular by giving a number enabling the following data set to be defined, and/or a field containing the number N of occurrences of the data body DB.
  • Depending on the format used, each data body DB may comprise a field T specifying the type of information it contains, a field L giving length of the information as a number of bits or of bytes, a field A containing the attributes of the information subsets, and a field Val containing the value or the content of the information subsets.
  • Since the document is structured in the form of a tree structure, the field Val may itself contain a data header field DH and one or more fields containing a data body DB.
  • On this topic, it should be observed that in the structure schema shown in FIG. 1, the information contained in the document is held in the nodes 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2, and 1.3.1 situated at the ends of the branches, and also in the attribute fields A of the subsets symbolized by all of the nodes of the document.
  • According to the invention, when it is desired to transmit a part of such a document, and regardless of whether it has been previously been compressed, the field T containing the type of the information in a data body DB that has not been transmitted or that has been withdrawn from the document receives a predefined value specifying that the following information subset is not transmitted. This predefined particular value for information type is selected to be equal to zero, for example, when a document is in compressed form, with other types of information having values that are not zero.
  • If this predefined value appears in the transmitted document, the length field L and the fields A and Val which normally follow the information type do not appear in the transmitted data. Consequently, following an information type that is equal to the predefined value, there is the header DH of the next set of data in the document, or an end-of-document flag.
  • Provision can be made to add a parameter to the document header H to specify whether or not the document is transmitted in full, so as to inform the recipient of the document whether the document that is being received is being transmitted in full or in part.
  • The portions P1, P2, and P3 may be transmitted separately one or more times. For this purpose, each has a header H, H2, H3 comprising firstly a parameter specifying that the document is not complete, followed by a definition of the location of the transmitted portion in the tree structure of the complete document.
  • In this way, a structured document can be enriched and modified over time.
  • It should be observed that there is no need to transmit the main portion PI since the location definitions appearing in the headers of the secondary portions enable the processor unit which receives the transmitted secondary portions to determine the location of each received portion in the structure of the document and thus to decode it. In addition, the document can be partitioned in such a manner that the main portion does not contain any payload data, so that the entire document can be reconstituted from the secondary portions and their locations within the document structure.
  • In addition, the headers H, H2, H3 of the portions P1, P2, P3 may contain information specifying a mode of processing the portion relative to an already transmitted portion associated with the same location in the structure, for example whether the transmitted portion is to replace an already transmitted portion associated with the same location, or whether it should not be taken into account if it already appears in the received document, or indeed whether it should be merged with the already transmitted portion associated with the same location.
  • As shown in FIG. 4, this definition of location may comprise the names of all of the higher nodes going back to the root node R, possibly associated with an order number relative to the higher node. For example, the firstly node of the first node of the third node of the first node attached to the root node (identified in FIG. 4 by a sequence of arrows coming from the root node R) can be referenced as follows:
  • /c/a[last]/b(1)d
  • This notation indicates that it is a node of type “d” connected to the first node of type “b” connected to the last node of type “a” connected to the node of type “c” which is directly connected to the root node R.
  • Other portions of the document can then be transmitted either by using the absolute definition method (relative to the root node R) as described above, or else, and advantageously, by using a relative definition method. Thus, for example, the third node connected to the same node inmmediately above the preceding node may be referenced as follows:
  • ../e[2]
  • This notation states that reference is being made to the second node, which must be of type “e”, that is connected to the same node at immediately higher level as referenced by the notation “../” It can be seen that this second method is more compact than the first.
  • Alternatively, the location of the transmitted portion P2, P3 of the document may be defined merely by means of a reference to the document portion, said reference having already been transmitted in the main portion of P1 of the document, e.g. following the predefined value specifying that the following information subset is not transmitted.
  • Preferably, the document, or the portions P1, P2, P3 of the document for transmission is/are previously compressed. For this purpose, it is advantageous in each document portion to distinguish between structure information and content information, given that certain document portions need not contain any content information. Thus, in the example of FIGS. 2 and 3, the structure information is constituted by all of the fields except for the value fields Val when these fields are not structured, i.e. when they are not capable of being partitioned into structured subsets of information. In the example of FIG. 2, these are the fields Val of the information subsets 1.1, 1.2.1.1, 1.2.2.1, 1.2.2.2, and 1.3.1, situated at the bottom ends of the branches of the document tree structure.
  • Compression processing proper consists, for example, in reading the portion of the document that is to be compressed sequentially, in applying an appropriate compression algorithm for processing the structure information, and in applying a compression algorithm adapted to the information type when a non-partitionable field Val appears while reading the document portion. It should be observed that in a compressed document or document portion, the structure information and the content information appears in the same order as in the original, non-compressed document.
  • It is also possible to apply a statistical compression algorithm, such as Zip.

Claims (7)

1-12. (canceled)
13. A method of handling at least one structured document having a hierarchical structure defined in a structure schema, the structured document comprising a main structured set of information including information subsets, at least one of the information subsets being structured and including information subsets of lower hierarchical level, each information subset being associated in a higher level information set with a respective information type, the structure corresponding to each information type being defined in the structure schema, the structured document being divided into structured portions capable of being handled individually, namely a main portion and at least one secondary portion, the main portion containing at least a main set of information, and the second portion containing an information subset which is removed from the main set of information, each secondary portion being attached to the main portion or to another secondary portion, the structured document comprising in each information set from which at least one information subset has been removed, the information type of each removed information subset having a predefined allocated value,
said method comprising the steps of:
receiving by a recipient a data stream formed by at least one secondary portion,
reading by the recipient at least some received secondary portions, the step of reading comprising at least one step of updating over time the plurality of secondary portions associated with a same location in the structure in accordance with at least one predefined rule.
14. The method according to claim 13,
wherein during the step of updating, the most recently received secondary portion replaces the previously received secondary portion associated with the same location in the structure.
15. The method according to claim 13,
wherein a header of each read secondary portion contains information specifying a processing mode to be applied to said secondary portion relative to an already received secondary portion associated with the same location in the structure.
16. The method according to claim 13,
wherein the structured schema of the structured document is known by the recipient.
17. The method according to claim 13,
wherein the structured document is partitioned in such a manner that the main portion does not contain any payload data, so that the entire document is reconstituted from the secondary portions and their locations within the document structure.
18. The method according to claim 13,
wherein the data stream comprises the main portion, the-step of reading comprising at least one step of updating over time a plurality of main portions associated with a same location in the structure in accordance with at least one predefined rule.
US11/800,550 2000-12-18 2007-05-04 Method of dividing structured documents into several parts Abandoned US20070277096A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US11/800,550 US20070277096A1 (en) 2000-12-18 2007-05-04 Method of dividing structured documents into several parts

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
FR00/16507 2000-12-18
FR0016507A FR2818409B1 (en) 2000-12-18 2000-12-18 METHOD FOR DIVIDING STRUCTURED DOCUMENTS INTO MULTIPLE PARTS
US10/451,473 US7275060B2 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts
PCT/FR2001/004008 WO2002050708A1 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts
US11/800,550 US20070277096A1 (en) 2000-12-18 2007-05-04 Method of dividing structured documents into several parts

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
PCT/FR2001/004008 Division WO2002050708A1 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts
US10/451,473 Division US7275060B2 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts

Publications (1)

Publication Number Publication Date
US20070277096A1 true US20070277096A1 (en) 2007-11-29

Family

ID=8857802

Family Applications (2)

Application Number Title Priority Date Filing Date
US10/451,473 Expired - Fee Related US7275060B2 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts
US11/800,550 Abandoned US20070277096A1 (en) 2000-12-18 2007-05-04 Method of dividing structured documents into several parts

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US10/451,473 Expired - Fee Related US7275060B2 (en) 2000-12-18 2001-12-14 Method for dividing structured documents into several parts

Country Status (6)

Country Link
US (2) US7275060B2 (en)
EP (1) EP1344151A1 (en)
JP (1) JP4145144B2 (en)
AU (1) AU2002219311A1 (en)
FR (1) FR2818409B1 (en)
WO (1) WO2002050708A1 (en)

Families Citing this family (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3944014B2 (en) * 2002-07-09 2007-07-11 株式会社東芝 Document editing method, document editing system, and document processing program
US7838430B2 (en) * 2003-10-28 2010-11-23 Applied Materials, Inc. Plasma control using dual cathode frequency mixing
US7464330B2 (en) * 2003-12-09 2008-12-09 Microsoft Corporation Context-free document portions with alternate formats
US8661332B2 (en) * 2004-04-30 2014-02-25 Microsoft Corporation Method and apparatus for document processing
US7549118B2 (en) * 2004-04-30 2009-06-16 Microsoft Corporation Methods and systems for defining documents with selectable and/or sequenceable parts
US7383500B2 (en) * 2004-04-30 2008-06-03 Microsoft Corporation Methods and systems for building packages that contain pre-paginated documents
US7418652B2 (en) * 2004-04-30 2008-08-26 Microsoft Corporation Method and apparatus for interleaving parts of a document
US7359902B2 (en) * 2004-04-30 2008-04-15 Microsoft Corporation Method and apparatus for maintaining relationships between parts in a package
US7580948B2 (en) * 2004-05-03 2009-08-25 Microsoft Corporation Spooling strategies using structured job information
US7755786B2 (en) 2004-05-03 2010-07-13 Microsoft Corporation Systems and methods for support of various processing capabilities
US7519899B2 (en) 2004-05-03 2009-04-14 Microsoft Corporation Planar mapping of graphical elements
US8243317B2 (en) * 2004-05-03 2012-08-14 Microsoft Corporation Hierarchical arrangement for spooling job data
US8363232B2 (en) * 2004-05-03 2013-01-29 Microsoft Corporation Strategies for simultaneous peripheral operations on-line using hierarchically structured job information
US7617450B2 (en) * 2004-09-30 2009-11-10 Microsoft Corporation Method, system, and computer-readable medium for creating, inserting, and reusing document parts in an electronic document
US7584111B2 (en) * 2004-11-19 2009-09-01 Microsoft Corporation Time polynomial Arrow-Debreu market equilibrium
US7617229B2 (en) * 2004-12-20 2009-11-10 Microsoft Corporation Management and use of data in a computer-generated document
US20060136816A1 (en) * 2004-12-20 2006-06-22 Microsoft Corporation File formats, methods, and computer program products for representing documents
US7617451B2 (en) * 2004-12-20 2009-11-10 Microsoft Corporation Structuring data for word processing documents
US7752632B2 (en) * 2004-12-21 2010-07-06 Microsoft Corporation Method and system for exposing nested data in a computer-generated document in a transparent manner
US7770180B2 (en) * 2004-12-21 2010-08-03 Microsoft Corporation Exposing embedded data in a computer-generated document
US8111694B2 (en) 2005-03-23 2012-02-07 Nokia Corporation Implicit signaling for split-toi for service guide
US20060277452A1 (en) * 2005-06-03 2006-12-07 Microsoft Corporation Structuring data for presentation documents
US20070022128A1 (en) * 2005-06-03 2007-01-25 Microsoft Corporation Structuring data for spreadsheet documents
US8176414B1 (en) * 2005-09-30 2012-05-08 Google Inc. Document division method and system
US20090307225A1 (en) * 2005-10-06 2009-12-10 Smart Internet Technology Crc Pty Ltd. Methods and systems for facilitating access to a schema
JP5570202B2 (en) * 2009-12-16 2014-08-13 キヤノン株式会社 Structured document analysis apparatus, structured document analysis method, and computer program
JP5480034B2 (en) 2010-06-24 2014-04-23 インターナショナル・ビジネス・マシーンズ・コーポレーション Method, program and system for dividing tree structure of structured document

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5142689A (en) * 1982-09-27 1992-08-25 Siemens Nixdort Informationssysteme Ag Process for the preparation of the connection of one of several data processor devices to a centrally synchronized multiple line system
US5812999A (en) * 1995-03-16 1998-09-22 Fuji Xerox Co., Ltd. Apparatus and method for searching through compressed, structured documents
US5956726A (en) * 1995-06-05 1999-09-21 Hitachi, Ltd. Method and apparatus for structured document difference string extraction
US6021202A (en) * 1996-12-20 2000-02-01 Financial Services Technology Consortium Method and system for processing electronic documents
US6061697A (en) * 1996-09-11 2000-05-09 Fujitsu Limited SGML type document managing apparatus and managing method
US6304578B1 (en) * 1998-05-01 2001-10-16 Lucent Technologies Inc. Packet routing and queuing at the headend of shared data channel
US6311187B1 (en) * 1998-12-29 2001-10-30 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured data under a push model
US20020013791A1 (en) * 2000-06-06 2002-01-31 Niazi Uzair Ahmed Data file processing
US6349302B1 (en) * 1997-07-08 2002-02-19 Hitachi, Ltd. Document processing method and system, and computer-readable recording medium having document processing program recorded therein
US20020023113A1 (en) * 2000-08-18 2002-02-21 Jeff Hsing Remote document updating system using XML and DOM
US6370536B1 (en) * 1996-11-12 2002-04-09 Fujitsu Limited Information management apparatus and information management program recording medium for compressing paragraph information
US6377957B1 (en) * 1998-12-29 2002-04-23 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured date
US6606633B1 (en) * 1998-09-22 2003-08-12 Nec Corporation Compound document management system and compound document structure managing method
US6610104B1 (en) * 1999-05-05 2003-08-26 Inventec Corp. Method for updating a document by means of appending
US6635089B1 (en) * 1999-01-13 2003-10-21 International Business Machines Corporation Method for producing composite XML document object model trees using dynamic data retrievals
US6671853B1 (en) * 1999-07-15 2003-12-30 International Business Machines Corporation Method and system for selectively streaming markup language documents
US6681395B1 (en) * 1998-03-20 2004-01-20 Matsushita Electric Industrial Company, Ltd. Template set for generating a hypertext for displaying a program guide and subscriber terminal with EPG function using such set broadcast from headend
US20040205598A1 (en) * 1998-12-18 2004-10-14 Toru Takahashi Method and system for management of structured document and medium having processing program therefor
US6848078B1 (en) * 1998-11-30 2005-01-25 International Business Machines Corporation Comparison of hierarchical structures and merging of differences
US6850948B1 (en) * 2000-10-30 2005-02-01 Koninklijke Philips Electronics N.V. Method and apparatus for compressing textual documents
US6871320B1 (en) * 1998-09-28 2005-03-22 Fujitsu Limited Data compressing apparatus, reconstructing apparatus, and method for separating tag information from a character train stream of a structured document and performing a coding and reconstruction
US6966027B1 (en) * 1999-10-04 2005-11-15 Koninklijke Philips Electronics N.V. Method and apparatus for streaming XML content
US6996770B1 (en) * 1999-07-26 2006-02-07 Microsoft Corporation Methods and systems for preparing extensible markup language (XML) documents and for responding to XML requests
US7627881B1 (en) * 1999-01-29 2009-12-01 Sony Corporation Transmitting apparatus and receiving apparatus

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997034240A1 (en) * 1996-03-15 1997-09-18 University Of Massachusetts Compact tree for storage and retrieval of structured hypermedia documents
US6119123A (en) * 1997-12-02 2000-09-12 U.S. Philips Corporation Apparatus and method for optimizing keyframe and blob retrieval and storage
EP0928070A3 (en) * 1997-12-29 2000-11-08 Phone.Com Inc. Compression of documents with markup language that preserves syntactical structure
JP2000083059A (en) 1998-07-06 2000-03-21 Jisedai Joho Hoso System Kenkyusho:Kk Index information distributing method, index information distributing device, retrieving device and computer readable recording medium recording program for functioning computer as each means of those devices

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5142689A (en) * 1982-09-27 1992-08-25 Siemens Nixdort Informationssysteme Ag Process for the preparation of the connection of one of several data processor devices to a centrally synchronized multiple line system
US5812999A (en) * 1995-03-16 1998-09-22 Fuji Xerox Co., Ltd. Apparatus and method for searching through compressed, structured documents
US5956726A (en) * 1995-06-05 1999-09-21 Hitachi, Ltd. Method and apparatus for structured document difference string extraction
US6061697A (en) * 1996-09-11 2000-05-09 Fujitsu Limited SGML type document managing apparatus and managing method
US6370536B1 (en) * 1996-11-12 2002-04-09 Fujitsu Limited Information management apparatus and information management program recording medium for compressing paragraph information
US6021202A (en) * 1996-12-20 2000-02-01 Financial Services Technology Consortium Method and system for processing electronic documents
US6349302B1 (en) * 1997-07-08 2002-02-19 Hitachi, Ltd. Document processing method and system, and computer-readable recording medium having document processing program recorded therein
US6681395B1 (en) * 1998-03-20 2004-01-20 Matsushita Electric Industrial Company, Ltd. Template set for generating a hypertext for displaying a program guide and subscriber terminal with EPG function using such set broadcast from headend
US6304578B1 (en) * 1998-05-01 2001-10-16 Lucent Technologies Inc. Packet routing and queuing at the headend of shared data channel
US6606633B1 (en) * 1998-09-22 2003-08-12 Nec Corporation Compound document management system and compound document structure managing method
US6871320B1 (en) * 1998-09-28 2005-03-22 Fujitsu Limited Data compressing apparatus, reconstructing apparatus, and method for separating tag information from a character train stream of a structured document and performing a coding and reconstruction
US6848078B1 (en) * 1998-11-30 2005-01-25 International Business Machines Corporation Comparison of hierarchical structures and merging of differences
US20040205598A1 (en) * 1998-12-18 2004-10-14 Toru Takahashi Method and system for management of structured document and medium having processing program therefor
US6377957B1 (en) * 1998-12-29 2002-04-23 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured date
US6311187B1 (en) * 1998-12-29 2001-10-30 Sun Microsystems, Inc. Propogating updates efficiently in hierarchically structured data under a push model
US6635089B1 (en) * 1999-01-13 2003-10-21 International Business Machines Corporation Method for producing composite XML document object model trees using dynamic data retrievals
US7627881B1 (en) * 1999-01-29 2009-12-01 Sony Corporation Transmitting apparatus and receiving apparatus
US6610104B1 (en) * 1999-05-05 2003-08-26 Inventec Corp. Method for updating a document by means of appending
US6671853B1 (en) * 1999-07-15 2003-12-30 International Business Machines Corporation Method and system for selectively streaming markup language documents
US6996770B1 (en) * 1999-07-26 2006-02-07 Microsoft Corporation Methods and systems for preparing extensible markup language (XML) documents and for responding to XML requests
US6966027B1 (en) * 1999-10-04 2005-11-15 Koninklijke Philips Electronics N.V. Method and apparatus for streaming XML content
US20020013791A1 (en) * 2000-06-06 2002-01-31 Niazi Uzair Ahmed Data file processing
US20020023113A1 (en) * 2000-08-18 2002-02-21 Jeff Hsing Remote document updating system using XML and DOM
US6850948B1 (en) * 2000-10-30 2005-02-01 Koninklijke Philips Electronics N.V. Method and apparatus for compressing textual documents

Also Published As

Publication number Publication date
JP4145144B2 (en) 2008-09-03
FR2818409A1 (en) 2002-06-21
AU2002219311A1 (en) 2002-07-01
EP1344151A1 (en) 2003-09-17
US20040054669A1 (en) 2004-03-18
JP2004524606A (en) 2004-08-12
FR2818409B1 (en) 2003-03-14
WO2002050708A1 (en) 2002-06-27
US7275060B2 (en) 2007-09-25

Similar Documents

Publication Publication Date Title
US20070277096A1 (en) Method of dividing structured documents into several parts
JP4561150B2 (en) A database model for hierarchical data formats.
US7231394B2 (en) Incremental bottom-up construction of data documents
JP4881353B2 (en) Method for improving binary representation capabilities of MPEG-7 and other XML-based content descriptions
US7886223B2 (en) Generating a statistical tree for encoding/decoding an XML document
US20100185938A1 (en) Mulitple updates to content descriptions using a single command
US20110283183A1 (en) Method for compressing/decompressing structured documents
US20070234192A1 (en) Encoding and distribution of schema for multimedia content descriptions
US20020138517A1 (en) Binary format for MPEG-7 instances
AU2002253002A1 (en) Method and system for compressing structured descriptions of documents
EP1388211A2 (en) Method and system for compressing structured documents
KR101109201B1 (en) Method for a description of audio-visual data content in a multimedia environment
US20050228811A1 (en) Method of and system for compressing and decompressing hierarchical data structures
JP2004536481A (en) Encoding and decoding method of path in tree structure of structured document
US20040111677A1 (en) Efficient means for creating MPEG-4 intermedia format from MPEG-4 textual representation
US20040109502A1 (en) Efficient means for creating MPEG-4 textual representation from MPEG-4 intermedia format
US20060212796A1 (en) Method for coding structured documents
US7571152B2 (en) Method for compressing and decompressing structured documents
JP3932137B2 (en) Structured data transmission apparatus and structured data transmission method
EP1467293A1 (en) Database model for hierarchical data formats
JP2005063453A (en) Transmission device for structured data
JP2004240983A (en) Device for transmitting structured data
JP2004240984A (en) Device for transmitting structured data
JP2004234678A (en) Transmitting device for structured data
JP2004234671A (en) Transmitting device for structured data

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION