US20010032218A1 - Method and apparatus for utilizing document type definition to generate structured documents - Google Patents

Method and apparatus for utilizing document type definition to generate structured documents Download PDF

Info

Publication number
US20010032218A1
US20010032218A1 US09/754,969 US75496901A US2001032218A1 US 20010032218 A1 US20010032218 A1 US 20010032218A1 US 75496901 A US75496901 A US 75496901A US 2001032218 A1 US2001032218 A1 US 2001032218A1
Authority
US
United States
Prior art keywords
document
metafile
objects
readable medium
machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/754,969
Inventor
Evan Huang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
XMLCities Inc
Original Assignee
XMLCities Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by XMLCities Inc filed Critical XMLCities Inc
Priority to US09/754,969 priority Critical patent/US20010032218A1/en
Assigned to XMLCITIES, INC. reassignment XMLCITIES, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HUANG, EVAN S.
Priority to TW90124215A priority patent/TW525088B/en
Publication of US20010032218A1 publication Critical patent/US20010032218A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0283Price estimation or determination
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/103Formatting, i.e. changing of presentation of documents
    • G06F40/117Tagging; Marking up; Designating a block; Setting of attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • G06F40/157Transformation using dictionaries or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods

Definitions

  • the present invention generally relates to the area of document processing and electronic publishing systems and more particularly relates to a method and apparatus for generating structured documents with user-defined document type definitions.
  • the present invention also relates to a mechanism provided to users to convert unstructured documents for various presentations using the method and apparatus, wherein the unstructured documents are defined to be files composed, edited, or managed via an authoring application (e.g. word processing).
  • the Internet is a rapidly growing communication network of interconnected computers around the world. Together, these millions of connected computers form a vast repository of hyperlinked information that is readily accessible by any of the connected computers from anywhere and anytime. With millions of web pages being created and added to this vast repository each year, there is a tremendous need to quickly and easily convert documents, such as presentations, data sheets or brochures, into a format presentable to and accessible by other applications or computers on the Internet.
  • a preferable format that is presentable to a web browsing application is in a markup language, such as HyperText Markup Language (HTML), Extensible Markup Language (XML), Standard Generalized Markup Language (SGML) or Wireless Markup Language (WML).
  • HTML HyperText Markup Language
  • XML Extensible Markup Language
  • SGML Standard Generalized Markup Language
  • WML Wireless Markup Language
  • Files or documents that are so composed, edited or managed for web browsing applications are commonly referred to as structured files or documents.
  • structured files or documents are commonly referred to as structured files or documents.
  • the ability to provide user-defined document type definitions (DTD) or document schema definition opens a new paradigm for information exchange or storage.
  • the challenge is how to generate structured documents with arbitrarily user-defined DTD.
  • An unstructured document with specific DTD can either be created from an unstructured document or converted from a structured document with other type of DTD.
  • the exemplary editors include Adobe FrameMaker, Arbortext Epic, and SoftQuad XMetal. These editors usually provide a structural view along with a word processing view, where the word processing view is like the traditional word processing environment for unstructured document while the structural view contains the document structure of data elements defined in certain DTD.
  • a user To create a structured document from scratch in these editors, a user usually needs to create an unstructured document in the word processing view. With a desired DTD loaded in, the user constructs a document structure tree in the structural view in accordance with document elements defined in the DTD. Typically, the user is engaged in procedures by copying-and-pasting or dragging-and-dropping the data elements from the created document into the document structure tree.
  • association between data elements and document elements is a crucial and effortful processing for creating or converting an unstructured or structured document into a structured document with specific DTD.
  • a keyword extracting approach extracts a keyword representative of the document structure from an unstructured document and the keyword/text pairs are used as the association between document elements and data elements.
  • a coordinate approach associates data elements with markup language tags in document elements by sorting the coordinates for coordinate documents.
  • a logical structure approach analyzes the document structure by matching the predetermined patterns and parses the data elements based on the analyzed document elements.
  • identifiers e.g. font information
  • the present invention has been made in consideration of the above described problems and needs and has particular applications to presentations over the Internet.
  • One of the features in the present invention is the use of identifiers in a DTD file to associate selected objects or group objects so that association information of selected objects or group objects can facilitate the generation of files in a markup language suitable for presentations on various media.
  • the present invention may be implemented as a method, a system, a product or other practical forms.
  • the present invention is a method.
  • the method receives a definition file including document type definitions (DTD) and displays a metafile along with the definition file, the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects.
  • the definition file includes a structure for document elements, each corresponding to one of the displayable objects in the metafile. Some of the document elements include a number of identifiers, each of the identifiers being assigned to one of the document elements.
  • the identifiers are numerals and/or alphabets.
  • the identifiers are one or more of a font name, a color name, a size, a font type, a color, a style, various effects or other symbols.
  • the method associates at least one of the identifiers with one of the displayable objects.
  • the present invention is implemented as a method for providing document conversion process, the method comprising activating a counter having a numbering system, converting an unstructured document into a metafile, wherein the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects, receiving a definition file including document type definitions (DTD) relating to the unstructured document; generating a modified metafile including association information of at least one of the displayable objects associated with one of the definitions in the definition file; and causing the counter to increment as soon as the modified metafile is to be saved.
  • DTD document type definitions
  • FIG. 1A shows a basic system configuration in which the present invention may be implemented in accordance with a preferred embodiment
  • FIG. 1B shows internal construction blocks of a system in which the present invention may be implemented and executed to achieve desired results contemplated in the present invention
  • FIG. 2A illustrates an example of an unstructured document that may be composed, edited or managed by an authoring tool.
  • FIG. 2B is an example of document type definitions (DTD);
  • FIG. 2C shows a structured document for the unstructured document shown in FIG. 2A based on the document type definitions (DTD) in FIG. 2B;
  • FIG. 3A illustrates a functional diagram according to one embodiment of the present invention
  • FIG. 3B shows a visual environment implementing a conversion module according to one embodiment of the present invention
  • FIG. 3C shows an example of a style sheet designed in XML format with respect to displayable objects in a metafile shown in FIG. 3B;
  • FIG. 3D shows an exemplary message from a dongle
  • FIG. 3E shows a process flowchart of using a product including an implementation of conversion module according to one embodiment of the present invention
  • FIG. 4 illustrates a block diagram of data processing apparatus which imports, edits, and converts unstructured or structured document into structured documents with user-defined DTD using structure-based font information;
  • FIG. 5 is an association table for document elements defined in DTD in FIG. 2B and font attributes;
  • FIG. 6 shows an editing result for the unstructured document in FIG. 1, where each parsed data element has been assigned font attributes based on the association table in FIG. 5;
  • FIG. 7 shows a transformation process which converts the parsed data elements in FIG. 6 into the desired structured document with the exemplary DTD in FIG. 2B;
  • FIG. 8 is an intermediate structured document, which contains parsed data elements with assigned font IDs associated with a list of font document elements;
  • FIG. 9 shows a transformation process which converts the intermediate structured document in FIG. 6 into the desired structured document with DTD in FIG. 2;
  • FIG. 10 shows an implementation of the transformation process in FIG. 9 using the extensible style language for transformation (XSLT).
  • XSLT extensible style language for transformation
  • references herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention.
  • the appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, the order of blocks in process flowcharts or diagrams representing one or more embodiments of the invention do not inherently indicate any particular order nor imply any limitations in the invention.
  • FIG. 1A shows a basic system configuration in which the present invention may be implemented in accordance with a preferred embodiment.
  • Unstructured documents such as product descriptions, functions lists and price schedules, may be created using an authoring tool executed on a computer 100 .
  • Files or Documents created by an authoring tool are referred to as unstructured documents.
  • Exemplary authoring tools may include Microsoft Office (e.g. Microsoft Word, Microsoft PowerPoint, and Microsoft Excel), Adobe FrameMaker and Adobe Photoshop.
  • the unstructured documents may be uploaded to computing device 102 that may serve as a central repository.
  • Computing device 102 may be a server station from Sun Microsystems (www.sun.com) or a desktop computer loaded with a complied and linked version of one embodiment implementing the present invention.
  • computer 100 and computing device 102 are inseparable and perform document conversion process and generate structured documents that may be ultimately represented in a format of markup language such as XML or HTML.
  • the structured documents represented in XML are converted to HTML format and become available through a private network 110 to a service server 104 that hosts what is generally referred to as a www (world wide web) site.
  • a user uses a desk computer 106 that operates a browsing application and is coupled to data network 108 to access files on service server 104 .
  • These files represented by the structured documents in computer 102 may represent the latest product information originally composed via an authoring tool.
  • the present invention is not limited to the Internet applications. It may be practiced in individual computers in which users often create documents in different word processing formats, such as FrameMaker or Microsoft Word.
  • the present invention may be utilized to convert documents to a markup representation regardless of the exact word processing formats.
  • FIG. 1B shows an internal construction blocks of a system 118 in which the present invention may be implemented and executed.
  • System 118 may correspond to a client device (e.g. computer 100 , 102 or 106 ) or a server device (e.g. server 104 ).
  • client device e.g. computer 100 , 102 or 106
  • server device e.g. server 104
  • system 108 includes a central processing unit (CPU) 122 interfaced to a data bus 120 and a device interface 124 .
  • CPU central processing unit
  • CPU 122 executes certain instructions to manage all devices and interfaces coupled to data bus 120 for synchronized operations and device interface 124 may be coupled to an external device such as computer 102 hence documents therefrom are received into memory or storage through data bus 120 .
  • display interface 126 Also interfaced to data bus 120 is display interface 126 , network interface 128 , printer interface 130 and floppy disk drive interface 138 .
  • a compiled and linked version of one embodiment of the present invention is loaded into storage 136 through floppy disk drive interface 138 , network interface 128 , device interface 124 or other interfaces coupled to data bus 120 .
  • Main memory 132 such as random access memory (RAM) is also interfaced to data bus 120 to provide CPU 122 with the instructions and access to memory storage 136 for data and other instructions.
  • RAM random access memory
  • CPU 122 when executing stored application program instructions, such as the complied and linked version of the present invention, CPU 122 is caused to manipulate the data to achieve results contemplated by the present invention.
  • ROM (read only memory) 134 is provided for storing invariant instruction sequences such as a basic input/output operation system (BIOS) for operation of keyboard 140 , display 126 and pointing device 142 if there are any.
  • BIOS basic input/output operation system
  • FIG. 2A illustrates an example of an unstructured document 200 that may be composed, edited or managed by an authoring tool.
  • data is generally presented in sequence, which usually follows a reading order (e.g. from top to bottom and left to right). This sequence may be parsed into segments of data elements, where each data element 102 is assigned with decoration attributes or information such as positions, font color, font size, font type, style and various effects and etc.
  • decoration information is essentially for proper layout and presentation purpose when a file containing the data elements is opened by the authoring tool for display on a display screen.
  • an unstructured document is printed to a metafile format that contain the decoration information.
  • a metafile format is commonly used Portable Data Format (PDF).
  • PDF Portable Data Format
  • FIG. 2B illustrates an example of DTD 208 for “recipe-type” documents, in which a document is to be broken down into structures of document elements.
  • a particular document element 210 may contain other document elements and attributes.
  • Another example of the document element 212 contains only the parsed character data.
  • FIG. 2C shows structured document 220 converted correspondingly from unstructured document 200 in FIG. 2A with respect to DTD 208 in FIG. 2B. As shown in the figure, the data sequence in the unstructured document is parsed into data elements associated with document elements defined in DTD for the structured document.
  • the structured document can easily access certain information via the document elements.
  • Presentation of a structured document is usually defined in separate style sheets, e.g., written in cascading style sheet (CSS) or extensible style language for formatting objects (XSL-FO), which interprets layout for each document element.
  • CSS cascading style sheet
  • XSL-FO extensible style language for formatting objects
  • This feature allows a structured document to be presented in different layouts for different media through different style sheets.
  • the decoration information or formatting attributes such as font information in an unstructured document, unless defined in DTD as attributes of document elements, abandoned after an unstructured document is converted into a corresponding structured document. Further modification of formatting information will in general not affect the converted structured documents.
  • FIG. 3A illustrates a functional diagram 300 according to one embodiment of the present invention.
  • a conversion module 302 comprises an association module 302 and an integration module 306 .
  • Association module 302 receives an unstructured document, preferably in a metafile format.
  • association module 302 also receives a file, referred to as a definition file, including DTD that are predefined.
  • DTD is defined according to the nature or purposes of the unstructured document.
  • the unstructured document is in a category of receipts, e.g. document 200 in FIG. 2A
  • the DTD in a definition file as shown in FIG. 2B is designed in accordance to the “receipt-type” documents.
  • FIG. 3B shows an environment 320 implementing conversion module 302 according to one embodiment of the present invention.
  • Environment 320 includes two displays 322 and 324 for a user to perform a conversion of an unstructured document to a file in markup language (referring to a markup language file).
  • Display 322 is used to display the unstructured document.
  • a metafile version of the unstructured document is loaded for display.
  • a metafile referring to either the unstructured document or a printed version thereof, typically contains many displayable objects. Each object is a cluster or a group of characters or words or a graphic representation.
  • each word or an isolated numeral is a displayable object which is inherently carried over in the metafile.
  • each object is defined by a number of attributes or decoration information including, but not limited to, type, size, color and position of the object such that it can be “printed” correctly.
  • a number of objects can be grouped manually by a user in terms of their meanings or purposes.
  • group object 326 includes three character-type objects “Green”, “Chili” and “Salsa”. Naturally the three character-type objects forms a title as a group object 326 .
  • the object grouping may be performed for the rest of the displayed metafile in display 322 .
  • Display 324 is used to display a definition file prepared for the metafile in display 322 .
  • the definition file is presented graphically as “DTD Pool” 328 .
  • the graphical representation 328 of DTD 208 in FIG. 2B is used in display 324 to illustrate the hierarchical relationships among the document elements.
  • an auxiliary XML tree 330 is produced from “DTD Pool” 328 .
  • Auxiliary XML tree 330 also shows the hierarchical relationships among the document elements.
  • each of the document elements is assigned to an identifier that may include, but not be limited to, a numeral, a name, a font, a type name or a color.
  • the identifier is in “data” of each of the document elements.
  • “data” 334 is activated upon group object 326 is selected.
  • One of the features in the present invention is an underlying association that relates group object 326 with the identifier in “data” 334 . Specifically in one embodiment, if the identifier in “data” 334 is a color, “green”, group object 326 is highlighted in green to indicate that this group object has been associated with the DTD. If the identifier is a font, “Arial”, group object 326 is highlighted in style Arial to indicate that this group object has been associated with the DTD.
  • a group object 340 can be associated with an identifier in data 342 under “ingredient”
  • a group object 344 can be associated with an identifier in data 346 under “amount” of “ingredient” and so on.
  • the metafile in display 322 has been segmented and the displayable objects therein are respectively grouped and each of the group objects is associated with the document element in the loaded DTD by an identifier.
  • Display 322 now has a modified metafile 310 , an example of which will be illustrated below.
  • modified metafile 310 is input to an integration module 306 that further receives a style sheet.
  • a style sheet is typically configured to include mapping rules in accordance of the media on which the objects from the metafile will be presented.
  • One exemplary media is a web presentation of a file accessible by a browser (e.g. Internet Explore from Microsoft).
  • the file is in markup language, such as HTML or XML, referring to as a markup language file.
  • FIG. 3C shows an example of such style sheet designed in XML format with respect to the displayable objects in the metafile.
  • a style sheet is designed to position, color or size respectively each of the objects so that a proper and attentive presentation can be achieved for a particular media.
  • the example in FIG. 3C is designed for presenting “receipt-type” document and causes the modified metafile to generate a proper XML file when loaded.
  • integration module 306 generates the XML file from the modified metafile in accordance with the style sheet.
  • the mapping rules can be loaded in with the DTD file so that integration module 306 performs mapping from the modified metafile to a markup language file in accordance with the loaded mapping rules.
  • conversion module 302 is implemented in software and may be distributed as an application to users or service providers. It is understood that the conversion process from an unstructured document to a markup language file is difficult to be quantified in a cost-determinable way.
  • a counter 308 is included in conversion module 302 .
  • counter 308 is configured to count the number of pages in the metafile to be converted. Every time, all of the objects in a display (i.e. a page display) are associated with the document elements in a DTD file and saved as a corresponding modified metafile, counter 308 increments.
  • FIG. 3D shows an example of counting results kept in a dongle.
  • a dongle (pronounced DONG-uhl) is a mechanism for ensuring that only authorized users can copy or use a specific software application, especially very expensive programs.
  • Common implementations of a dongle include a hardware key that plugs into a parallel or serial port on a computer and that a software application accesses for verification before continuing to run; special key diskettes accessed in a similar manner; and registration numbers that are loaded into some form of read-only memory at the factory or during a system setup.
  • an owner of a product including an implementation of conversion module 302 may distribute the product free or at very low cost to users.
  • the user needs to produce volumes of web pages from the unstructured documents composed, edited or managed by various authoring tools.
  • One of the benefits for the user to receive the product in such manner is not to have to come up with a large capital for acquiring the product before using it.
  • the users may pay for the usage of the products.
  • one of the purposes of using a dongle with conversion module 302 is to manage the usage thereof. As a result, the owner of the product can control the usage of the product by controlling the dongle containing the usage information.
  • FIG. 3E shows a process flowchart 370 of using a product including an implementation of conversion module 302 according to one embodiment of the present invention.
  • the product is leased by a user or a business.
  • the product is used by a service provider providing services to businesses that need to convert unstructured documents to structured documents for different media presentation (e.g. presentation on a web site).
  • Process 370 starts with generating metafiles from authored documents at 372 .
  • the authored documents may have been prepared using one or more authoring tools.
  • metafiles are preferably obtained from the authored documents so that conversion module 302 does not have to be respectively configured for each of the different authoring tools.
  • the preference of a metafile is not an inherent limitation to the current invention but is to make the product or conversion module 302 work more efficiently.
  • a conversion interface or a print driver could be configured to accommodate any type of the authored documents or generate the metafiles.
  • metafiles Once the metafiles are obtained, they may be now loaded to a visual environment in which the metafiles can be respectively displayed. Environment 320 of FIG. 3B may be applicable so that pages of each of the metafiles can be individually loaded for display.
  • an authorization process 378 is triggered to ensure that the user is operating an authorized product.
  • one exemplary authorization method is through a dongle that is pre-set by a business or a dealer that offers/owns/controls the product. If authorization process 378 indicates that process 370 is not authorized, typically a display is shown to the user as to where the product can be authorized.
  • One of the procedures in setting authorization 376 involves a purchase of a permitted quantity for the number of pages converted or saved.
  • a dongle is used for coupling to a computer executing process 370 .
  • the dongle includes a first and a second number.
  • the first number is a starting number, for example, “10”
  • the second number is a limit number, for example, “1000”, which means there are 1000 pages of converted documents can be processed and saved by process 370 .
  • process 370 is permitted to proceed to 380 , the user is now permitted to group a number of displayable objects respectively to group objects according to, perhaps, their meanings or their purposes and in view of a DTD file loaded and display nearby.
  • the group objects can be respectively associated with definitions in the DTD. At least some of the definitions have a number of identifiers, preferably each identifier is associated or designated to one of the definitions.
  • the associations between the selected objects and the definitions are to be saved in a modified metafile.
  • an counter is to be checked at 386 .
  • the first and the second numbers in dongle are compared.
  • the first number is substantially close to the second number, for example the two numbers being the same, process 370 will ask for a replenishment of the permitted usage.
  • the user has to get the dongle reset or reconfigured by a business or a dealer that can now collect fees based on the information in the dongle.
  • the numbers have been reset and now permit process 370 to proceed.
  • a save step can be conducted.
  • the modified metafile or a markup language file can be saved in a storage space.
  • the markup language file is generated from the modified metafile in reference to a style sheet for a predefined media presentation.
  • the counter is incremented.
  • the counter is checked at 386 , in particular after 384 , in FIG. 3E.
  • the description has made it evident to those skilled in the art that the counter could be checked or consulted virtually anywhere along process 370 .
  • One of the objectives for using a counter herein is to facilitate a business to control and determine the usage of process 370 so that a cost could be determined and a fee could be charged.
  • FIG. 4 shows a functional block diagram 400 of a data processing module 404 according to one embodiment of the present invention.
  • Data processing module 404 included in integration module 306 comprises an input module 406 , an editing module 410 and a transformation or filtering module 414 .
  • One of the functions performed by data processing module 404 is to convert unstructured documents or structured documents with different DTD into corresponding structured documents with predefined or specific DTD.
  • Input module 406 loads documents or imports documents from a document database 402 that may correspond to a repository in computing device 102 of FIG. 1A. Alternatively input module 406 can start a new document 408 . It should be noted that the loaded or imported documents, can be either unstructured (e.g. a metafile) or structured and may have contained pre-created structure-based font information in certain cases.
  • An editing module 410 communicates with input module 406 and creates/edits the structure-based font information for the input documents. This module allows selections of data elements for the input documents and provides an editing environment to alter the font attributes such as font type, font style, font color, font size, and font effects for the selected data elements.
  • the way to parse the input documents into data elements and to assign font attributes is based on an association table for the document elements defined in a desired DTD and associated font attributes 412 .
  • An exemplary association table 500 for DTD 412 is given in FIG. 5, which contains fields of document element 502 , element attribute 504 , font type 506 , font style 508 , font color 510 , font size 512 , and font effect 514 .
  • FIG. 6 shows an editing result 600 for the unstructured document 200 of FIG. 2A.
  • Each parsed data element or combined objects 602 , 604 , 606 , 608 , 610 , 612 and 614 has been assigned font attributes based on the association table in FIG. 5 and displayed respectively in the associated font.
  • this module allows sequence selections of data elements based on the reading order of the input document 602 to edit their font information.
  • This module also allows region grouping of data elements to edit their font information.
  • This module can also provide an auxiliary view of the association table.
  • Transformation or filtering module 414 converts the loaded documents into structured documents with user-defined document type definitions (DTD) using the structure-based font information. Mapping rules based on the document elements and font attributes for the conversion 416 is imported or designed in this module.
  • DTD document type definitions
  • FIG. 7 illustrates an example 700 of the mapping rules for converting the edited document 602 into the structured document 220 of FIG. 2C.
  • 702 starts and 718 ends the “document” element
  • 704 starts and 714 ends the “recipe” document
  • 706 forms the “ingredient” element
  • 708 forms the “procedure” element
  • 710 forms the “presentation” element
  • 712 forms the “originate” element.
  • the structure-based font information is used to locate the data elements and the located data elements are assigned as attributes or parsed character data for document elements.
  • mapping rules can be implemented, by not restricted to, programming languages such as Java, JavaScript, extensible style language for transformation (XSLT), C/C++, . . . etc, or any build-in or programmable hardware devices.
  • the converted documents can be either saved as a file document or exported into a document database 418 .
  • the transformation module can also output the edited documents as the intermediate structured documents which contain the structure-based font information.
  • the intermediate structured documents can be reloaded for further editing or batch conversion.
  • FIG. 8 An example of the intermediate structured document 808 for the edited document 600 is given in FIG. 8, where 802 contains “font” elements with specific font attributes and 804 contains the parsed data elements with “font_ID” attributes to associate the font information. Since the parsed data elements with the same font attributes have been grouped by the same “font_ID” in the intermediate structured documents, mapping rules for conversion can be designed based on the grouped font information.
  • FIG. 9 illustrates an example of the mapping rules for converting the intermediate document 800 into the structured document 220 .
  • 902 starts and 918 ends the “document” element
  • 904 starts and 914 ends the “recipe” document
  • 906 forms the “ingredient” element
  • 908 forms the “procedure” element
  • 910 forms the “presentation” element
  • 912 forms the “originate” element.
  • the grouped font information is used to locate the data elements and the located data elements are assigned as attributes or parsed character data for document elements.
  • FIG. 10 shows an example of implementing the mapping rules given in FIG. 9 using the extensible style language for transformation (XSLT).
  • XSLT extensible style language for transformation
  • the invention described above is preferably implemented in software, hardware or a combination of both. At least portions of the invention can be embodied as computer readable code on a computer readable medium.
  • the computer readable medium is any data storage device that can store data that can be thereafter read by a computing device. Examples of the computer readable medium include read-only memory, random-access memory, disk drives, floppy disks, CD-ROMs, DVDs, magnetic tape, optical data storage devices, carrier waves.
  • the computer readable media can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion.

Abstract

The use of identifiers in user-defined document type definitions is disclosed for converting unstructured documents to structured documents. The identifiers in user-defined document type definitions are used to associate selected objects or group objects in the unstructured documents so that association information of the selected objects or group objects can facilitate the generation of files in a markup language suitable for presentations on various media.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application claims the benefits of the provisional application, No. 60/179,330, entitled “Method and Apparatus for Generating Structured Documents with User-defined Document Type Definitions Using Structure-based Font Information”, filed Jan. 31, 2000, which is hereby incorporated by reference for all purposes.[0001]
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0002]
  • The present invention generally relates to the area of document processing and electronic publishing systems and more particularly relates to a method and apparatus for generating structured documents with user-defined document type definitions. The present invention also relates to a mechanism provided to users to convert unstructured documents for various presentations using the method and apparatus, wherein the unstructured documents are defined to be files composed, edited, or managed via an authoring application (e.g. word processing). [0003]
  • 2. Description of the Related Art [0004]
  • The Internet is a rapidly growing communication network of interconnected computers around the world. Together, these millions of connected computers form a vast repository of hyperlinked information that is readily accessible by any of the connected computers from anywhere and anytime. With millions of web pages being created and added to this vast repository each year, there is a tremendous need to quickly and easily convert documents, such as presentations, data sheets or brochures, into a format presentable to and accessible by other applications or computers on the Internet. [0005]
  • It is well known that a preferable format that is presentable to a web browsing application (e.g. a browser) is in a markup language, such as HyperText Markup Language (HTML), Extensible Markup Language (XML), Standard Generalized Markup Language (SGML) or Wireless Markup Language (WML). Files or documents that are so composed, edited or managed for web browsing applications are commonly referred to as structured files or documents. Among all the benefits of the structured documents, the ability to provide user-defined document type definitions (DTD) or document schema definition opens a new paradigm for information exchange or storage. However, the challenge is how to generate structured documents with arbitrarily user-defined DTD. [0006]
  • An unstructured document with specific DTD can either be created from an unstructured document or converted from a structured document with other type of DTD. There are several editors for generating structure documents. The exemplary editors include Adobe FrameMaker, Arbortext Epic, and SoftQuad XMetal. These editors usually provide a structural view along with a word processing view, where the word processing view is like the traditional word processing environment for unstructured document while the structural view contains the document structure of data elements defined in certain DTD. To create a structured document from scratch in these editors, a user usually needs to create an unstructured document in the word processing view. With a desired DTD loaded in, the user constructs a document structure tree in the structural view in accordance with document elements defined in the DTD. Typically, the user is engaged in procedures by copying-and-pasting or dragging-and-dropping the data elements from the created document into the document structure tree. [0007]
  • To convert a structured document with one DTD into another DTD in these editors, one needs to load in the structured document, to modify the tags and attributes of document elements from one DTD to another, and to shuffle the data elements or to parse new data elements associated with redefined document elements in the new DTD. [0008]
  • Among the procedures described above, the association between data elements and document elements is a crucial and effortful processing for creating or converting an unstructured or structured document into a structured document with specific DTD. Several approaches have been proposed to associate the data elements and the document elements to simplify the generation of the structured document. For examples, a keyword extracting approach extracts a keyword representative of the document structure from an unstructured document and the keyword/text pairs are used as the association between document elements and data elements. A coordinate approach associates data elements with markup language tags in document elements by sorting the coordinates for coordinate documents. A logical structure approach analyzes the document structure by matching the predetermined patterns and parses the data elements based on the analyzed document elements. Nevertheless, none of the above approaches have considered using identifiers (e.g. font information) to associate the data elements and document elements. There is, therefore, a need for a generic approach to use the identifier information in user-defined document type definitions to associate data elements and document elements for generating structured documents. [0009]
  • In addition, the procedures required by the exemplary editors are somehow tedious and laborious and are inherently of high cost. Quite often, a business that has many documents to convert has to outsource the process due to the inefficiency and slowness associated with the conversion process. On the other end, the conversion process conducted by a service provider is difficult to be quantified as it is mainly involved in manual and repeated processes depending on the complexities of the documents. There is thus another need for a mechanism for quantifying the conversion of the unstructured documents to structured documents for various presentations in a cost-determinable way. [0010]
  • SUMMARY OF THE INVENTION
  • The present invention has been made in consideration of the above described problems and needs and has particular applications to presentations over the Internet. One of the features in the present invention is the use of identifiers in a DTD file to associate selected objects or group objects so that association information of selected objects or group objects can facilitate the generation of files in a markup language suitable for presentations on various media. [0011]
  • The present invention may be implemented as a method, a system, a product or other practical forms. According to one implementation, the present invention is a method. The method receives a definition file including document type definitions (DTD) and displays a metafile along with the definition file, the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects. The definition file includes a structure for document elements, each corresponding to one of the displayable objects in the metafile. Some of the document elements include a number of identifiers, each of the identifiers being assigned to one of the document elements. In one implementation, the identifiers are numerals and/or alphabets. In another implementation, the identifiers are one or more of a font name, a color name, a size, a font type, a color, a style, various effects or other symbols. The method associates at least one of the identifiers with one of the displayable objects. [0012]
  • According to another implementation, the present invention is implemented as a method for providing document conversion process, the method comprising activating a counter having a numbering system, converting an unstructured document into a metafile, wherein the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects, receiving a definition file including document type definitions (DTD) relating to the unstructured document; generating a modified metafile including association information of at least one of the displayable objects associated with one of the definitions in the definition file; and causing the counter to increment as soon as the modified metafile is to be saved. [0013]
  • Objects and advantage together with the foregoing are attained in the exercise of the invention in the following description and resulting in the embodiments illustrated in the accompanying drawings. [0014]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where: [0015]
  • FIG. 1A shows a basic system configuration in which the present invention may be implemented in accordance with a preferred embodiment; [0016]
  • FIG. 1B shows internal construction blocks of a system in which the present invention may be implemented and executed to achieve desired results contemplated in the present invention; [0017]
  • FIG. 2A illustrates an example of an unstructured document that may be composed, edited or managed by an authoring tool. [0018]
  • FIG. 2B is an example of document type definitions (DTD); [0019]
  • FIG. 2C shows a structured document for the unstructured document shown in FIG. 2A based on the document type definitions (DTD) in FIG. 2B; [0020]
  • FIG. 3A illustrates a functional diagram according to one embodiment of the present invention; [0021]
  • FIG. 3B shows a visual environment implementing a conversion module according to one embodiment of the present invention; [0022]
  • FIG. 3C shows an example of a style sheet designed in XML format with respect to displayable objects in a metafile shown in FIG. 3B; [0023]
  • FIG. 3D shows an exemplary message from a dongle; [0024]
  • FIG. 3E shows a process flowchart of using a product including an implementation of conversion module according to one embodiment of the present invention; [0025]
  • FIG. 4 illustrates a block diagram of data processing apparatus which imports, edits, and converts unstructured or structured document into structured documents with user-defined DTD using structure-based font information; [0026]
  • FIG. 5 is an association table for document elements defined in DTD in FIG. 2B and font attributes; [0027]
  • FIG. 6 shows an editing result for the unstructured document in FIG. 1, where each parsed data element has been assigned font attributes based on the association table in FIG. 5; [0028]
  • FIG. 7 shows a transformation process which converts the parsed data elements in FIG. 6 into the desired structured document with the exemplary DTD in FIG. 2B; [0029]
  • FIG. 8 is an intermediate structured document, which contains parsed data elements with assigned font IDs associated with a list of font document elements; [0030]
  • FIG. 9 shows a transformation process which converts the intermediate structured document in FIG. 6 into the desired structured document with DTD in FIG. 2; and [0031]
  • FIG. 10 shows an implementation of the transformation process in FIG. 9 using the extensible style language for transformation (XSLT). [0032]
  • DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • In the following detailed description of the present invention, numerous specific details are set forth in order to provide a thorough understanding of the present invention. However, it will become obvious to those skilled in the art that the present invention may be practiced without these specific details. In other instances, well known methods, procedures, components, and circuitry have not been described in detail to avoid unnecessarily obscuring aspects of the present invention. The detailed description is presented largely in terms of procedures, logic blocks, processing, and other symbolic representations that directly or indirectly resemble the operations of data processing devices coupled to networks. These process descriptions and representations are the means used by those experienced or skilled in the art to most effectively convey the substance of their work to others skilled in the art. [0033]
  • Reference herein to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the invention. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Further, the order of blocks in process flowcharts or diagrams representing one or more embodiments of the invention do not inherently indicate any particular order nor imply any limitations in the invention. [0034]
  • Referring now to the drawings, in which like numerals refer to like parts throughout the several views. FIG. 1A shows a basic system configuration in which the present invention may be implemented in accordance with a preferred embodiment. Unstructured documents, such as product descriptions, functions lists and price schedules, may be created using an authoring tool executed on a [0035] computer 100. Files or Documents created by an authoring tool are referred to as unstructured documents. Exemplary authoring tools may include Microsoft Office (e.g. Microsoft Word, Microsoft PowerPoint, and Microsoft Excel), Adobe FrameMaker and Adobe Photoshop. The unstructured documents may be uploaded to computing device 102 that may serve as a central repository. Computing device 102 may be a server station from Sun Microsystems (www.sun.com) or a desktop computer loaded with a complied and linked version of one embodiment implementing the present invention.
  • In one setting, [0036] computer 100 and computing device 102 are inseparable and perform document conversion process and generate structured documents that may be ultimately represented in a format of markup language such as XML or HTML. In one application, the structured documents represented in XML are converted to HTML format and become available through a private network 110 to a service server 104 that hosts what is generally referred to as a www (world wide web) site.
  • In one situation, a user uses a [0037] desk computer 106 that operates a browsing application and is coupled to data network 108 to access files on service server 104. These files represented by the structured documents in computer 102 may represent the latest product information originally composed via an authoring tool.
  • As will be explained below, the present invention is not limited to the Internet applications. It may be practiced in individual computers in which users often create documents in different word processing formats, such as FrameMaker or Microsoft Word. The present invention may be utilized to convert documents to a markup representation regardless of the exact word processing formats. [0038]
  • FIG. 1B shows an internal construction blocks of a [0039] system 118 in which the present invention may be implemented and executed. System 118 may correspond to a client device ( e.g. computer 100,102 or 106) or a server device (e.g. server 104). As shown in FIG. 1B, system 108 includes a central processing unit (CPU) 122 interfaced to a data bus 120 and a device interface 124.
  • [0040] CPU 122 executes certain instructions to manage all devices and interfaces coupled to data bus 120 for synchronized operations and device interface 124 may be coupled to an external device such as computer 102 hence documents therefrom are received into memory or storage through data bus 120. Also interfaced to data bus 120 is display interface 126, network interface 128, printer interface 130 and floppy disk drive interface 138. Generally, a compiled and linked version of one embodiment of the present invention is loaded into storage 136 through floppy disk drive interface 138, network interface 128, device interface 124 or other interfaces coupled to data bus 120.
  • [0041] Main memory 132 such as random access memory (RAM) is also interfaced to data bus 120 to provide CPU 122 with the instructions and access to memory storage 136 for data and other instructions. In particular, when executing stored application program instructions, such as the complied and linked version of the present invention, CPU 122 is caused to manipulate the data to achieve results contemplated by the present invention. ROM (read only memory) 134 is provided for storing invariant instruction sequences such as a basic input/output operation system (BIOS) for operation of keyboard 140, display 126 and pointing device 142 if there are any.
  • FIG. 2A illustrates an example of an [0042] unstructured document 200 that may be composed, edited or managed by an authoring tool. In an unstructured document, data is generally presented in sequence, which usually follows a reading order (e.g. from top to bottom and left to right). This sequence may be parsed into segments of data elements, where each data element 102 is assigned with decoration attributes or information such as positions, font color, font size, font type, style and various effects and etc. The decoration information is essentially for proper layout and presentation purpose when a file containing the data elements is opened by the authoring tool for display on a display screen.
  • According to one embodiment, an unstructured document is printed to a metafile format that contain the decoration information. An example of a metafile format is commonly used Portable Data Format (PDF). One of the advantages of the metafile format is its independence from the authoring tool and perhaps from computers so that the metafile format can be opened or read identically in many different environments. [0043]
  • A structured document such as SGML and XML starts with document type definitions (DTD). FIG. 2B illustrates an example of [0044] DTD 208 for “recipe-type” documents, in which a document is to be broken down into structures of document elements. A particular document element 210 may contain other document elements and attributes. Another example of the document element 212 contains only the parsed character data.
  • FIG. 2C shows [0045] structured document 220 converted correspondingly from unstructured document 200 in FIG. 2A with respect to DTD 208 in FIG. 2B. As shown in the figure, the data sequence in the unstructured document is parsed into data elements associated with document elements defined in DTD for the structured document.
  • Unlike the unstructured document, the structured document can easily access certain information via the document elements. Presentation of a structured document is usually defined in separate style sheets, e.g., written in cascading style sheet (CSS) or extensible style language for formatting objects (XSL-FO), which interprets layout for each document element. This feature allows a structured document to be presented in different layouts for different media through different style sheets. Generally, the decoration information or formatting attributes, such as font information in an unstructured document, unless defined in DTD as attributes of document elements, abandoned after an unstructured document is converted into a corresponding structured document. Further modification of formatting information will in general not affect the converted structured documents. [0046]
  • FIG. 3A illustrates a functional diagram [0047] 300 according to one embodiment of the present invention. A conversion module 302 comprises an association module 302 and an integration module 306. Association module 302 receives an unstructured document, preferably in a metafile format. At the same time, association module 302 also receives a file, referred to as a definition file, including DTD that are predefined. Generally, DTD is defined according to the nature or purposes of the unstructured document. For example, the unstructured document is in a category of receipts, e.g. document 200 in FIG. 2A, the DTD in a definition file as shown in FIG. 2B is designed in accordance to the “receipt-type” documents.
  • To further understand [0048] association module 302, FIG. 3B shows an environment 320 implementing conversion module 302 according to one embodiment of the present invention. Environment 320 includes two displays 322 and 324 for a user to perform a conversion of an unstructured document to a file in markup language (referring to a markup language file). Display 322 is used to display the unstructured document. In one preferable embodiment, a metafile version of the unstructured document is loaded for display. A metafile, referring to either the unstructured document or a printed version thereof, typically contains many displayable objects. Each object is a cluster or a group of characters or words or a graphic representation. As shown in display 322, each word or an isolated numeral is a displayable object which is inherently carried over in the metafile. In other words, each object is defined by a number of attributes or decoration information including, but not limited to, type, size, color and position of the object such that it can be “printed” correctly. A number of objects can be grouped manually by a user in terms of their meanings or purposes. For example, group object 326 includes three character-type objects “Green”, “Chili” and “Salsa”. Naturally the three character-type objects forms a title as a group object 326. The object grouping may be performed for the rest of the displayed metafile in display 322.
  • [0049] Display 324 is used to display a definition file prepared for the metafile in display 322. To facilitate operations of association module 302, the definition file is presented graphically as “DTD Pool” 328. For example, the graphical representation 328 of DTD 208 in FIG. 2B is used in display 324 to illustrate the hierarchical relationships among the document elements.
  • Accordingly to one embodiment that ultimately converts the metafile to an XML file, an [0050] auxiliary XML tree 330 is produced from “DTD Pool” 328. Auxiliary XML tree 330 also shows the hierarchical relationships among the document elements. In addition, each of the document elements is assigned to an identifier that may include, but not be limited to, a numeral, a name, a font, a type name or a color. In one embodiment, the identifier is in “data” of each of the document elements. To associate group object 326 with a document element “title” 332, “data” 334 is activated upon group object 326 is selected. One of the features in the present invention is an underlying association that relates group object 326 with the identifier in “data” 334. Specifically in one embodiment, if the identifier in “data” 334 is a color, “green”, group object 326 is highlighted in green to indicate that this group object has been associated with the DTD. If the identifier is a font, “Arial”, group object 326 is highlighted in style Arial to indicate that this group object has been associated with the DTD.
  • Similarly, a [0051] group object 340 can be associated with an identifier in data 342 under “ingredient”, a group object 344 can be associated with an identifier in data 346 under “amount” of “ingredient” and so on. As a result, the metafile in display 322 has been segmented and the displayable objects therein are respectively grouped and each of the group objects is associated with the document element in the loaded DTD by an identifier. Display 322 now has a modified metafile 310, an example of which will be illustrated below.
  • Referring now back to FIG. 3A, modified [0052] metafile 310 is input to an integration module 306 that further receives a style sheet. A style sheet is typically configured to include mapping rules in accordance of the media on which the objects from the metafile will be presented. One exemplary media is a web presentation of a file accessible by a browser (e.g. Internet Explore from Microsoft). Hence, the file is in markup language, such as HTML or XML, referring to as a markup language file.
  • FIG. 3C shows an example of such style sheet designed in XML format with respect to the displayable objects in the metafile. Generally a style sheet is designed to position, color or size respectively each of the objects so that a proper and attentive presentation can be achieved for a particular media. The example in FIG. 3C is designed for presenting “receipt-type” document and causes the modified metafile to generate a proper XML file when loaded. In other words, [0053] integration module 306 generates the XML file from the modified metafile in accordance with the style sheet. Given the description herein, it can be noted that a style sheet does not have to be input to integration module 306. In one implementation, the mapping rules can be loaded in with the DTD file so that integration module 306 performs mapping from the modified metafile to a markup language file in accordance with the loaded mapping rules.
  • According to one embodiment, [0054] conversion module 302 is implemented in software and may be distributed as an application to users or service providers. It is understood that the conversion process from an unstructured document to a markup language file is difficult to be quantified in a cost-determinable way. A counter 308 is included in conversion module 302. In one embodiment, counter 308 is configured to count the number of pages in the metafile to be converted. Every time, all of the objects in a display (i.e. a page display) are associated with the document elements in a DTD file and saved as a corresponding modified metafile, counter 308 increments. FIG. 3D shows an example of counting results kept in a dongle. A dongle (pronounced DONG-uhl) is a mechanism for ensuring that only authorized users can copy or use a specific software application, especially very expensive programs. Common implementations of a dongle include a hardware key that plugs into a parallel or serial port on a computer and that a software application accesses for verification before continuing to run; special key diskettes accessed in a similar manner; and registration numbers that are loaded into some form of read-only memory at the factory or during a system setup.
  • When the dongle needs to be reset, the conversion process can be evaluated in a cost-determinable way. According to one embodiment, an owner of a product including an implementation of [0055] conversion module 302 may distribute the product free or at very low cost to users. Typically the user needs to produce volumes of web pages from the unstructured documents composed, edited or managed by various authoring tools. One of the benefits for the user to receive the product in such manner is not to have to come up with a large capital for acquiring the product before using it. The users may pay for the usage of the products. Hence, one of the purposes of using a dongle with conversion module 302 is to manage the usage thereof. As a result, the owner of the product can control the usage of the product by controlling the dongle containing the usage information.
  • FIG. 3E shows a process flowchart [0056] 370 of using a product including an implementation of conversion module 302 according to one embodiment of the present invention. Some time, the product is leased by a user or a business. Other times, the product is used by a service provider providing services to businesses that need to convert unstructured documents to structured documents for different media presentation (e.g. presentation on a web site).
  • Process [0057] 370 starts with generating metafiles from authored documents at 372. Generally, the authored documents may have been prepared using one or more authoring tools. As described above, metafiles are preferably obtained from the authored documents so that conversion module 302 does not have to be respectively configured for each of the different authoring tools. However, it should be noted that the preference of a metafile is not an inherent limitation to the current invention but is to make the product or conversion module 302 work more efficiently. Those skilled in the art understand that a conversion interface or a print driver could be configured to accommodate any type of the authored documents or generate the metafiles.
  • Once the metafiles are obtained, they may be now loaded to a visual environment in which the metafiles can be respectively displayed. [0058] Environment 320 of FIG. 3B may be applicable so that pages of each of the metafiles can be individually loaded for display.
  • Before process [0059] 370 permits a user to proceed further, an authorization process 378 is triggered to ensure that the user is operating an authorized product. As described above, one exemplary authorization method is through a dongle that is pre-set by a business or a dealer that offers/owns/controls the product. If authorization process 378 indicates that process 370 is not authorized, typically a display is shown to the user as to where the product can be authorized. One of the procedures in setting authorization 376 involves a purchase of a permitted quantity for the number of pages converted or saved.
  • According to one embodiment, a dongle is used for coupling to a computer executing process [0060] 370. The dongle includes a first and a second number. The first number is a starting number, for example, “10”, and the second number is a limit number, for example, “1000”, which means there are 1000 pages of converted documents can be processed and saved by process 370.
  • Once process [0061] 370 is permitted to proceed to 380, the user is now permitted to group a number of displayable objects respectively to group objects according to, perhaps, their meanings or their purposes and in view of a DTD file loaded and display nearby. At 382, the group objects can be respectively associated with definitions in the DTD. At least some of the definitions have a number of identifiers, preferably each identifier is associated or designated to one of the definitions.
  • As described above, the associations between the selected objects and the definitions are to be saved in a modified metafile. Before process [0062] 370 permits such saving, an counter is to be checked at 386. In one embodiment, the first and the second numbers in dongle are compared. When the first number is substantially close to the second number, for example the two numbers being the same, process 370 will ask for a replenishment of the permitted usage. Typically, the user has to get the dongle reset or reconfigured by a business or a dealer that can now collect fees based on the information in the dongle. At 384, the numbers have been reset and now permit process 370 to proceed.
  • At [0063] 386, a save step can be conducted. Depending on an exact implementation, the modified metafile or a markup language file can be saved in a storage space. The markup language file is generated from the modified metafile in reference to a style sheet for a predefined media presentation. At 388, the counter is incremented.
  • If should be noted that the counter is checked at [0064] 386, in particular after 384, in FIG. 3E. In fact, the description has made it evident to those skilled in the art that the counter could be checked or consulted virtually anywhere along process 370. One of the objectives for using a counter herein is to facilitate a business to control and determine the usage of process 370 so that a cost could be determined and a fee could be charged.
  • FIG. 4 shows a functional block diagram [0065] 400 of a data processing module 404 according to one embodiment of the present invention. Data processing module 404 included in integration module 306 comprises an input module 406, an editing module 410 and a transformation or filtering module 414. One of the functions performed by data processing module 404 is to convert unstructured documents or structured documents with different DTD into corresponding structured documents with predefined or specific DTD.
  • [0066] Input module 406 loads documents or imports documents from a document database 402 that may correspond to a repository in computing device 102 of FIG. 1A. Alternatively input module 406 can start a new document 408. It should be noted that the loaded or imported documents, can be either unstructured (e.g. a metafile) or structured and may have contained pre-created structure-based font information in certain cases.
  • An [0067] editing module 410 communicates with input module 406 and creates/edits the structure-based font information for the input documents. This module allows selections of data elements for the input documents and provides an editing environment to alter the font attributes such as font type, font style, font color, font size, and font effects for the selected data elements. The way to parse the input documents into data elements and to assign font attributes is based on an association table for the document elements defined in a desired DTD and associated font attributes 412. An exemplary association table 500 for DTD 412 is given in FIG. 5, which contains fields of document element 502, element attribute 504, font type 506, font style 508, font color 510, font size 512, and font effect 514.
  • FIG. 6 shows an [0068] editing result 600 for the unstructured document 200 of FIG. 2A. Each parsed data element or combined objects 602, 604, 606, 608, 610, 612 and 614 has been assigned font attributes based on the association table in FIG. 5 and displayed respectively in the associated font. During the parsing, this module allows sequence selections of data elements based on the reading order of the input document 602 to edit their font information. This module also allows region grouping of data elements to edit their font information. This module can also provide an auxiliary view of the association table.
  • Transformation or [0069] filtering module 414 converts the loaded documents into structured documents with user-defined document type definitions (DTD) using the structure-based font information. Mapping rules based on the document elements and font attributes for the conversion 416 is imported or designed in this module.
  • FIG. 7 illustrates an example [0070] 700 of the mapping rules for converting the edited document 602 into the structured document 220 of FIG. 2C. In particular, 702 starts and 718 ends the “document” element, 704 starts and 714 ends the “recipe” document, 706 forms the “ingredient” element, 708 forms the “procedure” element, 710 forms the “presentation” element, and 712 forms the “originate” element. In 704, 706, 708, 710, and 712, the structure-based font information is used to locate the data elements and the located data elements are assigned as attributes or parsed character data for document elements. These mapping rules can be implemented, by not restricted to, programming languages such as Java, JavaScript, extensible style language for transformation (XSLT), C/C++, . . . etc, or any build-in or programmable hardware devices. The converted documents can be either saved as a file document or exported into a document database 418.
  • Other than directly converting into the desired documents, the transformation module can also output the edited documents as the intermediate structured documents which contain the structure-based font information. The intermediate structured documents can be reloaded for further editing or batch conversion. [0071]
  • An example of the intermediate structured document [0072] 808 for the edited document 600 is given in FIG. 8, where 802 contains “font” elements with specific font attributes and 804 contains the parsed data elements with “font_ID” attributes to associate the font information. Since the parsed data elements with the same font attributes have been grouped by the same “font_ID” in the intermediate structured documents, mapping rules for conversion can be designed based on the grouped font information.
  • FIG. 9 illustrates an example of the mapping rules for converting the intermediate document [0073] 800 into the structured document 220. In particular, 902 starts and 918 ends the “document” element, 904 starts and 914 ends the “recipe” document, 906 forms the “ingredient” element, 908 forms the “procedure” element, 910 forms the “presentation” element, and 912 forms the “originate” element. In 904, 906, 908, 910, and 912, the grouped font information is used to locate the data elements and the located data elements are assigned as attributes or parsed character data for document elements.
  • FIG. 10 shows an example of implementing the mapping rules given in FIG. 9 using the extensible style language for transformation (XSLT). [0074]
  • The invention described above is preferably implemented in software, hardware or a combination of both. At least portions of the invention can be embodied as computer readable code on a computer readable medium. The computer readable medium is any data storage device that can store data that can be thereafter read by a computing device. Examples of the computer readable medium include read-only memory, random-access memory, disk drives, floppy disks, CD-ROMs, DVDs, magnetic tape, optical data storage devices, carrier waves. The computer readable media can also be distributed over network coupled computer systems so that the computer readable code is stored and executed in a distributed fashion. [0075]
  • The present invention has been described in sufficient detail with a certain degree of particularity. It is understood to those skilled in the art that the present disclosure of embodiments has been made by way of examples only and that numerous changes in the arrangement and combination of parts may be resorted without departing from the spirit and scope of the invention as claimed. While the embodiments discussed herein may appear to include some limitations as to the presentation of the information units, in terms of the format and arrangement, the invention has applicability well beyond such embodiment, which can be appreciated by those skilled in the art. Accordingly, the scope of the present invention is defined by the appended claims rather than the forgoing description of embodiments. [0076]

Claims (42)

1. A method for producing structured documents, the method comprising:
receiving a definition file including document type definitions (DTD);
displaying a metafile along with the definition file, the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects; and
associating at least one of the definitions in the definition file with one of the displayable objects.
2. The method of
claim 1
further comprising:
generating a modified metafile that includes the displayable objects, each being associated with the at least one of the definitions in the definition file.
3. The method of
claim 2
further comprising converting modified metafile to a markup language file in accordance with a set of mapping rules.
4. The method of
claim 1
, wherein the definition file includes a structure for document elements, each corresponding to one of the displayable objects in the metafile.
5. The method of
claim 4
, wherein some of the document elements include another layer of sub-document elements, each of subdocument elements corresponds to one of the displayable objects in the metafile.
6. The method of
claim 4
, wherein at least some of the document elements include respectively a number of identifiers, each of the identifiers being assigned to one of the at least some of the document elements.
7. The method of
claim 6
, wherein some of the identifiers are one or more of numerals and alphabets.
8. The method of
claim 6
, wherein some of the identifiers are selected from a group consisting of a font type, a color name, a size, a style, and an effect.
9. The method of
claim 6
, wherein the associating of the at least one of the definitions in the definition file comprises:
selecting one of the displayable objects; and
assigning one of the identifiers to the selected display object.
10. The method of
claim 9
, wherein the one of the identifiers is either a numeral or an alphabet.
11. The method of
claim 10
, wherein the one of the identifiers is one or more of (i) a font type, (ii) a color, (iii) a size, (iv) a style, and (v) an effect.
12. The method of
claim 1
, wherein the metafile is or is generated from an unstructured document that is composed, edited or managed by an authoring tool.
13. The method of
claim 12
, wherein some of the displayable objects are respective groups of characters.
14. The method of
claim 13
, wherein some of the decoration attributes include at least positions, font color, font size, font type, style, and effect for each of the groups of characters.
15. A method for producing structured documents, the method comprising:
activating an environment including a first display and a second display, the first display displaying a metafile and the second display displaying a definition file including document type definitions (DTD), wherein the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects, and wherein each of the document type definitions includes an identifier;
grouping a number of group objects, each of the group objects including a number of the displayable objects; and
associating each of the group objects with the identifier in one of the document type definitions.
16. The method of
claim 15
further comprising generating a modified metafile including information of each of the group objects being associated with the identifier in one of the document type definitions.
17. The method of
claim 16
further comprising:
converting the modified metafile to a markup language file in accordance with mapping rules.
18. The method of
claim 17
wherein the markup language file is suitable for presentation on a selected media.
19. The method of
claim 18
wherein the selected media is a web presentation on the Internet.
20. The method of
claim 18
wherein the markup language file is based on a markup language selected from a group consisting of HyperText Markup Language (HTML), compact HyperText Markup Language (cHTML), Extensible Markup Language (XML), Standard Generalized Markup Language (SGML) or Wireless Markup Language (WML).
21. The method of
claim 15
wherein some of the decoration attributes include at least position, font type, color, size, style, and effect for each of the groups of characters.
22. The method of
claim 21
wherein some of the displayable objects are respective groups of characters.
23. The method of
claim 22
, wherein the identifier is one or more of a numeral and an alphabet.
24. The method of
claim 23
, wherein the identifier is one or more of (i) a font type, (ii) a color, (iii) a size, (iv) a style, and (v) an effect.
25. A machine-readable medium embodying instructions for execution by a processor, the instructions, when executed by the processor, causing the processor to produce structured documents, the machine-readable medium comprising:
program code for receiving a definition file including document type definitions (DTD);
program code for displaying a metafile along with the definition file, the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects; and
program code for associating at least one of the definitions in the definition file with one of the displayable objects.
26. The machine-readable medium of
claim 25
further comprising:
program code for generating a modified metafile that includes the displayable objects, each being associated with the at least one of the definitions in the definition file.
27. The machine-readable medium of
claim 25
further comprising program code for converting modified metafile to a markup language file in accordance with a set of mapping rules.
28. The machine-readable medium of
claim 25
, wherein the definition file includes a structure for document elements, each corresponding to one of the displayable objects in the metafile.
29. The machine-readable medium of
claim 28
, wherein some of the document elements include another layer of sub-document elements, each of sub-document elements corresponds to one of the displayable objects in the metafile.
30. The machine-readable medium of
claim 28
, wherein at least some of the document elements include respectively a number of identifiers, each of the identifiers being assigned to one of the at least some of the document elements.
31. The machine-readable medium of
claim 30
, wherein some of the identifiers are one of either numerals or alphabets.
32. The machine-readable medium of
claim 30
, wherein some of the identifiers are selected from a group consisting of a font type, a color, a size, a style, and an effect.
33. The machine-readable medium of
claim 30
, wherein the associating of the at least one of the definitions in the definition file comprises:
program code for selecting one of the displayable objects; and
program code for assigning one of the identifiers to the selected display object.
34. The machine-readable medium of
claim 33
, wherein the one of the identifiers is one or more of a numeral and an alphabet.
35. The machine-readable medium of
claim 34
, wherein the one of the identifiers is one or more of (i) a font type, (ii) a color, (iii) a size, (iv) a style, and (v) an effect.
36. The machine-readable medium of
claim 25
, wherein the metafile is or is generated from an unstructured document that is composed, edited or managed by an authoring tool.
37. The machine-readable medium of
claim 36
, wherein some of the displayable objects are respective groups of characters.
38. The machine-readable medium of
claim 37
, wherein some of the decoration attributes include at least position, font type, color, size, style, and effect for each of the groups of characters.
39. A machine-readable medium embodying instructions for execution by a processor, the instructions, when executed by the processor, causing the processor to produce structured documents, the machine-readable medium comprising:
program code for activating an environment including a first display and a second display, the first display displaying a metafile and the second display displaying a definition file including document type definitions (DTD), wherein the metafile including a number of displayable objects and respective decoration attributes about each of the displayable objects, and wherein each of the document type definitions includes an identifier;
program code for grouping a number of group objects, each of the group objects including a number of the displayable objects; and
program code for associating each of the group objects with the identifier in one of the document type definitions.
40. The machine-readable medium of
claim 39
further comprising program code for generating a modified metafile including information of each of the group objects being associated with the identifier in one of the document type definitions .
41. The machine-readable medium of
claim 40
further comprising program code for converting the modified metafile to a markup language file in accordance with mapping rules.
42. The method of
claim 39
wherein some of the decoration attributes include at least position, font type, color, size, style, and effect for each of the groups of characters and wherein some of the displayable objects are respective groups of characters.
US09/754,969 2000-01-31 2001-01-05 Method and apparatus for utilizing document type definition to generate structured documents Abandoned US20010032218A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US09/754,969 US20010032218A1 (en) 2000-01-31 2001-01-05 Method and apparatus for utilizing document type definition to generate structured documents
TW90124215A TW525088B (en) 2001-01-05 2001-09-28 Method and apparatus for generating structured documents for various presentations

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US17933000P 2000-01-31 2000-01-31
US09/754,969 US20010032218A1 (en) 2000-01-31 2001-01-05 Method and apparatus for utilizing document type definition to generate structured documents

Publications (1)

Publication Number Publication Date
US20010032218A1 true US20010032218A1 (en) 2001-10-18

Family

ID=22656117

Family Applications (2)

Application Number Title Priority Date Filing Date
US09/754,861 Expired - Fee Related US6910182B2 (en) 2000-01-31 2001-01-05 Method and apparatus for generating structured documents for various presentations and the uses thereof
US09/754,969 Abandoned US20010032218A1 (en) 2000-01-31 2001-01-05 Method and apparatus for utilizing document type definition to generate structured documents

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US09/754,861 Expired - Fee Related US6910182B2 (en) 2000-01-31 2001-01-05 Method and apparatus for generating structured documents for various presentations and the uses thereof

Country Status (11)

Country Link
US (2) US6910182B2 (en)
EP (1) EP1166214B1 (en)
JP (1) JP2003521069A (en)
KR (1) KR20010110671A (en)
CN (1) CN1392986A (en)
AT (1) ATE300766T1 (en)
AU (2) AU2001226368A1 (en)
CA (1) CA2365622A1 (en)
DE (1) DE60112188T2 (en)
RU (1) RU2001128738A (en)
WO (2) WO2001055899A1 (en)

Cited By (72)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020036788A1 (en) * 2000-09-12 2002-03-28 Yasuhiro Hino Image processing apparatus, server apparatus, image processing method and memory medium
US20020094000A1 (en) * 2000-11-06 2002-07-18 Heilman Randy T. Method of controlling the turn off characteristics of a VCSEL diode
US20020118379A1 (en) * 2000-12-18 2002-08-29 Amit Chakraborty System and user interface supporting user navigation of multimedia data file content
US20020129061A1 (en) * 2001-03-07 2002-09-12 Swart Stacey J. Method and apparatus for creating files that are suitable for hardcopy printing and for on-line use
WO2003038662A1 (en) * 2001-10-31 2003-05-08 University Of Medicine & Dentistry Of New Jersey Conversion of text data into a hypertext markup language
US20030093565A1 (en) * 2001-07-03 2003-05-15 Berger Adam L. System and method for converting an attachment in an e-mail for delivery to a device of limited rendering capability
US20030177449A1 (en) * 2002-03-12 2003-09-18 International Business Machines Corporation Method and system for copy and paste technology for stylesheet editing
US20030177441A1 (en) * 2002-03-12 2003-09-18 International Business Machines Corporation Method and system for stylesheet execution interactive feedback
US20030182271A1 (en) * 2002-03-21 2003-09-25 International Business Machines Corporation Method and apparatus for generating electronic document definitions
US20030182623A1 (en) * 2002-03-21 2003-09-25 International Business Machines Corporation Standards-based formatting of flat files into markup language representations
US20030187827A1 (en) * 2002-03-29 2003-10-02 Fuji Xerox Co., Ltd. Web page providing method and apparatus and program
US20040008356A1 (en) * 2002-06-24 2004-01-15 Canon Kabushiki Kaisha Image forming apparatus, image forming method, and computer readable storage medium that stores control program
US20040049738A1 (en) * 2000-08-17 2004-03-11 Thompson Robert James Cullen Computer implemented system and method of transforming a source file into a transfprmed file using a set of trigger instructions
US20040049736A1 (en) * 2002-09-05 2004-03-11 Abdul Al-Azzawe Method for creating wrapper XML stored procedure
US20040083196A1 (en) * 2002-10-29 2004-04-29 Jason Reasor Hardware property management system and method
US20040177315A1 (en) * 2003-03-03 2004-09-09 International Business Machines Corporation Structured document bounding language
US20040177321A1 (en) * 2003-03-03 2004-09-09 International Business Machines Corporation Meta editor for structured documents
US20040205605A1 (en) * 2002-03-12 2004-10-14 International Business Machines Corporation Method and system for stylesheet rule creation, combination, and removal
US20050097449A1 (en) * 2003-10-31 2005-05-05 Jurgen Lumera System and method for content structure adaptation
US20050114764A1 (en) * 2003-11-25 2005-05-26 Gudenkauf John C. Producing a page of information based on a dynamic edit form and one or more transforms
US20050114765A1 (en) * 2003-11-25 2005-05-26 Gudenkauf John C. Producing a page of information based on a dynamic edit form and one or more transforms
US20050132272A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Differential dynamic content delivery
US20050132285A1 (en) * 2003-12-12 2005-06-16 Sung-Chieh Chen System and method for generating webpages
US20050240603A1 (en) * 2004-04-26 2005-10-27 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20050257193A1 (en) * 2004-05-13 2005-11-17 Alexander Falk Method and system for visual data mapping and code generation to support data integration
US20050257731A1 (en) * 2004-03-24 2005-11-24 Bouchaud David Laurent C Submersible vehicle launch and recovery system
US20050278624A1 (en) * 2004-06-09 2005-12-15 Canon Kabushiki Kaisha Image processing apparatus, control method therefor, and program
US20050289121A1 (en) * 2003-05-27 2005-12-29 Masayuki Nakamura Web-compatible electronic device, web page processing method, and program
US20060116864A1 (en) * 2004-12-01 2006-06-01 Microsoft Corporation Safe, secure resource editing for application localization with automatic adjustment of application user interface for translated resources
US20060129745A1 (en) * 2004-12-11 2006-06-15 Gunther Thiel Process and appliance for data processing and computer program product
US20060155700A1 (en) * 2005-01-10 2006-07-13 Xerox Corporation Method and apparatus for structuring documents based on layout, content and collection
US20060168562A1 (en) * 2005-01-24 2006-07-27 International Business Machines Corporation Viewing and editing markup language files with complex semantics
US20060218475A1 (en) * 2005-03-24 2006-09-28 Bodin William K Differential dynamic content delivery with indications of interest from non-participants
US20060259638A1 (en) * 2000-12-20 2006-11-16 David Pociu Rapid development in a distributed application environment
US20070198516A1 (en) * 2006-01-31 2007-08-23 Ganapathy Palamadai R Method of and system for organizing unstructured information utilizing parameterized templates and a technology presentation layer
US7305455B2 (en) 2002-03-21 2007-12-04 International Business Machines Corporation Interfacing objects and markup language messages
US7315980B2 (en) 2002-03-21 2008-01-01 International Business Machines Corporation Method and apparatus for generating electronic document definitions
US20080010588A1 (en) * 2004-11-12 2008-01-10 Justsystems Corporation Document Processing Device and Document Processing Method
US20080320401A1 (en) * 2007-06-21 2008-12-25 Padmashree B Template-based deployment of user interface objects
US20090083620A1 (en) * 2004-11-12 2009-03-26 Justsystems Corporation Document processing device and document processing method
US20090106668A1 (en) * 2005-03-31 2009-04-23 International Business Machines Corporation Differential Dynamic Content Delivery With A Session Document Recreated In Dependence Upon An Interest Of An Identified User Participant
US20090259995A1 (en) * 2008-04-15 2009-10-15 Inmon William H Apparatus and Method for Standardizing Textual Elements of an Unstructured Text
US20090265339A1 (en) * 2006-04-12 2009-10-22 Lonsou (Beijing) Technologies Co., Ltd. Method and system for facilitating rule-based document content mining
US20090300705A1 (en) * 2008-05-28 2009-12-03 Dettinger Richard D Generating Document Processing Workflows Configured to Route Documents Based on Document Conceptual Understanding
US20090327862A1 (en) * 2008-06-30 2009-12-31 Roy Emek Viewing and editing markup language files with complex semantics
US20090327213A1 (en) * 2008-06-25 2009-12-31 Microsoft Corporation Document index for handheld application navigation
US20100017785A1 (en) * 2006-12-22 2010-01-21 Siemens Aktiengesellschaft Method for generating a machine-executable target code from a source code, associated computer program and computer system
US7657832B1 (en) * 2003-09-18 2010-02-02 Adobe Systems Incorporated Correcting validation errors in structured documents
US7774693B2 (en) 2004-01-13 2010-08-10 International Business Machines Corporation Differential dynamic content delivery with device controlling action
US20100241950A1 (en) * 2009-03-20 2010-09-23 Xerox Corporation Xpath-based display of a paginated xml document
US20100318743A1 (en) * 2009-06-10 2010-12-16 Microsoft Corporation Dynamic screentip language translation
US7890848B2 (en) 2004-01-13 2011-02-15 International Business Machines Corporation Differential dynamic content delivery with alternative content presentation
US8005025B2 (en) 2004-07-13 2011-08-23 International Business Machines Corporation Dynamic media content for collaborators with VOIP support for client communications
US8010885B2 (en) 2004-01-13 2011-08-30 International Business Machines Corporation Differential dynamic content delivery with a presenter-alterable session copy of a user profile
US8095575B1 (en) * 2007-01-31 2012-01-10 Google Inc. Word processor data organization
US8161131B2 (en) 2004-04-26 2012-04-17 International Business Machines Corporation Dynamic media content for collaborators with client locations in dynamic client contexts
US8180832B2 (en) 2004-07-08 2012-05-15 International Business Machines Corporation Differential dynamic content delivery to alternate display device locations
US8185814B2 (en) 2004-07-08 2012-05-22 International Business Machines Corporation Differential dynamic delivery of content according to user expressions of interest
US20130067313A1 (en) * 2011-09-09 2013-03-14 Damien LEGUIN Format conversion tool
US8499232B2 (en) * 2004-01-13 2013-07-30 International Business Machines Corporation Differential dynamic content delivery with a participant alterable session copy of a user profile
US20140181640A1 (en) * 2012-12-20 2014-06-26 Beijing Founder Electronics Co., Ltd. Method and device for structuring document contents
US20150199307A1 (en) * 2012-08-08 2015-07-16 Google Inc. Pluggable Architecture For Optimizing Versioned Rendering of Collaborative Documents
US9167087B2 (en) 2004-07-13 2015-10-20 International Business Machines Corporation Dynamic media content for collaborators including disparate location representations
US9378187B2 (en) * 2003-12-11 2016-06-28 International Business Machines Corporation Creating a presentation document
US20160292279A1 (en) * 2015-03-30 2016-10-06 Airwatch Llc Providing search results based on enterprise data
CN106469143A (en) * 2015-08-21 2017-03-01 国际商业机器公司 The estimation of file structure
US9852127B2 (en) 2008-05-28 2017-12-26 International Business Machines Corporation Processing publishing rules by routing documents based on document conceptual understanding
US10089388B2 (en) 2015-03-30 2018-10-02 Airwatch Llc Obtaining search results
US10318582B2 (en) 2015-03-30 2019-06-11 Vmware Inc. Indexing electronic documents
US10452904B2 (en) 2017-12-01 2019-10-22 International Business Machines Corporation Blockwise extraction of document metadata
US10592738B2 (en) * 2017-12-01 2020-03-17 International Business Machines Corporation Cognitive document image digitalization
EP4174866A1 (en) * 2021-10-27 2023-05-03 Koninklijke Philips N.V. User-guided structured document modeling

Families Citing this family (97)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100750074B1 (en) * 1999-03-09 2007-08-21 코닌클리케 필립스 일렉트로닉스 엔.브이. Method of coding a document
US7966234B1 (en) 1999-05-17 2011-06-21 Jpmorgan Chase Bank. N.A. Structured finance performance analytics system
JP4320491B2 (en) * 1999-11-18 2009-08-26 ソニー株式会社 Document processing system, terminal device, document providing device, document processing method, recording medium
AU2001249914A1 (en) * 2000-04-07 2001-10-23 Financeware.Com Method and apparatus for rendering electronic documents
US7249095B2 (en) 2000-06-07 2007-07-24 The Chase Manhattan Bank, N.A. System and method for executing deposit transactions over the internet
US8396859B2 (en) * 2000-06-26 2013-03-12 Oracle International Corporation Subject matter context search engine
US7313541B2 (en) 2000-11-03 2007-12-25 Jpmorgan Chase Bank, N.A. System and method for estimating conduit liquidity requirements in asset backed commercial paper
US7181684B2 (en) * 2000-12-12 2007-02-20 Oracle International Corporation Dynamic tree control system
US7703009B2 (en) * 2001-04-09 2010-04-20 Huang Evan S Extensible stylesheet designs using meta-tag information
JP2003036152A (en) * 2001-05-17 2003-02-07 Matsushita Electric Ind Co Ltd Information printing system
US7272594B1 (en) 2001-05-31 2007-09-18 Autonomy Corporation Ltd. Method and apparatus to link to a related document
US20030037023A1 (en) * 2001-08-07 2003-02-20 Intelliclaim Emulation process for making changes and revisions to computer data files
US20030208460A1 (en) * 2002-05-06 2003-11-06 Ncr Corporation Methods, systems and data structures to generate and link reports
US8224723B2 (en) 2002-05-31 2012-07-17 Jpmorgan Chase Bank, N.A. Account opening system, method and computer program product
US7117429B2 (en) * 2002-06-12 2006-10-03 Oracle International Corporation Methods and systems for managing styles electronic documents
US7607081B1 (en) 2002-06-28 2009-10-20 Microsoft Corporation Storing document header and footer information in a markup language document
US7584419B1 (en) 2002-06-28 2009-09-01 Microsoft Corporation Representing non-structured features in a well formed document
US7562295B1 (en) 2002-06-28 2009-07-14 Microsoft Corporation Representing spelling and grammatical error state in an XML document
US7523394B2 (en) * 2002-06-28 2009-04-21 Microsoft Corporation Word-processing document stored in a single XML file that may be manipulated by applications that understand XML
US7533335B1 (en) 2002-06-28 2009-05-12 Microsoft Corporation Representing fields in a markup language document
US7565603B1 (en) 2002-06-28 2009-07-21 Microsoft Corporation Representing style information in a markup language document
US7650566B1 (en) 2002-06-28 2010-01-19 Microsoft Corporation Representing list definitions and instances in a markup language document
CA2494808C (en) 2002-08-23 2008-12-30 Lg Electronics, Inc. Electronic document request/supply method based on xml
DE10250842B4 (en) * 2002-10-31 2010-11-11 OCé PRINTING SYSTEMS GMBH A method, computer program product and apparatus for processing a document data stream of an input format into an output format
KR100636909B1 (en) * 2002-11-14 2006-10-19 엘지전자 주식회사 Electronic document versioning method and updated information supply method using version number based on XML
US7293031B1 (en) * 2002-11-21 2007-11-06 Ncr Corp. Report specification generators and interfaces
JP2004192427A (en) * 2002-12-12 2004-07-08 Internet Disclosure Co Ltd Finance related disclosure document creating system
TW583556B (en) * 2002-12-20 2004-04-11 Inst Information Industry Method for translating web page document into web service interface and storage medium storing computer program for executing the method
AU2003901428A0 (en) * 2003-03-24 2003-04-10 Objective Systems Pty Ltd A system and method for formatting and distributing reading material
US7770184B2 (en) 2003-06-06 2010-08-03 Jp Morgan Chase Bank Integrated trading platform architecture
US7970688B2 (en) 2003-07-29 2011-06-28 Jp Morgan Chase Bank Method for pricing a trade
US7188127B2 (en) 2003-10-07 2007-03-06 International Business Machines Corporation Method, system, and program for processing a file request
US7155444B2 (en) * 2003-10-23 2006-12-26 Microsoft Corporation Promotion and demotion techniques to facilitate file property management between object systems
US20050097450A1 (en) * 2003-10-31 2005-05-05 Spx Corporation System and method for composition and decomposition of information objects
GB2411014A (en) * 2004-02-11 2005-08-17 Autonomy Corp Ltd Automatic searching for relevant information
EP1736894A4 (en) * 2004-03-30 2016-07-06 Jvc Kenwood Corp Digitization service manual generation method and additional data generation method
US8423447B2 (en) 2004-03-31 2013-04-16 Jp Morgan Chase Bank System and method for allocating nominal and cash amounts to trades in a netted trade
DE102004021269A1 (en) * 2004-04-30 2005-11-24 OCé PRINTING SYSTEMS GMBH Method, apparatus and computer program product for generating a page and / or area structured data stream from a row data stream
JP4154368B2 (en) * 2004-06-15 2008-09-24 キヤノン株式会社 Document processing apparatus, document processing method, and document processing program
US7693770B2 (en) 2004-08-06 2010-04-06 Jp Morgan Chase & Co. Method and system for creating and marketing employee stock option mirror image warrants
US7536634B2 (en) * 2005-06-13 2009-05-19 Silver Creek Systems, Inc. Frame-slot architecture for data conversion
US7599952B2 (en) * 2004-09-09 2009-10-06 Microsoft Corporation System and method for parsing unstructured data into structured data
US7818342B2 (en) * 2004-11-12 2010-10-19 Sap Ag Tracking usage of data elements in electronic business communications
US7711676B2 (en) * 2004-11-12 2010-05-04 Sap Aktiengesellschaft Tracking usage of data elements in electronic business communications
US7865519B2 (en) * 2004-11-17 2011-01-04 Sap Aktiengesellschaft Using a controlled vocabulary library to generate business data component names
JP4868733B2 (en) * 2004-11-25 2012-02-01 キヤノン株式会社 Structured document processing apparatus, structured document processing method, and program
US20070041041A1 (en) 2004-12-08 2007-02-22 Werner Engbrocks Method and computer program product for conversion of an input document data stream with one or more documents into a structured data file, and computer program product as well as method for generation of a rule set for such a method
KR100709379B1 (en) * 2004-12-30 2007-04-20 주식회사 엔리치텍 Making method for documents having the form appling the analyzed Meta-file
WO2006081428A2 (en) 2005-01-27 2006-08-03 Symyx Technologies, Inc. Parser for generating structure data
US7996443B2 (en) * 2005-02-28 2011-08-09 Microsoft Corporation Schema grammar and compilation
US8688569B1 (en) 2005-03-23 2014-04-01 Jpmorgan Chase Bank, N.A. System and method for post closing and custody services
US7756839B2 (en) 2005-03-31 2010-07-13 Microsoft Corporation Version tolerant serialization
US7478325B2 (en) * 2005-04-22 2009-01-13 Microsoft Corporation Methods for providing an accurate visual rendition of a text element formatted with an unavailable font
US7634515B2 (en) * 2005-05-13 2009-12-15 Microsoft Corporation Data model and schema evolution
US7587671B2 (en) * 2005-05-17 2009-09-08 Palm, Inc. Image repositioning, storage and retrieval
US7895219B2 (en) * 2005-05-23 2011-02-22 International Business Machines Corporation System and method for guided and assisted structuring of unstructured information
US7822682B2 (en) 2005-06-08 2010-10-26 Jpmorgan Chase Bank, N.A. System and method for enhancing supply chain transactions
JP2006350867A (en) * 2005-06-17 2006-12-28 Ricoh Co Ltd Document processing device, method, program, and information storage medium
CN100437594C (en) * 2005-09-02 2008-11-26 鸿富锦精密工业(深圳)有限公司 Figure element operating system and method
US7567928B1 (en) 2005-09-12 2009-07-28 Jpmorgan Chase Bank, N.A. Total fair value swap
US20070067397A1 (en) * 2005-09-19 2007-03-22 Available For Licensing Systems and methods for sharing documents
US7818238B1 (en) 2005-10-11 2010-10-19 Jpmorgan Chase Bank, N.A. Upside forward with early funding provision
US7730388B2 (en) * 2005-11-03 2010-06-01 Microsoft Corporation Converting an enhanced metafile into a chronologically independent object property list for conversion into a PDF document
WO2007064050A1 (en) * 2005-11-29 2007-06-07 Our Tech Co., Ltd. System offering a data- skin based on standard schema and the method
US7921367B2 (en) * 2005-12-20 2011-04-05 Oracle International Corp. Application generator for data transformation applications
US9207917B2 (en) 2005-12-20 2015-12-08 Oralce International Corporation Application generator for data transformation applications
US8280794B1 (en) 2006-02-03 2012-10-02 Jpmorgan Chase Bank, National Association Price earnings derivative financial product
US8407585B2 (en) * 2006-04-19 2013-03-26 Apple Inc. Context-aware content conversion and interpretation-specific views
US7620578B1 (en) 2006-05-01 2009-11-17 Jpmorgan Chase Bank, N.A. Volatility derivative financial product
US7647268B1 (en) 2006-05-04 2010-01-12 Jpmorgan Chase Bank, N.A. System and method for implementing a recurrent bidding process
US7916972B2 (en) * 2006-07-31 2011-03-29 Xerox Corporation Landmark-based form reading with declarative language
US9811868B1 (en) 2006-08-29 2017-11-07 Jpmorgan Chase Bank, N.A. Systems and methods for integrating a deal process
US20090300482A1 (en) * 2006-08-30 2009-12-03 Compsci Resources, Llc Interactive User Interface for Converting Unstructured Documents
US20080065671A1 (en) * 2006-09-07 2008-03-13 Xerox Corporation Methods and apparatuses for detecting and labeling organizational tables in a document
US7827096B1 (en) 2006-11-03 2010-11-02 Jp Morgan Chase Bank, N.A. Special maturity ASR recalculated timing
US7801926B2 (en) 2006-11-22 2010-09-21 Microsoft Corporation Programmable logic and constraints for a dynamically typed storage system
CN101464870B (en) * 2007-12-21 2011-03-23 鸿富锦精密工业(深圳)有限公司 Cross-drawings copying system and method for stamping mold parts
US8126837B2 (en) * 2008-09-23 2012-02-28 Stollman Jeff Methods and apparatus related to document processing based on a document type
US8365072B2 (en) * 2009-01-02 2013-01-29 Apple Inc. Identification of compound graphic elements in an unstructured document
US8738514B2 (en) 2010-02-18 2014-05-27 Jpmorgan Chase Bank, N.A. System and method for providing borrow coverage services to short sell securities
US8352354B2 (en) 2010-02-23 2013-01-08 Jpmorgan Chase Bank, N.A. System and method for optimizing order execution
US20120166953A1 (en) * 2010-12-23 2012-06-28 Microsoft Corporation Techniques for electronic aggregation of information
US8543911B2 (en) 2011-01-18 2013-09-24 Apple Inc. Ordering document content based on reading flow
US8442998B2 (en) 2011-01-18 2013-05-14 Apple Inc. Storage of a document using multiple representations
US8380753B2 (en) 2011-01-18 2013-02-19 Apple Inc. Reconstruction of lists in a document
US9323767B2 (en) 2012-10-01 2016-04-26 Longsand Limited Performance and scalability in an intelligent data operating layer system
US9588675B2 (en) 2013-03-15 2017-03-07 Google Inc. Document scale and position optimization
CN103885925B (en) * 2013-03-28 2017-04-26 中国证券监督管理委员会信息中心 Method for encapsulating XBRL (extensible business reporting language) instance documents
WO2015003245A1 (en) * 2013-07-09 2015-01-15 Blueprint Sofware Systems Inc. Computing device and method for converting unstructured data to structured data
US9361086B1 (en) * 2015-04-22 2016-06-07 International Business Machines Corporation Collating and intelligently sequencing installation documentation
US9881003B2 (en) * 2015-09-23 2018-01-30 Google Llc Automatic translation of digital graphic novels
CN106933781A (en) * 2015-12-30 2017-07-07 航天信息软件技术有限公司 A kind of word document data writing systems and method
CN107301162A (en) * 2016-04-14 2017-10-27 珠海金山办公软件有限公司 A kind of method and device for recognizing word or file
US10089285B2 (en) * 2016-12-14 2018-10-02 Rfpio, Inc. Method to automatically convert proposal documents
DE102016224894A1 (en) * 2016-12-14 2018-06-14 Robert Bosch Gmbh Diagnostic dongle for a tool and method for diagnosing and / or controlling a tool by means of a diagnostic dongle
KR101965563B1 (en) * 2017-03-17 2019-04-04 주식회사 인프라웨어 Method and apparatus editing electronic documents
KR101774257B1 (en) * 2017-05-15 2017-09-04 주식회사 한글과컴퓨터 Document editing apparatus for maintaining style of object and operating method thereof

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5276793A (en) * 1990-05-14 1994-01-04 International Business Machines Corporation System and method for editing a structured document to preserve the intended appearance of document elements
JP3023690B2 (en) * 1990-06-15 2000-03-21 富士ゼロックス株式会社 Document processing apparatus and method
GB9225566D0 (en) * 1992-12-07 1993-01-27 Incontext Corp System for display of structured documents
US5386369A (en) * 1993-07-12 1995-01-31 Globetrotter Software Inc. License metering system for software applications
WO1996017310A1 (en) 1994-11-29 1996-06-06 Avalanche Development Company System and process for creating structured documents
US6003048A (en) * 1995-04-27 1999-12-14 International Business Machines Corporation System and method for converting a coordinate based document to a markup language (ML) based document
JPH0969101A (en) 1995-08-31 1997-03-11 Hitachi Ltd Method and device for generating structured document
JPH10116275A (en) * 1996-10-11 1998-05-06 Fuji Xerox Co Ltd Document style editing device
JPH10307816A (en) * 1997-05-08 1998-11-17 Just Syst Corp Structured document processor its processing method and computer readable recording medium recording program for allowing computer to execute the method

Cited By (115)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040049738A1 (en) * 2000-08-17 2004-03-11 Thompson Robert James Cullen Computer implemented system and method of transforming a source file into a transfprmed file using a set of trigger instructions
US7386790B2 (en) * 2000-09-12 2008-06-10 Canon Kabushiki Kaisha Image processing apparatus, server apparatus, image processing method and memory medium
US20020036788A1 (en) * 2000-09-12 2002-03-28 Yasuhiro Hino Image processing apparatus, server apparatus, image processing method and memory medium
US20020094000A1 (en) * 2000-11-06 2002-07-18 Heilman Randy T. Method of controlling the turn off characteristics of a VCSEL diode
US20020118379A1 (en) * 2000-12-18 2002-08-29 Amit Chakraborty System and user interface supporting user navigation of multimedia data file content
US7013309B2 (en) * 2000-12-18 2006-03-14 Siemens Corporate Research Method and apparatus for extracting anchorable information units from complex PDF documents
US20060259638A1 (en) * 2000-12-20 2006-11-16 David Pociu Rapid development in a distributed application environment
US20020129061A1 (en) * 2001-03-07 2002-09-12 Swart Stacey J. Method and apparatus for creating files that are suitable for hardcopy printing and for on-line use
US20030093565A1 (en) * 2001-07-03 2003-05-15 Berger Adam L. System and method for converting an attachment in an e-mail for delivery to a device of limited rendering capability
WO2003038662A1 (en) * 2001-10-31 2003-05-08 University Of Medicine & Dentistry Of New Jersey Conversion of text data into a hypertext markup language
US20030167442A1 (en) * 2001-10-31 2003-09-04 Hagerty Clark Gregory Conversion of text data into a hypertext markup language
US7373597B2 (en) 2001-10-31 2008-05-13 University Of Medicine & Dentistry Of New Jersey Conversion of text data into a hypertext markup language
US7992088B2 (en) 2002-03-12 2011-08-02 International Business Machines Corporation Method and system for copy and paste technology for stylesheet editing
US20080098293A1 (en) * 2002-03-12 2008-04-24 Clarke Adam R Method and system for stylesheet execution interactive feedback
US20030177449A1 (en) * 2002-03-12 2003-09-18 International Business Machines Corporation Method and system for copy and paste technology for stylesheet editing
US20030177441A1 (en) * 2002-03-12 2003-09-18 International Business Machines Corporation Method and system for stylesheet execution interactive feedback
US7337391B2 (en) * 2002-03-12 2008-02-26 International Business Machines Corporation Method and system for stylesheet execution interactive feedback
US8117533B2 (en) 2002-03-12 2012-02-14 International Business Machines Corporation Method and system for stylesheet rule creation, combination, and removal
US20040205605A1 (en) * 2002-03-12 2004-10-14 International Business Machines Corporation Method and system for stylesheet rule creation, combination, and removal
US20030182623A1 (en) * 2002-03-21 2003-09-25 International Business Machines Corporation Standards-based formatting of flat files into markup language representations
US20030182271A1 (en) * 2002-03-21 2003-09-25 International Business Machines Corporation Method and apparatus for generating electronic document definitions
US7093195B2 (en) 2002-03-21 2006-08-15 International Business Machines Corporation Standards-based formatting of flat files into markup language representations
US7315980B2 (en) 2002-03-21 2008-01-01 International Business Machines Corporation Method and apparatus for generating electronic document definitions
US7305455B2 (en) 2002-03-21 2007-12-04 International Business Machines Corporation Interfacing objects and markup language messages
US7130842B2 (en) 2002-03-21 2006-10-31 International Business Machines Corporation Method and apparatus for generating electronic document definitions
US20030187827A1 (en) * 2002-03-29 2003-10-02 Fuji Xerox Co., Ltd. Web page providing method and apparatus and program
US7234110B2 (en) * 2002-03-29 2007-06-19 Fuji Xerox Co., Ltd. Apparatus and method for providing dynamic multilingual web pages
US7394563B2 (en) * 2002-06-24 2008-07-01 Canon Kabushiki Kaisha Image forming apparatus that executes an image trimming process with priority over other commands, method therefor, and storage medium storing a program therefor
US20040008356A1 (en) * 2002-06-24 2004-01-15 Canon Kabushiki Kaisha Image forming apparatus, image forming method, and computer readable storage medium that stores control program
US7322022B2 (en) 2002-09-05 2008-01-22 International Business Machines Corporation Method for creating wrapper XML stored procedure
US20040049736A1 (en) * 2002-09-05 2004-03-11 Abdul Al-Azzawe Method for creating wrapper XML stored procedure
US20040083196A1 (en) * 2002-10-29 2004-04-29 Jason Reasor Hardware property management system and method
US20040177321A1 (en) * 2003-03-03 2004-09-09 International Business Machines Corporation Meta editor for structured documents
US7213201B2 (en) * 2003-03-03 2007-05-01 International Business Machines Corporation Meta editor for structured documents
US20040177315A1 (en) * 2003-03-03 2004-09-09 International Business Machines Corporation Structured document bounding language
US10275437B2 (en) 2003-03-03 2019-04-30 International Business Machines Corporation Structured document bounding language
US20080282144A1 (en) * 2003-03-03 2008-11-13 International Business Machines Corporation Structured Document Bounding Language
US9542375B2 (en) 2003-03-03 2017-01-10 International Business Machines Corporation Structured document bounding language
US20050289121A1 (en) * 2003-05-27 2005-12-29 Masayuki Nakamura Web-compatible electronic device, web page processing method, and program
US7272787B2 (en) * 2003-05-27 2007-09-18 Sony Corporation Web-compatible electronic device, web page processing method, and program
US7657832B1 (en) * 2003-09-18 2010-02-02 Adobe Systems Incorporated Correcting validation errors in structured documents
US20050097449A1 (en) * 2003-10-31 2005-05-05 Jurgen Lumera System and method for content structure adaptation
US20050114764A1 (en) * 2003-11-25 2005-05-26 Gudenkauf John C. Producing a page of information based on a dynamic edit form and one or more transforms
US20050114765A1 (en) * 2003-11-25 2005-05-26 Gudenkauf John C. Producing a page of information based on a dynamic edit form and one or more transforms
US7162692B2 (en) * 2003-12-11 2007-01-09 International Business Machines Corporation Differential dynamic content delivery
US9378187B2 (en) * 2003-12-11 2016-06-28 International Business Machines Corporation Creating a presentation document
US20050132272A1 (en) * 2003-12-11 2005-06-16 International Business Machines Corporation Differential dynamic content delivery
US20050132285A1 (en) * 2003-12-12 2005-06-16 Sung-Chieh Chen System and method for generating webpages
US8578263B2 (en) 2004-01-13 2013-11-05 International Business Machines Corporation Differential dynamic content delivery with a presenter-alterable session copy of a user profile
US8499232B2 (en) * 2004-01-13 2013-07-30 International Business Machines Corporation Differential dynamic content delivery with a participant alterable session copy of a user profile
US7774693B2 (en) 2004-01-13 2010-08-10 International Business Machines Corporation Differential dynamic content delivery with device controlling action
US8010885B2 (en) 2004-01-13 2011-08-30 International Business Machines Corporation Differential dynamic content delivery with a presenter-alterable session copy of a user profile
US7890848B2 (en) 2004-01-13 2011-02-15 International Business Machines Corporation Differential dynamic content delivery with alternative content presentation
US20050257731A1 (en) * 2004-03-24 2005-11-24 Bouchaud David Laurent C Submersible vehicle launch and recovery system
US8161131B2 (en) 2004-04-26 2012-04-17 International Business Machines Corporation Dynamic media content for collaborators with client locations in dynamic client contexts
US8161112B2 (en) 2004-04-26 2012-04-17 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US7827239B2 (en) 2004-04-26 2010-11-02 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20050240603A1 (en) * 2004-04-26 2005-10-27 International Business Machines Corporation Dynamic media content for collaborators with client environment information in dynamic client contexts
US20050257193A1 (en) * 2004-05-13 2005-11-17 Alexander Falk Method and system for visual data mapping and code generation to support data integration
US7681121B2 (en) * 2004-06-09 2010-03-16 Canon Kabushiki Kaisha Image processing apparatus, control method therefor, and program
US20050278624A1 (en) * 2004-06-09 2005-12-15 Canon Kabushiki Kaisha Image processing apparatus, control method therefor, and program
US8214432B2 (en) 2004-07-08 2012-07-03 International Business Machines Corporation Differential dynamic content delivery to alternate display device locations
US8185814B2 (en) 2004-07-08 2012-05-22 International Business Machines Corporation Differential dynamic delivery of content according to user expressions of interest
US8180832B2 (en) 2004-07-08 2012-05-15 International Business Machines Corporation Differential dynamic content delivery to alternate display device locations
US9167087B2 (en) 2004-07-13 2015-10-20 International Business Machines Corporation Dynamic media content for collaborators including disparate location representations
US8005025B2 (en) 2004-07-13 2011-08-23 International Business Machines Corporation Dynamic media content for collaborators with VOIP support for client communications
US20090083620A1 (en) * 2004-11-12 2009-03-26 Justsystems Corporation Document processing device and document processing method
US20080010588A1 (en) * 2004-11-12 2008-01-10 Justsystems Corporation Document Processing Device and Document Processing Method
US20060116864A1 (en) * 2004-12-01 2006-06-01 Microsoft Corporation Safe, secure resource editing for application localization with automatic adjustment of application user interface for translated resources
US20060129745A1 (en) * 2004-12-11 2006-06-15 Gunther Thiel Process and appliance for data processing and computer program product
US7693848B2 (en) * 2005-01-10 2010-04-06 Xerox Corporation Method and apparatus for structuring documents based on layout, content and collection
US20060155700A1 (en) * 2005-01-10 2006-07-13 Xerox Corporation Method and apparatus for structuring documents based on layout, content and collection
US7412649B2 (en) 2005-01-24 2008-08-12 International Business Machines Corporation Viewing and editing markup language files with complex semantics
US20060168562A1 (en) * 2005-01-24 2006-07-27 International Business Machines Corporation Viewing and editing markup language files with complex semantics
US7475340B2 (en) * 2005-03-24 2009-01-06 International Business Machines Corporation Differential dynamic content delivery with indications of interest from non-participants
US20090063944A1 (en) * 2005-03-24 2009-03-05 International Business Machines Corporation Differential Dynamic Content Delivery With Indications Of Interest From Non-Participants
US20060218475A1 (en) * 2005-03-24 2006-09-28 Bodin William K Differential dynamic content delivery with indications of interest from non-participants
US8230331B2 (en) 2005-03-24 2012-07-24 International Business Machines Corporation Differential dynamic content delivery with indications of interest from non-participants
US20090106668A1 (en) * 2005-03-31 2009-04-23 International Business Machines Corporation Differential Dynamic Content Delivery With A Session Document Recreated In Dependence Upon An Interest Of An Identified User Participant
US8245134B2 (en) 2005-03-31 2012-08-14 International Business Machines Corporation Differential dynamic content delivery with a session document recreated in dependence upon an interest of an identified user participant
US20070198516A1 (en) * 2006-01-31 2007-08-23 Ganapathy Palamadai R Method of and system for organizing unstructured information utilizing parameterized templates and a technology presentation layer
US20090265339A1 (en) * 2006-04-12 2009-10-22 Lonsou (Beijing) Technologies Co., Ltd. Method and system for facilitating rule-based document content mining
US8515939B2 (en) * 2006-04-12 2013-08-20 Lonsou (Beijing) Technologies Co., Ltd. Method and system for facilitating rule-based document content mining
US20100017785A1 (en) * 2006-12-22 2010-01-21 Siemens Aktiengesellschaft Method for generating a machine-executable target code from a source code, associated computer program and computer system
US9378190B2 (en) 2007-01-31 2016-06-28 Google Inc. Word processor data organization
US8458231B1 (en) * 2007-01-31 2013-06-04 Google Inc. Word processor data organization
US8095575B1 (en) * 2007-01-31 2012-01-10 Google Inc. Word processor data organization
US20080320401A1 (en) * 2007-06-21 2008-12-25 Padmashree B Template-based deployment of user interface objects
US20090259995A1 (en) * 2008-04-15 2009-10-15 Inmon William H Apparatus and Method for Standardizing Textual Elements of an Unstructured Text
US20090300705A1 (en) * 2008-05-28 2009-12-03 Dettinger Richard D Generating Document Processing Workflows Configured to Route Documents Based on Document Conceptual Understanding
US9852127B2 (en) 2008-05-28 2017-12-26 International Business Machines Corporation Processing publishing rules by routing documents based on document conceptual understanding
US10169546B2 (en) * 2008-05-28 2019-01-01 International Business Machines Corporation Generating document processing workflows configured to route documents based on document conceptual understanding
US20090327213A1 (en) * 2008-06-25 2009-12-31 Microsoft Corporation Document index for handheld application navigation
US20090327862A1 (en) * 2008-06-30 2009-12-31 Roy Emek Viewing and editing markup language files with complex semantics
US20100241950A1 (en) * 2009-03-20 2010-09-23 Xerox Corporation Xpath-based display of a paginated xml document
US8108766B2 (en) * 2009-03-20 2012-01-31 Xerox Corporation XPath-based display of a paginated XML document
US20100318743A1 (en) * 2009-06-10 2010-12-16 Microsoft Corporation Dynamic screentip language translation
US8612893B2 (en) 2009-06-10 2013-12-17 Microsoft Corporation Dynamic screentip language translation
US8312390B2 (en) 2009-06-10 2012-11-13 Microsoft Corporation Dynamic screentip language translation
US8910039B2 (en) * 2011-09-09 2014-12-09 Accenture Global Services Limited File format conversion by automatically converting to an intermediate form for manual editing in a multi-column graphical user interface
US20130067313A1 (en) * 2011-09-09 2013-03-14 Damien LEGUIN Format conversion tool
US20150199307A1 (en) * 2012-08-08 2015-07-16 Google Inc. Pluggable Architecture For Optimizing Versioned Rendering of Collaborative Documents
US20140181640A1 (en) * 2012-12-20 2014-06-26 Beijing Founder Electronics Co., Ltd. Method and device for structuring document contents
US10885086B2 (en) 2015-03-30 2021-01-05 Airwatch Llc Obtaining search results
US20160292279A1 (en) * 2015-03-30 2016-10-06 Airwatch Llc Providing search results based on enterprise data
US10229209B2 (en) * 2015-03-30 2019-03-12 Airwatch Llc Providing search results based on enterprise data
US10089388B2 (en) 2015-03-30 2018-10-02 Airwatch Llc Obtaining search results
US10318582B2 (en) 2015-03-30 2019-06-11 Vmware Inc. Indexing electronic documents
US11238118B2 (en) 2015-03-30 2022-02-01 Airwatch Llc Providing search results based on enterprise data
US10572579B2 (en) * 2015-08-21 2020-02-25 International Business Machines Corporation Estimation of document structure
CN106469143A (en) * 2015-08-21 2017-03-01 国际商业机器公司 The estimation of file structure
US10452904B2 (en) 2017-12-01 2019-10-22 International Business Machines Corporation Blockwise extraction of document metadata
US10592738B2 (en) * 2017-12-01 2020-03-17 International Business Machines Corporation Cognitive document image digitalization
US10977486B2 (en) 2017-12-01 2021-04-13 International Business Machines Corporation Blockwise extraction of document metadata
EP4174866A1 (en) * 2021-10-27 2023-05-03 Koninklijke Philips N.V. User-guided structured document modeling

Also Published As

Publication number Publication date
DE60112188T2 (en) 2005-12-29
CA2365622A1 (en) 2001-08-02
DE60112188D1 (en) 2005-09-01
ATE300766T1 (en) 2005-08-15
EP1166214A1 (en) 2002-01-02
EP1166214B1 (en) 2005-07-27
WO2001055900A1 (en) 2001-08-02
JP2003521069A (en) 2003-07-08
RU2001128738A (en) 2003-07-20
KR20010110671A (en) 2001-12-13
US20010032217A1 (en) 2001-10-18
AU2001226368A1 (en) 2001-08-07
AU2775401A (en) 2001-08-07
WO2001055900A9 (en) 2002-04-18
CN1392986A (en) 2003-01-22
WO2001055899A1 (en) 2001-08-02
US6910182B2 (en) 2005-06-21

Similar Documents

Publication Publication Date Title
EP1166214B1 (en) Method and apparatus for generating structured documents for various presentations
US7475337B1 (en) Generating structured documents by associating document elements in a first display with displayed document type definitions in a second display
US8484552B2 (en) Extensible stylesheet designs using meta-tag information
Chaudhri et al. XML data management: native XML and XML-enabled database systems
Bradley The XML companion
US6763343B1 (en) Preventing duplication of the data in reference resource for XML page generation
US7146564B2 (en) Extensible stylesheet designs using meta-tag and/or associated meta-tag information
US7290008B2 (en) Method to extend a uniform resource identifier to encode resource identifiers
US20040205592A1 (en) Method and apparatus for extensible stylesheet designs
US20040221233A1 (en) Systems and methods for report design and generation
US6442576B1 (en) Searching for documents with multiple element types
US20060143562A1 (en) Self-describing editors for browser-based WYSIWYG XML/HTML editors
EP1393205A2 (en) Improvements relating to developing documents
US20020156813A1 (en) Developing documents
CA2221436A1 (en) Dynamic incremental updating of electronic documents
EP1377917A2 (en) Extensible stylesheet designs using meta-tag information
KR20020057709A (en) XML builder
Manzoor Authoring presentation semantics for mathematical documents for the web
Vatnal et al. Web Content Management as a means of Exploitation of Internet Information Resources
Voth FileMaker Pro 6 Developer's Guide to XML/XSL
Oikonomidis et al. XML ASSESSMENT USAGE REPORT
Brickley SpeedXML: An agile, user-oriented query tool for XML documents
van Herwijnen et al. Other ISO text processing standards

Legal Events

Date Code Title Description
AS Assignment

Owner name: XMLCITIES, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HUANG, EVAN S.;REEL/FRAME:011429/0235

Effective date: 20010105

STCB Information on status: application discontinuation

Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION