US20010014891A1 - Display of media previews - Google Patents

Display of media previews Download PDF

Info

Publication number
US20010014891A1
US20010014891A1 US08/847,156 US84715697A US2001014891A1 US 20010014891 A1 US20010014891 A1 US 20010014891A1 US 84715697 A US84715697 A US 84715697A US 2001014891 A1 US2001014891 A1 US 2001014891A1
Authority
US
United States
Prior art keywords
preview
media object
frames
media
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US08/847,156
Other versions
US6370543B2 (en
Inventor
Eric M. Hoffert
Stephen R. Smoot
Karl Cremin
Adnan Ali
Michael Mills
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Procter and Gamble Co
Magnifi Inc
Insolvency Services Group Inc
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US08/847,156 priority Critical patent/US6370543B2/en
Assigned to MAGNIFI, INC. reassignment MAGNIFI, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: HOFFERT, ERIC M., SMOOT, STEVE, MILLS, MICHAEL, ALI, ADNAN, CREMIN, KARL
Publication of US20010014891A1 publication Critical patent/US20010014891A1/en
Assigned to RUSTIC CANYON VENTURES, L.P., ACCENTURE LLP, PROCTER & GAMBLE PLATFORM, INC. reassignment RUSTIC CANYON VENTURES, L.P. SECURITY INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EMMPERATIVE MARKETING HOLDINGS, INC., EMMPERATIVE MARKETING, INC.
Publication of US6370543B2 publication Critical patent/US6370543B2/en
Application granted granted Critical
Assigned to WORLDWIDE MAGNIFI, INC. reassignment WORLDWIDE MAGNIFI, INC. CHANGE OF NAME (SEE DOCUMENT FOR DETAILS). Assignors: AT ONCE NETWORKS, INC./D.B.A./ MAGNIFI, INC.
Assigned to EMMPERATIVE MARKETING, INC. reassignment EMMPERATIVE MARKETING, INC. MERGER (SEE DOCUMENT FOR DETAILS). Assignors: WORLDWIDE MAGNIFI, INC.
Assigned to INSOLVENCY SERVICES GROUP, INC. reassignment INSOLVENCY SERVICES GROUP, INC. ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: EMMPERATIVE MARKETING, INC.
Assigned to PROCTER & GAMBLE COMPANY, THE reassignment PROCTER & GAMBLE COMPANY, THE ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: INSOLVENCY SERVICES GROUP, INC.
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7834Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using audio features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7844Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using original textual content or text extracted from visual content or transcript of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/785Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using colour or luminescence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7847Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content
    • G06F16/786Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using low-level visual features of the video content using motion, e.g. object motion or camera motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/913Multimedia
    • Y10S707/914Video
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/913Multimedia
    • Y10S707/915Image
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/913Multimedia
    • Y10S707/916Audio
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/912Applications of a database
    • Y10S707/943News
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10STECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10S707/00Data processing: database and file management or data structures
    • Y10S707/99941Database schema or data structure
    • Y10S707/99944Object-oriented database structure
    • Y10S707/99945Object-oriented database structure processing

Definitions

  • the present invention relates to the field of networking, specifically to the field of searching for and retrieval of information on a network.
  • text based search algorithms cannot answer such queries. Yet, text based search tools are the predominate search tools available on the internet today. Even if text based search algorithms are enhanced to examine files for file type and, therefore, be able to detect whether a file is a audio, video or other multimedia file, little if any information is available about the content of the file beyond its file type.
  • search engine which is capable of searching the internet, or other large distributed network for multimedia information. It is also desirable that the search engine provide for analysis of the content of files found in the search and for display of previews of the information.
  • FIG. 1 illustrates an overall diagram of a media search and retrieval system as may implement the present inventions.
  • FIGS. 2A-C illustrates a flow diagram of a method of media crawling and indexing as may utilize the present inventions.
  • FIG. 3A illustrates an overall diagram showing analysis of digital audio files.
  • FIGS. 3B, 3C and 3 D illustrates waveforms.
  • FIG. 3E-H illustrate a flow diagram of a method of analyzing content of digital audio files.
  • FIG. 4A illustrates a user interface showing search results.
  • FIG. 4B illustrates components of a preview.
  • FIG. 4C-4E illustrate a flow diagram of a method of providing for previews.
  • reference numerals in all of the accompanying drawings typically are in the form “drawing number” followed by two digits, xx; for example, reference numerals on FIG. 1 may be numbered 1 xx; on FIG. 3, reference numerals may be numbered 3 xx.
  • a reference numeral may be introduced on one drawing and the same reference numeral may be utilized on other drawings to refer to the same item.
  • FIG. 1 provides an overview of a system implementing various aspects of the present invention. As was stated above, it is desirable to be provide a system which will allow searching of media files on a distributed network such as the internet or, alternatively, on intranets. It would be desirable if such a system were capable of crawling the network, indexing media files, examining and analyzing the media file's content, and presenting summaries to users of the system of the content of the media files to assist the user in selection of a desired media file.
  • a distributed network such as the internet or, alternatively, on intranets. It would be desirable if such a system were capable of crawling the network, indexing media files, examining and analyzing the media file's content, and presenting summaries to users of the system of the content of the media files to assist the user in selection of a desired media file.
  • the embodiment described herein may be broken down into 3 key components: (1) crawling and indexing of the network to discover multimedia files and to index them 100 ; (2) examining the media files for content ( 101 - 105 ); and (3) building previews which allow a user to easily identify media objects of interest 106 .
  • crawling and indexing of the network to discover multimedia files and to index them 100
  • examining the media files for content 101 - 105
  • building previews which allow a user to easily identify media objects of interest 106 .
  • FIGS. 2A-2C provides a description of a method for crawling and indexing a network to identify and index media files.
  • Hypertext markup language (HTML) in the network is crawled to locate media files, block 201 .
  • Lexical information i.e., textual descriptions
  • the media index is then weighted, block 204 and data is stored for each media object, block 205 .
  • FIG. 2B The method of the described embodiment for crawling HTML to locate media files is illustrated in greater detail by FIG. 2B.
  • a process as used by the present invention may be described as follows:
  • the crawler starts with a seed of multimedia specific URL sites to begin its search.
  • Each seed site is handled by a separate thread for use in a multithreaded environment.
  • Each thread parses HTML pages (using a tokenizer with lexical analysis) and follows outgoing links from a page to search for new references to media files.
  • Outgoing links from an HTML page are either absolute or relative references.
  • Relative references are concatenated with the base URL to generate an absolute pathname.
  • Each new page which is parsed is searched for media file references.
  • the crawler scans for a robot exclusion protocol file. If the file is present, it indicates those directories which should not be scanned for information. The crawler will not index material which is disallowed by the optional robot exclusion file.
  • a media description file (termed for purposes of this application the mediaX file).
  • the general format of this file for the described embodiment is provided in Appendix A. This file contains a series of records of textual information for each media file within the current directory.
  • the crawler scans for the media description file in each directory at a web site, and adds the text based information stored there into the index being created by the crawler.
  • the mediaX file allows for storage of information such as additional keywords, abstract and classification data. Since the mediaX file is stored directly within the directory where the media file resides, it ensures an implicit authentication process whereby the content provider can enhance the searchable aspects of the multimedia information and can do so in a secure manner.
  • the crawler can be constrained to operate completely within a single parent URL.
  • the user inputs a single URL corresponding to a single web site.
  • the crawler will then only follow outgoing links which are relative to the base URL for the site. All absolute links will not be followed.
  • By following only those links which are relative to the base URL only those web pages which are within a single web site will be visited, resulting in a search and indexing pass of a single web site.
  • This allows for the crawling and indexing of a single media-rich web site.
  • Each HTML page is scanned for predetermined types of HTML tags, block 211 .
  • the following tags are scanned for:
  • a media uniform resource locator URL
  • block 212 If there is a media URL, then the media URL is located and stored. However, in the described embodiment, certain media URL's may be excluded. For example, an embodiment may choose not to index URLs having certain keywords in the URL, certain prefixes, certain suffixes or particular selected URLs.
  • relevant lexical information is selected for each URL.
  • a web page which references a media file provides significant description of the media file as textual information on the web page.
  • the present invention has recognized that it would be useful to utilize this textual information.
  • certain web pages may reference only a single media file, while other web pages may reference a plurality of media files.
  • certain lexical information on the web page may be more relevant than other information to categorizing the media for later searching.
  • a special tag may be stored within the indexed text where the media reference occurs in the web page.
  • queries are posed to the full-text database of the stored HTML pages which reference media, the distance of the keyword text from the media reference tag can be used to determine if there is a relevant match.
  • the standard distance from media reference to matching keyword utilized is ten words in each direction outwards from the media reference.
  • the word distance metric is called “lexical proximity”. For standard web pages where text surrounding media is generally relevant this is an appropriate value.
  • the user needs a mechanism by which to broaden or narrow the search, based on the relevance which is found by the default lexical proximity.
  • Users can employ an expand and narrow search button to change the default lexical proximity.
  • the expand function will produce more and more search results for a given query, as the lexical proximity value is increased.
  • a typical expand function will increase the lexical proximity value by a factor of two each time it is selected. When the expand function is used, more text will be examined which is located near the media reference to see if there is a keyword match. Expanding the search repeatedly will decrease precision and increase recall.
  • the narrow search button will do the reverse, by decreasing the lexical proximity value more and more.
  • a typical narrow function will decrease the lexical proximity value by a factor of two each time it is selected.
  • the narrow search button will reduce the number of search results, and hone in on that text information which only surrounds the media reference directly. Narrowing the search will increase precision and decrease recall. The relevance of all resulting queries should be quite high, on average, as a search is narrowed using this method.
  • a search query may often produce a search result list with zero hits.
  • a method is employed which will iterate on the lexical proximity value until a set of ten search results are returned. The algorithm is as follows:
  • Users should be able to specify the usage of lexical proximity to enhance the indexing of their search material. For example, if the web page author knows that all words which are ten words in front of the media reference are valid and relevant, then the author should specify a lexical proximity value which is only negative ten (i.e., look only in the reverse direction from the media URL by ten words). If the web page author knows that all words which are ten words after the media reference are valid and relevant, then the author should specify a lexical proximity value which is only positive ten. Finally, if the web author knows that both ten words ahead, and ten words behind the media reference are relevant, then the lexical proximity value should be set to positive/negative ten. Similarly, if the web author knows that the entire page contains relevant text for a single media file, then the lexical proximity value should be set to include all text on a page as relevant.
  • auxiliary data in the media file (copyright, author, producer, etc.)
  • Media content of files may be stored as downloadable files or as streaming files. Downloadable content is indexed by connecting to an HTTP server, downloading the media file, and then analyzing the file for the purposes of building a media rich index.
  • an HTTP server stores, not the content itself, but instead a reference to the media file. Therefore, the process of indexing such a file is not as straightforward as for a downloadable file which is stored on the HTTP server and may be downloaded from the server.
  • header information to be queried and indexed includes title, author, copyright; in the case of a video media file, additional information indexed may also include duration, video resolution, frame rate, etc.
  • This method can be applied to any streaming technology, including both streaming sound and video.
  • the media data which is indexed includes information which is resident in the file header (i.e., title, author, copyright), and which can be computed or analyzed based on information in the media file (i.e., sound volume level, video color and brightness, etc.).
  • the latter category of information includes content attributes which can be computed while the media is streaming, or after the media has completed streaming from a server. It should be noted that once the streaming media has been queried and received results back from the server, the streaming process can conclude as the indexing is complete.
  • a media index is generated by storing the information which has been discussed above in an index format.
  • the media index is weighted to provide for increased accuracy in the searching capabilities.
  • the weighing scheme is applied factoring a weight factor for each of the following text items: ITEM WEIGHTING FACTOR URL of the media file 10 Keywords embedded in the media file 10 Textual annotations in the media file 10 script dialogue, lyrics, and closed 10 captioning in the media file Text strings associated with the media file 9 anchor reference Text surrounding the media file reference 7 Title of the HTML document containing 6 the media file Keywords and meta-tags associated with 6 the HTML document URL for the HTML document containing 5 the media file reference
  • Media specific text e.g., closed captioning, annotations, etc.
  • Content attributes such as brightness, color or BAN, contrast, speech v. music and volume level.
  • sampling rate, frame rate, number of tracks, data rate, size may be stored).
  • the present invention is generally concerned with indexing two types of media files (i) audio 102 and (ii) video 103 .
  • the present invention discloses an algorithm used to predict the likelihood that a given video file contains a low, medium or high degree of motion.
  • the likelihood is computed as a single scalar value, which maps into one of N buckets of classification.
  • the value associated with the motion likelihood is called the “motion” metric.
  • a method for determining and classifying the brightness, contrast and color of the same video signal is also described. The combination of the motion metric along with brightness, contrast and color estimates enhance the ability of users to locate a specific piece of digital video.
  • the described method for estimating motion content and brightness, contrast and color can be used together with the described algorithm for searching the worldwide Internet in order to index and intelligently tag digital multimedia content.
  • the described method allows for powerful searching based on information signals stored inside the content within very large multimedia databases.
  • an index of multimedia information which includes a motion metric and brightness, contrast and color estimate
  • users can perform field based sorting of multimedia databases. For example, a user could execute the query: find me all video, from slow moving to fast, by Steven Spielberg, and the database engine would return a list of search results, ordered from slowest to fastest within the requested motion range.
  • the digital video file is associated with a digital audio sequence, then an analysis of the digital audio can occur. An analysis of digital audio could determine if the audio is either music or speech. It can also determine if the speaker is male or female, and other information. This type of information could then be used to allow a user query such as:
  • the described method in its preferred embodiment, is relatively fast to compute. Historically, most systems for analyzing video signals have operated in the frequency domain. Frequency domain processing, although potentially more accurate than image based analysis, has the disadvantage of being compute intensive, making it difficult to scan and index a network for multimedia information in a rapid manner.
  • a given video file contains low, medium or high amounts of motion
  • the scalar value is an estimate of the type of content found in the video file.
  • the method described here is appropriate for those video files which may be in a variety of different coding formats (such as Vector Quantization, Block Truncation Coding, Intraframe DCT coded), and need to be analyzed in a uniform uncompressed representation.
  • it is disclosed to decode the video into a uniform representation, since it may be coded in either an intraframe or an interframe coded format.
  • the method described here is a scheme for determing the average frame difference for a pixel in a sequence of video.
  • the same metric is determined. This is desirable, even though the interframe coded video has some information about frame to frame differences.
  • the reason that the interframe coded video is uncompressed and then analyzed, is that different coding schemes produce different types of interframe patterns which may be non uniform.
  • the disclosed invention is based on three discoveries:
  • time periods can be compressed into buckets which average visual change activity
  • slow moving video is typically comprised of small frame differences
  • moderate motion video is typically comprised of medium frame differences
  • fast moving video is typically comprised of large frame differences
  • video content such as talking heads and talk shows are comprised of slow moving video
  • video content such as newscasts and commercials are comprised of moderate speed video
  • video content such as sports and action films are comprised of fast moving video
  • the disclosed method operates generally by accessing a multimedia file and evaluating the video data to determine the visual change activity and by algorithm to compute the motion metric operates as follows:
  • the resulting value is the motion metric Z.
  • sample size is 8 bits per pixel, 24 bits for RGB
  • the degree of motion is stored in the index of a multimedia database. This facilitates user queries and searches based on the degree of motion for a sequence, including the ability to provide field based sorting of video clips based on motion estimates.
  • the method described above is appropriate for those video files which may be in a variety of different coding formats (such as Vector Quantization, Block Truncation Coding, Intraframe DCT coded), and need to be analyzed in a uniform uncompressed representation.
  • the coded representation is decoded and then an analysis is applied in the image space domain on the uncompressed pixel samples.
  • some coding formats such as MPEG
  • MPEG MPEG
  • the method described here uses the motion estimation data to derive an estimate of motion for a full sequence of video in a computationally efficient manner.
  • the MPEG coded data contains both motion vectors and motion vector lengths
  • the number of non-zero motion vectors is a measure of how many image blocks are moving
  • the length of motion vectors is a measure of how far image blocks are moving
  • slow moving video is comprised of few motion vectors and small vector lengths
  • moderate video is comprised of moderate motion vectors and moderate vector lengths
  • fast moving video is comprised of many motion vectors and large vector lengths
  • video content such as talking heads and talk shows are comprised of slow moving video
  • video content such as newscasts and commercials are comprised of moderate speed video
  • video content such as sports and action films are comprised of fast moving video
  • An algorithm to compute the motion metric may operates as follows:
  • time periods can be compressed into buckets which average brightness activity
  • the buckets can be averaged to derive an overall estimate of brightness level
  • FIG. 3A provides an overview of the process.
  • a digital audio file is initially analyzed 301 and an initial determination is made whether the file is speech 307 or music 302 . If the file is determined to be music, in one embodiment, if the file is “noisy”, a noise reduction filter may be applied and the analysis repeated 303 . This is because a noisy speech file may be misinterpreted as music. If the file is music, an analysis may be done to determine if the music is fast or slow 304 and an analysis may be done to determine if the music is bass or treble 305 based on a pitch analysis.
  • an analysis might be done to determine if the speech 308 is fast or slow based on frequency and whether it is male or female 309 based on pitch.
  • knowing that a portion of an audio track for a movie starring Sylvester Stallone has a fast, male voice may be interpreted by retrieval software as indicating that portion of the audio track is a action scene involving Sylvester Stallone.
  • the voice recognition capability may be limited to only recognizing a known voice, while in other more advanced embodiments, omni-voice recognition capability may be added. In either event, the recognized text may be added to the stored information for the media file and be used for searching and retrieval.
  • time periods can be compressed into buckets which average amplitude activity
  • music is typically comprised of a continuous amplitude signal
  • speech is typically comprised of a discontinuous amplitude signal
  • sound effects are typically comprised of a discontinuous amplitude signal
  • audio comprised of music and speech has moderate rates of change in amplitude activity
  • Continuous signals are characterized by low rates of change.
  • Various types of music including rock, classical and jazz are often relatively continuous in nature with respect to the amplitude signal. Rarely does music jump from a minimum to a maximum amplitude. This is illustrated by FIG. 3C which illustrates a typical amplitude signal 330 for music.
  • FIG. 3B illustrates a typical amplitude signal 320 for speech.
  • FIG. 3D illustrates signal 340 having period 341 which would be interpreted as music, period 342 which would be speech, period 343 music, period 344 speech, period 345 music and period 346 speech.
  • the audio file is a compressed file (which may be in any of a number of known compression formats), it is first decompressed using any of a known decompression algorithm, block 351 .
  • a amplitude analysis is then performed on the audio track to provide a music speech metric value. The amplitude analysis is performed as follows:
  • the audio track is divided into time segments of a predetermined length, block 352 .
  • each time segment is 50 ms.
  • the time segments may be of a greater or lesser length.
  • a normalized amplitude deviation is computed, block 356 . This is described in greater detail with reference to FIG. 3F.
  • the maximum amplitude and minimum amplitude is determined, block 351 . In the example of FIG. 3B, values range from 0 to 256 (in an alternative embodiment, the values may be based on floating point calculations and may range from 0 to 1.0).
  • the maximum amplitude value is shown as 160, for the second interval 322 , it is 158 and for the third interval 323 , it is 156.
  • the average maximum amplitude and average minimum amplitude is computed for all time intervals, block 352 . Again, using the example in FIG.
  • the average maximum amplitude will be 158.
  • a value MAX-DEV is computed for each interval as the absolute value of maximum amplitude for the interval minus the average maximum, block 353 .
  • the MAX-DEV will be 2
  • the second interval it will be 0
  • for the third interval it will be 2.
  • the MAX-DEV is normalized by computing MAX-DEV * (REF-VALUE/MAX) where the reference value is 256 in the described embodiment (and may be 1.0 in a floating point embodiment) and MAX is the maximum amplitude for all of the intervals.
  • the normalized MAX-DEV values for each segment are averaged together, block 357 , to determine a music-speech metric. High values tend to indicate speech, low values tend to indicate music and medium values tend to indicate a combination, block 358 .
  • N seconds of the audio file may be randomly chosen for analysis.
  • run the algorithm described above may be run on each channel, and the results averaged into a single scalar value to represent the entire sequence.
  • the percentage of each type of music-speech metric category may be computed and displayed. For example, for a soundtrack which is one hour long, which may consist of different periods of silence, music, speech and sound effects, the resulting characterization of the audio file would appear as follows:
  • volume level metric is an estimate of the volume of content found in the audio file.
  • time periods can be compressed into buckets which average volume activity
  • the buckets can be averaged to derive an overall estimate of volume level
  • the disclosed algorithm provides for determining the volume level of data in an audio file by evaluating the average amplitude for set of sampled signals.
  • the disclosed algorithm comprises the steps of:
  • the audio track is mapped into X time segment buckets, 362 .
  • the percentage of each type of volume category is displayed. For example, for a soundtrack which is one hour long, which may consist of different periods of silence, loudness, softness and moderate sound levels, the resulting characterization of the audio file would appear as follows:
  • a focus of the method described herein is to generate a visual display of audio information which can aid a user to determine if an audio file contains the audio content they are looking for.
  • This method is complements the other types of useful information which can be computed and or extracted from digital audio files; the combination of context and content analysis, together with graphical display of content data results in a composite useful snapshot of a piece of digital media information.
  • the algorithm described herein is used to display a time compressed representation of an audio signal.
  • the method is focused on providing some high level features visually of the time varying sound signal.
  • the method described can allow users to:
  • the results returned might be a set of fifty musical pieces by Beethoven. If the searcher knows that the piece of music they are looking for has a very quiet part towards the end of the piece, the user could view the graphical representation and potentially find the quiet part by seeing the waveform display illustrate a volume decrease towards the end of the waveform image. This could save the searcher great amounts of time that would have been required to listen to all fifty pieces of music.
  • a searcher might be looking for a speech by Martin Luther King, where the speech starts out with him yelling loudly, and then speaking in a normal tone of voice. If twenty speeches are returned from the search engine results, then the searcher could visually scan the results and look for a waveform display which shows high volume at the beginning and then levels off within the first portion of the audio signal. This type of visual identification could save the searcher great amounts of time which would be required to listen to all twenty speeches.
  • Continuous signals are characterized by low rates of change.
  • Various types of music including rock, classical and jazz are often relatively continuous in nature with respect to the amplitude signal. Rarely does music jump from a minimum to a maximum amplitude. Similarly, it is rare that speech results in a continuous amplitude signal with only small changes in amplitude.
  • Discontinuous signals are characterized by high rates of change. For speech, there are often bursty periods of large amplitude interspersed with extended periods of silence of low amplitude. For sound effects, there are often bursty periods of large amplitude interspersed with bursty periods of low amplitude.
  • a method is illustrated here which derives a visual representation of sound in a temporally compressed format.
  • the goal is to illustrate long term trends in the audio signal which will be useful to a user when searching digital multimedia content.
  • the method produces visual images of constant horizontal resolution, independent of the duration in seconds. This means that temporal compression must occur to varying degrees while still maintaining a useful representation of long term amplitude trends within a limited area of screen display.
  • mapping is that a single bucket represents N/X samples of sound
  • the N/X samples term is called a compressed time sample C
  • red represents maximum amplitude
  • a conventional speech recognition algorithm can then be applied (also called speech to text) which can convert the speech information in the audio file into textual information. This will allow the audio file to then be searchable based on its internal characteristics, as well as the actual narrative which is within the audio file.
  • the speech may correspond to closed captioning information, script dialogue or other forms of textual representation.
  • the described embodiment is concerned with parsing content files and building low-bandwidth previews of higher bandwidth data files. This allows rapid previewing of media data files, without need to download the entire file.
  • FIG. 4A A sample of the results of search, showing a media preview is given in FIG. 4A.
  • the preview is explained in greater detail with reference to FIG. 4B.
  • FIG. 4B illustrates a preview 410 .
  • the preview comprises a first sprocket area 411 at the top of the preview and a second sprocket area at the bottom of the preview, a image area having three images of height IH 412 and width IW 413 .
  • the preview itself is of height FH 414 and width FW 415 .
  • the preview may include a copyright area 416 for providing copyright information relating to the preview and certain embodiments may contain an area, for example in the upper left hand corner of the first sprocket area 411 for a corporate logo or other branding information.
  • a general algorithm for generation and display of previews is disclosed with reference to FIG. 4C.
  • the media file is examined to locate portions having predetermined characteristics. For example, portions of a video file having fast action may be located. Or, portions of a video having black and white portions.
  • a preview of the object is generated and stored. This will be discussed in greater detail in connection with FIG. 4D. Finally, when requested by a user, for example, in response to a search, the preview may be displayed.
  • the object may be, for example, a digital video file, an animation file, or a panoramic image.
  • the file may be downloadable or streaming. And, if downloadable, the file may have table based frame descriptions or track based frame descriptions.
  • Animation objects include animated series of frames using a lossless differential encoding scheme and hyperlinked animation.
  • a preview is generated generally along the lines of the preview of FIG. 4A and 4B, block 432 .
  • an aspect ratio is computed for the preview.
  • the target filmstrip is set with a width FW 415 and a height FH.
  • a distance ID is set for the distance between images on the filmstrip.
  • a sprocket height and width is set resulting in a sprocket region height (SRH 411 ).
  • the particular heights and widths may vary from implementation to implementation dependent on a variety of factors such as expected display resolution. In alternative embodiment, differing sprocket designs may be utilized and background colors, etc. may be selected. In certain embodiments, it may be desirable to include copyright information 416 or a logo.
  • the number, width and height of images can be determined for display of a preview.
  • the selection images for use in the preview is dependent on whether the preview is being generated for a 3D media object, a digital video or animation object, or a panoramic object.
  • N frames from the image are then decompressed to pure RGB at N fixed points in the image where the N fixed points at TW, 2*TW, 3*TW, . . . N* TW time into the media image. This process reduces the need to decompress the entire video file. Scanning to the particular ones of the N frames is accomplished by using the table based frame description, the track based frame description or by streaming dependent on the media source file.
  • An objective of choosing N frames spaced TW apart is to develop a preview with frames from various portions of the media file so that the user will be given an opportunity to review the various portions in making a determination if the user wishes to access the entire file.
  • the decompress process may utilize intraframe, predictive decoding or bilinear decoding dependent on the source file.
  • a color space conversion is then performed from RGB to YUV.
  • an adaptive noise reduction process may be performed.
  • Each of the N frames are then analyzed to determine if the frame meets predetermined criteria for display, block 444 . Again, an objective is to provide the user with a quality preview allowing a decision if the entire file should be accessed.
  • each of the N frames are analyzed for brightness, contrast and quality. If the frames meet for the criteria, block 445 , then the frame is scaled, block 447 from its original width W and height H to width IW and height IH using interpolation. Linear interpolation is utilized and the aspect ratio is maintained.
  • Each frame is also analyzed for a set of attributes, block 448 .
  • the attributes in the described embodiment include brightness, contrast (luminance, deviation), chrominance, and dominant color.
  • Brightness indicates the overall brightness of digital video clip.
  • Color indicates if the video clip is in full color or black and white, and contrast indicates the degree of contrast in the movie.
  • These high level content attributes tend to be more meaningful for the typically short video sequences which are published on the Internet and Intranet.
  • This information can then be used for enhanced searching.
  • chrominance can be used for searching for black and white versus color video.
  • embodiments may provide for optionally storing a feature vector for texture, composition and structure. These attributes can be averaged across the N frames and the average for each attribute is stored as a searchable metric.
  • the contrast of the frames may be enhanced using a contrast enhancement algorithm.
  • the maximum chrominance is computed for the selected N frames in the video sequence.
  • the maximum chrominance for the set of frames is then determined by finding the maximum chrominance for each frame by finding the maximum chrominance for all pixels in each frame. This maximum chrominance value for the set of selected frames is then compared against a threshold. If the maximum chrominance for the sequence is larger than the threshold, then the sequence is considered in full color. If the maximum chrominance for the sequence is smaller than the threshold, then the sequence is considered in black and white.
  • the luminance is computed for the selected N frames in the video sequence. The luminance is then averaged into a single scalar value.
  • luminance values are computed for each frame of the digital video sequence.
  • the luminance values which fall below the fifth percentile, and above the ninety-fifth percentile are then removed from the set of values. This is done to remove random noise.
  • the remaining luminance values are then examined for the maximum and minimum luminance. The difference between the maximum and minimum luminance is computed as the contrast for a single frame.
  • the contrast value is then computed for all frames in the sequence, and the average contrast is stored as the resulting value.
  • audio and video clips may be associated with each frame, block 449 .
  • a standard audio segment may be selected or alternatively an audio selection algorithm may be applied which finds audio which meets predetermined criteria, such as a preset volume level.
  • predetermined criteria such as a preset volume level.
  • video a video track of duration VD is selected.
  • the video selection may be a standard video segment or the video segment may be selected using a video selection algorithm which selects video segments meeting a predetermined criteria such as video at a predetermined brightness, contrast or motion.
  • a frame iterator algorithm is applied to select a new frame.
  • the frame iterator algorithm of the described embodiment selects another frame by iteratively selecting frames between the frame in question and the other frames until a frame is found which meets the criteria or until a predetermined number of iterations have been applied. If the predetermined number of iteration are applied without successfully finding a frame which meets the criteria, the originally selected frame is used.
  • the algorithm starts with the original frame at TW (or, 2*TW, 3*TW . . .
  • N* TW N* TW and selects, first, a frame at TW ⁇ (TW/2) (i.e., a frame halfway between the original frame and the beginning). If this frame does not meet the criteria, a frame at TW+(TW/2) is selected and iteratively frames are selected according to the pattern:
  • Interactive panoramic images are often stored as multimedia files. Typically these media files are stored as a series of compressed video images, with a total file size ranging from 100 to 500 Kbytes.
  • the described embodiment provides a method which creates a low bandwidth preview of a panoramic picture.
  • a preview in the described embodiment utilizes approximately 10 Kbytes in storage size which is only 1/10th to 1/50th of the original panoramic storage.
  • the preview provides a high-quality, low bit rate display of the full panoramic scene.
  • the method for creating the panoramic preview, block 450 may be described as follows:
  • Coding schemes can include progressive or interlaced transmission algorithms.
  • a top view, bottom view, front view and rear view are selected for images to display, block 441 .
  • the web server application can return a series of HTML and EMBED tags which setup a movie controller, allowing a user to interact with the videos of cars.
  • the low bandwidth preview (a filmstrip showing select scenes of the video clip)
  • the position of the mouse that is active when a user clicks within the preview will drive the resulting EMBED tags which are created and then returned from the server.
  • a user clicks down in frame X of a filmstrip then an in-line viewer is created which will begin display and playback of the movie at frame X.
  • a snipet or short segment of a video or audio file may be stored with the preview and associated with a particular portion of the preview. This method avoids the need to access the original file for playback of a short audio or video segment.
  • the described embodiment provides for both a text and visual method for showing that the search results are of different media types.
  • search results include digital video, audio, 3D, animation, etc.
  • Video may show karate methods, sound might be an interview with a karate expert, 3D could be a simulation of a karate chop and animation a display of a flipbook of a karate flip.
  • an icon which is representative of each type of media is employed.
  • users can select basic, detailed or visual search results. If a user selects visual search results, then only visual images, filmstrips or waveforms are presented to users as search results.
  • the visual search results are typically displayed as a set of mosaics on a page, usually multiple thumbnail images per row, and multiple filmstrips (usually two) per row. Clicking on images, waveforms or filmstrips then takes users to new web pages where more information is described about the media content. This allows users to rapidly scan a page of visual search results to see if they can find what they are looking for.
  • Text keywords may be found within certain multimedia files (e.g., the content of the file). For example, movie and other video files sometimes contain a movie text track, a closed caption track or a musical soundtrack lyrics track. For each text keyword which is found in one of these tracks, a new database is created by the process of the present invention. This database maps keywords to [text, timecode] pairs. This is done so that it is possible to map keywords directly to the media file and timecode position where the media file text reference occurs. The timecode position is subsequently used when producing search results to viewers, so that viewers can jump directly to that portion of the media sequence where the matching text occurs.

Abstract

A method and apparatus for searching for multimedia files in a distributed database and for displaying results of the search based on the context and content of the multimedia files.

Description

    RELATED APPLICATIONS
  • This application claims benefit of the following co-pending U.S. Provisional Applications: [0001]
  • 1) Method and Apparatus for Processing Context and Content of Multimedia Files When Creating Searchable Indices of Multimedia Content on Large, Distributed Networks; Serial No.: 60/018,312; Filed: May 24, 1996; [0002]
  • 2) Method and Apparatus for Display of Results of a Search Queries for Multimedia Files; Serial No.: 60/018,311; Filed: May 24, 1996; [0003]
  • 3) Method for Increasing Overall Performance of Obtaining Search Results When Searching on a Large, Distributed Database By Prioritizing Database Segments to be Searched; Serial No.: 60/018,238; Filed: May 24, 1996; [0004]
  • 4) Method for Processing Audio Files to Compute Estimates of Music-Speech Content and Volume Levels to Enable Enhanced Searching of Multimedia Databases; Serial No.: 60/021,452; Filed: Jul. 10, 1996; [0005]
  • 5) Method for Searching for Copyrighted Works on Large, Distributed Networks; Serial No.: 60/021,515; Filed: Jul. 10, 1996; [0006]
  • 6) Method for Processing Video Files to Compute Estimates of Motion Content, Brightness, Contrast and Color to Enable Enhanced Searching of Multimedia Databases; Serial No.: 60/021,517; Filed: Jul. 10, 1996; [0007]
  • 7) Method and Apparatus for Displaying Results of Search Queries for Multimedia Files; Serial No.: 60/021,466; Filed: Jul. 10, 1996; [0008]
  • 8) A Method for Indexing Stored Streaming Multimedia Content When Creating Searchable Indices of Multimedia Content on Large, Distributed Networks; Serial No.: 60/023,634; Filed: Aug. 9, 1996; [0009]
  • 9) An Algorithm for Exploiting Lexical Proximity When Performing Searches of Multimedia Content on Large, Distributed Networks; Serial No.: 60/023,633; Filed: Aug. 9, 1996; [0010]
  • 10) A Method for Synthesizing Descriptive Summaries of Media Content When Creating Searchable Indices of Multimedia Content on Large, Distributed Networks; Serial No.: 60/023,836; Filed: Aug. 12, 1996. [0011]
  • BACKGROUND OF THE INVENTION
  • 1. Field of the Invention [0012]
  • The present invention relates to the field of networking, specifically to the field of searching for and retrieval of information on a network. [0013]
  • 2. Description of the Related Art [0014]
  • Wouldn't it be nice to be able to log onto your local internet service provider, access the worldwide web, and search for some simple information, like “Please find me action movies with John Wayne which are in color?” or “Please find me audio files of Madonna talking?”, or “I would like black and white photos of the Kennedy assassination”. Or, how about even “Please find me an action movie starring Michael Douglas and show me a preview of portions of the movie where he is speaking loudly”. Perhaps, instead of searching the entire worldwide web, a company may want to implement this searching capability on its intranet. [0015]
  • Unfortunately, text based search algorithms cannot answer such queries. Yet, text based search tools are the predominate search tools available on the internet today. Even if text based search algorithms are enhanced to examine files for file type and, therefore, be able to detect whether a file is a audio, video or other multimedia file, little if any information is available about the content of the file beyond its file type. [0016]
  • Still further, what if the search returns a number of files. Which one is right? Can the user tell from looking at the title of the document or some brief text contained in the document as is done by many present day search engines? In the case of relatively small text files, downloading one or two or three “wrong” files, when searching for the right file, is not a major problem. However, when downloading relatively large multimedia files, it may be problematic to download the files without having a degree of assurance that the correct file has been found. [0017]
  • SUMMARY OF THE INVENTION
  • It is desireable to provide a search engine which is capable of searching the internet, or other large distributed network for multimedia information. It is also desirable that the search engine provide for analysis of the content of files found in the search and for display of previews of the information. [0018]
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 illustrates an overall diagram of a media search and retrieval system as may implement the present inventions. [0019]
  • FIGS. 2A-C illustrates a flow diagram of a method of media crawling and indexing as may utilize the present inventions. [0020]
  • FIG. 3A illustrates an overall diagram showing analysis of digital audio files. [0021]
  • FIGS. 3B, 3C and [0022] 3D illustrates waveforms.
  • FIG. 3E-H illustrate a flow diagram of a method of analyzing content of digital audio files. [0023]
  • FIG. 4A illustrates a user interface showing search results. [0024]
  • FIG. 4B illustrates components of a preview. [0025]
  • FIG. 4C-4E illustrate a flow diagram of a method of providing for previews. [0026]
  • For ease of reference, it might be pointed out that reference numerals in all of the accompanying drawings typically are in the form “drawing number” followed by two digits, xx; for example, reference numerals on FIG. 1 may be numbered [0027] 1xx; on FIG. 3, reference numerals may be numbered 3xx. In certain cases, a reference numeral may be introduced on one drawing and the same reference numeral may be utilized on other drawings to refer to the same item.
  • DETAILED DESCRIPTION OF THE EMBODIMENTS
  • What is described herein is a method and apparatus for searching for, indexing and retrieving information in a large, distributed network. [0028]
  • 1.0 Overview [0029]
  • FIG. 1 provides an overview of a system implementing various aspects of the present invention. As was stated above, it is desirable to be provide a system which will allow searching of media files on a distributed network such as the internet or, alternatively, on intranets. It would be desirable if such a system were capable of crawling the network, indexing media files, examining and analyzing the media file's content, and presenting summaries to users of the system of the content of the media files to assist the user in selection of a desired media file. [0030]
  • The embodiment described herein may be broken down into 3 key components: (1) crawling and indexing of the network to discover multimedia files and to index them [0031] 100; (2) examining the media files for content (101- 105); and (3) building previews which allow a user to easily identify media objects of interest 106. Each of these phases of the embodiment provide, as will be appreciated, for unique methods and apparatus for allowing advanced media queries.
  • 2.0 Media Crawling and Indexing [0032]
  • FIGS. 2A-2C provides a description of a method for crawling and indexing a network to identify and index media files. Hypertext markup language (HTML) in the network is crawled to locate media files, [0033] block 201. Lexical information (i.e., textual descriptions) is located describing the media files, block 202 and a media index is generated, block 203. The media index is then weighted, block 204 and data is stored for each media object, block 205. Each of these steps will be described in greater detail below.
  • 2.1 Crawl HTML to locate media files [0034]
  • The method of the described embodiment for crawling HTML to locate media files is illustrated in greater detail by FIG. 2B. Generally, a process as used by the present invention may be described as follows: [0035]
  • The crawler starts with a seed of multimedia specific URL sites to begin its search. Each seed site is handled by a separate thread for use in a multithreaded environment. Each thread parses HTML pages (using a tokenizer with lexical analysis) and follows outgoing links from a page to search for new references to media files. Outgoing links from an HTML page are either absolute or relative references. Relative references are concatenated with the base URL to generate an absolute pathname. Each new page which is parsed is searched for media file references. When a new site is found by the crawler, there is a check against the internal database to ensure that the site has not already been visited (within a small period of time); this guarantees that the crawler only indexes unique sites within its database, and does not index the same site repeatably. A hash table scheme is used to guarantee that only unique new URLs are added to the database. The URL of a link is mapped into a single bit in a storage area which can contain up to approximately ten million URLs. If any URL link which is found hashes to the same bit position, then the URL is not added to the list of URLs for processing. As the crawler crawls the web, those pages which contain media references receive a higher priority for processing than those pages which do not reference media. As a result, pages linked to media specific pages will be visited by the crawler first in an attempt to index media related pages more quickly than through conventional crawler techniques. [0036]
  • When entering a new site, the crawler scans for a robot exclusion protocol file. If the file is present, it indicates those directories which should not be scanned for information. The crawler will not index material which is disallowed by the optional robot exclusion file. On a per directory basis, there is proposed to be stored a media description file (termed for purposes of this application the mediaX file). The general format of this file for the described embodiment is provided in Appendix A. This file contains a series of records of textual information for each media file within the current directory. As will be discussed in greater detail below, the crawler scans for the media description file in each directory at a web site, and adds the text based information stored there into the index being created by the crawler. The mediaX file allows for storage of information such as additional keywords, abstract and classification data. Since the mediaX file is stored directly within the directory where the media file resides, it ensures an implicit authentication process whereby the content provider can enhance the searchable aspects of the multimedia information and can do so in a secure manner. [0037]
  • The crawler can be constrained to operate completely within a single parent URL. In this case, the user inputs a single URL corresponding to a single web site. The crawler will then only follow outgoing links which are relative to the base URL for the site. All absolute links will not be followed. By following only those links which are relative to the base URL, only those web pages which are within a single web site will be visited, resulting in a search and indexing pass of a single web site. This allows for the crawling and indexing of a single media-rich web site. Once a single web site has had an index created, then users may submit queries to find content located only at the web site of interest. This scheme will work for what is commonly referred to as “Intranet” sites, where a media-rich web site is located behind a corporate firewall, or for commercial web sites containing large multimedia datasets. [0038]
  • 2.1.1 Scan page for predetermined HTML tag types [0039]
  • Each HTML page is scanned for predetermined types of HTML tags, block [0040] 211. In this embodiment, the following tags are scanned for:
  • tables (single row and multi-row) [0041]
  • lists (ordered and unordered) [0042]
  • headings [0043]
  • java script [0044]
  • client side image maps [0045]
  • server side image maps [0046]
  • header separators [0047]
  • 2.1.2 Determine if there is a media URL [0048]
  • If there is a media uniform resource locator (URL), block [0049] 212. If there is a media URL, then the media URL is located and stored. However, in the described embodiment, certain media URL's may be excluded. For example, an embodiment may choose not to index URLs having certain keywords in the URL, certain prefixes, certain suffixes or particular selected URLs.
  • 2.1.3 Locating relevant lexical information [0050]
  • Next, relevant lexical information (text) is selected for each URL. Often a web page which references a media file provides significant description of the media file as textual information on the web page. When indexing a media file, the present invention has recognized that it would be useful to utilize this textual information. However, certain web pages may reference only a single media file, while other web pages may reference a plurality of media files. In addition, certain lexical information on the web page may be more relevant than other information to categorizing the media for later searching. [0051]
  • It has been observed that relevant textual information may be directly surrounding the media reference on a web page, or it may be far from the media reference. However, it has been found that more often than not, the relevant text is very close (in lexical distance) to the media reference. Therefore, the following general rules are applied when associating lexical information with a media file: [0052]
  • 1) if the media file reference is found within a table, store the text within the table element as associated with the media file; [0053]
  • 2) if the media file reference is found within a list, store the text within the list element as associated with the media file; [0054]
  • 3) store the text in the heading as associated with the media file. In addition, in some embodiments, the text within higher level headings may also be stored. [0055]
  • 4) if there is javascript, store the text associated with the javascript tag; [0056]
  • 5) for client and server side image maps, if there is no relevant text, store only the URL. In addition, the image maps may be parsed to obtain all unique URLs and these may also be stored. [0057]
  • In some embodiments, a special tag may be stored within the indexed text where the media reference occurs in the web page. When queries are posed to the full-text database of the stored HTML pages which reference media, the distance of the keyword text from the media reference tag can be used to determine if there is a relevant match. The standard distance from media reference to matching keyword utilized is ten words in each direction outwards from the media reference. The word distance metric is called “lexical proximity”. For standard web pages where text surrounding media is generally relevant this is an appropriate value. [0058]
  • If the results of a search using lexical proximity are not satisfactory to a user, the user needs a mechanism by which to broaden or narrow the search, based on the relevance which is found by the default lexical proximity. Users can employ an expand and narrow search button to change the default lexical proximity. The expand function will produce more and more search results for a given query, as the lexical proximity value is increased. A typical expand function will increase the lexical proximity value by a factor of two each time it is selected. When the expand function is used, more text will be examined which is located near the media reference to see if there is a keyword match. Expanding the search repeatedly will decrease precision and increase recall. [0059]
  • The narrow search button will do the reverse, by decreasing the lexical proximity value more and more. A typical narrow function will decrease the lexical proximity value by a factor of two each time it is selected. The narrow search button will reduce the number of search results, and hone in on that text information which only surrounds the media reference directly. Narrowing the search will increase precision and decrease recall. The relevance of all resulting queries should be quite high, on average, as a search is narrowed using this method. [0060]
  • When a database is limited in depth of entries, and is generated with a fixed lexical proximity value, a search query may often produce a search result list with zero hits. In order to increase the number of search results for the case of zero hits with fixed lexical proximity, a method is employed which will iterate on the lexical proximity value until a set of ten search results are returned. The algorithm is as follows: [0061]
  • perform the search query [0062]
  • look at the number of returned hits [0063]
  • if the number of returned hits is less than ten, then [0064]
  • perform a new search with the lexical proximity value doubled [0065]
  • continue the above process until ten search results are returned [0066]
  • Users should be able to specify the usage of lexical proximity to enhance the indexing of their search material. For example, if the web page author knows that all words which are ten words in front of the media reference are valid and relevant, then the author should specify a lexical proximity value which is only negative ten (i.e., look only in the reverse direction from the media URL by ten words). If the web page author knows that all words which are ten words after the media reference are valid and relevant, then the author should specify a lexical proximity value which is only positive ten. Finally, if the web author knows that both ten words ahead, and ten words behind the media reference are relevant, then the lexical proximity value should be set to positive/negative ten. Similarly, if the web author knows that the entire page contains relevant text for a single media file, then the lexical proximity value should be set to include all text on a page as relevant. [0067]
  • In addition to the above-described processes for locating relevant lexical information, in the described embodiment, certain information is generally stored for all media URL's. In particular, the following information is stored: [0068]
  • the name of the media file [0069]
  • URL of the media file [0070]
  • text string which is associated with the media file anchor reference [0071]
  • title of the HTML document containing the media file [0072]
  • keywords associated with the HTML document [0073]
  • URL for the HTML document containing the media file reference [0074]
  • keywords embedded in the media file [0075]
  • textual annotations in the media file [0076]
  • script dialogue, closed captioning and lyric data in the media file [0077]
  • auxiliary data in the media file (copyright, author, producer, etc.) [0078]
  • auxiliary data located within the media reference in the HTML document [0079]
  • auxiliary data located in an associated media description file [0080]
  • 2.1.4 Streaming files [0081]
  • Media content of files may be stored as downloadable files or as streaming files. Downloadable content is indexed by connecting to an HTTP server, downloading the media file, and then analyzing the file for the purposes of building a media rich index. [0082]
  • In the case of streaming, multimedia content, block [0083] 214, an HTTP server stores, not the content itself, but instead a reference to the media file. Therefore, the process of indexing such a file is not as straightforward as for a downloadable file which is stored on the HTTP server and may be downloaded from the server.
  • In the case of streaming media files certain information is gathered, block [0084] 215, as will be described with reference to FIG. 2C.
  • Below is described a method for indexing streaming files to index audio content and to index video content: [0085]
  • download the media file reference corresponding to the correct streaming media type [0086]
  • for each URL listed in the media file reference, perform the following operation: [0087]
  • connect directly to the media file on the media server where it resides, block [0088] 221
  • commence streaming of the media on the appropriate TCP socket, block [0089] 222
  • query the streaming media to obtain appropriate content attributes and header data, block [0090] 223
  • add all relevant content attributes and header information into the media rich index, block [0091] 224 (header information to be queried and indexed includes title, author, copyright; in the case of a video media file, additional information indexed may also include duration, video resolution, frame rate, etc.)
  • determine if streaming text or synchronized multimedia information, is included, block [0092] 225.
  • if it is, then stream the entire media clip, and index all text within the synchronized media track of the media file [0093]
  • if possible, store the time code for each block of text which occurs with the streaming media [0094]
  • This method can be applied to any streaming technology, including both streaming sound and video. The media data which is indexed includes information which is resident in the file header (i.e., title, author, copyright), and which can be computed or analyzed based on information in the media file (i.e., sound volume level, video color and brightness, etc.). [0095]
  • The latter category of information includes content attributes which can be computed while the media is streaming, or after the media has completed streaming from a server. It should be noted that once the streaming media has been queried and received results back from the server, the streaming process can conclude as the indexing is complete. [0096]
  • 2.2 Generate and weight a media index [0097]
  • As the network is crawled, a media index is generated by storing the information which has been discussed above in an index format. The media index is weighted to provide for increased accuracy in the searching capabilities. In the described embodiment, the weighing scheme is applied factoring a weight factor for each of the following text items: [0098]
    ITEM WEIGHTING FACTOR
    URL of the media file 10
    Keywords embedded in the media file 10
    Textual annotations in the media file 10
    script dialogue, lyrics, and closed 10
    captioning in the media file
    Text strings associated with the media file 9
    anchor reference
    Text surrounding the media file reference 7
    Title of the HTML document containing 6
    the media file
    Keywords and meta-tags associated with 6
    the HTML document
    URL for the HTML document containing 5
    the media file reference
  • In other embodiments, alternative weighting factors may be utilized without departure from the present invention. [0099]
  • 2.3 Store data for each media object [0100]
  • Finally, data is stored for each media object. In the described embodiment, the following data is stored: [0101]
  • Relevant text [0102]
  • HTML document title [0103]
  • HTML meta tags [0104]
  • Media specific text (e.g., closed captioning, annotations, etc.) [0105]
  • Media URL [0106]
  • Anchor text [0107]
  • Content previews (discussed below) [0108]
  • Content attributes (such as brightness, color or BAN, contrast, speech v. music and volume level. In addition, sampling rate, frame rate, number of tracks, data rate, size may be stored). [0109]
  • Of course, in alternative embodiments a subset or superset of these fields may be used. [0110]
  • 3.0 Content analysis [0111]
  • As was briefly mentioned above, it is desirable to not only search the lexical content surrounding a media file, but also to search the content of the media file itself in order to provide a more meaningful database of information to search. [0112]
  • As was shown in FIG. 1, the present invention is generally concerned with indexing two types of media files (i) audio [0113] 102 and (ii) video 103.
  • 3.1 Video Content [0114]
  • The present invention discloses an algorithm used to predict the likelihood that a given video file contains a low, medium or high degree of motion. In the described embodiment, the likelihood is computed as a single scalar value, which maps into one of N buckets of classification. The value associated with the motion likelihood is called the “motion” metric. A method for determining and classifying the brightness, contrast and color of the same video signal is also described. The combination of the motion metric along with brightness, contrast and color estimates enhance the ability of users to locate a specific piece of digital video. [0115]
  • Once a motion estimate and brightness, contrast and color estimate exist for all video files located in an index of multimedia content, it is possible for users to execute search queries such as: [0116]
  • “find me all action packed videos” [0117]
  • “find me all dramas and talk shows” [0118]
  • If the digital video information is indexed in a database together with auxiliary text-based information, then it is possible to execute queries such as: [0119]
  • “find me all action packed videos of James Bond from 1967” [0120]
  • “find me all talk shows with Bill Clinton and Larry King from 1993” [0121]
  • Combining motion with other associated video file parameters, users can execute queries such as: [0122]
  • “find me all slow moving, black and white movies made by Martin Scorcese” [0123]
  • “find me all dark action movies filmed in Zimbabwe” [0124]
  • The described method for estimating motion content and brightness, contrast and color can be used together with the described algorithm for searching the worldwide Internet in order to index and intelligently tag digital multimedia content. The described method allows for powerful searching based on information signals stored inside the content within very large multimedia databases. Once an index of multimedia information exists which includes a motion metric and brightness, contrast and color estimate, users can perform field based sorting of multimedia databases. For example, a user could execute the query: find me all video, from slow moving to fast, by Steven Spielberg, and the database engine would return a list of search results, ordered from slowest to fastest within the requested motion range. In addition, if the digital video file is associated with a digital audio sequence, then an analysis of the digital audio can occur. An analysis of digital audio could determine if the audio is either music or speech. It can also determine if the speaker is male or female, and other information. This type of information could then be used to allow a user query such as: [0125]
  • “find me all fast video clips which contain loud music”; [0126]
  • “find me all action packed movies starring Sylvester Stallone and show me a preview of a portion of the movie where Stallone is talking”. [0127]
  • This type of powerful searching of content will become increasingly important, as vast quantities of multimedia information become digitized and moved onto digital networks which are accessible to large numbers of consumer and business users. [0128]
  • The described method, in its preferred embodiment, is relatively fast to compute. Historically, most systems for analyzing video signals have operated in the frequency domain. Frequency domain processing, although potentially more accurate than image based analysis, has the disadvantage of being compute intensive, making it difficult to scan and index a network for multimedia information in a rapid manner. [0129]
  • The described approach of low-cost computation applied to an analysis of motion and brightness, contrast and color has been found to be useful for rapid indexing of large quantities of digital video information when building searchable multimedia databases. Coupled with low-cost computation is the fact that most video files on large distributed networks (such as the Internet) are generally of limited duration. Hence the algorithms described herein can typically be applied to short duration video files in such a way that they can be represented as a single scalar value. This simplifies presentation to the user. [0130]
  • In addition to the image space method described here, an algorithm is presented which works on digital video (such as MPEG) which has already been transformed into a frequency domain representation. In this case, the processing can be done solely by analyzing the frequency domain and motion vector data, without needing to perform the computation moving the images into frequency space. [0131]
  • 3.1.1 Degree of Motion Algorithm Details (Image Space) [0132]
  • In order to determine if a given video file contains low, medium or high amounts of motion, it is disclosed to derive a single valued scalar which represents the video data file to a reasonable degree of accuracy. The scalar value, called the motion metric, is an estimate of the type of content found in the video file. The method described here is appropriate for those video files which may be in a variety of different coding formats (such as Vector Quantization, Block Truncation Coding, Intraframe DCT coded), and need to be analyzed in a uniform uncompressed representation. In fact, it is disclosed to decode the video into a uniform representation, since it may be coded in either an intraframe or an interframe coded format. If the video has been coded as intraframe, then the method described here is a scheme for determing the average frame difference for a pixel in a sequence of video. Likewise, for interframe coded sequences, the same metric is determined. This is desirable, even though the interframe coded video has some information about frame to frame differences. The reason that the interframe coded video is uncompressed and then analyzed, is that different coding schemes produce different types of interframe patterns which may be non uniform. The disclosed invention is based on three discoveries: [0133]
  • time periods can be compressed into buckets which average visual change activity [0134]
  • the averaged rate of change of image activity gives an indication of overall change [0135]
  • an indication of overall change rate is correlated with types of video content [0136]
  • The indication of overall change has been found to be highly correlated with the type of video information stored in an video file. It has been found through empirical examination that [0137]
  • slow moving video is typically comprised of small frame differences [0138]
  • moderate motion video is typically comprised of medium frame differences [0139]
  • fast moving video is typically comprised of large frame differences [0140]
  • and that, [0141]
  • video content such as talking heads and talk shows are comprised of slow moving video [0142]
  • video content such as newscasts and commercials are comprised of moderate speed video [0143]
  • video content such as sports and action films are comprised of fast moving video [0144]
  • The disclosed method operates generally by accessing a multimedia file and evaluating the video data to determine the visual change activity and by algorithm to compute the motion metric operates as follows: [0145]
  • A. Motion Estimator [0146]
  • if the number of samples N exceeds a threshold T, then repeat the Motion Estimator algorithm below for a set of time periods P=N/T. The value Z computed for each period P is then listed in a table of values. [0147]
  • as an optional preprocessing step, employ an adaptive noise reduction algorithm to remove noise. Apply either a flat field (mean), or stray pixel (median) filter to reduce mild and severe noise respectively. [0148]
  • if the video file contains RGB samples, then run the algorithm and average the results into a single scalar value to represent the entire sequence [0149]
  • B. Motion Estimator [0150]
  • determine a fixed sampling grid in time consisting of X video frames [0151]
  • if video samples are compressed, then decompress the samples [0152]
  • decompress all video samples into a uniform decoded representation [0153]
  • adjust RGB for contrast (low/med/high) [0154]
  • compute the RGB frame differences for each frame X with its nearest neighbor [0155]
  • sum up all RGB frame differences for each pixel in each frame X [0156]
  • compute the average RGB frame difference for each pixel for each frame X [0157]
  • sum and then average RGB frame differences for all pixels in all frames in a sequence. [0158]
  • the resulting value is the motion metric Z. The motion metric Z is normalized by taking Z-NORMAL=Z * (REF-VAL/MAX-DIFFERENCE) where MAX-DIFFERENCE is the maximum difference for all frames. [0159]
  • map the value Z into one of five categories [0160]
  • low degree of motion [0161]
  • moderate degree of motion [0162]
  • high degree of motion [0163]
  • very high degree of motion [0164]
  • Using a typical RGB range of 0-255, the categories for the scalar Z map to: [0165]
  • 0-20, motion content, low [0166]
  • 20-40, motion content, moderate [0167]
  • 40-60, motion content, high [0168]
  • 60 and above, motion content, very high [0169]
  • A specific example, using actual values, is as follows: [0170]
  • number of video frames X=1000 [0171]
  • sample size is 8 bits per pixel, 24 bits for RGB [0172]
  • average frame difference per frame is 15 [0173]
  • the sequence is characterized as low motion [0174]
  • Note that when the number of video frames exceeds the threshold T, then the percentage of each type of motion metric category is displayed. For example, for a video sequence which is one hour long, which may consist of different periods of low, moderate and high motion, the resulting characterization of the video file would appear as follows: [0175]
  • 40%, motion content low [0176]
  • 10%, motion content moderate [0177]
  • 50%, motion content high [0178]
  • Once the degree of motion has been computed, it is stored in the index of a multimedia database. This facilitates user queries and searches based on the degree of motion for a sequence, including the ability to provide field based sorting of video clips based on motion estimates. [0179]
  • 3.1.2 Degree of Motion Algorithm Details (Frequency Domain) [0180]
  • The method described above is appropriate for those video files which may be in a variety of different coding formats (such as Vector Quantization, Block Truncation Coding, Intraframe DCT coded), and need to be analyzed in a uniform uncompressed representation. The coded representation is decoded and then an analysis is applied in the image space domain on the uncompressed pixel samples. However, some coding formats (such as MPEG) already exist in the frequency domain and can provide useful information regarding motion, without a need to decode the digital video sequence and perform frame differencing averages. In the case of a coding scheme such as MPEG, the data in its native form already contains estimates of motion implicitly (indeed, the representation itself is called motion estimation). The method described here uses the motion estimation data to derive an estimate of motion for a full sequence of video in a computationally efficient manner. [0181]
  • In order to determine if a given video file contains low, medium or high amounts of motion, it is necessary to derive a single valued scalar which represents the video data file to a reasonable degree of accuracy. The scalar value, called the motion metric, is an estimate of the type of content found in the video file. The idea, when applied to MPEG coded sequences, is based on four key principles: [0182]
  • the MPEG coded data contains both motion vectors and motion vector lengths [0183]
  • the number of non-zero motion vectors is a measure of how many image blocks are moving [0184]
  • the length of motion vectors is a measure of how far image blocks are moving [0185]
  • averaging the number and length of motion vectors per frame indicates degrees of motion [0186]
  • The indication of overall motion has been found to be correlated with the type of video information stored in an video file. It has been found through empirical examination that [0187]
  • slow moving video is comprised of few motion vectors and small vector lengths [0188]
  • moderate video is comprised of moderate motion vectors and moderate vector lengths [0189]
  • fast moving video is comprised of many motion vectors and large vector lengths [0190]
  • and that, [0191]
  • video content such as talking heads and talk shows are comprised of slow moving video [0192]
  • video content such as newscasts and commercials are comprised of moderate speed video [0193]
  • video content such as sports and action films are comprised of fast moving video [0194]
  • An algorithm to compute the motion metric may operates as follows: [0195]
  • Motion Estimator (Frequency Domain) [0196]
  • if the number of frames N exceeds a threshold T, then repeat the Motion Estimator algorithm below for a set of time periods P=N/T. The value Z computed for each period P is then listed in a table of values. [0197]
  • Motion Estimator Algorithm [0198]
  • determine a fixed sampling grid in time consisting of X video frames [0199]
  • determine the total number of non-zero motion vectors for each video frame [0200]
  • determine the average number of non-zero motion vectors per coded block [0201]
  • determine the average length of motion vectors per coded block [0202]
  • sum and average the number of non-zero motion vectors per block in a sequence as A [0203]
  • sum and average the length of non-zero motion vectors per block in a sequence as B [0204]
  • compute a weighted average of the two averaged values as Z=W1 * A+W2 * B [0205]
  • the resulting value is the motion metric Z [0206]
  • map the value Z into one of five categories [0207]
  • low degree of motion [0208]
  • moderate degree of motion [0209]
  • high degree of motion [0210]
  • very high degree of motion [0211]
  • Note that when the number of video frames exceeds the threshold T, then the percentage of each type of motion metric category is displayed. For example, for a video sequence which is one hour long, which may consist of different periods of low, moderate and high motion, the resulting characterization of the video file would appear as follows: [0212]
  • 40%, motion content low [0213]
  • 10%, motion content moderate [0214]
  • 50%, motion content high [0215]
  • 3.1.3 Brightness. Contrast and Color Algorithm Details [0216]
  • In order to determine if a given video file contains dark, moderate or bright intensities, it is necessary to derive a single valued scalar which represents the brightness information in the video data file to a reasonable degree of accuracy. The scalar value, called the brightness metric, is an estimate of the brightness of content found in the video file. The idea is based on two key principles: [0217]
  • time periods can be compressed into buckets which average brightness activity [0218]
  • the buckets can be averaged to derive an overall estimate of brightness level [0219]
  • By computing the luminance term for every pixel in a frame, and then for all frames in a sequence, and averaging this value, we end up with an average luminance for a sequence. [0220]
  • The same method above can be applied to determining a metric for contrast and color, resulting in a scalar value which represents an average contrast and color for a sequence. [0221]
  • 3.1.4 Search Results Display [0222]
  • Once the motion and brightness level estimates have been determined, the values are displayed to user in tabular or graphical form. The tabular format would appear as shown below: [0223]
  • Degree of motion: high [0224]
  • Video intensity bright [0225]
  • The end result is a simple display of two pieces of textual information. This information is very low bandwidth, and yet encapsulates an extensive processing and computation on the data set. And users can more quickly find the multimedia information. [0226]
  • 3.2 Audio Content [0227]
  • Before reviewing an algorithm used by the disclosed embodiment for analyzing audio files in detail, it is worthwhile to briefly turn to FIG. 3A which provides an overview of the process. A digital audio file is initially analyzed [0228] 301 and an initial determination is made whether the file is speech 307 or music 302. If the file is determined to be music, in one embodiment, if the file is “noisy”, a noise reduction filter may be applied and the analysis repeated 303. This is because a noisy speech file may be misinterpreted as music. If the file is music, an analysis may be done to determine if the music is fast or slow 304 and an analysis may be done to determine if the music is bass or treble 305 based on a pitch analysis. In the case of speech, an analysis might be done to determine if the speech 308 is fast or slow based on frequency and whether it is male or female 309 based on pitch. By way of example, knowing that a portion of an audio track for a movie starring Sylvester Stallone has a fast, male voice, may be interpreted by retrieval software as indicating that portion of the audio track is a action scene involving Sylvester Stallone. In addition, in certain embodiments, it may be desirable to perform voice recognition analysis to recognize the voice into text 310. In some embodiments, the voice recognition capability may be limited to only recognizing a known voice, while in other more advanced embodiments, omni-voice recognition capability may be added. In either event, the recognized text may be added to the stored information for the media file and be used for searching and retrieval.
  • 3.2.1 Computation of a music-speech metric [0229]
  • In order to determine if a given audio file contains music, speech, or a combination of both types of audio, it is disclosed in one embodiment to derive a single valued scalar which represents the audio data file to a reasonable degree of accuracy. The scalar value, called the music-speech metric, is an estimate of the type of content found in the audio file. The idea is based on three key principles: [0230]
  • time periods can be compressed into buckets which average amplitude activity [0231]
  • the averaged rate of change of amplitude activity gives an indication of overall change [0232]
  • an indication of overall amplitude change rate is correlated with types of audio content [0233]
  • The indication of overall change has been found to be highly correlated with the type of audio information stored in an audio file. It has been found through empirical examination that [0234]
  • music is typically comprised of a continuous amplitude signal [0235]
  • speech is typically comprised of a discontinuous amplitude signal [0236]
  • sound effects are typically comprised of a discontinuous amplitude signal [0237]
  • and that, [0238]
  • music signals are typically found to have low rates of change in amplitude activity [0239]
  • speech signals are typically found to have high rates of change in amplitude activity [0240]
  • sound effects are typically found to have high rates of change in amplitude activity [0241]
  • audio comprised of music and speech has moderate rates of change in amplitude activity [0242]
  • Continuous signals are characterized by low rates of change. Various types of music, including rock, classical and jazz are often relatively continuous in nature with respect to the amplitude signal. Rarely does music jump from a minimum to a maximum amplitude. This is illustrated by FIG. 3C which illustrates a [0243] typical amplitude signal 330 for music.
  • Similarly, it is rare that speech results in a continuous amplitude signal with only small changes in amplitude. Discontinuous signals are characterized by high rates of change. For speech, there are often bursty periods of large amplitude interspersed with extended periods of silence of low amplitude. This is illustrated by FIG. 3B which illustrates a [0244] typical amplitude signal 320 for speech.
  • Sometimes speech will be interspersed with music, for example if there is talk over a song. This is illustrated by FIG. 3D which illustrates signal [0245] 340 having period 341 which would be interpreted as music, period 342 which would be speech, period 343 music, period 344 speech, period 345 music and period 346 speech.
  • For sound effects, there are often bursty periods of large amplitude interspersed with bursty periods of low amplitude. [0246]
  • Turning now to FIG. 3E, if the audio file is a compressed file (which may be in any of a number of known compression formats), it is first decompressed using any of a known decompression algorithm, block [0247] 351. A amplitude analysis is then performed on the audio track to provide a music speech metric value. The amplitude analysis is performed as follows:
  • The audio track is divided into time segments of a predetermined length, block [0248] 352. In the described embodiment, each time segment is 50 ms. However, in alternate embodiments, the time segments may be of a greater or lesser length.
  • For each segment, a normalized amplitude deviation is computed, block [0249] 356. This is described in greater detail with reference to FIG. 3F. First, for each time segment, the maximum amplitude and minimum amplitude is determined, block 351. In the example of FIG. 3B, values range from 0 to 256 (in an alternative embodiment, the values may be based on floating point calculations and may range from 0 to 1.0). For the first interval 321, the maximum amplitude value is shown as 160, for the second interval 322, it is 158 and for the third interval 323, it is 156. Then, the average maximum amplitude and average minimum amplitude is computed for all time intervals, block 352. Again, using the example in FIG. 3B, the average maximum amplitude will be 158. Next, a value MAX-DEV is computed for each interval as the absolute value of maximum amplitude for the interval minus the average maximum, block 353. For the first interval of FIG. 3b, the MAX-DEV will be 2, for the second interval, it will be 0 and for the third interval, it will be 2. Finally, the MAX-DEV is normalized by computing MAX-DEV * (REF-VALUE/MAX) where the reference value is 256 in the described embodiment (and may be 1.0 in a floating point embodiment) and MAX is the maximum amplitude for all of the intervals. Thus, for the first interval, the normalized value for MAX-DEV will be 160−(256/160)=256. Normalizing the deviation value provides for removing dependencies based on volume differences in the audio files and allows for comparison of files recorded at different volumes.
  • Finally, the normalized MAX-DEV values for each segment are averaged together, block [0250] 357, to determine a music-speech metric. High values tend to indicate speech, low values tend to indicate music and medium values tend to indicate a combination, block 358.
  • It should be noted that if for efficiency, only a portion of the audio file may be analyzed. For example, N seconds of the audio file may be randomly chosen for analysis. Also, if the audio file contains stereo or quadraphonic samples, then run the algorithm described above may be run on each channel, and the results averaged into a single scalar value to represent the entire sequence. [0251]
  • Note also that when the number of samples exceeds the threshold T, then the percentage of each type of music-speech metric category may computed and displayed. For example, for a soundtrack which is one hour long, which may consist of different periods of silence, music, speech and sound effects, the resulting characterization of the audio file would appear as follows: [0252]
  • 40%, music content: high, speech content: low [0253]
  • 10%, music content: high, speech content: medium [0254]
  • 10%, music content: medium, speech content: medium [0255]
  • 10%, music content: medium, speech content: high [0256]
  • 30%, music content: low, speech content: high [0257]
  • 3.2.2 Volume Algorithm Details [0258]
  • In order to determine if a given audio file contains quiet, soft or loud audio information, it is disclosed to derive a single valued scalar which represents the volume information in the audio data file to a reasonable degree of accuracy. The scalar value, called the volume level metric, is an estimate of the volume of content found in the audio file. The idea is based on three key principles: [0259]
  • time periods can be compressed into buckets which average volume activity [0260]
  • the buckets can be averaged to derive an overall estimate of volume level [0261]
  • In general, the disclosed algorithm provides for determining the volume level of data in an audio file by evaluating the average amplitude for set of sampled signals. In particular, the disclosed algorithm comprises the steps of: [0262]
  • if the number of samples N exceeds a threshold T, then repeat the Volume Audio Channel Estimator algorithm, below, for a set of time periods P=N/T. The value Z computed for each period P is then listed in a table of values. [0263]
  • if the audio file contains mono samples, then run the algorithm on a single channel [0264]
  • if the audio file contains stereo samples, then run the algorithm on each channel, and average the results into a single scalar value to represent the entire sequence [0265]
  • if the audio file contains quadraphonic samples, then run the algorithm on each channel, and average the results into a single scalar value to represent the entire sequence [0266]
  • The algorithm used by the described embodiment for volume estimation is then given by FIG. 3G as follows: [0267]
  • if audio samples are compressed, then decompress the samples into a uniform PCM coded representation, block [0268] 361.
  • The audio track is mapped into X time segment buckets, [0269] 362.
  • determine the total number of audio samples N, block [0270] 366. The samples will get mapped into time segment buckets, block 367. The mapping is such that a single bucket represents N/X samples of sound and the N/X samples is called a compressed time sample C
  • Compute the average amplitude value for each bucket X, [0271] 368 by summing up all amplitude values within C and dividing to obtain an average amplitude.
  • compute the average amplitude A for all X buckets, block [0272] 369
  • the resulting value is volume estimate A [0273]
  • map the value A into one of five categories: [0274]
  • quiet [0275]
  • soft [0276]
  • moderate [0277]
  • loud [0278]
  • very loud [0279]
  • Using a typical maximum amplitude excursion of [0280] 100, the categories for A map to:
  • 0-50, quiet [0281]
  • 50-70, soft [0282]
  • 70-80, moderate [0283]
  • 80-100, loud [0284]
  • 100—above, very loud [0285]
  • It will be apparent to one skilled in the art that alternate “bucket sizes” can be used and the mapping may be varied from the mapping presented in the disclosed algorithm without departure from the spirit and scope of the invention. [0286]
  • When the number of samples exceeds the threshold T, then the percentage of each type of volume category is displayed. For example, for a soundtrack which is one hour long, which may consist of different periods of silence, loudness, softness and moderate sound levels, the resulting characterization of the audio file would appear as follows: [0287]
  • 30%, quiet [0288]
  • 20%, soft [0289]
  • 5%, moderate [0290]
  • 10%, loud [0291]
  • 35%, very loud [0292]
  • 3.2.3 Search Results Display [0293]
  • Once the music-speech and volume level estimates have been determined, the values are displayed to the user in tabular or graphical form The tabular format may appear as shown below: [0294]
  • Music content: high [0295]
  • Speech content: low [0296]
  • Volume level: loud [0297]
  • The end result is a simple display of three pieces of textual information. This information is very low bandwidth, and yet encapsulates an extensive processing and computation on the data set. And users can more quickly find the multimedia information they are looking for. [0298]
  • 3.2.4 Waveform display [0299]
  • A focus of the method described herein is to generate a visual display of audio information which can aid a user to determine if an audio file contains the audio content they are looking for. This method is complements the other types of useful information which can be computed and or extracted from digital audio files; the combination of context and content analysis, together with graphical display of content data results in a composite useful snapshot of a piece of digital media information. [0300]
  • As users need to sift through large quantities of music, sound effects and speeches (on large distributed networks such as the Internet) it will be useful to process the audio signals to enhance the ability to distinguish one audio file from another. The use of only keyword based searching for media content will prove to be increasingly less useful than a true analysis and display of the media signal. [0301]
  • The algorithm described herein is used to display a time compressed representation of an audio signal. The method is focused on providing some high level features visually of the time varying sound signal. The method described can allow users to: [0302]
  • differentiate visually between music and speech [0303]
  • observe periods of silence interspersed with loud or soft music/speech [0304]
  • observe significant changes in volume level [0305]
  • identify extended periods in an audio track where volume level is very low or high [0306]
  • Using a multimedia search engine it is possible for users to execute a query such as: [0307]
  • “find me all soft music by Beethoven from the seventeenth century”
  • The results returned might be a set of fifty musical pieces by Beethoven. If the searcher knows that the piece of music they are looking for has a very quiet part towards the end of the piece, the user could view the graphical representation and potentially find the quiet part by seeing the waveform display illustrate a volume decrease towards the end of the waveform image. This could save the searcher great amounts of time that would have been required to listen to all fifty pieces of music. [0308]
  • Using a multimedia search engine it is possible for users to execute a query such as: [0309]
  • “find me all loud speeches by Martin Luther King”
  • A searcher might be looking for a speech by Martin Luther King, where the speech starts out with him yelling loudly, and then speaking in a normal tone of voice. If twenty speeches are returned from the search engine results, then the searcher could visually scan the results and look for a waveform display which shows high volume at the beginning and then levels off within the first portion of the audio signal. This type of visual identification could save the searcher great amounts of time which would be required to listen to all twenty speeches. [0310]
  • Continuous signals are characterized by low rates of change. Various types of music, including rock, classical and jazz are often relatively continuous in nature with respect to the amplitude signal. Rarely does music jump from a minimum to a maximum amplitude. Similarly, it is rare that speech results in a continuous amplitude signal with only small changes in amplitude. Discontinuous signals are characterized by high rates of change. For speech, there are often bursty periods of large amplitude interspersed with extended periods of silence of low amplitude. For sound effects, there are often bursty periods of large amplitude interspersed with bursty periods of low amplitude. These trends can often be identified computationally, or visually, or using both methods. A method is illustrated here which derives a visual representation of sound in a temporally compressed format. The goal is to illustrate long term trends in the audio signal which will be useful to a user when searching digital multimedia content. Note that the method produces visual images of constant horizontal resolution, independent of the duration in seconds. This means that temporal compression must occur to varying degrees while still maintaining a useful representation of long term amplitude trends within a limited area of screen display. [0311]
  • An algorithm, as used by the described embodiment, to compute and display the waveform operates as follows: [0312]
  • A. Waveform Display [0313]
  • if the number of samples N exceeds a threshold T, then repeat the Waveform Display algorithm below for a set of time periods P=N/T. A different waveform is computed for each time period. [0314]
  • if the audio file contains mono samples, then run the algorithm on a single channel [0315]
  • if the audio file contains stereo samples, then run the algorithm on each channel, and display the results for each channel [0316]
  • if the audio file contains quadrophonic samples, then run the algorithm on each channel, and display the results for each channel [0317]
  • B. Waveform Display Algorithm [0318]
  • determine a fixed sampling grid in time consisting of X buckets [0319]
  • if audio samples are compressed, then decompress the samples [0320]
  • decompress all audio samples into a uniform PCM coded representation [0321]
  • determine the total number of audio samples N [0322]
  • determine the number of samples which get mapped into a single bucket [0323]
  • the mapping is that a single bucket represents N/X samples of sound [0324]
  • the N/X samples term is called a compressed time sample C [0325]
  • compute the minimum, maximum and average amplitude value for each bucket X [0326]
  • display an RGB interpolated line from the minimum to the maximum amplitude [0327]
  • the line passes through the average amplitude [0328]
  • red represents maximum amplitude [0329]
  • green represents average amplitude [0330]
  • blue represents minimum amplitude [0331]
  • the interpolation occurs using integer arithmetic [0332]
  • the line is rendered vertically from top to bottom within each bucket X [0333]
  • compress the resulting waveform using a DCT based compression scheme (or alternate) [0334]
  • Note that when the number of samples exceeds the threshold T, then a series of waveform displays are computed. For example, for a soundtrack which is one hour long, which may consist of different periods of silence, music, speech and sound effects, the resulting waveform display characterization would need to be broken up into segments and displayed separately. The ability to scan through these displays would then be under user control. [0335]
  • 3.2.5 Additional Processing and Analysis of Audio Files [0336]
  • After a digital audio file has been classified as music, speech or a combination of the two, additional processing and analysis can be applied in order to extract more useful information from the data. This more useful information can be used to enhance the ability of users to search for digital audio files. [0337]
  • For the case of audio files which have been classified as music, with some degree of speech content (or which have been classified as speech, with some degree of music content) one can assume that there is a speaking or singing voice within the audio file accompanied with the music. A conventional speech recognition algorithm can then be applied (also called speech to text) which can convert the speech information in the audio file into textual information. This will allow the audio file to then be searchable based on its internal characteristics, as well as the actual lyrics or speech narrative which accompanies the music. [0338]
  • For the case of audio files which have been classified as speech, one can assume that there is a reasonable certainty of a speaking voice within the audio file. A conventional speech recognition algorithm can then be applied (also called speech to text) which can convert the speech information in the audio file into textual information. This will allow the audio file to then be searchable based on its internal characteristics, as well as the actual narrative which is within the audio file. The speech may correspond to closed captioning information, script dialogue or other forms of textual representation. [0339]
  • 3.2.6 Determining if a Given Music File contains Fast or Slow Music [0340]
  • When an audio file is first examined, a determination can be made if the audio data is sampled and digitized, or is completely synthetic. If the data has been digitized, then all of the processes described above can be applied. If the data has been synthesized, then the audio file is MIDI information (Musical Instrument Digital Interface). If a file has been identified as MIDI, then it is possible to scan for information in the file regarding tempo, tempo changes and key signature. That information can be analyzed to determine the average tempo, as well as the rate of change of the tempo. In addition, the key signature of the music can be extracted. The tempo, rate of change of tempo and key signature can all be displayed in search results for a user as: [0341]
  • tempo: (slow, moderate, fast) [0342]
  • rate of change of tempo (low, medium, high) [0343]
  • indicates if the music changes pace frequently [0344]
  • signature [0345]
  • key of music [0346]
  • indication of minor and major key [0347]
  • Note that when the number of samples exceeds the threshold T, then the percentage of each type of tempo category is displayed. For example, for a soundtrack which is one hour long, which may consist of different periods of fast, moderate or slow tempo levels, the resulting characterization of the music file would appear as follows: [0348]
  • 30%, slow [0349]
  • 20%, moderate [0350]
  • 20%, fast [0351]
  • 30%, very fast [0352]
  • 4.0 Previews [0353]
  • The described embodiment is concerned with parsing content files and building low-bandwidth previews of higher bandwidth data files. This allows rapid previewing of media data files, without need to download the entire file. [0354]
  • 4.1 Preview Overview [0355]
  • In the described embodiment, for video media files, a preview mechanism has been developed. A sample of the results of search, showing a media preview is given in FIG. 4A. The preview is explained in greater detail with reference to FIG. 4B. FIG. 4B illustrates a [0356] preview 410. The preview comprises a first sprocket area 411 at the top of the preview and a second sprocket area at the bottom of the preview, a image area having three images of height IH 412 and width IW 413. The preview itself is of height FH 414 and width FW 415. In addition, in certain embodiments, the preview may include a copyright area 416 for providing copyright information relating to the preview and certain embodiments may contain an area, for example in the upper left hand corner of the first sprocket area 411 for a corporate logo or other branding information.
  • A general algorithm for generation and display of previews is disclosed with reference to FIG. 4C. Generally, after finding a media object, as was discussed above in Section 1 in connection with crawling to locate media files, the media file is examined to locate portions having predetermined characteristics. For example, portions of a video file having fast action may be located. Or, portions of a video having black and white portions. [0357]
  • Next, a preview of the object is generated and stored. This will be discussed in greater detail in connection with FIG. 4D. Finally, when requested by a user, for example, in response to a search, the preview may be displayed. [0358]
  • 4.2 Preview Generation [0359]
  • Turning now to FIG. 4D, the process for generation of a preview is discussed in greater detail. Initially, a determination is made of the object type, block [0360] 431. The object may be, for example, a digital video file, an animation file, or a panoramic image. In the case of digital video, as was discussed above, the file may be downloadable or streaming. And, if downloadable, the file may have table based frame descriptions or track based frame descriptions. Animation objects include animated series of frames using a lossless differential encoding scheme and hyperlinked animation.
  • Regardless of the media type, a preview is generated generally along the lines of the preview of FIG. 4A and 4B, block [0361] 432.
  • 4.2.1 Sizing of preview and images [0362]
  • The sizing of the preview and of images is done in the described embodiment as follows: [0363]
  • A) Initially, an aspect ratio is computed for the preview. The aspect ratio is computed as the width of a frame of the object divided by the height or A=W/H. [0364]
  • B) The target filmstrip is set with a [0365] width FW 415 and a height FH. A distance ID is set for the distance between images on the filmstrip. Next, a a sprocket height and width is set resulting in a sprocket region height (SRH 411). The particular heights and widths may vary from implementation to implementation dependent on a variety of factors such as expected display resolution. In alternative embodiment, differing sprocket designs may be utilized and background colors, etc. may be selected. In certain embodiments, it may be desirable to include copyright information 416 or a logo.
  • C) In any event, the [0366] target height IH 412 of a filmstrip image can be computed as IH=FH−(2 * SRH). The target width of an image can be computed as a function of the aspect ratio as follows: IW=A * IH. The number N of filmstrip images which will be displayed can them be computed as N=FW/(IW+ID).
  • Using the above calculations, the number, width and height of images can be determined for display of a preview. [0367]
  • 4.2.2 Selection of images [0368]
  • The selection images for use in the preview is dependent on whether the preview is being generated for a 3D media object, a digital video or animation object, or a panoramic object. [0369]
  • 4.2.2.1 Selection of images—Digital Video and Animation [0370]
  • For digital video or animation sequences, a temporal width TW is calculated, block [0371] 442, as TW=T/(N+1) where T is equal to the length (time) of the media object and N is the number of frames calculated as discussed above. N frames from the image are then decompressed to pure RGB at N fixed points in the image where the N fixed points at TW, 2*TW, 3*TW, . . . N* TW time into the media image. This process reduces the need to decompress the entire video file. Scanning to the particular ones of the N frames is accomplished by using the table based frame description, the track based frame description or by streaming dependent on the media source file. An objective of choosing N frames spaced TW apart is to develop a preview with frames from various portions of the media file so that the user will be given an opportunity to review the various portions in making a determination if the user wishes to access the entire file.
  • The decompress process may utilize intraframe, predictive decoding or bilinear decoding dependent on the source file. In the described embodiment, a color space conversion is then performed from RGB to YUV. Optionally, an adaptive noise reduction process may be performed. [0372]
  • Each of the N frames are then analyzed to determine if the frame meets predetermined criteria for display, block [0373] 444. Again, an objective is to provide the user with a quality preview allowing a decision if the entire file should be accessed. In the described embodiment, each of the N frames are analyzed for brightness, contrast and quality. If the frames meet for the criteria, block 445, then the frame is scaled, block 447 from its original width W and height H to width IW and height IH using interpolation. Linear interpolation is utilized and the aspect ratio is maintained.
  • Each frame is also analyzed for a set of attributes, block [0374] 448. The attributes in the described embodiment include brightness, contrast (luminance, deviation), chrominance, and dominant color. Brightness indicates the overall brightness of digital video clip. Color indicates if the video clip is in full color or black and white, and contrast indicates the degree of contrast in the movie. These high level content attributes tend to be more meaningful for the typically short video sequences which are published on the Internet and Intranet. The computation for each of the attributes is detailed below. This information can then be used for enhanced searching. For example, chrominance can be used for searching for black and white versus color video. In addition, embodiments may provide for optionally storing a feature vector for texture, composition and structure. These attributes can be averaged across the N frames and the average for each attribute is stored as a searchable metric. In addition, optionally, the contrast of the frames may be enhanced using a contrast enhancement algorithm.
  • We will now briefly describe computation of the chrominance, luminance and contrast values. The maximum chrominance is computed for the selected N frames in the video sequence. The maximum chrominance for the set of frames is then determined by finding the maximum chrominance for each frame by finding the maximum chrominance for all pixels in each frame. This maximum chrominance value for the set of selected frames is then compared against a threshold. If the maximum chrominance for the sequence is larger than the threshold, then the sequence is considered in full color. If the maximum chrominance for the sequence is smaller than the threshold, then the sequence is considered in black and white. [0375]
  • The luminance is computed for the selected N frames in the video sequence. The luminance is then averaged into a single scalar value. [0376]
  • To determine contrast, luminance values are computed for each frame of the digital video sequence. The luminance values which fall below the fifth percentile, and above the ninety-fifth percentile are then removed from the set of values. This is done to remove random noise. The remaining luminance values are then examined for the maximum and minimum luminance. The difference between the maximum and minimum luminance is computed as the contrast for a single frame. The contrast value is then computed for all frames in the sequence, and the average contrast is stored as the resulting value. [0377]
  • Finally, audio and video clips may be associated with each frame, block [0378] 449. For audio, a standard audio segment may be selected or alternatively an audio selection algorithm may be applied which finds audio which meets predetermined criteria, such as a preset volume level. For video, a video track of duration VD is selected. The video selection may be a standard video segment or the video segment may be selected using a video selection algorithm which selects video segments meeting a predetermined criteria such as video at a predetermined brightness, contrast or motion.
  • Going back to analysis of the frames, if one of the N frames does not meet the criteria, block [0379] 445, a frame iterator algorithm is applied to select a new frame. The frame iterator algorithm of the described embodiment selects another frame by iteratively selecting frames between the frame in question and the other frames until a frame is found which meets the criteria or until a predetermined number of iterations have been applied. If the predetermined number of iteration are applied without successfully finding a frame which meets the criteria, the originally selected frame is used. The algorithm starts with the original frame at TW (or, 2*TW, 3*TW . . . N* TW) and selects, first, a frame at TW−(TW/2) (i.e., a frame halfway between the original frame and the beginning). If this frame does not meet the criteria, a frame at TW+(TW/2) is selected and iteratively frames are selected according to the pattern:
  • ((TW−(TW/2)), (TW+(TW/2), (TW−(TW/4)), (TW+(TW/4), . . . (TW−(TW/X)), (TW+(TW/X)).
  • 4.2.2.2 Selection of images—Panoramic [0380]
  • Interactive panoramic images are often stored as multimedia files. Typically these media files are stored as a series of compressed video images, with a total file size ranging from 100 to 500 Kbytes. The described embodiment provides a method which creates a low bandwidth preview of a panoramic picture. A preview, in the described embodiment utilizes approximately 10 Kbytes in storage size which is only 1/10th to 1/50th of the original panoramic storage. The preview provides a high-quality, low bit rate display of the full panoramic scene. In the described embodiment, the method for creating the panoramic preview, block [0381] 450, may be described as follows:
  • 1) extract all information from the header of the media file to determine the width, height and number of tiles for the panoramic scene. Create an offscreen buffer to generate the new panoramic picture preview. [0382]
  • 2) For each tiled image on the media file, decode the image using the coding algorithm which was used to encode the original tiles. The decoded files are converted to pure RGB and then to YUV. The tiles are scaled from (W, H) to (IW, IH) similar to as discussed above. In other embodiments, as a next step, the image may be scaled by a factor of two in each direction. [0383]
  • 3) Re-orient the tile by rotating it 90 degrees clockwise. [0384]
  • 4) For each scaled and rotated tile, copy the image (scanning from right to left) into the offscreen buffer. [0385]
  • 5) In the case of embodiment which scales by the factor of two, when all tiles have been processed, examine the resulting picture size after it has been reduced by a factor of two. If the image is below a fixed resolution then the process is complete. If the image is above a fixed resolution, then reduce the picture size again by a factor of two, until it is less than or equal to the fixed resolution. [0386]
  • 6) Composite the reconstructed panoramic picture with filmstrip images on the top and bottom of the picture to create a look and feel consistent with the filmstrip images for video sequences. [0387]
  • 7) Any of a number of known compression algorithms may be applied to the reconstructed and composited panoramic picture to produce a low bandwidth image preview. Coding schemes can include progressive or interlaced transmission algorithms. [0388]
  • 4.2.2.3 Selection of images—3D [0389]
  • For 3D images, a top view, bottom view, front view and rear view are selected for images to display, block [0390] 441.
  • 4.3 Interactive Display of Search Results [0391]
  • When returning search results from a user's multimedia query to a database, it is disclosed to generate appropriate commands to drive a web browser display to facilitate interactive viewing of the search results. Depending on the position a user selects (for example with a mouse or other cursor control device) within a preview of the media content shown in the search result, the user will begin interaction with the content at different points in time or space. The end result is a more useful and interactive experience for a user employing a multimedia search engine. [0392]
  • For example, if a user searches for videos of a car, then the web server application can return a series of HTML and EMBED tags which setup a movie controller, allowing a user to interact with the videos of cars. When the low bandwidth preview (a filmstrip showing select scenes of the video clip) is presented to a user, the position of the mouse that is active when a user clicks within the preview will drive the resulting EMBED tags which are created and then returned from the server. For example: [0393]
  • if a user clicks down in frame X of a filmstrip, then an in-line viewer is created which will begin display and playback of the movie at frame X. In an alternative embodiment, a snipet or short segment of a video or audio file may be stored with the preview and associated with a particular portion of the preview. This method avoids the need to access the original file for playback of a short audio or video segment. [0394]
  • if a user clicks down at pan angle X, tilt angle Y and fov Z within a panorama filmstrip, then an in-line viewer is created which will begin display of the panorama at those precise viewing parameters. [0395]
  • if a user clicks down within a select viewpoint of a 3D scene within a filmstrip, then an in-line viewer is created which will begin display of the 3D scene at that viewpoint. [0396]
  • if a user clicks down within an audio waveform at time T, then an in-line viewer is created which will begin begin playback of the sound at that particular time T. [0397]
  • By allowing users to drive the points in time or space where their display of interactive media begins, users can more precisely hone in on the content they are looking for. For example, if a user is looking for a piece of music which has a certain selection which is very loud, they may observe the volume increase in the graphical waveform display, click on that portion of the waveform and then hear the loud portion of the music. This takes them directly to the selection of interest. [0398]
  • 4.4 Use of Media Icons to illustrate search results [0399]
  • When returning search results from a user's multimedia query to a database, the described embodiment provides for both a text and visual method for showing that the search results are of different media types. For example, when executing a search for the word “karate”, it is possible that numerous search results will be returned, including digital video, audio, 3D, animation, etc. Video may show karate methods, sound might be an interview with a karate expert, 3D could be a simulation of a karate chop and animation a display of a flipbook of a karate flip. In order to enable a viewer to rapidly scan a page and distinguish the different media types, an icon which is representative of each type of media is employed. [0400]
  • By using a universal set of icons as shown in the figures for media types, it enhance the ability of users to scan a page of search results and quickly jump to those responses which are most relevant. In addition, the use of media icons can transcend barriers of language and culture, making it easier for people from different cultures and speaking different languages to understand search results for multimedia queries. [0401]
  • 4.5 Selection of basic, detailed or visual results [0402]
  • In the described embodiment, users can select basic, detailed or visual search results. If a user selects visual search results, then only visual images, filmstrips or waveforms are presented to users as search results. The visual search results are typically displayed as a set of mosaics on a page, usually multiple thumbnail images per row, and multiple filmstrips (usually two) per row. Clicking on images, waveforms or filmstrips then takes users to new web pages where more information is described about the media content. This allows users to rapidly scan a page of visual search results to see if they can find what they are looking for. [0403]
  • 4.5 Timecode based display [0404]
  • Text keywords may be found within certain multimedia files (e.g., the content of the file). For example, movie and other video files sometimes contain a movie text track, a closed caption track or a musical soundtrack lyrics track. For each text keyword which is found in one of these tracks, a new database is created by the process of the present invention. This database maps keywords to [text, timecode] pairs. This is done so that it is possible to map keywords directly to the media file and timecode position where the media file text reference occurs. The timecode position is subsequently used when producing search results to viewers, so that viewers can jump directly to that portion of the media sequence where the matching text occurs. [0405]
  • ALTERNATIVES TO THE PREFERRED EMBODIMENT OF THE PRESENT INVENTION
  • There are, of course, alternatives to the described embodiment which are within the reach of one of ordinary skill in the relevant art. The present invention is intended to be limited only by the claims presented below. [0406]

Claims (19)

What is claimed is:
1. A computer implemented method of storing a representation of a media object comprising the steps of:
examining the media object to locate portions of the media object having predetermined characteristics; and
storing a preview of the media object.
2. The method as recited by
claim 1
further comprising the step of displaying the preview.
3. The method as recited by
claim 2
further comprising the step of searching a database of previews based on said predetermined characteristics prior to display of said preview.
4. The method as recited by
claim 1
wherein said step of generating a preview comprises the steps of:
a) determining a preview image size; and
b) selecting images from said media object for display in said preview.
5. The method as recited by
claim 4
wherein said media object is a digital video and said step of determining a preview image size comprises the steps of:
a) computing an aspect ratio A;
b) determining a target height IH of said preview image as the preview height FH less the height of any top and bottom border;
c) determining a target width IW of said preview image as a function of said target height IH and said aspect ratio A.
6. The method as recited by
claim 5
further comprising the step of computing the number of images for display as the preview width divided by the sum of the target width IW and any spacing between images.
7. The method as recited by
claim 4
wherein said step of selecting images from said media object comprises the steps of:
a) decompressing frames of said media object at N points;
b) analyzing each of said frames to determine if said frames meet predetermined criteria and if said frames do meet said predetermined criteria, selecting said frame for display;
c) if one of said frames do not meet said predetermined criteria, selecting a substitute frame.
8. A method of generating a preview comprising the steps of:
a) determining a preview image size; and
b) selecting images from said media object for display in said preview.
9. The method as recited by
claim 8
wherein said media object is a digital video and said step of determining a preview image size comprises the steps of:
a) computing an aspect ratio A;
b) determining a target height IH of said preview image as the preview height FH less the height of any top and bottom border;
c) determining a target width IW of said preview image as a function of said target height IH and said aspect ratio A.
10. The method as recited by
claim 9
further comprising the step of computing the number of images for display as the preview width divided by the sum of the target width IW and any spacing between images.
11. The method as recited by
claim 8
wherein said step of selecting images from said media object comprises the steps of:
a) decompressing frames of said media object at N points wherein N is determined based on the size of said preview and the size of images to be displayed in said preview;
b) analyzing each of said frames to determine if said frames meet predetermined criteria and if said frames do meet said predetermined criteria, selecting said frame for display;
c) if one of said frames do not meet said predetermined criteria, selecting a substitute frame.
12. The method as recited by
claim 11
further comprises the steps of:
a) scaling each of said selecting frames;
b) determining a predetermined set of attributes for said frames.
13. The method as recited by
claim 12
further comprising the step of selecting audio portions associated with each of said selected frames.
14. The method as recited by
claim 12
further comprising the step of selecting video portions associated with each of said selected frames.
15. A method viewing a media object comprising the steps of:
a) providing a low bandwidth representation of said media object for review by a user;
b) allowing said user to select said media object after review of said low bandwidth representation.
16. The method of
claim 15
wherein said low bandwidth representation displays selected frames of said media object.
17. The method of
claim 15
wherein said media object is a digital video and said selected frames are selected utilizing a time-based algorithm.
18. The method of
claim 15
wherein said media object is a panoramic object and said selected frames are selected using a spatial based algorithm.
19. The method of
claim 15
wherein said step of allowing a user to select said media object allows said user to access said media object at predetermined portions of said media object based on review and selection of said portion of said low bandwidth representation.
US08/847,156 1996-05-24 1997-04-30 Display of media previews Expired - Lifetime US6370543B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US08/847,156 US6370543B2 (en) 1996-05-24 1997-04-30 Display of media previews

Applications Claiming Priority (11)

Application Number Priority Date Filing Date Title
US1831296P 1996-05-24 1996-05-24
US1831196P 1996-05-24 1996-05-24
US1823896P 1996-05-24 1996-05-24
US2145296P 1996-07-10 1996-07-10
US2151796P 1996-07-10 1996-07-10
US2146696P 1996-07-10 1996-07-10
US2151596P 1996-07-10 1996-07-10
US2363396P 1996-08-09 1996-08-09
US2363496P 1996-08-09 1996-08-09
US2383696P 1996-08-12 1996-08-12
US08/847,156 US6370543B2 (en) 1996-05-24 1997-04-30 Display of media previews

Publications (2)

Publication Number Publication Date
US20010014891A1 true US20010014891A1 (en) 2001-08-16
US6370543B2 US6370543B2 (en) 2002-04-09

Family

ID=27582451

Family Applications (1)

Application Number Title Priority Date Filing Date
US08/847,156 Expired - Lifetime US6370543B2 (en) 1996-05-24 1997-04-30 Display of media previews

Country Status (1)

Country Link
US (1) US6370543B2 (en)

Cited By (108)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020012519A1 (en) * 1997-05-16 2002-01-31 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US20020038358A1 (en) * 2000-08-08 2002-03-28 Sweatt Millard E. Method and system for remote television replay control
US20020087661A1 (en) * 2000-08-08 2002-07-04 Matichuk Chris E. One click web records
US6473756B1 (en) * 1999-06-11 2002-10-29 Acceleration Software International Corporation Method for selecting among equivalent files on a global computer network
US20030014415A1 (en) * 2000-02-23 2003-01-16 Yuval Weiss Systems and methods for generating and providing previews of electronic files such as web files
EP1351501A2 (en) * 2002-03-19 2003-10-08 British Broadcasting Corporation Method and system for accessing video data
US20040015467A1 (en) * 2002-07-18 2004-01-22 Accenture Global Services, Gmbh Media indexing beacon and capture device
US20040078753A1 (en) * 1998-05-23 2004-04-22 Doyle Michael D. Method and apparatus for identifying features of multidimensional image data in hypermedia systems
US20040098365A1 (en) * 2000-09-14 2004-05-20 Christophe Comps Method for synchronizing a multimedia file
US20040133597A1 (en) * 2003-01-07 2004-07-08 Fano Andrew E. Customized multi-media services
US6785429B1 (en) * 1998-07-08 2004-08-31 Matsushita Electric Industrial Co., Ltd. Multimedia data retrieval device and method
US20040190853A1 (en) * 2003-03-24 2004-09-30 Christopher Dow System and method for aggregating commercial navigation information
WO2004084448A2 (en) * 2003-03-20 2004-09-30 Digital Networks North America, Inc. System and method for navigation of indexed video content
US20040221044A1 (en) * 2003-05-02 2004-11-04 Oren Rosenbloom System and method for facilitating communication between a computing device and multiple categories of media devices
US20050091274A1 (en) * 2003-10-28 2005-04-28 International Business Machines Corporation System and method for transcribing audio files of various languages
WO2005050973A2 (en) * 2003-11-18 2005-06-02 Motorola, Inc. Method for video segment matching
US20050165840A1 (en) * 2004-01-28 2005-07-28 Pratt Buell A. Method and apparatus for improved access to a compacted motion picture asset archive
US20050163462A1 (en) * 2004-01-28 2005-07-28 Pratt Buell A. Motion picture asset archive having reduced physical volume and method
US20050216443A1 (en) * 2000-07-06 2005-09-29 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US20050246375A1 (en) * 2004-05-03 2005-11-03 Microsoft Corporation System and method for encapsulation of representative sample of media object
US20050254366A1 (en) * 2004-05-14 2005-11-17 Renaud Amar Method and apparatus for selecting an audio track based upon audio excerpts
US20060031384A1 (en) * 2004-05-03 2006-02-09 Microsoft Corporation System and method for optimized property retrieval of stored objects
US20060031545A1 (en) * 2004-08-06 2006-02-09 Microsoft Corporation System and method for generating selectable extension to media transport protocol
US20060200470A1 (en) * 2005-03-03 2006-09-07 Z-Force Communications, Inc. System and method for managing small-size files in an aggregated file system
US20060277174A1 (en) * 2005-06-06 2006-12-07 Thomson Licensing Method and device for searching a data unit in a database
US20060294064A1 (en) * 2005-06-24 2006-12-28 Microsoft Corporation Storing queries on devices with rewritable media
US20060294585A1 (en) * 2005-06-24 2006-12-28 Microsoft Corporation System and method for creating and managing a trusted constellation of personal digital devices
US20070079010A1 (en) * 2005-10-04 2007-04-05 Microsoft Corporation Media exchange protocol and devices using the same
US20070106721A1 (en) * 2005-11-04 2007-05-10 Philipp Schloter Scalable visual search system simplifying access to network and device functionality
US20070245028A1 (en) * 2006-03-31 2007-10-18 Baxter Robert A Configuring content in an interactive media system
US20070250499A1 (en) * 2006-04-21 2007-10-25 Simon Widdowson Method and system for finding data objects within large data-object libraries
US20070282818A1 (en) * 2000-04-07 2007-12-06 Virage, Inc. Network video guide and spidering
US20080008352A1 (en) * 2001-07-05 2008-01-10 Davis Bruce L Methods Employing Topical Subject Criteria in Video Processing
US20080028047A1 (en) * 2000-04-07 2008-01-31 Virage, Inc. Interactive video application hosting
US20080071770A1 (en) * 2006-09-18 2008-03-20 Nokia Corporation Method, Apparatus and Computer Program Product for Viewing a Virtual Database Using Portable Devices
US20080071750A1 (en) * 2006-09-17 2008-03-20 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Standard Real World to Virtual World Links
US20090150433A1 (en) * 2007-12-07 2009-06-11 Nokia Corporation Method, Apparatus and Computer Program Product for Using Media Content as Awareness Cues
US20090254592A1 (en) * 2007-11-12 2009-10-08 Attune Systems, Inc. Non-Disruptive File Migration
US20090292734A1 (en) * 2001-01-11 2009-11-26 F5 Networks, Inc. Rule based aggregation of files and transactions in a switched file system
US20100158470A1 (en) * 2008-12-24 2010-06-24 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US20100161580A1 (en) * 2008-12-24 2010-06-24 Comcast Interactive Media, Llc Method and apparatus for organizing segments of media assets and determining relevance of segments to a query
US20100211693A1 (en) * 2010-05-04 2010-08-19 Aaron Steven Master Systems and Methods for Sound Recognition
US20110016079A1 (en) * 2009-07-20 2011-01-20 Adam Harris Summarizing a Body of Media
US7917008B1 (en) 2001-08-19 2011-03-29 The Directv Group, Inc. Interface for resolving recording conflicts with network devices
US7962948B1 (en) 2000-04-07 2011-06-14 Virage, Inc. Video-enabled community building
US20110238698A1 (en) * 2010-03-25 2011-09-29 Rovi Technologies Corporation Searching text and other types of content by using a frequency domain
WO2011123146A1 (en) * 2010-04-01 2011-10-06 Microsoft Corporation Opportunistic frame caching
US8122465B2 (en) 2001-07-05 2012-02-21 Digimarc Corporation Watermarking to set video usage permissions
US8126987B2 (en) 2009-11-16 2012-02-28 Sony Computer Entertainment Inc. Mediation of content-related services
USRE43346E1 (en) 2001-01-11 2012-05-01 F5 Networks, Inc. Transaction aggregation in a switched file system
US8171509B1 (en) 2000-04-07 2012-05-01 Virage, Inc. System and method for applying a database to video multimedia
US8180747B2 (en) 2007-11-12 2012-05-15 F5 Networks, Inc. Load sharing cluster file systems
US8195760B2 (en) 2001-01-11 2012-06-05 F5 Networks, Inc. File aggregation in a switched file system
US8204860B1 (en) 2010-02-09 2012-06-19 F5 Networks, Inc. Methods and systems for snapshot reconstitution
US8352785B1 (en) 2007-12-13 2013-01-08 F5 Networks, Inc. Methods for generating a unified virtual snapshot and systems thereof
US20130018807A1 (en) * 2002-04-02 2013-01-17 Collaborative Agreements, LLC Method for Facilitating Transactions Between Two or More Parties
US8396895B2 (en) 2001-01-11 2013-03-12 F5 Networks, Inc. Directory aggregation for files distributed over a plurality of servers in a switched file system
US8397059B1 (en) 2005-02-04 2013-03-12 F5 Networks, Inc. Methods and apparatus for implementing authentication
US8396836B1 (en) 2011-06-30 2013-03-12 F5 Networks, Inc. System for mitigating file virtualization storage import latency
US8417746B1 (en) * 2006-04-03 2013-04-09 F5 Networks, Inc. File system management with enhanced searchability
US8417681B1 (en) 2001-01-11 2013-04-09 F5 Networks, Inc. Aggregated lock management for locking aggregated files in a switched file system
US8423555B2 (en) 2010-07-09 2013-04-16 Comcast Cable Communications, Llc Automatic segmentation of video
US8433759B2 (en) 2010-05-24 2013-04-30 Sony Computer Entertainment America Llc Direction-conscious information sharing
US8433735B2 (en) 2005-01-20 2013-04-30 F5 Networks, Inc. Scalable system for partitioning and accessing metadata over multiple servers
US8463850B1 (en) 2011-10-26 2013-06-11 F5 Networks, Inc. System and method of algorithmically generating a server side transaction identifier
US8533223B2 (en) 2009-05-12 2013-09-10 Comcast Interactive Media, LLC. Disambiguation and tagging of entities
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US8549582B1 (en) 2008-07-11 2013-10-01 F5 Networks, Inc. Methods for handling a multi-protocol content name and systems thereof
WO2013191856A1 (en) * 2012-06-21 2013-12-27 Motorola Mobility Llc Correlation engine and method for granular meta-content having arbitrary non-uniform granularity
US20140040273A1 (en) * 2012-08-03 2014-02-06 Fuji Xerox Co., Ltd. Hypervideo browsing using links generated based on user-specified content features
US8682916B2 (en) 2007-05-25 2014-03-25 F5 Networks, Inc. Remote file virtualization in a switched file system
US20140090002A1 (en) * 2011-04-11 2014-03-27 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US20140089241A1 (en) * 1999-02-01 2014-03-27 Steven M. Hoffberg System and method for intermachine markup language communications
US8694537B2 (en) 2010-07-29 2014-04-08 Soundhound, Inc. Systems and methods for enabling natural language processing
US8694534B2 (en) 2010-07-29 2014-04-08 Soundhound, Inc. Systems and methods for searching databases by sound input
US8752084B1 (en) 2008-07-11 2014-06-10 The Directv Group, Inc. Television advertisement monitoring system
US8856148B1 (en) 2009-11-18 2014-10-07 Soundhound, Inc. Systems and methods for determining underplayed and overplayed items
US8875198B1 (en) 2001-08-19 2014-10-28 The Directv Group, Inc. Network video unit
US8959574B2 (en) 2012-06-21 2015-02-17 Google Technology Holdings LLC Content rights protection with arbitrary correlation of second content
US8966557B2 (en) 2001-01-22 2015-02-24 Sony Computer Entertainment Inc. Delivery of digital content
US20150331941A1 (en) * 2014-05-16 2015-11-19 Tribune Digital Ventures, Llc Audio File Quality and Accuracy Assessment
US9195500B1 (en) 2010-02-09 2015-11-24 F5 Networks, Inc. Methods for seamless storage importing and devices thereof
US9258175B1 (en) 2010-05-28 2016-02-09 The Directv Group, Inc. Method and system for sharing playlists for content stored within a network
US9286298B1 (en) 2010-10-14 2016-03-15 F5 Networks, Inc. Methods for enhancing management of backup data sets and devices thereof
US9292488B2 (en) 2014-02-01 2016-03-22 Soundhound, Inc. Method for embedding voice mail in a spoken utterance using a natural language processing computer system
US9330277B2 (en) 2012-06-21 2016-05-03 Google Technology Holdings LLC Privacy manager for restricting correlation of meta-content having protected information based on privacy rules
US20160133298A1 (en) * 2013-07-15 2016-05-12 Zte Corporation Method and Device for Adjusting Playback Progress of Video File
US9348915B2 (en) 2009-03-12 2016-05-24 Comcast Interactive Media, Llc Ranking search results
US9390167B2 (en) 2010-07-29 2016-07-12 Soundhound, Inc. System and methods for continuous audio matching
US9483405B2 (en) 2007-09-20 2016-11-01 Sony Interactive Entertainment Inc. Simplified run-time program translation for emulating complex processor pipelines
US9507849B2 (en) 2013-11-28 2016-11-29 Soundhound, Inc. Method for combining a query and a communication command in a natural language computer system
US9564123B1 (en) 2014-05-12 2017-02-07 Soundhound, Inc. Method and system for building an integrated user profile
US9602862B2 (en) 2000-04-16 2017-03-21 The Directv Group, Inc. Accessing programs using networked digital video recording devices
US9871842B2 (en) 2012-12-08 2018-01-16 Evertz Microsystems Ltd. Methods and systems for network based video clip processing and management
US9892730B2 (en) 2009-07-01 2018-02-13 Comcast Interactive Media, Llc Generating topic-specific language models
USRE47019E1 (en) 2010-07-14 2018-08-28 F5 Networks, Inc. Methods for DNSSEC proxying and deployment amelioration and systems thereof
US10121165B1 (en) 2011-05-10 2018-11-06 Soundhound, Inc. System and method for targeting content based on identified audio and multimedia
US20190019498A1 (en) * 2017-04-26 2019-01-17 International Business Machines Corporation Adaptive digital assistant and spoken genome
US10390074B2 (en) 2000-08-08 2019-08-20 The Directv Group, Inc. One click web records
US10467289B2 (en) 2011-08-02 2019-11-05 Comcast Cable Communications, Llc Segmentation of video according to narrative theme
US10567492B1 (en) 2017-05-11 2020-02-18 F5 Networks, Inc. Methods for load balancing in a federated identity environment and devices thereof
US10797888B1 (en) 2016-01-20 2020-10-06 F5 Networks, Inc. Methods for secured SCEP enrollment for client devices and devices thereof
US10833943B1 (en) 2018-03-01 2020-11-10 F5 Networks, Inc. Methods for service chaining and devices thereof
US10957310B1 (en) 2012-07-23 2021-03-23 Soundhound, Inc. Integrated programming framework for speech and text understanding with meaning parsing
USRE48725E1 (en) 2012-02-20 2021-09-07 F5 Networks, Inc. Methods for accessing data in a compressed file system and devices thereof
US11295730B1 (en) 2014-02-27 2022-04-05 Soundhound, Inc. Using phonetic variants in a local context to improve natural language understanding
US11531668B2 (en) 2008-12-29 2022-12-20 Comcast Interactive Media, Llc Merging of multiple data sets
US20230401789A1 (en) * 2022-06-13 2023-12-14 Verizon Patent And Licensing Inc. Methods and systems for unified rendering of light and sound content for a simulated 3d environment

Families Citing this family (176)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7168084B1 (en) 1992-12-09 2007-01-23 Sedna Patent Services, Llc Method and apparatus for targeting virtual objects
US9286294B2 (en) 1992-12-09 2016-03-15 Comcast Ip Holdings I, Llc Video and digital multimedia aggregator content suggestion engine
US6829368B2 (en) * 2000-01-26 2004-12-07 Digimarc Corporation Establishing and interacting with on-line media collections using identifiers in media signals
US7295752B1 (en) 1997-08-14 2007-11-13 Virage, Inc. Video cataloger system with audio track extraction
US6665687B1 (en) 1998-06-26 2003-12-16 Alexander James Burke Composite user interface and search system for internet and multimedia applications
US7269585B1 (en) * 1998-06-26 2007-09-11 Alexander James Burke User interface and search system for local and remote internet and other applications
US6833865B1 (en) * 1998-09-01 2004-12-21 Virage, Inc. Embedded metadata engines in digital capture devices
JP2000090644A (en) * 1998-09-08 2000-03-31 Sharp Corp Image management method and device
US6859799B1 (en) * 1998-11-30 2005-02-22 Gemstar Development Corporation Search engine for video and graphics
US6256071B1 (en) * 1998-12-11 2001-07-03 Hitachi America, Ltd. Methods and apparatus for recording video files and for generating a table listing the recorded files and links to additional information
US6924828B1 (en) * 1999-04-27 2005-08-02 Surfnotes Method and apparatus for improved information representation
US7565294B2 (en) * 1999-05-19 2009-07-21 Digimarc Corporation Methods and systems employing digital content
US7844594B1 (en) * 1999-06-18 2010-11-30 Surfwax, Inc. Information search, retrieval and distillation into knowledge objects
US7075591B1 (en) * 1999-09-22 2006-07-11 Lg Electronics Inc. Method of constructing information on associate meanings between segments of multimedia stream and method of browsing video using the same
US7228305B1 (en) * 2000-01-24 2007-06-05 Friskit, Inc. Rating system for streaming media playback system
US6389467B1 (en) 2000-01-24 2002-05-14 Friskit, Inc. Streaming media search and continuous playback system of media resources located by multiple network addresses
US20030126597A1 (en) * 2000-02-01 2003-07-03 Geoffrey Darby On-screen stripe and other methods for delivering information that facilitate convergence of audio/visual programming and advertisements with internet and other media usage
US9129034B2 (en) * 2000-02-04 2015-09-08 Browse3D Corporation System and method for web browsing
CN1279730C (en) * 2000-02-21 2006-10-11 株式会社Ntt都科摩 Information distribution method, information distribution system and information distribution server
US7788339B1 (en) * 2000-03-02 2010-08-31 Qwest Communications International Inc. System and method for automated download of multimedia files
US6516312B1 (en) * 2000-04-04 2003-02-04 International Business Machine Corporation System and method for dynamically associating keywords with domain-specific search engine queries
US6446083B1 (en) * 2000-05-12 2002-09-03 Vastvideo, Inc. System and method for classifying media items
US6968332B1 (en) * 2000-05-25 2005-11-22 Microsoft Corporation Facility for highlighting documents accessed through search or browsing
US20020087577A1 (en) * 2000-05-31 2002-07-04 Manjunath Bangalore S. Database building method for multimedia contents
JP2002032364A (en) * 2000-07-14 2002-01-31 Ricoh Co Ltd Document information processing method, document information processor and recording medium
US6912544B1 (en) * 2000-08-31 2005-06-28 Comverse Ltd. System and method for interleaving of material from database and customized audio-visual material
US7627665B2 (en) * 2000-09-28 2009-12-01 Barker Geoffrey T System and method for providing configurable security monitoring utilizing an integrated information system
KR101399240B1 (en) 2000-10-11 2014-06-02 유나이티드 비디오 프로퍼티즈, 인크. Systems and methods for delivering media content
US20020129364A1 (en) * 2000-11-27 2002-09-12 O2 Holdings, Llc On-screen display area enabling media convergence useful for viewers and audio/visual programmers
KR100449497B1 (en) * 2000-12-21 2004-09-21 주식회사 매직아이 Apparatus and method for providing realtime information
US6985950B1 (en) * 2001-03-06 2006-01-10 Microsoft Corporation System for creating a space-efficient document categorizer for training and testing of automatic categorization engines
DE60204181T2 (en) * 2001-03-07 2006-01-26 Matsushita Electric Industrial Co., Ltd., Kadoma Receiver with memory
US20030009489A1 (en) * 2001-05-29 2003-01-09 Griffin Steven K. Method for mining data and automatically associating source locations
KR20030006734A (en) * 2001-07-14 2003-01-23 엠텍비젼 주식회사 Method and system for managing image data via network
US7793326B2 (en) 2001-08-03 2010-09-07 Comcast Ip Holdings I, Llc Video and digital multimedia aggregator
US7908628B2 (en) * 2001-08-03 2011-03-15 Comcast Ip Holdings I, Llc Video and digital multimedia aggregator content coding and formatting
GB2378785A (en) * 2001-08-18 2003-02-19 Robert Benjamin Franks Online trademark application system
US20040003394A1 (en) * 2002-07-01 2004-01-01 Arun Ramaswamy System for automatically matching video with ratings information
US6983481B2 (en) * 2002-07-25 2006-01-03 International Business Machines Corporation Apparatus and method for blocking television commercials with a content interrogation program
US7454772B2 (en) 2002-07-25 2008-11-18 International Business Machines Corporation Apparatus and method for blocking television commercials and providing an archive interrogation program
US7194701B2 (en) 2002-11-19 2007-03-20 Hewlett-Packard Development Company, L.P. Video thumbnail
US20040246197A1 (en) * 2003-06-05 2004-12-09 Barrier Wanda Jane Three-way office
US20050144305A1 (en) * 2003-10-21 2005-06-30 The Board Of Trustees Operating Michigan State University Systems and methods for identifying, segmenting, collecting, annotating, and publishing multimedia materials
US20050192948A1 (en) * 2004-02-02 2005-09-01 Miller Joshua J. Data harvesting method apparatus and system
US7983835B2 (en) 2004-11-03 2011-07-19 Lagassey Paul J Modular intelligent transportation system
GB2417797B (en) * 2004-09-02 2009-05-13 Hewlett Packard Development Co A method and apparatus for managing storage used by a processor when processing instructions
EP1808858B1 (en) * 2005-08-31 2013-05-15 Sony Corporation File processing device, file processing method, program and recording medium
US8874477B2 (en) 2005-10-04 2014-10-28 Steven Mark Hoffberg Multifactorial optimization system and method
KR100724899B1 (en) * 2005-11-22 2007-06-04 삼성전자주식회사 Compatible-progressive download method and the system thereof
KR100763197B1 (en) 2006-02-06 2007-10-04 삼성전자주식회사 Method and apparatus for content browsing
GB2454106B (en) * 2006-06-06 2010-06-16 Channel D Corp System and method for displaying and editing digitally sampled audio data
US7890490B1 (en) * 2006-06-30 2011-02-15 United Video Properties, Inc. Systems and methods for providing advanced information searching in an interactive media guidance application
US7555480B2 (en) * 2006-07-11 2009-06-30 Microsoft Corporation Comparatively crawling web page data records relative to a template
JP2008026381A (en) * 2006-07-18 2008-02-07 Konica Minolta Business Technologies Inc Image forming device
US20080046406A1 (en) * 2006-08-15 2008-02-21 Microsoft Corporation Audio and video thumbnails
US20080088735A1 (en) * 2006-09-29 2008-04-17 Bryan Biniak Social media platform and method
US20080126191A1 (en) * 2006-11-08 2008-05-29 Richard Schiavi System and method for tagging, searching for, and presenting items contained within video media assets
US20080120290A1 (en) * 2006-11-20 2008-05-22 Rexee, Inc. Apparatus for Performing a Weight-Based Search
US8488839B2 (en) * 2006-11-20 2013-07-16 Videosurf, Inc. Computer program and apparatus for motion-based object extraction and tracking in video
US20080120328A1 (en) * 2006-11-20 2008-05-22 Rexee, Inc. Method of Performing a Weight-Based Search
US8379915B2 (en) * 2006-11-20 2013-02-19 Videosurf, Inc. Method of performing motion-based object extraction and tracking in video
US20080120291A1 (en) * 2006-11-20 2008-05-22 Rexee, Inc. Computer Program Implementing A Weight-Based Search
US8059915B2 (en) * 2006-11-20 2011-11-15 Videosurf, Inc. Apparatus for and method of robust motion estimation using line averages
WO2008064934A1 (en) * 2006-11-30 2008-06-05 International Business Machines Corporation Method, system and computer program for downloading information based on a snapshot approach
KR20080070471A (en) * 2007-01-26 2008-07-30 엘지전자 주식회사 Method for constructing of file format and apparatus and method for processing digital broadcast signal with file which has file format
US7958104B2 (en) 2007-03-08 2011-06-07 O'donnell Shawn C Context based data searching
US7920748B2 (en) * 2007-05-23 2011-04-05 Videosurf, Inc. Apparatus and software for geometric coarsening and segmenting of still images
US7903899B2 (en) * 2007-05-23 2011-03-08 Videosurf, Inc. Method of geometric coarsening and segmenting of still images
US8326814B2 (en) 2007-12-05 2012-12-04 Box, Inc. Web-based file management system and service
US20090327268A1 (en) * 2008-06-27 2009-12-31 Microsoft Corporation Providing targeted information for entertainment-oriented searches
WO2010006334A1 (en) 2008-07-11 2010-01-14 Videosurf, Inc. Apparatus and software system for and method of performing a visual-relevance-rank subsequent search
US8364660B2 (en) * 2008-07-11 2013-01-29 Videosurf, Inc. Apparatus and software system for and method of performing a visual-relevance-rank subsequent search
US8239359B2 (en) * 2008-09-23 2012-08-07 Disney Enterprises, Inc. System and method for visual search in a video media player
US20100107090A1 (en) * 2008-10-27 2010-04-29 Camille Hearst Remote linking to media asset groups
US20100228776A1 (en) * 2009-03-09 2010-09-09 Melkote Ramaswamy N System, mechanisms, methods and services for the creation, interaction and consumption of searchable, context relevant, multimedia collages composited from heterogeneous sources
US8769589B2 (en) * 2009-03-31 2014-07-01 At&T Intellectual Property I, L.P. System and method to create a media content summary based on viewer annotations
US8725648B2 (en) * 2009-09-01 2014-05-13 Savoirsoft, Inc. Digital rights content services architecture
JP2011055190A (en) * 2009-09-01 2011-03-17 Fujifilm Corp Image display apparatus and image display method
US10721269B1 (en) 2009-11-06 2020-07-21 F5 Networks, Inc. Methods and system for returning requests with javascript for clients before passing a request to a server
US9508011B2 (en) 2010-05-10 2016-11-29 Videosurf, Inc. Video visual and audio query
WO2011146276A2 (en) 2010-05-19 2011-11-24 Google Inc. Television related searching
WO2011149558A2 (en) 2010-05-28 2011-12-01 Abelow Daniel H Reality alternate
US10142687B2 (en) 2010-11-07 2018-11-27 Symphony Advanced Media, Inc. Audience content exposure monitoring apparatuses, methods and systems
US8955001B2 (en) 2011-07-06 2015-02-10 Symphony Advanced Media Mobile remote media control platform apparatuses and methods
WO2012099617A1 (en) 2011-01-20 2012-07-26 Box.Net, Inc. Real time notification of activities that occur in a web-based collaboration environment
US20120291056A1 (en) * 2011-05-11 2012-11-15 CSC Holdings, LLC Action enabled automatic content preview system and method
US9015601B2 (en) 2011-06-21 2015-04-21 Box, Inc. Batch uploading of content to a web-based collaboration environment
US9063912B2 (en) 2011-06-22 2015-06-23 Box, Inc. Multimedia content preview rendering in a cloud content management system
US9652741B2 (en) 2011-07-08 2017-05-16 Box, Inc. Desktop application for access and interaction with workspaces in a cloud-based content management system and synchronization mechanisms thereof
WO2013009328A2 (en) 2011-07-08 2013-01-17 Box.Net, Inc. Collaboration sessions in a workspace on cloud-based content management system
TWI426402B (en) * 2011-07-28 2014-02-11 Univ Nat Taiwan Science Tech Video searching method
US9197718B2 (en) 2011-09-23 2015-11-24 Box, Inc. Central management and control of user-contributed content in a web-based collaboration environment and management console thereof
US8515902B2 (en) 2011-10-14 2013-08-20 Box, Inc. Automatic and semi-automatic tagging features of work items in a shared workspace for metadata tracking in a cloud-based content management system with selective or optional user contribution
WO2013062599A1 (en) 2011-10-26 2013-05-02 Box, Inc. Enhanced multimedia content preview rendering in a cloud content management system
US9098474B2 (en) * 2011-10-26 2015-08-04 Box, Inc. Preview pre-generation based on heuristics and algorithmic prediction/assessment of predicted user behavior for enhancement of user experience
US8990307B2 (en) 2011-11-16 2015-03-24 Box, Inc. Resource effective incremental updating of a remote client with events which occurred via a cloud-enabled platform
GB2500152A (en) 2011-11-29 2013-09-11 Box Inc Mobile platform file and folder selection functionalities for offline access and synchronization
US9019123B2 (en) 2011-12-22 2015-04-28 Box, Inc. Health check services for web-based collaboration environments
US8805418B2 (en) 2011-12-23 2014-08-12 United Video Properties, Inc. Methods and systems for performing actions based on location-based rules
US9904435B2 (en) 2012-01-06 2018-02-27 Box, Inc. System and method for actionable event generation for task delegation and management via a discussion forum in a web-based collaboration environment
US11232481B2 (en) 2012-01-30 2022-01-25 Box, Inc. Extended applications of multimedia content previews in the cloud-based content management system
US8988578B2 (en) 2012-02-03 2015-03-24 Honeywell International Inc. Mobile computing device with improved image preview functionality
US9965745B2 (en) 2012-02-24 2018-05-08 Box, Inc. System and method for promoting enterprise adoption of a web-based collaboration environment
US9195636B2 (en) 2012-03-07 2015-11-24 Box, Inc. Universal file type preview for mobile devices
US9054919B2 (en) 2012-04-05 2015-06-09 Box, Inc. Device pinning capability for enterprise cloud service and storage accounts
US9575981B2 (en) 2012-04-11 2017-02-21 Box, Inc. Cloud service enabled to handle a set of files depicted to a user as a single file in a native operating system
US9413587B2 (en) 2012-05-02 2016-08-09 Box, Inc. System and method for a third-party application to access content within a cloud-based platform
US9396216B2 (en) 2012-05-04 2016-07-19 Box, Inc. Repository redundancy implementation of a system which incrementally updates clients with events that occurred via a cloud-enabled platform
US9691051B2 (en) 2012-05-21 2017-06-27 Box, Inc. Security enhancement through application access control
US9027108B2 (en) 2012-05-23 2015-05-05 Box, Inc. Systems and methods for secure file portability between mobile applications on a mobile device
US8914900B2 (en) 2012-05-23 2014-12-16 Box, Inc. Methods, architectures and security mechanisms for a third-party application to access content in a cloud-based platform
US9021099B2 (en) 2012-07-03 2015-04-28 Box, Inc. Load balancing secure FTP connections among multiple FTP servers
GB2505072A (en) 2012-07-06 2014-02-19 Box Inc Identifying users and collaborators as search results in a cloud-based system
US9792320B2 (en) 2012-07-06 2017-10-17 Box, Inc. System and method for performing shard migration to support functions of a cloud-based service
US9712510B2 (en) 2012-07-06 2017-07-18 Box, Inc. Systems and methods for securely submitting comments among users via external messaging applications in a cloud-based platform
US9473532B2 (en) 2012-07-19 2016-10-18 Box, Inc. Data loss prevention (DLP) methods by a cloud service including third party integration architectures
US8868574B2 (en) 2012-07-30 2014-10-21 Box, Inc. System and method for advanced search and filtering mechanisms for enterprise administrators in a cloud-based environment
US9794256B2 (en) 2012-07-30 2017-10-17 Box, Inc. System and method for advanced control tools for administrators in a cloud-based service
US8745267B2 (en) 2012-08-19 2014-06-03 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US9369520B2 (en) 2012-08-19 2016-06-14 Box, Inc. Enhancement of upload and/or download performance based on client and/or server feedback information
US9558202B2 (en) 2012-08-27 2017-01-31 Box, Inc. Server side techniques for reducing database workload in implementing selective subfolder synchronization in a cloud-based environment
US9135462B2 (en) 2012-08-29 2015-09-15 Box, Inc. Upload and download streaming encryption to/from a cloud-based platform
US9117087B2 (en) 2012-09-06 2015-08-25 Box, Inc. System and method for creating a secure channel for inter-application communication based on intents
US9311071B2 (en) 2012-09-06 2016-04-12 Box, Inc. Force upgrade of a mobile application via a server side configuration file
US9195519B2 (en) 2012-09-06 2015-11-24 Box, Inc. Disabling the self-referential appearance of a mobile application in an intent via a background registration
US9292833B2 (en) 2012-09-14 2016-03-22 Box, Inc. Batching notifications of activities that occur in a web-based collaboration environment
US10200256B2 (en) 2012-09-17 2019-02-05 Box, Inc. System and method of a manipulative handle in an interactive mobile user interface
US9553758B2 (en) 2012-09-18 2017-01-24 Box, Inc. Sandboxing individual applications to specific user folders in a cloud-based service
US10915492B2 (en) 2012-09-19 2021-02-09 Box, Inc. Cloud-based platform enabled with media content indexed for text-based searches and/or metadata extraction
US9519501B1 (en) 2012-09-30 2016-12-13 F5 Networks, Inc. Hardware assisted flow acceleration and L2 SMAC management in a heterogeneous distributed multi-tenant virtualized clustered system
US9959420B2 (en) 2012-10-02 2018-05-01 Box, Inc. System and method for enhanced security and management mechanisms for enterprise administrators in a cloud-based environment
US9705967B2 (en) 2012-10-04 2017-07-11 Box, Inc. Corporate user discovery and identification of recommended collaborators in a cloud platform
US9495364B2 (en) 2012-10-04 2016-11-15 Box, Inc. Enhanced quick search features, low-barrier commenting/interactive features in a collaboration platform
US9665349B2 (en) 2012-10-05 2017-05-30 Box, Inc. System and method for generating embeddable widgets which enable access to a cloud-based collaboration platform
US9165072B1 (en) 2012-10-09 2015-10-20 Amazon Technologies, Inc. Analyzing user searches of verbal media content
JP5982343B2 (en) 2012-10-17 2016-08-31 ボックス インコーポレイテッドBox, Inc. Remote key management in a cloud-based environment
US9756022B2 (en) 2014-08-29 2017-09-05 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US10395642B1 (en) * 2012-11-19 2019-08-27 Cox Communications, Inc. Caption data fishing
US10235383B2 (en) 2012-12-19 2019-03-19 Box, Inc. Method and apparatus for synchronization of items with read-only permissions in a cloud-based environment
US9396245B2 (en) 2013-01-02 2016-07-19 Box, Inc. Race condition handling in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9953036B2 (en) 2013-01-09 2018-04-24 Box, Inc. File system monitoring in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
EP2755151A3 (en) 2013-01-11 2014-09-24 Box, Inc. Functionalities, features and user interface of a synchronization client to a cloud-based environment
EP2757491A1 (en) 2013-01-17 2014-07-23 Box, Inc. Conflict resolution, retry condition management, and handling of problem files for the synchronization client to a cloud-based platform
US10375155B1 (en) 2013-02-19 2019-08-06 F5 Networks, Inc. System and method for achieving hardware acceleration for asymmetric flow connections
US9554418B1 (en) 2013-02-28 2017-01-24 F5 Networks, Inc. Device for topology hiding of a visited network
US10725968B2 (en) 2013-05-10 2020-07-28 Box, Inc. Top down delete or unsynchronization on delete of and depiction of item synchronization with a synchronization client to a cloud-based platform
US10846074B2 (en) 2013-05-10 2020-11-24 Box, Inc. Identification and handling of items to be ignored for synchronization with a cloud-based platform by a synchronization client
US9633037B2 (en) 2013-06-13 2017-04-25 Box, Inc Systems and methods for synchronization event building and/or collapsing by a synchronization component of a cloud-based platform
US9805050B2 (en) 2013-06-21 2017-10-31 Box, Inc. Maintaining and updating file system shadows on a local device by a synchronization client of a cloud-based platform
US10110656B2 (en) 2013-06-25 2018-10-23 Box, Inc. Systems and methods for providing shell communication in a cloud-based platform
US10229134B2 (en) 2013-06-25 2019-03-12 Box, Inc. Systems and methods for managing upgrades, migration of user data and improving performance of a cloud-based platform
US9535924B2 (en) 2013-07-30 2017-01-03 Box, Inc. Scalability improvement in a system which incrementally updates clients with events that occurred in a cloud-based collaboration platform
US9213684B2 (en) 2013-09-13 2015-12-15 Box, Inc. System and method for rendering document in web browser or mobile device regardless of third-party plug-in software
US10509527B2 (en) 2013-09-13 2019-12-17 Box, Inc. Systems and methods for configuring event-based automation in cloud-based collaboration platforms
GB2518298A (en) 2013-09-13 2015-03-18 Box Inc High-availability architecture for a cloud-based concurrent-access collaboration platform
US9535909B2 (en) 2013-09-13 2017-01-03 Box, Inc. Configurable event-based automation architecture for cloud-based collaboration platforms
US8892679B1 (en) 2013-09-13 2014-11-18 Box, Inc. Mobile device, methods and user interfaces thereof in a mobile device platform featuring multifunctional access and engagement in a collaborative environment provided by a cloud-based platform
US9704137B2 (en) 2013-09-13 2017-07-11 Box, Inc. Simultaneous editing/accessing of content by collaborator invitation through a web-based or mobile application to a cloud-based collaboration platform
US10866931B2 (en) 2013-10-22 2020-12-15 Box, Inc. Desktop application for accessing a cloud collaboration platform
US9672280B2 (en) * 2014-04-10 2017-06-06 Google Inc. Methods, systems, and media for searching for video content
US10530854B2 (en) 2014-05-30 2020-01-07 Box, Inc. Synchronization of permissioned content in cloud-based environments
US9602514B2 (en) 2014-06-16 2017-03-21 Box, Inc. Enterprise mobility management and verification of a managed application by a content provider
US11838851B1 (en) 2014-07-15 2023-12-05 F5, Inc. Methods for managing L7 traffic classification and devices thereof
US9894119B2 (en) 2014-08-29 2018-02-13 Box, Inc. Configurable metadata-based automation and content classification architecture for cloud-based collaboration platforms
US10574442B2 (en) 2014-08-29 2020-02-25 Box, Inc. Enhanced remote key management for an enterprise in a cloud-based environment
US10038731B2 (en) 2014-08-29 2018-07-31 Box, Inc. Managing flow-based interactions with cloud-based shared content
US10652127B2 (en) 2014-10-03 2020-05-12 The Nielsen Company (Us), Llc Fusing online media monitoring data with secondary online data feeds to generate ratings data for online media exposure
US10182013B1 (en) 2014-12-01 2019-01-15 F5 Networks, Inc. Methods for managing progressive image delivery and devices thereof
US11895138B1 (en) 2015-02-02 2024-02-06 F5, Inc. Methods for improving web scanner accuracy and devices thereof
US9392324B1 (en) 2015-03-30 2016-07-12 Rovi Guides, Inc. Systems and methods for identifying and storing a portion of a media asset
US10834065B1 (en) 2015-03-31 2020-11-10 F5 Networks, Inc. Methods for SSL protected NTLM re-authentication and devices thereof
US10404698B1 (en) 2016-01-15 2019-09-03 F5 Networks, Inc. Methods for adaptive organization of web application access points in webtops and devices thereof
US9826285B1 (en) 2016-03-24 2017-11-21 Amazon Technologies, Inc. Dynamic summaries for media content
US10412198B1 (en) 2016-10-27 2019-09-10 F5 Networks, Inc. Methods for improved transmission control protocol (TCP) performance visibility and devices thereof
US10845956B2 (en) 2017-05-31 2020-11-24 Snap Inc. Methods and systems for voice driven dynamic menus
US11223689B1 (en) 2018-01-05 2022-01-11 F5 Networks, Inc. Methods for multipath transmission control protocol (MPTCP) based session migration and devices thereof

Family Cites Families (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4772941A (en) * 1987-10-15 1988-09-20 Eastman Kodak Company Video display system
US5136618A (en) * 1989-01-19 1992-08-04 Redband Technologies, Inc. Method and apparatus for bandwidth reduction of modulated signals
US5253275A (en) * 1991-01-07 1993-10-12 H. Lee Browne Audio and video transmission and receiving system
US5764276A (en) * 1991-05-13 1998-06-09 Interactive Pictures Corporation Method and apparatus for providing perceived video viewing experiences using still images
US5706290A (en) * 1994-12-15 1998-01-06 Shaw; Venson Method and apparatus including system architecture for multimedia communication
US5524193A (en) * 1991-10-15 1996-06-04 And Communications Interactive multimedia annotation method and apparatus
US5555407A (en) * 1993-02-17 1996-09-10 Home Information Services, Inc. Method of and apparatus for reduction of bandwidth requirements in the provision of electronic information and transaction services through communication networks
DE69516751T2 (en) * 1994-04-15 2000-10-05 Canon Kk Image preprocessing for character recognition system
US5493677A (en) 1994-06-08 1996-02-20 Systems Research & Applications Corporation Generation, archiving, and retrieval of digital images with evoked suggestion-set captions and natural language interface
US5600775A (en) * 1994-08-26 1997-02-04 Emotion, Inc. Method and apparatus for annotating full motion video and other indexed data structures
CA2153445C (en) * 1994-09-08 2002-05-21 Ashok Raj Saxena Video optimized media streamer user interface
JP2958742B2 (en) * 1994-10-07 1999-10-06 ローランド株式会社 Waveform data compression device, waveform data decompression device, quantization device, and data creation method using floating point
WO1996017313A1 (en) * 1994-11-18 1996-06-06 Oracle Corporation Method and apparatus for indexing multimedia information streams
US5751338A (en) 1994-12-30 1998-05-12 Visionary Corporate Technologies Methods and systems for multimedia communications via public telephone networks
US5729471A (en) * 1995-03-31 1998-03-17 The Regents Of The University Of California Machine dynamic selection of one video camera/image of a scene from multiple video cameras/images of the scene in accordance with a particular perspective on the scene, an object in the scene, or an event in the scene
US5729741A (en) 1995-04-10 1998-03-17 Golden Enterprises, Inc. System for storage and retrieval of diverse types of information obtained from different media sources which includes video, audio, and text transcriptions
US5677708A (en) 1995-05-05 1997-10-14 Microsoft Corporation System for displaying a list on a display screen
US5727141A (en) * 1995-05-05 1998-03-10 Apple Computer, Inc. Method and apparatus for identifying user-selectable regions within multiple display frames
JPH0916457A (en) * 1995-06-28 1997-01-17 Fujitsu Ltd Multimedia data retrieval system
US5943046A (en) * 1995-07-19 1999-08-24 Intervoice Limited Partnership Systems and methods for the distribution of multimedia information
US5778190A (en) * 1995-07-21 1998-07-07 Intel Corporation Encoding video signals using multi-phase motion estimation
US5687331A (en) 1995-08-03 1997-11-11 Microsoft Corporation Method and system for displaying an animated focus item
US5742816A (en) * 1995-09-15 1998-04-21 Infonautics Corporation Method and apparatus for identifying textual documents and multi-mediafiles corresponding to a search topic
US5737619A (en) 1995-10-19 1998-04-07 Judson; David Hugh World wide web browsing with content delivery over an idle connection and interstitial content display
US5884056A (en) 1995-12-28 1999-03-16 International Business Machines Corporation Method and system for video browsing on the world wide web
US5786856A (en) * 1996-03-19 1998-07-28 International Business Machines Method for adaptive quantization by multiplication of luminance pixel blocks by a modified, frequency ordered hadamard matrix
US5852435A (en) * 1996-04-12 1998-12-22 Avid Technology, Inc. Digital multimedia editing and data management system
US5870754A (en) * 1996-04-25 1999-02-09 Philips Electronics North America Corporation Video retrieval of MPEG compressed sequences using DC and motion signatures
US5903892A (en) 1996-05-24 1999-05-11 Magnifi, Inc. Indexing of media content on a network
US5797008A (en) 1996-08-09 1998-08-18 Digital Equipment Corporation Memory storing an integrated index of database records
US5895477A (en) * 1996-09-09 1999-04-20 Design Intelligence, Inc. Design engine for automatic layout of content
US6172672B1 (en) * 1996-12-18 2001-01-09 Seeltfirst.Com Method and system for providing snapshots from a compressed digital video stream
US6081278A (en) * 1998-06-11 2000-06-27 Chen; Shenchang Eric Animation object having multiple resolution format

Cited By (209)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7181127B2 (en) 1997-05-16 2007-02-20 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US7181126B2 (en) 1997-05-16 2007-02-20 Hiatchi, Ltd. Image retrieving method and apparatuses therefor
US7254311B2 (en) 1997-05-16 2007-08-07 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US6400890B1 (en) * 1997-05-16 2002-06-04 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US6587637B2 (en) * 1997-05-16 2003-07-01 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US20020012521A1 (en) * 1997-05-16 2002-01-31 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US20020012519A1 (en) * 1997-05-16 2002-01-31 Hitachi, Ltd. Image retrieving method and apparatuses therefor
US7362323B2 (en) * 1998-05-23 2008-04-22 Eolas Technologies, Inc. Method and apparatus for identifying features of multidimensional image data in hypermedia systems
US20040078753A1 (en) * 1998-05-23 2004-04-22 Doyle Michael D. Method and apparatus for identifying features of multidimensional image data in hypermedia systems
US6785429B1 (en) * 1998-07-08 2004-08-31 Matsushita Electric Industrial Co., Ltd. Multimedia data retrieval device and method
US20140089241A1 (en) * 1999-02-01 2014-03-27 Steven M. Hoffberg System and method for intermachine markup language communications
US6473756B1 (en) * 1999-06-11 2002-10-29 Acceleration Software International Corporation Method for selecting among equivalent files on a global computer network
US7162493B2 (en) * 2000-02-23 2007-01-09 Penta Trading Ltd. Systems and methods for generating and providing previews of electronic files such as web files
US20030014415A1 (en) * 2000-02-23 2003-01-16 Yuval Weiss Systems and methods for generating and providing previews of electronic files such as web files
US8548978B2 (en) * 2000-04-07 2013-10-01 Virage, Inc. Network video guide and spidering
US8387087B2 (en) 2000-04-07 2013-02-26 Virage, Inc. System and method for applying a database to video multimedia
US20070282818A1 (en) * 2000-04-07 2007-12-06 Virage, Inc. Network video guide and spidering
US20080028047A1 (en) * 2000-04-07 2008-01-31 Virage, Inc. Interactive video application hosting
US8171509B1 (en) 2000-04-07 2012-05-01 Virage, Inc. System and method for applying a database to video multimedia
US7769827B2 (en) 2000-04-07 2010-08-03 Virage, Inc. Interactive video application hosting
US8495694B2 (en) 2000-04-07 2013-07-23 Virage, Inc. Video-enabled community building
US9684728B2 (en) * 2000-04-07 2017-06-20 Hewlett Packard Enterprise Development Lp Sharing video
US7962948B1 (en) 2000-04-07 2011-06-14 Virage, Inc. Video-enabled community building
US9338520B2 (en) 2000-04-07 2016-05-10 Hewlett Packard Enterprise Development Lp System and method for applying a database to video multimedia
US9602862B2 (en) 2000-04-16 2017-03-21 The Directv Group, Inc. Accessing programs using networked digital video recording devices
US10142673B2 (en) 2000-04-16 2018-11-27 The Directv Group, Inc. Accessing programs using networked digital video recording devices
US20050216443A1 (en) * 2000-07-06 2005-09-29 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US9542393B2 (en) 2000-07-06 2017-01-10 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US8117206B2 (en) 2000-07-06 2012-02-14 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US20090125534A1 (en) * 2000-07-06 2009-05-14 Michael Scott Morton Method and System for Indexing and Searching Timed Media Information Based Upon Relevance Intervals
US8527520B2 (en) 2000-07-06 2013-09-03 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevant intervals
US7490092B2 (en) * 2000-07-06 2009-02-10 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US8706735B2 (en) * 2000-07-06 2014-04-22 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US20130318121A1 (en) * 2000-07-06 2013-11-28 Streamsage, Inc. Method and System for Indexing and Searching Timed Media Information Based Upon Relevance Intervals
US9244973B2 (en) 2000-07-06 2016-01-26 Streamsage, Inc. Method and system for indexing and searching timed media information based upon relevance intervals
US9654238B2 (en) 2000-08-08 2017-05-16 The Directv Group, Inc. Method and system for remote television replay control
US8949374B2 (en) 2000-08-08 2015-02-03 The Directv Group, Inc. Method and system for remote television replay control
US20020087661A1 (en) * 2000-08-08 2002-07-04 Matichuk Chris E. One click web records
US10390074B2 (en) 2000-08-08 2019-08-20 The Directv Group, Inc. One click web records
US20020038358A1 (en) * 2000-08-08 2002-03-28 Sweatt Millard E. Method and system for remote television replay control
US9171851B2 (en) 2000-08-08 2015-10-27 The Directv Group, Inc. One click web records
US20040098365A1 (en) * 2000-09-14 2004-05-20 Christophe Comps Method for synchronizing a multimedia file
US7386782B2 (en) * 2000-09-14 2008-06-10 Alcatel Method for synchronizing a multimedia file
US8195769B2 (en) 2001-01-11 2012-06-05 F5 Networks, Inc. Rule based aggregation of files and transactions in a switched file system
USRE43346E1 (en) 2001-01-11 2012-05-01 F5 Networks, Inc. Transaction aggregation in a switched file system
US20090292734A1 (en) * 2001-01-11 2009-11-26 F5 Networks, Inc. Rule based aggregation of files and transactions in a switched file system
US8417681B1 (en) 2001-01-11 2013-04-09 F5 Networks, Inc. Aggregated lock management for locking aggregated files in a switched file system
US8396895B2 (en) 2001-01-11 2013-03-12 F5 Networks, Inc. Directory aggregation for files distributed over a plurality of servers in a switched file system
US8195760B2 (en) 2001-01-11 2012-06-05 F5 Networks, Inc. File aggregation in a switched file system
US8966557B2 (en) 2001-01-22 2015-02-24 Sony Computer Entertainment Inc. Delivery of digital content
US8036421B2 (en) 2001-07-05 2011-10-11 Digimarc Corporation Methods employing topical subject criteria in video processing
US8085979B2 (en) 2001-07-05 2011-12-27 Digimarc Corporation Methods employing stored preference data to identify video of interest to a consumer
US8122465B2 (en) 2001-07-05 2012-02-21 Digimarc Corporation Watermarking to set video usage permissions
US7778441B2 (en) 2001-07-05 2010-08-17 Digimarc Corporation Methods employing topical subject criteria in video processing
US20100199314A1 (en) * 2001-07-05 2010-08-05 Davis Bruce L Methods employing stored preference data to identify video of interest to a consumer
US20080008352A1 (en) * 2001-07-05 2008-01-10 Davis Bruce L Methods Employing Topical Subject Criteria in Video Processing
US8875198B1 (en) 2001-08-19 2014-10-28 The Directv Group, Inc. Network video unit
US7917008B1 (en) 2001-08-19 2011-03-29 The Directv Group, Inc. Interface for resolving recording conflicts with network devices
US9426531B2 (en) 2001-08-19 2016-08-23 The Directv Group, Inc. Network video unit
US9743147B2 (en) 2001-08-19 2017-08-22 The Directv Group, Inc. Network video unit
US9467746B2 (en) 2001-08-19 2016-10-11 The Directv Group, Inc. Network video unit
EP1351501A3 (en) * 2002-03-19 2008-01-30 British Broadcasting Corporation Method and system for accessing video data
EP1351501A2 (en) * 2002-03-19 2003-10-08 British Broadcasting Corporation Method and system for accessing video data
US20130018807A1 (en) * 2002-04-02 2013-01-17 Collaborative Agreements, LLC Method for Facilitating Transactions Between Two or More Parties
US20040015467A1 (en) * 2002-07-18 2004-01-22 Accenture Global Services, Gmbh Media indexing beacon and capture device
US7949689B2 (en) 2002-07-18 2011-05-24 Accenture Global Services Limited Media indexing beacon and capture device
US8108369B2 (en) 2003-01-07 2012-01-31 Accenture Global Services Limited Customized multi-media services
US20040133597A1 (en) * 2003-01-07 2004-07-08 Fano Andrew E. Customized multi-media services
US7593915B2 (en) * 2003-01-07 2009-09-22 Accenture Global Services Gmbh Customized multi-media services
WO2004084448A3 (en) * 2003-03-20 2007-07-05 Digital Networks North America System and method for navigation of indexed video content
US20040221311A1 (en) * 2003-03-20 2004-11-04 Christopher Dow System and method for navigation of indexed video content
WO2004084448A2 (en) * 2003-03-20 2004-09-30 Digital Networks North America, Inc. System and method for navigation of indexed video content
US7735104B2 (en) 2003-03-20 2010-06-08 The Directv Group, Inc. System and method for navigation of indexed video content
US8752115B2 (en) 2003-03-24 2014-06-10 The Directv Group, Inc. System and method for aggregating commercial navigation information
US20040190853A1 (en) * 2003-03-24 2004-09-30 Christopher Dow System and method for aggregating commercial navigation information
US20040221044A1 (en) * 2003-05-02 2004-11-04 Oren Rosenbloom System and method for facilitating communication between a computing device and multiple categories of media devices
US7673020B2 (en) 2003-05-02 2010-03-02 Microsoft Corporation System and method for facilitating communication between a computing device and multiple categories of media devices
US20050091274A1 (en) * 2003-10-28 2005-04-28 International Business Machines Corporation System and method for transcribing audio files of various languages
US20080052062A1 (en) * 2003-10-28 2008-02-28 Joey Stanford System and Method for Transcribing Audio Files of Various Languages
US7321852B2 (en) 2003-10-28 2008-01-22 International Business Machines Corporation System and method for transcribing audio files of various languages
US8996369B2 (en) 2003-10-28 2015-03-31 Nuance Communications, Inc. System and method for transcribing audio files of various languages
US20050125821A1 (en) * 2003-11-18 2005-06-09 Zhu Li Method and apparatus for characterizing a video segment and determining if a first video segment matches a second video segment
WO2005050973A3 (en) * 2003-11-18 2006-08-31 Motorola Inc Method for video segment matching
WO2005050973A2 (en) * 2003-11-18 2005-06-02 Motorola, Inc. Method for video segment matching
US20050163462A1 (en) * 2004-01-28 2005-07-28 Pratt Buell A. Motion picture asset archive having reduced physical volume and method
US20050165840A1 (en) * 2004-01-28 2005-07-28 Pratt Buell A. Method and apparatus for improved access to a compacted motion picture asset archive
US7502820B2 (en) 2004-05-03 2009-03-10 Microsoft Corporation System and method for optimized property retrieval of stored objects
US20050246375A1 (en) * 2004-05-03 2005-11-03 Microsoft Corporation System and method for encapsulation of representative sample of media object
WO2005111869A3 (en) * 2004-05-03 2006-01-05 Microsoft Corp System and method for encapsulation of representative sample of media object
US20060031384A1 (en) * 2004-05-03 2006-02-09 Microsoft Corporation System and method for optimized property retrieval of stored objects
US7574655B2 (en) 2004-05-03 2009-08-11 Microsoft Corporation System and method for encapsulation of representative sample of media object
US7953504B2 (en) 2004-05-14 2011-05-31 Synaptics Incorporated Method and apparatus for selecting an audio track based upon audio excerpts
US20050254366A1 (en) * 2004-05-14 2005-11-17 Renaud Amar Method and apparatus for selecting an audio track based upon audio excerpts
US7555554B2 (en) 2004-08-06 2009-06-30 Microsoft Corporation System and method for generating selectable extension to media transport protocol
US20060031545A1 (en) * 2004-08-06 2006-02-09 Microsoft Corporation System and method for generating selectable extension to media transport protocol
US8433735B2 (en) 2005-01-20 2013-04-30 F5 Networks, Inc. Scalable system for partitioning and accessing metadata over multiple servers
US8397059B1 (en) 2005-02-04 2013-03-12 F5 Networks, Inc. Methods and apparatus for implementing authentication
US20060200470A1 (en) * 2005-03-03 2006-09-07 Z-Force Communications, Inc. System and method for managing small-size files in an aggregated file system
US8239354B2 (en) 2005-03-03 2012-08-07 F5 Networks, Inc. System and method for managing small-size files in an aggregated file system
US20060277174A1 (en) * 2005-06-06 2006-12-07 Thomson Licensing Method and device for searching a data unit in a database
US20060294064A1 (en) * 2005-06-24 2006-12-28 Microsoft Corporation Storing queries on devices with rewritable media
US20060294585A1 (en) * 2005-06-24 2006-12-28 Microsoft Corporation System and method for creating and managing a trusted constellation of personal digital devices
US20070079010A1 (en) * 2005-10-04 2007-04-05 Microsoft Corporation Media exchange protocol and devices using the same
US8117342B2 (en) 2005-10-04 2012-02-14 Microsoft Corporation Media exchange protocol supporting format conversion of media items
US9600584B2 (en) * 2005-11-04 2017-03-21 Nokia Technologies Oy Scalable visual search system simplifying access to network and device functionality
US20140365472A1 (en) * 2005-11-04 2014-12-11 Nokia Corporation Scalable visual search system simplifying access to network and device functionality
US8849821B2 (en) * 2005-11-04 2014-09-30 Nokia Corporation Scalable visual search system simplifying access to network and device functionality
US20070106721A1 (en) * 2005-11-04 2007-05-10 Philipp Schloter Scalable visual search system simplifying access to network and device functionality
US20070245028A1 (en) * 2006-03-31 2007-10-18 Baxter Robert A Configuring content in an interactive media system
US8417746B1 (en) * 2006-04-03 2013-04-09 F5 Networks, Inc. File system management with enhanced searchability
WO2007124080A2 (en) * 2006-04-21 2007-11-01 Hewlett-Packard Development Company L.P. Method and system for finding data objects within large data-object libraries
WO2007124080A3 (en) * 2006-04-21 2008-04-17 Hewlett Packard Development Co Method and system for finding data objects within large data-object libraries
US20070250499A1 (en) * 2006-04-21 2007-10-25 Simon Widdowson Method and system for finding data objects within large data-object libraries
US20080071750A1 (en) * 2006-09-17 2008-03-20 Nokia Corporation Method, Apparatus and Computer Program Product for Providing Standard Real World to Virtual World Links
US8775452B2 (en) 2006-09-17 2014-07-08 Nokia Corporation Method, apparatus and computer program product for providing standard real world to virtual world links
US9678987B2 (en) 2006-09-17 2017-06-13 Nokia Technologies Oy Method, apparatus and computer program product for providing standard real world to virtual world links
US20080071770A1 (en) * 2006-09-18 2008-03-20 Nokia Corporation Method, Apparatus and Computer Program Product for Viewing a Virtual Database Using Portable Devices
US8682916B2 (en) 2007-05-25 2014-03-25 F5 Networks, Inc. Remote file virtualization in a switched file system
US9483405B2 (en) 2007-09-20 2016-11-01 Sony Interactive Entertainment Inc. Simplified run-time program translation for emulating complex processor pipelines
US20090254592A1 (en) * 2007-11-12 2009-10-08 Attune Systems, Inc. Non-Disruptive File Migration
US8548953B2 (en) 2007-11-12 2013-10-01 F5 Networks, Inc. File deduplication using storage tiers
US8117244B2 (en) 2007-11-12 2012-02-14 F5 Networks, Inc. Non-disruptive file migration
US8180747B2 (en) 2007-11-12 2012-05-15 F5 Networks, Inc. Load sharing cluster file systems
US20090150433A1 (en) * 2007-12-07 2009-06-11 Nokia Corporation Method, Apparatus and Computer Program Product for Using Media Content as Awareness Cues
US8352785B1 (en) 2007-12-13 2013-01-08 F5 Networks, Inc. Methods for generating a unified virtual snapshot and systems thereof
US8549582B1 (en) 2008-07-11 2013-10-01 F5 Networks, Inc. Methods for handling a multi-protocol content name and systems thereof
US8752084B1 (en) 2008-07-11 2014-06-10 The Directv Group, Inc. Television advertisement monitoring system
US9442933B2 (en) 2008-12-24 2016-09-13 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US11468109B2 (en) 2008-12-24 2022-10-11 Comcast Interactive Media, Llc Searching for segments based on an ontology
US20100158470A1 (en) * 2008-12-24 2010-06-24 Comcast Interactive Media, Llc Identification of segments within audio, video, and multimedia items
US8713016B2 (en) 2008-12-24 2014-04-29 Comcast Interactive Media, Llc Method and apparatus for organizing segments of media assets and determining relevance of segments to a query
US20100161580A1 (en) * 2008-12-24 2010-06-24 Comcast Interactive Media, Llc Method and apparatus for organizing segments of media assets and determining relevance of segments to a query
US9477712B2 (en) 2008-12-24 2016-10-25 Comcast Interactive Media, Llc Searching for segments based on an ontology
US10635709B2 (en) 2008-12-24 2020-04-28 Comcast Interactive Media, Llc Searching for segments based on an ontology
US11531668B2 (en) 2008-12-29 2022-12-20 Comcast Interactive Media, Llc Merging of multiple data sets
US10025832B2 (en) 2009-03-12 2018-07-17 Comcast Interactive Media, Llc Ranking search results
US9348915B2 (en) 2009-03-12 2016-05-24 Comcast Interactive Media, Llc Ranking search results
US9626424B2 (en) 2009-05-12 2017-04-18 Comcast Interactive Media, Llc Disambiguation and tagging of entities
US8533223B2 (en) 2009-05-12 2013-09-10 Comcast Interactive Media, LLC. Disambiguation and tagging of entities
US11562737B2 (en) 2009-07-01 2023-01-24 Tivo Corporation Generating topic-specific language models
US10559301B2 (en) 2009-07-01 2020-02-11 Comcast Interactive Media, Llc Generating topic-specific language models
US9892730B2 (en) 2009-07-01 2018-02-13 Comcast Interactive Media, Llc Generating topic-specific language models
US8370288B2 (en) 2009-07-20 2013-02-05 Sony Computer Entertainment America Llc Summarizing a body of media by assembling selected summaries
US8744993B2 (en) 2009-07-20 2014-06-03 Sony Computer Entertainment America Llc Summarizing a body of media by assembling selected summaries
US20110016079A1 (en) * 2009-07-20 2011-01-20 Adam Harris Summarizing a Body of Media
US8126987B2 (en) 2009-11-16 2012-02-28 Sony Computer Entertainment Inc. Mediation of content-related services
US8856148B1 (en) 2009-11-18 2014-10-07 Soundhound, Inc. Systems and methods for determining underplayed and overplayed items
US8392372B2 (en) 2010-02-09 2013-03-05 F5 Networks, Inc. Methods and systems for snapshot reconstitution
US9195500B1 (en) 2010-02-09 2015-11-24 F5 Networks, Inc. Methods for seamless storage importing and devices thereof
US8204860B1 (en) 2010-02-09 2012-06-19 F5 Networks, Inc. Methods and systems for snapshot reconstitution
US8725766B2 (en) * 2010-03-25 2014-05-13 Rovi Technologies Corporation Searching text and other types of content by using a frequency domain
US20110238698A1 (en) * 2010-03-25 2011-09-29 Rovi Technologies Corporation Searching text and other types of content by using a frequency domain
US9691430B2 (en) 2010-04-01 2017-06-27 Microsoft Technology Licensing, Llc Opportunistic frame caching
WO2011123146A1 (en) * 2010-04-01 2011-10-06 Microsoft Corporation Opportunistic frame caching
US20100211693A1 (en) * 2010-05-04 2010-08-19 Aaron Steven Master Systems and Methods for Sound Recognition
US9280598B2 (en) * 2010-05-04 2016-03-08 Soundhound, Inc. Systems and methods for sound recognition
US8688253B2 (en) * 2010-05-04 2014-04-01 Soundhound, Inc. Systems and methods for sound recognition
US8433759B2 (en) 2010-05-24 2013-04-30 Sony Computer Entertainment America Llc Direction-conscious information sharing
US9258175B1 (en) 2010-05-28 2016-02-09 The Directv Group, Inc. Method and system for sharing playlists for content stored within a network
US8423555B2 (en) 2010-07-09 2013-04-16 Comcast Cable Communications, Llc Automatic segmentation of video
US9177080B2 (en) 2010-07-09 2015-11-03 Comcast Cable Communications, Llc Automatic segmentation of video
USRE47019E1 (en) 2010-07-14 2018-08-28 F5 Networks, Inc. Methods for DNSSEC proxying and deployment amelioration and systems thereof
US9355407B2 (en) 2010-07-29 2016-05-31 Soundhound, Inc. Systems and methods for searching cloud-based databases
US10657174B2 (en) 2010-07-29 2020-05-19 Soundhound, Inc. Systems and methods for providing identification information in response to an audio segment
US8694537B2 (en) 2010-07-29 2014-04-08 Soundhound, Inc. Systems and methods for enabling natural language processing
US8694534B2 (en) 2010-07-29 2014-04-08 Soundhound, Inc. Systems and methods for searching databases by sound input
US9390167B2 (en) 2010-07-29 2016-07-12 Soundhound, Inc. System and methods for continuous audio matching
US10055490B2 (en) 2010-07-29 2018-08-21 Soundhound, Inc. System and methods for continuous audio matching
US9286298B1 (en) 2010-10-14 2016-03-15 F5 Networks, Inc. Methods for enhancing management of backup data sets and devices thereof
US10575031B2 (en) 2011-04-11 2020-02-25 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US20140090002A1 (en) * 2011-04-11 2014-03-27 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US11240538B2 (en) 2011-04-11 2022-02-01 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US9996615B2 (en) * 2011-04-11 2018-06-12 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US10078695B2 (en) 2011-04-11 2018-09-18 Evertz Microsystems Ltd. Methods and systems for network based video clip generation and management
US10832287B2 (en) 2011-05-10 2020-11-10 Soundhound, Inc. Promotional content targeting based on recognized audio
US10121165B1 (en) 2011-05-10 2018-11-06 Soundhound, Inc. System and method for targeting content based on identified audio and multimedia
US8396836B1 (en) 2011-06-30 2013-03-12 F5 Networks, Inc. System for mitigating file virtualization storage import latency
US10467289B2 (en) 2011-08-02 2019-11-05 Comcast Cable Communications, Llc Segmentation of video according to narrative theme
US8463850B1 (en) 2011-10-26 2013-06-11 F5 Networks, Inc. System and method of algorithmically generating a server side transaction identifier
USRE48725E1 (en) 2012-02-20 2021-09-07 F5 Networks, Inc. Methods for accessing data in a compressed file system and devices thereof
US9330277B2 (en) 2012-06-21 2016-05-03 Google Technology Holdings LLC Privacy manager for restricting correlation of meta-content having protected information based on privacy rules
US8959574B2 (en) 2012-06-21 2015-02-17 Google Technology Holdings LLC Content rights protection with arbitrary correlation of second content
WO2013191856A1 (en) * 2012-06-21 2013-12-27 Motorola Mobility Llc Correlation engine and method for granular meta-content having arbitrary non-uniform granularity
US11776533B2 (en) 2012-07-23 2023-10-03 Soundhound, Inc. Building a natural language understanding application using a received electronic record containing programming code including an interpret-block, an interpret-statement, a pattern expression and an action statement
US10957310B1 (en) 2012-07-23 2021-03-23 Soundhound, Inc. Integrated programming framework for speech and text understanding with meaning parsing
US10996931B1 (en) 2012-07-23 2021-05-04 Soundhound, Inc. Integrated programming framework for speech and text understanding with block and statement structure
US20140040273A1 (en) * 2012-08-03 2014-02-06 Fuji Xerox Co., Ltd. Hypervideo browsing using links generated based on user-specified content features
US9244923B2 (en) * 2012-08-03 2016-01-26 Fuji Xerox Co., Ltd. Hypervideo browsing using links generated based on user-specified content features
US9871842B2 (en) 2012-12-08 2018-01-16 Evertz Microsystems Ltd. Methods and systems for network based video clip processing and management
US10542058B2 (en) 2012-12-08 2020-01-21 Evertz Microsystems Ltd. Methods and systems for network based video clip processing and management
US20160133298A1 (en) * 2013-07-15 2016-05-12 Zte Corporation Method and Device for Adjusting Playback Progress of Video File
US9799375B2 (en) * 2013-07-15 2017-10-24 Xi'an Zhongxing New Software Co. Ltd Method and device for adjusting playback progress of video file
US9507849B2 (en) 2013-11-28 2016-11-29 Soundhound, Inc. Method for combining a query and a communication command in a natural language computer system
US9601114B2 (en) 2014-02-01 2017-03-21 Soundhound, Inc. Method for embedding voice mail in a spoken utterance using a natural language processing computer system
US9292488B2 (en) 2014-02-01 2016-03-22 Soundhound, Inc. Method for embedding voice mail in a spoken utterance using a natural language processing computer system
US11295730B1 (en) 2014-02-27 2022-04-05 Soundhound, Inc. Using phonetic variants in a local context to improve natural language understanding
US9564123B1 (en) 2014-05-12 2017-02-07 Soundhound, Inc. Method and system for building an integrated user profile
US10311858B1 (en) 2014-05-12 2019-06-04 Soundhound, Inc. Method and system for building an integrated user profile
US11030993B2 (en) 2014-05-12 2021-06-08 Soundhound, Inc. Advertisement selection by linguistic classification
US10776419B2 (en) * 2014-05-16 2020-09-15 Gracenote Digital Ventures, Llc Audio file quality and accuracy assessment
US20150331941A1 (en) * 2014-05-16 2015-11-19 Tribune Digital Ventures, Llc Audio File Quality and Accuracy Assessment
US10797888B1 (en) 2016-01-20 2020-10-06 F5 Networks, Inc. Methods for secured SCEP enrollment for client devices and devices thereof
US10665237B2 (en) 2017-04-26 2020-05-26 International Business Machines Corporation Adaptive digital assistant and spoken genome
US20190019498A1 (en) * 2017-04-26 2019-01-17 International Business Machines Corporation Adaptive digital assistant and spoken genome
US10607608B2 (en) * 2017-04-26 2020-03-31 International Business Machines Corporation Adaptive digital assistant and spoken genome
US20190019499A1 (en) * 2017-04-26 2019-01-17 International Business Machines Corporation Adaptive digital assistant and spoken genome
US10567492B1 (en) 2017-05-11 2020-02-18 F5 Networks, Inc. Methods for load balancing in a federated identity environment and devices thereof
US10833943B1 (en) 2018-03-01 2020-11-10 F5 Networks, Inc. Methods for service chaining and devices thereof
US20230401789A1 (en) * 2022-06-13 2023-12-14 Verizon Patent And Licensing Inc. Methods and systems for unified rendering of light and sound content for a simulated 3d environment

Also Published As

Publication number Publication date
US6370543B2 (en) 2002-04-09

Similar Documents

Publication Publication Date Title
US6282549B1 (en) Indexing of media content on a network
US6370543B2 (en) Display of media previews
US5983176A (en) Evaluation of media content in media files
US6374260B1 (en) Method and apparatus for uploading, indexing, analyzing, and searching media content
US7131059B2 (en) Scalably presenting a collection of media objects
US7149755B2 (en) Presenting a collection of media objects
Aigrain et al. Content-based representation and retrieval of visual media: A state-of-the-art review
US9122754B2 (en) Intelligent video summaries in information access
US7826709B2 (en) Metadata editing apparatus, metadata reproduction apparatus, metadata delivery apparatus, metadata search apparatus, metadata re-generation condition setting apparatus, metadata delivery method and hint information description method
US8156132B1 (en) Systems for comparing image fingerprints
US8090200B2 (en) Redundancy elimination in a content-adaptive video preview system
US7421455B2 (en) Video search and services
US20190320213A1 (en) Media management based on derived quantitative data of quality
WO2000045604A1 (en) Signal processing method and video/voice processing device
JP2005522785A (en) Media object management method
KR20030007736A (en) Streaming video bookmarks
JP2002140712A (en) Av signal processor, av signal processing method, program and recording medium
US20060126942A1 (en) Method of and apparatus for retrieving movie image
Metso et al. A content model for the mobile adaptation of multimedia information
JP3517349B2 (en) Music video classification method and apparatus, and recording medium recording music video classification program
England et al. I/browse: The bellcore video library toolkit
JP2000285242A (en) Signal processing method and video sound processing device
JP3358692B2 (en) Video block classification method and apparatus
Snoek et al. The role of visual content and style for concert video indexing
Shahraray et al. Multimedia Processing for Advanced Communications Services

Legal Events

Date Code Title Description
AS Assignment

Owner name: MAGNIFI, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:HOFFERT, ERIC M.;SMOOT, STEVE;CREMIN, KARL;AND OTHERS;REEL/FRAME:009054/0356;SIGNING DATES FROM 19980119 TO 19980303

AS Assignment

Owner name: PROCTER & GAMBLE PLATFORM, INC., OHIO

Free format text: SECURITY INTEREST;ASSIGNORS:EMMPERATIVE MARKETING HOLDINGS, INC.;EMMPERATIVE MARKETING, INC.;REEL/FRAME:012691/0660

Effective date: 20020304

Owner name: RUSTIC CANYON VENTURES, L.P., CALIFORNIA

Free format text: SECURITY INTEREST;ASSIGNORS:EMMPERATIVE MARKETING HOLDINGS, INC.;EMMPERATIVE MARKETING, INC.;REEL/FRAME:012691/0660

Effective date: 20020304

Owner name: ACCENTURE LLP, TEXAS

Free format text: SECURITY INTEREST;ASSIGNORS:EMMPERATIVE MARKETING HOLDINGS, INC.;EMMPERATIVE MARKETING, INC.;REEL/FRAME:012691/0660

Effective date: 20020304

STCF Information on status: patent grant

Free format text: PATENTED CASE

AS Assignment

Owner name: WORLDWIDE MAGNIFI, INC., CALIFORNIA

Free format text: CHANGE OF NAME;ASSIGNOR:AT ONCE NETWORKS, INC./D.B.A./ MAGNIFI, INC.;REEL/FRAME:013484/0426

Effective date: 19970320

AS Assignment

Owner name: EMMPERATIVE MARKETING, INC., CALIFORNIA

Free format text: MERGER;ASSIGNOR:WORLDWIDE MAGNIFI, INC.;REEL/FRAME:013496/0455

Effective date: 20010710

AS Assignment

Owner name: INSOLVENCY SERVICES GROUP, INC., CALIFORNIA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:EMMPERATIVE MARKETING, INC.;REEL/FRAME:013552/0097

Effective date: 20020828

AS Assignment

Owner name: PROCTER & GAMBLE COMPANY, THE, OHIO

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:INSOLVENCY SERVICES GROUP, INC.;REEL/FRAME:013506/0320

Effective date: 20020828

FEPP Fee payment procedure

Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

FPAY Fee payment

Year of fee payment: 4

FPAY Fee payment

Year of fee payment: 8

FPAY Fee payment

Year of fee payment: 12