WO2008138356A3 - System and method for arabic omni font written optica character recognition - Google Patents

System and method for arabic omni font written optica character recognition Download PDF

Info

Publication number
WO2008138356A3
WO2008138356A3 PCT/EG2007/000018 EG2007000018W WO2008138356A3 WO 2008138356 A3 WO2008138356 A3 WO 2008138356A3 EG 2007000018 W EG2007000018 W EG 2007000018W WO 2008138356 A3 WO2008138356 A3 WO 2008138356A3
Authority
WO
WIPO (PCT)
Prior art keywords
character recognition
arabic
optica
omni
hmm
Prior art date
Application number
PCT/EG2007/000018
Other languages
French (fr)
Other versions
WO2008138356A2 (en
Inventor
Mohsen Abdel-Razik Ali Rashwan
Mohamed Attia Mohamed El-Araby Ahmed
Original Assignee
The Engineering Company For The Development Of Computer Systems ; (Rdi)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Engineering Company For The Development Of Computer Systems ; (Rdi) filed Critical The Engineering Company For The Development Of Computer Systems ; (Rdi)
Priority to PCT/EG2007/000018 priority Critical patent/WO2008138356A2/en
Publication of WO2008138356A2 publication Critical patent/WO2008138356A2/en
Publication of WO2008138356A3 publication Critical patent/WO2008138356A3/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/18Extraction of features or characteristics of the image
    • G06V30/1801Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections
    • G06V30/18019Detecting partial patterns, e.g. edges or contours, or configurations, e.g. loops, corners, strokes or intersections by matching or filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/29Graphical models, e.g. Bayesian networks
    • G06F18/295Markov models or related models, e.g. semi-Markov models; Markov random fields; Networks embedding Markov models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

Optical character recognition system and method are disclosed for Arabic scripts. The system is based on Hidden Markov Models (HMM), an approach that has proven to be very successful in the area of automatic speech recognition. The system is divided into two subsystems, the trainer subsystem based on Hidden Markov Models (HMM) and the recognizer subsystem comprising a feature extractor (M4) providing a series of feature vectors to the vector quantizer module (M5) and using the Viterbi algorithm to recognize input bitmaps to output characters. The system uses a speciel histogram-based algorithm for lines & words decomposition.
PCT/EG2007/000018 2007-05-15 2007-05-15 System and method for arabic omni font written optica character recognition WO2008138356A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EG2007/000018 WO2008138356A2 (en) 2007-05-15 2007-05-15 System and method for arabic omni font written optica character recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EG2007/000018 WO2008138356A2 (en) 2007-05-15 2007-05-15 System and method for arabic omni font written optica character recognition

Publications (2)

Publication Number Publication Date
WO2008138356A2 WO2008138356A2 (en) 2008-11-20
WO2008138356A3 true WO2008138356A3 (en) 2009-09-24

Family

ID=40002673

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EG2007/000018 WO2008138356A2 (en) 2007-05-15 2007-05-15 System and method for arabic omni font written optica character recognition

Country Status (1)

Country Link
WO (1) WO2008138356A2 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8111911B2 (en) * 2009-04-27 2012-02-07 King Abdulaziz City For Science And Technology System and methods for arabic text recognition based on effective arabic text feature extraction
EP2662802A1 (en) * 2012-05-09 2013-11-13 King Abdulaziz City for Science & Technology (KACST) Method and system for preprocessing an image for optical character recognition
JP5986051B2 (en) * 2013-05-12 2016-09-06 キング・アブドゥルアジズ・シティ・フォー・サイエンス・アンド・テクノロジー(ケイ・エイ・シィ・エス・ティ)King Abdulaziz City For Science And Technology (Kacst) Method for automatically recognizing Arabic text
US9014481B1 (en) 2014-04-22 2015-04-21 King Fahd University Of Petroleum And Minerals Method and apparatus for Arabic and Farsi font recognition
US9501708B1 (en) 2015-09-10 2016-11-22 King Fahd University Of Petroleum And Minerals Adaptive sliding windows for text recognition
CN109255113B (en) * 2018-09-04 2022-10-11 郑州信大壹密科技有限公司 Intelligent proofreading system
US11270153B2 (en) 2020-02-19 2022-03-08 Northrop Grumman Systems Corporation System and method for whole word conversion of text in image

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5933525A (en) * 1996-04-10 1999-08-03 Bbn Corporation Language-independent and segmentation-free optical character recognition system and method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5933525A (en) * 1996-04-10 1999-08-03 Bbn Corporation Language-independent and segmentation-free optical character recognition system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Computational Intelligence for Measurement Systems and Applications, CIMSA 2004 IEEE International Conference on Boston, MA, USA", 14 July 2004, article GOUDA ET AL.: "Segmentation of connected arabic characters using hidden Markov models", pages: 115 - 119, XP010773387 *
"Document Image Analysis for Libraries, 2006. DIAL '06. Second International Conference on Lyon, France", 27 April 2006, article NADIA BEN AMOR ET AL.: "Combining a hybrid Approach for Features Selection and Hidden Markov Models in Multifont Arabic Characters Recognition", pages: 103 - 107, XP010912514 *

Also Published As

Publication number Publication date
WO2008138356A2 (en) 2008-11-20

Similar Documents

Publication Publication Date Title
WO2008138356A3 (en) System and method for arabic omni font written optica character recognition
KR102386854B1 (en) Apparatus and method for speech recognition based on unified model
CN108986791B (en) Chinese and English language voice recognition method and system for civil aviation air-land communication field
US10127927B2 (en) Emotional speech processing
US9721561B2 (en) Method and apparatus for speech recognition using neural networks with speaker adaptation
US20090119105A1 (en) Acoustic Model Adaptation Methods Based on Pronunciation Variability Analysis for Enhancing the Recognition of Voice of Non-Native Speaker and Apparatus Thereof
KR101237799B1 (en) Improving the robustness to environmental changes of a context dependent speech recognizer
Bao et al. Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition
WO2005077098A8 (en) Handwriting and voice input with automatic correction
CN111210807B (en) Speech recognition model training method, system, mobile terminal and storage medium
Yu et al. Conversational Speech Transcription Using Context-Dependent Deep Neural Networks.
GB0207343D0 (en) Signal processing system
US20080004876A1 (en) Non-enrolled continuous dictation
CN111179917B (en) Speech recognition model training method, system, mobile terminal and storage medium
Lee et al. Audio-to-visual conversion using hidden markov models
Hartmann et al. Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon
CN110415725A (en) Use the method and system of first language data assessment second language pronunciation quality
CN112133292A (en) End-to-end automatic voice recognition method for civil aviation land-air communication field
Vazhenina et al. Phoneme set selection for Russian speech recognition
US9953638B2 (en) Meta-data inputs to front end processing for automatic speech recognition
Luo et al. Modeling characters versuswords for mandarin speech recognition
Joy et al. DNNs for unsupervised extraction of pseudo speaker-normalized features without explicit adaptation data
Murali Karthick et al. Speaker adaptation of convolutional neural network using speaker specific subspace vectors of SGMM
WO2004008433A3 (en) System and method for mandarin chinese speech recognition using an optimized phone set
KR20160015005A (en) Method and apparatus for discriminative training acoustic model based on class, and speech recognition apparatus using the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07722742

Country of ref document: EP

Kind code of ref document: A2

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 13/04/10)

122 Ep: pct application non-entry in european phase

Ref document number: 07722742

Country of ref document: EP

Kind code of ref document: A2