Get Started. It's Free
or sign up with your email address
OCR intro by Mind Map: OCR intro

1. Optical character recognition / branch of pattern recognition

2. Categories:

2.1. Task-specific

2.1.1. Ex: bank check reader, card reader

2.1.2. High throughput rates

2.1.3. Low error rates

2.2. General purpose

2.2.1. Ex: commercial OCR soft

3. Current OCR system:

3.1. Commercial

3.1.1. Capture development system 12

3.1.2. FineReader 7.0

3.1.3. Automatic Reader 7.0

3.1.4. NovoD DX

3.1.5. NeuroTalker

3.1.6. BBn

3.1.7. PrimeOCR

3.1.8. Cuneiform

3.1.9. Vividata OCR

3.2. Opensource

3.2.1. NIST form-based handprint recognition system

3.2.2. Illuminator

3.2.3. Calpoly OCR

3.2.4. Xocr

3.2.5. Tesseract

4. Processing stages

4.1. Document digitization

4.2. Character/word recognition

4.2.1. Image analysis

4.2.2. Enhancement

4.2.3. Contextual processing

4.3. Output distribution

5. Trends

5.1. Adaptive OCR (robust wide range printed doc)

5.1.1. Multi-script and multi-language

5.1.2. Omni-font text

5.1.3. Automatic document segmentation

5.1.4. Mathematical notation

5.2. Handwriting recognition

5.2.1. Hand-printed text in form

5.2.2. Personal checks

5.2.3. Postal enveloped, parcel address reader

5.2.4. Portable and handheld devices

5.3. Document image enhancement: image filter to source document image

5.4. Intelligent post-processing

5.5. OCR in multi-media: read on video, image rather than document

6. OCR techniques

6.1. Image formats for OCR

6.1.1. Good choices: TIFF BMP GIF PNG

6.1.2. Acceptable: JPEG(loss info)

6.1.3. Not good PCX: memory-inefficient PDF: 3rd party PDF reader

6.2. Essential:

6.2.1. Feature extractor

6.2.2. Classifier Template matching Structural classification Discriminant function classifier Bayesian classifier: ANN (artificial neural networks)

6.3. Difficulty

6.3.1. Poor input: noisy, low resolution, multi generation image version

6.3.2. Incorrect image processing

6.3.3. Poor segmentation

6.3.4. Confusing letters (0 and O, l and I 1)

6.3.5. Imaging defect (light/ heavy print ink, stray marks, curved baseline)

6.3.6. Punctuation ****

6.3.7. Mix of text and materials

6.3.8. Typography (font style, shaded background, unusual typefaces, large/small print)

6.3.9. Handwriting recognition (ICR Intelligent character reocognition) Character segmentation ambiguity Character shape variability

6.3.10. Language mixing Unexpected script No character identification Fast pace recognition

6.3.11. Script language: Large size of characters (>3300 chars in the charset) Complexity of single character Similarity between characters Double base line

6.4. Solution:

6.4.1. Better preprocessing

6.4.2. Adaptive character classification: recognize the font face

6.4.3. Multi-character recognition

6.4.4. Use of context

6.4.5. Document image enhancement: Filters apply

6.4.6. Documentation segmentation Separate text and materials Finding text lines Text reading order detection

6.4.7. Task-specific: Zip code recognition, address validation Form-base text recognition (tax form) Automatic accounting procedures (bill) Bank check Passenger transport tickets Signature verification

6.4.8. For script language: Segmentation-free method Segmentation-based methods Perception-oriented approach

6.4.9. Font Font abstraction Font identification

6.5. Validity

6.5.1. Check accuracy There is ground truth Precision (correct items/items in OCR output) Recall (correct item/ items of ground truth) No ground truth Manually check by expert Manually check wrong, missing, extra output Calculate pages Iteration

6.5.2. Check productivity Character classifier Document complexity Image noise Post- processing Interface issue Handwriting