1. What is it?
1.1. Optical Character Recognition (OCR) is the process of taking an image of letters or typed text and converting it into data that a computer understand
1.2. Or a simple explanation: OCR is a technology that convert different type of documents, such as scanned paper document, PDF files or images captured by a digital camera into editable and searchable data
2. Product example
2.1. ABBYY FileReader
2.1.1. What is it?
2.1.1.1. ABBYY FineReader in an (OCR) software that provides unmatched accuracy and conversion capabilities, virtual eliminating retyping and reformatting of documents. Intuitive use and one-click automated tasks let you do more in fewer steps. Up to 190 languages are supported for text recognition-- more than any OCR software in the market.
2.1.2. What does it do?
2.1.2.1. FileReader converts scanned paper document, digital images of text and image-only PDFs into actionable formats as Microsoft Word, Excel or searchable PDFs, enabling you to quote or entirely reuse text and table content without retyping.
2.1.2.1.1. Step 1
2.1.2.1.2. Step 2
2.1.2.1.3. Step 3
2.1.2.1.4. Step 4
2.1.2.1.5. Step 5
2.1.2.1.6. Step 6
3. What does it use?
3.1. Data are generally passed to the computer by using a scanner or other devices, like a digital camera.
3.2. Some computer fax application use OCR to transform incoming faxes from graphics files into word processing documents.
4. How does it work?
4.1. lets suppose life was really simple and there was only one letter in the alphabet: A. Even then, you can probably see that OCR would be quite a tricky problem, because every single person writes the letter A in a slightly different way.
4.1.1. Even with a printed text, there's an issue, because book and other documents are printed in many different typefaces (fonts) and a letter A can be printed in many different forms.
4.1.1.1. Broadly speaking, there are two different ways to solve this problem. Either by
4.1.1.1.1. Recognizing character in their entirety (Pattern Recognition)
4.1.1.1.2. Or by Detecting The individual lines and strokes characters are made from (Feature Detection) and identify them that way