What is OCR
OCR (Optical Character Recognition) is the use of technology to recognize printed or handwritten text characters in digital images of physical documents, such as scanned paper documents.
The image is converted to a duotone image (ie black and white). Analyze the bright and dark areas of the image
Dark area = character to be recognized
Bright area = background
OCR works by Pattern recognition and feature detection
Pattern recognition
- Techniques that try to match characters with the character matrix stored in the database
- Rely on the input characters are consistent fonts, such as OCR A and B
Feature detection
- More sophisticated identification methods
- Break down characters into "features" such as lines, loops, and intersections
OCR Application
- Passport scanning, ID detect
- Scan document
- Automated data entry, extraction and processing
- Receipt and Invoice Scanning
- Electrical Medical Records
- Forms and Surveys
- Bank : Electronic document processing (checks, etc.)
- ANPR / Traffic : Read vehicle license plate/VIN number
- Logistics: Sort the mail.
Introduction to Artificial Intelligence (AI) and Machine Learning
Google Cloud Vision
Cloud Vision allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.