Introduction to OCR and Google Cloud Vision ~ Tech Share

Wednesday, March 16, 2022

Introduction to OCR and Google Cloud Vision

What is OCR

OCR (Optical Character Recognition) is the use of technology to recognize printed or handwritten text characters in digital images of physical documents, such as scanned paper documents.

The image is converted to a duotone image (ie black and white). Analyze the bright and dark areas of the image

Dark area = character to be recognized

Bright area = background

OCR works by Pattern recognition and feature detection

Pattern recognition

Techniques that try to match characters with the character matrix stored in the database
Rely on the input characters are consistent fonts, such as OCR A and B

Feature detection

More sophisticated identification methods
Break down characters into "features" such as lines, loops, and intersections

OCR Application

Passport scanning, ID detect
Scan document
Automated data entry, extraction and processing
Receipt and Invoice Scanning
Electrical Medical Records
Forms and Surveys
Bank : Electronic document processing (checks, etc.)
ANPR / Traffic : Read vehicle license plate/VIN number
Logistics: Sort the mail.

Introduction to Artificial Intelligence (AI) and Machine Learning

Internet of Things (IOT) Basics

Google Cloud Vision

Cloud Vision allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.