Simplest Digits OCR Implementation

azam sayeed
2 min readMay 26, 2020

--

Definition from Wikipedia

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast).[1]

Widely used as a form of data entry from printed paper data records — whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation — it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Simplistic Methodology to perform OCR in python

  1. Available Dataset — Digits OCR images dataset from 0–9 is readily available in sklearn.datasets
from sklearn.datasets import load_digits
digits = load_digits()

2. Convert for each image as 1D Numpy array hstack and convert to Pandas DataFrame for further processing

3. Dimensionality Reduction to represent 90% of the variance using PCA from sklearn.decomposition

4. Train the Model using training split data , simplest Model we have considered is Logistic Regression with labelled data

5. Evaluate the Model performance and Accuracy with metrics

E2E implementation is given in below Notebook:

--

--