Simplest Digits OCR Implementation

azam sayeed
2 min readMay 26, 2020

Definition from Wikipedia

Optical character recognition or optical character reader (OCR) is the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene-photo (for example the text on signs and billboards in a landscape photo) or from subtitle text superimposed on an image (for example from a television broadcast).[1]

Widely used as a form of data entry from printed paper data records — whether passport documents, invoices, bank statements, computerized receipts, business cards, mail, printouts of static-data, or any suitable documentation — it is a common method of digitizing printed texts so that they can be electronically edited, searched, stored more compactly, displayed on-line, and used in machine processes such as cognitive computing, machine translation, (extracted) text-to-speech, key data and text mining. OCR is a field of research in pattern recognition, artificial intelligence and computer vision.

Simplistic Methodology to perform OCR in python

  1. Available Dataset — Digits OCR images dataset from 0–9 is readily available in sklearn.datasets
from sklearn.datasets import load_digits
digits = load_digits()

2. Convert for each image as 1D Numpy array hstack and convert to Pandas DataFrame for further processing

3. Dimensionality Reduction to represent 90% of the variance using PCA from sklearn.decomposition

4. Train the Model using training split data , simplest Model we have considered is Logistic Regression with labelled data

5. Evaluate the Model performance and Accuracy with metrics

E2E implementation is given in below Notebook:

--

--