Machine learning based characters recognition

This final year project aims to study and implement some machine learning techniques for character recognition. The author was tasked to develop a mobile app for a business card scanner based on these techniques. The author has chosen to do research on Tesseract, which is an open-source optical char...

Full description

Saved in:
Bibliographic Details
Main Author: Song, Tianyi
Other Authors: Huang Guangbin
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/75485
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This final year project aims to study and implement some machine learning techniques for character recognition. The author was tasked to develop a mobile app for a business card scanner based on these techniques. The author has chosen to do research on Tesseract, which is an open-source optical character recognition (OCR) engine sponsored by Google and has embedded the Tess-two library locally into the business card scanner. The scanner was developed for Android systems. It is able to scan characters on business cards, distinguish the information and save it into the entry attributes for a new contact. It includes functions of photo cropping and saving, character recognition, information extraction and contact adding. The app design, app structure, key codes and testing results will be included in this report. Since OCR is the key technology of the application, its principle and development will be discussed for basic understanding as well as future improvement of the scanner.