Deep learning for optical character recognition in online images
Product images on e-commerce product listing sites contain a wealth of information about the products. The ability to encode the text in the product images into machine readable form through Optical Character Recognition is crucial for machines to develop a better understanding of the products. T...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/167016 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Product images on e-commerce product listing sites contain a wealth of information about
the products. The ability to encode the text in the product images into machine readable
form through Optical Character Recognition is crucial for machines to develop a better
understanding of the products. This will allow for data mining to be performed, laying the
foundation for advanced features that can provide value add to the e-commerce platform
users.
In this study, a custom end-to-end OCR system optimized for performance in terms of
speed, recall, and precision on online e-commerce images (online images) is proposed.
The pipeline, consisting of a Mask R-CNN based text detection model, and an ABINET
text recognition model, is able to perform with an accuracy of 67.3%. This represents a
96% increase in performance relative to the benchmark OCR pipeline. The pipeline also
achieves competitive performance to SOTA algorithms in the ICDAR 2015 Born Digital
competition after accounting for differences in challenge level. A web platform was also
successfully developed to allow for online text detection and recognition, and to visualize
the performance of the pipeline. |
---|