A new method for detecting altered text in document images

As more and more office documents are captured, stored, and shared in digital format, and as image editing software are becoming increasingly more powerful, there is a growing concern about document authenticity. To prevent illicit activities, this paper presents a new method for detecting altered t...

Full description

Saved in:
Bibliographic Details
Main Authors: Nandanwar, Lokesh, Shivakumara, Palaiahnakote, Pal, Umapada, Lu, Tong, Lopresti, Daniel, Seraogi, Bhagesh, Chaudhuri, Bidyut B.
Format: Article
Published: World Scientific Publ Co Pte Ltd 2021
Subjects:
Online Access:http://eprints.um.edu.my/28575/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
id my.um.eprints.28575
record_format eprints
spelling my.um.eprints.285752022-08-16T04:20:29Z http://eprints.um.edu.my/28575/ A new method for detecting altered text in document images Nandanwar, Lokesh Shivakumara, Palaiahnakote Pal, Umapada Lu, Tong Lopresti, Daniel Seraogi, Bhagesh Chaudhuri, Bidyut B. QA75 Electronic computers. Computer science As more and more office documents are captured, stored, and shared in digital format, and as image editing software are becoming increasingly more powerful, there is a growing concern about document authenticity. To prevent illicit activities, this paper presents a new method for detecting altered text in document images. The proposed method explores the relationship between positive and negative coefficients of DCT to extract the effect of distortions caused by tampering by fusing reconstructed images of respective positive and negative coefficients, which results in Positive-Negative DCT coefficients Fusion (PNDF). To take advantage of spatial information, we propose to fuse R, G, and B color channels of input images, which results in RGBF (RGB Fusion). Next, the same fusion operation is used for fusing PNDF and RGBF, which results in a fused image for the original input one. We compute a histogram to extract features from the fused image, which results in a feature vector. The feature vector is then fed to a deep neural network for classifying altered text images. The proposed method is tested on our own dataset and the standard datasets from the ICPR 2018 Fraud Contest, Altered Handwriting (AH), and faked IMEI number images. The results show that the proposed method is effective and the proposed method outperforms the existing methods irrespective of image type. World Scientific Publ Co Pte Ltd 2021-09-30 Article PeerReviewed Nandanwar, Lokesh and Shivakumara, Palaiahnakote and Pal, Umapada and Lu, Tong and Lopresti, Daniel and Seraogi, Bhagesh and Chaudhuri, Bidyut B. (2021) A new method for detecting altered text in document images. International Journal of Pattern Recognition and Artificial Intelligence, 35 (12). ISSN 0218-0014, DOI https://doi.org/10.1142/S0218001421600107 <https://doi.org/10.1142/S0218001421600107>. 10.1142/S0218001421600107
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Nandanwar, Lokesh
Shivakumara, Palaiahnakote
Pal, Umapada
Lu, Tong
Lopresti, Daniel
Seraogi, Bhagesh
Chaudhuri, Bidyut B.
A new method for detecting altered text in document images
description As more and more office documents are captured, stored, and shared in digital format, and as image editing software are becoming increasingly more powerful, there is a growing concern about document authenticity. To prevent illicit activities, this paper presents a new method for detecting altered text in document images. The proposed method explores the relationship between positive and negative coefficients of DCT to extract the effect of distortions caused by tampering by fusing reconstructed images of respective positive and negative coefficients, which results in Positive-Negative DCT coefficients Fusion (PNDF). To take advantage of spatial information, we propose to fuse R, G, and B color channels of input images, which results in RGBF (RGB Fusion). Next, the same fusion operation is used for fusing PNDF and RGBF, which results in a fused image for the original input one. We compute a histogram to extract features from the fused image, which results in a feature vector. The feature vector is then fed to a deep neural network for classifying altered text images. The proposed method is tested on our own dataset and the standard datasets from the ICPR 2018 Fraud Contest, Altered Handwriting (AH), and faked IMEI number images. The results show that the proposed method is effective and the proposed method outperforms the existing methods irrespective of image type.
format Article
author Nandanwar, Lokesh
Shivakumara, Palaiahnakote
Pal, Umapada
Lu, Tong
Lopresti, Daniel
Seraogi, Bhagesh
Chaudhuri, Bidyut B.
author_facet Nandanwar, Lokesh
Shivakumara, Palaiahnakote
Pal, Umapada
Lu, Tong
Lopresti, Daniel
Seraogi, Bhagesh
Chaudhuri, Bidyut B.
author_sort Nandanwar, Lokesh
title A new method for detecting altered text in document images
title_short A new method for detecting altered text in document images
title_full A new method for detecting altered text in document images
title_fullStr A new method for detecting altered text in document images
title_full_unstemmed A new method for detecting altered text in document images
title_sort new method for detecting altered text in document images
publisher World Scientific Publ Co Pte Ltd
publishDate 2021
url http://eprints.um.edu.my/28575/
_version_ 1744649127538458624