Machine learning models for patent examination

This year, text classification remains one of the most attractive research topics in the field of NLP(Natural Language Processing). Due to the complexity of patent classification, Patent Examination is still inseparable from the examiners, which makes the efficiency of Patent Examination ineffici...

Full description

Saved in:
Bibliographic Details
Main Author: Wang, Yanqing
Other Authors: Lihui CHEN
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141313
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This year, text classification remains one of the most attractive research topics in the field of NLP(Natural Language Processing). Due to the complexity of patent classification, Patent Examination is still inseparable from the examiners, which makes the efficiency of Patent Examination inefficient and time-consuming. This is an urgent problem which needs to be addressed. Since Google announced the outstanding performance of BERT in 11 NLP tasks at the end of October 2018, BERT (Bidirectional Encoder Representation from Transformers) has become a fire in the NLP field. This project attempts to use the BERT model to build a patent examination task model. This project comprehensively reviews and implements some patent examination tasks based on text classification, and comprehensively studies their performance on large data sets. Starting from the classification results of patent data sets, we tried to add a "summary" to improve the accuracy of patent classification. The tasks are mainly divided into the following two: (a) Build a text classification model based on BERT and train an optimization model. (b) Add a "summary" mechanism to the model to improve classification accuracy and verify the results.