Context based patent classification and search (part B)
This report summarizes the research, methodologies and experimental implementations on context-based patent classification and search. The traditional method of patent search and prior arts retrieval is a tedious process involving a patent examiner manually searching for relevant prior arts for each...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/139957 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This report summarizes the research, methodologies and experimental implementations on context-based patent classification and search. The traditional method of patent search and prior arts retrieval is a tedious process involving a patent examiner manually searching for relevant prior arts for each submitted patent application. Albeit integrating machine learning with patent analysis to ease the tedious process, the research on patent information retrieval is still limited and previously attained results were proven to be unsatisfactory. This project explores, integrates and evaluates the suitability of various state-of-the-art Natural Language Processing and deep learning techniques such as the Siamese neural network, “Global Vector” (GloVe) word embeddings, “Bidirectional Long Short-Term Memory” (BiLSTM) model and “A Lite Bidirectional Encoder Representations from Transformer” (ALBERT) model to implement a prior arts retrieval system. A publicly available patent dataset of 499 patent applications and 2410 patent citations was used for the experiments. Three experimental models, “Siamese BiLSTM+GloVe”, “Siamese ALBERT without fully connected layer” and “Siamese ALBERT with fully connected layer” were implemented using Euclidean Similarity and Cosine Similarity to rank the similarity scores between a patent application and patent citation. The suitability of the models for prior arts retrieval is also discussed. The experimental results of this project demonstrate that the two Siamese ALBERT models outperformed the Siamese BiLSTM model in prior arts retrieval tasks by at least two times and further results justify that the implementation of the fully connected layer in Siamese ALBERT substantially improved the quality of prior arts retrieval. |
---|