DeepCite: tool for systematic annotation of scientific literature enabling machine learning-based aggregation of research results
The goal of a research paper is to gather and interpret information into writing, and to share your results and findings for others to learn. Although these are good intentions, the amount and speed at which research papers are being published has increased exponentially over the last decade. It is...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156647 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The goal of a research paper is to gather and interpret information into writing, and to share your results and findings for others to learn. Although these are good intentions, the amount and speed at which research papers are being published has increased exponentially over the last decade. It is becoming overwhelming to consume the increasing amount of information.
This project concerns the implementation of a machine learning text classification model together with an easy-to-use tool for systematic annotation of scientific literature. The purpose of the classification model is to determine whether a sentence is important or not important in the context of research papers. The tool is created to facilitate the collection of the data required by the model. The goal of the project is to create a model to help researchers identify important sentences in papers, thus saving the time and effort to read through long research reports.
The tool is developed as a Chrome extension and is able to export annotations made by the user in a table format. Data collected by the tool is then processed and passed to the model for training. The model is implemented as a transformer-based model, which is a deep learning model that utilizes the mechanism of self-attention heavily to compute a representation of the input sequence. The model has demonstrated that it is possible to classify important sentences in the field of research papers. |
---|