Automation of scraping for conflict of interest in webpages
As the world advances in technology, researchers compete to submit their papers to gain recognition and to show advances in technology. However, just as these calls are held by people, some of these submissions could be submitted by those who had recent contact or relations with the organiser of the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181181 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181181 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1811812024-11-18T01:16:39Z Automation of scraping for conflict of interest in webpages Lee, Ming Jia Sourav S Bhowmick College of Computing and Data Science ASSourav@ntu.edu.sg Computer and Information Science Web scraping As the world advances in technology, researchers compete to submit their papers to gain recognition and to show advances in technology. However, just as these calls are held by people, some of these submissions could be submitted by those who had recent contact or relations with the organiser of the events. As a result, this could result in an unfair competition, leading to conflict of interest between the organisers and candidates. Hence, during the submission of a paper, the submission sites will request information about conflicts of interest of the paper's authors with program committee (PC) members. This project presents the development of an automation python-based application that can be used to extract for information relating to conflict of interest related to an event. The goal of this project is to automate the process so that it can be done on multiple webpages at the same time, hence not requiring the user to individually type down every webpage to be scraped from. Future work on the application will focus on optimising the code to prevent the code from extracting excessive information as well as improving its capabilities in scraping other information stored in the webpage. Bachelor's degree 2024-11-18T01:16:39Z 2024-11-18T01:16:39Z 2024 Final Year Project (FYP) Lee, M. J. (2024). Automation of scraping for conflict of interest in webpages. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181181 https://hdl.handle.net/10356/181181 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Web scraping |
spellingShingle |
Computer and Information Science Web scraping Lee, Ming Jia Automation of scraping for conflict of interest in webpages |
description |
As the world advances in technology, researchers compete to submit their papers to gain recognition and to show advances in technology. However, just as these calls are held by people, some of these submissions could be submitted by those who had recent contact or relations with the organiser of the events. As a result, this could result in an unfair competition, leading to conflict of interest between the organisers and candidates. Hence, during the submission of a paper, the submission sites will request information about conflicts of interest of the paper's authors with program committee (PC) members.
This project presents the development of an automation python-based application that can be used to extract for information relating to conflict of interest related to an event. The goal of this project is to automate the process so that it can be done on multiple webpages at the same time, hence not requiring the user to individually type down every webpage to be scraped from.
Future work on the application will focus on optimising the code to prevent the code from extracting excessive information as well as improving its capabilities in scraping other information stored in the webpage. |
author2 |
Sourav S Bhowmick |
author_facet |
Sourav S Bhowmick Lee, Ming Jia |
format |
Final Year Project |
author |
Lee, Ming Jia |
author_sort |
Lee, Ming Jia |
title |
Automation of scraping for conflict of interest in webpages |
title_short |
Automation of scraping for conflict of interest in webpages |
title_full |
Automation of scraping for conflict of interest in webpages |
title_fullStr |
Automation of scraping for conflict of interest in webpages |
title_full_unstemmed |
Automation of scraping for conflict of interest in webpages |
title_sort |
automation of scraping for conflict of interest in webpages |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181181 |
_version_ |
1816859058304450560 |