Develop web crawler
The Blackboard system used by http://www.edventure.sg has a Campus Pack Wiki Tool that allows users to create and update Wikis to facilitate learning. Two such Wikis have been created for the course SC207/CPE207 Software Engineering, the Seminar Wiki and the Conspectus Wiki. The Conspectus Wiki a...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2012
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/48596 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The Blackboard system used by http://www.edventure.sg has a Campus Pack Wiki Tool that allows users to create and update Wikis to facilitate learning. Two such Wikis have been created for the course SC207/CPE207 Software Engineering, the Seminar Wiki and the Conspectus Wiki.
The Conspectus Wiki allows students to share their summary of the subject, while the Seminar Wiki allows students to share their answers and opinions to questions given by their lecturer. Therefore, these Wikis become an excellent channel for sharing knowledge.
The lecturer awards marks to students based on the number of comments they have made. However, there are too many comments on the Wikis that it becomes too tedious and time consuming to count them manually for each student. There is a need for automation of the counting process. The goal of this project is to develop an application to assist the lecturer in the counting of student names.
Before the counting process can be started, data must be extracted from a Wiki, and student names must be filtered out and properly identified. The application of this project cannot simply identify student names by matching them to their registered student names as people do not always write their names in the same way all the time. For example, “John Tan” could also write his name as “Tan John”.
The key to developing this application is by String manipulation. Strings are sequences of characters. By breaking down and comparing Strings, identification of specific Strings is possible. In this case, I want to firstly identify student names and filter them out from the raw data, and secondly properly identify who is who according to the registered student names. |
---|