Cognitive systems for data analytics
The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transi...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/74040 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-74040 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-740402023-03-03T20:44:35Z Cognitive systems for data analytics Wu, Evan Tan Ah Hwee School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transits as well as weather condition. In addition, I will estimate the effects of ridership between various transits as well as its accessibility and availability. Hence, the scope of the project would be to produce insightful analysis on mainly bikeshare along with a collection of other transit modes through analysis of datasets using cognitive-related algorithms. I have used mainly Python for this Final Year Project, other libraries for Python such as Folium, BeautifulSoup, Numpy, Scikit-Learn and Pandas were imported and used to help with the necessary visualizations, data pre-processing and analysis. In addition, Java was also used in the implementation of Adaptive Resonance Theory (ART)-based algorithms such as Fusion ART, Fusion ART-with Match Tracking, Fusion ART-with Interesting Features (WIF) and IFC ART for clustering. Throughout the experiment ART based algorithms have been shown to have low computations complexity with the ability to adaptively learn, and more importantly they are cognitive. The best performing algorithm among the four ART-based algorithms and the algorithms provided by scikit-learn, will be evaluated using Davies-Bouldin index(DBI) to determine the best algorithm for each experiment. There are six experiments done in this project. Some interesting findings have been uncovered from the transit groups in the experiments such as the high correlation between subway departures and nearby bikeshare station check-in count, abnormally high taxi transaction usage counts tends to come from specific locations for both departure and drop-off point. Anomalies could be found, and one of such example occurred on 4th November 2016 and located at “Lake/State” found in the Central Business District (CBD) of Chicago with extremely high subway departures, it was found that events such as “Mac & Choose Fest” and “Allstate Hot Chocolate 15K/5K” was taking place near “Lake/State”, which most likely greatly contributed to this anomaly. Limitations posed in this project are that the results are only representative of people living in Chicago, Illinois (IL) and the way data was recorded for certain datasets, such as bus related data where only the departure count for each route is recorded allowing for only some level for analysis. Future work or research could be done by applying the algorithms on data sets that is representative of the population in other countries and comparing the difference in mobility, usage and behavioural patterns between these countries. Bachelor of Engineering (Computer Science) 2018-04-23T14:33:20Z 2018-04-23T14:33:20Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74040 en Nanyang Technological University 129 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Wu, Evan Cognitive systems for data analytics |
description |
The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transits as well as weather condition. In addition, I will estimate the effects of ridership between various transits as well as its accessibility and availability. Hence, the scope of the project would be to produce insightful analysis on mainly bikeshare along with a collection of other transit modes through analysis of datasets using cognitive-related algorithms.
I have used mainly Python for this Final Year Project, other libraries for Python such as Folium, BeautifulSoup, Numpy, Scikit-Learn and Pandas were imported and used to help with the necessary visualizations, data pre-processing and analysis. In addition, Java was also used in the implementation of Adaptive Resonance Theory (ART)-based algorithms such as Fusion ART, Fusion ART-with Match Tracking, Fusion ART-with Interesting Features (WIF) and IFC ART for clustering. Throughout the experiment ART based algorithms have been shown to have low computations complexity with the ability to adaptively learn, and more importantly they are cognitive. The best performing algorithm among the four ART-based algorithms and the algorithms provided by scikit-learn, will be evaluated using Davies-Bouldin index(DBI) to determine the best algorithm for each experiment.
There are six experiments done in this project. Some interesting findings have been uncovered from the transit groups in the experiments such as the high correlation between subway departures and nearby bikeshare station check-in count, abnormally high taxi transaction usage counts tends to come from specific locations for both departure and drop-off point. Anomalies could be found, and one of such example occurred on 4th November 2016 and located at “Lake/State” found in the Central Business District (CBD) of Chicago with extremely high subway departures, it was found that events such as “Mac & Choose Fest” and “Allstate Hot Chocolate 15K/5K” was taking place near “Lake/State”, which most likely greatly contributed to this anomaly. Limitations posed in this project are that the results are only representative of people living in Chicago, Illinois (IL) and the way data was recorded for certain datasets, such as bus related data where only the departure count for each route is recorded allowing for only some level for analysis. Future work or research could be done by applying the algorithms on data sets that is representative of the population in other countries and comparing the difference in mobility, usage and behavioural patterns between these countries. |
author2 |
Tan Ah Hwee |
author_facet |
Tan Ah Hwee Wu, Evan |
format |
Final Year Project |
author |
Wu, Evan |
author_sort |
Wu, Evan |
title |
Cognitive systems for data analytics |
title_short |
Cognitive systems for data analytics |
title_full |
Cognitive systems for data analytics |
title_fullStr |
Cognitive systems for data analytics |
title_full_unstemmed |
Cognitive systems for data analytics |
title_sort |
cognitive systems for data analytics |
publishDate |
2018 |
url |
http://hdl.handle.net/10356/74040 |
_version_ |
1759856414227955712 |