Cognitive systems for data analytics

The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transi...

Full description

Saved in:
Bibliographic Details
Main Author: Wu, Evan
Other Authors: Tan Ah Hwee
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74040
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-74040
record_format dspace
spelling sg-ntu-dr.10356-740402023-03-03T20:44:35Z Cognitive systems for data analytics Wu, Evan Tan Ah Hwee School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transits as well as weather condition. In addition, I will estimate the effects of ridership between various transits as well as its accessibility and availability. Hence, the scope of the project would be to produce insightful analysis on mainly bikeshare along with a collection of other transit modes through analysis of datasets using cognitive-related algorithms. I have used mainly Python for this Final Year Project, other libraries for Python such as Folium, BeautifulSoup, Numpy, Scikit-Learn and Pandas were imported and used to help with the necessary visualizations, data pre-processing and analysis. In addition, Java was also used in the implementation of Adaptive Resonance Theory (ART)-based algorithms such as Fusion ART, Fusion ART-with Match Tracking, Fusion ART-with Interesting Features (WIF) and IFC ART for clustering. Throughout the experiment ART based algorithms have been shown to have low computations complexity with the ability to adaptively learn, and more importantly they are cognitive. The best performing algorithm among the four ART-based algorithms and the algorithms provided by scikit-learn, will be evaluated using Davies-Bouldin index(DBI) to determine the best algorithm for each experiment. There are six experiments done in this project. Some interesting findings have been uncovered from the transit groups in the experiments such as the high correlation between subway departures and nearby bikeshare station check-in count, abnormally high taxi transaction usage counts tends to come from specific locations for both departure and drop-off point. Anomalies could be found, and one of such example occurred on 4th November 2016 and located at “Lake/State” found in the Central Business District (CBD) of Chicago with extremely high subway departures, it was found that events such as “Mac & Choose Fest” and “Allstate Hot Chocolate 15K/5K” was taking place near “Lake/State”, which most likely greatly contributed to this anomaly. Limitations posed in this project are that the results are only representative of people living in Chicago, Illinois (IL) and the way data was recorded for certain datasets, such as bus related data where only the departure count for each route is recorded allowing for only some level for analysis. Future work or research could be done by applying the algorithms on data sets that is representative of the population in other countries and comparing the difference in mobility, usage and behavioural patterns between these countries. Bachelor of Engineering (Computer Science) 2018-04-23T14:33:20Z 2018-04-23T14:33:20Z 2018 Final Year Project (FYP) http://hdl.handle.net/10356/74040 en Nanyang Technological University 129 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Wu, Evan
Cognitive systems for data analytics
description The cities of Paris, London, Chicago and New York (among others) have used large-scale bikeshare systems to facilitate the use of bicycles for urban commuting. This Final Year Project (FYP) focuses on the city of Chicago and estimates the relationship between bikeshare usage and other various transits as well as weather condition. In addition, I will estimate the effects of ridership between various transits as well as its accessibility and availability. Hence, the scope of the project would be to produce insightful analysis on mainly bikeshare along with a collection of other transit modes through analysis of datasets using cognitive-related algorithms. I have used mainly Python for this Final Year Project, other libraries for Python such as Folium, BeautifulSoup, Numpy, Scikit-Learn and Pandas were imported and used to help with the necessary visualizations, data pre-processing and analysis. In addition, Java was also used in the implementation of Adaptive Resonance Theory (ART)-based algorithms such as Fusion ART, Fusion ART-with Match Tracking, Fusion ART-with Interesting Features (WIF) and IFC ART for clustering. Throughout the experiment ART based algorithms have been shown to have low computations complexity with the ability to adaptively learn, and more importantly they are cognitive. The best performing algorithm among the four ART-based algorithms and the algorithms provided by scikit-learn, will be evaluated using Davies-Bouldin index(DBI) to determine the best algorithm for each experiment. There are six experiments done in this project. Some interesting findings have been uncovered from the transit groups in the experiments such as the high correlation between subway departures and nearby bikeshare station check-in count, abnormally high taxi transaction usage counts tends to come from specific locations for both departure and drop-off point. Anomalies could be found, and one of such example occurred on 4th November 2016 and located at “Lake/State” found in the Central Business District (CBD) of Chicago with extremely high subway departures, it was found that events such as “Mac & Choose Fest” and “Allstate Hot Chocolate 15K/5K” was taking place near “Lake/State”, which most likely greatly contributed to this anomaly. Limitations posed in this project are that the results are only representative of people living in Chicago, Illinois (IL) and the way data was recorded for certain datasets, such as bus related data where only the departure count for each route is recorded allowing for only some level for analysis. Future work or research could be done by applying the algorithms on data sets that is representative of the population in other countries and comparing the difference in mobility, usage and behavioural patterns between these countries.
author2 Tan Ah Hwee
author_facet Tan Ah Hwee
Wu, Evan
format Final Year Project
author Wu, Evan
author_sort Wu, Evan
title Cognitive systems for data analytics
title_short Cognitive systems for data analytics
title_full Cognitive systems for data analytics
title_fullStr Cognitive systems for data analytics
title_full_unstemmed Cognitive systems for data analytics
title_sort cognitive systems for data analytics
publishDate 2018
url http://hdl.handle.net/10356/74040
_version_ 1759856414227955712