Using Hadoop and Cassandra for taxi data analytics: A feasibility study
This paper reports on a preliminary study to assess the feasibility of using the Open Cirrus Cloud Computing Research testbed to provide offline and online analytical support for taxi fleet operations. In the study, we benchmarked the performance gains from distributing the offline analysis of GPS l...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2010
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7045 https://ink.library.smu.edu.sg/context/sis_research/article/8048/viewcontent/Using_Hadoop_and_Cassandra_for_Taxi_Data_Analytics__A_Feasibility.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8048 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-80482022-03-29T01:28:31Z Using Hadoop and Cassandra for taxi data analytics: A feasibility study KOH, Alvin Jun Yong NGUYEN, Xuan Khoa WOODARD, C. Jason This paper reports on a preliminary study to assess the feasibility of using the Open Cirrus Cloud Computing Research testbed to provide offline and online analytical support for taxi fleet operations. In the study, we benchmarked the performance gains from distributing the offline analysis of GPS location traces over multiple virtual machines using the Apache Hadoop implementation of the MapReduce paradigm. We also explored the use of the Apache Cassandra distributed database system for online retrieval of vehicle trace data. While configuring the testbed infrastructure was straightforward, we encountered severe I/O bottlenecks in running the benchmarks due to the lack of local disk storage on the compute nodes. This design limitation severely impedes the analysis of large data sets using cloud computing technologies. 2010-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7045 https://ink.library.smu.edu.sg/context/sis_research/article/8048/viewcontent/Using_Hadoop_and_Cassandra_for_Taxi_Data_Analytics__A_Feasibility.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University taxi fleet management GPS data cloud computing Apache Hadoop Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
taxi fleet management GPS data cloud computing Apache Hadoop Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
taxi fleet management GPS data cloud computing Apache Hadoop Databases and Information Systems Numerical Analysis and Scientific Computing KOH, Alvin Jun Yong NGUYEN, Xuan Khoa WOODARD, C. Jason Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
description |
This paper reports on a preliminary study to assess the feasibility of using the Open Cirrus Cloud Computing Research testbed to provide offline and online analytical support for taxi fleet operations. In the study, we benchmarked the performance gains from distributing the offline analysis of GPS location traces over multiple virtual machines using the Apache Hadoop implementation of the MapReduce paradigm. We also explored the use of the Apache Cassandra distributed database system for online retrieval of vehicle trace data. While configuring the testbed infrastructure was straightforward, we encountered severe I/O bottlenecks in running the benchmarks due to the lack of local disk storage on the compute nodes. This design limitation severely impedes the analysis of large data sets using cloud computing technologies. |
format |
text |
author |
KOH, Alvin Jun Yong NGUYEN, Xuan Khoa WOODARD, C. Jason |
author_facet |
KOH, Alvin Jun Yong NGUYEN, Xuan Khoa WOODARD, C. Jason |
author_sort |
KOH, Alvin Jun Yong |
title |
Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
title_short |
Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
title_full |
Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
title_fullStr |
Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
title_full_unstemmed |
Using Hadoop and Cassandra for taxi data analytics: A feasibility study |
title_sort |
using hadoop and cassandra for taxi data analytics: a feasibility study |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2010 |
url |
https://ink.library.smu.edu.sg/sis_research/7045 https://ink.library.smu.edu.sg/context/sis_research/article/8048/viewcontent/Using_Hadoop_and_Cassandra_for_Taxi_Data_Analytics__A_Feasibility.pdf |
_version_ |
1770576194198044672 |