Resurrecting the anscombosaurus Rex : why learning exploratory data analysis is critical for biologists

There is growing abuse of Confirmatory Data Analysis (CDA) methods such as p-values for significance in research. We recommend another data analysis method known as Exploratory Data Analysis (EDA) to complement CDA methods to gain better insights of our data. This study aims to design an introductor...

Full description

Saved in:
Bibliographic Details
Main Author: Ho, Sung Yang
Other Authors: Goh Wen Bin Wilson
Format: Final Year Project
Language:English
Published: 2019
Subjects:
Online Access:http://hdl.handle.net/10356/77284
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:There is growing abuse of Confirmatory Data Analysis (CDA) methods such as p-values for significance in research. We recommend another data analysis method known as Exploratory Data Analysis (EDA) to complement CDA methods to gain better insights of our data. This study aims to design an introductory EDA tutorial and using feedback from participants and current materials taught in online courses to design subtopics of EDA that can be used in the field of biology and if possible generalized to other fields. The findings of our research suggest that there is a gap in knowledge between undergraduates to postgraduates. Students are only exposed to CDA methods and there are multiple misconceptions when it comes to graphical and statistical interpretations. The key findings of this study suggest that not only are biology students not proficient in statistics, but they are also lacking in data science. Hence, there is a pressing need to educate data science better to the biology field. The final design of the subtopics of EDA after content analysis aims to teach students on the importance of clean data, the power of data visualisation through the use of the “ggplot2” R package and patterns of significance when analysing data.