Automatic speech recognition and chat bot for air traffic control
Artificial Intelligence (AI) has demonstrated the ability to manage complex processes highly effectively and thus is widely seen as a key component in future airport ATM systems. Future AI tools for ATMs will rely on digital data, such as surveillance, radar, weather, and flight plans, for the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/177842 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Artificial Intelligence (AI) has demonstrated the ability to manage complex processes
highly effectively and thus is widely seen as a key component in future airport ATM
systems. Future AI tools for ATMs will rely on digital data, such as surveillance, radar,
weather, and flight plans, for their operation. However, the foundational Air Traffic
Control Officer (ATCo)-pilot communication medium is voice, which is a vital source
of situational data. Controller Pilot Data Link Communications (CPDLC) has been
developed as an alternative, text-based communication delivery method, however,
ATCo-pilot communications will not be completely transitioned to this framework in
the near-term future. Moreover, as CPDLC is a one-to-one communication paradigm,
the additional situational awareness of other traffic provided by traditional party-line
VHF communications is potentially lost. Therefore, an automated speech-to-text
translation tool can be seen as a missing link, enabling traditional ATCo-pilot voice
communications to be automatically translated and input into a datalink system such
as CPDLC. To this end this paper presents a Machine Learning (ML) based Automatic
Speech Recognition (ASR) framework that is able to accurately translate VHF-quality
ATCo-pilot speech communication to text, achieving a Word Error Rate of only
6.13%. Moreover, the presented model is able to extract crucial information with an
accuracy and F1-score of 95.2% and 90.5% respectively. A detailed design of the
framework is provided to enable its replication by the wider research community. |
---|