Athena : a Rasperry Pi - based Alexa system

The field of virtual assistants enabled with artificial intelligence has found many applications in our modern world today. Such applications have evolved over the years, and are now able to interact with users in multiple ways such as text, voice and image. With the recent advancements made in arti...

Full description

Saved in:
Bibliographic Details
Main Author: Khoo, Bryant Ju Seng
Other Authors: Oh Hong Lye
Format: Final Year Project
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/74006
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The field of virtual assistants enabled with artificial intelligence has found many applications in our modern world today. Such applications have evolved over the years, and are now able to interact with users in multiple ways such as text, voice and image. With the recent advancements made in artificial intelligence and machine learning, many companies have entered the market for virtual assistants. Products such as Amazon’s Echo have already started to gain popularity in households for accessing various features. Despite the popularity of Amazon’s Alexa, there is still room for improvement. Today, Alexa is unable to provide its users with personalization options, and most of the available Alexa Skills only execute simple commands based on Alexa’s interaction model which is not conversation-friendly. Thus, this project aims to explore the implementation of Amazon’s Alexa on a Raspberry Pi with personalised options and investigate how an Alexa Skill can be built to increase interactivity in a conversation. Athena is a Raspberry Pi based Alexa system was built on a Raspberry Pi 3 Model B unit. The system was implemented using the Python programming language, primarily for extensibility for future development of the project. Using an open source project AlexaPi, Athena can call the Amazon Voice Service (AVS) Application Programming Interface (API). Invocation of the virtual assistant is done via a custom invocation word “Athena” via an external speech recognition engine, which allows for flexibility in the word of choice. A facial recognition module was developed using an open source API, and was integrated with the system to provide further customization. On top of the customizations mentioned, the project will involve the implementation of an Alexa Skill to investigate ways to create a more interactive conversation with the user. During the exploration of Alexa skills, an implementation of an open dialogue Alexa skill was explored, but it did not provide a desirable conversational experience due to various reasons. This will be further detailed in the report. Thus, the NewsReader skill is a skill built with its conversation modelled as a Finite State Machine. The skill gives the Athena skill the ability to read and summarize articles for the user based on a “Search” and “Section” criteria provided. Articles that exceed the length limit permitted by AVS are summarised with the LexRank algorithm. The skill can remember contextual information of the conversation with the help of three levels of state management. Even though Athena is a system in its prototypical stage, one main advantage of using Athena over the Amazon Echo is its customizability. Being able to add in custom modules such as the Facial Recognition module opens the system to endless opportunities for upgrades and enhancements.