Determining intent of conversations through machine learning

Conversation is very important in the lives of human beings. Interaction between two or more people promotes an exchange of ideas and thoughts. Applications such as automated conversational agents have been seeing widespread use due to the importance of communication and are now being utilized in te...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Del Mundo, Gabriel V.
التنسيق: text
اللغة:English
منشور في: Animo Repository 2018
الموضوعات:
الوصول للمادة أونلاين:https://animorepository.dlsu.edu.ph/etd_masteral/5528
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
الوصف
الملخص:Conversation is very important in the lives of human beings. Interaction between two or more people promotes an exchange of ideas and thoughts. Applications such as automated conversational agents have been seeing widespread use due to the importance of communication and are now being utilized in technologies such as in navigation apps. Conversational agents form responses based on the persons input. However, current conversational systems lack the initiative to provide additional information to the user since it lacks knowledge on the context of a conversation and the user's intent. By modeling a person's intent, these systems will have knowledge on the current direction of a conversation. Forum posts and other data from a Filipino forum site called Pinoy Exchange will be extracted to simulate conversations. Three different machine learning methods were tested: Naive Bayes, Decision Trees (particularly Random Forest), and Convolutional Neural Networks. These machine learning methods were used to create two models, one for classifying dialogue acts to represent a users intent, and the other to classify if a post is about to conclude or not. The dialogue act model that performed best is the Convolutional Neural Network and was able to classify the multi-label problem with a Hamming Loss of 7.45. The conversation end model had difficulties classifying concluding conversations due to the largely skewed dataset.