TrueGPT: can you privately extract algorithms from ChatGPT in tabular classification?

Recently, it has been shown that Large Language Models (LLMs) achieve impressive zero-shot classification on tabular data, revealing an internal algorithm ALLM without explicit training data. We predict that ALLM will become a standard for tabular data classification, replacing resource-intensive...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Soegeng, Hans Farrell
مؤلفون آخرون: Thomas Peyrin
التنسيق: Final Year Project
اللغة:English
منشور في: Nanyang Technological University 2024
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/175642
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Recently, it has been shown that Large Language Models (LLMs) achieve impressive zero-shot classification on tabular data, revealing an internal algorithm ALLM without explicit training data. We predict that ALLM will become a standard for tabular data classification, replacing resource-intensive custom ML models. However, LLM complexity hinders regulatory transparency. To address this, we introduce a method to approximate ALLM with human-interpretable binary feature rules Aerule. We utilize the TT-rules (Truth Table rules) model developed by Benamira et al., 2023 to extract the binary rules through the LLM inference of tabular datasets. Following the extraction and approximation processes, we set aside the LLM and exclusively rely on Aerule for inference. Our method is fully automatic. We validate the approach on 8 public tabular datasets, adding a user option to activate privacy-preserving feature to ensure owner data protection.