Large language models for urban data analysis: exploration of various methods and LLMs for traffic data prediction

This project investigates the application of Large Language Models (LLMs) for urban data analysis, with a focus on traffic data. The primary objective of this project includes exploring capabilities of various LLMs such as ChatGPT, Claude, and Llama in urban data analysis, developing methodologies f...

全面介紹

Saved in:
書目詳細資料
主要作者: Goh, Jeremy Chun Hao
其他作者: Long Cheng
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/181199
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:This project investigates the application of Large Language Models (LLMs) for urban data analysis, with a focus on traffic data. The primary objective of this project includes exploring capabilities of various LLMs such as ChatGPT, Claude, and Llama in urban data analysis, developing methodologies for traffic data prediction qualitatively and quantitatively, and comparing its performance against traditional data analysis methods. Tailored prompts were designed to facilitate the experiment and leveraged the capabilities of Poe.com, a platform which allowed users to create their own agent with customized knowledge base. The study revealed two main techniques for leveraging LLMs in urban data analytics: Standard Prompting and a One-time Setup method by creating a personalised assistant. While standard prompting techniques requires a new prompt for each analysis, the technique of developing a prompt as a personalised assistant eliminates the need for repeated prompt crafting, saving time and bridging the gap for prompt engineering knowledge in users. Key findings for qualitative data prediction have shown that ChatGPT-4o excels in data interpretation while Claude 3.5 – Sonnet can provide actionable insights and realistic forecasts. Llama, however, has faced challenges with achieving a moderate level of accuracy in data interpretation. Additional findings during evaluation of accuracy in quantitative predictions also revealed significant limitations when forecasting data such as total number of cars, which changes under highly volatile environments due to changing policies. However, data such as public transport ridership reflected a more stable and predictable pattern for public transport usage. Overall, this project demonstrates that LLMs can significantly enhance urban data analytic workflows, gaining quicker insights into traffic patterns compared to traditional methods, although some limitations remain, such as the need for domain expertise in interpreting results.