Evaluation Framework for Large Language Model-based Conversational Agents
Tuan Q. Phan; Bernard Tan; Le Hoanh-Su; Nguyen Hoang Thuan (Hrsg). Pacific-Asia Conference on Information Systems PACIS 2014 Proceedings. Ho Chi Minh City, Vietnam, July 1 - 5, 2024. Atlanta, GA: Association for Information Systems/AIS eLibrary 2024 S. 1390 - 1406
Erscheinungsjahr: 2024
Publikationstyp: Diverses (Konferenzbeitrag)
Sprache: Englisch
Inhaltszusammenfassung
The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Sc...The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Science Research (DSR) project, we construct an evaluation framework that incorporates four essential components: the pre-defined objectives of the agents, corresponding tasks, and the selection of appropriate datasets and metrics. Our framework outlines how these elements relate to each other in the evaluation and enables a structured approach for the evaluation. We demonstrate how such a framework enables a more systematic evaluation process. This framework can be a guiding tool for researchers and developers working with LLM-based conversational agents.» weiterlesen» einklappen
Verknüpfte Personen
- Anna Wolters
- Mitarbeiter/in
(Institut für Wirtschafts- und Verwaltungsinformatik)