| SciPort RLP

Evaluation Framework for Large Language Model-based Conversational Agents

Tuan Q. Phan; Bernard Tan; Le Hoanh-Su; Nguyen Hoang Thuan (Hrsg). Pacific-Asia Conference on Information Systems PACIS 2014 Proceedings. Ho Chi Minh City, Vietnam, July 1 - 5, 2024. Atlanta, GA: Association for Information Systems/AIS eLibrary 2024 S. 1390 - 1406

Erscheinungsjahr: 2024

Publikationstyp: Diverses (Konferenzbeitrag)

Sprache: Englisch

Geprüft

Bibliothek

Inhaltszusammenfassung

The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Sc...The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Science Research (DSR) project, we construct an evaluation framework that incorporates four essential components: the pre-defined objectives of the agents, corresponding tasks, and the selection of appropriate datasets and metrics. Our framework outlines how these elements relate to each other in the evaluation and enables a structured approach for the evaluation. We demonstrate how such a framework enables a more systematic evaluation process. This framework can be a guiding tool for researchers and developers working with LLM-based conversational agents.» weiterlesen » einklappen

Conversational Agents
Evaluation Framework
Large Language Models

Autoren

Wolters, Anna (Autor)

Arz von Straussenburg, Arnold (Autor)

Riehle, Dennis (Autor)

Verknüpfte Personen

Anna Wolters
Mitarbeiter/in
(Institut für Wirtschafts- und Verwaltungsinformatik)

Dennis M. Riehle
Leiter
(Forschungsgruppe Wirtschaftsinformatik und Smart Data - EoTLab)

Starten Sie Ihre Suche...

Evaluation Framework for Large Language Model-based Conversational Agents

Inhaltszusammenfassung

Autoren

Verknüpfte Personen

Beteiligte Einrichtungen