Starten Sie Ihre Suche...


Durch die Nutzung unserer Webseite erklären Sie sich damit einverstanden, dass wir Cookies verwenden. Weitere Informationen

Evaluation Framework for Large Language Model-based Conversational Agents

Tuan Q. Phan; Bernard Tan; Le Hoanh-Su; Nguyen Hoang Thuan (Hrsg). Pacific-Asia Conference on Information Systems PACIS 2014 Proceedings. Ho Chi Minh City, Vietnam, July 1 - 5, 2024. Atlanta, GA: Association for Information Systems/AIS eLibrary 2024 S. 1390 - 1406

Erscheinungsjahr: 2024

Publikationstyp: Diverses (Konferenzbeitrag)

Sprache: Englisch

Website
GeprüftBibliothek

Inhaltszusammenfassung


The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Sc...The integration of Large Language Models (LLM) in Conversational Agents (CA) enables a significant advancement in the agents’ ability to understand and respond to user queries in a more human-like manner. Despite the widespread adoption of LLMs in these agents, there exists a noticeable lack of research on standardized evaluation methods. Addressing this research gap, our study proposes a comprehensive evaluation framework tailored explicitly to LLM-based conversational agents. In a Design Science Research (DSR) project, we construct an evaluation framework that incorporates four essential components: the pre-defined objectives of the agents, corresponding tasks, and the selection of appropriate datasets and metrics. Our framework outlines how these elements relate to each other in the evaluation and enables a structured approach for the evaluation. We demonstrate how such a framework enables a more systematic evaluation process. This framework can be a guiding tool for researchers and developers working with LLM-based conversational agents.» weiterlesen» einklappen

  • Conversational Agents
  • Evaluation Framework
  • Large Language Models

Autoren


Arz von Straussenburg, Arnold (Autor)

Verknüpfte Personen