As the use of Large Language Models (LLMs) grows, their limitations in addressing domain-specific scholarly research, particularly in Ancient World studies, become evident. To address this gap, this paper introduces the AI Librarian, a system leveraging Retrieval Augmented Generation (RAG) for enabling scholars to ask questions on the content of nearly two thousand multilingual open-access publications from ISAW Papers, Digital Central Asian Archaeology, and the Ancient World Digital Library. The tool incorporates customized preprocessing for various document formats and offers an intuitive multilingual interface that ensures accurate citation-supported scholarly engagement. Experiments on a large set of questions and answers show the effectiveness and efficiency of our solution. This project highlights the transformative potential of RAG-powered language systems in Digital Humanities and emphasizes the importance of investment in open-access knowledge infrastructure.
Di Pasqua, F., Merialdo, P., Torlone, R. (2025). Generative AI for Ancient Insights: Leveraging LLMs and RAG for Ancient World Studies. In Proceedings of the 2025 IEEE International Conference on Cyber Humanities, IEEE-CH 2025 (pp.1-6). Institute of Electrical and Electronics Engineers Inc. [10.1109/IEEE-CH65308.2025.11279339].
Generative AI for Ancient Insights: Leveraging LLMs and RAG for Ancient World Studies
Merialdo P.;Torlone R.
2025-01-01
Abstract
As the use of Large Language Models (LLMs) grows, their limitations in addressing domain-specific scholarly research, particularly in Ancient World studies, become evident. To address this gap, this paper introduces the AI Librarian, a system leveraging Retrieval Augmented Generation (RAG) for enabling scholars to ask questions on the content of nearly two thousand multilingual open-access publications from ISAW Papers, Digital Central Asian Archaeology, and the Ancient World Digital Library. The tool incorporates customized preprocessing for various document formats and offers an intuitive multilingual interface that ensures accurate citation-supported scholarly engagement. Experiments on a large set of questions and answers show the effectiveness and efficiency of our solution. This project highlights the transformative potential of RAG-powered language systems in Digital Humanities and emphasizes the importance of investment in open-access knowledge infrastructure.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


