Benvenuti nell'Anagrafe della Ricerca d'Ateneo

In the contemporary landscape, the pervasive use of advanced chatbots for content generation and translation raises intriguing linguistic questions. Acknowledging the complex ethical considerations associated with the development and use of these language models, it becomes apparent that there exists a notable scientific void in the realm of general linguistics concerning this interesting subject, particularly in relation to the Italian language, as evidenced by the scarcity of specific studies on this topic. The purpose of this article is to bridge this divide by conducting a thorough linguistic analysis of a subset of texts produced by two influential language models, namely Gpt4 and Jasper, which have gained global popularity. From a methodological standpoint, we meticulously constructed two corpora, each comprising 10 short stories for children generated by Chat Gpt and Jasper, respectively. To achieve this, we utilized prompt engineering tools to ensure the use of highly detailed and explicit inputs for each language model. Additionally, we incorporated a third corpus of equal size, consisting of 10 classic fairy tales, such as Hansel and Gretel and Tom Thumb, translated into Italian, in order to establish a solid comparative linguistic foundation with human language. Through a comprehensive qualitative and quantitative analysis, facilitated by the utilization of SketchEngine, we uncovered the distinctive linguistic features of Chat Gpt and Jasper, aiming to provide an initial characterization of these language models, and paying specific attention to the deviations and idiosyncrasies inherent in this unique form of linguistic expression. Ultimately, this paper aims to serve as a catalyst for further reflections on the topic, to foster a deeper understanding of the implications and potential consequences of employing these language models, sparking meaningful discussions in both the ethical and linguistic realms.

Calò, C., Palmerini, M. (2024). Exploring the linguistic landscape of GPT4 and Jasper Language Models: a corpus-based analysis of Italian short stories for children. In J.A.N.Á.y.O.S.O.G. Salud A. Flores Borjabad (a cura di), Tejiendo palabras: explorando la lengua, la lingüística y el proceso de traducción en la era de la inteligencia artificial (pp. 269-286). Madrid : Dykinson, S. L..

Exploring the linguistic landscape of GPT4 and Jasper Language Models: a corpus-based analysis of Italian short stories for children

Monica Palmerini^{Conceptualization}

2024-01-01

Abstract

In the contemporary landscape, the pervasive use of advanced chatbots for content generation and translation raises intriguing linguistic questions. Acknowledging the complex ethical considerations associated with the development and use of these language models, it becomes apparent that there exists a notable scientific void in the realm of general linguistics concerning this interesting subject, particularly in relation to the Italian language, as evidenced by the scarcity of specific studies on this topic. The purpose of this article is to bridge this divide by conducting a thorough linguistic analysis of a subset of texts produced by two influential language models, namely Gpt4 and Jasper, which have gained global popularity. From a methodological standpoint, we meticulously constructed two corpora, each comprising 10 short stories for children generated by Chat Gpt and Jasper, respectively. To achieve this, we utilized prompt engineering tools to ensure the use of highly detailed and explicit inputs for each language model. Additionally, we incorporated a third corpus of equal size, consisting of 10 classic fairy tales, such as Hansel and Gretel and Tom Thumb, translated into Italian, in order to establish a solid comparative linguistic foundation with human language. Through a comprehensive qualitative and quantitative analysis, facilitated by the utilization of SketchEngine, we uncovered the distinctive linguistic features of Chat Gpt and Jasper, aiming to provide an initial characterization of these language models, and paying specific attention to the deviations and idiosyncrasies inherent in this unique form of linguistic expression. Ultimately, this paper aims to serve as a catalyst for further reflections on the topic, to foster a deeper understanding of the implications and potential consequences of employing these language models, sparking meaningful discussions in both the ethical and linguistic realms.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Codice ISBN
	
				978-84-1170-923-1
			
	Citazione
	
				Calò, C., Palmerini, M. (2024). Exploring the linguistic landscape of GPT4 and Jasper Language Models: a corpus-based analysis of Italian short stories for children. In J.A.N.Á.y.O.S.O.G. Salud A. Flores Borjabad (a cura di), Tejiendo palabras: explorando la lengua, la lingüística y el proceso de traducción en la era de la inteligencia artificial (pp. 269-286). Madrid : Dykinson, S. L..
			
	Appare nelle tipologie:
	
				2.1 Contributo in volume (Capitolo o Saggio)

File in questo prodotto:

File	Dimensione	Formato
2024_Palmerini-Calò_Exploring the LL of GPT4 and Jasper language models.pdf accesso aperto Descrizione: capitolo volume Tipologia: Versione Editoriale (PDF) Licenza: Creative commons Dimensione 524.79 kB Formato Adobe PDF Visualizza/Apri	524.79 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/480787

Citazioni

ND

ND

ND

social impact