Benvenuti nell'Anagrafe della Ricerca d'Ateneo

Many approaches have been introduced recently to automatically create or augment Knowledge Graphs (KGs) with facts extracted from Wikipedia, particularly its structured components like the infoboxes. Although these structures are valuable, they represent only a fraction of the actual information expressed in the articles. In this work, we quantify the number of highly accurate facts that can be harvested with high precision from the text of Wikipedia articles using information extraction techniques bootstrapped from the entities and relations already in a KG. Our experimental evaluation, which uses Freebase as reference KG, reveals we can augment several relations in the domain of people by more than 10%, with facts whose accuracy are over 95%. Moreover, the vast majority of these facts are missing from the infoboxes, YAGO and DBpedia.

Cannaviccio, M., Barbosa, D., Merialdo, P. (2016). Accurate fact harvesting from natural language text in Wikipedia with lector. In Proceedings of the 19th International Workshop on Web and Databases, WebDB 2016 (pp.1-6). Association for Computing Machinery, Inc [10.1145/2932194.2932203].

Accurate fact harvesting from natural language text in Wikipedia with lector

CANNAVICCIO, MATTEO;Barbosa, Denilson;MERIALDO, PAOLO

2016-01-01

Abstract

Many approaches have been introduced recently to automatically create or augment Knowledge Graphs (KGs) with facts extracted from Wikipedia, particularly its structured components like the infoboxes. Although these structures are valuable, they represent only a fraction of the actual information expressed in the articles. In this work, we quantify the number of highly accurate facts that can be harvested with high precision from the text of Wikipedia articles using information extraction techniques bootstrapped from the entities and relations already in a KG. Our experimental evaluation, which uses Freebase as reference KG, reveals we can augment several relations in the domain of people by more than 10%, with facts whose accuracy are over 95%. Moreover, the vast majority of these facts are missing from the infoboxes, YAGO and DBpedia.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2016
			
	Codice ISBN
	
				9781450343107
9781450343107
			
	Citazione
	
				Cannaviccio, M., Barbosa, D., Merialdo, P. (2016). Accurate fact harvesting from natural language text in Wikipedia with lector. In Proceedings of the 19th International Workshop on Web and Databases, WebDB 2016 (pp.1-6). Association for Computing Machinery, Inc [10.1145/2932194.2932203].
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/308071

Citazioni

ND

3

ND

social impact