The Italian Senate faces the problem of clustering amendments to optimize the scheduling of parliamentary sessions. Currently, this task is carried out by Similis, an application that tackles this problem by using a traditional term-frequency technique, which leads to clustering based on wording rather than semantics. Recent advances in natural language processing have led Italian institutions to investigate the adoption of pre-trained language models (PTLMs) for text analysis. Along this line, in this paper, we propose CLAMSE, an alternative system to Similis that uses Sentence-BERT pre-trained models to generate embeddings and then groups similar amendments through hierarchical agglomerative clustering. Our preliminary evaluation shows that CLAMSE achieves comparable performance to Similis using embeddings generated by pre-trained models without fine-tuning, paving the way for applying a clustering method with advanced contextual understanding. This study contributes to enhancing the effectiveness of institutional decision-making processes through the adoption of PTLMs.

Sajeva, A., Iannucci, S., Marchetti, C., Merialdo, P., Torlone, R. (2024). Clustering Amendments with Semantic Embeddings. In CEUR Workshop Proceedings (pp.312-320). CEUR-WS.

Clustering Amendments with Semantic Embeddings

Sajeva A.;Iannucci S.;Merialdo P.;Torlone R.
2024-01-01

Abstract

The Italian Senate faces the problem of clustering amendments to optimize the scheduling of parliamentary sessions. Currently, this task is carried out by Similis, an application that tackles this problem by using a traditional term-frequency technique, which leads to clustering based on wording rather than semantics. Recent advances in natural language processing have led Italian institutions to investigate the adoption of pre-trained language models (PTLMs) for text analysis. Along this line, in this paper, we propose CLAMSE, an alternative system to Similis that uses Sentence-BERT pre-trained models to generate embeddings and then groups similar amendments through hierarchical agglomerative clustering. Our preliminary evaluation shows that CLAMSE achieves comparable performance to Similis using embeddings generated by pre-trained models without fine-tuning, paving the way for applying a clustering method with advanced contextual understanding. This study contributes to enhancing the effectiveness of institutional decision-making processes through the adoption of PTLMs.
2024
Sajeva, A., Iannucci, S., Marchetti, C., Merialdo, P., Torlone, R. (2024). Clustering Amendments with Semantic Embeddings. In CEUR Workshop Proceedings (pp.312-320). CEUR-WS.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/483468
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact