We report on some recent advancements on the development of the ROADRUNNER system, which is able to automatically infer a wrapper for HTML pages. One of the major drawbacks of the ROADRUNNER approach was its limited ability in handling irregularities in the source pages. To overcome this issue, we have developed a technique to deal with chunks of unstructured HTML code. Several experiments have been conducted to evaluate the effectiveness of the approach, producing encouraging results. Copyright c 2004, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.

Crescenzi, V., Mecca, G., Merialdo, P. (2004). Handling irregularities in ROADRUNNER. In AAAI Workshop - Technical Report (pp.39-44).

Handling irregularities in ROADRUNNER

CRESCENZI, VALTER;MECCA, Giansalvatore;MERIALDO, PAOLO
2004-01-01

Abstract

We report on some recent advancements on the development of the ROADRUNNER system, which is able to automatically infer a wrapper for HTML pages. One of the major drawbacks of the ROADRUNNER approach was its limited ability in handling irregularities in the source pages. To overcome this issue, we have developed a technique to deal with chunks of unstructured HTML code. Several experiments have been conducted to evaluate the effectiveness of the approach, producing encouraging results. Copyright c 2004, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
2004
Crescenzi, V., Mecca, G., Merialdo, P. (2004). Handling irregularities in ROADRUNNER. In AAAI Workshop - Technical Report (pp.39-44).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11590/307423
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? ND
social impact