We report on some recent advancements on the development of the ROADRUNNER system, which is able to automatically infer a wrapper for HTML pages. One of the major drawbacks of the ROADRUNNER approach was its limited ability in handling irregularities in the source pages. To overcome this issue, we have developed a technique to deal with chunks of unstructured HTML code. Several experiments have been conducted to evaluate the effectiveness of the approach, producing encouraging results. Copyright c 2004, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
Crescenzi, V., Mecca, G., & Merialdo, P. (2004). Handling irregularities in ROADRUNNER. In AAAI Workshop - Technical Report (pp.39-44).
|Titolo:||Handling irregularities in ROADRUNNER|
|Data di pubblicazione:||2004|
|Citazione:||Crescenzi, V., Mecca, G., & Merialdo, P. (2004). Handling irregularities in ROADRUNNER. In AAAI Workshop - Technical Report (pp.39-44).|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|