Resources
A Method to Evaluate and Compare Web IE
we report on an original information extraction evaluation method to compare proposals that work on free, semi-structured or structured documents stringently, homogeneously and fairly.
Download
(0.1 MB,
application/pdf)
A Novel Approach to Web Information Extraction
We present a proposal to extract information from semi-structured web documents. It leverages a classical propositional technique and enhances it with the ability to learn from an unbounded context, which helps increase its effectiveness.
Download
(0.11 MB,
application/pdf)
A Novel Approach to Web Information Extraction
We present a proposal to extract information from semi-structured web documents. It leverages a classical propositional technique and enhances it with the ability to learn from an unbounded context, which helps increase its effectiveness.
Download
(0.11 MB,
application/pdf)
New approaches to Web Information Extraction
In this dissertation, we focus on developing web information extractors that learn rules to extract information from semi-structured web documents and on how to evaluate different information extraction proposals to rank them
Download
(2.92 MB,
application/pdf)
On Exploring Search Heuristics for Web IE
In this paper we employ a classical IE to explore different search heuristics and analyse how they have an impact on effectiveness and/or efficiency.
Download
(0.33 MB,
application/pdf)
On Extracting Information from Deep Web Documents
We propose an approach that relies on an open catalogue of attributive and relational features of the DOM nodes and we have incorporated an optimisation that allows it to be very efficient.
Download
(0.38 MB,
application/pdf)
On Feeding Agents with Web Information
Dykers is a new approach that can learn very resilient web information extraction rules using neural networks.
Download
(0.2 MB,
application/pdf)
On Learning Web IE Rules With TANGO
TANGO is a new proposal to learn web information extraction rules that are represented as Horn clauses
Download
(1.44 MB,
application/pdf)
On Member Labelling in Social Networks
Katz is a novel hybrid proposal to solve the member labelling problem building on neural networks.
Download
(0.16 MB,
application/pdf)
Optimizando FOIL para la Extracción de Información
Optimizando FOIL para la Extracción de Información de la Web
Download
(0.17 MB,
application/pdf)
ROLLER: A novel approach to Web IE
Roller is a new propositio-relational technique that relies on an open catalogue of features, so that it can adapt to the evolution of the Web, and on an independent base learner and a rule scorer, so it can benefit from the continuous advances in ML
Download
(0.59 MB,
application/pdf)
VENICE:A method to Rank Web Information Extractors
Venice is an automated method to rank web information extraction proposals.
Download
(0.62 MB,
application/pdf)