A NEW INFORMATION EXTRACTION METHOD BASED ON TRANSDUCERS
Vesna Pajić
Univerzitet u Beogradu, Poljoprivredni fakultet
Miloš Pajić
Univerzitet u Beogradu, Poljoprivredni fakultet
Staša Vujičić-Stanković
Univerzitet u Beogradu, Matematički fakultet
Keywords:
information extraction, natural language processing, data structuring
Abstract
An overview of the information extraction is given, whose methods and techniques are indispensable in the information search and information management. Information extraction uses and combines techniques and methods from mathematics and computer science, such as natural language processing, formal language theory, probability and statistics. Taking into account all specifics of requests for information and textual resources from which it is extracted, we developed and present a new method for the information extraction called a two-stage method based on the transducers. An architecture of a system that implements this method is presented, along with an example of application. This method has the special significance in
situations in which there is a lack of already annotated text corpora, that are necessary for the application of existing methods, especially those based on probability and statistics.