Infrastructure Components for Large-Scale Information Extraction Systems

William W. Cohen

Large-scale systems for information extraction include many different classifiers and extractors. Experience in building such systems shows that finding an appropriate architecture is both difficult and important: in particular, in systems containing many learned components, it is important to cleanly share information between the components, and to flexibly sequence the actions of the components. In this paper, an architecture for large-scale information extraction systems is described, based a light-weight blackboard system for communication between components, and a declarative control system for automatically sequencing component-level tasks like classification, extraction, and feature computation.

This page is copyrighted by AAAI. All rights reserved. Your use of this site constitutes acceptance of all of AAAI's terms and conditions and privacy policy.