News & Events


Processing Geographic Language

Prof. Inderjeet Mani, Brandeis University Boston

Humans are able to communicate geographic information in a highly concise but vague manner, posing interesting challenges for natural language understanding. In recent years, information extraction systems have been developed to ground geographical references in text in terms of geo-coordinates, with the tags produced by such systems being used by geographical search engines and mapping tools. However, without a standard for how different types of geographical entities should be tagged, such systems are impossible to reliably evaluate. I will describe an annotation scheme called SpatialML, that has been used to accurately mark up places, their geo-coordinates, and spatial relationships in a variety of text corpora. SpatialML represents spatial relationships among geographical regions in terms of the Region Connection Calculus (RCC), and it has also been mapped to the Generalized Upper Model (GUM) ontology from the University of Bremen. SpatialML is also being used in the Cross-Language Evaluation Forum (CLEF) to assess tools to analyze geographical queries posed to search engines, and it is currently being integrated with a time markup standard (TimeML). Despite these positive trends, I will argue that a far more concerted research effort is required to address thefundamental challenges of geographic language.

Date: 11.05.2009

Time: 11:00 h

Location: Cartesium, Raum 1;041, Enrique-Schmidt-Str. 5, Universität Bremen