TREC 2001 Arabic/English Cross-Language IR track

Overview

The National Institute of Standards and Technology (NIST) will conduct an evaluation of Cross-Language Information Retrieval (CLIR) technology in conjunction with the Text Retrieval Conference (TREC-2001). The focus for 2001 will be retrieval of Arabic language newswire documents from topics in English or French. Additional information is available in the track guidelines.

Official Information

Track Guidelines
The track guidelines contain information about obtaining the test collection, running the experiments, and submitting the results.

Useful resources

Arabic IR and Computational Linguistics Resources
An extensive collection of links to resources that participants in the track might find useful.
Character set conversion
A Perl script provided by the LDC for converting the Arabic topic descriptions from ISO 8859-6 encoding to the UTF8 encoding that is used by the document collection for use in monolingual Arabic retrieval experiments. This script is freely redistributable.

Fred Gey
Doug Oard
Last modified: Fri Jul 20 01:09:47 2001