TREC 2001 Arabic/English Cross-Language IR track
Overview
The National Institute of Standards and Technology (NIST) will conduct
an evaluation of Cross-Language Information Retrieval (CLIR)
technology in conjunction with the Text Retrieval Conference
(TREC-2001). The focus for 2001 will be retrieval of Arabic language
newswire documents from topics in English or French. Additional
information is available in the track guidelines.
Official Information
- Track Guidelines
- The track guidelines contain information about obtaining the test
collection, running the experiments, and submitting the results.
Useful resources
- Arabic IR and Computational Linguistics
Resources
- An extensive collection of links to resources that participants
in the track might find useful.
- Character set conversion
- A Perl script provided by the LDC for converting the Arabic
topic descriptions from ISO 8859-6 encoding to the UTF8
encoding that is used by the document collection for use in
monolingual Arabic retrieval experiments. This script is
freely redistributable.
Fred Gey
Doug Oard
Last modified: Fri Jul 20 01:09:47 2001