#LuceneEval
A library and application built on top of Lucene
. Convenient methods to load the cacm
collection and its relative files, search its queries and test the results using trec_eval
##Configuration
Currently there is no external configuration, you need to modify the code.
The file LuceneEval.java
holds the configuration variables.
Default configuration is: DATAFILE = "data/cacm/cacm.all"; CACM_XML = "data/results/cacm.all.xml"; QUERYFILE = "data/cacm/query.text"; STOPWORDLIST = "data/cacm/common_words"; CACM_QRELS_FILE = "data/cacm/qrels.text"; TREC_QRELS_FILE = "data/results/trec_qrels"; TREC_SEARCHRESULTS_FILE = "data/results/trec_searchresults"; TREC_RESULTS_FILE = "data/results/trec_results"; RESULTS_LIMIT = 20;
##Dependencies
Lucene
- provides a Java-based indexing and search implementation, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities
SimpeXML
- high performance XML serialization and configuration framework for Java
trec_eval
- the standard tool used for evaluating an ad hoc retrieval run, given the results file and a standard set of judged results
##License
LuceneEval by Ivan Kanakarakis is licensed under GNU GPLv3 license.
Further see COPYING.