java - problem with precision and recall measuring in lucene -
i need calculate precision , recall value in lucene , use source code that
public class precisionrecall { public static void main(string[] args) throws throwable { file topicsfile = new file("c:/users/raden/documents/lucene/lucenehibernate/lia/lia2e/src/lia/benchmark/topics.txt"); file qrelsfile = new file("c:/users/raden/documents/lucene/lucenehibernate/lia/lia2e/src/lia/benchmark/qrels.txt"); directory dir = fsdirectory.open(new file("c:/users/raden/documents/myindex")); searcher searcher = new indexsearcher(dir, true); string docnamefield = "filename"; printwriter logger = new printwriter(system.out, true); trectopicsreader qreader = new trectopicsreader(); //#1 qualityquery qqs[] = qreader.readqueries( //#1 new bufferedreader(new filereader(topicsfile))); //#1 judge judge = new trecjudge(new bufferedreader( //#2 new filereader(qrelsfile))); //#2 judge.validatedata(qqs, logger); //#3 qualityqueryparser qqparser = new simpleqqparser("title", "contents"); //#4 qualitybenchmark qrun = new qualitybenchmark(qqs, qqparser, searcher, docnamefield); submissionreport submitlog = null; qualitystats stats[] = qrun.execute(judge, //#5 submitlog, logger); qualitystats avg = qualitystats.average(stats); //#6 avg.log("summary",2,logger, " "); dir.close(); } }
and here contents of topicsfile
<top> <num> number: 0 <title> apache source <desc> description: <narr> narrative: </top>
and contents of qrelsfile
# format: # # qnum 0 doc-name is-relevant # # 0 0 apache1.0.txt 1 0 0 apache1.1.txt 1 0 0 apache2.0.txt 1
now problem occur when ran source code displayed value of precision , recall zero. here result when ran source code.
0 - contents:apache contents:source 0 stats: search seconds: 0.047 docname seconds: 0.039 num points: 56.000 num points: 0.000 max points: 3.000 average precision: 0.000 mrr: 0.000 recall: 0.000 precision @ 1: 0.000 precision @ 2: 0.000 precision @ 3: 0.000 precision @ 4: 0.000 precision @ 5: 0.000 precision @ 6: 0.000 precision @ 7: 0.000 precision @ 8: 0.000 precision @ 9: 0.000 precision @ 10: 0.000 precision @ 11: 0.000 precision @ 12: 0.000 precision @ 13: 0.000 precision @ 14: 0.000 precision @ 15: 0.000 precision @ 16: 0.000 precision @ 17: 0.000 precision @ 18: 0.000 precision @ 19: 0.000 precision @ 20: 0.000 summary search seconds: 0.047 docname seconds: 0.039 num points: 56.000 num points: 0.000 max points: 3.000 average precision: 0.000 mrr: 0.000 recall: 0.000 precision @ 1: 0.000 precision @ 2: 0.000 precision @ 3: 0.000 precision @ 4: 0.000 precision @ 5: 0.000 precision @ 6: 0.000 precision @ 7: 0.000 precision @ 8: 0.000 precision @ 9: 0.000 precision @ 10: 0.000 precision @ 11: 0.000 precision @ 12: 0.000 precision @ 13: 0.000 precision @ 14: 0.000 precision @ 15: 0.000 precision @ 16: 0.000 precision @ 17: 0.000 precision @ 18: 0.000 precision @ 19: 0.000 precision @ 20: 0.000
now can tell me had done wrong make precision , recall values zeros? , mean when precision , recall value zero? reason doing because need measure performance of search engine, , precision , recall 1 of way me achieve it.
thanks though
precision = 0 means none of results correct. see the wikipedia article, example.
i suggest trying individual query, , see results are. have issue tokenizer; maybe not casing things right etc.
Comments
Post a Comment