php - Zend_Lucene and wilcard operator weirdness -


a quick summary of problem, wildcard operator doesn't seem return result expecting. testing against keyword field.

here come sample showing issue

include 'zend/loader/autoloader.php'; $autoloader = zend_loader_autoloader::getinstance(); $autoloader->setfallbackautoloader(true);   zend_search_lucene_analysis_analyzer::setdefault(     new zend_search_lucene_analysis_analyzer_common_utf8_caseinsensitive()); @mkdir('/tmp/test-lucene'); $index = zend_search_lucene::create('/tmp/test-lucene'); $doc = new zend_search_lucene_document(); $doc->addfield(zend_search_lucene_field::keyword('path', 'root/1/2/3')); $doc->addfield(zend_search_lucene_field::unstored('contents', 'the lazy fox jump on dog bla bla bla')); $index->adddocument($doc);   $doc = new zend_search_lucene_document(); $doc->addfield(zend_search_lucene_field::keyword('path', 'root/1')); $doc->addfield(zend_search_lucene_field::unstored('contents', 'the lazy fox jump on dog bla bla bla')); $index->adddocument($doc);  $doc = new zend_search_lucene_document(); $doc->addfield(zend_search_lucene_field::keyword('path', 'root/3/2/1')); $doc->addfield(zend_search_lucene_field::unstored('contents', 'the lazy fox jump on dog bla bla bla')); $index->adddocument($doc);  $doc = new zend_search_lucene_document(); $doc->addfield(zend_search_lucene_field::keyword('path', 'root/3/2/2')); $doc->addfield(zend_search_lucene_field::unstored('contents', 'the lazy fox jump on dog bla bla bla')); $index->adddocument($doc);  $hits = $index->find('path:root/3/2*'); foreach($hits $hit){     $doc = $hit->getdocument();     echo $doc->getfieldvalue('path') . php_eol; } 

this return whole set of documents instead of last 2 expected

output:

root/1/2/3 root/1 root/3/2/1 root/3/2/2 

so here question why lucene (zend_lucene in case) matches first documents, thought keyword fields not tokenized.

ps: might wants know why running test. have ecommerce website database, category table have path field. example category might have path '/1/2/3' means it's category id 3 , parent category index 2 etc ...

the problem when user full text search , specify category, ideally want return results category children categories, need lucene way of doing path '/1/2%'.

one other possibility merge results sql query , lucene hits, if possible avoid case because performs poorly.

if have ideas, welcomed.

use zend_search_lucene_analysis_analyzer_common_utf8num_caseinsensitive , replace slashes character not occur in paths word character zend_search_lucene. used german ß.

include 'zend/loader/autoloader.php'; $autoloader = zend_loader_autoloader::getinstance(); $autoloader->setfallbackautoloader(true);  zend_search_lucene_analysis_analyzer::setdefault(     new zend_search_lucene_analysis_analyzer_common_utf8num_caseinsensitive()); @mkdir('/tmp/test-lucene'); $index = zend_search_lucene::create('/tmp/test-lucene');  foreach (array('root/1/2/3', 'root/1', 'root/3/2/1', 'root/3/2/2') $path) {     $path = str_replace('/', 'ß', $path);     $doc = new zend_search_lucene_document();     $doc->addfield(zend_search_lucene_field::keyword('path', $path));     $index->adddocument($doc); }  $hits = $index->find(str_replace('/', 'ß', 'path:root/3/2*')); foreach($hits $hit){     echo str_replace('ß', '/', $hit->getdocument()->getfieldvalue('path')) . php_eol; } 

Comments

Popular posts from this blog

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -

java - Android recognize cell phone with keyboard or not? -

iphone - How would you achieve a LED Scrolling effect? -