http://www.let.rug.nl/~gosse/Imix/clin04_tiedemann.pdf is a poster comparing several Information Retrieval (IR) engines, among them Lucene and Xapian.
This analysis could be helpful in choosing something nice for the Z3 ECM.
« May 2005 | Main | July 2005 »
http://www.let.rug.nl/~gosse/Imix/clin04_tiedemann.pdf is a poster comparing several Information Retrieval (IR) engines, among them Lucene and Xapian.
This analysis could be helpful in choosing something nice for the Z3 ECM.
Posted at 07:59 PM | Permalink | Comments (0)
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Posted at 07:47 PM | Permalink | Comments (0)
I have been at Michael Salib's talk about Xapian, "Stupidity and laser cat toys: Indexing the US Patent Database with Xapian and Twisted"
Xapian is a probabilistic text search engine.
Michael used to index the US Patent Database, wich is pretty big indeed.He wrote a python wrapper called Xapwrap, that you can get here:
http://divmod.org/projects/xapwrapMichael explained that Xapian was prefered to Lucene because It easier to wrap into Python and provided faster queries and a better precision.
I'm waiting for Michael to upload the slides on the EP sites to give more precise feedback on this.
More info on PyLucene here: http://www.sauria.com/~twl/conferences/pycon2005/20050325/Pulling Java Lucene into Python.html(PyCon05 notes)
feature-wise, Xapian has eveything needed to run a scalabale text engine.(stemming based on snowball, meta-indexes, etc..) It optionnally uses twisted's python.log for logging.
I have the feeling that Xapian would fit pretty well as an external indexer for z3
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Posted at 12:07 PM | Permalink | Comments (0)
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Posted at 05:15 PM | Permalink | Comments (0)
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Posted at 03:12 PM | Permalink | Comments (0)
(Post originally written by Julien Anguenot on the old Nuxeo blogs.)
Posted at 11:10 AM | Permalink | Comments (0)
(Post originally written by Julien Anguenot on the old Nuxeo blogs.)
Posted at 07:53 PM | Permalink | Comments (0)
(Post originally written by Tarek Ziadé on the old Nuxeo blogs.)
Posted at 11:09 AM | Permalink | Comments (0)
(Post originally written by Julien Anguenot on the old Nuxeo blogs.)
Posted at 02:33 AM | Permalink | Comments (0)
Posted at 12:24 PM | Permalink | Comments (0)