|
|
|
Seen at Europython: Xapian text search engineI have been at Michael Salib's talk about Xapian, "Stupidity and laser cat toys: Indexing the US Patent Database with Xapian and Twisted" Xapian is a probabilistic text search engine. Michael used to index the US Patent Database, wich is pretty big indeed.He wrote a python wrapper called Xapwrap, that you can get here: http://divmod.org/projects/xapwrapMichael explained that Xapian was prefered to Lucene because It easier to wrap into Python and provided faster queries and a better precision. I'm waiting for Michael to upload the slides on the EP sites to give more precise feedback on this. More info on PyLucene here: http://www.sauria.com/~twl/conferences/pycon2005/20050325/Pulling Java Lucene into Python.html(PyCon05 notes) feature-wise, Xapian has eveything needed to run a scalabale text engine.(stemming based on snowball, meta-indexes, etc..) It optionnally uses twisted's python.log for logging.
I have the feeling that Xapian would fit pretty well as an external indexer for z3 Important announcement: Join the Nuxeo team and contribute to the Nuxeo project! We have open positions in France and the UK for open source Java EE developers and sales engineers, both junior and senior. Trackback PingsTrackback URL for this entry:
http://blogs.nuxeo.com/sections/blogs/tarek_ziade/2005_06_30_seen_at_europython/tbping
|
Nuxeo Bloggers: Log in! Search Nuxeo Blogs
About this blog
Tarek Ziadé Nuxeo Bloggers
Photos and Pictures
|
|
Nuxeo -
Indesko -
Nuxeo 5 Project
All content is copyrighted by their author. CPSSkins is Copyright © 2003-2006 by Jean-Marc Orliaguet. | CPS is Copyright © 2002-2006 by Nuxeo SAS. |