ZCatalog for standalone ZODB
by Kevin Dangoor
I have just packaged up ZCatalog from Zope 3.1 for use with the standalone ZODB. I’m surprised that no one had released this previously, because all but the most trivial of ZODB apps will need some way to do non-primary key sorts of searches.
Previously available solutions that I spotted are: [IndexedCatalog](http://www.async.com.br/projects/IndexedCatalog/) and [these instructions for getting Zope 2.6's ZCatalog working](http://slarty.polito.it:8069/~sciasbat/wiki/moin.cgi/StandaloneZodbHowto).
The major feature that ZCatalog has over IndexedCatalog is a full-text search index.
I had originally packaged up the Zope 2.8 catalog. That was generally working, but I wasn’t completely comfortable with it because the code was fairly “tangled up”. The Zope bits were spread throughout the code. Zope 3 has a beautiful architecture, and extracting the catalog from there was far easier.
Note, however, that this means that current ZCatalog plugins ([Dieter Maurer's AdvancedQuery](http://www.dieter.handshake.de/pyprojects/zope/#AdvancedQuery), or [TextIndexNG](http://www.zopyx.com/OpenSource/TextIndexNG), for example) won’t work directly. Hopefully, it will not be difficult to get these sorts of things running. Generally speaking, that is left as an exercise to the reader (patches accepted
I’m calling this release 1.0 alpha 0, because the testing that I’ve done is approximately as extensive as what you see in the readme file. Which is to say that it passes the most basic of sanity checks, but I haven’t looked much beyond that. I’ll be exercising it a lot more this week, so I’ll probably have a bit more confidence later.
When it comes to developer tools, though, release early, release often is a good motto.
Did you actually check out IndexedCatalog?
http://www.async.com.br/projects/IndexedCatalog/
I mentioned IndexedCatalog (with the same link) in my post… Something that did just occur to me, however, is that it may be possible to rip the full text index out of Zope and use that with IndexedCatalog. IndexedCatalog has more features for querying, etc. than the Zope 3.1 catalog, but the fulltext index is a big draw. If Zope’s index can work with IndexedCatalog, then it’s the best of both.
I realize I’m blind. Indeed, we never needed FTI, but patches would be very welcome — I don’t actually think it should be so much trouble to add if you could get your Index class to conform to the general Index API.
I just took a closer look at IndexedCatalog (not an exhaustive look, by any stretch of the imagination). Though they serve similar functions, IndexedCatalog and ZCatalog (particularly Zope 3.1′s catalog) are very different in how they approach the problem. The mechanisms for specifying indexes and querying are very different.
Since I have ZCatalog running already and it looks like it will meet my needs for now, I’m going to head down that path rather than working to move Zope’s TextIndex into IndexedCatalog.
I like IndexedCatalog, it adds great features to ZODB minus the aformentioned hassle of extracting Zope 2′s catalog. Zope’s 3′s catalog packaged w/o Zope 3 is really at a very similar to IndexedCatalog, but interface driven rather than driven at the class/attribute level. Something I’ve been working on recently is the Zope 3 content managment project over at z3lab (http://www.z3lab.org/).
One of those projects is adding a flexible search API that can search multiple, pluggble sources with pluggable agents (http://comments.gmane.org/gmane.comp.web.zope.zope3.ecm.general/80). IndexCatalog could be used as a search source and work in a unified framework with other sources like the Zope 3.1 catalog and other “searchable” sources as defined in the framework. Any comments you guys have on this would be welcome.
Hi, Michel
I couldn’t get to your actual proposal without registering, and I don’t have the cycles at the moment to hop through that to get to it. I did read your announcement, though.
I don’t like RDF-the-syntax, so I hope when you say that things are specified in RDF you mean RDF-the-model with some reasonable API. These days, people are pulling information from so many sources that a unified API does make good sense.
Sounds like a good idea to me, though the devil is always in the details.
Congrats on the new book, by the way!
I couldn’t find the download link.
IndexedCatalog didn’t support new-style classes last time I tried it.
I should put a link somewhere on the site for ZCatalog. I should also put up a new version soon, because I found a typo in one of my classes.
The file itself is an attachment at the bottom of the ZCatalog page:
http://www.blazingthings.com/zcatalog
Here’s the direct link:
http://www.blazingthings.com/files/ZCatalog-1.0a0.tar.gz
Not supporting new-style classes would be a drag. I have a feeling that that is nothing to do with IndexedCatalog, though… it’s probably more to do with the version of ZODB in use. Old versions of Persistent didn’t support new-style classes. The current ones most definitely do.
IIRC Johan has a branch that does support new-style classes somewhere, I’ve pinged him for it.
I get a “Page not found” with that link. I don’t see the attachment, either. Is there a magic handshake or something?
I can understand if you’re not quite ready to release code.
Nope, I was definitely ready to release code. Apparently, the attachments don’t show up if you’re not logged in. I wasn’t aware of that, and I’ll see about fixing that now.
The direct link I gave didn’t work because I updated to alpha 1, having fixed a bug along the way. Here’s the current link:
http://www.blazingthings.com/files/ZCatalog-1.0a1.tar.gz
And, like I said, the attachment should start showing up soon.
I had been looking for a text indexing library for python, and this seems great. The one thing I can’t figure out is how to actually save the catalog to disk. How do I specify where to save it? I’ve gone through your source code and don’t see where it connects to anything that asks for a filename. Thanks
The catalog itself is stored in the ZODB. As mentioned on the page for ZCatalog, you need to have ZODB 3.4.
http://www.zope.org/Products/ZODB3.4
That’s where you’ll find the storage location.
I followed the instructions and installed ZODB and ZopeInterface before using ZCatalog. I understand that the data gets stored into a ZODB instance, but ZODB itself must be stored on disk in order for it to be persistent. I’m just not sure how to specify the file to use. Thanks
I’d recommend that you read A.M. Kuchling’s ZODB Programming Guide:
http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html
You might also want to check out this article:
http://www.blueskyonmars.com/2005/06/18/zodb-vs-pysqlite-with-sqlobject/
for the caveats when using the ZODB (specifically, the packing requirement). It may be fine for you, but you should be aware of it.
The short answer to your question is something like this:
fs = FileStorage(filename)
db = DB(fs)
conn = DB.open()
root = conn.root()
The root is a persistent dictionary. You can create a ZCatalog and store it in that dictionary.