I was reading this article about linkfluence moving from CouchDB to Riak

"Why we move away from CouchDB

We were already aware of Riak before we started using CouchDB, but we weren’t sure about trusting a new product at this point, so we decided, after some benchmark, to go for CouchDB.

After the first couple of months, it was obvious that this was a bad choice.

Our main problems with CouchDB is scalability, versionning and stability.

Once we store a document in CouchDB, we modify it at least twice after the original write. Each modification generates a new version of the document. This feature is nice for some use-cases, but we don’t need it, and there’s no way to disable it, so the size of our databases started to become really important. You’ll probably tell me “hey, you know you can compact your database ?”, and I’ll answer “sure”. The trouble is that we never managed to get it to compact an entire database without crashing (well, to be honest, with the last version of CouchDB we finally managed to compact one database).

The second issue is that one database == one file. When you have multiple small databases, this is fine. When you a have only a few databases, and some grow to more than 1TB, the problems keep growing too (and it’s a real pain to backup).

We also had a lot of random crashes with CouchDB, even if the last version was quite stable."

Does that sound familiar, fellow Zope developer? I know a lot about ZODB but little about CouchDB. One thing that a lot of people don't know about ZODB is that it's very fast and I think this is true about CouchDB too. Speed isn't the same as a raw speed of inserts/queries because with the concurrency variable added the story gets a lot more complex.

It's the exact same perspectives I've always had on ZODB:

1) It's really convenient and powerful

2) It being a single HUGE file makes it hard to scale

3) Versioning can be nifty but it's often not needed and causes headache with the packing

4) It works great but when it cracks it cracks hard and cryptically

Comments

betabug

1.) It is :-)

2.) Not in my experience, but it depends of course what "scale" means to you. 9Gig Data.fs works fine for me, no idea if 1TB would be useable. The 9Gig is also without blob support, once that goes into the app, it will be a lot smaller.

3.) Versioning (aka "built-in undo") has saved my sweet butt a couple of times. Packing is really painless, never ever failed for me in many years. My incremental backup scripts include a packing / full backup cycle every few months.

4.) Never ever cracked for me. In my experience ZODB is a tough sucker. Many years of work on ZODB probably make a lot of difference to the newcomers in the robustness department.

Don't forget that on ZODB, the case stated above ("Once we store a document in CouchDB, we modify it at least twice after the original write") would result only in one write if it was in the same transacation. Not quite clear from the description if they're talking about multiple transactions there.

Proper setup of objects also means that not all edits write a huge object to the DB, ideally only a small subobject.

Peter Bengtsson

All valid points. I think to me the interesting thing to take away from this is that I can "apply" my experience of ZODB to CouchDB. In a sense.

I wonder if CouchDB has the same delicate challenges with conflict errors as ZODB has.

Your email will never ever be published.

Related posts