Crosstips.org

My fun Crossword solver project. Crosstips.org & Krysstips.se

Kung Fu

Fujian White Crane Kung Fu

Fry-IT

Fry-IT is the company I work for

Photos

Photoalbum, both old and new.

Zope

What I have and am doing with Zope

Receptsamlingen

In Swedish only. About my "Collection of Recipes" website.

Contact me

My contact details and how to contact me.

 

KungFuPeople.com
Do you train Kung Fu?
Or know someone who does?
Then check out KungFuPeople.com


Mobile version of this page Mobile version of this page


 

Speed test between django_mongokit and postgresql_psycopg2

http://github.com/peterbe/django-mongokit

9th of March 2010

Following on from yesterday's blog about How and why to use django-mongokit I extended the exampleproject which is inside the django-mongokit project with another app called exampleapp_sql which does the same thing as the exampleapp but does it with SQL instead. Then I added a very simple benchmarker app in the same project and wrote three functions:

  1. One to create 10/100/500/1000 instances of my class
  2. One to edit one field of all 10/100/500/1000 instances
  3. One to delete each of the 10/100/500/1000 instances

Speed test between django_mongokit and postgresql_psycopg2

The results can speak for themselves:

 # 10
 mongokit django_mongokit.mongodb
 Creating 10 talks took 0.0108649730682 seconds
 Editing 10 talks took 0.0238521099091 seconds
 Deleting 10 talks took 0.0241661071777 seconds
 IN TOTAL 0.058883190155 seconds

 sql django.db.backends.postgresql_psycopg2
 Creating 10 talks took 0.0994439125061 seconds
 Editing 10 talks took 0.088721036911 seconds
 Deleting 10 talks took 0.0888710021973 seconds
 IN TOTAL 0.277035951614 seconds

 # 100
 mongokit django_mongokit.mongodb
 Creating 100 talks took 0.114995002747 seconds
 Editing 100 talks took 0.181537866592 seconds
 Deleting 100 talks took 0.13414812088 seconds
 IN TOTAL 0.430680990219 seconds

 sql django.db.backends.postgresql_psycopg2
 Creating 100 talks took 0.856637954712 seconds
 Editing 100 talks took 1.16229200363 seconds
 Deleting 100 talks took 0.879518032074 seconds
 IN TOTAL 2.89844799042 seconds

 # 500
 mongokit django_mongokit.mongodb
 Creating 500 talks took 0.505300998688 seconds
 Editing 500 talks took 0.809900999069 seconds
 Deleting 500 talks took 0.65673494339 seconds
 IN TOTAL 1.97193694115 seconds

 sql django.db.backends.postgresql_psycopg2
 Creating 500 talks took 4.4399368763 seconds
 Editing 500 talks took 5.72280597687 seconds
 Deleting 500 talks took 4.34039878845 seconds
 IN TOTAL 14.5031416416 seconds

 # 1000
 mongokit django_mongokit.mongodb
 Creating 1000 talks took 0.957674026489 seconds
 Editing 1000 talks took 1.60552191734 seconds
 Deleting 1000 talks took 1.28869891167 seconds
 IN TOTAL 3.8518948555 seconds

 sql django.db.backends.postgresql_psycopg2
 Creating 1000 talks took 8.57405209541 seconds
 Editing 1000 talks took 14.8357069492 seconds
 Deleting 1000 talks took 11.9729249477 seconds
 IN TOTAL 35.3826839924 seconds

On average, MongoDB is 7 times faster.

All in all it doesn't really mean that much. We expect MongoDB to be faster than PostgreSQL because what it lacks for in features it makes up for in speed. It's interesting to see it in action and nice to see that MongoKit is fast enough to benefit from the database's speed.

As always with benchmarks: Lies, lies and more damn lies! This doesn't really compare apples for apples but hopefully with django-mongokit the comparison is becoming more fair. Also, you're free to fork the project on github and do your optimizations and re-run the tests yourself.



Comment

Henrique Carvalho Alves - 9th March 2010  [«« Reply to this]
Can you compare the timing for making queries?

Single table/document queries, and then joined tables/simulated joins. I wonder how the benefits from I/O surpass the query times, and vice-versa.
Peter Bengtsson - 9th March 2010   [«« Reply to this]
I could. Perhaps when time allows. Feel free to fork it on github if you feel like you want to test something that is more relevant to your real-life projects.
Massimiliano Torromeo - 9th March 2010  [«« Reply to this]
What about SELECTs?
Peter Bengtsson - 11th March 2010   [«« Reply to this]
Both when it retrieves for editing and for deleting it does selects by key.
Michael Pasternak - 9th March 2010  [«« Reply to this]
I'm following your posts carefully :-)

It seems, that "good old SQL" is often slower in those "create 1000 objects" benchmarks. MySQL has some benchmarks, that make PostgreSQL look like a slow database. SQLite is also very fast, when compared to MySQL. BerkeleyDB may be even faster ;) unfortunatley, some OSS projects I know stopped using it at some point. But well then, again, creating as many objects as possible quickly may be really your app model, so...

How about other benchmarks, like "update every record, that matches 3 - 4 foreign keys AND an IN() range query"?

What about data reliability?

Default PostgreSQL installation is also far from perfect. If speed is your goal, then you can tune PostgreSQL not to use disk that often - disable fsync, set a large COMMIT delay and so on. These settings make the database unreliable in case of power-loss, but for me - I run django tests on local machine - they make testing way faster.

Just my $0.05. For some time in my life, I neglected SQL databases. This didn't turn out to be as good as I thought.
Peter Bengtsson - 11th March 2010   [«« Reply to this]
I'm sure there are parameters to make it faster but there are parameters to make MongoDB faster too.

Truth is, when speed matters it's probably because your project matters. And if your project matters your data matters and then you'll need to take reliability and durability to a whole new level and you'd have to start over with the benchmarks.
Alex - 9th March 2010  [«« Reply to this]
You're comparing 1000 transactions vs. 1000 inserts to a database that does not guarantee durability.

Try running the tests when you wrap the inserts into *one* transaction.
Peter Bengtsson - 11th March 2010   [«« Reply to this]
Lack of transactions is definitely a key pain point. It changes your code as you need to do fewer things per function when you can't rely on rollback to save you.

But do note that the view in the benchmark does not use transactions. So I *am* comparing 1000 inserts vs. 1000 inserts.
cybernd - 10th May 2010   [«« Reply to this]
> So I *am* comparing 1000 inserts vs. 1000 inserts.

No you dont.

Postgresql will add an implicit transactions if no one opened an explicit one.

In your special case you are using django without transactions in your code. Djangos documentation states that per default it will wrap every call in one transaction.

So basically you are dealing with 1000 explicit transactions and 1000 inserts on PGs side.

Sorry to say that but your benchmark results are misleading.

Also note that this is a syntetic micro benchmark. In real world there would be many queries within one transaction and also bulk queries to get rid of the client<>db roundtrip.
Peter Bengtsson - 11th May 2010   [«« Reply to this]
Perhaps you're right. Perhaps PostgreSQL is faster than the benchmark makes it out to look. But this is how Django+PostgreSQL works together.

Also, like I said before, both "sides" of this benchmark can be optimized further.
Eas - 10th March 2010  [«« Reply to this]
Michael, there are other postgres tuning parameters that help with throughput with concurrent updates without sacrificing durability.

Basically, you tell postgres to wait to see if any other transactions complete within a given time window so it can log them all at once. The cost is some potential latency as some transactions may not return as quickly. Look at commit_delay and commit_siblings. You may also need to tweak wal_buffers.
yashh - 27th May 2010  [«« Reply to this]
Good benchmarks peter. I tried to insert django Mysql model with 24018 entries into mongoDB. It took 2.037 secs to get all entries from Mysql (InnoDB all indexes well fit into RAM) and 17.xx seconds to insert all 24K entries. Its pretty fast.
Peter Bengtsson - 29th May 2010   [«« Reply to this]
Fast indeed. But 24k objects isn't really a lot. What would be interesting is to know the speed when there are already 24m inserted.

I hear that Tokyo Cabinet is fast to write but dog slow to *index* when the total number goes up.

I wonder how fast MongoDB would be to insert 24k entries where you already have 1m, 10m or 100m entries in there already.
 
Name:
Email:
hide my email address.

Your email address will be encoded to prevent email-extraction spiders from reading it so you won't get spammed if you decide to show your email address.