Skip to content. | Skip to navigation

Personal tools

>>> ''.join(word[:3].lower() for word in 'David Isaac Glick'.split())



You are here: Home / Blog / on Zope, multiple cores, and the GIL

on Zope, multiple cores, and the GIL

by David Glick posted Jan 17, 2010 04:12 PM
I recently installed HAProxy as a load-balancer for a site that had previously been running using a single Zope instance using 4 threads. I switched to 2 instances using 2 threads each, load-balanced by HAProxy. I wasn't anticipating that this change would have a noticeable effect on the site's performance, so was happily surprised when the client mentioned that users of the site were commenting on the improved speed.

But why did the site get faster?

Looking at a munin graph of server activity, I observed a noticeable drop in the number of rescheduling interrupts -- a change that coincided with my change in server configuration:

graph showing decreased contention when I switched to more Zope instances with fewer threads

I suspect that the "before" portion of this graph illustrates a problem that occurs when running multi-threaded Python programs on multi-core machines, wherein threads running in different cores fight for control of the Global Interpreter Lock (a problem Dave Beazley has called to the community's attention in a recent presentation) -- and that this explains the improvement in performance once I switched to multiple processes with fewer threads. By switching to multiple processes, we let concurrent processing get managed by the operating system, which is much better at it.

Moral of the story: If you're running Zope on a multi-core machine, having more than 2 threads per Zope instance is probably a bad move performance-wise, compared to the option of running more (load-balanced) instances with fewer threads.

(Using a single thread per instance might be even better, although of course you need to make sure you have enough instances to still handle your load, and you need to make sure single-threaded instances don't make calls to external services which then call back to that instance and block. I haven't experimented with using single-threaded instances yet myself.)

Hanno Schlichting says:
Jan 17, 2010 06:34 PM
Did you have the obligatory "zope-conf-additional = python-check-interval 1000" line in your instance configs? Or did you use the Python default of 100?<p>With the default you get much more trashing, but with a setting like 1000 ticks you usually get that under control.
Lee Joramo says:
Jan 18, 2010 01:14 AM
David, Thanks for the data. Your 'before' chart looks rather the current charts for several of my servers. I have been planning to check out HAProxy since seeing Elizabeth Leddy's presentation at the Plone 2009 Conference in Budapest Your data really helps me see that I need to invest the time in this approach.<p>HAProxy looks a little daunting. Do you have any advice or configuration files that you can share?
Matt Hamilton says:
Jan 17, 2010 06:33 PM
If you want to see some older info on this, take a look at the report I did back in 2002 on Zope's performance on Solaris:<p><p>This was mainly looking at Solaris's different threading systems. The default threading back then was NxM in which they mapped N userspace threads to M kernel processes. This was due to kernel process context switches being that much heavier on Solaris (maybe due to massive multi-processor designs).<p>They have now by default switched to what is referred in that report as the 'alternate threading model' which mapped user threads 1 to 1 to kernel processes. This worked better for Zope as it meant the OS could schedule the threads better. In the case of Solaris there was also issues with not just the GIL but locks on I/O that could end up with deadlocks on the NxM threading model.<p>I guess what you are seeing now is really an extension of that same issue, but with zope threads and cores replacing the OS threads and processes in that report.<p>Another thing that I've not seem mentioned at all in recent years is the python-check-interval setting in zope.conf. It used to be said that this was set too low for faster processors and should be raised. I was going to say, I don't even know if that setting still exists, but a quick search and of course Jarn are on the ball as usual:<p>[…]/jarn.checkinterval<p>-Matt
Laurence Rowe says:
Jan 17, 2010 10:43 PM
When configuring HAProxy it's important to remember that traffic is not only served by Zope's threads but also by filestream iterators As Blob and skin resource images get served by filestream iterators outside of the thread, think carefully before setting 'maxconn 1' on your backend server. FWIW, I'm no longer convinced you get much benefit from cache affinity on individual zope servers unless you have complex membrane or possibly ldap users. There's a lot to be said for just using varnish's inbuilt load balancing.