Skip to content. | Skip to navigation

Personal tools

>>> ''.join(word[:3].lower() for word in 'David Isaac Glick'.split())



You are here: Home

David Glick – Plone developer

by admin posted Apr 06, 2010 02:48 AM

The making of

by David Glick posted Apr 04, 2011 12:55 AM

An explanation of the pile of hacks we used to get ZODB running in the browser.

On April 1st Matthew Wilkes and I announced the launch of ZODB Webscale Edition, which runs the ZODB in a Javascript-based Python interpreter backed by HTML localstorage. It was of course in honor of April Fool's day, as the entire concept of running the ZODB in a browser is a bit silly, but the technology actually works. Here's how we did it.

The concept

Matthew first approached me about a month ago with the intent to pull off something epic for April Fool's day this year. His goal, he explained, was to "make something that nutcases would find useful but everyone else knows is stupid." We discussed various ideas such as supporting ymacs as a richtext editor in Plone before I remembered I had seen a way to run Python in the browser. We quickly ruled out running all of Zope2 in the browser as too big a project, but Matthew suggested doing just the ZODB, and I realized that making it be backed by HTML localstorage could make for a fun, reasonably scoped, buzzword-compliant demo. The idea was born.

The Emscripten Python interpreter

The hardest part of the problem—getting a Python interpreter implemented in Javascript—was already basically solved by the Emscripten project. Their Python interpreter was generated by compiling CPython to LLVM bytecode using clang, then using their tools to translate that into Javascript. The result is a 2.8MB closure-compiled "python.js" which includes the logic of CPython as well as implementations of basic C library calls like sprintf and malloc in terms of operations on a heap which consists of a Javascript array. We didn't have time to get the whole Emscripten toolchain set up and working so that we could build a non-packed python.js, but we did need to understand the basics of this Python interpreter, so we used the Google Closure Compiler to unpack the whitespace of python.js so it was at least semi-readable.

Unfortunately this Python interpreter had a major limitation—it had no implementation of importing modules, so you were limited to things like "sys" which are included statically in CPython. Obviously this wouldn't work for getting the ZODB working.

The import system

I wanted to allow dynamically importing as many things as possible, rather than simply bundling the ZODB code with the interpreter in some fashion. So it seemed like a good approach would be to write a small WSGI import server, and then make the interpreter fetch imports via AJAX in some way. But how, exactly?

I knew that importing in Python calls the __import__ builtin, so I could monkeypatch __builtins__.__import__ to make it somehow fetch the module being imported by name, then manually construct a module using imp.new_module() and exec the fetched code in the new module's namespace. However, this would be Python code running within the sandbox of the Javascript-based interpreter, without the ability to make Javascript calls like fetching via AJAX. So how could we do the actual "fetch the code" step?

We could have used the CPython API (as translated into Javascript) to, from Javascript, create a new "builtin" module with a function for loading a module's code via AJAX. But, neither of us had worked with the Python C API much, let alone with its closure-compiled Javascript variant, and this seemed like too big of a task. So we hit on a simpler hack: we wrote a Javascript function to do the fetching and then "hijacked" the raw_input builtin by replacing the interpreter's reference to it (we picked raw_input because the Emscripten Python interpreter didn't implement it anyway).

The result is a glorious mixture of CPython API (we had to use a bit after all to unpack the argument with the module name and to pack the string with the returned source code) and JQuery:

function raw_input(self, args) { 
    // stack management
    var b = a;
    a += 4;
    for(var d = b;d < a;d++) {
        i[d] = j[d] = 0
    i[b] = 0;
    // unpack argument
    Module._PyArg_UnpackTuple(args, $ba, 0, 1, u([b, 0, 0, 0], 0, o));
    var name = ma(Module._PyString_AsString(Module._PyObject_Str(i[b]))); 

    // fetch via *synchronous* XMLHTTPRequest
    output('Importing ' + name + '...', 'status');
    var source = '';
        url: 'lib/' + name,
        error: function(xhr, status, code) {},
        success: function(result) {
            source = result;
        async: false,
        dataType: 'text',
        cache: true

    // return the source as a pointer into the Python heap
    var h = Module.Pointer_make(Module.intArrayFromString(source))
    a = b;
    return Module._PyString_FromString(h);
// hijack the raw_input builtin
n[RMb] = Module._builtin_raw_input = raw_input;<p> </p>

The __import__ hook could then be implemented in terms of the new raw_input builtin:

import sys, imp
_known_bad = set()
def __import__(name, globals={}, locals={}, fromlist=[], level=-1):
    if name in _known_bad:
        raise ImportError('Could not fetch module %s from server.' % name)
    if name in sys.modules:
        return sys.modules[name]

    # call our hook (we hijack raw_input below)
    source = raw_input(name)
    if not source:
        raise ImportError('Could not fetch module %s from server.' % name)

    m = imp.new_module(name)
    m.__file__ = name
    sys.modules[name] = m
    if '.' in name:
        parent, basename = name.rsplit('.', 1)
        if parent in sys.modules:
            setattr(sys.modules[parent], basename, m)
    exec source in m.__dict__
    return m
__builtins__.__import__ = __import__

This Python is included inline in the HTML, and found and executed during initialization of the interpreter. It is a bit buggy in its handling of packages, but worked well enough to let us move on to the more interesting aspects of the project.

Making the ZODB work

So we could import things. "import this" worked great. The ZODB? Not so much. You see, we soon found out that Emscripten's Python interpreter is really quite minimalistic in its builtin modules. "os" is not included, as sandboxed Javascript can't access the local filesystem, so anything like "logging" which depends on it was a problem. Things like "threading", "re", and "time" were similarly missing. Even more problematic was the omission of the following modules which are used in pickling (sort of the core function of the ZODB): cPickle, marshal, and struct.

So we started hacking up our copies of the ZODB and transaction packages. We removed all the logging. We took out the threading locks, with the justification that Javascript is single-threaded anyway. time.time() got replaced with a simple incrementing counter. Et cetera. As for cPickle and its dependencies, we borrowed the pure Python implementations from PyPy. We also needed Tres Seaver's branch of ZODB to provide a pure Python implementation of the 'persistent' module. It took a couple evenings, but without too much effort we were eventually able to instantiate a DemoStorage, instantiate a DB, connect to it, and commit transactions on the root object. Major win!

The HTML localstorage backend

But it still wasn't a great demo. We wanted it to be possible to commit a transaction, then come back after leaving the page and be able to access the data that had been committed. And we wanted the persistence to happen in browser localstorage on the client side, rather than by passing values to the server. So we needed to find a way to modify or replace DemoStorage to pass its values to Javascript to be placed in localstorage, and to retrieve them again when the page is loaded.

After the CPython API hackery needed to get the imports working, I was a bit scared about doing a lot of passing values from Python to Javascript and back. So at this point I thought, "Wait. We have the entire Python interpreter runtime state in these Javascript arrays; why don't I just save and restore the whole interpreter?" Ultimately this led me down a rabbit hole to nowhere. I never quite figured out the correct bootstrapping process to get all the necessary Javascript variables re-initialized on subsequent loads, but with the Python heap, stack, etc replaced with the old state. And I was bumping up against the 5MB limit for what can be placed in localstorage.

Fortunately Matthew came along at this point with a different approach. He wrote a very simple ZODB storage class, the HTML5Storage (code), which stores pickles of modified objects and writes them to a (Python) global dict, keyed by object id, when a transaction is committed. And instead of messing around with the CPython API to interface Python with Javascript, he simply made a commit print out the JSON representation of that global dict, with a special identifier that the Javascript implementation of print() was modified to watch for and handle specially by parsing the JSON and stuffing it in localstorage. When the page is loaded, the stored values are passed back to Python by converting the localstorage contents to JSON, executing it as Python, and placing the values back in the Python global dict. (There is a bit of extra hackery to encode backslashes in the Python repr of the pickles, handled by the very Britishly named dodgy_encode function.)

At this point, it should have worked. But there was one more hurdle.

Debugging the pickles

When we tried to reload the root object from the HTML5Storage, we were getting unexpected errors. I compared the pickles that had been generated in the Emscripten interpreter with those generated for a similar object on a real Python interpreter, and noticed that they were not the same. I used the pickletools.dis() function from the stdlib to examine the pickle bytecode, and figured out that the size of some strings in the pickle was getting recorded incorrectly, so the pickles were not being executed correctly during unpickling. Specifically, the size of some strings was getting recorded as \x02 regardless of the actual length of the string.

I tracked the bug down to the repr() of the pickle that Matthew was doing in his dodgy_encode function. A variety of different bytes were all getting repr'd as \x02. And then I tracked this into the PyString_Repr implementation of the Javascriptified Python interpreter, to where it calls sprintf with a format of "\\x%02x". It turns out that Emscripten's implementation of sprintf was incomplete and supported neither the "02" used immediately after % to give a zero-padding width, nor the "x" used to specify printing a hex value.  It was interpreting the "02" literally, and the x was getting truncated off. Once I figured out where this issue was arising, it was a relatively simple matter to adjust the sprintf implementation to handle this format correctly.

And there was pickling, and there was unpickling: the first day. And Matthew and David looked on the hackery and saw that it was good.

The launch

Fortunately we still had a day left to put a bit of polish on the thing before April 1. Ryan Foster was kind enough to whip up a nice web-2.0-product-style design on short notice. (Kudos to Ryan; I basically said "we want something like" and he magically figured out exactly the sort of color scheme and logo I had in mind.) We added a bit of varnish in front and added some social media bling to the footer. In a flash of inspiration, I dubbed the site a project of "POSKey Enterprises," a reference to the POSKeyError one gets when trying to load an object from the ZODB that does not exist. We revamped the input/output to be much nicer and closer to a real Python interpreter. launched to much fanfare on the morning of April 1. Like, I mean, we got literally dozens of pageviews. (Okay, actually a few hundred. And we realize the thing is a bit esoteric.) :) Some people assumed the thing was just a light frontend to a server-side interpreter, so people were more impressed once we explained that everything was actually executing client-side. Thanks to everyone else who retweeted the link for helping spread the word a bit beyond the tiny Zope circles!

Ultimately, I think the project satisfied our goals; after all, hacking is about the journey, not the destination.


Inspect your ZODB with Eye

by David Glick posted Mar 21, 2011 01:50 PM

Eye is a utility for browsing the contents of a ZODB.

A fairly common complaint about the ZODB is that there's no generic tool for browsing its contents. In fact this is a bit of a lie, as there are at least 3 existing tools called "zodbbrowser," but they all depend on large parts of the Zope stack, and are therefore a bit hard to install.  So at the PyCon sprints I worked on adapting Roberto Allende's zope2.zodbbrowser into a Pyramid-based tool called Eye.

The result is easy to install and looks like it will be fairly useful for seeing all the objects present in a ZODB (not just the ones that the ZMI or some other app-level tool chooses to show). As an added bonus, it knows how to browse "broken" objects, so you don't have to have your application code in Eye's PYTHONPATH.

(Blue items are persistent objects; black ones are included in the ZODB only by virtue of being referenced by persistent objects, and do not get their own pickle.)

Eye can also be used to take a peek at any old set of Python objects that are not in a ZODB.

See the PyPI page for installation and usage instructions, or clone the project on github and send me pull requests. :)


Registering Add-on-specific components using z3c.baseregistry

by David Glick posted Oct 16, 2010 06:45 PM

z3c.baseregistry provides a safer way to register Zope components that should only be available within sites that have a particular add-on installed.

(Updated 10/17/2010 to reflect some minor corrections from Martin.)

At Groundwire it is very common for us to run a number of Plone sites within the same Zope instance. This introduces some unique requirements for add-on products to follow when they are registering Zope components:

  1. If the add-on registers some new component, it should do it in such a way that it is only available within sites that have the add-on installed.
  2. If the add-on overrides some default component from Zope or Plone, it should do it in such a way that it can be further overridden for one particular site.

There are several common approaches that can help with these requirements, but each has downsides. Overriding components using overrides.zcml is global (i.e., affects all Plone sites in an instance) and prevents further customization. Registering components for a browser layer only works for components that adapt the request (such as browser views). Registering persistent local utilities or adapters in a site's local component registry keeps things isolated, but can be a headache when it's time to uninstall the add-on or remove the implementation of a component.

There is a lesser-known fourth option: using z3c.baseregistry to create a registry specific to one add-on.

Component registries in Zope 2

In Zope 2 we typically deal with 2 types of component registries (also called site managers historically):

  1. The global registry, which is populated with components at startup by processing ZCML.
  2. A local registry associated with each Plone site (implemented in five.localsitemanager). These store components persistently in the ZODB, and can be populated via the zope.component.interfaces.IComponentRegistration API in Python, or via GenericSetup (componentregistry.xml)

When Zope traverses over a Plone site, its local registry is set as the active registry (via—or in older Zopes—which sets a thread local). After that, this registry is the one that can be obtained by zope.component.getSiteManager, and the one that will be implicitly used by the functions that do component lookups. If a component is looked up but not found in the local registry, it will fall back to checking in that registry's base registries. By default in Plone, there is just one base registry, which is the global registry.

Introducing z3c.baseregistry

However, there's no requirement that the global registry be the only base registry. z3c.baseregistry makes it possible to define additional, named registries which can be installed as additional base registries for a particular site. Then when a component lookup occurs, it will be looked for first in the local registry, then in the custom base registry, and then in the global registry.

The cool thing is that while the installation of a z3c.baseregistry is persistent, the components one contains are not. Instead, the components are populated at Zope startup via ZCML, very much like the global registry. The registerIn grouping directive lets us specify which registry components should be registered in:

<registerIn registry=".packageComponents">
  <!-- component directives here -->

This means that when you uninstall an add-on that has its own base registry, you just need to remove the registry from the site manager's bases, rather than figuring out how to unregister each individual component as would be necessary for persistent components in the local registry. It also means that you can safely remove a component's class when you remove its registration without worrying about breaking legacy persistent registrations of that component.

Step by step

I've used this approach in a few projects lately. Here's what it looks like:

  1. Add z3c.baseregistry to the add-on's install_requires in, and re-run buildout to make sure it is installed.
  2. Create a new registry instance.

    In (or could be elsewhere):

    from zope.component import getGlobalSiteManager
    from z3c.baseregistry.baseregistry import BaseComponents
    packageComponents = BaseComponents(getGlobalSiteManager(), '')

    Here, we made sure that the new registry has the global registry as its base, and is named after our add-on package (

  3. Register a local utility for looking up the new registry by name (this is used by z3c.baseregistry internally).

    In configure.zcml:

    <!-- registry for package-specific components -->
  4. Install the new registry in the bases for a particular site. z3c.baseregistry includes a form for doing this through the web, but it doesn't seem to work in Zope 2. Oh well, we can do it with a GenericSetup import handler instead.


    from zope.component import getSiteManager
    from zope.component.interfaces import IComponents
    def install_base_registry(site):
        sm = getSiteManager(context=site)
        reg = sm.getUtility(IComponents, name=u'')
        sm.__bases__ = tuple([reg] + [r for r in sm.__bases__ if r is not reg])

    You would then call this from the add-on's custom "import various" GenericSetup handler.

  5. Now components can be registered for the new add-on specific registry, using the registerIn grouping directive.

    In configure.zcml:

    <!-- make sure we can use registerIn -->
    <include package="z3c.baseregistry" file="meta.zcml"/>
    <registerIn registry=".packageComponents">

    These components will be found within sites that have the product installed, but not within sites that don't!


  • It should be obvious, but this only localizes the effects of ZCML directives whose effect is to register something in the component registry (e.g. utility, adapter, subscriber, browser:page). Directives that mutate other things, such as <class> which directly modifies a class, will still have a global effect.
  • Don't forget to make sure that the base registry gets removed from the local registry's bases when the add-on is uninstalled. Otherwise removing the product will break the site when it tries to unpickle the base registry.
David Glick

David Glick

I am a problem solver trying to make websites easier to build.

Currently I do this in my spare time as a member of the Plone core team, and during the day as an independent web developer specializing in Plone and custom Python web applications.