Skip to content. | Skip to navigation

Personal tools

>>> ''.join(word[:3].lower() for word in 'David Isaac Glick'.split())

‘davisagli’

Navigation

You are here: Home

David Glick – Plone developer

by admin posted Apr 05, 2010 11:48 PM

reflections on PyCon 2010

by David Glick posted Feb 23, 2010 02:50 PM

I just got back from the US PyCon 2010, my first Python Conference, where I had a blast. The conference felt to me a lot like Plone conferences in spirit, only with a greater diversity of software projects and of course more people (a record attendance of ~1100). It was held at the Hyatt in downtown Atlanta and was a great success logistically. One success in the organization of the conference was the push to get more women to attend, which resulted in 11% female attendees, an increase over previous years which I hope will continue as a trend.

Some highlights of the talks I attended were:

  • "Building Leafy Chat, DjangoDose, and Hurricane: Lessons Learned on the Real-Time Web with Python" by Alex Gaynor – Introduced me to Orbited, Twisted, Redis, and other tools for building scalable, interactive websites.
  • "Managing the world's oldest Django project" by James Bennett – I found myself drawing parallels between the evolution of Django and Ellington that James presented and that of Zope and Plone. The Django community is learning the same lessons about testing and reusability that we have.
  • "What every developer should know about database scalability" by Jonathan Ellis – good general overview of different strategies for replication and caching (focused on concepts rather than any particular software)
  • "Powerful Pythonic Patterns" by Alex Martelli – philosophizing on software patterns and anti-patterns in the Python context
  • "Demystifying Non-Blocking and Asynchronous I/O" by Peter A Portante – very helpful beginner-level overview
  • "Unladed Swallow: fewer coconuts, faster Python" by Collin Winter – an update on the state of Unladen Swallow, which was approved for being merged into CPython during the language summit just before PyCon
  • "Pynie: Python 3 on Parrot" by Allison Randal – This one was for fun...I might keep an eye on Pynie just to see how a language actually gets implemented.
  • "How Python is guiding infrastructure construction in Africa" by Roy Hyunjin Han – Covered the use of Python for recognizing buildings in satellite imagery to help with planning development, etc.
  • "Why not run all your tests all the time? A Study of continuous integration systems" by C. Titus Brown – Bottom line: "Use Hudson."
  • the infamous Testing in Python BoF, which was a 3-hour lightning talk session organized one evening by the folks from Disney, complete with pizza, beer, heckling, and goats (the goat meme was introduced by Terry Peppers as an alternative to lolcats in slides, and ended up being adopted as a testing mascot).
  • "Tests and Testability" by Ned Batchelder – Not a lot new here for me, but a good overview by the creator of coverage.py.

Selecting which talk to go to was sometimes excruciating, and I'm looking forward to catching up with some of the ones I missed. Some of the ones I've heard recommended are:

  • "Deployment, development, packaging, and a little bit of the cloud" by Ian Bicking
  • "The state of Packaging" by Tarek Ziadé
  • "Scaling your Python application on EC2" by Jeremy Edberg – learnings from reddit
  • "Dude, Where's My Database?" by Eric Florenzano
  • "Understanding the Python GIL" by David Beazley – the hot topic of the conference
  • "The Python and the Elephant: Large Scale Natural Language Processing with NLTK and Dumbo" by Nitin Madnani and Jimmly L. Lin

Videos of the talks are, amazingly, already becoming available. Kudos to the A/V team.

On Sunday my attention waned and I got a bit mischievous. The Eldarion guys, who created Type War, set up OHWar, a type war clone where you compete to correctly guess who said various quotes that were overheard at PyCon. After playing for far too long and still failing to stay in first place for long, I decided it was a job for Python and created an OHWar-playing bot. I left it running in screen and came back a few hours later to find that I had not only topped the leaderboard but also hit the game's built-in score limit. :) This was also the evening that David Brenneman and I found the Django Pony unattended and added some "enhancements." ;)

Django pony with Plone stickers

Zope and Plone were not very visible in the conference schedule (there was one talk on Plone GetPaid and Satchmo, one on using Plone with Salesforce in which I contributed a few minutes of technical material to go with Chris Johnson's high-level overview, and one on the interface/adapter concepts...as well as a couple relating to repoze.bfg which has a Zopish ancestry). On the other hand, I believe Plone was, surprisingly, the only open source project with a booth in the exhibition hall. We had a nice-looking display with the Plone banner that continues to be passed around to US events, a bunch of collateral and books for display, and a big monitor for demoing Plone 4. Various people took turns staffing the booth, including members of the Atlanta Plone group, and Chris Calloway for much of Saturday. The Plone Foundation also subsidized World Plone Day T-shirts which a bunch of us wore on Saturday. We gathered for a photo and ended up with around 30 people.

Plone folks at PyCon

During the conference, a highlight for me was meeting and eating meals with various luminaries, including Jason Huggins (of Selenium fame), Holger Krekel (founder of PyPy), Wesley Chun (author of Core Python Programming) and even Guido himself (well, way down at the other end of the table). I also got to interact briefly with Allison Randal (from the Perl community), while trying out and submitting a new test for pynie, a nascent Python implementation for the Parrot VM. I also now have a face to put with many additional names that I had only seen online before.

I was only able to join the sprints for one day, and mostly spent my time working on some miscellaneous tasks I hadn't been getting too. However we were able to have a good meeting of GetPaid folks, to try to determine how to move forward with Brandon Rhodes' work to clean up payment processor configuration. I also did some refactoring of the GetPaid development buildout to clean it up, make sure it still works, and pave the way for updating the product for compatibility with Plone 4. If I had been able to stay longer, I think it would have been fun to participate in the great work being done in the Python packaging sprint, led by Tarek Ziadé and the Packaging Pig. Next year I will have to be sure to attend the entire sprint.

1 comment

Reflections on building a member directory using Plone and Salesforce.com

by David Glick posted Feb 08, 2010 04:20 PM

I promised Chris Johnson that I would write up some of my learnings from a project integrating Plone and Salesforce.com, which Groundwire is just finishing up. So here you go, Chris!

The goal of the project is to provide web access to a directory of businesses who have paid for membership and inclusion in our client's directory -- while keeping the master data for the directory within Salesforce.com, not Plone. This involves several crucial challenges:

  1. How to present views for searching and browsing the Salesforce directory data within Plone
  2. How to provide the ability for businesses to log in and update their member profile
  3. How to provide the ability for businesses to apply and complete payment for membership, as well as to renew membership each year.

In this article, I'm going to focus on explaining how I approached the first two challenges. This is much more of a hand wave in the right direction, assuming a fair amount of background in Plone, than a detailed tutorial. That said, feel free to ask me questions about aspects of the implementation that I gloss over here.

Exposing the directory within Plone

Querying Salesforce directly on each request is a non-starter for many use cases. That's because Salesforce puts a pretty low limit on the number of API requests allowed per day (something like 1000 per user license). This means that we need a way to mirror data from Salesforce within Plone, and then update it in batch (thereby using fewer API requests) every night. (Building the directory as VisualForce pages within Salesforce Sites would be a valid alternative in some cases -- though requiring more work to integrate visually. But for this project it was a requirement that we be able to store additional data such as logos within Plone, as well as link to related content items for a business.)

How do we model data from Salesforce within Plone? It depends on what you need to do with the content in Plone. If you just need to be able to search and display a listing of results, then there is no reason to create full-fledged content items. In the past, for a case like this, I have just created temporary stub objects during a nightly dump of data from Salesforce, indexed them in a custom catalog, and then discarded the stubs. This is the most lightweight option; you have a catalog full of data for building your search views, but no unnecessary data hanging around.

If you actually need to be able to navigate to a full page view of a particular directory item, then you probably need an actual content item. I think Dexterity would be promising for this sort of thing, but for the project I'm just now wrapping up, I used Archetypes because I needed image scaling and the ability to link to other AT content as related items, both of which Dexterity doesn't have great support for yet.

Note that you don't actually need to define most aspects of the schema, if there are fields you want to display but don't need to have editable within Plone. For example, my schema looks something like this:

MemberProfileSchema = document.ATDocumentSchema.copy() + atapi.Schema((
    atapi.TextField('sf_id'),
    atapi.TextField('mailingAddress'),
    # etc...
))
# hide most fields
for field in MemberProfileSchema.fields():
    if field.schemata == 'default' and field.__name__ not in ('text',):
        field.widget.visible = {'edit':'invisible', 'view':'visible'}

Fields like mailingAddress get populated during the nightly data dump, but don't appear on the edit form if you edit the member profile. Why not? Well, mostly because I figured it would be hard to get an Archetypes edit form to save things to Salesforce as well as Plone. Alex Tokar at Web Collective tells me he has successfully taken this approach, though.

Here is an abbreviated version of the browser view that is called once a night to pull in the data from Salesforce:

"""
SFDC sync view. This is intended to be run via cron every night to update
the member profiles based on data from Salesforce.com.

It will:

 * Find all Accounts with a member status of 'Current' or 'Grace Period' (in
   our client's Salesforce schema this is a custom rollup field based on various
   criteria).
 
 * For each Account, find an existing Member Profile object in Plone whose
   'sf_id' field value equals the Id of the Account, and update it.
   
 * Or, if no existing Member Profile was found, create a new one and publish it.

 * Retract any existing Member Profiles that were no longer found as Accounts
   with the Active or Grace Period membership status in Salesforce, so they are
   still present but not publicly visible.

"""

import logging
import transaction
from zope.component import getUtility
from Products.Five import BrowserView
from Products.CMFCore.utils import getToolByName
from plone.i18n.normalizer.interfaces import IIDNormalizer
from Products.CMFPlone.utils import safe_unicode
from Products.CMFPlone.utils import _createObjectByType

SOBJECT_TYPE = 'Account'
FIELDS_TO_FETCH = (
    'Id',
    'Name',
    'Description',
    'BillingStreet',
    'BillingCity',
    'BillingState',
    'BillingPostalCode',
    # etc...
    )
FETCH_CRITERIA = "Member_Status__c = 'Current' OR Member_Status__c = 'Grace Period'"
DIRECTORY_ID = 'directory'
PROFILE_PORTAL_TYPE = 'Member Profile'

logger = logging.getLogger('SFDC Import')

class UpdateMemberProfilesFromSalesforce(BrowserView):
    
    def __init__(self, context, request):
        BrowserView.__init__(self, context, request)
        self.catalog = getToolByName(self.context, 'portal_catalog')
        self.wftool = getToolByName(self.context, 'portal_workflow')
        self.normalizer = getUtility(IIDNormalizer)
    
    def getDirectoryFolder(self):
        portal = getToolByName(self.context, 'portal_url').getPortalObject()
    
        # create the directory folder if it doesn't exist yet
        try:
            directory = portal.unrestrictedTraverse(DIRECTORY_ID)
        except KeyError:
            _createObjectByType('Large Plone Folder', portal, id=DIRECTORY_ID)
            directory = getattr(portal, DIRECTORY_ID)
        
        return directory
    
    def findOrCreateProfileBySfId(self, name, sf_id):
        res = self.catalog.searchResults(getSf_id = sf_id)
        if res:
            # update existing profile
            profile = res[0].getObject()
            logger.info('Updating %s' % '/'.join(profile.getPhysicalPath()))
            return profile
        else:
            # didn't match sf_id or UID: create new profile
            name = safe_unicode(name)
            profile_id = self.normalizer.normalize(name)
            directory = self.getDirectoryFolder()
            profile_id = directory.invokeFactory(PROFILE_PORTAL_TYPE, profile_id)
            profile = getattr(directory, profile_id)
            profile.setSf_id(sf_id)
            profile.reindexObject(idxs=['getSf_id'])
            logger.info('Creating %s' % '/'.join(profile.getPhysicalPath()))
        
        return profile
    
    def updateProfile(self, profile, data):
        profile.setSf_id(data.Id)
        profile.setTitle(data.Name)
        if not profile.getText():
            profile.setText(data.Description, mimetype='text/x-web-intelligent')
        profile.setMailingAddress("%s\n%s, %s %s" % (data.BillingStreet, data.BillingCity,
                                                     data.BillingState, data.BillingPostalCode))
        # etc...
        
        # publish and reindex
        try:
            self.wftool.doActionFor(profile, 'publish')
        except:
            pass
        profile.reindexObject()
    
    def hideProfileBySfId(self, sf_id):
        res = self.catalog.searchResults(getSf_id = sf_id)
        profile = res[0].getObject()
        try:
            self.wftool.doActionFor(profile, 'reject')
        except:
            pass

    def queryMembers(self):
        """ Returns an iterator over the records of active members from Salesforce.com """
        sfbc = getToolByName(self.context, 'portal_salesforcebaseconnector')
        where = '(' + FETCH_CRITERIA + ')'
        soql = "SELECT %s FROM %s WHERE %s" % (
            ','.join(FIELDS_TO_FETCH),
            SOBJECT_TYPE,
            where)
        logger.debug(soql)
        res = sfbc.query(soql)
        logger.info('%s records found.' % res['size'])
        for member in res:
            yield member
        while not res['done']:
            res = sfbc.queryMore(res['queryLocator'])
            for member in res:
                yield member
    
    def __call__(self, queryMembers=queryMembers):
        """ Updates the member directory based on querying Salesforce.com """
        
        # 0. get list of sf_ids for the profiles we already know about, so we
        # can keep track of which ones we need to make private
        sf_ids_not_found = set(self.catalog.uniqueValuesFor('getSf_id'))
        
        # 1. fetch active Member Profile records, update ones that match,
        #    and create new ones
        for i, data in enumerate(queryMembers(self)):
            profile = self.findOrCreateProfileBySfId(name = data.Name, sf_id = data.Id)
            self.updateProfile(profile, data)
            
            # commit periodically (every 10) to avoid conflicts
            if not i % 10:
                transaction.commit()
            
            # keep track of which profiles we need to hide
            try:
                sf_ids_not_found.remove(data.Id)
            except KeyError:
                pass
        
        # 2. hide any profiles that are no longer active
        for sf_id in sf_ids_not_found:
            self.hideProfileBySfId(sf_id)

All that's left is writing the view which actually queries the catalog for these member profiles and presents them as a listing, which is relatively straightforward, and left as an exercise for the reader. :)

Allowing updates to directory profiles

So if the Archetypes content type doesn't allow edits to most of its fields, how did I provide for logged-in members to edit profile info? Well, there are 2 parts:

  1. The Salesforce Auth Plugin allows logins to Plone based on Account records in Salesforce (by matching on custom username and password fields on the Account).
  2. A custom z3c.form form reads values from the Account associated with the currently logged-in user, and writes to both that Account record in Salesforce and also to the associated Member Profile archetype within Plone (so that updates appear in the directory immediately).

I won't go into detail on the configuration of the Auth Plugin, as it is covered in the package's documentation. I configured it to load the Salesforce Id of the Account and several other fields into PAS member properties, for easy access within Plone. I did not configure all of the account fields as member properties -- while I could have done so, I didn't see much utility in that, since Plone can't (at least not yet) automatically generate an edit form for all the member properties.

Instead, I built a custom z3c.form form that reads and writes directly to Salesforce. This turned out to be less complicated than I anticipated, mostly thanks to a new ORM-style library I built for wrapping the objects returned from Salesforce by beatbox (with attributes corresponding to Salesforce field names) with a model whose attribute names match the field names of the form schema -- allowing use of the wrapper as the context of a z3c.form form. I'm not yet going to post the implementation of this library, as I intend to make some significant changes to the API before releasing it (real soon now?). But let me at least show you what using it looks like (again I have simplified from the real code):

from zope.interface import implements
from z3c.form import form, field, button
from plone.z3cform.layout import wrap_form
from plone.memoize.instance import memoize
from Products.CMFCore.utils import getToolByName

from sforzando import SFObject, SFField

class IAccountGeneralInfo(Interface):
    """ Schema for member profile edit form """
    business_name = schema.TextLine(title = u'Business Name')
    # etc...

class SFAccount(SFObject):
    """ Adapts a Salesforce Account to the profile edit form schema"""
    implements(IAccountGeneralInfo)
    
    _sObjectType = 'Account'
    
    sf_id = SFField('Id')
    business_name = SFField('Name')
    # etc...

class ProfileEditForm(form.Form):
    """ An edit form for the current authenticated member's Account """
    
    label = u'Update Profile'
    fields = field.Fields(IAccountGeneralInfo)

    def _get_sf_id(self):
        """ Find the Salesforce Account Id corresponding to the current logged in member. """
        mtool = getToolByName(self.context, 'portal_membership')
        member = mtool.getAuthenticatedMember()
        sf_id = member.getProperty('sf_id')
        if not sf_id:
            raise Exception("Did not find valid Salesforce ID for member '%s'" % member.getId())
        return sf_id

    @memoize
    def getContent(self):
        """ Provides the object this form will edit.
            Memoized so we always get the same one for a given request. """
        sfbc = getToolByName(context, 'portal_salesforcebaseconnector')
        return SFAccount(sfbc, "Id='%s'" % self._get_sf_id())

    @button.buttonAndHandler(u'Update Profile')
    def handleUpdate(self, action):
        """ Handler for the Update Profile button """
        data, errors = self.extractData()
        if not errors:
            self.status = u'Changes saved.'
            # save changes to Salesforce
            sf_id = self._get_sf_id()
            sfbc = getToolByName(context, 'portal_salesforcebaseconnector')
            SFAccount.update(sfbc, id=sf_id, **data)
            # etc...additional code to update the local AT-based copy of the Account data...

ProfileEditView = wrap_form(ProfileEditForm)

Formlib would probably also work just as well as z3c.form. And certainly using a PloneFormGen form with the 'update' feature of the salesforcepfgadapter would work without need for coding, if you don't need a particularly fancy form. As long as you mapped the Salesforce object Id as a member property in the Auth Plugin configuration, it's pretty easy to use that as the basis for determining which object the form should edit.

In conclusion

I'm pretty excited about the results of this project, which is one of the deeper integrations of Plone and Salesforce.com that I have worked on, and which builds on the tools Groundwire has led the development of over the past few years -- especially the Salesforce Auth Plugin. Giving Plone the ability to accept logins based on a CRM system opens the door to a lot of exciting possibilities -- think about being able to show visitors targeted content based on what your database knows about their interests or location, or allowing them to share content with other visitors from the same geographic area.

If you are putting to good use the tools and code discussed here, or are finding other cool things to do by integrating Plone and Salesforce, I'd love to hear about it.

Using HAProxy with Zope via Buildout

by David Glick posted Jan 19, 2010 03:14 PM

After my post on reducing GIL contention by using fewer Zope threads, Lee Joramo asked for more information on setting up HAProxy, so let me share my configuration. Much of the credit for this goes to Hanno Schlichting and Alex Clark, who offered me much good advice and a sample configuration, respectively.

First, a few words about what HAProxy offers. For the past couple years I've been using Pound to load balance between multiple backend Zope instances. But recently I've been hearing recommendations from people I trust (such as Jarn and Elizabeth Leddy) to try HAProxy instead.

HAProxy offers some nice features: - Backend health checks - Various load-balance algorithms for how requests get distributed to backends - Can do sticky sessions so that an authenticated user always hits the same backend - Warmup time (don't send as many requests to a Zope instance while it's starting up) - Provides a status page giving info on backend status and uptime, # of queued requests, # of active sessions, # of errors, etc.

Some of these are possible with pound too, but the status screen was really the "killer app" for me. This is fun to watch but also very useful for doing rolling restarts when new code needs to be deployed without an interruption in service.

HAProxy status page

Configuration

In my buildout.cfg I added:

[buildout]
...
parts =
    ...
    haproxy-build
    haproxy-conf

[haproxy-build]
recipe = plone.recipe.haproxy
url = http://dist.plone.org/thirdparty/haproxy-1.3.22.zip

[haproxy-conf]
recipe = collective.recipe.template
input = ${buildout:directory}/haproxy.conf.in
output = ${buildout:directory}/etc/haproxy.conf
maxconn = 24000
ulimit-n = 65536
user = zope
group = staff
bind = 127.0.0.1:8080

Here, we add a part called "haproxy-build" which uses the plone.recipe.haproxy recipe to build haproxy from source and add a bin/haproxy script for running it, and a part called "haproxy-conf" which builds the HAProxy configuration file by filling in variables in a template file called haproxy.conf.in.

Be sure to set the user and group variables to the user and group you want HAProxy to run as, and update the bind variable to set the port to which HAProxy should bind.

I run most of my Plone stack using supervisord, so I also updated my supervisord configuration in buildout to run HAProxy:

[supervisor]
recipe = collective.recipe.supervisor
...
programs =
    ...
    10 haproxy ${buildout:directory}/bin/haproxy [ -f ${buildout:directory}/etc/haproxy.conf -db ]

In a real life deployment, you'll probably also want a caching reverse proxy like squid or varnish sitting in front of HAProxy.

What about the contents of haproxy.conf.in? Here's mine:

global
  log 127.0.0.1 local6
  maxconn  ${haproxy-conf:maxconn}
  user     ${haproxy-conf:user}
  group    ${haproxy-conf:group}
  daemon
  nbproc 1

defaults
  mode http
  option httpclose
  # Remove requests from the queue if people press stop button
  option abortonclose
  # Try to connect this many times on failure
  retries 3
  # If a client is bound to a particular backend but it goes down,
  # send them to a different one
  option redispatch
  monitor-uri /haproxy-ping

  timeout connect 7s
  timeout queue   300s
  timeout client  300s
  timeout server  300s

  # Enable status page at this URL, on the port HAProxy is bound to
  stats enable
  stats uri /haproxy-status
  stats refresh 5s
  stats realm Haproxy\ statistics

frontend zopecluster
  bind ${haproxy-conf:bind}
  default_backend zope

# Load balancing over the zope instances
backend zope
  # Use Zope's __ac cookie as a basis for session stickiness if present.
  appsession __ac len 32 timeout 1d
  # Otherwise add a cookie called "serverid" for maintaining session stickiness.
  # This cookie lasts until the client's browser closes, and is invisible to Zope.
  cookie serverid insert nocache indirect
  # If no session found, use the roundrobin load-balancing algorithm to pick a backend.
  balance roundrobin
  # Use / (the default) for periodic backend health checks
  option httpchk

  # Server options:
  # "cookie" sets the value of the serverid cookie to be used for the server
  # "maxconn" is how many connections can be sent to the server at once
  # "check" enables health checks
  # "rise 1" means consider Zope up after 1 successful health check
  server  plone0101 127.0.0.1:${zeoclient1:http-address} cookie p0101 check maxconn 2 rise 1
  server  plone0102 127.0.0.1:${zeoclient2:http-address} cookie p0102 check maxconn 2 rise 1

This assumes that I have Zope instances built by parts called "zeoclient1" and "zeoclient2" in my buildout; you'll probably need to update those names.

You may want to adjust the "option httpchk" line to use a different URL for checking whether the Zope instances are up -- you want to point at something that can be rendered as quickly as possible (in my case it's the Zope root information screen, so I'm not too worried).

The maxconn setting for each backend should be at least the number of threads that that Zope instance is running. Laurence Rowe pointed out to me that it should probably not be set to 1, since Zope also serves some things (blobs and ) via file stream iterators, which happens apart from the main ZPublisher threads. (So setting maxconn to 1 would mean serving a large blob could block other requests to that backend, for instance.)

See the HAProxy configuration documentation for more details on the settings that can be used in this file.

2 comments
David Glick

David Glick

I am a problem solver trying to make websites easier to build.

Currently I do this in my spare time as a member of the Plone core team, and during the day as an independent web developer specializing in Plone and custom Python web applications.