Skip to content. | Skip to navigation

Personal tools

>>> ''.join(word[:3].lower() for word in 'David Isaac Glick'.split())

‘davisagli’

Navigation

You are here: Home

David Glick – Plone developer

by admin posted Apr 05, 2010 11:48 PM

How to disable Plone's HTML filtering

by David Glick posted Jan 30, 2014 12:40 PM

It's a little tricky (and often not wise) to disable Plone's HTML filters, but here's how.

A common complaint about content editing in Plone is that it is too strict about filtering out certain HTML. There is a good reason for its strictness: since some HTML tags can be used to inject unsafe content (such as cross-site scripting (XSS)) and since many Plone sites allow content editing by untrusted users, these HTML tags are disallowed by default as a security precaution.

But sometimes security is less important that being able to insert any markup. A client recently came to me with this situation:

  • It’s a news site that commonly wants to embed iframes, video, and scripts.
  • The only users with access to edit HTML are trusted editorial staff.
  • There are no other, untrusted sites on the same domain or Zope instance.
Sounds like a good candidate for turning off the filtering!

Step 1: Turn off Plone’s safe_html transform. Go to /portal_transforms/safe_html in the ZMI, and enter a 1 in the ‘disable_transform’ box. This prevents Plone from removing tags and attributes while rendering rich text.

That used to be all that was necessary to turn off filtering. But these days TinyMCE also filters HTML on the client side, using configuration based on the safe_html transform’s settings, and unfortunately it doesn’t pay attention to the ‘disable_transform’ flag. Luckily we can get around that…

Step 2: Monkey-patch the TinyMCE utility to return a wildcard when determining the allowed tags and attributes:

from Products.TinyMCE.utility import TinyMCE
TinyMCE.getValidElements = lambda self: {'*': ['*']}

Now you can edit a page, open TinyMCE’s HTML dialog, enter whatever you want, save, and it will get saved to the database and be shown on subsequent views of the page. However, if you’re using Safari, Chrome, and IE, some unsafe markup will still not show up on the initial view of the page right after saving. Why not?

These browsers provide automatic reflexive XSS protection. That means that if they detect the same potentially unsafe markup in both a request and its response, they’ll block or ignore it. It’s normally a pretty nice security precaution, but in this case it’s cramping our style. Fortunately these browsers also provide a way for a site to opt out of the protection using an HTTP response header.

Step 3: Set the "X-XSS-Protection: 0" response header. This can be done in your frontend webserver such as apache or nginx. In my case, though, I figured I only need to disable the protection for users who have permission to edit, so I added this to the site’s main_template:

tal:define="dummy python:checkPermission('Modify portal content', context) and request.RESPONSE.setHeader('X-XSS-Protection', '0');"

Et voilà. No more filtering, and the editorial staff can enter whatever markup they dream up.

P.S. If someone wants to fix TinyMCE to pay attention to safe_html's disable_transform flag, that'd be nice!

Update 2014-02-03:Thanks to Nathan van Gheem who just fixed Products.TinyMCE to pay attention to the disable_transform flag from safe_html. Once that fix is released, only steps 1 and 3 will be necessary.

Recent improvements to Plone-Salesforce integration

by David Glick posted Jul 10, 2013 11:45 PM

The simple-salesforce client and a celery-based message queue are keys to making Plone-Salesforce integration simpler and more robust.

I've been working on yet another big project that involves synchronizing data between Plone and Salesforce. As in past projects, Plone provides a powerful web content management platform that can support lots of editors with a number of different roles without paying for additional license fees, while Salesforce's superior reporting tools provide motivation for replicating data there as well. With help from Jason Lantz, I've been working on modernizing the toolset used to achieve this replication.

The improvements are on several fronts:

  1. Using the simple-salesforce library instead of beatbox and suds
  2. Handling incoming and outgoing data via a message queue
  3. Using Plone ids as a external id in Salesforce

Let me cover each of these in more detail.

simple-salesforce vs. beatbox and suds

Salesforce originally had only a SOAP API, and for a long time the clients available for interacting with Salesforce from Python only worked with the SOAP API. There was beatbox, which supported the generic Salesforce partner webservice pretty well but didn't handle custom webservices. And there was suds, which could parse WSDL for custom webservices and allow interacting with them. This ecosystem was annoying: you needed different libraries for different tasks, they both were as error-prone as most SOAP clients are, and you needed to update WSDL files whenever someone made a change to the webservice.

These days Salesforce has a REST API—both for generic operations and available from Apex for custom webservices—and Jason called my attention to the simple-salesforce library from the team at New Organizing Institute. I read the code and was impressed: it's only about 500 lines of code and I don't think I would do much differently if I were writing it from scratch. (I'd probably add some tests, but it is working okay for me in practice.) Using simple-salesforce, inserting a contact looks something like this:

from simple_salesforce import Salesforce
sf = Salesforce(
    username='SF_USERNAME', password='SF_PASSWORD',
    security_token='SF_TOKEN', sandbox=True)
sf.Contact.insert({
    'FirstName': 'David',
    'LastName': 'Glick',
    'Email': 'david@glicksoftware.com',
})

Handling data via a message queue

The first generation of Plone-Salesforce integration tools made synchronous calls out to Salesforce. This was the easiest thing to get working, but had some problems:

  • High latency. This was a problem both for user experience and because it tied up server threads and increased the likelihood of database conflict errors.
  • Tight coupling. If Salesforce was temporarily unreachable due to network issues or a maintenance window, the Plone user would see errors.

To solve these problems we knew we needed a message queue to allow for asynchronous processing of data being relayed from one system to the other. We did some research and chose celery (with RabbitMQ as a backend). Celery has a lot of features including support for multiple backends and the flower UI for viewing the queue, and it seems to have some traction in other parts of the Python web programming world.

At first I was worried that celery would be difficult to integrate with Zope, but it turns out it's not very complicated and hardly even warrants the creation of a new library. Celery uses a connection pool, so can be used from within a threaded environment like Zope. So the main consideration is making sure that it plays nicely with Zope's transaction management: you don't want a job to get queued twice if the code that places it on the queue runs within a transaction that hits a database conflict and gets retried. Using an after-commit hook makes this pretty easy to deal with. I ended up creating a @salesforce_task decorator which encapsulates the logic for deferring the queueing to the right time as well as creating a Salesforce connection. See the code.

Then there's the other direction: polling Salesforce periodically for data that have been updated. I used to handle this with a cron job and "bin/instance run" script, but celery has support for periodic tasks so I figured I'd give that a try. The trick here is that this task needs to modify objects in the ZODB, so needs a full Zope environment set up when it runs in the celery worker process. Doing this is non-obvious but not especially difficult; I created another decorator called @zope_task which handles the dirty details. See the code.

Using Plone ids as a external id in Salesforce

To sync multiple changes to the same object between the 2 systems, we either need to store a Salesforce Id on the object in Plone, or a Plone id on the object in Salesforce. In the past I've stored a Salesforce Id in Plone, but this is annoying, because it means when you create a new object you can't push it to Salesforce asynchronously, because you need to wait for the id of the inserted object and store it. It turns out the Salesforce API has pretty good support for referring to objects by an "external id field" if you create one. So now we are adding an external id field called Plone_Id__c to our objects in Salesforce, and then it's pretty darn easy to push changes to Salesforce:

@salesforce_task
def do_upsert(sf, sobject_type, record_id, data):
    sobject = getattr(sf, sobject_type)
    return sobject.upsert(record_id, data)

@grok.subscribe(IContact, IObjectAddedEvent)
@grok.subscribe(IContact, IObjectModifiedEvent)
def handle_contact_modified(contact, event):
    data = {
        'FirstName': contact.first_name,
        'LastName': contact.last_name,
    }
    do_upsert.delay('Contact', 'Plone_Id__c/' + contact.UID(), data)

(In practice I have some extra checks to make sure that we don't call out to Salesforce while tests are running, or when those events are triggered due to pulling data in from Salesforce.)

You can even use an external id when referencing related objects, so we could update a committee that references the Contact who is chairing it like this:

@grok.subscribe(ICommittee, IObjectAddedEvent)
@grok.subscribe(ICommittee, IObjectModifiedEvent)
def handle_committee_modified(committee, event):
    data = {
        'Name': committee.title,
        'Chair__r': {  # this is the relation to the chair in Salesforce
            'Plone_Id__c': committee.chair  # in Plone we store a UID of the chair
        },
    }
    do_upsert.delay('Committee__c', 'Plone_Id__c/' + committee.UID(), data)

Next steps

This is all working very nicely and I'm *much* happier than I was with the old approaches. I'm also quite happy that the new approaches are for the most part Zope-agnostic, and could easily be modified to handle Salesforce integration from other Python-based platforms. Potential directions for future development (if someone wants to pay me to work on them) are:

  • Package up the decorators for creating celery tasks that run within Zope or do callouts to external webservices into a reusable library.
  • Create a higher-level, reusable Plone add-on with a UI for configuring synchronization between Plone content types and Salesforce objects.
2 comments

Goodbye, Groundwire. Hello, world!

by David Glick posted Sep 11, 2012 11:16 AM

After 5 years, I'm moving on from Groundwire to work independently.

In the summer of 2007 I graduated from college, moved to Seattle, and started working as a web developer for Groundwire, then called ONE/Northwest. This week, five years later, I'm both sad and excited to announce that I'm moving on.

Working at Groundwire has been a fantastic experience for which I am deeply grateful. I got to work with the best colleagues in the world, as well as some awesome clients whose missions I believe in. I learned a lot about shepherding nonprofit organizations through the process of doing a significant technology project. I was privileged to spend time pushing the Plone content management system to the next level and building deep integration between it and the Salesforce.com CRM platform. Groundwire is still doing great work to help nonprofits engage their constituents, and if you're currently looking for a great opportunity to combine your tech skills with a desire to work for good, I can wholeheartedly recommend applying for one of Groundwire's open positions.

But nevertheless, it's time for me to "level up" to the next challenge. After Thursday, I will start working as a freelance contractor specializing in Plone, Python, and web app development. I'm looking forward to working with and contributing my skills to some other great consultancies and clients, as well as learning new tools and techniques for constructing web-based solutions. And I'm looking forward to having a more flexible schedule with more chances to work on my own projects in addition to paid work.

The Plone content management system has grown dear to me during my time at Groundwire, and I will not be abandoning it. During the next few weeks leading up to the annual Plone conference, I plan to spend as much time as possible working on various Plone-related improvements that I have had scarce time for lately, including:

  • Attending the Sea Sprint and pushing forward the Deco layout system
  • Continuing to improve Dexterity's capabilities for building content types through the web
  • Helping prepare the release of Plone 4.3 (including completing the demise of KSS)

I'll be doing this work on a volunteer basis, but if you've benefited from my contributions to Plone and/or want to help make sure I continue to have time available to work on it, please consider leaving me a tip:

Following the Plone conference, I am scheduling projects for mid-October and beyond, and am especially interested in:

  • Chances to work with Plone, Pyramid, and/or other Python-based web frameworks—or train other developers on how to use them well
  • Web projects that are interactive rather than merely presenting information
  • Projects that expand what is possible for non-developers to accomplish using Plone
  • Projects that are driven in part by a "social good" mission rather than merely profit

If you have something like that I can assist with, please be in touch!

12 comments
David Glick

David Glick

I am a problem solver trying to make websites easier to build.

Currently I do this in my spare time as a member of the Plone core team, and during the day as an independent web developer specializing in Plone and custom Python web applications.