Skip to content. | Skip to navigation

Personal tools

>>> ''.join(word[:3].lower() for word in 'David Isaac Glick'.split())

‘davisagli’

Navigation

You are here: Home / Presentations / The Art of Integrating Plone with Webservices

The Art of Integrating Plone with Webservices

by David Glick posted Jul 12, 2011 08:37 PM
Part of Zope's power lies in integrating it with other web-based services. Python provides some powerful "batteries included" to help with this, but there are still some common pitfalls to watch out for.

What is a web service?

  • Lets code running on one machine (the client) interact with code on another (the server)
  • Transports messages over HTTP

Why web services?

  • Often easier to integrate than to build your own
  • Combine best-of-breed tools

Major categories

RPC-style / Service-oriented architecture
Focus is on the action being performed.
REST-style / Resource-oriented architecture
Focus is on the object being acted upon.

XML-RPC

  • Passes messages in a simple XML-based format.

    Sample request:

    <?xml version="1.0"?>
    <methodCall>
      <methodName>examples.getStateName</methodName>
      <params>
        <param>
            <value><i4>40</i4></value>
        </param>
      </params>
    </methodCall>
    

    Sample response:

    <?xml version="1.0"?>
    <methodResponse>
      <params>
        <param>
            <value><string>South Dakota</string></value>
        </param>
      </params>
    </methodResponse>
    
  • Support in the Python stdlib: xmlrpclib

Example: Querying PyPI

>>> import xmlrpclib
>>> from pprint import pprint
>>> client = xmlrpclib.ServerProxy('http://pypi.python.org/pypi')
>>> client.release_urls('Plone', '4.0.1')
>>> pprint(client.release_urls('Plone', '4.0.1'))
[{'comment_text': '',
'downloads': 177,
'filename': 'Plone-4.0.1.zip',
'has_sig': False,
'md5_digest': 'be72596d49295b7207f0a861ee3530ed',
'packagetype': 'sdist',
'python_version': 'source',
'size': 1507065,
'upload_time': <DateTime '20101004T02:30:01' at 10071a248>,
'url': 'http://pypi.python.org/packages/source/P/Plone/Plone-4.0.1.zip'}]

More info: http://wiki.python.org/moin/PyPiXmlRpc

Example: wsapi4plone

Provides an XML-RPC interface for interacting with a Plone site.

  • post_object
  • put_object
  • get_object
  • delete_object
  • query
  • get_schema
  • get_types
  • get_workflow
  • set_workflow
  • get_discussion

More info: http://pypi.python.org/pypi/wsapi4plone.core

SOAP

  • "big Web Services" — described by various WS-* W3C standards
  • passes XML-based messages like XML-RPC, but more complicated (can represent complex types)
  • WSDL (web service description language) — machine-readable XML description of the interface
  • In Python:
    • soaplib
    • suds

Sample Request and Response

Sample request:

<?xml version="1.0" encoding="UTF-8"?>
<SOAP-ENV:Envelope xmlns:ns0="http://cicero.azavea.com/"
                   xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/"
                   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                   xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
   <SOAP-ENV:Header/>
   <ns1:Body>
      <ns0:GetOfficialsByAddress>
         <ns0:authToken>FOO</ns0:authToken>
         <ns0:address>1402 3rd Ave</ns0:address>
         <ns0:city>Seattle</ns0:city>
         <ns0:state>WA</ns0:state>
         <ns0:postalCode>98101</ns0:postalCode>
         <ns0:country>US</ns0:country>
         <ns0:districtType>NATIONAL_UPPER</ns0:districtType>
         <ns0:includeAtLarge>false</ns0:includeAtLarge>
      </ns0:GetOfficialsByAddress>
   </ns1:Body>
</SOAP-ENV:Envelope>

Sample response:

<?xml version="1.0" encoding="utf-8"?>
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"
               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
               xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <soap:Body>
    <GetOfficialsByAddressResponse xmlns="http://cicero.azavea.com/">
      <GetOfficialsByAddressResult>
        <ElectedOfficialInfo>
          <ElectedOfficialID>326f9123-4196-49ff-a9ab-cca8194a12a8</ElectedOfficialID>
          <AssemblyName />
          (snip)
          <FirstName>Maria</FirstName>
          <MiddleInitial>E.</MiddleInitial>
          <LastName>Cantwell</LastName>
          (snip)
          <LastUpdateDate>2009-03-26T00:00:00</LastUpdateDate>
        </ElectedOfficialInfo>
      </GetOfficialsByAddressResult>
    </GetOfficialsByAddressResponse>
  </soap:Body>
</soap:Envelope>

suds example: Azavea's Cicero API

Azavea provides a web service to look up information about elected officials for a given address:

>>> from suds.client import Client
>>> auth_client = Client('http://cicero.azavea.com/Azavea.Cicero.WebService.v2/AuthenticationService.asmx?WSDL')
>>> token = auth_client.service.GetToken(username, password)

# printing a client lists its service's available methods
>>> client = Client('http://cicero.azavea.com/Azavea.Cicero.WebService.v2/ElectedOfficialQueryService.asmx?WSDL')
>>> print client
Suds ( https://fedorahosted.org/suds/ )  version: 0.4 GA  build: R699-20100913
Service ( ElectedOfficialQueryService ) tns="http://cicero.azavea.com/"
   Prefixes (2)
      ns0 = "http://cicero.azavea.com/"
      ns1 = "http://microsoft.com/wsdl/types/"
   Ports (2):
      (ElectedOfficialQueryServiceSoap)
         Methods (11):
            GetOfficialsByAddress(xs:string authToken, xs:string address, xs:string city, xs:string state, xs:string postalCode, xs:string country, xs:string districtType, xs:boolean includeAtLarge, )
(snip)

>>> officials = client.service.GetOfficialsForAddress(token, '1402 3rd Ave', 'Seattle', 'WA', '98101', 'US', 'NATIONAL_UPPER', True)
>>> officials.ElectedOfficialInfo[0].FirstName
'Maria'
>>> officials.ElectedOfficialInfo[0].LastName
'Cantwell'

RESTful APIs

  • reaction to "big web services"
  • resource-oriented
  • encourages direct use of features of HTTP (request methods, passing parameters in query string, caching, etc.)
  • response representations may vary. XML and JSON are common.
  • in Python:
    • urllib/urllib2 for transfer (stdlib)
    • ElementTree (stdlib), lxml, or some other XML library to parse XML
    • json (stdlib) to parse JSON

Example: Brown Paper Tickets API

Brown Paper Tickets provides an API for listing and registering for events.

Sample Response:

<?xml version="1.0"?>
<document>
<result>success</result>
<resultcode>000000</resultcode>
<note></note>
<event>
      <title>My Event</title>
      <link>http://www.brownpapertickets.com/event/120141</link>
      <description>blah blah blah</description>
      <event_id>120141</event_id>
      <live>y</live>
      (snip)
</event>
</document>

We can make a request to this service using urllib and parse the response using ElementTree:

>>> from urllib import urlencode, urlopen
>>> from xml.etree import ElementTree
>>> url = 'https://www.brownpapertickets.com/api2/eventlist?id=foo&client=bar'
>>> res = urlopen(url).read()
>>> tree = ElementTree.fromstring(res)
>>> for node in tree.findall('event'):
... title = node.find('title').text

Authentication

  • Token or API key in request
  • HTTP basic authentication
  • AuthSub & OAuth (requires callback to receive token)

What could go wrong

  • call times out
  • call fails
  • call succeeds but other code fails
  • ZODB conflict errors
  • long external calls tie up publisher threads

Timeouts & Deadlocks

  • Python's default socket timeout is None (forever), which is pathological
  • Can override with socket.setdefaulttimeout(), then catch socket.timeout
  • But note that it is a global setting, not per-thread

Transactions

  • In Zope, we're used to transactions being handled automatically:
    • new transaction for each request
    • resource manager exists for common resources
    • two-phase commit ensures atomicity
  • But web services are not generally transactional
  • Can do the non-transactional (e.g. web service) calls last:
    • If something local fails, exception will cause transaction to abort
    • Web service call never happens
    • Can use transaction.addAfterCommitHook for this
  • What if we need to write something locally based on a response from a web service? (e.g. payment authorization)
    • If something fails _after_ the web service call, the transaction will abort but the web service call can't be undone
    • Workaround: Catch exceptions and make a new webservice call to undo the effect of the first (but what if _that_ fails?)
    • Workaround: Catch exceptions and log them (make sure your logging is foolproof!)
    • Workaround: Use an asynchronous task system like zc.async to queue the second part as a separate job that can be retried if it fails

ConflictErrors

Occurs when connection A tries to commit changes to an object that was modified by another transaction (from connection B) since the object was loaded by connection A.

Even worse variant of the last case:

  1. Web service call succeeds
  2. Write to ZODB fails with a ConflictError
  3. ZPublisher RETRIES THE REQUEST -- and the web service gets called again
  4. Much weeping and gnashing of teeth

This is exacerbated by the fact that remote web service calls tend to be slow, which makes transactions last longer and increases the risk of conflicts.

Possible ways to mitigate:

  • Set the request's retry_max_count attribute to 0 (conflicting requests will fail hard instead of getting retried silently, but sometimes that's better)
  • Handle remote calls as zc.async jobs, so they take place in a separate transaction

Tools and techniques

Maintaining a pool of clients

  • Generally want one client object per thread to store session token, avoid reinitializing
  • Store in foreign_connections dictionary, an attribute of the ZODB connection
  • Use _v_ attributes only as a fallback
  • See "How This Package Maintains Persistent Connections" at http://pypi.python.org/pypi/alm.solrindex

Server-side caching

  • Goal: improve performance or API usage by cutting out unnecessary web service requests
  • plone.memoize provides various decorators to cache method results in different ways
    • on the request object
    • in an object attribute
    • in an object volatile attribute
    • in a global RAM cache
  • Can use something like time.time() // 3600 in the cache key to expire after no more than some given interval.
  • But remember there are only two hard problems in computer science: Naming things, cache invalidation, and off-by-one errors.

Asynchronous Loading

If you're fetching something from a remote server for display as part of a web page, load it in a separate request after the main page loads.

Advantages:

  • Keeps the site from being perceived as slow
  • Separates long-running remote calls from write transactions, so the risk of ConflictErrors is reduced.
  • Sometimes can load directly from the external service to Javascript instead of hitting your server.

Example: collective.googleanalytics

JQuery makes it easy:

jq(function () {
    jq('#analytics--1027659344').load('http://davisagli.com/blog/plone-4-in-the-news/@@analytics_async', {
       'report_ids': 'page-pageviews-sparkline,page-top-keywords-table',
       'profile_ids': 'ga:31264872',
       'request_url': 'http://davisagli.com/blog/plone-4-in-the-news',
       'date_range': 'month'
    }, function () {
        jq('#analytics--1027659344').css({
                'background-image': 'none',
                'height': 'auto'
            });
    });
});

Asynchronous Processing

plone.async.core makes it easy to queue asynchronous jobs to run via the zc.async infrastructure.

As discussed above, this can be used to avoid blocking the ZPublisher threads and keeping transactions open when long-running external calls are needed.

Simple usage example:

from Products.Five import BrowserView
import zc.async.job
from plone.async.core import getQueues
from time import sleep

def do_work(MAX):
    for x in xrange(1,MAX):
        sleep(0.1)
    print 'done!'

class Work(BrowserView):

    def __call__(self):
        queue = getQueues()['']
        job = zc.async.job.Job(do_work, 100)
        queue.put(job)
        # returns immediately; job runs in another thread

Testing Web Services

Some approaches:

  • use a real connection (full system test)
  • connect to a stub server (for an example, see zc.authorizedotnet)
  • inject responses to particular calls (unit test)

Serving webservices from Zope

Filed under: , ,