4Suite is a platform for XML and RDF applications. A portion of 4Suite, the repository, is an XML and RDF database management system. RDF is Resource Description Framework, an extensible system for managing metadata. Among the features the 4Suite repository offers are:
4Suite is particularly suited for building Web applications with XML technologies. It allows you to store, index, transform and render XML documents. I mentioned the various APIs available for 4Suite. There are other articles listed in the Resources section that focus on the command line, Web and XSLT APIs. In this article I take a closer look at the Python API. 4Suite is implemented in Python and C, and pretty much all the features of the repository are available to Python code. This is handy for scripted processing in 4Suite apps, integration into other Python tools, extension of 4Suite's capabilities and even for rapid access to and maintenance of the repository using Python's interactive prompt.
Because 4Suite is written mostly in Python, you can pretty much customize it to your heart's content using Python code, as long as you have the right permissions to access the application source code. But in this article, I shall stick to the official Python API, also known as Client Core (CCore).
You should be familiar with Python and somewhat familiar with XML and related technologies before reading this article. If you want to try the code examples yourself, you should have 4Suite installed and a repository instance initialized as as described in the UNIX or Windows install guides. You should also have at least skimmed the repository quick start guide, and set up a non-super user with a home folder as recommended in that document.
All objects in 4Suite are available through CCore as Python proxy objects. These are arranged into a hierarchy with a proxy object representing the repository itself at the root, all top level folders (also known as containers) below the root, all subfolders below the top-level folders, and so on. This is similar to a file system metaphor on most operating systems. The first step in working with CCore is to get the repository proxy object, which you do by logging in. The following interactive Python session illustrates this:
>>> import sha
>>> from Ft.Server.Client import Core
>>> pw_hash = sha.new("uo").hexdigest()
>>> repo = Core.GetRepository("uo", pw_hash, "localhost", 8803)
This code works for the case where username is "uo" and the password is "uo", accessing the local machine on the standard port. Adjust accordingly for your circumstances. Most of what you need to access the repository will be imported from the Ft.Server module. The GetRepository function allows you to give the authentication and network information to create a repository proxy object. You pass the user name and SHA password hash, which I compute using the standard sha module. You get back a repository proxy object.
There are a lot of methods you can invoke on the repo object, as you can see by running dir(repo). But conveniently, you can access the this object as a dictionary where the keys are the names of the child resources and the values are the proxy objects for each child resource. A resource is any object that is managed in the repository, including containers, XML files, raw (non-XML) files, and other things.
>>> repo.keys()
[u'web', u'ftss', u'home']
This shows that I have two top-level resources in the repository. Your own case may vary. Examine the ftss resource a bit more closely:
>>> obj = repo['ftss']
>>> obj
<Ft.Server.Client.Core.ContainerClient.ContainerClient instance at 0x81c90c4>
>>> obj.keys()
[u'servers', u'docs', u'dashboard', u'commands', u'demos', u'data', u'docdefs',
u'groups', u'users']
The object repr tells us that the ftss object is a container object, or more accurately a container client proxy object. You can also go further and look at the contents of
obj['data'], which is itself a container, and the contents of containers can be accessed using dictionary idiom as well. You will find many more resources, most of which are now actual files. The following code displays the contents of the XML resource identified by the repository path /ftss/data/null.
>>> obj = repo['ftss']['data']['null']
>>> obj.getContent()
'<null/>\n'
The getContent() method retrieves the contents of the resource, which is XML in this case becaus ethe resource is an XML document. It can also be the contents of a raw file (anything from HTML to a JPEG to a ZIP file). All resources in the repository have a standard content view. If you invoke getContent() on a container, you'll see an XML-ized view of its entries.
You needn't always use dictionary access to navigate the repository. It also supports navigating local paths in the resource hierarchy. The following code fetches the null resource in a way generally equivalent to
repo['ftss']['data']['null']
>>> obj = repo.fetchResource('ftss/data/null')
>>> obj.getContent()
'<null/>\n'
You can also invoke fetchResource() on other objects to navigate relative to those objects. Any path that starts with "/" is absolute and is effectively fetched relative to the repository itself. You can also get the absolute path of any resource.
>>> c1 = repo.fetchResource('ftss/data')
>>> c1.getAbsolutePath()
u'/ftss/data'
>>> c2 = c1.fetchResource('..')
>>> c2.getAbsolutePath()
u'/ftss'
So far all these operations are read-only, but you can also update the repository. For example, I create a simple XML file in my home directory as follows:
>>> DOC = u"""<?xml version="1.0" encoding="UTF-8"?>
... <verse>
... <attribution>Wole Soyinka</attribution>
... <line>Traveller, you must set out</line>
... <line>At dawn. And wipe your feet upon</line>
... <line>The dog-nose wetness of the earth</line>
... </verse>
... """
>>> home_folder = repo.fetchResource('home/uo')
>>> new_doc = home_folder.createDocument('dawn.xml', DOC, imt='text/xml')
The createDocument() method on container objects creates an XML document by default. You can create specialized XML documents such as XSLT stylesheets optimized for transforms using additional options. The first parameter is the name of the document to be created, the second is the content of the document, a Python Unicode object. I explicitly specify the internet media type (IMT) of the resource. The repository keeps careful track of the IMT of resources because they are needed in Web environments, and for other considerations. The return value from createDocument() is a proxy object for the newly created resource.
You can perform all sorts of XML processing operations on the new document. The following example applies one of the XSLT stylesheets that comes with 4Suite.
>>> xslt_obj = repo.fetchResource('ftss/data/decorated-xml.xslt')
>>> transform_result = new_doc.applyXslt([xslt_obj])
First I get a proxy object for the XSLT stylesheet I want to apply. Then I invoke the applyXslt() method on the source document. This method takes a list of stylesheets, so though I have only one, I put it into a list. The result is a tuple of which the first item is a string buffer with the transform output. The second item is the IMT of the result. I do not show the result because of its length, but do try it yourself and see. The transform result is a pretty HTML view of the XML document similar to the well-known Internet Explorer 5 view of an XML document.
To create a non-XML resource in the repository, you must use a special method since
createDocument() tries to parse the given contents as XML. The following adds to the repository an image file from a remote Web site.
>>> import urllib
>>> url = urllib.urlopen('http://4suite.org/include/4Suite-org.png')
>>> image_data = url.read()
>>> home_folder.createRawFile('4Suite-org.png', 'image/png', image_data)
<Ft.Server.Client.Core.RawFileClient.RawFileClient instance at 0x85bcbec>
createRawFile() takes the new resource name, an IMT then the data for the resource, the PNG file body in this case. The creation methods are only available on container objects and the repository itself.
The repository is fully transactional. If I were to end this Python session right now, all my changes would be lost. In order to save my changes, I have to commit the transaction:
>>> repo.txCommit()
You can also use
repo.txRollback() to discard the changes. Once you have ended a transaction in either way, you must not use the repo object any more or you'll get an error.
I hope this walkthrough of the Python API to the 4Suite repository is enough to get you started. There are many other details and capabilites I did not cover. Many of them build on these basics in a straightforward manner. As you can see, accessing the 4Suite repository from Python is very easy. 4Suite already provides the basic tools for building Web-based applications with XML technologies. The Python API adds a rich dimension of additional capabilities.