Java & Python REST Services (v0.3)


This page documents the various REST services that I develop. Currently, I describe the Java services (developed using Restlet) mainly providing cheminformatics functionality via the CDK. The Java services come as a single JAR file which includes a small web server, allowing for easy deployment. See Build & Install for more details.

I also provide mod_python based services. See here for more details.

Currently, this site does not host the services. See Servers to get a list of servers that host these services. The documentation for each service is independent of the actual server hosting the service.

The documentation for the various resources is currently manually generated. As a result, there's no practical way to generate code to use these services. In the future I'll probably provide a WADL document for the services. Of course, it is reminiscent of WSDL, which was something I wanted to move away from. See here for an interesting discussion.

Downloads

The CDK based Java services can be obtained in binary or source forms. The binary form (cdkrest.jar) can be directly run from the command line to deploy the REST services.

If you would like to modify the services you can download cdkrest-src-0.5.tgz, cdkrest-src-0.5.zip. The code is licensed under the LGPL.

Build & Install

You can regenerate the CDK REST services JAR file from the sources using ant (version 1.7.1 or better is required along with ant-contrib). After installing ant, you should be able to do
ant rest
To see other options do
ant info
The JAR file contains the services as well as the Noelios restlet engine to host the services. This makes hosting the Java based services very easy since all you need to do is to run the JAR file. By default, it will start a server located at http://localhost:8182. You will likely want to start it on a specific host and port as shown below
java -jar cdkrest.jar -p 6666 -s my.own.host -l services.log
Not that this will block a terminal so you will likely want to use nohup and run it in the background. You can then visit any of the services described above as
http://my.own.host:6666/cdk/fingerprint/CCOCCCO
For this example you should see a single line of 1's and 0's in your browser.

Note that the depiction service provided by cdkrest.jar employs Swing and hence requires the X11 windowing system. If you plan to run these on a headless server an easy fix is to run a virtual framebuffer (such as Xvfb). In this case, the start up would look like:

Xvfb :2 -screen 1 800x600x16 &
export DISPLAY=:2.1
nohup java -jar cdkrest.jar -p 6666 -s my.own.host -l services.log &
In addition, it appears that the code tries to bind an ipv6 socket to an ipv4 address (as reported by Ola Spjuth). In such a case, adding
-Djava.net.preferIPv4Stack=true
to the command line should let it work.

Installation instructions for Gentoo Linux have been provided by Anders Lovgren and Ola Spjuth.

Documentation

Quick list of available resources

/cdk
/cdk/depict
/cdk/depict/{smiles}
/cdk/depict/{width}/{height}/{smiles}
/cdk/descriptors
/cdk/descriptors/{smiles}
/cdk/descriptor/{klass}
/cdk/descriptor/{klass}/{smiles}
/cdk/descriptor/tpsa/{smiles}
/cdk/descriptor/xlogp/{smiles}
/cdk/mw/{smiles}
/cdk/mf/{smiles}
/cdk/fingerprint
/cdk/fingerprint/{smiles}
/cdk/fingerprint/{type}/{smiles}
/cdk/substruct/{target}/{query}
/cdk/substruct

The convention employed by the following descriptions is as follows

Depiction

FunctionReturn 2D depictions of SMILES input
Accept*
Return typeimage/jpeg
URL[GET] /cdk/depict/{smiles}
  {smiles} should be a valid SMILES string or a Base64 encoded SMILES string (appropriately quoted). The resultant image will have a width and height of 200 pixels respectively.
URL[GET] /cdk/depict/{width}/{height}/{smiles}
  {smiles} should be a valid SMILES string or a Base64 encoded SMILES string (appropriately quoted). {width} and {height} should be integer values defining the width and height of the resultant image
URL[POST] /cdk/depict
  POST requests to this URL should include at least one form variable called molecule containing a SMILES or SDF string. No Base64 encoding is required. If no other form variable is specified the resultant image is 200x200. To get other dimensions you can specify width and height as form variables, containing the desired width and height as integers.

Descriptors

FunctionList or calculate molecular descriptors
Accepttext/plain *
Return typetext/plain text/xml
URL[GET] /cdk/descriptors
  If text/plain is specified in the "Accept" header, the return is a plain text document with each descriptor resource on a separate line. Otherwise an XML document with multiple specification-ref elements, whose href attribute is a link to the individual descriptor specification resource. See below for more details.
URL[GET] /cdk/descriptors/{smiles}
  {smiles} should be a valid SMILES string. If text/plain is specified in the "Accept" header, the return is a plain text document with each descriptor value resource on a separate line. Otherwise an XML document with multiple descriptor-ref elements, whose href attribute is a link to the individual descriptor value resource. See below for more details.
URL[GET] /cdk/descriptor/{klass}
  {klass} should be a fully qualified CDK molecular descriptor class. A list of classes can be found here or alternatively, obtained from the service URL shown above. This service ignores the "Accept" header and the Content-type of the result is always text/xml. The XML result is a "descriptor specification", that provides information regarding the descriptor implementation.
URL[GET] /cdk/descriptor/{klass}/{smiles}
  {klass} should be a fully qualified CDK molecular descriptor class. A list of classes can be found here or alternatively, obtained from the service URL shown above. {smiles} should be a valid SMILES string. This service ignores the "Accept" header and the Content-type of the result is always text/xml. The result is a document listing the value(s) of the descriptor (with the names of each descriptor calculated) for the specified molecule

Fingerprints

FunctionReturn fingerprints for one or more molecules
Accept*
Return typetext/plain
URL[GET] /cdk/fingerprint/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted). Will return a binary string (1's & 0's) representation of the CDK path based fingerprint. The length of the string indicates the size of the fingerpint
URL[GET] /cdk/fingerprint/{type}/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted). {type} can be one of std, maccs or estate. Will return a binary string (1's & 0's) representation of the fingerprint. The length of the string indicates the size of the fingerpint
URL[POST] /cdk/fingerprint
  POST requests to this URL should include two form variables. The first should be called smiles and will be a comma separated string of SMILES. There is no need to Base64 encode these strings. The second should be called type and will be one of std, maccs or estate. The returned document will consist of N lines, where N is the number of SMILES specified. Each line will be a binary string (1's & 0's) representation of the fingerprint. The length of the string indicates the size of the fingerpint.

An example Python client would be

import urllib2
import urllib

url = 'http://localhost:8182/cdk/fingerprint'

slist = ','.join(['CCC(=O)CC','c1ccccc1','C(=O)CC(=O)COC'])
values = {'smiles' : slist, 'type' : 'maccs' }
data = urllib.urlencode(values)

req = urllib2.Request(url, data)
resp = urllib2.urlopen(req)
print resp.read()

Molecular Weight

FunctionEvaluate molecular weight of a SMILES input
Accept*
Return typetext/plain
URL[GET] /cdk/mw/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted)

Molecular Formulae

FunctionEvaluate molecular formula of a SMILES input
Accepttext/plain text/html
Return typetext/plain text/html
URL[GET] /cdk/mf/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted). If text/html is requested via an Accept header, the result will be an HTML form of the formula (so numbers show as subscripts or superscripts)

Substructure Matching

FunctionDetermine whether a substructure is present in a target molecule(s)
Accept*
Return typetext/plain
URL[GET] /cdk/substruct/{target}/{query}
  {target} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted) and {query} should be a SMARTS pattern or a SMILES string. If the query is present in the target, the result is the string "true", otherwise "false".If the SMILES could not be parsed, the error code is 400 (BAD REQUEST) and the result is "fail". If there was an error during parsing the SMARTS the response code is 500 (INTERNAL SERVER ERROR) and there is no result.
URL[POST] /cdk/substruct
  POST requests to this URL should include two form variables. The first should be called target and will be a comma separated string of SMILES. The second should be called query and will be a SMARTS pattern or a SMILES string.

The result will be a plain text document with N lines, for N SMILES strings. For each line, if the query is present in the target, the line will equal "true", otherwise "false".If the SMILES could not be parsed, the line will be "fail". If there was an error during parsing the SMARTS the response code is 500 (INTERNAL SERVER ERROR) and there is no result.

An example Python client would be

import urllib2
import urllib

url = 'http://localhost:8182/cdk/substruct'

slist = ','.join(['CCC(=O)CC','c1ccccc1','C(=O)CC(=O)COC'])
values = {'target' : slist, 'query' : '[#6]=O' }
data = urllib.urlencode(values)

req = urllib2.Request(url, data)
resp = urllib2.urlopen(req)
print resp.read()

Topological Polar Surface Area

FunctionReturns the TPSA of a molecule
Accept*
Return typetext/plain
URL[GET] /cdk/descriptor/tpsa/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately encoded)

Version

FunctionReturns version information for the services
Accept*
Return typetext/plain
URL[GET] /cdk
  Note that this provides information for the whole application rather than any individual service. The result is plain text listing the version of the services, JRE version and operating system

XLogP

FunctionReturns the XLogP of a molecule
Accept*
Return typetext/plain
URL[GET] /cdk/descriptor/xlogp/{smiles}
  {smiles} should be a valid SMILES or Base64 encoded SMILES string (appropriately quoted)

Python Services

This section lists some mod_python based Python services. In general, most of these depend on external programs to do the real work.

3D Structure Generation

This service makes use of smi23d (SVN) to generate 3D structures. Currently hosted on this server and provides a web form. In addition, you can directly obtain 3D structures by constructing URL's of the form
http://rest.rguha.net/threed/d3.py/get3d?smiles=c1ccccc1
Structures are returned as SD files. Please note that this is a wimpy server and if it gets too loaded, I will disable the service. Feel free to download the code and run it locally (you'll need to install smi23d and change the paths in the Python file)

Servers

This section lists servers that host the Java services. To use the services simply append the resource URL's to the host names.

HostnameContact
http://toposome.chemistry.drexel.edu:6666Rajarshi Guha, rajarshi.guha@gmail.com
http://ws1.bmc.uu.se:8182Ola Spjuth, ola.spjuth@farmbio.uu.se