## Displaying compounds with WebGL

After publishing my last article about OPSIN I was interested in using HTML5 techniques to display chemical compounds and found a nice library: ChemDoodle.

With ChemDoodle it’s very easy to display a molecule. Just download the libs and import them to your HTML code:

To display a compound you need its representation as MOL file, include it in less than 10 lines:

Here is a sample with caffeine:

If your browser is able to display WebGL you should see a stick-model. Use your mouse to interact. Very easy to use! Of course you can load the MOL data from a file, but that is beyond the scope of this article.

## Benefit of standardization: OPSIN

Just read about a new tool to parse chemical names from systematic IUPAC nomenclature.

OPSIN (Open Parser for Systematic IUPAC nomenclature) is an open source IUPAC nomenclature parser. The IUPAC provides some rules to name chemical compounds, you may have learned some of them in your first course of organic chemistry.

The web interface also comes with an API to generate a 2D picture of the parsed compound. You can speak to the API by calling the image via http://opsin.ch.cam.ac.uk/opsin/IUPAC-NAME.png . For example to get an image for 2λ6,2’,2’‘-spiroter[[1,3,2]benzodioxathiole] just follow these instructions and you’ll get an image like this:

Very smart, isn’t it? Using the web interface they also provide InChI and SMILES strings and a CML definition.

It’s not limited to simple molecules, I’ve tried some more complex names, for example 3,6-diamino-N-[[15-amino-11-(2-amino-3,4,5,6-tetrahydropyrimidin-4-yl)-8- [(carbamoylamino)methylidene]-2-(hydroxymethyl)-3,6,9,12,16-pentaoxo- 1,4,7,10,13-pentazacyclohexadec-5-yl]methyl]hexanamide:

What should I say, I’m impressed! You can download the tool at bitbucket or use the web interface.

## R for the web

There is a nice R module for apache: rApache. So you can easily publish statistics.

To install rApache first install the following packages from the Debian/Ubuntu repository:

So the basics are done. Lets install rApache. Grab the latest version:

extract the contents and cd into it. The installation process should be clear, I had to give a hint for the apxs2 location:

To notify apache about the new module you need to create two more files. First one is /etc/apache2/mods-available/r.conf :

Now all files in /R are assumed to be R-scripts, in /RApacheInfo you’ll find some information about your installation. The second file is /etc/apache2/mods-available/r.load :

This file just defines which lib to load. To finish the installation you need to load the rApache module and restart the webserver via:

That’s it. You can test whether all was successful by browsing to localhost/RApacheInfo, hopefully you’ll see some config stuff. To prepare some own tests create a directory /var/www/R (assuming your document-root is /var/www ) and paste something like this in a file called test :

Browsing to localhost/R/test you should see something like this:

To create a graphic you need to change the content type to an image type. A small example might give you an idea:

Reload the page and you’ll see a more or less nice plot :-P That’s it for the moment, for a more interactive interface take a look at the ggplot2 mod.

## Converting peaks to Gaussians

Yesterday I updated the iso2l. One of the improvements is the MS mode, now it’s able to display isotopic clusters as expected by MS instruments instead of only theoretical ones. The task was to estimate a normal distribution of a theoretical isotope peak.

The accuracy of a mass spectrometry (MS) instrument is determined by its resolution. The higher the resolution the easier you can distinguish between two peaks. This is essential especially to identify isotopes. Depending on the charge state of an ion two isotopes may differ in less than 0.1 mass over charge (m/z). To detect the resolution of your MS instrument just select one peak and measure the width of the peak at the half height of it. This expression is called $FWHM$ (full width at half maximum). The resolution $R$ is calculated by the following equation:

So you see the resolution respects the characteristics of MS instruments that peaks at higher m/z are wider.

Now we want to go the other way around. We have an theoretical mass of an peak and want to estimate a mass distribution as measured by an instrument. These distributions look like normal distributions, so it’s obvious that we want to estimate a Gaussian $\mathcal{N}(\mu,\,\sigma^2)$:

It’s clear that $\mu = m/z$ of the Peak, but we have to find sigma to have the distribution half-maximum at $\mu \pm \frac{1}{2} FWHM$. Since the normalization term $\frac{1}{\sqrt{2\pi\sigma^2}}$ doesn’t matter in this case, the formula simplifies to $\mathcal{N}(\mu,\,\sigma^2) = e^{ -\frac{(x-\mu)^2}{2\sigma^2} }\,$ with its maximum of 1 at $\mu$. As you know $\sigma$ isn’t affected if we move all data points by a distinct value, so let’s move them by $-\mu$. Now the distribution has its mean at 0. The equation we have to solve is:

You see, the half-maximum is at $\pm\sigma\sqrt{2\ln2}$, with $FWHM=2\sigma\sqrt{2\ln2}$. Reverse, given the $FWHM$ we can calculate $\sigma$ of the normal distribution with:

Combining everything, a peak at m/z in an instrument with resolution $R$ can be approximated with a normal distribution $\mathcal{N}(\mu,\,\sigma^2)$ with parameters:

You see, the higher the m/z the bigger is $\sigma$.

## Plotting w/o X

This might be interesting for non-X fans like me. I just found a nice way to plot to a simple terminal.

Using gnuplot you can enable terminal plots via set term dumb . Here is an example:

Very cool idea, isn’t it!? Ok, you can’t see much details, it might give you an overview even if you are just connected via SSH.

If anybody has an idea how to do it with R please tell me!