How to use Amazon Cloud supercomputers to view molecules in remarkable detail

Cloud computing code speeds processing of data-intensive microscopy data
February 5, 2013

These microscope images of podosomes, cellular structures thought to be involved in cancer, show the differences in clarity produced by conventional microscopic techniques (left) and super-resolution imaging (right). Salk researchers have developed a method of utilizing cloud computing to significantly reduce the time necessary for processing super-resolution images. (Credit: Salk Institute for Biological Studies)

Salk Institute for Biological Studies researchers have shared a how-to secret for biologists: code for Amazon Cloud that significantly reduces the time necessary to process data-intensive microscopic images.

The method promises to speed research into the underlying causes of disease by making single-molecule microscopy of practical use for more laboratories.

“This is an extremely cost-effective way for labs to process super-resolution images,” says Hu Cang, Salk assistant professor in the Waitt Advanced Biophotonics Center and coauthor of the paper. “Depending on the size of the data set, it can save over a week’s worth of time.”

Background: the limits of microscope imaging

The latest frontier in basic biomedical research is to better understand the “molecular machines” called proteins and enzymes. Determining how they interact is key to discovering cures for diseases.

Unfortunately, conventional light microscopes cannot clearly show objects as small as single molecules. The available alternatives, such as electron microscopy, could not be effectively used with living cells. In 1873, German physicist Ernst Abbe worked out the mathematics to improve resolution in light microscopes. But Abbe’s calculations also established the diffraction limit.

According to the Abbe limit, it is impossible to see the difference between any two objects if they are smaller than half the wavelength of the imaging light. Since the shortest wavelength we can see is around 400 nanometers (nm), that means anything 200 nm (ultraviolet-size) or below appears as a blurry spot. The challenge for biologists is that the molecules they want to see are often only a few tens of nanometers in size.

“You have no idea how many single molecules are distributed within that blurry spot, so essential features and ideas remain obscure to you,” says Jennifer Lippincott-Schwartz, a Salk non-resident fellow and coauthor on the paper.

In the early 2000s, several techniques were developed to break through the Abbe Limit, launching the new field of super-resolution microscopy. Among them was a method developed by Lippincott-Schwartz and her colleagues called Photoactivated Localization Microscopy, or PALM.

PALM, and its sister techniques, work because mathematics can see what the eye cannot: within the blurry spot, there are concentrations of photons that form bright peaks, which represent single molecules. The downside to these approaches is that it can take several hours to several days to crunch all the numbers required just to produce one usable image.

“Calculating an area of 50 pixels can take nearly a full day on a state-of-the-art desktop computer,” says Lippincott-Schwartz.

Supercomputer cloud

Analysis using Amazon EC2 cloud. (a) Computation time versus the radius of the image mosaic. (b) Illustration of the Amazon EC2 cloud, which allows multiple instances, each of which can be individually configured on demand. (c) Cloud computing accelerates image reconstruction with increased field of view. Left, for a measured image, the inset shows superresolution reconstruction of an area of 2.7 × 2.7 μm using a quad-core Core i7 computer; right, reconstruction of an increased area of 9.6 × 9.6 μm using 25 EC2 instances. The improvement is 12.5-fold. Credit: Salk Institute for Biological Studies)

The Salk researchers now offer other scientists an easier alternative: the Amazon Elastic Compute Cloud (Amazon Elastic EC2), a service that provides access to supercomputing via the Internet, allowing massive computing tasks to be distributed over banks of computers.

To make the PALM technique more practical for use in biomedical research, the team wrote a computer script that allows any biologist to upload and process PALM images using Amazon Cloud.

As a demonstration, Cang, Lippincott-Schwartz and post-doctoral researcher Ying Hu reconstructed the images of podosomes, which are molecular machines that appear to encourage cancer cells to spread. In one instance, they dropped the time needed to process an image from a whole day to 72 minutes. They also imaged tubulin, a protein essential for building various structures within cells. In that case, they were able to drop the time from nine days to under three and a half hours.

Their new paper provides a how-to tutorial for using the code to process PALM images through Amazon Cloud, helping the other labs achieve similar increases in speed.

Other researchers on the study were: Xiaolin Nan, of Oregon Health and Science University School of Medicine, and Sengupta Prabuddha, of The Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health.

The research was supported by the Waitt Foundation.