How to use Amazon Cloud supercomputers to view molecules in remarkable detail
February 5, 2013

These microscope images of podosomes, cellular structures thought to be involved in cancer, show the differences in clarity produced by conventional microscopic techniques (left) and super-resolution imaging (right). Salk researchers have developed a method of utilizing cloud computing to significantly reduce the time necessary for processing super-resolution images. (Credit: Salk Institute for Biological Studies)
Salk Institute for Biological Studies researchers have shared a how-to secret for biologists: code for Amazon Cloud that significantly reduces the time necessary to process data-intensive microscopic images.
The method promises to speed research into the underlying causes of disease by making single-molecule microscopy of practical use for more laboratories.
“This is an extremely cost-effective way for labs to process super-resolution images,” says Hu Cang, Salk assistant professor in the Waitt Advanced Biophotonics Center and coauthor of the paper. “Depending on the size of the data set, it can save over a week’s worth of time.”
Background: the limits of microscope imaging
The latest frontier in basic biomedical research is to better understand the “molecular machines” called proteins and enzymes. Determining how they interact is key to discovering cures for diseases.
Unfortunately, conventional light microscopes cannot clearly show objects as small as single molecules. The available alternatives, such as electron microscopy, could not be effectively used with living cells. In 1873, German physicist Ernst Abbe worked out the mathematics to improve resolution in light microscopes. But Abbe’s calculations also established the diffraction limit.
According to the Abbe limit, it is impossible to see the difference between any two objects if they are smaller than half the wavelength of the imaging light. Since the shortest wavelength we can see is around 400 nanometers (nm), that means anything 200 nm (ultraviolet-size) or below appears as a blurry spot. The challenge for biologists is that the molecules they want to see are often only a few tens of nanometers in size.
“You have no idea how many single molecules are distributed within that blurry spot, so essential features and ideas remain obscure to you,” says Jennifer Lippincott-Schwartz, a Salk non-resident fellow and coauthor on the paper.
In the early 2000s, several techniques were developed to break through the Abbe Limit, launching the new field of super-resolution microscopy. Among them was a method developed by Lippincott-Schwartz and her colleagues called Photoactivated Localization Microscopy, or PALM.
PALM, and its sister techniques, work because mathematics can see what the eye cannot: within the blurry spot, there are concentrations of photons that form bright peaks, which represent single molecules. The downside to these approaches is that it can take several hours to several days to crunch all the numbers required just to produce one usable image.
“Calculating an area of 50 pixels can take nearly a full day on a state-of-the-art desktop computer,” says Lippincott-Schwartz.
Supercomputer cloud

Analysis using Amazon EC2 cloud. (a) Computation time versus the radius of the image mosaic. (b) Illustration of the Amazon EC2 cloud, which allows multiple instances, each of which can be individually configured on demand. (c) Cloud computing accelerates image reconstruction with increased field of view. Left, for a measured image, the inset shows superresolution reconstruction of an area of 2.7 × 2.7 μm using a quad-core Core i7 computer; right, reconstruction of an increased area of 9.6 × 9.6 μm using 25 EC2 instances. The improvement is 12.5-fold. Credit: Salk Institute for Biological Studies)
The Salk researchers now offer other scientists an easier alternative: the Amazon Elastic Compute Cloud (Amazon Elastic EC2), a service that provides access to supercomputing via the Internet, allowing massive computing tasks to be distributed over banks of computers.
To make the PALM technique more practical for use in biomedical research, the team wrote a computer script that allows any biologist to upload and process PALM images using Amazon Cloud.
As a demonstration, Cang, Lippincott-Schwartz and post-doctoral researcher Ying Hu reconstructed the images of podosomes, which are molecular machines that appear to encourage cancer cells to spread. In one instance, they dropped the time needed to process an image from a whole day to 72 minutes. They also imaged tubulin, a protein essential for building various structures within cells. In that case, they were able to drop the time from nine days to under three and a half hours.
Their new paper provides a how-to tutorial for using the code to process PALM images through Amazon Cloud, helping the other labs achieve similar increases in speed.
Other researchers on the study were: Xiaolin Nan, of Oregon Health and Science University School of Medicine, and Sengupta Prabuddha, of The Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health.
The research was supported by the Waitt Foundation.
Comments (9)
by twm114
‘from nine days to under three and a half hours’ – beautiful! So can we expect a lot more pictures?
by Damon
Define human.
by Glen Lincoln
I still don’t understand how doing ginormous super computer calculations has the ability to make microscopes perform better in spite of the wavelength of light not permitting ten nanometer size imaging. There’s a gap in my conceptualization of the trick here, but I gotta say, I commend you for an exceptionally interesting and readable article.
by Editor
The calculations are intended to extract information from microscope images. The calculations do not require a supercomputer; they can be done with a desktop computer, but that takes a long time. The program extracts information from the fuzzy image (see the top picture on the left) from the microscope to show more details (right).
by Doug Safarik
http://www.youtube.com/watch?v=1eA3XCvrK90
by Bri
Since we are supposed to have our heads in the clouds in order to get work in the future, this article is a window into what that might entail..The cloud is very impressive at data processing. It’s like renting time on a supercomputer. The cloud saved a lot of processing time so all those work hours in data processing have been reduced. My guess is that would translate into some job loss for those that would have processed this information. Those people are different from the ones interpreting the data. Google’s cloud servers are basically a soft AI so there aren’t that many human jobs in association with that step. Once we get down to the data interpretation that is a job for a highly trained human. One strong Watsonesq or should I say Sherlockesq AI would learn how to interpret the data and process that faster than s human could. We could leave the data interpretation to a strong AI in the cloud but there goes some more jobs. I don’t know, it still looks to me like robots and strong AI will take all the jobs, no matter how sexy this cloud service appears.
by asiwel
We need to apply some reason here when we talk about what machines can do versus “job loss.” Why have computers when we could hire 1000′s of people to run abacuses. In that the cloud here saves “processing time”, whether by hand or by desktop PC, that means possibly you or I can afford it or do it .. when otherwise possibly we couldn’t have done so. Alternately, here perhaps the cloud computer is doing something that would be forever impossible or unreasonable to do by hand – so no “jobs” like that would ever be there to lose. When you try to save certain types of jobs, that really only means that only the 1% may be able to have them performed and millions who might could benefit (if cost were reasonable) are left without. This is simply the Henry Ford logic of mass production versus cost.
by Editor
Yes, I don’t think there’s any difference in human tasks here (except for IT skills in using cloud supercomputers — possibly requiring additional staff for smaller labs). The key variable is computer processing time and thus overall throughput.
by someone
It seems the main advantage of using Amazon for this kind of task is sharing of information – via AMIs (so that other researchers can just take their machine image, launch it, and continue working on it).
As far as cost effectiveness, it would probably be cheaper in the long run for a department to buy a rack of servers, than to pay Amazon hourly rates (the most expensive cloud service currently on the market).