DNA sequence database hits a billion entries

January 18, 2006 | Source: KurzweilAI

The Wellcome Trust Sanger Institute’s World Trace Archive database of DNA sequences hit one billion entries Tuesday.

The Trace Archive, a store of all the sequence data produced and published by the world scientific community, is 22 Terabytes in size and doubling every ten months. It is perhaps the largest single scientific database in Europe, if not the world, and larger than the estimated 20 TB of equivalent text data in the Library of Congress.

All the data are freely available to the world scientific community as a resource to geneticists all over the globe.

Trace data are the raw results of genetic research to allow researchers to identify and study genes, to reveal variations (mutations) in genes and to study similarity to genes in other organisms. Each entry is a piece of genetic information averaging 864 characters long. Scientists can search these sequences and piece them together to build up the whole genetic information of organisms — mice, fish, flies, bacteria, and humans.

Source: Wellcome Trust news release