1000 Genomes Project announced
22 Jan 08
An international research consortium including Oxford University academics today announced the 1000 Genomes Project, an ambitious effort that will involve sequencing the genomes of at least a thousand people from around the world to create the most detailed and medically useful picture to date of human genetic variation.
Drawing on the expertise of multidisciplinary research teams, the 1000 Genomes Project will develop a new map of the human genome that will provide a view of biomedically relevant DNA variations at a resolution unmatched by current resources.
Recently developed catalogues of human genetic variation, such as the HapMap and the Wellcome Trust Case Control Consortium – both of which involved a number of Oxford researchers in key roles – have proved valuable in human genetic research. Using the HapMap and related resources, researchers already have discovered more than 100 regions of the genome containing genetic variants that are associated with risk of common human diseases, such as diabetes, coronary artery disease, prostate and breast cancer, rheumatoid arthritis, inflammatory bowel disease and age-related macular degeneration.
The new map will provide genomic context surrounding the HapMap’s genetic variants, giving researchers important clues to which variants might be causal, including more precise information on where to search for causal variants.
Professor Gil McVean, Professor Peter Donnelly, Dr Jonathan Marchini and Dr Simon Myers, all from Oxford’s Department of Statistics and Oxford’s Wellcome Trust Centre for Human Genetics, and their research groups, are making major contributions to the project. Professor McVean and Professor Donnelly were involved in the design of the project, and Professor McVean is co-chair of the analysis group.
Any two humans are more than 99 percent the same at the genetic level. However, it is important to understand the small fraction of genetic material that varies among people because it can help explain individual differences in susceptibility to disease, response to drugs or reaction to environmental factors. Variation in the human genome is organized into local neighborhoods called haplotypes, which are stretches of DNA usually inherited as intact blocks of information.
However, because existing maps of haplotypes (like the HapMap) are not extremely detailed, researchers often must follow them up with costly and time-consuming DNA sequencing to help pinpoint the precise causative variants. The new map would enable researchers to more quickly zero in on disease-related genetic variants, speeding efforts to use genetic information to develop new strategies for diagnosing, treating and preventing common diseases.
As with other major human genome reference projects, data from the 1000 Genomes Project will be made available to the worldwide scientific community through freely accessible public databases.
With current approaches, researchers can search for two types of genetic variants related to disease: very rare genetic variants that have a severe effect, and common genetic variants that increase the risk of common conditions like heart disease. Between these two types of genetic variants — very rare and fairly common — there is a significant gap in scientific knowledge. The 1000 Genomes Project is designed to fill that gap, which researchers anticipate will contain many important variants that are relevant to human health and disease.
The 1000 Genomes Project aims to produce a catalog of variants that are present at one per cent or greater frequency in the human population across most the genome, and down to 0.5 percent or lower within genes. This will likely entail sequencing the genomes of at least 1,000 people.
‘This project will examine the human genome in a detail that has never been attempted – the scale is immense,’ says Professor McVean. ‘At 6 trillion DNA bases, the 1000 Genomes Project will generate 60-fold more sequence data over its three-year course than have been deposited into public DNA databases over the past 25 years. In fact, when up and running at full speed, this project will generate more sequence in two days than was added to public databases for all of the past year.’
