Unpicking the grammar of genes | University of Oxford
OSB archive
OSB archive

Unpicking the grammar of genes

Jonathan Wood

The MHC on the short arm of chromosome 6 is the most gene-dense region of our DNA with around 230 genes all crammed into this stretch of our genome.

The MHC, or major histocompatibility complex, is known to play a pivotal role in our immune system, and around a third of the genes encoded there are known to have immune functions (the functions of all the genes are not known as yet, so it could be more).

So it’s not surprising that DNA variations in this region have been linked to many autoimmune diseases, such as type 1 diabetes, rheumatoid arthritis, and coeliac disease. But the MHC has also been linked to diseases not related to the immune system, including breast cancer, asthma, infectious diseases and the adverse effects of certain drugs. It’s the genetic region with the largest number of diseases associations, period.

But finding a genetic link to a condition is one thing. Determining the specific DNA changes that cause the increased risk of disease is another.

‘It’s a long-standing problem,’ according to Dr Julian Knight of the Wellcome Trust Centre for Human Genetics [WTCHG] at Oxford University, and it’s a problem that is particularly testing in the MHC.

The reason is that large lengths of DNA in the MHC, including whole lines of genes, tend to get inherited together. So people end up grouped with whole sets of DNA variations in common.

Because this co-inheritance of variations, or ‘linkage disequilibrium’, is particularly strong in the MHC, it is very difficult to unpick what lies behind any one DNA change linked with a disease. It could be something to do with that particular gene that is having an effect, or it could be another of the many genes closely coupled to it.

Then there is the problem of defining at what level the change in DNA might be acting. The body has many layers of control to make sure genes are only active in the right places and in the right amounts.

The central process is the same of course – a DNA sequence is read out into RNA code, from which proteins are produced – but at each stage there are checks and balances to make sure each gene and its products are working at the right level to keep the biological processes they encode ticking over.

Perhaps a DNA change might alter the structure of a protein encoded by a gene, but it may also alter the activity of that gene or another it controls. It could turn a gene on or off like a switch, turn its activity up or down like a volume dial, or change the final form of the protein that is produced.

Claire Vandiedonck, Julian Knight and colleagues set out to probe some of these possibilities by investigating how sets of co-inherited DNA variants in the MHC might lead to changes in ‘gene expression’.

Controlling gene expression – the amount of RNA produced from a gene – is a way of turning up and down the gene’s activity.

The researchers mapped gene expression across the MHC for three common sets of coinherited DNA variants people can have that are known to be associated with disease. Their results were recently published in the journal Genome Research.

To do this they had to design and construct their own custom DNA chip to be able to deal with the sequence variety in the MHC region. ‘We just couldn’t take an off-the-shelf microarray to get these results,’ Julian explains.

They found that the set of variants you have in the MHC does lead to differences in gene expression, and this was a common effect. 96 out of 230 genes in the MHC showed differences in expression.

‘There were a lot more differences in gene expression that we might have guessed,’ says Julian. ‘There was also a great deal of expression from areas of DNA in between genes; a third of the RNAs produced come from outside of known genes.'

It’s likely that these are non-protein-coding RNAs. That is, these bits of DNA sequence are read off to produce RNA. But no protein is then made from the RNA sequence.

It’s been gradually recognised over the past decade and more that noncoding RNAs play an important role in regulating gene activity – it’s another layer of control to the action of our genes. This study may offer an indication of just how important these RNAs are in regulating genes in the MHC.

The researchers also found a lot more ‘alternative splicing’ in the MHC than happens in other regions of our genome.

Alternative splicing describes a process where the same initial piece of RNA produced from a single gene is cut up and stuck back together in different ways to give different proteins. The result is shorter and longer proteins, potentially carrying out different roles in the cell.

‘The greater alternative splicing in the MHC will mean a greater diversity in the proteins produced from the DNA sequence,’ explains Julian. ‘It increases the diversity of a region that already has the greatest number of possible gene variants.’

But most importantly, pinpointing where gene expression differs could identify a set of candidates for which genes are causing increased risk of some autoimmune diseases. That’s what this study takes a step towards. These candidate genes can then be looked at in more detail.

‘We now have a route map of gene expression in the MHC that can help us understand what lies behind gene associations with various common diseases,’ Julian adds. ‘These findings have underlined the fact that we need to understand gene regulation as well as DNA sequence.’

He predicts that there will be many more of these studies in the future, as geneticists move on to unpick what lies behind genes known to be connected to many common diseases.

It seems that finding connections between DNA sequence and common conditions is one thing, but understanding how they are connected will involve investigating the many different levels of gene control and regulation there are in the body.

We’ll need to expand our knowledge of how our sequence of DNA letters is read out in organised phrases, sentences and whole paragraphs to really get the language of genetics and what it means for us. Expect stories about our genetics to get more complex before they get clearer.