There’s a Mystery Machine That Sculpts the Human Genome
Geneticists can’t see this machine, but they can see its works—and they say it might be the key to reshaping the genome.
Consider that the human genome is longer than the average human. It consists of around two meters of DNA, which must somehow fit into cells, whose nuclei are about 200,000 times narrower.
So it folds. And it folds in such a way that any given stretch can be easily unfolded, so the genes within it can be read and used. Knots are verboten, and anyone who has ever shoved headphones into their pockets will know how hard it is to scrunch an extremely long thread into a ball without knotting anything.
In the 1970s, biochemists showed that this feat of extreme origami begins when DNA is wrapped around proteins called histones, creating what looks like a string of beads. This reduces the packing problem, but doesn’t come close to solving it. The wrapped DNA must be folded and twisted in ever more complicated (and as yet unknown) ways. Eventually, it forms large loops.
The loops aren’t just a packing solution. They also bring genes into close contact with distant sequences that turn them on or off. So, the 3-D form of the genome also dictates its function. And to really understand how genes are used (and how they are misused in cases of disease), we need to appreciate the genome as a looping, twisting, physical entity, rather than just a string of letters.
In 2014, a team led by Erez Lieberman Aiden at Baylor College of Medicine took important steps towards this goal by creating an unprecedentedly detailed 3-D map of the human genome. These genetic cartographers used a technique called Hi-C to embalm the genome and identify regions that interact with one another. Using this method, they identified a grand total of 10,000 loops—far fewer than the millions that were thought to exist.
They also showed that the loops obey certain rules. Most tend to be short. They occur in the same places whether you’re looking at a neuron or a skin cell, or a human cell or mouse cell. And they almost always associate with a protein called CTCF, which acts as a fastener. In theory, two CTCF proteins will bind to separate stretches of DNA and then lock together, creating a loop and holding it in place.
“That was a total bombshell,” says student Suhas Rao who worked on the project. He, like many others, had assumed that loops form when two stretches of free-floating DNA randomly find each other and are fastened by a pair of CTCF proteins. But that can’t be right. If it was, the CTCF landing sequences would align in all four possible orientations, rather than the very specific one that Rao saw in his data. The loops must be forming in a completely different way, one that’s deliberate and controlled.
Rao and fellow student Adrian Sanborn think that the key to this process is a cluster of proteins called an “extrusion complex,” which looks like a couple of Polo mints stuck together. The complex assembles on a stretch of DNA so that the long molecule threads through one hole, forms a very short loop, and then passes through the other one. Then, true to its name, the complex extrudes the DNA, pushing both strands outwards so that the loop gets longer and longer. And when the complex hits one of the CTCF landing sites, it stops, but only if the sites are pointing in the right direction.
This explanation is almost perfect. It accounts for everything that the team have seen in their work: why the loops don’t get tangled, and why the CTCF landing sites align the way they do. “This is an important milestone in understanding the three dimensional structure of chromosomes, but like most great papers, it raises more questions than it provides answers,” says Kim Nasmyth, a biochemist at the University of Oxford who first proposed the concept of an extrusion complex in 2001.