Treasure hunting in our junk DNA

The DNA that was once deemed useless may hold answers to some of the unsolved questions in biology. Sofie Bates talks with scientists about the emerging field of lncRNAs and why these molecules are so exciting. Illustrated by Stephanie Kinkel.

Illustration: Stephanie Kinkel

Scientists sequenced the human genome in 2003, stringing together the genetic letters of our DNA. But when they began to decipher those 3 billion letters, scientists were surprised to find that 99 percent of our DNA appeared to be junk – just meaningless genetic material between the important chunks.

Since then, a treasure trove of molecules has been found within that 99 percent. That so-called junk DNA produces about 30,000 molecules called long non-coding RNAs (lncRNAs). These molecules are long strands of genetic material, some of which loop and coil to interact with other molecules and regulate what happens in our cells.

“One of the main motivations for studying lncRNAs – and indeed studying all of the ‘junk’ DNA – is that there’s a very big elephant in the room of biology,” said Rory Johnson, who uses computers to study thousands of lncRNAs at a time at the University of Bern in Switzerland. What are these molecules produced by the junk DNA, what are they doing, and could there be other important molecules buried in the junk?

“One of the main motivations for studying lncRNAs – and indeed studying all of the ‘junk’ DNA – is that there’s a very big elephant in the room of biology.”

Scientists hope that junk DNA holds answers to some of the lingering questions of biology, which may lead to new cures for intractable diseases. Though the field of lncRNA biology is still in its infancy, early successes – such as the development of a lncRNA-based test for prostate cancer – are promising, lncRNA researchers say. But before we see more widespread use of lncRNAs in the clinic, scientists need to understand what lncRNAs are, what they’re doing, and how they fit in with what we already know.

Susan Carpenter, a researcher at the University of California, Santa Cruz, is one of the scientists trying to answer these questions. She’s spent a decade working to understand these enigmatic molecules, but she still gets giddy as she wonders aloud about the mysteries of lncRNAs.

“It’s like if you had just identified proteins – having never seen them before – and you were like ‘There’s so many of them! What are they doing? Why are they important?’ I feel like we’re at the same place,” she said.  “It’s basically like discovering proteins all over again.”

Fitting into a convoluted network

A strand of DNA. Scientists believe some of the junk DNA produces lncRNAs.

A Dutch chemist was the first to describe proteins in 1838. He and other scientists noticed that certain molecules clumped together when they were exposed to acid or heat, similarly to how egg whites congeal as they cook. A few decades later, scientists discovered the existence of DNA — though it took another century to discern that DNA is responsible for storing information in the cell.

Information in our cells is passed from DNA to molecules called messenger RNAs to proteins. Instructions for making and operating a human body are stored in the DNA, transcribed into mRNAs, then carried out by proteins — the hardworking laborers of the cell.

The discovery of proteins has led to countless research breakthroughs, which in turn have led to new ways to diagnose and treat the ailments of the human body. But as scientists learn more, they also learn how much they don’t know. The molecules in our cells don’t operate in a perfectly linear workflow, but a convoluted network. Scientists still don’t understand the nuance of that network.

“If we want to understand how a human being works, we probably can’t just understand it in terms of proteins. There’s a lot of complexity that makes us human beings, and that complexity may be lying in the ‘junk’ DNA,” said Johnson.

Some of that complexity may lie in molecules such as lncRNAs.

 Sofie Bates tells the story of the race to sequence the human genome, and the surprising discovery that came after. Illustration by Stephanie Kinkel.

Working with a work in progress

If the defining characteristic of mRNAs is that they carry messages, the defining characteristic of lncRNAs is that they don’t. mRNAs are like the paperboys of the cell, scurrying off to deliver instructions for making new proteins. For lncRNAs, the messages are blank.

Scientists have long known about a few lncRNAs, such as Xist, which coats one of the redundant sex chromosomes in females and turns it off. But molecules like Xist were once considered an exception, not the norm.

The definition of lncRNAs is a work in progress.

Now, scientists have identified anywhere between 10,000 and 100,000 lncRNAs in humans. The only characteristics they share are that they’re longer than 200 genetic letters and they don’t contain a message. That’s it.

“That we define it that way shows how little we know about them,” said Daniel Lim, a researcher at UCSF studying lncRNAs in brain diseases.

The definition is a work in progress, and scientists’ concept of “lncRNAs” will likely change as scientists identify new lncRNAs and learn more about them. Of the thousands of lncRNAS they’ve found, many might just be noise, a byproduct of sloppy evolution and poor planning by nature. But scientists speculate that some lncRNAs play critical roles in the human body because they differ greatly between sick and healthy people.

For example, in a review published in the Journal of Hematology & Oncology in 2018, researchers reported that one lncRNA is more abundant in patients with several types of cancer, including cancers in white blood cells and the brain, than in healthy people. Because this particular lncRNA, MALAT1, is prevalent in many cancers, some scientists speculate that it helps drive the development of tumors.

Scientists have also found that the number and type of lncRNAs in our cells fluctuate in order to respond to and initiate changes in the cell—allowing lncRNAs to drive a more fine-tuned control than is possible with other cellular regulators. Levels of lncRNAs can also change drastically in a particular tissue, like the brain or the lung, while remaining stable throughout the rest of the body. Because lncRNAs change with such dramatic precision, lncRNAs may have a unique ability to help scientists understand how our cells change in cancer and disease.

lncRNAs across the tree of life

Some of the lncRNAs that have been identified appear to be performing essential tasks that keep our cells healthy. The challenge is figuring out which lncRNAs are important and which aren’t, then figuring out what the important ones do.

The tree of life based on genetic sequences. By David Hills.

For instance, Igor Ulitsky, a researcher at the Weizmann Institute of Science in Israel, is comparing lncRNAs in humans to those in other animals. He’s looking for those that exist in both humans and animals. He reasons that, if a lncRNA has stuck around for millions of years in radically different species across the tree of life, such as humans and zebrafish, it’s probably doing something important, and is therefore worth studying.

Ulitsky is using this trick to sift through thousands of lncRNAs, helping other scientists decide which to study in more depth. In a 2015 study published in Cell Reports, Ulitsky and colleagues reported that most lncRNAs are unique in different species. However, they found over a thousand lncRNAs that appeared to have similar functions in mammals from humans to primates to mice. Those are the ones most worth studying, Ulitsky says: “There are thousands of lncRNAs in our genome but, experimentally, we can only go after very few of those at a time.”

Inflamed curiosity

Susan Carpenter has been studying one of these lncRNAs at the bench for about a decade, and she’s just now starting to understand what it does and how it works.

Photo of Susan Carpenter by Carolyn Lagattuta.

She was drawn to lncRNA-Cox2 because she thought it might be involved in a physiological reaction called inflammation, the body’s first line of defense and a component of many diseases.

LncRNA-Cox2 sits right next to the Cox2 gene, which is essential to trigger the redness, heat, and swelling of inflammation. Cox2 is even targeted by drugs like Advil.

Carpenter wondered whether lncRNA-Cox2 also influenced inflammation. To answer that question, she started by creating mouse models.

She genetically engineered mice so that some were missing snippets of the Cox2 gene. She made other mice with the nearby lncRNA-Cox2 switched on or off. Carpenter wanted to see how deleting pieces of Cox2 and lncRNA-Cox2 changed the inflammatory response in the mice. That would give her some clues as to which parts were essential for the lncRNA’s function and what each piece was doing.

The answer was more complicated than she anticipated.

So far, she’s found that just this one lncRNA works in a multitude of ways. For one, lncRNA-Cox2 interacts with the Cox2 gene to turn up its activity and trigger the inflammatory response in the mice.

lncRNA-Cox2 may act like a master switch to kick-start the inflammation process.

Carpenter suspects that the lncRNA recruits cellular scribes to the Cox2 gene, allowing them to copy the information in the DNA into messages that mRNAs will deliver. Those mRNAs will sound the alarm and tell the cell to start manufacturing Cox2 proteins to fight the infection.

However, Carpenter has found that lncRNA-Cox2 interacts with more than just its neighboring gene, Cox2. It also changes the activity of far-away genes to turn them on or off. In this way lncRNA-Cox2 may act like a master switch to kick-start the inflammation process.

She also found that lncRNA-Cox2 is almost always turned on in the lung, perhaps poised to start an inflammatory response that might defend against infection; the lungs are constantly exposed to the environment as animals breathe in microbes, so lung cells have to be ready to fight off an infection with every breath.

Carpenter’s research indicates that lncRNA-Cox2 is a key player in inflammation — the body’s initial defense system that is triggered in nearly every disease, from neurodegeneration to chronic arthritis to cancer.

“It’s so interesting that we have this fundamental process that we need to keep us healthy, and that often it’s gone out of control in many different diseases,” said Carpenter. She hopes that understanding the role lncRNA-Cox2 plays in inflammation will lead to a better understanding of what goes wrong in the human body in many types of disease.

“Our thinking is that if we could better understand what controls inflammation, then we might be able to find drug targets that would have a major impact on all of those diseases,” said Carpenter.

A long but promising road

It’s taken years of lab work for Carpenter and her team to get these answers that just barely scratch the surface of everything that lncRNA-Cox2 is doing. “We’ve just studied this one example, and this is how complex it is,” she said.

Though there may be similarities between this lncRNA and others, every lncRNA is likely unique. Different lncRNAs may do the fine-tuned regulation that keeps our cells functioning properly from day to day. That lncRNAs are so complex, changing rapidly to control and respond to changes in the cell, might also make them a good indicator of health problems. That’s why many scientists are optimistic about using lncRNAs as biomarkers — molecules that signal physiological changes — to diagnose patients in the clinic by measuring lncRNAs in urine, blood, or bone marrow.

In 2009, for example, scientists at a biotech company in southern California called Gen-Probe Inc., conducted the first clinical trial to measure a lncRNA. They measured the levels of the PCA3 lncRNA, which was shown to be linked to the development of prostate cancer, in the urine of 495 men. The scientists found higher amounts of the PCA3 lncRNA in the urine from men who tested positive for prostate cancer than in those who did not.

For Carpenter, the success of lncRNA-PCA3 demonstrates that lncRNA research at the bench has the potential to one day help patients in doctors’ offices worldwide. In 2012, the US Food and Drug Administration approved the first lncRNA-based cancer biomarker, using the PCA3 lncRNA. Through a simple urine test, can doctors analyze the levels of the PCA3 lncRNA to help diagnose prostate cancer.

However, scientists are cautious in their expectations— many biomarkers that look promising in initial studies don’t hold up in clinical trials. So, it’s not clear how many more lncRNA-based tests will succeed. And most lncRNA research is far from ready for the clinic.

The lncRNA researchers working on this problem are cautiously optimistic. But they’re also tenacious.

“If we want to identify something new, we have to start with the fundamentals,” said Carpenter. “How do you ever get to the end if you don’t start at the very beginning?”

© 2019 Sofie Bates / UC Santa Cruz Science Communication Program

Sofie Bates

Sofie Bates

Author

B.S. (genetics and genomics, minor in professional writing) University of California, Davis

Internships: Inside Science video team and Science News

When I was eleven, I convinced my classmates that dragons were real. I’d watched a film about dragons—complete with dissection of a computer-generated “dragon”—and felt compelled to share such groundbreaking science with my classmates. As I presented my findings to my classmates, they bounced in their plastic chairs, hands raised high with questions. My teacher intervened, explaining that dragons didn’t exist—but “mockumentaries” did.

My passion for communicating science resurfaced during a college research conference. As I explained my genetics experiments, I realized I was more excited to talk about the science than I was to do more benchwork. Now, I hope to instill the same sense of wonder in grown-ups as I did in my sixth-grade classmates. But I’ve learned my lesson: fact-check first.

Stephanie Kinkel

Stephanie Kinkel

Illustrator

B.S. (molecular biology) University of San Diego

M.S. (biology) Masachusetts Institute of Technology

Internship: Millenium School (Berkeley, California)

Stephanie is a teacher, illustrator, and molecular biologist. Currently working as a STEM educator at the Millennium School in San Fransisco, she is also a freelance science illustrator.

Her interests in the relationship between cells and human disease first led her to earn her BS from the University of San Diego (UCSD) then her MS from Massachusetts Institute of Technology (MIT). Beyond her own studies she taught science and math for 6 years prior to her admission to the Science Illustration program.

In her work and overall life, Stephanie tends to be heartfelt and detail-oriented. Whether working in paint, colored pencil, or digitally, Stephanie hopes to communicate the details of a phenomenon or place. Striving to capture the motion and wonder of the natural world, she hopes to convey the beauty of its smallest parts.

 

Artists website