How Exactly Does One Map the Human Gut?

Mary Brophy Marcus

There was a recent flurry of excitement when UK researchers from the Wellcome Sanger Institute, a nonprofit genomics and genetics research organization in Cambridge, England, announced their creation of the most detailed cell map of the human gastrointestinal (GI) tract up to this point. Using spatial and single-cell data from 1.6 million cells (from 271 donors), the scientists’ work, published in Nature, yields valuable information about the gut’s health and disease states.

“We now have a nice idea of what cell types we can find in the gut,” said co-author Rasa Elmentaite, PhD, co-founder and head of Genomics and Data Science at Ensocell.

No small question: How, exactly, does one go about mapping the human gut?

The story behind the construction of the Gut Cell Atlas, an arm of the larger Human Cell Atlas project — which recently published more than 40 studies in Nature in November 2024 — is one of global collaboration and scientific precision across continents.

Most of all, Elmentaite said, it has taken the scientists involved down paths they never anticipated.

How It Began

Over the past 5-7 years, there have been many individual publications across institutes where research labs have created snippets of what the cells in the gut look like, said Elmentaite, referring to the GI tract, which in this case encompasses the mouth, throat, esophagus, stomach, small intestine, large intestine, rectum, and anus.

Labs studying gut cells often focus on one specific region, for example, the small intestine or the large intestine, if they’re working on Crohn’s disease research. “It makes sense. The GI tract is really regionalized,” she said.

In 2019 and 2021, Elmentaite and her colleagues published papers that mapped some cells in the human gut, and their work garnered attention from other scientists around the world. That’s how they decided to take on the challenge of creating a more comprehensive map of the cells in both the healthy gut and in GI tracts where disease is present.

“We realized there are so many other labs interested in understanding the intestinal tract holistically,” she said. “We thought it would be really interesting to look at the cells in the context of the whole GI system.”

Not a Bad ‘Lockdown Project’

During the pandemic, while the rest of society was playing with sourdough starter, Elmentaite’s lab decided to pool their datasets with data from other labs to see what happened. They had more than 20 datasets at the time and wanted to generate more to fill gaps where there was less information available (eg, data on the cells in the stomach).

“It required getting samples from surgeons, going into the lab and processing those samples into single cells, and then processing that data bioinformatically,” said Elmentaite.

She explained that generating more datasets was not a job for one person and “required convincing a lot of scientists that it was worthwhile,” including a cancer biologist and a mucosal immunologist. They also brought in more technical bioinformatics expertise and recruited an IT team — several people who only worked on processing data and clearing it, “so it was aligned very specifically, and processed very uniformly, across the study,” Elmentaite said.

In the end, with researchers coming together on many Zoom calls from Australia, Germany, Norway, Spain, and numerous locations across the United Kingdom, they were able to integrate 25 single-cell RNA sequencing datasets that encompassed the entire healthy GI tract in development and in adults, leading to a healthy gut reference atlas that includes about 1.1 million cells and 136 cell subgroups.

How to Build a Two-Dimensional (2D) Map of the Gut

Here’s where the weeds get taller: Raw sequencing reads — bits of different RNA that the researchers sequenced from individual biological samples — needed to be mapped back to genes so that the researchers could understand what kinds of genes were expressed in the samples.

“There is a lot of curation there that needs to happen because the versions of transcriptome [a complete set of RNA molecules in a single cell or tissue], and what the genome looks like, is evolving constantly. So we have to map it back to the same transcriptomic reference,” Elmentaite said.

Once they knew all the different genes active in a cell, the researchers were able, using bioinformatics, to pull all of the datasets together and visualize them in a 2D space — essentially, a representation of a map.

“The cells that are transcriptionally more similar to each other, they will cluster together. And the ones that are more distinct transcriptionally will be farther away from each other,” Elmentaite said.

For decades, GI tract researchers had individually studied which types of genes are activated in an epithelial cell vs a T cell vs a B cell, she said, and “then suddenly you’re seeing it on your screen all at once. It’s amazing to see.”

In Brutal Detail

One of the biggest challenges in interpreting the data involved many people who made sure that there were no technical differences between the samples, Elmentaite said.

For example, samples processed in different labs may have sequencing differences between them. Or the kind of enzymes used to process the tissue might have been different from lab to lab. Sometimes they’d even record down to the detail what time of the day a sample was collected, if that information was available.

“Computationally, we considered all of these variables and tried to regress as much of that as possible. And then, if we saw clusters that resembled gene expression that we know is consistent with some of the cell types, then we knew that we’d regressed all the batch effects,” said Elmentaite.

Before ‘Eureka’ Comes a Few ‘Aha’ Moments

There were many “aha moments,” but also some puzzling ones, she said. One of the biggest surprises was seeing that the identity of a cell believed to be set from development can actually change if a person has a lot of irritation and inflammation, as occurs in inflammatory bowel disease. Epithelial cell metaplasia was one of the surprises.

“Metaplasia describes an activation or differentiation of one differentiated cell type into another differentiated cell type. And we knew that that exists in some of the upper GI diseases, but we didn’t realize that the same sort of mechanism exists in the small intestine,” Elmentaite said.

Epithelial cells are among the most abundant cells in the body. The researchers knew from established research that these cells act a bit like first responders and provide healing to the gut. “We could see that they were producing a lot of mucous that helps to potentially flush down the microbes that are triggering inflammation. But the ‘aha moment’ for us was that actually there’s a dual function in these cells,” she said.

They could see in the transcriptional profile a significant production of chemokines (a family of proteins that play a role in the body’s immune responses) and major histocompatibility complex class II molecules (cell surface proteins involved in the body’s immune response), and the epithelial cells seemed to attract neutrophils and monocytes and to interact with certain T cells — adding to the cycle of chronic inflammation.

“I think for a lot of us, this was a surprise because we think of epithelial cells as more like a barrier, just a passive player in, for example, inflammatory bowel disease,” said Elmentaite.

In addition to reporting their findings on epithelial cell metaplasia, the researchers processed 12 GI disease datasets, including celiac disease, Crohn’s disease, GI-related cancers, and ulcerative colitis.

A Gut Cell Atlas for the People (Well, Scientific People)

“The Gut Cell Atlas is usable for everyone, and everyone has a chance to contribute,” said Elmentaite. But 1.6 million cells is only a start and more data are needed, she said, and efforts are being spearheaded at the broader Human Cell Atlas project to collect it. There are also efforts to get all researchers using the same cell annotation, research-wide standards that will make the atlas truly usable for everyone.

Keith Summa, MD, PhD, assistant professor of medicine in the Division of Gastroenterology and Hepatology at Feinberg School of Medicine, Northwestern University, Chicago, focuses much of his work on inflammatory bowel disease as well as disorders of the gut-brain interaction. He conducts basic and translational research using experimental models of intestinal inflammation.

At Northwestern University, they have a tissue biorepository. It includes colon and intestine tissue samples from healthy individuals as well as individuals with different disease states. “One potential area where I could see this being a useful tool for us is that we could utilize the Gut Cell Atlas in the analysis of our tissue samples to see if we find commonalities between what’s described in the atlas and then what we’re observing,” said Summa, who was not involved in the UK-based project.

He said Northwestern University’s biorepository has details about what medications people were on and the extent and severity of their disease.

“We may be able to utilize the Gut Cell Atlas to help us look at specific factors in terms of how people are responding to different treatments, how different cell types are affected by different treatments, or how they are active or not active in different severities of disease. It may help us provide more precision to our understanding of these different conditions,” said Summa.

He also noted how GI specialists use “Crohn’s disease” as a single diagnosis, “but that encompasses a pretty wide spectrum of phenotypes and behaviors,” he said. “I think looking at such a cell-specific level may help to better identify some of the different pathways that are active and the different phenotypes or behaviors of this disease.” 

Next Step: Visualize It in a Three-Dimensional (3D) Space

The next stage for Elmentaite and her colleagues is to evolve the 2D map into a 3D map so they can visualize which cells are where and how they are organized in space.

“The next step, which I think is super exciting, is understanding how these cells depend on each other,” she said. “Really functionally understanding if some of them are essential and others are not. And doing AI modeling to understand what we can learn from this vast amount of data that we’re generating. And of course, that includes how we use this knowledge to create precision therapies.”

TOP PICKS FOR YOU
Recommendations

3090D553-9492-4563-8681-AD288FA52ACE