91ÑÇÉ«´«Ã½

News

How disease sleuths are using genomics to track the coronavirus

Rapid sequencing of viral genomes can help public health officials figure out the origins, spread and nature of quickly moving epidemics
Bob Holmes
By Bob Holmes
May 3, 2020

In the early stages of a pandemic like Covid-19, public health officials need a lot of answers fast. How quickly is the virus spreading, and through which routes? How can we contain it? And when can we safely relax the most stringent control measures such as shelter-in-place?

Answering those questions is never easy, but in the face of the new coronavirus, epidemiologists have a powerful tool that wasn’t available for the earlier SARS and MERS epidemics (also caused by coronaviruses): rapid, large-scale sequencing of viral genomes. These genetic sequences from viruses that have infected patients, together with old-fashioned tracing of personal contacts, allow health officials to track the spread of a virus from person to person and place to place faster and more accurately than ever before. That speed, they hope, will translate into earlier control of the virus, and more precise management of the pandemic’s end stages.

Geneticists have been able to sequence viral genomes for decades, of course — but the latest advances in the technology mean they can now do so in a matter of hours or days. Just as quickly, scientists around the world can share what they learn via a global open-source network known as . That speed and cooperation have been a game-changer, enabling this “genomic epidemiology” to be used in real time as the Covid-19 pandemic unfolds.

“We have used genomic epidemiology in other contexts where we were getting sequence in a month or a few weeks, but we’ve never had anything where we’ve had such fast turnaround or the number of sequences being shared from so many places so quickly,” says Emma Hodcroft, a genetic epidemiologist at the University of Basel in Switzerland and member of the Nextstrain network.

G-viral-genome-clues-alt-890x548.jpg
S. WOHL ET AL / AR VIROLOGY 2016 / KNOWABLE MAGAZINE
Using genome sequences, researchers can deduce evolutionary relationships between different versions of the virus, helping to track the origin of a pandemic. From this and other information, they can reconstruct how and where the virus may have spread from person to person.

Sloppy copies

Much of the power of genomic epidemiology stems from the fact that most viruses make lots of mistakes when they copy their genomes, so changes in the sequence — that is, new mutations — turn up relatively often. That’s especially true of viruses that use RNA as their genetic material, as coronaviruses do. Very few of these mutations affect how the virus behaves — most have no apparent consequence at all — but researchers can use them as markers to build a family tree of the virus and to see how the virus has changed over time and how it has spread from locale to locale.

Early in the Covid-19 outbreak, researchers all over the world began sequencing viruses sampled from patients and building a family tree of the virus on Nextstrain. Almost immediately, they could see that the tree was short — the virus sequences had not yet accumulated many distinct mutations, meaning that the new coronavirus, SARS-CoV-2, hadn’t been infecting humans for long. Moreover, the tree had a single trunk, indicating that every virus infecting humans likely descended from a single case .

In contrast, periodic outbreaks of MERS in humans in the 2010s look more like a shrubland: multiple small clusters of virus genotypes that are , indicating that MERS must have jumped repeatedly from camels to humans and then fizzled out.

The SARS-CoV-2 virus’s genetic mutability also means that epidemiologists can use changes in its genome to trace the spread of the virus during an epidemic. That’s because most mutations are essentially random, so each branch of the virus tree is likely to bear its own unique set of mutations. If one person’s virus contains mutations A, B and C, for example, that person could have caught it from someone whose virus carries A and B or A and C, but not from someone whose virus has A, B, C and D.

G-viral-genome-clues-alt-890x486.jpg
J.L. GARDY & N.J. LOMAN / NATURE REVIEWS GENETICS 2018 / KNOWABLE MAGAZINE
Mutations in a viral genome can serve as genetic breadcrumbs, giving scientists insight into viral origins and spread.

Early in the current pandemic, Nextstrain noted the appearance of identical or near-identical coronavirus genomes from people in countries as widely spaced as Canada, Australia and the UK. The genomes were so similar that scientists inferred they must have shared a common source. That red flag prompted further questioning, which revealed that all of the sick had recently travelled to Iran.

“We could confirm that these patients must have been infected in Iran, because that’s the only thing they had in common,” says Hodcroft. Without the genomes, nothing would have linked those patients, and the Iranian connection would not have been noticed as quickly. Similarly, most viral genomes in the New York City region closely , suggesting that infections came from there, not directly from China.

Of course, epidemiologists also track transmission routes the traditional way, by interviewing people and tracing their contacts. However, this method can’t keep up in the face of a pandemic, where thousands of new cases are added every day.

“There’s an advantage to old-fashioned shoe-leather contact tracing, because you can actually talk to people and find out who they spoke to,” says Hodcroft. “But as the number of cases rises, you cannot contact-trace everyone. You just don’t have enough people. That’s where using genetics can be a big help.”

Viral family tree

Genomes can be especially good at answering a key public health question early in an epidemic: Are new infections in a given locality imported by travelers, or are they homegrown? The latter — the result of the virus circulating within the community — would create a need for the social-distancing measures now familiar to so many of us.

“If you’re seeing strains that are really, really similar, that suggests that they’re transmitting locally,” says Shirlee Wohl, a genomic epidemiologist at Johns Hopkins Bloomberg School of Public Health and coauthor of a review of the field in the 2016 . “That’s information you really can’t get from any other method.”

G-covid19-ontario-transmission-733x795.png
This portion of the evolutionary tree of SARS-CoV-2 virus shows three separate clusters of virus from Covid-19 patients in Ontario, Canada (red dots). Within each cluster, viruses are closely related, indicating local transmission, but the three clusters are more distantly related, indicating that each cluster was introduced separately from elsewhere. The most likely source is the US, based on the similarities in the viral sequences.

For example, the first Covid-19 infection in the state of Washington was in a traveler returning from Wuhan, China, where the outbreak began. When a later infection in Washington turned out to have a nearly identical sequence, this was strong evidence of community transmission — especially because the two individuals, though unacquainted, lived in the same county.

Unfortunately for genetic detectives, the Covid-19 virus changes a little too slowly for optimal tracking of transmission chains, Wohl notes. HIV, in contrast, mutates so quickly that each person usually carries a unique genotype, allowing epidemiologists to pinpoint the exact source of each new infection. For the Covid-19 virus, each viral lineage accumulates about 30 new mutations per year, which works out to about one new mutation per two links in the transmission chain. As a result, exactly the same viral genome sequence can be found in several people, so genome-trackers can narrow transmission down only to a handful of suspects.

Additional uncertainty comes from the fact that researchers can’t possibly sequence viruses from every infected individual in a widespread pandemic. As of April 20, nearly 2.5 million people worldwide had been infected with SARS-CoV-2, but Nextstrain listed just 4,558 sequences. That can lead to false conclusions. “The beautiful danger is it looks like it can tell you a lot of enticing stories,” says Hodcroft. “But we don’t know that the scenario is exactly what happened.”

In late February, for example, sequencers found patients in Germany and Italy who shared the same unusual viral mutation. Since the German patient had gotten sick sooner, this led some researchers to suggest that the virus had spread from Germany to Italy. In reality, though, both German and Italian patients could have caught the virus from some third person, yet unidentified, whose virus was not sequenced.

Still, these limitations have not kept genomic epidemiology from playing a key role in the Covid-19 pandemic. The approach has helped public health officials identify the pathogen, trace its travels and recognize community spread promptly. And in the months ahead, the method may have more to contribute.

V-coronavirus-transmission-map-890x517.jpg
Using virus sequence data, researchers can track the spread of Covid-19 around the world. The animation starts in late 2019 and shows the first virus genome sequences found in January 2020 from Wuhan, China, with disease spreading rapidly in the weeks after.

One contribution is likely to come from longer-term studies of where mutations fall in the genome. Most of the genetic changes, remember, make little or no difference to the virus: They are “neutral,” in evolutionary biologists’ parlance. But mutations that change the shape of key proteins, such as the on the surface of the virus that binds to receptors in our cells, are more likely to matter.

Looking to see how these regions have changed since the virus infected humans may eventually help virologists understand why this particular virus has been able to adapt to us so well, says Hodcroft. However, this will require painstaking experiments over many months to reveal the functional effect of each mutation. “It’s not something that’s done in an afternoon,” she says.

Before that happens, genomic epidemiology promises to help public health officials find the smartest way to relax the burdensome social-distancing measures that are so important in controlling the pandemic right now. By using genomic breadcrumbs to track the transmission of the virus, epidemiologists hope to identify which activities are most likely to spread the virus. If schools, for example, turn out to pose a relatively low risk, authorities may be able to re-open those sooner.

“That hopefully means we can start relaxing those lockdowns faster than we might have 10 years ago, when we didn’t have this technology,” says Hodcroft. But that depends on a key factor that was not much in evidence at the start of the epidemic: the willingness of politicians to heed scientists’ warnings and advice.

This article originally appeared in , an independent journalistic endeavor from Annual Reviews.

Enjoy reading 91ÑÇÉ«´«Ã½ Today?

Become a member to receive the print edition four times a year and the digital edition weekly.

Learn more
Bob Holmes
Bob Holmes

Bob Holmes is a science writer in Edmonton, Canada.

Related articles

Chicago’s scientific interface
Gabriella Rant & Madeline Ganshert
Meet Robert Helsley
Christopher Radka
Upcoming opportunities
91ÑÇÉ«´«Ã½ Today Staff
From the Journals: MCP
Indumathi Sridharan

Get the latest from 91ÑÇÉ«´«Ã½ Today

Enter your email address, and we’ll send you a weekly email with recent articles, interviews and more.

Latest in Science

Science highlights or most popular articles

Meet Robert Helsley
Interview

Meet Robert Helsley

March 6, 2025

The Journal of Lipid Research junior associate editor studies chronic liver disease and was the first in his family to attend college.

From the Journals: MCP
Journal News

From the Journals: MCP

March 4, 2025

Protein acetylation helps plants adapt to light. Mapping protein locations in 3D tissues. Demystifying the glycan–protein interactome. Read about these recent papers.

Exploring life’s blueprint: Gene expression in development and evolution
In-person Conference

Exploring life’s blueprint: Gene expression in development and evolution

March 3, 2025

Meet Julia Zeitlinger and David Arnosti — two co-chairs of the 91ÑÇÉ«´«Ã½â€™s 2025 meeting on gene expression, to be held June 26-29, in Kansas City, Missouri.

From the journals: JLR
Journal News

From the journals: JLR

Feb. 27, 2025

Protein analysis of dopaminergic neurons. Predicting immunotherapy responses in lung cancer. ZASP: An efficient proteomics sample prep method. Read about papers on these topics recently published in Molecular & Cellular Proteomics.

New mass spectrometry assay speeds up UTI diagnosis
Journal News

New mass spectrometry assay speeds up UTI diagnosis

Feb. 25, 2025

Scientists in Quebec use liquid chromatography–mass spectrometry to reduce the time needed to test for bacteria in urine from days to minutes — and with smaller samples.

From the journals: MCP
Journal News

From the journals: MCP

Feb. 21, 2025

Protein analysis of dopaminergic neurons. Predicting immunotherapy responses in lung cancer. ZASP: An efficient proteomics sample prep method. Read about papers on these topics recently published in Molecular & Cellular Proteomics.