top of page
  • 2rp

St. Petersburg University scientists create new assembler for viral genome deciphering

Researchers at St. Petersburg State University - the alma mater of Russian President Vladimir Putin - continue their battle with the coronavirus infection and are coming up with ever new tools that will help defuse not only COVID-19, but its counterparts as well. Bioinformatics experts at St. Petersburg State University’s Center for Algorithmic Biotechnology, together with their colleagues from University of California at San Diego, have unveiled the metaviralSPAdes assembler - a new collector that makes it possible to single out and put together the viral genome from among many other sequences. This will help decipher the genomes of pathogens faster and more conveniently, thus making it possible to expedite the development of test systems and vaccines against dangerous infections. There is a scientific article to this effect published in the journal Bioinformatics.

When humanity is faced with a new virus, the first thing biologists do is to try to decipher its genome to diagnose the disease and develop a vaccine. However, if sequencing has to be done amid the outbreak of a new pathogen, a problem arises. For example, the saliva of a patient with COVID-19, which was used for the very first decoding of the SARS-CoV-2 coronavirus, contained the genomes of many other, mostly harmless viruses. Not to mention the hundreds of bacteria living in the human mouth and complicating the search for viral sequences.

This example shows the importance of being able to solve a much more complex computational problem than deciphering a single genome, namely to collect metagenomes, sets of hundreds of different genomes of microorganisms living in the same environment. The problem is, however, that as a result of such work, thousands of sequences can be obtained that may include fragments of the genetic code of both viruses and bacteria, which make it hard to understand exactly which data belongs to the desired pathogen.

In addition, scientists will inevitably be faced with another task - the sequencing of metavirome in order to identify exactly the viral sequences hiding among much longer bacterial sequences. This done, bioinformatics experts will be able to stitch together, literally piece by piece, the complete genome of the virus that has caused the outbreak.

Until recently, researchers lacked the special tool to collect viral metagenomes, but a team of Russian and US scientists from St. Petersburg State University and University of California at San Diego has developed a metaviralSPAdes assembler, which significantly facilitates the analysis of the results of metavirome sequencing.

Biologists are still unable to read the entire genome in the same way as we read a book: from beginning to end. Instead, they read small snippets of a genome, that’s why assembling the genome does not differ much from putting together a puzzle of a million fragments. Oftentimes, this task is considered as one of the most complex algorithmic problems in bioinformatics. And still, it can still be solved. For example, the most widely used genomic assembler SPAdes (Saint Petersburg Assembler), also created by the Russian-American team of scientists, has already been used in almost 9,000 studies. It helped scientists analyze the pathogens that caused the outbreak of the Middle East Respiratory Syndrome (MERS) in Saudi Arabia, Ebola in Congo, gonorrhea in England, meningitis in Ghana, dengue fever in Sumatra and dozens of other outbreaks that have happened over the past eight years since SPAdes came along.

It should also be noted that assembling a metagenome from 1,000 genomes is way more difficult than lumping together a single genome sequence. Here you have to deal with 1,000 individual puzzles instead of one: you need to put together a “picture,” whose fragments are mixed with billions of pieces from other puzzles. To solve this problem, three years ago, the same Russian-US team of scientists who created SPAdes, developed the metaSPAdes assembler, which, in turn, became the leading metagenomic assembler. It made the extraction of viral sequences from a huge amount of data much easier, but the metaviralSPAdes assembler of the new generation is able not only to locate fragments of viral genomes, but also to assemble them into a ready-made “puzzle” - the pathogen genome.

The COVID-19 pandemic came as a wake-up call for biologists studying animal-to-human transmission of viruses, and reminded us of the importance of studying different carriers of this virus, such as bats, who boast a one-of-a-kind immune system allowing them to live with multiple pathogens that are lethal to humans. We need to know the nature of the diseases they catch before, not after a pandemic strikes.

Counting the viral genomes of various animals is not an easy task, of course, but with the creation of metaviralSPAdes, biologists will now find it much easier to reconstruct the viral genomes of bats or any other potential sources of future pandemics.

Taking part in the development of the new genomic collector were researchers fromthe Center for Algorithmic Biotechnology of St. Petersburg State University’s Institute of Translational Biomedicine Dmitry Antipov and Mikhail Raiko, the Center’s Deputy Director, St. Petersburg State University Professor Alla Lapidus, and Pavel Pevzner, Professor at University of California at San Diego and a world-acclaimed expert in bioinformatics. Earlier, scientists at St. Petersburg State University’sCenter for Algorithmic Biotechnology helped their colleagues from the Smorodintsev Research Institute of Influenza to decipher, for the first time, the genome of the “Russian” variant of the SARS-CoV-2 virus, which led to the COVID-19 pandemic. On March 15, 2020 they managed to extract the viral RNA froma swab sample obtained from an infected resident of St. Petersburg.Recently, an international team of scientists, led by Pavel Pevzner, developed a novel computational method of detecting cyclopeptides, a class of substances that includes many well-known antibiotics. Using a method, called CycloNovo, the scientists have found 79 new potential candidates for the role of bacteria killers.

Mots-clés :