
Researchers from the Institute for Integrative Systems Biology (I2SysBio), a joint centre of the University of Valencia (UV) and the Spanish National Research Council (CSIC), have developed CleanBar, a bioinformatics tool for analysing DNA sequences from complex samples from the perspective of individual cells. The study, led by Maria Dzunkova and published in ISME Communications, focuses on observing how phages (viruses that infect bacteria, often used to combat multidrug-resistant pathogenic bacteria) multiply within each bacterial cell, and it reveals that infections are not uniform across all cells.
“Applied to the study of viral infections, it provides a direct window to observe how viruses spread within each cell, how many copies they produce and how the infection varies between different bacteria within the same population”, explains Vicente Arnau, the application's programmer, researcher at I2SysBio and lecturer in the UV’s Department of Computer Science.
Single-cell sequencing is a technology that makes it possible to analyse the genetic material of individual cells, in contrast with traditional sequencing, which analyses an average drawn from many cells in a tissue or sample. This technique enables researchers to unravel cellular heterogeneity, identify complex cell types and understand subtle differences between cells. CleanBar is an open-source application, easy to install, that can analyse sequencing files of varying sizes — from thousands to millions of sequences — in a short time.
Applied to microbial sequencing, single-cell sequencing makes it possible to obtain the genetic code of individual bacteria present in a sample, along with information about bacterial interactions with plasmids and bacteriophages. Technically, identifying each cell (or bacterium) is achieved by adding a unique DNA tag to the cell.
Several technologies exist, but the most accessible for any laboratory is the technique known as split-and-pool barcoding, recently developed by the Lithuanian biotechnology company Atrandi Biosciences. In this type of barcoding, a set of four different sequences or tags is added in the laboratory using a 96-well plate, without the need to purchase any expensive equipment, which will allow the DNA from each of them to be differentiated. These tags are known as barcodes.
“Afterwards, the DNA sequences will have to be separated and grouped according to their barcodes and, in this way, we will obtain the sequenced DNA of each individual cell. Researchers from three different groups at I2SysBio collaborated on this work to present our software, CleanBar — a tool for separating the DNA sequences obtained according to the barcodes used to identify each of the bacteria in a sample”, explains Vicente Arnau, who adds that CleanBar is also capable of detecting barcodes in variable positions, with separations between them that differ from those expected, and can also predict misclassifications, where one or two barcodes fail.
This project is funded by the Gen-T programme of the Valencian Regional Government (GV), the Investigo programme (GV) and the Spanish Ministry of Science, Innovation and Universities.
Article reference: Vicente Arnau, Alicia Ortiz-Maiques, Juan Valero-Tebar, Lucas Mora-Quilis, Vaida Kurmauskaite, Lorea Campos Dopazo, Pilar Domingo-Calap, Mária Džunková, CleanBar: a versatile demultiplexing tool for split-and-pool barcoding in single-cell omics, ISME Communications, Volume 5, Issue 1, January 2025, ycaf134, https://doi.org/10.1093/ismeco/ycaf134








