Articles

Page 2 of 10

Efficient privacy-preserving variable-length substring match for genome sequence

The development of a privacy-preserving technology is important for accelerating genome data sharing. This study proposes an algorithm that securely searches a variable-length substring match between a query a...

Authors: Yoshiki Nakagawa, Satsuya Ohata and Kana Shimizu

Citation: Algorithms for Molecular Biology 2022 17:9

Content type: Research Published on: 26 April 2022
- View Full Text
- View PDF
Tree diet: reducing the treewidth to unlock FPT algorithms in RNA bioinformatics

Hard graph problems are ubiquitous in Bioinformatics, inspiring the design of specialized Fixed-Parameter Tractable algorithms, many of which rely on a combination of tree-decomposition and dynamic programming...

Authors: Bertrand Marchand, Yann Ponty and Laurent Bulteau

Citation: Algorithms for Molecular Biology 2022 17:8

Content type: Research Published on: 2 April 2022
- View Full Text
- View PDF
Adding hydrogen atoms to molecular models via fragment superimposition

Most experimentally determined structures of biomolecules lack annotated hydrogen positions due to their low electron density. However, thorough structure analysis and simulations require knowledge about the p...

Authors: Patrick Kunzmann, Jacob Marcel Anter and Kay Hamacher

Citation: Algorithms for Molecular Biology 2022 17:7

Content type: Software Published on: 29 March 2022
- View Full Text
- View PDF
Perplexity: evaluating transcript abundance estimation in the absence of ground truth

There has been rapid development of probabilistic models and inference methods for transcript abundance estimation from RNA-seq data. These models aim to accurately estimate transcript-level abundances, to acc...

Authors: Jason Fan, Skylar Chan and Rob Patro

Citation: Algorithms for Molecular Biology 2022 17:6

Content type: Research Published on: 25 March 2022
- View Full Text
- View PDF
Space-efficient representation of genomic k-mer count tables

k-mer counting is a common task in bioinformatic pipelines, with many dedicated tools available. Many of these tools produce in output k-mer count tables containing both k-mers and counts, easily reaching tens of...

Authors: Yoshihiro Shibuya, Djamal Belazzougui and Gregory Kucherov

Citation: Algorithms for Molecular Biology 2022 17:5

Content type: Research Published on: 21 March 2022
- View Full Text
- View PDF
Fast characterization of segmental duplication structure in multiple genome assemblies

The increasing availability of high-quality genome assemblies raised interest in the characterization of genomic architecture. Major architectural elements, such as common repeats and segmental duplications (S...

Authors: Hamza Išerić, Can Alkan, Faraz Hach and Ibrahim Numanagić

Citation: Algorithms for Molecular Biology 2022 17:4

Content type: Research Published on: 18 March 2022
- View Full Text
- View PDF
Parsimonious Clone Tree Integration in cancer

Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to...

Authors: Palash Sashittal, Simone Zaccaria and Mohammed El-Kebir

Citation: Algorithms for Molecular Biology 2022 17:3

Content type: Research Published on: 14 March 2022
- View Full Text
- View PDF
Efficiently sparse listing of classes of optimal cophylogeny reconciliations

Cophylogeny reconciliation is a powerful method for analyzing host-parasite (or host-symbiont) co-evolution. It models co-evolution as an optimization problem where the set of all optimal solutions may represe...

Authors: Yishu Wang, Arnaud Mary, Marie-France Sagot and Blerina Sinaimeri

Citation: Algorithms for Molecular Biology 2022 17:2

Content type: Research Published on: 15 February 2022
- View Full Text
- View PDF
A new 1.375-approximation algorithm for sorting by transpositions

Authors: Luiz Augusto G. Silva, Luis Antonio B. Kowada, Noraí Romeu Rocco and Maria Emília M. T. Walter

Citation: Algorithms for Molecular Biology 2022 17:1

Content type: Research Published on: 15 January 2022
- View Full Text
- View PDF
An optimized FM-index library for nucleotide and amino acid search

Pattern matching is a key step in a variety of biological sequence analysis pipelines. The FM-index is a compressed data structure for pattern matching, with search run time that is independent of the length o...

Authors: Tim Anderson and Travis J. Wheeler

Citation: Algorithms for Molecular Biology 2021 16:25

Content type: Software article Published on: 31 December 2021
- View Full Text
- View PDF
An improved approximation algorithm for the reversal and transposition distance considering gene order and intergenic sizes

In the comparative genomics field, one of the goals is to estimate a sequence of genetic changes capable of transforming a genome into another. Genome rearrangement events are mutations that can alter the gene...

Authors: Klairton L. Brito, Andre R. Oliveira, Alexsandro O. Alexandrino, Ulisses Dias and Zanoni Dias

Citation: Algorithms for Molecular Biology 2021 16:24

Content type: Research Published on: 29 December 2021
- View Full Text
- View PDF
A simpler linear-time algorithm for the common refinement of rooted phylogenetic trees on a common leaf set

The supertree problem, i.e., the task of finding a common refinement of a set of rooted trees is an important topic in mathematical phylogenetics. The special case of a common leaf set L is known to be solvable i...

Authors: David Schaller, Marc Hellmuth and Peter F. Stadler

Citation: Algorithms for Molecular Biology 2021 16:23

Content type: Research Published on: 6 December 2021
- View Full Text
- View PDF
Testing the agreement of trees with internal labels

A semi-labeled tree is a tree where all leaves as well as, possibly, some internal nodes are labeled with taxa. Semi-labeled trees encompass ordinary phylogenetic trees and taxonomies. Suppose we are given a c...

Authors: David Fernández-Baca and Lei Liu

Citation: Algorithms for Molecular Biology 2021 16:22

Content type: Research Published on: 4 December 2021
- View Full Text
- View PDF
Approximation algorithm for rearrangement distances considering repeated genes and intergenic regions

The rearrangement distance is a method to compare genomes of different species. Such distance is the number of rearrangement events necessary to transform one genome into another. Two commonly studied events a...

Authors: Gabriel Siqueira, Alexsandro Oliveira Alexandrino, Andre Rodrigues Oliveira and Zanoni Dias

Citation: Algorithms for Molecular Biology 2021 16:21

Content type: Research Published on: 13 October 2021
- View Full Text
- View PDF
DeepGRP: engineering a software tool for predicting genomic repetitive elements using Recurrent Neural Networks with attention

Repetitive elements contribute a large part of eukaryotic genomes. For example, about 40 to 50% of human, mouse and rat genomes are repetitive. So identifying and classifying repeats is an important step in ge...

Authors: Fabian Hausmann and Stefan Kurtz

Citation: Algorithms for Molecular Biology 2021 16:20

Content type: Software Article Published on: 23 August 2021
- View Full Text
- View PDF
Heuristic algorithms for best match graph editing

Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics as a representation of the pairwise most closely related genes among multiple species. An arc connec...

Authors: David Schaller, Manuela Geiß, Marc Hellmuth and Peter F. Stadler

Citation: Algorithms for Molecular Biology 2021 16:19

Content type: Research Published on: 17 August 2021
- View Full Text
- View PDF
A novel method for inference of acyclic chemical compounds with bounded branch-height based on artificial neural networks and integer programming

Analysis of chemical graphs is becoming a major research topic in computational molecular biology due to its potential applications to drug design. One of the major approaches in such a study is inverse quant...

Authors: Naveed Ahmed Azam, Jianshen Zhu, Yanming Sun, Yu Shi, Aleksandar Shurbevski, Liang Zhao, Hiroshi Nagamochi and Tatsuya Akutsu

Citation: Algorithms for Molecular Biology 2021 16:18

Content type: Research Published on: 14 August 2021
- View Full Text
- View PDF
INGOT-DR: an interpretable classifier for predicting drug resistance in M. tuberculosis

Prediction of drug resistance and identification of its mechanisms in bacteria such as Mycobacterium tuberculosis, the etiological agent of tuberculosis, is a challenging problem. Solving this problem requires a ...

Authors: Hooman Zabeti, Nick Dexter, Amir Hosein Safari, Nafiseh Sedaghat, Maxwell Libbrecht and Leonid Chindelevitch

Citation: Algorithms for Molecular Biology 2021 16:17

Content type: Research Published on: 10 August 2021
- View Full Text
- View PDF
Approximate search for known gene clusters in new genomes using PQ-trees

Gene clusters are groups of genes that are co-locally conserved across various genomes, not necessarily in the same order. Their discovery and analysis is valuable in tasks such as gene annotation and predicti...

Authors: Galia R. Zimerman, Dina Svetlitsky, Meirav Zehavi and Michal Ziv-Ukelson

Citation: Algorithms for Molecular Biology 2021 16:16

Content type: Research Published on: 9 July 2021
- View Full Text
- View PDF
Shape decomposition algorithms for laser capture microdissection

In the context of biomarker discovery and molecular characterization of diseases, laser capture microdissection is a highly effective approach to extract disease-specific regions from complex, heterogeneous ti...

Authors: Leonie Selbach, Tobias Kowalski, Klaus Gerwert, Maike Buchin and Axel Mosig

Citation: Algorithms for Molecular Biology 2021 16:15

Content type: Research Published on: 8 July 2021
- View Full Text
- View PDF
Distinguishing linear and branched evolution given single-cell DNA sequencing data of tumors

Cancer arises from an evolutionary process where somatic mutations give rise to clonal expansions. Reconstructing this evolutionary process is useful for treatment decision-making as well as understanding evol...

Authors: Leah L. Weber and Mohammed El-Kebir

Citation: Algorithms for Molecular Biology 2021 16:14

Content type: Research Published on: 6 July 2021
- View Full Text
- View PDF
Bayesian optimization with evolutionary and structure-based regularization for directed protein evolution

Directed evolution (DE) is a technique for protein engineering that involves iterative rounds of mutagenesis and screening to search for sequences that optimize a given property, such as binding affinity to a ...

Authors: Trevor S. Frisby and Christopher James Langmead

Citation: Algorithms for Molecular Biology 2021 16:13

Content type: Research Published on: 1 July 2021
- View Full Text
- View PDF
Using Robinson-Foulds supertrees in divide-and-conquer phylogeny estimation

One of the Grand Challenges in Science is the construction of the Tree of Life, an evolutionary tree containing several million species, spanning all life on earth. However, the construction of the Tree of Life i...

Authors: Xilin Yu, Thien Le, Sarah A. Christensen, Erin K. Molloy and Tandy Warnow

Citation: Algorithms for Molecular Biology 2021 16:12

Content type: Research Published on: 28 June 2021
- View Full Text
- View PDF
Using the longest run subsequence problem within homology-based scaffolding

Genome assembly is one of the most important problems in computational genomics. Here, we suggest addressing an issue that arises in homology-based scaffolding, that is, when linking and ordering contigs to ob...

Authors: Sven Schrinner, Manish Goel, Michael Wulfert, Philipp Spohr, Korbinian Schneeberger and Gunnar W. Klau

Citation: Algorithms for Molecular Biology 2021 16:11

Content type: Research Published on: 28 June 2021
- View Full Text
- View PDF
Disk compression of k-mer sets

K-mer based methods have become prevalent in many areas of bioinformatics. In applications such as database search, they often work with large multi-terabyte-sized datasets. Storing such large datasets is a de...

Authors: Amatur Rahman, Rayan Chikhi and Paul Medvedev

Citation: Algorithms for Molecular Biology 2021 16:10

Content type: Research Published on: 21 June 2021
- View Full Text
- View PDF
The Bourque distances for mutation trees of cancers

Mutation trees are rooted trees in which nodes are of arbitrary degree and labeled with a mutation set. These trees, also referred to as clonal trees, are used in computational oncology to represent the mutati...

Authors: Katharina Jahn, Niko Beerenwinkel and Louxin Zhang

Citation: Algorithms for Molecular Biology 2021 16:9

Content type: Research Published on: 10 June 2021
- View Full Text
- View PDF
LazyB: fast and cheap genome assembly

Advances in genome sequencing over the last years have lead to a fundamental paradigm shift in the field. With steadily decreasing sequencing costs, genome projects are no longer limited by the cost of raw seq...

Authors: Thomas Gatter, Sarah von Löhneysen, Jörg Fallmann, Polina Drozdova, Tom Hartmann and Peter F. Stadler

Citation: Algorithms for Molecular Biology 2021 16:8

Content type: Research Published on: 1 June 2021
- View Full Text
- View PDF
The energy-spectrum of bicompatible sequences

Genotype-phenotype maps provide a meaningful filtration of sequence space and RNA secondary structures are particular such phenotypes. Compatible sequences, which satisfy the base-pairing constraints of a give...

Authors: Fenix W. Huang, Christopher L. Barrett and Christian M. Reidys

Citation: Algorithms for Molecular Biology 2021 16:7

Content type: Research Published on: 1 June 2021
- View Full Text
- View PDF
Fast and efficient Rmap assembly using the Bi-labelled de Bruijn graph

Genome wide optical maps are high resolution restriction maps that give a unique numeric representation to a genome. They are produced by assembling hundreds of thousands of single molecule optical maps, which...

Authors: Kingshuk Mukherjee, Massimiliano Rossi, Leena Salmela and Christina Boucher

Citation: Algorithms for Molecular Biology 2021 16:6

Content type: Research Published on: 25 May 2021
- View Full Text
- View PDF
Exact transcript quantification over splice graphs

The probability of sequencing a set of RNA-seq reads can be directly modeled using the abundances of splice junctions in splice graphs instead of the abundances of a list of transcripts. We call this model gra...

Authors: Cong Ma, Hongyu Zheng and Carl Kingsford

Citation: Algorithms for Molecular Biology 2021 16:5

Content type: Research Published on: 10 May 2021
- View Full Text
- View PDF
Natural family-free genomic distance

A classical problem in comparative genomics is to compute the rearrangement distance, that is the minimum number of large-scale rearrangements required to transform a given genome into another given genome. Th...

Authors: Diego P. Rubert, Fábio V. Martinez and Marília D. V. Braga

Citation: Algorithms for Molecular Biology 2021 16:4

Content type: Research Published on: 10 May 2021
- View Full Text
- View PDF
Improving metagenomic binning results with overlapped bins using assembly graphs

Metagenomic sequencing allows us to study the structure, diversity and ecology in microbial communities without the necessity of obtaining pure cultures. In many metagenomics studies, the reads obtained from m...

Authors: Vijini G. Mallawaarachchi, Anuradha S. Wickramarachchi and Yu Lin

Citation: Algorithms for Molecular Biology 2021 16:3

Content type: Research Published on: 4 May 2021
- View Full Text
- View PDF
Fast lightweight accurate xenograft sorting

With an increasing number of patient-derived xenograft (PDX) models being created and subsequently sequenced to study tumor heterogeneity and to guide therapy decisions, there is a similarly increasing need fo...

Authors: Jens Zentgraf and Sven Rahmann

Citation: Algorithms for Molecular Biology 2021 16:2

Content type: Research Published on: 2 April 2021
- View Full Text
- View PDF
Quantifying steric hindrance and topological obstruction to protein structure superposition

In computational structural biology, structure comparison is fundamental for our understanding of proteins. Structure comparison is, e.g., algorithmically the starting point for computational studies of struct...

Authors: Peter Røgen

Citation: Algorithms for Molecular Biology 2021 16:1

Content type: Research Published on: 27 February 2021
- View Full Text
- View PDF
Fast and accurate structure probability estimation for simultaneous alignment and folding of RNAs with Markov chains

Simultaneous alignment and folding (SA&F) of RNAs is the indispensable gold standard for inferring the structure of non-coding RNAs and their general analysis. The original algorithm, proposed by Sankoff, solv...

Authors: Milad Miladi, Martin Raden, Sebastian Will and Rolf Backofen

Citation: Algorithms for Molecular Biology 2020 15:19

Content type: Research Published on: 13 November 2020
- View Full Text
- View PDF
gsufsort: constructing suffix arrays, LCP arrays and BWTs for string collections

The construction of a suffix array for a collection of strings is a fundamental task in Bioinformatics and in many other applications that process strings. Related data structures, as the Longest Common Prefix...

Authors: Felipe A. Louza, Guilherme P. Telles, Simon Gog, Nicola Prezza and Giovanna Rosone

Citation: Algorithms for Molecular Biology 2020 15:18

Content type: Software Article Published on: 22 September 2020
- View Full Text
- View PDF
A linear-time algorithm that avoids inverses and computes Jackknife (leave-one-out) products like convolutions or other operators in commutative semigroups

Data about herpesvirus microRNA motifs on human circular RNAs suggested the following statistical question. Consider independent random counts, not necessarily identically distributed. Conditioned on the sum, ...

Authors: John L. Spouge, Joseph M. Ziegelbauer and Mileidy Gonzalez

Citation: Algorithms for Molecular Biology 2020 15:17

Content type: Research Published on: 19 September 2020
- View Full Text
- View PDF
Reconstruction of time-consistent species trees

The history of gene families—which are equivalent to event-labeled gene trees—can to some extent be reconstructed from empirically estimated evolutionary event-relations containing pairs of orthologous, paralo...

Authors: Manuel Lafond and Marc Hellmuth

Citation: Algorithms for Molecular Biology 2020 15:16

Content type: Research Published on: 20 August 2020
- View Full Text
- View PDF
On an enhancement of RNA probing data using information theory

Identifying the secondary structure of an RNA is crucial for understanding its diverse regulatory functions. This paper focuses on how to enhance target identification in a Boltzmann ensemble of structures via...

Authors: Thomas J. X. Li and Christian M. Reidys

Citation: Algorithms for Molecular Biology 2020 15:15

Content type: Research Published on: 7 August 2020
- View Full Text
- View PDF
Algorithms for the quantitative Lock/Key model of cytoplasmic incompatibility

Cytoplasmic incompatibility (CI) relates to the manipulation by the parasite Wolbachia of its host reproduction. Despite its widespread occurrence, the molecular basis of CI remains unclear and theoretical models...

Authors: Tiziana Calamoneri, Mattia Gastaldello, Arnaud Mary, Marie-France Sagot and Blerina Sinaimeri

Citation: Algorithms for Molecular Biology 2020 15:14

Content type: Research Published on: 22 July 2020
- View Full Text
- View PDF
Fast computation of genome-metagenome interaction effects

Association studies have been widely used to search for associations between common genetic variants observations and a given phenotype. However, it is now generally accepted that genes and environment must be...

Authors: Florent Guinot, Marie Szafranski, Julien Chiquet, Anouk Zancarini, Christine Le Signor, Christophe Mougel and Christophe Ambroise

Citation: Algorithms for Molecular Biology 2020 15:13

Content type: Research Published on: 1 July 2020
- View Full Text
- View PDF
Evolution through segmental duplications and losses: a Super-Reconciliation approach

The classical gene and species tree reconciliation, used to infer the history of gene gain and loss explaining the evolution of gene families, assumes an independent evolution for each family. While this assum...

Authors: Mattéo Delabre, Nadia El-Mabrouk, Katharina T. Huber, Manuel Lafond, Vincent Moulton, Emmanuel Noutahi and Miguel Sautie Castellanos

Citation: Algorithms for Molecular Biology 2020 15:12

Content type: Research Published on: 26 May 2020
- View Full Text
- View PDF
Precise parallel volumetric comparison of molecular surfaces and electrostatic isopotentials

Geometric comparisons of binding sites and their electrostatic properties can identify subtle variations that select different binding partners and subtle similarities that accommodate similar partners. Becaus...

Authors: Georgi D. Georgiev, Kevin F. Dodd and Brian Y. Chen

Citation: Algorithms for Molecular Biology 2020 15:11

Content type: Research Published on: 25 May 2020
- View Full Text
- View PDF
Context-aware seeds for read mapping

Most modern seed-and-extend NGS read mappers employ a seeding scheme that requires extracting t non-overlapping seeds in each read in order to find all valid mappings under an edit distance threshold of t. As t g...

Authors: Hongyi Xin, Mingfu Shao and Carl Kingsford

Citation: Algorithms for Molecular Biology 2020 15:10

Content type: Research Published on: 23 May 2020
- View Full Text
- View PDF
Detecting transcriptomic structural variants in heterogeneous contexts via the Multiple Compatible Arrangements Problem

Transcriptomic structural variants (TSVs)—large-scale transcriptome sequence change due to structural variation - are common in cancer. TSV detection from high-throughput sequencing data is a computationally c...

Authors: Yutong Qiu, Cong Ma, Han Xie and Carl Kingsford

Citation: Algorithms for Molecular Biology 2020 15:9

Content type: Research Published on: 15 May 2020
- View Full Text
- View PDF
The distance and median problems in the single-cut-or-join model with single-gene duplications

In the field of genome rearrangement algorithms, models accounting for gene duplication lead often to hard problems. For example, while computing the pairwise distance is tractable in most duplication-free mod...

Authors: Aniket C. Mane, Manuel Lafond, Pedro C. Feijao and Cedric Chauve

Citation: Algorithms for Molecular Biology 2020 15:8

Content type: Research Published on: 4 May 2020
- View Full Text
- View PDF
Non-parametric and semi-parametric support estimation using SEquential RESampling random walks on biomolecular sequences

Non-parametric and semi-parametric resampling procedures are widely used to perform support estimation in computational biology and bioinformatics. Among the most widely used methods in this class is the stand...

Authors: Wei Wang, Jack Smith, Hussein A. Hejase and Kevin J. Liu

Citation: Algorithms for Molecular Biology 2020 15:7

Content type: Research Published on: 16 April 2020
- View Full Text
- View PDF
Linear-time algorithms for phylogenetic tree completion under Robinson–Foulds distance

We consider two fundamental computational problems that arise when comparing phylogenetic trees, rooted or unrooted, with non-identical leaf sets. The first problem arises when comparing two trees where the le...

Authors: Mukul S. Bansal

Citation: Algorithms for Molecular Biology 2020 15:6

Content type: Research Published on: 13 April 2020
- View Full Text
- View PDF
From pairs of most similar sequences to phylogenetic best matches

Many of the commonly used methods for orthology detection start from mutually most similar pairs of genes (reciprocal best hits) as an approximation for evolutionary most closely related pairs of genes (recipr...

Authors: Peter F. Stadler, Manuela Geiß, David Schaller, Alitzel López Sánchez, Marcos González Laffitte, Dulce I. Valdivia, Marc Hellmuth and Maribel Hernández Rosales

Citation: Algorithms for Molecular Biology 2020 15:5

Content type: Research Published on: 9 April 2020
- View Full Text
- View PDF
Alignment- and reference-free phylogenomics with colored de Bruijn graphs

The increasing amount of available genome sequence data enables large-scale comparative studies. A common task is the inference of phylogenies—a challenging task if close reference sequences are not available,...

Authors: Roland Wittler

Citation: Algorithms for Molecular Biology 2020 15:4

Content type: Research Published on: 7 April 2020
- View Full Text
- View PDF

How was your experience today?

Rating Please select one rating

Awful

Bad

Good

Great

Thank you for your feedback.

Tell us why (opens in a new tab)

Articles

Algorithms for Molecular Biology

Contact us