In recent years, the scientific community has devoted substantial resources toward the development of experimental and analytical strategies for the detection of RNA modifications. These efforts have generated a number of algorithms and software packages, which have been extensively reviewed elsewhere13. The current approaches for modification detection based on Nanopore data can be divided into two categories: those based on the detection of modification-induced basecalling errors and those based on the analysis of the electrical signal. The first strategy, which is implemented in tools such as Epinano14, DiffErr15, Eligos16, and Drummer17, has shown interesting results despite not considering the effects of RNA modification on the raw electrical signal; however, modern basecalling models tend to become more insensitive to common PTM, with the risk that methods of this group could quickly become ineffective at detecting modifications. On the other hand, methods based on raw signal space analyses (such as Tombo18, Mines19, xPore20, nanom6A21, nanoRMS22, nanoDoc23, Yanocomp24, and Penguin25) can lead to richer comparative analyses, but are more complicated and come with steeper computational costs. The methods described above can be further classified into two groups: de novo detection methods, that use a trained model to identify modifications, and comparative methods, where differences between two samples are evaluated to infer the presence of a modification. At present, de novo strategies are often hindered by the difficulty to generate a training set containing all kmer contexts with and without modifications. For this reason, the majority of existing methods instead undertake a comparative approach, where the sample of interest is compared to a reference sample devoid of modifications. Here we introduce Nanocompore, a flexible and versatile analysis method dedicated to the detection of RNA modifications from DRS datasets in signal space. To identify potential modification sites, Nanocompore uses a model-free comparative approach based on a 2 components Gaussian mixture model, where an experimental RNA sample is compared against a sample with fewer or no modifications. Potentially, this can be applied to any modification, provided that an appropriate control depleted of the modification is available, and that the modification significantly alters the current signal. We demonstrate this for seven different RNA modifications in synthetic oligonucleotides, as well as extensively for m6A in coding and noncoding native RNAs in yeast and mammalian cells. Nanocompore includes several unique features: (1) robust signal realignment based on Nanopolish, (2) modelling of the biological variability, (3) ability to run multiple statistical tests, (4) prediction of RNA modifications using both signal intensity and duration (dwell time), and (5) availability of an automated pipeline that runs all the preprocessing steps. Finally, the results generated by Nanocompore can also be leveraged to infer RNA modifications at single molecule resolution.
Native Instruments Creator Tools 1.2.0
In this paper we introduce Nanocompore, a robust and versatile method for the identification of multiple types of RNA modification from Nanopore DRS data. Nanocompore performs a signal level comparison between two conditions, allowing identification of significant changes indicative of the presence/absence of RNA modifications (Fig. 1). Our approach has several advantages over alternative RNA PTM mapping methods. First, it is based on Nanopore DRS, a technique which is seeing rapid adoption and that, unlike previous genome-wide strategies, is not affected by reverse transcription or PCR amplification biases. Second, it maps RNA modifications in the context of long reads, giving critical information on RNA PTMs on individual gene isoforms. Third, our comparative strategy does not require any training and can be applied as-is to different RNA modifications, as long as a modification-depleted reference sample is available. Fourth, the approach implemented in Nanocompore is paving the way for future works to study RNA modifications at single molecule resolution. Finally, we implemented analysis pipelines in the Nextflow and Snakemake Domain Specific Languages, allowing automatic execution of all processing steps, from raw data up to the execution of Nanocompore and other RNA modification tools, thus greatly simplifying the bioinformatics work.
In order to compare Nanocompore against most of the other tools available for RNA modification detection in a reproducible way, we wrote a snakemake pipeline called MetaCompore ( -slide/MetaCompore). For this study, we used MetaCompore v0.1.2, which includes the latest version of following tools : Epinano 1.2.0, Eligos 2.0.0, Tombo 1.5.1, differr_nanopore_drs (latest version), Mines (latest version) and Nanocompore 1.0.3. MetaCompore preprocess the data for all the tools, including Basecalling with ONT-Guppy 4.2.2 (except for Epinano which required the older 3.1.5 version), read alignment to the reference transcriptome with Minimap2 2.17, alignments filtering with pyBiotools 0.2.7 and signal realignment with f5c 0.6. For portability and reproducibility reasons, every module of MetaCompore is provided within its own singularity container and all the options used for a run are tracked in a YAML configuration file. Nanocompore and Epinano are the only tools to support experimental replicates. For all the other tools we merged the data obtained from replicates. Since every tool outputs a different kind of statistics/format, MetaCompore filters the data following the respective authors recommendations and when possible converts the result in a similar format containing the significant site associated with their p-value and Effect size. For Nanocompore and Tombo which both work in signal space, we added a peak calling denoising step to narrow down the results. 2ff7e9595c
Comments