Category: Genetics (Non-PD)
Objective: To determine the full sequence and length of the pentanucleotide repeat in the RFC1 gene by Cas9-targeted Nanopore sequencing in patients with cerebellar ataxia (CA), neuropathy (N), and vestibular areflexia (VA) syndrome (CANVAS).
Background: Recently, it was shown that biallelic expansions of a pentanucleotide repeat along with multiple alterations of the AAAAG wildtype motif cause CANVAS. Long-read or third generation sequencing is suitable to resolve repeat structures but its data analysis is challenging. As such, Oxford Nanopore Technologies (ONT) is a single molecule sequencing approach capable of generating long reads of up to many megabases and thereby can resolve short tandem repeats (STRs).
Method: Using blood-derived high molecular weight DNA from CANVAS patients, we performed Cas9 target enrichment of the intronic repeat region in RFC1. For this, we used four guide RNAs cutting ~2.7 kb up- and ~3.1 kb downstream of the repeat, respectively. Sequencing was done on a MinION. We performed base calling using Guppy and read mapping using Minimap2 and identified reads matching to 30 bp unique up- or downstream sequences of the STR region. From those reads we extracted the repeat sequence.
Results: On-target reads account for 0.1% of the 142 Mb Nanopore run, with an average read length of 3 kb. Fifteen reads matched a flanking sequence of which four fully cover the repeat and flanking regions on both ends. Expanded repeat sequences differed between reads and were not compatible with a biallelic state but indicated several alleles with diverse repeat motifs. We investigated base calling qualities and found that they are low for the repeat region, particularly for repeat sequences containing bases other than G and A (Phred score QV often <4, i.e. base calling accuracy <50%). When disregarding low quality repeat sequences, only the repeat motif AAGGG was observed throughout the repeat region. The repeat length was 4.4 kb corresponding to ~900 pentanucleotide repeats.
Conclusion: Cas9-targeted Nanopore sequencing of large insertions is challenging and needs thorough quality controls on a base- rather than a read level. However, when addressing these pitfalls, it is a powerful method to determine repeat length and sequence enabling high throughput analysis of the CANVAS-causing repeat.
To cite this abstract in AMA style:
I. Wohlers, H. Pott, S. Schaake, J. Trinh, H. Busch, K. Lohmann. Be aware of pitfalls: Bioinformatic analysis of Cas9-targeted Nanopore sequencing of the RFC1 repeat in CANVAS [abstract]. Mov Disord. 2021; 36 (suppl 1). https://www.mdsabstracts.org/abstract/be-aware-of-pitfalls-bioinformatic-analysis-of-cas9-targeted-nanopore-sequencing-of-the-rfc1-repeat-in-canvas/. Accessed December 11, 2024.« Back to MDS Virtual Congress 2021
MDS Abstracts - https://www.mdsabstracts.org/abstract/be-aware-of-pitfalls-bioinformatic-analysis-of-cas9-targeted-nanopore-sequencing-of-the-rfc1-repeat-in-canvas/