I’ve written a few times about an odd sort of unnatural DNA sequence, where some of the nucleotides are connected via “click” triazole units rather than the traditional polyphosphate backbone. I remember wondering what the chemical biology community would make out of these things, and I wanted to report on at least one ingenious answer to that question.
A general situation in the field is when you have a big pile of RNA or DNA pieces, and you want to analyze them – sort them out, determine how many different ones are in there and which ones are present more than the others. This is the underlying problem that “RNA-seq” assays are trying to answer: when you perturb some cells by changing something about their environment, what genes’ transcriptions get affected? What protein levels go up or down? One reasonable proxy for those is the levels of various mRNAs, but there’s a lot of RNA in a cell, and sorting through it looking for changes has been nontrivial.
There are some steps that a lot of these assays use to clean up the sample and get it into a form that can be more easily analyzed. The clean-up often involves some method to get all the ribosomal RNA out of there or to direct your readout away from it somehow, because that’s the majority of the RNA in cell lysate, and you don’t want your assay technology chasing it. Then there’s usually a fragmentation step, to break up the mRNAs into useful-sized pieces (which will be reassembled later on, computationally). Another common step is the use of reverse transcriptase enzymes to make complementary DNA out of all those mRNA pieces, because our techniques to amplify and sequence DNA are so flippin’ powerful. There’s often a step that tries to stick some sort of unique tag/label/identifier onto each DNA piece as well, using ligase enzymes, to aid in keeping track of all this stuff. These enzymes are the key to the whole business, along with modern sequencer technology and its associated computing prowess to reassemble everything from the pieces.
Here’s how the “Click-seq” variant works, though. You take your pile of mRNAs and expose them to reverse transcriptase, as before. But this time, you feed in a set amount of azide-containing nucleotides along with the usual mix of normal 2-deoxynucleotides to make your complementary DNAs. The thing is, these are chain-terminating residues for the RT enzyme, so what you end up with is a random heap of cDNA pieces, all of which end (on the 3′) with an azidonucleotide. And these turn into handles for introducing those labels/barcodes/identifiers. This way you avoid having to do a separate fragmentation step, because the reverse transcriptase has already fragmented things for you as it randomly choked on the azidonucleotides. And you avoid having to use the ligase enzymes to label everything, which can be expensive – you put those on via azide/alkyne reactions instead. The clicked oligos are still viable for PCR and sequencing, as it turns out, so that part of the process works fine, and the whole thing avoids some recombination artifacts that the other variants can be vulnerable to (ligases ligating the wrong things, etc.)
Because of that, it’s also good for picking up real recombination events that might be biologically relevant, but rare on the absolute scale. This happens with some viruses, and the technique has been used to look at so-called “defective interfering” RNAs which emerge over time in viral infections. These are somewhat mysterious – in some cases, they seem to hijack and (as the name says) perhaps interfere with the usual infectious cycle, but they may also prolong it and make it worse. It looks like some viruses have evolved to let these things accumulate, and there must be a reason for that, too. The only way to unravel all that is with a good enough RNA analysis technique, and the clicked nucleotides just may be providing one.