Visualize experimental data on your RNA 2D.

Today, many experimental approches allow to probe the folding state of RNAs, from single molecules to entire populations. Then arise the need to explore the reactivity profiles using suitable visualizations.

One alternative is to map these profiles on RNA 2D predictions. A kind of “folded heatmap”. RNArtist will soon provide a panel to visualize your experimental data on your 2Ds with a bunch of options to fit your needs (color gradients, min/max values, cutoff, normalization,…). Meanwhile, you can already produce such visualizations with the scripting facilities of RNArtistCore, the drawing engine on which RNArtist is built. Let’s see how to do it.

How to use RNArtistCore from the command line?

you will need to have Java installed on your computer. From the command line, type “java” to test if it is already installed. If not, you will find a release for your computer on this page.
download the jar file for the last release of RNArtistCore on its GitHub page (on the right, you will see a section named “Releases”)
from your terminal, test this command (replace X.X.X with your release number):

java -jar rnartistcore-X.X.X-SNAPSHOT-jar-with-dependencies.jar

you should see the following answer:

Usage: java -jar rnartistcore-X.X.X-jar-with-dependencies.jar [-c] path/to/your/plotting_script.kts

It works! But now you need to write a script to define the location of your structural data, and how to plot them.

How to plot your RNA 2D?

You will need:

your 2D structure described in a file (it is also possible to describe your 2D in the script). I will use the Vienna format, but BPSeq or Ct are fine too. Here is the content of my file named rna.vienna:

>my RNA
GGGACCGCCCGGGAAACGGGCGAAAAACGAGGUGCGGGCACCUCGUGACGACGGGAGUUCGACCGUGACGCAUGCGGAAAUUGGAGGUGAGUUCGCGAAUACGCAAGCGAAUACGCCCUGCUUACCGAAGCAAGCG
.....((((((.....))))))....((((((((....))))))))....((((........))))..(((.(((..........(((((((..(((....)))..(((....)))...)))))))...))).)))

your experimental data in a text file. Each line has to contains two fields (the position in the sequence and the experimental value) separated by a space. Here is the first lines of my file named data.txt:

I have randomy generated 136 values between 0.0 and 1.0. To make it easier, you will put everything in the same directory (the jar, vienna and txt files).

RNArtistCore provides a Domain Specific Language (DSL) to write your plotting instructions in a script. It is really simple to use. Let’s start with a basic example. Create a file named script.kts in the same folder as your other files. Copy/paste this content in your script file:

rnartist {
    ss {
        vienna {
            file="rna.vienna"
        }
    }
    
    theme {
        details {
            value = 3    
        }
        scheme {
            value = "Persian Carolina"
        }
    }
    
    svg {
        path = "."
        width = 500.0
        height = 500.0
    }
}

The vienna element defines the location of your structural file. The theme element provides a nice rendering (details level 3 will show the residues and the scheme “Persian Carolina” will color them nicely). After a while, you should see a new file in your folder named rna.svg and containing this drawing:

How to map your data on your RNA 2D?

You will now modify your script to:

link your data to the 2D structure
precise the gradient color to be used to color the residues

This gives now the following script:

rnartist {
    ss {
        vienna {
            file="rna.vienna"
        }
    }

    data {
        file = "data.txt"
    }

    theme {
        details {
            value = 3    
        }
        color {
            type = "N"
            value = "lightyellow"
            to = "firebrick"
        }
    }

    svg {
        path = "."
        width = 500.0
        height = 500.0
    }
}

The data element has to be defined before the theme element. In the theme element, you can see that the scheme has been removed. It is replaced with a color element. When RNArtistCore sees a property named “to” in a color element, it knows that it has to produce a gradient color and to use it to color each residue according to its value in the dataset.

If you rerun your former command, your file named rna.svg should now contain this drawing:

Thanks to the RNArtistCore DSL, we can easily produce other kind of results. For example, we could…

restrict the coloring to purines

color {
    type = "R"
    value = "lightyellow"
    to = "firebrick"
}

restrict the coloring to purines in helices

color {
    type = "R@helix"
    value = "lightyellow"
    to = "firebrick"
}

restrict the coloring to purines in helices, between positions 1 to 50 (overlapping the two first helices in the 2D)

color {
    type = "R@helix"
    value = "lightyellow"
    to = "firebrick"
    
    location {
        1 to 50
    }
}

change the colors and restrict to residues whose values are between 0.3 and 0.7

color {
    value = "white"
    to = "orange"
    data between 0.3..0.7
}

Easy and efficient! If you want to give it a try, here is the link to download the last version of the script. I remember you that you can find many examples of scripts in the documentation of RNArtistCore on GitHub.

Great, but i have thousands of files to process!

Inside the vienna element, you can define a folder instead of a single file.

vienna {
    path="/User/bwayne/vienna_files/"
}

With the same command, RNArtistCore will now be able to process any Vienna file stored in this folder, but with one limitation. If an element in the script targets a precise location, and if your RNAs are different in size, you will get weird results. Two options then:

either you generate one script per RNA. Very easy with a language like Python.
or you describe your locations according to a reference numbering system. This reference will allow RNArtistCore to infer the absolute positions for each RNA when needed. Like the columns in a structural alignment. But this is for another post.

Written on October 19, 2023