The best tRNA database

Functionalities

This T-Psi-C database allows you to search across all available tRNA sequences. For most tRNA molecules information on modifications and secondary structure is available. Moreover there is also information about cell localization and organism. Our T-PSI-C database has cross-references to PDB database, for each structure that is deposed in this database. For each tRNA entry one model is selected to be displayed by Jmol applet. In case there is no structure deposed in the database our own algorithm for alignment was used to automatically align sequences and ModeRNA software was used for modelling purposes.

For each entry secondary structure is being drawn using Traveler software and our JavaScript code to display it. For each modification there is cross-reference to Modomics database.

For searching purposes there are two main approaches available. First one is filtering by various parameters including modified and unmodified sequence and second one is by using BLAST to search data.

Any verified user is eligible to upload data to our database. To do so there is a requirement to create an account in our database which enables sending the user own data. Before being displayed data are checked and accepted by our team, to avoid spam or misleading information.

There is possibility to download tRNA files in FASTA format (for both modified and unmodified sequences), Dot-bracket format - for secondary structure information and in CSV to get most of the available information.

If you want to retrieve or upload data automatically, there is special API for that purpose.

Technology

For database purposes Django framework (Python) with MySQL was used. For searching purposes Elasticsearch and BLAST+ were used, and for input of new data and API - Django Rest Framework. Secondary structures were displayed using Traveler generated XML code (on the basis of single, manually created positions for template tRNA), and tertiary structures using Jmol applet. Folding energy was calculated using ViennaRNA package (ViennaRNA-2.4.14) using unmodified sequences.

Modelling

Missing tertiary structures were modelled using ModeRNA, because of it possibility to consider modifications. ModeRNA require structural alignment, that was generated using muscle (muscle3.8.31). To generate alignment using muscle, that is multiple sequence alignment program for protein we selected amino acids that have specific score in the matrix used by the alignment as representation for dot-bracket notation:

. -> T
( -> R
) -> Y
[ -> G
] -> D
{ -> T
} -> T

We use multiple sequence alignment as it gives better results than comparing two sequences. For each new batch of sequences or single new sequence new alignment is generated. Then model and template are selected from the alignment and alignment is simplified by removing unnecessary gaps. This method is under further development and will be improved again.
Moreover, ModeRNA does not allow to model few of the modifications or unknown nucleotide. Therefore we changed them for modelling purposes:

H -> A
< -> C
; -> G
N -> U

Modelled molecules were further minimized using MMTK software. Because of various problems with automatic minimization of full-size structures, structures were minimized using sliding window (size 15, step 10, 200 cycles each) and if any error occurred this part was not considered for minimization while remaining part of the model was minimized. Finally modelled structures were visually assessed, and these that contained significant missing regions or were not similar at all to the tRNA were removed from the database.
We are still working on improving modelling methods, as our target is 100% of automatically modelled tRNA with good quality.

Versioning

Each change in secondary structure or sequence (modified or unmodified) create new version of the entry. Old versions are still available, but they are excluded from database search. There is also information about entry creation date and last modification date. In case of tertiary structure there is available additional date of the last modification.

People

Marcin Sajek, IHG AIM
Tomasz Woźniak, Linkedin profile, IHG AIM, Wozniak Technologies
prof. Jadwiga Jaruzelska
prof. Jan Barciszewski

Last update has been done in 2019.