Bio Databases 2018: How do they taste?

Share on: 
By: @finchtalk
Thu Jan 04, 2018
Database Growth. Total DBs are in the top line.
DBs per year are on the bottom.

It’s a new year and new edition of Nucleic Acids Research’s (NAR’s) Annual Database issue. It’s the 25th year for the NAR database series, the eighth for FinchTalk, and the first at the new home for Discovering Biology in a Digital World. 

As expected, Bio Databases continue to grow in number. NAR’s database now catalogs 1737 molecular biology databases, up 75 from last year. Of the 181 papers in this year's NAR issue, 82 articles represent new databases, 84 are about updates of existing databases, and 15 are about databases that have not been in NAR before. 47 databases were removed from the NAR archive due to unresponsive servers. 

As I scanned this year's list of new databases, FlavorDB piqued my interest. Its description, “flavor molecules,” looked simply delicious. From the website’s summary, “Flavor is an expression of olfactory and gustatory sensations experienced through a multitude of chemical processes triggered by molecules.” That’s science speak for the chemistry of smell and taste. As flavor molecules can regulate metabolic processes, and have health implications, such a resource can be valuable to the scientific community. The database itself (as of Jan 04, 2018) has 25,595 flavor molecules. For the scientifically-minded, or molecular gastronomists, the database also links flavor molecules that are shared between foods. These links can be used to discover novel food pairings in a Flavor Network.  

FlavorDB's Flavor Network. Select a food and see other foods that share flavor molecules​​, in this case Alaska wild-rhubarb.

The database has several search interfaces. The flavor search lets you find molecules by common names, chemical names, functional groups, hand drawn structures, and more. In another tab you can search by food entities or natural ingredients to get lists of molecules that create a food's bouquet and taste sensations. The help options starting with wine. (we think this is always a good idea - provided you’re over 21). Another tab lets you search by natural sources. Here, you need to know your scientific names. Typing “hops” yields nothing, whereas typing “Humulus” (because everyone knows hops is Humulus lupulus) lets you see the chemical flavor constituents that make beer "hoppy." Of course searching with hops as an entity, works just fine. After all you do not H. lupulus your beer. 

The last tab, “Flavor Pairing” lets you search by entities and ingredients and then shows you other entities and ingredients that share common molecules. Hops, for example, shares flavors with many kinds of cheese, apples, cocoa, and rum. Wasabi, as might be expected shares flavors with horseradish, radishes, and turnips, and also pineapple. Who knew?  Maybe sushi with pineapple is a good idea after all.

FlavorDB also has data on receptors, classifying them by their ability to detect sweet, bitter, sour, umami, and odor sensations. Unfortunately I did not see a connection from the flavor molecules to their receptors, that would be interesting and is perhaps in future plans. Even better would be to provide links to flavor molecules in 3D structures like our Receptors - Hot Stuff Collection. 

With flavors on my brain, I continued to scan the list of new databases. PICKLES and MINTbase caught my eye. More food related databases? Nope, PICKLES is about "Pooled In-vitro Crispr Knockout Library Essentiality Screens." Because CRISPR technology provides a way test gene knockout experiments at large scale, we need new ways to score the results from many experiments. Essentiality is a new concept that provides a way to define and score a gene's importance. In short, some genes cannot be knocked out; they are essential for life and thus have high essentiality. And, if you have an essential gene knocked out, you are in a pickle.

MINTbase contains information about mitochondrial and nuclear tRNA fragments. While tRNAs are involved in translating RNA into protein, we are learning that fragments of tRNA, or tRFs, can regulate gene expression at the translation level by interfering with proteins that bind mRNA. Turns out there are a lot of tRFs, so we need a database. 

Speaking of RNA, many of the new databases focus on various components of the RNA world (covered in Bio Databases 2014). These include CirGRDB - Regulation of RNAs in circadian rhythms; CSCD - Cancer-Specific cRNA Database; ExoRBase - Human blood exosome RNAs; ITSoneDB - Eukaryotic ribosomal RNA Internal Transcribed Spacer 1 sequences; Lnc2Meth - lncRNAs and DNA methylation; miRCarta - miRNAs and precursors; mirTrans - Cell-specific transcriptional information for human miRNAs; MSDD - miRNA SNP Disease Database; RISE - RNA-RNA interactions; and RNArchitecture - Structural classification of RNAs. RNA is very important. 

MINTbase and PICKLES caught my eye because of their names. Tabloid Proteome also has an intriguing name. It tracks protein associations inferred from Mass Spectrometry. Inferred associations in a tabloid, hmm... Last, one of my favorite names this year was dbCAN-seq, because you can too. 

Filed under: