Biological sequences are the fundamental data type through which scientists interpret biology. Despite the exponential increase in the amount of sequence data, we are limited in our ability to predict functions for the vast majority of sequences.

For instance, less than 1% of the sequenced genes have laboratory validated functions and less than half can be associated with a hypothesized function. We need a more intelligent and efficient solution to propagate functional information across biological sequences. 

Tatta Bio is building a new data infrastructure and a search engine to map sequences to function. We first target protein functions, focusing on highly diverse sequences.  

Gaia is in active development. Sign up below if you would like to receive updates on new features!