The Sequence Ontology (SO) is an ontology used to describe biological sequences through the use of standardized vocabulary and internodal relationships. In addition to the power of controlled terms, SO offers grouping at multiple levels, enriching the ability to preform logic and algorithm based operations across large groups of features, types, and effects. Furthermore, adding SO as a core requirement will allow us to avoid the semantics of natural language, and the imperfection of free-text description.
The Data Model's use of SO.
For the ClinGen Data Model (DM) we have attempted to use SO terms whenever possible to describe many aspect of our model including: sequences, features, alterations, amino acids, and amino acid changes.
Our ValueSets relating to sequence annotation allows knowledge to be captured on two levels: Primary and Ancillary types. Primary types allow the highest level of description while Ancillary types include the ability to add necessary biological meaning and record variant effects if desired.
Ontologies use for term definitions.
Additionally, we use SO and The Relations Ontology (RO) terms when available to standardized other aspects of the model including reference-types and inter-model relationships.