Scope and Usage
The sequence ontology defines a "gene as a region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions."
The current definition of gene is similar, but explicity allows for non-functional genes. Furthermore, while the concept of a gene may include elements such as regulatory regions, version 0.1 of the allele model only relates genes to reference sequences for transcripts and genes (such as LRG), and will therefore not contain data on regulatory elements.
Included under this definition of gene are:
- Protein coding genes
- Pseudogenes
- Non-Coding RNA sequences
- Computationally Predicted Genes
There is no guarantee that a gene is ever transcribed, such as some computationally predicted genes or pseudogenes. Furthermore, many genes are untranslated.
A gene is identified by an external authority, typically HGNC, although the data model is configured to recognize identifiers from multiple authorities.
Context
The Gene resource is described by the following attributes:
- identifer: the set of identifiers from external authorities that identify the Gene.
- symbol: the string symbol identifying the gene.
- name: the human readable name for a gene.
- aliasSymbol: a list of alternative symbols used to identify the gene.
Resource Model
Related Resources: ReferenceSequence
Notes
Formal Definitions
Schema
schema: Gene.xsd
Examples
id | name | |||
---|---|---|---|---|
G101 | G101-ILK | json-ld | xml | json |
G102 | G102-BRCA1 | json-ld | xml | json |