ReferenceSequence

A versioned sequence of nucleotide bases or amino acids.

Scope and Usage

All alleles are described relative to a ReferenceSequence by way of the ContextualAllele.ReferenceCoordinate see ContextualAllele. Both the ContextualAllele position (or coordinate) and sequence at that postion are relative to the ReferenceSequence.

The current version of the model was based on four specific types of conceptual reference sequences; ChromosomeReferenceSequence, GeneReferenceSequence, TranscriptReferenceSequence and AminoAcidReferenceSequence.

Resource Model

ReferenceSequence Resource Diagram

Related Resources: ContextualAllele & Gene

Notes

  • This version of the ReferenceSequence resource only considers the related reference sequence associations between amino acid and transcript reference sequences. This has constrained the scope of codes in the value set for the ReferenceSequence.related.relatedType attribute.

Definitions & Bindings

ReferenceSequence

Definition

Stable, reliable, public consensus sequence of a portion of the genome against which alleles are aligned and described.

Control
1..1
Requirements

For every ContextualAllele.ReferenceCoordinate there must be a corresponding reference sequence.

ReferenceSequence.identifier

Definition

The versioned accession that can be used to uniquely identify the sequence in a public database. All users of reference sequences SHALL be sure of the identity of the reference sequence. Often only one identifier will be used, but if more then one reference accession is used it's required to map the same sequence.

Control
0..*
Type

Identifier

Requirements

For ReferenceSequence, an identifier is valid unless the accession is withdrawn. If a new version of a sequence is created, it will be given a different versioned accession, and therefore a different identifier. In this case, the previous identifier is still valid: it can still be used to retrieve the given sequence, even if that sequence is no longer the most recent sequence associated with a given entity.

ReferenceSequence.referenceSequenceType

Definition

The type of reference sequence.

Control
1..1
Binding

ReferenceSequenceType: reference-sequence-type (Codes identifying kinds of reference sequences)

Type

code

Requirements

Currently, the supported types are transcript, gene, chromosome and amino_acid (see Binding).

ReferenceSequence.chromosome

Definition

The chromosome to which the reference sequence is naturally bound.

Control
0..1
Binding

ReferenceSequenceChromosome: reference-sequence-chromosome (Codes identifying human chromosomes.)

Type

code

Requirements

This should be provided when the sequence represents a 'chromosome'.

ReferenceSequence.cdsStart

Definition

The offset of the start of the coding region from the start of the reference sequence. This should be provided when the sequence represents a 'transcript'.

NOTE: Unlike reference coordinates, this is a 1-based index representing the position of the nucleotide that starts the coding region.

Control
0..1
Type

integer

ReferenceSequence.cdsEnd

Definition

The offset of the end of the coding region from the start of the reference sequence. This should be provided when the sequence represents a 'transcript'.

NOTE: Unlike reference coordinates, this is a 1-based index representing the position of the nucleotide that ends the coding region.

Control
0..1
Type

integer

ReferenceSequence.gene

Definition

Identifies the gene related to a 'gene' or 'transcript' reference sequence.

Control
0..1
Type

Resource(Gene)

ReferenceSequence.referenceGenome

Definition

The genome build in which the chromosomal reference sequence is referenced.

Control
0..1
Type

CodeableConcept

Requirements

The current version of the model does not provide for the same version of a chromosomal reference sequence to be associated with more than one reference genome build.

Definition

The relationship between a different reference sequence to provide additional context.

Control
0..1
Requirements

The current version of the resource only defines relationship type terminology bindings to support the relationship between transcript and amino acid reference sequences.

ReferenceSequence.related.relatedType

Definition

The type or relationship between an amino acid and transcript or vice versa.

Control
0..1
Binding

ReferenceSequenceRelationshipType: reference-sequence-relationship-type (Codes describing the relationships between ReferenceSequences)

Type

code

ReferenceSequence.related.target

Definition

The target reference sequence for the specified related.relatedType.

Control
0..1
Type

Resource(ReferenceSequence)

Schema

schema: ReferenceSequence.xsd

Examples

idname
RS201RS201-NM_001014794.2-ILKjson-ldxmljson
RS202RS202-NC_000011.9-b37json-ldxmljson
RS203RS203-NC_000011.10-b38json-ldxmljson
RS210RS210-NC_000017.11-b38xmljson
RS211RS211-NC_000017.10-b37xmljson
RS212RS212-NG_005905.2-BRCA1xmljson
RS213RS213-NM_007294.3-BRCA1xmljson
RS214RS214-NP_009225.1-BRCA1xmljson
RS215RS215-U14680.1-BRCA1xmljson
RS216RS216-LRG_292p1-BRCA1xmljson
RS217RS217-LRG_292t1-BRCA1xmljson
RS218RS218-LRG_292-BRCA1xmljson