IUPAC codes

The International Union of Pure and Applied Chemistry (IUPAC) has defined a standard representation of DNA bases by single characters that specify either a single base (e.g. G for guanine, A for adenine) or a set of bases (e.g. R for either G or A). UCSC uses these single character codes to represent multiple observed alleles of single-base polymorphisms.

SymbolBasesOrigin of designation
GGGuanine
AAAdenine
TTThymine
CCCytosine
RG or ApuRine
YT or CpYrimidine
MA or CaMino
KG or TKeto
SG or CStrong interaction (3 H bonds)
WA or TWeak interaction (2 H bonds)
HA or C or Tnot-G, H follows G in the alphabet
BG or T or Cnot-A, B follows A
VG or C or Anot-T (not-U), V follows U
DG or A or Tnot-C, D follows C
NG or A or T or CaNy