Eukaryotic DNA is associated with histone proteins and organized into a complex nucleoprotein structure called chromatin. This structure decreases the accessibility of DNA but also helps to protect it from damage. Access to DNA is achieved by highly regulated local chromatin decondensation.
The 'building block' of chromatin is the nucleosome. This contains ~150 bp of DNA wrapped around a histone octamer which consists of two each of the core histones H2A, H2B, H3 and H4 in a 1.65 left-handed superhelical turn (Luger et al. 1997, Andrews & Luger 2011).
Most organisms have multiple genes encoding the major histone proteins. The replication-dependent genes for the five histone proteins are clustered together in the genome in all metazoans. Human replication-dependent histones occur in a large cluster on chromosome 6 termed HIST1, a smaller cluster HIST2 on chromosome 1q21, and a third small cluster HIST3 on chromosome 1q42 (Marzluff et al. 2002). Histone genes are named systematically according to their cluster and location within the cluster.
The 'major' histone genes are expressed primarily during the S phase of the cell cycle and code for the bulk of cellular histones. Histone variants are usually present as single-copy genes that are not restricted in their expression to S phase, contain introns and are often polyadenylated (Old & Woodland 1984). Some variants have significant differences in primary sequence and distinct biophysical characteristics that are thought to alter the properties of nucleosomes. Others localize to specific regions of the genome. Some variants can exchange with pre-existing major histones during development and differentiation, referred to as replacement histones (Kamakaka & Biggins 2005). These variants can become the predominant species in differentiated cells (Pina & Suau 1987, Wunsch et al. 1991). Histone variants may have specialized functions in regulating chromatin dynamics.
The H2A histone family has the highest sequence divergence and largest number of variants. H2A.Z and H2A.XH2A are considered 'universal variants', found in almost all organisms (Talbert & Henikoff 2010). Variants differ mostly in the C-terminus, including the docking domain, implicated in interactions with the (H3-H4)x2 tetramer within the nucleosome, and in the L1 loop, which is the interaction interface of H2A-H2B dimers (Bonisch & Hake 2012). Canonical H2A proteins are expressed almost exclusively during S-phase. There are several nearly identical variants (Marzluff et al. 2002). No functional specialization of these canonical H2A isoforms has been demonstrated (Bonisch & Hake 2012). Reversible histone modifications such as acetylation and methylation regulate transcription from genomic DNA, defining the 'readability' of genes in specific tissues (Kouzarides 2007, Marmorstein & Trievel 2009, Butler et al. 2012).
N.B. The coordinates of post-translational modifications represented here follow Reactome standardized naming, which includes the UniProt standard practice whereby coordinates refer to the translated protein before any further processing. Histone literature typically refers to coordinates of the protein after the initiating methionine has been removed; therefore the coordinates of post-translated histone residues described here are frequently +1 when compared with the literature. For more information on Reactome's standards for naming pathway events, the molecules that participate in them and representation of post-translational modifications, please refer to Naming Conventions on the Reactome Wiki or Jupe et al. 2014.