The CSTF (cleavage stimulation factor) complex, which determines the efficiency of polyadenylation, is composed of CSTF1 (also known as CSTF50), CSTF2 (also known as CSTF64), and CSTF3 (also known as CSTF77) (Takagaki and Manley 2000: human proteins were used; Grozdanov et al. 2018: human proteins were used; Yang et al. 2018: human proteins were used). CSTF3 directly interacts with both CSTF1 and CSTF2, while CSTF1 and CSTF2 do not interact directly (Takagaki and Manley 2000). The WD40 repeats of CSTF1 are responsible for its interaction with CSTF3 (Takagaki et and Manley 2000). The proline-rich domain of CSTF3 is responsible for its interaction with both CSTF1 and CSTF2 (Takagaki and Manley 2000). The hinge domain of CSTF2 is responsible for binding to CSTF3 (Hockert et al. 2010: human proteins were used; Grozdanov et al. 2018). The reconstitution experiment with human proteins confirmed that the CSTF complex is hexameric, a dimer of two trimers that interact by homodimerization of two CSTF3 molecules and two CSTF1 molecules from interacting trimers (Yang et al. 2018). CSTF2 relies on CSTF3 to be transported into the nucleus (Grozdanov et al. 2018; Grozdanov et al. 2020). CSTF3 stimulates the RNA binding of CSTF2 (Grozdanov et al. 2018), and CSTF1 fine tunes the recognition of the target sequence by the RNA recognition motif (RRM) of CSTF2 (Yang et al. 2018). CSTF2 also interacts with SYMPK (Symplekin) (Takagaki and Manley 2000; Yao et al. 2013), a protein known to be closely associated with the CPSF complex (Shi et al. 2009; Schönemann et al. 2014). CSTF2 has a paralog called CSTF2T (also known as tauCSTF64 or CSTF64tau or CstF64t), which shares >70% sequence identity with CSTF2 in both human and mouse (Dass et al. 2002), which can replace CSTF2 in the CSTF complex and which exhibits tissue-biased expression and subtly different RNA-polymer preferences, contributing to tissue-specific regulation of polyadenylation as described below (Dass et al. 2002; Monarez et al. 2007, Yao et al. 2013, Huber et al. 2005). CSTF2 binds to U/GU-rich downstream sequence elements (DSEs) typically ~15-40 nucleotides downstream of the cleavage site in the pre-mRNA via its RRM (MacDonald et al. 1994: human CSTF2 was used; Masmouzadeh and Latham 2024: human CSTF2 was used), therefore promoting proximal 3'UTR PAS usage at many gene loci (Xia et al. 2014; Akman et al. 2015). CSTF2 thus determines the accuracy and efficiency of polyadenylation, and also brings the rest of the CSTF complex to this site through interaction between CSTF2 and CSTF3, stabilizing the CSTF complex assembly (Hockert et al. 2010; Yang et al. 2018).
In mouse, Cstf2t mRNA was reported to be ubiquitously expressed, with protein expression restricted to testes and brain (Huber et al. 2005). Subsequently, Cstf2t protein was also reported to be expressed in mouse immune cells (Hockert et al. 2011). Using mouse proteins, it was shown that the affinities of CSTF2 and CSTF2T for various RNA polymers differ, with both proteins showing similar affinities for poly(G), poly(A), and poly(C), but CSTF2 having a higher affinity for poly(U) and CSTF2T for poly(GU), supporting the hypothesis that CSTF2T promotes tissue-specific patterns of polyadenylation in tissues where it is expressed (Monarez et al. 2007). CSTF2 is an X chromosome gene, while CSTF2T is an autosomal gene, and mouse Cstf2 was shown to undergo suppression in the testes due to meiotic XY-body formation (Dass et al. 2007). Cstf2t is expressed during meiosis and haploid differentiation during spermatogenesis, and its targeted disruption causes aberrant spermatogenesis, leading to male infertility (Dass et al. 2007; Hockert et al. 2011). A homozygous nonsense mutation CSTF2T R327* was reported in an infertile male patient with severe oligoasthenospermia, which involves both low sperm count and reduced sperm motility (Gorukmez and Gorukmez 2020). One of the spermatogenesis genes whose mRNA polyadenylation is regulated by Cstf2t in mouse is the transcription factor cAMP-responsive element modulator (CREM) (Grozdanov et al. 2016). Besides regulating the expression of genes involved in spermatogenesis, mouse studies imply that CSTF2T also regulates the expression of genes involved in the adhesion of motile spermatozoa to eggs during fertilization (Tardif et al. 2010). Cstf2t knockout mice show normal immune function, suggesting that in immune cells the role of Cstf2t is either redundant with that of Cstf2 or non-essential (Hockert et al. 2011). Cstf2t is important in memory function in female mice and in anxiety regulation in male mice (Harris et al. 2016). In human cervical carcinoma cell line HeLa, it was shown that interactions of CSTF2 with RNA were at poly(A) sites (PASs), with the affinity varying between the different sites (Yao et al. 2012). RNAi-mediated depletion of CSTF2 in HeLa cells has a relatively small effect on the global polyadenylation profile, but simultaneous depletion of CSTF2 and CSTF2T leads to greater changes in polyadenylation profile, mostly characterized by the increased relative use of distal PASs (Yao et al. 2012). CSTF2 was shown to bind to thousands of dormant intronic PASs in HeLa cells that are suppressed, at least in part, by U1 small nuclear ribonucleoproteins (U1 snRNPs) (Yao et al. 2012). Contrary to earlier studies, a study by Yao et al. 2013 reported that both CSTF2 and CSTF2T proteins were widely expressed in human cell lines and mouse tissues, with tissue-specific variations in protein levels. Both proteins co-immunoprecipitate with the other two components of the CSTF complex, CSTF1 and CSTF3 in HeLa cells, and exhibit highly similar RNA-binding specificities both in vitro and in vivo (Yao et al. 2013). CSTF2 and CSTF2T negatively regulate each other's expression (Yao et al. 2013). Although the hinge domains of both CSTF2 and CSTF2T are capable of binding to SYMPK, the interaction between SYMPK and the full-length CSTF2T is inhibited by the P/G-rich domain in CSTF2T (Yao et al. 2013: human embryonic kidney cell line HEK293 was used).
In addition to a paralog CSTF2T, CSTF2 also possesses several splicing isoforms whose expression was shown to be particularly pronounced in the nervous system, where they may regulate alternative polyadenylation of neural mRNAs (Shankarling et al. 2009; Shankarling and MacDonald 2013). CSTF2 was shown to regulate expression of splicing isoforms of acetylcholinesterase (ACHE), which hydrolyze the neurotransmitter acetylcholine and terminate the synaptic transmission, through regulation of alternative polyadenylation (ACHE) of ACHE pre-mRNAs (Nazim et al. 2016). A missense mutation in CSTF2, CSTF2 D50A, leads to intellectual disability in male patients (Grozdanov et al. 2020). CSTF2 D50A mutant interacts with CSTF3 similarly to the wild-type protein but shows higher affinity for RNA (Grozdanov et al. 2020). When expressed in mice, Cstf2 D50A mutant leads to altered polyadenylation in over 1300 gene critical for brain development (Grozdanov et al. 2020).
In HeLa cells, CSTF2 was shown to undergo post-translational arginine dimethylation in a cell cycle-dependent manner, but the effect of this modification on the function of the CSTF complex in polyadenylation is unknown (Kim et al. 2010). In normal human diploid fibroblasts, arginine dimethylation of CSTF2 decreases in senescent cells (Lim et al. 2010).
Mouse embryonic stem cells (mESCs) that lack Cstf2 display slower growth, loss of pluripotency and a lengthened G1 phase, correlating with increased polyadenylation of histone mRNAs (Youngblood et al. 2014). In particular, Cstf2 is needed for mESC differentiation into endoderm lineages and cardiomyocytes (Youngblood and MacDonald 2014). Cstf2t is able to partially compensate for Cstf2 in inhibiting polyadenylation of histone mRNAs (Youngblood et al. 2014). Using human HeLa and HEK293 cell lines, it was shown that CSTF2 functions as part of the heat-labile factor (HLF), composed of the CPSF complex (cleavage/polyadenylation specificity factor), SYMPK, and CSTF2, with CSTF2 levels increasing toward the S-phase and CSTF2 being responsible for the recruitment of the HLF complex to histone pre-mRNAs (Romeo et al. 2014). CSTF2 depletion results in misprocessed histone transcripts that get polyadenylated, and mainly accumulate in the nucleus, where they are targets of the exosome machinery (Romeo et al. 2014). CSTF2T also binds histone mRNAs, and was also shown to bind small nucleolar RNAs (snoRNAs) and small nuclear RNAs (snRNAs) (Kargapolova et al. 2017). CSTF2T is involved in alternative processing of snRNAs, including U1, promoting internal oligoadenylation of snRNAs, which results in their shortening and targets them for rapid degradation (Kargapolova et al. 2017). In mouse male germ cells, Cstf2t regulates the expression of histones and histone-like proteins (Grozdanov et al. 2018).
CSTF2 promotes the usage of proximal PASs and its frequent upregulation in tumors is associated with global 3'UTR shortening characteristic of tumors (Xia et al. 2014; Akman et al. 2015). Upregulation of CSTF2 in tumors is stimulated by EGF signaling (Akman et al. 2015).