E3 ligase finding its substrate schematic

How do E3 ubiquitin ligases achieve selective protein degradation?

The ubiquitin-proteasome system (UPS) is the major route through which the cell achieves selective protein degradation, and hence the UPS plays a critical role in essentially all cellular processes. We are interested in how specificity is achieved within the UPS, particularly in the context of immune cell function. How does the cell identify proteins that need to be degraded? What cellular machinery is involved, and how is the activity of these components regulated? We combine expression screening approaches with loss-of-function CRISPR screening approaches with the goal of (1) identifying substrates of E3 ubiquitin ligases, and (2) delineating the specific molecular features ("degrons") that dictate substrate recognition. UPS components are a largely untapped source of potential drug targets; an enhanced understanding of how E3 ligases select their substrates will be invaluable to guide the development of novel therapeutics.

E1-E2-E3 UPS pathway schematic

The Ubiquitin-Proteasome System

Ubiquitin is a small (76 amino acid) protein which, through the action of the E1 activating enzyme, the E2 ubiquitin conjugases and the E3 ubiqutin ligases, is covalently attached to protein substrates to target them for proteolytic destruction by the proteasome. The ubiqutin-mediated proteasomal degradation of proteins plays an important role in essentially all major cellular processes, underscored by the fact that well over 1000 proteins (>5% of all human genes) are involved in the system. Dysfunction of the ubiquitin system is associated with a variety of human conditions, including cancer and neurodegeneration.

E3 ligase families schematic

Identifying substrates of E3 ubiquitin ligases

Over 800 E3 ligases are encoded in the human genome. These are thought to be critical in providing specificity within the UPS, by selecting the protein susbtrates to be tagged with ubiquitin and hence targeted for proteasomal degradation. However for most E3 ligases few (if any) substrates have been identified, and hence understanding how E3s perform this role remains a major challenge. E3 ubiquitin ligases are thought to selectively identify their substrates through the recognition of specific sequence elements - termed degrons - found in their target substrates. However, despite the central role of degrons in proteostasis, remarkably few degron motifs have been characterized thus far. The ability to identify degrons and match them to their cognate E3 at scale will be critical if we are to achieve a systems-level understading of the UPS.

GPS approach schematic

The Global Protein Stability (GPS) approach

Our work is centered around the Global Protein Stability (GPS) system, a genetic approach to studying the ubiquitin-proteasome system. GPS is an expression screening technique which enables the stability of exogenously expressed GFP-tagged proteins to be measured in cultured mammalian cells. GPS is based on a standard lentiviral expression vector expressing two fluorescent proteins: DsRed, which serves as an internal reference, and GFP, which is fused to the partner of interest. Because DsRed and GFP are expressed from the same transcript, the ratio of the green signal to the red signal can be used to readout the effect of the fusion partner on the stability of GFP. GPS relies on fluorescence-activated cell sorting (FACS) followed by Illumina sequencing to measure the stability of GFP fusion proteins. In a typical GPS experiment, cells expressing a library of GFP-fusion constructs are partioned into 4 bins or 6 bins (depending on the FACS instrument used) based on the GFP/DsRed ratio. Genomic DNA is then extracted from the sorted cells in each bin, which is used as a template for a PCR reaction to amplify the fusion protein. Subsequent Illumina sequencing of the amplicons allows for the quantification of the abundance of each fusion in each bin, thereby permitting a measurement of protein stability.

ORFs versus peptides GPS schematic

Global stability profiling using a human ORFeome library

GPS screens with the human ORFeome library allow us to gain a global picture of protein stability in a single experiment. The human ORFeome library consists of ~14,000 full-length, sequence-verified human open reading frames (ORFs) which can be identified through unique molecular barcodes located at their 3' end. By performing comparative profiling experiments in the presence and absence of a chemical inhibitor or genetic peturbation, we use GPS-ORFeome screens to examine how the stability of cellular proteins is impacted by the ubiquitin-proteasome system. Once unstable proteins of interest have been identified, we combine the GPS expression screening approach with loss-of-function CRISPR screens to identify the E3 ligase(s) responsible. Alternatively, substrates of specific E3 ligases can be identified by searching for proteins stabilised in cells lacking the E3 ligase compared to wild-type cells.

GPS approach schematic

Mapping degrons using GPS

Degrons are primarily thought to comprise short peptide motifs lying in unstructured regions of proteins. To characterise degron motifs, we exploit microarray-based oligonucleotide synthesis to generate custom GPS libraries wherein short peptides are fused to GFP. Once short peptides capable of destabilising GFP have been identified, we delineate the precise nature of the degron motif through comprehensive mutagenesis experiments and identify the cognate E3 ligase that recognises the motif through CRISPR screens.

ORFs versus peptides GPS schematic

Terminal degrons: the N-degron and C-degron pathways

One particularly interesting class of degrons are those which lie at the ends of proteins. By performing GPS-peptide screens using libraries of peptides derived from the extreme N- and C-termini of human proteins, we have characterised a suite of Cullin-RING E3 ligase complexes which target specific peptide motifs when they are located at protein ends. Recent work includes the identification of an N-terminal glycine degron pathway regulated by Cul2ZYG11B and Cul2ZER1, and the identification of a suite of E3 ligases that target degron motifs located at protein C-termini. We seek to delineate additional pathways through which E3 ligases target degrons located at protein termini and to understand the primary physiological roles of these pathways.

Schematic representation of viral gene products that intefere with the MHC-I antigen presentation pathway

How do viruses exploit the ubiquitin system?

Viruses manipulate the cellular machinery of their host to permit their efficient replication whilst simultaneously evading detection by the host immune system. Deciphering the mechanisms by which viral gene products achieve these goals is informative not only for understanding mechanisms of viral pathogenesis, but also for elucidating the fundamental biology of the cell. As the ubiquitin-proteasome system regulates so many different cellular processes, it is unsurprising that a wide range of viruses have evolved stratgies to hijack it. We employ a range of genetic screening techniques to (1) identify substrates of virally-encoded E3 ubiquitin ligases, and (2) characterise viral gene products that manipulate endogenous E3s to promote the degradation of cellular proteins.

E1-E2-E3 UPS pathway schematic

The MHC-I antigen presentation pathway

The major histocompatibility complex class I (MHC-I) antigen presentation pathway plays a central role in the defence against viral infection. We are interested in understanding both the role of the ubiquitin-proteasome system in the generation of antigenic peptides for presentation on MHC-I molecules, and how virusus exploit the ubiquitin system to antagonise the MHC-I pathway to evade detection by the immune system. Previous work has focused on dissecting the mechanism of action of viral gene products that interfere with this pathway. Retroviral gene-trap forward genetic screens in the near-haploid KBM-7 cell line led to the identification of PLP2 as a cellular protein essential for the function of K3 and K5, two E3 ligases encoded by Kaposi's sarcoma-associated herpesvirus (KSHV), and TMEM129, a cellular E3 ligase hijacked by the US11 gene product of human cytomegalovirus (HCMV).

E3 ligase families schematic

The HUSH complex

We are also interested in cellular pathways involved in sensing and responding to viral infection. Previous work in this area has focussed on the HUSH complex, a critical epigentic regulator that we discovered in 2015. HUSH is a complex of three proteins, TASOR, MPP8 and Periphilin, which acts by recruiting two effector proteins, the histone methyltransfererase SETDB1 and the chromatin remodeller MORC2, to target sites in order to instill epigenetic repression. HUSH can drive the silencing of integrated HIV-1, and indeed HIV-1 accesory proteins exploit the ubiquitin system to target HUSH subunits for proteasomal degradation. Inhibitors of HUSH function therefore represent a potential strategy to eradicate the pool of latent HIV-1 in AIDS patients that currently necessitates lifelong antiretroviral therapy.


Developing new genetic tools to probe gene function

Developing and exploiting state-of-the-art genetic technologies lies at the core of our approach. We specialise in large-scale genetic screening approaches, powered by fluoresence-based phenotypic selection using FACS and analysed through next-generation sequencing and computational strategies. We perform both expression screens, where we leverage highly-parallel microarry-based oligonucleotide synthesis to generate custom lentiviral expression libraries, and loss-of-fuction screens, using either large-scale guide RNA libraries to enable CRISPR/Cas9 screens or gene-trap mutagenesis screens in near-haploid KBM-7 cells. We seek new avenues - both experimental and computational - through which to enhance, combine and deploy these techniques, and, where our questions cannot be addressed using existing tools, we seek to develop novel genetic technologies to further our research goals.

GPS ORFs vs peptides schematic

Genetic tools to monitor protein stability

A major goal of our work is to develop effective genetic tools to probe the function of the ubiquitin-proteasome system. This work is based on the Global Protein Stability (GPS) system, which allows for the stability of GFP-tagged fusion proteins to be readout by FACS. Combining GPS with a human ORFeome library permits the stability of ~14,000 human proteins to be assayed in a single pooled experiment; alternatively, we leverage microarray oligonucleotide synthesis to generate custom libraries of GFP-peptide fusion proteins which allow the delineation of degron motifs. We combine these expression screening approaches with loss-of-function CRISPR screens to match the degron motifs reponsible for substrate instability to the cognate E3 ubiquitin ligase.

E3 ligase families schematic

Retroviral gene-trap forward genetic screens in near-haploid human cells

The discovery of the near-haploid KBM-7 cell line allows forward genetic screens to be carried out in human cells. With each gene represented by only a single allele, insertional mutagenesis using a gene-trap retroviral vector generates a library of gene knockouts; this library can then be interrogated to identify mutant cells displaying a phenotype of interest. We demonstrated that haploid screens could be performed using FACS-based phenotypic selection, and exploited this approach to identify a suite of novel genes involved in host-pathgoen interactions. Although such screens are restricted to a single cell type, this technique still offer some advantages over genome-wide CRISPR/Cas9 screens for certain applications, for example, where a guide RNA library targeted to a pre-defined set of genes is not desirable.

GPS approach schematic

Differential Viral Integration (DIVA)

DIVA is a genetic technique to assay chromatin accessibility. DIVA is based on the premise that lentiviruses will more readily integrate into regions of the genome loosely packaged in "open" chromatin as compared to more densely packaged heterochromatic sites. Thus, by mapping lentiviral integration sites on a large scale, genomic loci exhbiting altered chromatin architecture between control cells and experimentally treated cells can be identified. We have exploited the DIVA approach to examine the role of MORC2-mediated chromatin compaction in epigentic repression by the HUSH complex.