Home/ Research

Research

The goal of the CSMP is to express, purify and determine the structures of representative members of membrane protein classes. Where classes of membrane proteins are represented in prokaryotes, it is likely that structures for a homolog will be determined first for prokaryotic or archaeal members. Many human proteins have no good homologs in prokaryotes or archaea. These include psychopharmaceutical receptors that are targets for neuro-psychopharmaceutical drugs, the reuptake pumps that are currently targets for the new anti-depressants, the ~1500 GPCRs in the human genome that include numerous key drug targets today. Nearly 30% of all eukaryotic proteins are membrane proteins, and these include protein targets for over 40% of all drugs in use today. There is no understanding of the mechanisms, and atomic interactions of any one of these. In most cases the particular membrane protein targets of today's drugs are not yet determined. This is primarily because they are membrane proteins, where preparation in structurally homogeneous and functionally active state, and subsequent structure determination have been extremely challenging.

CSMP supports an integrated program, with interdependent subprojects, and core facilities that provide for routine processes, including protein purification, characterization, x-ray diffraction at the Advanced Light Source in Berkeley at beamlines 5.0.2, and 8.3.1, and structure determination by electron microscopy and NMR.

Bioinformatics supports two key aspects of the overall effort. It contributes to the construction of a target list of representative proteins whose structures are to be determined. The second contribution of bioinformatics is that it leverages the experimentally determined structures by structurally and functionally characterizing many more related protein sequences. Target selection and protein structure modeling are clearly inter-dependent. The better are our prediction methods, the smaller is the number of experimentally determined target structures that are needed for a given level of structural coverage and accuracy.

To illustrate the rich set of mechanisms of membrane proteins we include a statistical summary here:

Figure D1. The distribution of the proteins as a function of the number of predicted transmembrane helices in the four genomes of interest. The human sequences were taken from the NCBI RefSeq database, the others from Uniprot.

Genome	# All Proteins	# Transmembrane Proteins	Fraction (%)	# proteins with ≥3tm helices	# clusters	# clusters 30% seq id, 50% seq id
H.sapiens	21, 901	5,463	24.9	2635	1,833	2,025
P.aeruginosa	5,565	1,290	23.1	816	732	789
E.coli	4,289	1,072	24.9	700	643	681
M.jannaschii	1,715	364	21.2	207	198	204

Table D1. Frequency of membrane proteins and their clusters in four genomes. "# transmembrane proteins" indicates the number of proteins predicted to contain at least one transmembrane (tm) spanning helix. Predicted transmembrane proteins with at least 3 helices were clustered. The clustering was performed such that all members in the same cluster share at least 50% sequence identity and are not more than 50 residues different in length. The number of clusters for all membrane proteins from the four genomes together is 3,273 and 3,699 at the 30% and 50% sequence identity cutoffs, respectively. The average cluster sizes are 1.3 and 1.2 proteins, respectively. At 30% sequence identity cutoff, there are 2,648, 413, 125, and 88 clusters with 1, 2, 3, and more than 3 members, respectively.

Membrane proteins that span the membrane three or more times will be expressed and then subject to 3-dimensional structure determination by one of several methods. These include high resolution X-ray crystallography, electron crystallography of 2-dimensional crystals, single particle electron microscopic imaging and alignment, and Nuclear Magnetic Resonance (NMR).