Brought to you by molecularsciences.org.
This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License.
This publication may not be redistributed without this notice.

Protein Protein Interaction

Proteins control and mediate many of the biological activities of cells. Specific protein protein interactions are involved in almost all physiological process.

Small Scale Detection Methods

Interactions can be observed at atomic level such as by looking at X-ray crystallography structures. The crystal structures yield specific information on the atoms and residues involved in the interaction.

We can infer complex interaction and/or direct interaction using classical techniques such as (Co)-immunoprecipitation/pull-down, affinity chromatography, immunoassays, such as ELISA, RIA. FRET (Fluorescence Resonance Energy Transfer) and SPR (Surface Plasmon Resonance) are devices which help identify protein protein interactions. Yeast Two-Hybrid is a commonly used technique to identify protein protein interactions.

At the cellular level, we can use "activity bioassays" to observe an interaction, e.g. cell proliferation assay to observe a ligand-receptor interaction.

Co-immunoprecipitation

Immunoprecipitation (IP) is the technique of precipitating an antigen out of solution using an antibody specific to that antigen. This process can be used to enrich a given protein to some degree of purity. Co-immunoprecipitation (also known as a 'pull-down') can identify interacting proteins or protein complexes present in cell extracts: by precipitating one protein believed to be in a complex, additional members of the complex are captured as well and can be identified. Co-immunoprecipitation is a purification procedure to determine if two different molecules (usually proteins) interact. An antibody specific to the protein of interest is added to a cell lysis. Then the antibody-protein complex is pelleted usually using protein-G sepharose which binds most antibodies. If there are any protein/molecules that bind to the first protein, they will also be pelleted. Identification of proteins in the pellet can be determined by western blot (if an antibody exist) or by sequencing a purified protein band.

Advantages of co-immunoprecipitation
- Proteins in their native state
- At their native concentration (unless transfection)

Disadvantages of co-immunoprecipitation
- Mixing of compartments during cell lysis, i.e. interacting proteins might not be in the same cellular compartment
- Detection of stable interactions only
- Does not indicate whether interaction is direct
- Antibodies required

Affinity Chromatography

Affinity chromatography partitions proteins or peptides on the basis of their specific, ligand-binding affinity. The matrix of an affinity column contains ligands that are highly selective for particular proteins or classes of proteins. Beads containing antibodies, for example, can be used to isolate a single protein or peptide from a complex mixture.

Affinity chromatography methods typically involve a two-step elution procedure in which the first fraction emerging from the column comprises all the proteins or peptides that failed to interact with the affinity matrix and the second fraction comprises all the proteins or peptides that were retained on the column.

Steps
1. Prepare column and sample
2. Sample introduction
3. Absorption of protein of interest
4. Removal of impurities
5. Elution of proteins of interest

Identity of protein can be verified with western blot or SDS-PAGE.

Advantages:
- Can use for protein purification
- Purified proteins can be quantitated

Disadvantages:
- In vitro interaction;
- Detection of interactions characterized by low dissociation constant;
- Coupling to the matrix might affect protein conformation;
- Interactions might be missed if PTM required.
- Fusion proteins may interfere with protein function.

FRET – Fluorescence Resonance Energy Transfer

FRET (Fluorescence Resonance Energy Transfer) is a technique for measuring interactions between two proteins in vivo. In this technique, two different fluorescent molecules (fluorophores) are genetically fused the two proteins of interest. Regular (non-FRET) fluorescence occurs when a fluorescent molecule (fluorophore) absorbs electromagnetic energy of one wavelength (the excitation frequency) and re-emits that energy at a different wavelength (the emission frequency). FRET derivatives include GFP, BFP, CFP, YFP.

Conditions
- Overlap of emission/absorption spectra
- Appropriate orientation of transition dipoles
- Distance between fluorophores from 100 to 100A

Advantages
- In vivo interaction
- Subcellular location
- Nondestructive method, done in living cells
- Imaging in individual cells
- Real-time

Disadvantages
- Difficult to set up
- Fusion proteins
- Limits distance 100A
- Rapid photobleaching
- Low signal to noise ratios

>Yeast Two Hybrid

The two-hybrid system is a useful way to detect proteins that interact with a protein you are studying. In general, it is used primarily for initial identification of interacting proteins, not for detailed characterization of the interaction.

The system is based on modular organization of transcription factors. A transcription factor is a protein that binds DNA at a specific promoter or enhancer region or site, where it regulates transcription. Transcription factors can be selectively activated or deactivated by other proteins, often as the final step in signal transduction.

A DNA-binding domain (DBD) is any protein motif that binds to double- or single-stranded DNA with affinity to a specific sequence or set thereof or a general affinity to DNA. [5] Transcription factors bind to DNA via their DNA binding domains. The protein of interest is fused to DBD, termed bait. Another protein is fused with AD (activation domain). It the two bind, then the reporter gene is expressed.

Advantages:
- Very high numbers of coding sequences assayed in a relatively simple experiment;
- Wide variety of interactions detected and characterized following one single commonly used protocol;
- In vivo assay;
- No need for protein purification.

Limitations
- Spurious activation of reporter genes , e.g. self activators:
- Use of multiple reporter genes or swap the two domains in the two proteins
- Mutational events leading to an increase in the rate of transcription
- Fusion to irrelevant small peptides
- Indirect interactions (e.g. endogenous yeast proteins serve as a bridge)
- Subcellular location: proteins are brought to proximity in the nucleus. This may not be the real physiological state
- Proteins that don’t normally bind in vivo: different enviroment in yeast and mammalian cells
- The cDNA for the interacting protein might not be represented in the library (or under-represented)
- No expression of the fusion protein
- Insufficient folding and/or stability of a fusion protein
- Absence of the required post-translational modifications
- Toxicity of fusion proteins

Large Scale Detection Methods

Bioinformatics Tools

Accessing reliability of PPIs

PPI reliability can be accessed by:
- IG measure - interaction generality
- EPR - expression profile reliability index
- PVM - paralogous verification method

IG

IG is based on the assumption that interactions observed in a complicated interaction network are more reliable. It takes information from a list of interacting proteins. It requires information on local topology.
- Calculating IG: number of interaction - numbers of proteins interacting with with multiple proteins + 1
- Smaller IG value = higher the number of interactions, more complex graph, common role and/or location
Advantages
- doesn't rely on external information
- can serve as filter to improve reliability of PPI dataset
- can improve functional predition
Disadvantages
- can eliminate true positives
IG2 incorporates principle component analysis (PCA) into the algorithm.

EPR - methods

It takes information from mRNA expression profile.

Functionally related genes tend to be expressed in a concerted fashion. True interacting proteins are more likely to be encoded by genes with similar expression profiles than random pairs.

EPR index estimates the biologically relevant fraction of protein interactions detected in a high throughput screen. It does so by comparing RNA expression profiles for the proteins whose interactions are found in the screen with expression profiles for known interacting pairs of proteins.

1. We collect the mRNA expression levels of interacting pairs under several conditions.
2. Create the distribution of distances for the network. Calculate a distance measure d2 between the expression levels of mRNA of an interacting pair. Plot the distribution of d2 for a dataset of interaction.
3. Compare this distribution to distributions of standard interacting and non-interacting sets
4. Calculate the percentage of true interactions in the network

Non-interacting pairs have broadcast distribution and low peaks. Interacting pairs have sharp distributions and high peaks. Thus the larger the distance between the two points = lower the probability of interaction.

- can assess the overall quality of an interaction database, but not that of individual interactions
- the similarity of the expression profile cannot be used alone to predict protein-protein interaction
- the profiles do allow an estimation of the percentage of biologically relevant interactions within a set

disadvantages
- Many non-interacting proteins are also co-expressed (false positives)
- Many interacting proteins are not co-expressed (false negatives)
- mRNA expression profiles are required. They may not be available for most organisms.

PVM - Paralogous verification method

If two proteins are paralogs, then the proteins they interact with are also often paralogs. Paralogs arise from gene duplication events. Para for parallel. If P1 and P2 interact, collect all paralogs of P1 and P2. Count the number of interactions between the two families, ignoring the interaction between P1 and P2. The count is the PVM score.

• PVM judges an interaction likely if the putatively interacting pair has paralogs that also interact.
• PVM scores individual interactions.
• Magnitude of PVM score is not important
• PVM is very selective (low FP) but not sensitive
• Can be used only in cases where the protein are paralogs

- can assess the quality of individual interactions
- can estimate the total number of biologically relevant interactions within a dataset (How?)
- can be used to indicate the quality of the dataset

disadvantages
- Low sensitivity
- Not applicable to all organisms

Sources

[1] http://en.wikipedia.org/wiki/Immunoprecipitation
[2] http://www.biochem.northwestern.edu/holmgren/Glossary/Definitions/Def-C/...
[3] Lecture Notes of Dr. Lina Yip
[4] Principles of Proteomics by R. M. Twyman
[5] http://en.wikipedia.org/wiki/DNA-binding_domain