Swiss-Prot defines proteomics as "the qualitative and quantitative comparison of proteomes under different conditions to further unravel biological processes."
A genome is the collection of all the genes present in a living being. Genomics refers to the field which concerns itself with the study of a genome. The proteome is the collection of all the proteins. Some doing proteomics is doing research on the proteome.
Proteomics is a far more complicated branch of science than genomics. All cell of a species capable of storing DNA carry the same genome. However, the proteome of different cells are completely different. For example, the proteomes of a neuron and a red blood cell in the same organism would have very little similarity. Proteins are functional unit of living being. Obviously, two cell performing completely different tasks would have a completely different set of functional units (proteome).
The complexities and problems involved in proteomics are enormous in comparison with genomic research. The genome gives us the code for making proteins. Then why do we need to study proteins themselves? The answer is that the genome just gives us a list of parts. It does not tell us what the parts do, how they interact, when they are produced and in what quantity.
Unlike differential display-PCR cDNA micro arrays, and serial analysis of gene expression, proteomics techniques directly investigate the functional molecules. Protein abundance often does not correspond to mRNA abundance. Therefore protein quantitation is essential.
Post Translational Modifications (PTM) are ver common in proteins. A PTM can modify both protein function and structure. PTMs cannot be predicted from genetic code. Proteomics is capable of PTM identication and quantitation.
Before a protein can be examined, it must be extracted from tissues, cells, and organelles. It has to be separated and profiled in a manner which would preserve its functional and structural integrity.
Proteomics aims to:
Proteomics research often involves the following steps:
All living beings including human beings have different bodily fluids. Each fluid is there to perform several functions or make the execution of several function possible. Generally, a fluid contains a mixtures of compounds, including many proteins. Our bodies make sure that our various bodily fluids remain separate. For example, blood, CSF do not mix.
When studying a disease, researchers often study the proteins involved in the disease or disorder. In a diagnostic experiment, the choice of body fluid is crucial. For example, CSF is a better choice than blood plasma for many brain diseases since blood and neurons are separated by the blood brain barrier.
There are hundreds of different fluids present in our body. We cannot cover all of them in this book. We will only be looking at the most common and most important ones.
Plasma is the liquid portion of the blood. It contains blood cells and other compounds. Plasma is about 55% to 60% of the blood content. To collect blood plasma, a syringe containing suitable anti-coagulant is used. Anti-coagulant is added to prevent blood from clotting. Plasma contain red blood cell (RBC), white blood cells (WBC), platelets, fibrinogens, lipids, salts, urea, antibodies, etc. Plasma serves as a transport medium for nutrients, waste, cells. In proteomics, plasma is fractionated before being examined since it contains to many substances and too many proteins.
To collect serum, blood is allowed to clot. Once the clot is removed, we are left with serum. Serum contain water, electrolytes, albumin, antibodies, etc. It does not contain red blood cell (RBC), white blood cells (WBC), platelets, fibrinogens.
CSF is a clear body fluid that is found in the subarachnoid space in the brain. Subarachnoid space is a between the skull and the cerebral cortex. It cushions and buffers the cortex. CSF is collected by a lumbar puncture. It is used in diagnosing cerebrospinal and neurological diseases such as Jacob Creutzfeldt disease.
In animals, urine is produced by filtering blood through kidney. It is then collected in the bladder and excreted through the urethra or penis. Urine contains excess compounds and undesirable substances that are not needed by the body or harmful to the body.
The major interest in the analysis of urine is to discriminate between glomerular and tubular diseases. In glomerular diseases, additional high molecular weight plasma proteins may be detected in the urine due to alteration of the glomerus. In contrast, tubular diseases show only additional low molecular weight proteins.
Several other fluids are also routinely used in medicine, diagnositics and research but we would not talking about them in this book.
In proteomics, fractionation is a separation process in which a mixture of compounds is divided into smaller fractions according to a gradient. The gradient can be based on a specific property of a set of properties. Fractionation widely employed to separate substances of interest from other substances. Fractionation techniques can be applied at different levels.
An ideal homogenate contains a suspension of intact and individualized subcellular compartments. To collect individualized cells, we use chelating agents such as EDTA for enzymatic and mechanic disaggregation. Then we disrupt plasma membrane by detergents or mechanical methods such as ultrasonification.
While some claim cell fractionation and subcellular fractionation to be different, most researchers use the terms interchangeably. Cell fractionation refers to fractioning the different components of the cell. This is among the first steps in protein profiling.
Subcellular compartments can be separated based on their properties such as size, density, and charge. The first step is to separate the nuclei and the unbroken cells from cytoplasmic organelles by differential sedimentation at low centrifugal force in order to obtain postnuclear supernatant (PNS). The supernatant can then be separated using differential centrifugation based on size, weight, density or even shape. Differential centrifuagation is a time-dependent technique. Isopycnic centrifugation separates by density by passing the organelles through a sucrose gradient. Free-flow electrophoresis (FFE) separates by charge.[3] Immunoisolation techniques can be used to separate using the biological properties of teh organelles. Several commercial and non-commercial devices and techniques exist for this purpose.
The goal of sample fractionation is to improve the detection of low abundance proteins. The dynamic range for 2DE is limited to 104 while teh protein expression range is from 107 to 1012. Therefore, only the most abundant proteins are detected.
Increasing the amount of sample is not always a good solution. It is usually better to further subdivide the mixture. Fractionation methods are based on the physico-chemical properties of proteins. Following is a list some properties and techniques exploiting these properties.
| Property | Fractionation Method |
|---|---|
| Size/shape | size exclusion chromatography |
| Surface charge | ion-exchange chromatography |
| Isoelectric point | electrophoretic methods |
| Surface hydrophobicity | reverse phase chromatography |
| Binding specificity | affinity chromatography |
| Solubility | solvent extraction |
22 proteins comprise 99% of the protein mass in serum. Affinity chromatography can be used to remove albumin, immunoglobulins, etc. Antibodies and proteins A, G, and L are often used to remove these proteins.
Protein precipitation can be induced by using organic solvents such as acetone, salts such as ammonium sulfate, or by changing the pH. Protein precipitation is used for removal of large abundant proteins. This methods lack specificity.
Ultrafiltration is a pressure-driven, semi permeable membrane-based separation process. It achieves separation on the basis of size. The membrane retains larger molecules. This method lacks specificity.
The analysis of proteins, whether on a small scale or large scale, requires methods for the separation of protein mixtures into their individual components. Protein separation methods can be placed on a sliding scale from fully selective to fully nonselective. Selective methods aim to isolate individual proteins from a mixture usually by exploiting very specific properties such as their binding specificity or biochemical function.
Whether protein separation is selective, partially selective or nonselective, it is important to remember that the underlying principle is always the exploitation of physical and chemical differences between proteins which cause them to behave differently in particular environments. These physical and chemical differences are determined by the number, type and order of amino acids in the protein, and by any PTMs that have taken place.
Two predominant methods to separate proteins are 2D gel electrophoresis and chromography techniques.
A major requirement for separation techniques is high resolution. The separation technique should produce fractions that comprise very simple mixtures of proteins, and ideally each fraction should contain an individual protein. This essentially rules out 1D techniques i.e. those that exploit a single chemical or physical property as the basis for separation. The other major requirement in proteomics is high throughput. The separation technique should resolve all the proteins in one experiment and should ideally be easy to automate. The final requirement is that the fractionation procedure should be compatible with downstream analysis by mass spectrometry. The two groups of techniques which dominate proteomics is 2DGE and multidimensional liquid chromatography.
Any charged molecule in solution will migrate in a applied electric field, a phenomenon known as electrophoresis. The rate of migration depends on the strength of the electric field and the charge density of the molecule, i.e. the ratio of charge to mass. Polyacrylamide gels are favored because they facilitate separation by sieving the proteins on the basis of their size.
All high resolution protein fractionation methods employ multidimensional separation processes that exploit different properties of proteins for separation in each dimension. Although PAGE separates proteins according to both charge and mass, exploiting both these principles in the same dimension still results in a low resolution separation.
First dimension is usually IEF, in which proteins are separated on the basis of their net charge irrespective of their mass. The underlying principle is that electrophoresis is carried out in a pH gradient, allowing each protein to migrate to its pI, the point at which its pI is equivalent to the surrounding pH.
IPG gels are the most commonly used since they don’t suffer from cathodic drift and poor reproducibility.
The second dimension, usually carried out by SDS-PAGE, separates the proteins according to the molecular mass irrespective of charge. SDS binds to the proteins, dwarfing protein charge density.
CE is carried out in glass tubes that are typically about 50 micrometer in diameter and up to 1 meter in length. The tubes may or may not be filled with gel, but the presence of gel facilitates sieving of the proteins or peptides and enhances size-independent separation. The thin tubes are efficient at dissipating heat, allowing the use of very strong electric fields. The separations are thus rapid, efficient and can be monitored in real time rather than at the experiment’s end point. Therefore, the major application of capillary electrophoresis has been the separation of peptides in relatively simple mixtures such as tryptic digests of purified proteins or spots excised from 2D gels.
Recently CE is coupled with HPLC and ESI MS/MS for high throughput proteomics.
Major limitations are:
- Resolution
- Sensitivity
- Representation
- Automation
There are several ways to increase the resolution of 2DGE:
- Use a larger gel
- Use zoom gels – 2nd dimension on 2 gels
- prefractionation
There are many different techniques to stain gels but there is no single ideal method. The choice often depends on your needs often lead to a combination of techniques. While several reactive dyes could be used to stain proteins prior to electrophoresis, the post-electrophoretic staining of separated proteins remains the preferred method.
Proteins which are inherently which are inherently color such as hemoglobin and myoglobin are easily observed directly in polyacrylamide gels upon exposure to light in the visible spectrum. Unfortunately, this is not the case for the vast majority of proteins whose visualization requires the use of dyes or stains. Many of the organic dyes and stains that have been adapted fro the detection of proteins in polyacrylamide gels have been derived from dyes originally utilized in the textile industry. Currently the most commonly used organic dyes include amido black, coomassie blue and silver staining.
Coomassie blue
CBB-R (reddish blue) and CBB-G (greenish blue) are the most sensitive and convenient to use. CBB requires an acid environment to enhance ionic interactions between dye and basic amino acid moieties of the protein. The proteins and CBB interact via hydrogen bonds and van der waals forces.
Duration of staining depends on gel thickness and polyacrylamide composition. Thicker the gel, longer the staining time. Gels can be stained by either passive diffusion or electrophoretic destaining.
CBB is an aromatic compound. It is not suitable with SDS if you wish to store the gel for a long period of time.
Amido Black
Amido black is not as sensitive as CBB but it enjoys selected applications because of its rapid staining and destaining properties.
Silver Staining
For most applications, visualization of proteins with CBB is sufficiently sensitive. However, if one is interested in determining the absolute purity of a protein or wishes to determine trace amounts of proteins then more sensitive protein-visualization techniques must be utilized. [5]
Although silver staining is the most sensitive of all non-radioactive protein-visualization methods currently available, it does have a number of drawbacks. Probably on of the most serious problems, and one that deserves mention, is the fact that certain proteins stain either very poorly or not at all with silver, appearing as negatively stained spots against a darker background. The shortcoming is further emphasized by the lack of staining of certain calcium-binding proteins, such as calmodulin. [5]
Fluorescent protein labeling
Fluorescent labeling methods are extremely sensitive but not commonly used largely due to the difficulty of use, need for additional equipment, and cost. Although proteins can be label after electrophoresis, they are usually labeled before electrophoresis. Fluorescence molecules bind covalently to proteins. Commonly used fluorophores are dansyl and ans.
Reverse staining
The idea behind reverse staining is to stain the gel and not the proteins. Staining gels are rapid, display intermediate sensitivity and do not require prior fixation of proteins within the gel matrix.
Labeling proteins with radioactive isotopes
Radiolabeling methods
Labeling proteins with radioactive isotopes before or after electrophoresis is the most sensitive labeling method. The commonly used isotopes are C14, S35, P32, H3, and I125. Cell or tissues are exposed to radioactive isotopes which they metabolize. Alternately, Proteins are labeled post-synthetically by any of a variety of chemical methods including oxidative iodination.
Radiolabeled protein bands or spots can be detected by liquid scintillation counting or autoradiography.
Immunoblotting
Using antibodies to spot desirable proteins.
[1] O’Farrell, P. Z., J. Biol. Chem. 1975, 250, 4007-4021.
[2] Rabilloud, T., Valette, C., Lawrence, J., Electrophoresis 1994, 15, 1552-1558.
[3] Sanchez, ARBF 1998, Sample preparation and solubilization: crucial steps preceding the two-dimensional gel electrophoresis process
[4] R.M. Twyman, Principles of Proteomics
[5] Peter J. Wirth, Alfredo Romano, 1995, Staining methods in gel electrophoresis, including use of multiple detection methods
[6]. Gurd, F. R., Methods Enzymol. 1967, 11, 532-541.
[7]. Griffith, O. W., Anal. Biochem. 1980, 106, 207-212.
[8]. Brune, D. R., Anal. Biochem. 1992, 207, 285-290.
Any separation technique that distributes the components of a mixture between two phases, a fixed stationary phase and a free-moving mobile phase, is known as chromatography. There are many different chromatography formats but all depend on the same underlying principle. A mixture of molecules is dissolved in a solvent and fed into the chromatography process. As the mobile phase moves over the stationary phase, the components of the mixture can interact with the molecules of both the solvent and the stationary matrix. Different components in the mixture move at different rates because of their differing affinities for each phase. Molecules with the lowest affinity for the stationary phase will move the most quickly because they tend to remain in the solvent, while molecules with the highest affinity move the most slowly because they tend to stay associated with stationary phase and are left behind. This results in the mixture being partitioned into a series of fractions, which can be eluted and collected individually.
LC is used more often than other chromatography formats because of its versatility and compatibility with MS. Unlike 2DGE, LC is suitable for the separation of both proteins and peptides, and can therefore be applied either upstream of 2DGE to prefractionate the sample, downstream to separate the peptide mixtures from single excised spots, or instead of 2DGE itself.
In LC methods, the stationary phase is a porous matrix while the mobile phase is a solvent containing dissolved proteins or peptides. The rate of flow depends of the affinity with the stationary phase.
Based on the ability of proteins to bind to certain molecules
Affinity chromatography partitions proteins or peptides on the basis of their specific, ligand-binding affinity. The matrix on an affinity column contains ligands that are highly selective for particular proteins or classes of proteins. Thus there is a two-step elution process. [4]
Based on the charge differences (due to surface residues)
It separates proteins or peptides according to their charge. It is based on the reversible adsorption of solute molecules to a solid phase that contains charged chemical groups. Cationic (+ve) or anionic (-ve) resins may be used and these attract molecules of opposite charge in the solvent. Instead of a 2-step elution procedure, gradient elution is achieved by washing the column with buffers of gradually increasing ionic strength or pH. [4]
Based on differences in hydrophobicity (due to nonpolar surface residues)
RP involves the reversible adsorption of proteins or peptides to the stationary phase matrix, and multiple fractions are produced by gradient elution. The proteins and peptides are separated according to their hydrophobicity, and the reversed-phase resin consists of hydrophobic ligands such as C14 to C18 alkyl groups. Gradient elution is achieved by gradually increasing the amount of an organic modifier in the elution buffer, which disrupts the weakest hydrophobic interactions first. Among the different chromatography techniques, RP-HPLC is the most powerful method and has the highest resolution. [4]
Based on size differences
The flexibility of LC methods in terms of combining different separating principles makes multidimensional chromatography an attractive technology. Not all methods are compatible so care must be taken before coupling two methods.
Proteins control and mediate many of the biological activities of cells. Specific protein protein interactions are involved in almost all physiological process.
Interactions can be observed at atomic level such as by looking at X-ray crystallography structures. The crystal structures yield specific information on the atoms and residues involved in the interaction.
We can infer complex interaction and/or direct interaction using classical techniques such as (Co)-immunoprecipitation/pull-down, affinity chromatography, immunoassays, such as ELISA, RIA. FRET (Fluorescence Resonance Energy Transfer) and SPR (Surface Plasmon Resonance) are devices which help identify protein protein interactions. Yeast Two-Hybrid is a commonly used technique to identify protein protein interactions.
At the cellular level, we can use "activity bioassays" to observe an interaction, e.g. cell proliferation assay to observe a ligand-receptor interaction.
Immunoprecipitation (IP) is the technique of precipitating an antigen out of solution using an antibody specific to that antigen. This process can be used to enrich a given protein to some degree of purity. Co-immunoprecipitation (also known as a 'pull-down') can identify interacting proteins or protein complexes present in cell extracts: by precipitating one protein believed to be in a complex, additional members of the complex are captured as well and can be identified. Co-immunoprecipitation is a purification procedure to determine if two different molecules (usually proteins) interact. An antibody specific to the protein of interest is added to a cell lysis. Then the antibody-protein complex is pelleted usually using protein-G sepharose which binds most antibodies. If there are any protein/molecules that bind to the first protein, they will also be pelleted. Identification of proteins in the pellet can be determined by western blot (if an antibody exist) or by sequencing a purified protein band.

Advantages of co-immunoprecipitation
- Proteins in their native state
- At their native concentration (unless transfection)
Disadvantages of co-immunoprecipitation
- Mixing of compartments during cell lysis, i.e. interacting proteins might not be in the same cellular compartment
- Detection of stable interactions only
- Does not indicate whether interaction is direct
- Antibodies required
Affinity chromatography partitions proteins or peptides on the basis of their specific, ligand-binding affinity. The matrix of an affinity column contains ligands that are highly selective for particular proteins or classes of proteins. Beads containing antibodies, for example, can be used to isolate a single protein or peptide from a complex mixture.
Affinity chromatography methods typically involve a two-step elution procedure in which the first fraction emerging from the column comprises all the proteins or peptides that failed to interact with the affinity matrix and the second fraction comprises all the proteins or peptides that were retained on the column.
Steps
1. Prepare column and sample
2. Sample introduction
3. Absorption of protein of interest
4. Removal of impurities
5. Elution of proteins of interest
Identity of protein can be verified with western blot or SDS-PAGE.
Advantages:
- Can use for protein purification
- Purified proteins can be quantitated
Disadvantages:
- In vitro interaction;
- Detection of interactions characterized by low dissociation constant;
- Coupling to the matrix might affect protein conformation;
- Interactions might be missed if PTM required.
- Fusion proteins may interfere with protein function.
FRET (Fluorescence Resonance Energy Transfer) is a technique for measuring interactions between two proteins in vivo. In this technique, two different fluorescent molecules (fluorophores) are genetically fused the two proteins of interest. Regular (non-FRET) fluorescence occurs when a fluorescent molecule (fluorophore) absorbs electromagnetic energy of one wavelength (the excitation frequency) and re-emits that energy at a different wavelength (the emission frequency). FRET derivatives include GFP, BFP, CFP, YFP.
Conditions
- Overlap of emission/absorption spectra
- Appropriate orientation of transition dipoles
- Distance between fluorophores from 100 to 100A
Advantages
- In vivo interaction
- Subcellular location
- Nondestructive method, done in living cells
- Imaging in individual cells
- Real-time
Disadvantages
- Difficult to set up
- Fusion proteins
- Limits distance 100A
- Rapid photobleaching
- Low signal to noise ratios
The two-hybrid system is a useful way to detect proteins that interact with a protein you are studying. In general, it is used primarily for initial identification of interacting proteins, not for detailed characterization of the interaction.
The system is based on modular organization of transcription factors. A transcription factor is a protein that binds DNA at a specific promoter or enhancer region or site, where it regulates transcription. Transcription factors can be selectively activated or deactivated by other proteins, often as the final step in signal transduction.
A DNA-binding domain (DBD) is any protein motif that binds to double- or single-stranded DNA with affinity to a specific sequence or set thereof or a general affinity to DNA. [5] Transcription factors bind to DNA via their DNA binding domains. The protein of interest is fused to DBD, termed bait. Another protein is fused with AD (activation domain). It the two bind, then the reporter gene is expressed.

Advantages:
- Very high numbers of coding sequences assayed in a relatively simple experiment;
- Wide variety of interactions detected and characterized following one single commonly used protocol;
- In vivo assay;
- No need for protein purification.
Limitations
- Spurious activation of reporter genes , e.g. self activators:
- Use of multiple reporter genes or swap the two domains in the two proteins
- Mutational events leading to an increase in the rate of transcription
- Fusion to irrelevant small peptides
- Indirect interactions (e.g. endogenous yeast proteins serve as a bridge)
- Subcellular location: proteins are brought to proximity in the nucleus. This may not be the real physiological state
- Proteins that don’t normally bind in vivo: different enviroment in yeast and mammalian cells
- The cDNA for the interacting protein might not be represented in the library (or under-represented)
- No expression of the fusion protein
- Insufficient folding and/or stability of a fusion protein
- Absence of the required post-translational modifications
- Toxicity of fusion proteins
PPI reliability can be accessed by:
- IG measure - interaction generality
- EPR - expression profile reliability index
- PVM - paralogous verification method
IG is based on the assumption that interactions observed in a complicated interaction network are more reliable. It takes information from a list of interacting proteins. It requires information on local topology.
- Calculating IG: number of interaction - numbers of proteins interacting with with multiple proteins + 1
- Smaller IG value = higher the number of interactions, more complex graph, common role and/or location
Advantages
- doesn't rely on external information
- can serve as filter to improve reliability of PPI dataset
- can improve functional predition
Disadvantages
- can eliminate true positives
IG2 incorporates principle component analysis (PCA) into the algorithm.
It takes information from mRNA expression profile.
Functionally related genes tend to be expressed in a concerted fashion. True interacting proteins are more likely to be encoded by genes with similar expression profiles than random pairs.
EPR index estimates the biologically relevant fraction of protein interactions detected in a high throughput screen. It does so by comparing RNA expression profiles for the proteins whose interactions are found in the screen with expression profiles for known interacting pairs of proteins.
1. We collect the mRNA expression levels of interacting pairs under several conditions.
2. Create the distribution of distances for the network. Calculate a distance measure d2 between the expression levels of mRNA of an interacting pair. Plot the distribution of d2 for a dataset of interaction.
3. Compare this distribution to distributions of standard interacting and non-interacting sets
4. Calculate the percentage of true interactions in the network
Non-interacting pairs have broadcast distribution and low peaks. Interacting pairs have sharp distributions and high peaks. Thus the larger the distance between the two points = lower the probability of interaction.
- can assess the overall quality of an interaction database, but not that of individual interactions
- the similarity of the expression profile cannot be used alone to predict protein-protein interaction
- the profiles do allow an estimation of the percentage of biologically relevant interactions within a set
disadvantages
- Many non-interacting proteins are also co-expressed (false positives)
- Many interacting proteins are not co-expressed (false negatives)
- mRNA expression profiles are required. They may not be available for most organisms.
If two proteins are paralogs, then the proteins they interact with are also often paralogs. Paralogs arise from gene duplication events. Para for parallel. If P1 and P2 interact, collect all paralogs of P1 and P2. Count the number of interactions between the two families, ignoring the interaction between P1 and P2. The count is the PVM score.
• PVM judges an interaction likely if the putatively interacting pair has paralogs that also interact.
• PVM scores individual interactions.
• Magnitude of PVM score is not important
• PVM is very selective (low FP) but not sensitive
• Can be used only in cases where the protein are paralogs
- can assess the quality of individual interactions
- can estimate the total number of biologically relevant interactions within a dataset (How?)
- can be used to indicate the quality of the dataset
disadvantages
- Low sensitivity
- Not applicable to all organisms
[1] http://en.wikipedia.org/wiki/Immunoprecipitation
[2] http://www.biochem.northwestern.edu/holmgren/Glossary/Definitions/Def-C/...
[3] Lecture Notes of Dr. Lina Yip
[4] Principles of Proteomics by R. M. Twyman
[5] http://en.wikipedia.org/wiki/DNA-binding_domain
What is the purpose of biological expression systems?
To produce proteins.
Why express heterlogous proteins?
Proteins are difficult to purify and obtain in large quantities. Expressing proteins fulfills both needs. It is also less expensive once an expression system has been set up.
What are expression systems?
Expression systems are vectors and host cells which are used to produce a protein. An expression system may be cell-based or cell free.
What the necessary step to produce a protein?
1. deciding which polynucleotide sequence to introduce
2. introducing mRNA or DNA into a cell-free extract
3. introducing DNA into a cell
4. extracting and purifying the desired proteins
How can we produce a novel protein?
It is possible to mutate tRNA genes of any organism i.e. modify the genes which code for tRNA. The goal of this exercise is to stimulate or suppress an activity. This technique allows the addition extra amino acids into the system and may result in the production of novel proteins. Protein splicing can take place after translation.
What are the advantages and disadvantages of cell-free systems?
Advantages
• bypassing cellular metabolism means that label incorporation can be more efficient
• highly toxic gene products can be made in this way
Disadvantages
• Although not present, cellular chaperones can be added to the system
• PTM that require cellular structures cannot be made
What are the advantages and disadvantages of cell-based systems?
Advantages
• can be adapted for producing very large quantities of protein at relatively low cost
• cell can be used to make isotopically labeled proteins
• can incorporate PTM
• can use cell-based expression to look at the function of a protein in the context of a living cell
• expression can be made to last as long as the cell culture lives.
Disadvantages
• Time consuming: cloning of target DNA, introduction of vector, growth of cell culture, and induction
• Proteins need to be extracted and purified from the cells
What is transformation?
Genetic alteration of a cell after uptake of foreign DNA.
What is transfection?
Transformation of animal cells.
What is transduction?
Transduction is the process by which bacterial DNA is moved from one bacterium to another by a virus.
How do you choose an expression system?
The choice depends largely on your needs. You cannot, for example, produce antibodies in a bacteria.
What is a vector?
Traditionally in medicine, a vector is an organism that does not cause disease itself but which spreads infection by conveying pathogens from one host to another. Mosquitoes do not cause malaria but spread the pathogen. A virus itself may serve as a vector, if it has been re-engineered and is used to deliver a gene to its target cell. A "vector" in this sense is a vehicle for delivering genetic material such as DNA to a cell by a process called transduction. In genetic engineering vectors are usually plasmids or phages.
What is a plasmid?
Plasmids are autonomously replicating extra chromosomal DNA molecules. Plasmids often contain genes that confer a selective advantage to the bacterium harboring them, e.g. the ability to make them antibiotic resistant. Every plasmid contains at least one DNA sequence that serves as a starting point for DNA replication, enabling the plasmid DNA to replicate itself independent of the chromosome.
Plasmids used in genetic engineering are called vectors. They are used to transfer genes from one organism to another and typically contain a genetic marker conferring a phenotype that can be selected for or against. Most also contain a polylinker or MCS (multiple cloning site), which is a short region containing several commonly used restriction sites allowing easy insertion of DNA fragments at this location.
What are lentiviral vectors?
Lentiviral vectors are a type of retrovirus that can infect both dividing and nondividing cells because their preintegration complex (virus “shell”) can get through the intact membrane of the nucleus of the target cell. HIV is a very effective lentiviral vector because it has evolved to infect and express its genes in human helper T cells and other macrophages. The only cells lentiviruses cannot gain access to are quiescent cells (in the G0 state) because this blocks the reverse transcription step. Reverse transcription is the process of coverting RNA to DNA.
What is a flag-tag?
FLAG-tag is a polypeptide protein tag that is added to a recombinant expressed protein. It can be used for affinity chromatography, then used to separate recombinant, overexpressed protein from wild-type protein expressed by the host organism. It can also be used in the isolation of protein complexes with multiple subunits.
What are inducible and constitutive promoters?
A promoter is a DNA sequence that enables a gene to be transcribed. The promoter is recognized by RNA polymerase, which then initiates transcription. Inducible promoters can be artificially activated or inhibited. Expression of constitutive promoters is not controlled by endogenous factors. Constitutive promoters are used for repressor proteins.
Describe lentiviral infection
1. The virus attaches itself to CD4 and co-receptor of the T helper cell membrane.
2. It injects the viral core into the cytoplasm.
3. Virus loses its membrane, consisting of nucleoproteic complex.
4. This complex enters the nucleus, does reverse transcription and inserts itself into the genome.
How can we insert a vector into a cell?
- Precipitation at the cell surface
- electroporation
- lipid vesicules
- using projectiles
- using viral vectors
How do we detect or purify a protein?
- Separate the lysate using electrophoresis
- Western blot
Amino acids
RW - H donors
NQST - H donors and acceptors
KDEYH - donor or acceptor based on their ionization state
Which amino acids are involved in electrostatic interactions? Hydrophobic
Electrostatic: polar - SCN WYTH
Hydrophobic : non polar - FW MAIL PGV
What is covalent capture
We place a Cys or Thr at the end of a polypeptide chain in solid phase with aldehydes. This helps purification as the lateral chain would not grow.
Ribosome Display
http://www.bioc.uzh.ch/plueckthun/slide_shows/Slides/ribo/ppframe.htm
How to cut & paste transgenes?
Restriction enzymes
How are codons read?
The adaptor in protein synthesis is tRNA. tRNA molecules contain an amino acid attachment site and a template recognition site (anticodon). Amino acid chains are assembled by ribosomes according to instructions originally on the DNA template copied on to mRNA transcript.
What are the start and stop codons?
start: AUG
stop: UAA, UAG, UGA
What are suppressor mutations?
List cell-based systems
Prokaryotic Transformation - concern bacteria, precipitation of DNA at the surface, electroporation
Prokaryotic Transduction - uses viral vectors
Eukaryotic Transduction - uses viral vectors
Eukaryotic Transfection - DNA at surface, electroporation, precipitation, projectiles, etc.
What is a polylinker or MCS?
A polylinker or MCS (multiple cloning site) is a short region containing several commonly used restriction sites allowing easy insertion of DNA fragments at this location.
What is Ori?
The origin of replication (also called the replication origin) is a particular DNA sequence at which DNA replication is initiated. DNA replication may proceed from this point bidirectionally or unidirectionally.
What are the specific features of an expression vector?
- Promoter - inducible such as lac operon
- Origin of Replication
- Transgene
- Tags to help purification
- IRES - to allow expression of multiple genes
- MCS - for restriction enzymes
- antibiotics resistance genes
What is an ORF?
The section of genome between start and stop sequences. The section which could potentially code for something.
What are the steps in Western Blot?
Draw a vector
What can we add to the vector? Why?
IRES: to express multiple genes
neoR:
tag or flag: to facilitate purification
When a protein is cloned, how do we verify that the protein is expressed?
WB and MS
How do we purify?
2D, chromatography
Explain affinity chromatography
1. An antigen is added to an immunoabsorbent.
2. Proteins are passed over the immunoabsortbent. Desired antibodies will non-covalently bond with antigen. Everything else will flow through.
3. Use salt of urea for elution to separate antigens and antibodies.
4. Dialysis to remove the elution materials.
Ion exchange chromatography
Reverse Phase chromatography
Gel-filtration, size-exclusion chromatography
Why are antibodies important?
Protein detection: FACS, WB, ELISA
Protein purification: affinity chromatography
Protein localization: immunofluorescence
FACS
Fluorescent antibodies can also be used with a fluorescence-activated cell sorter (FACS) to identify and even physically separate different cell populations according to the antigens expressed on their surfaces. A FACS is a flow cytometer designed to detect fluorescent light stimulated by the laser beam. Cells binding fluorescently tagged monoclonal antibodies directed against specific cell surface CD antigens, for example, will emit fluorescent light as they pass through the cytometer's laser beam. The relative intensity of emitted fluorescent light correlates with the amount of antibody bound to the cell surface which in turn correlates with the level of CD marker expression by the cell. With proper combinations of antibodies with different fluorescent tags, specific cell populations can be sorted and even isolated from a complex mixture of cells.
ELISA
Enzyme-Linked Immunosorbent Assay (ELISA) is used to detect presence of an antibody of antigen. It uses two antibodies coupled to an enzyme, one of which is specific to the antigen. It the first antibody attaches to an antigen, the second would produce a detectable chromogenic or fluorogenic signal.
immunofluorescence
1. Add your primary antibodies which will bind to a certain protein.
2. Add your secondary fluorescent antibodies which will bind to primary antibodies.
3. View with a microscope.
Monoclonal antibodies
Monoclonal antibodies are all identical, produced by clones of a single antibody-producing cell. They recognize one specific epitope. Producing mAb requires immunizing an animal, usually a mouse; obtaining immune cells from its spleen; and fusing the cells with a cancer cell (such as cells from a myeloma) to make them immortal, which means that they will grow and divide indefinitely. A tumor of the fused cells is called a hybridoma, and these cells secrete mAb. The development of the immortal hybridoma requires the use of animals; no commonly accepted non-animal alternatives are available. An investigator who wishes to study a particular protein or other molecule selects a hybridoma cell line that secretes mAb that reacts strongly with that protein or molecule. The cells must grow and multiply to form a clone that will produce the desired mAb.
Western Blot
A sample is subjected to electrophoresis on an SDS-PAGE. The resolved proteins are transferred onto the gel. An antibody specific for an antigen is added. A second antibody which is specific to the first antibody is added and rinsed. If a label is detected, we have the antigen, otherwise we don’t.
Polyclonal antibodies
Most antigens have several epitopes. Polyclonal antibodies are heterogeneous mixtures of antibodies, each specific for one of the various epitopes on an antigen.
1. Immunize animal
2. Purify IgG from the serum
Antibody Structure
Antibody Fragments
CDR - complementarity determining regions
What is ribosome display?
Ribosome display is a technology for the in vitro selection and evolution of the very large protein libraries. The entire procedure is performed in vitro, without using cells at any step. Steps:
1. Isolation mRNA
2. Reverse transcribe mRNA to DNA
3. Amplify with PCR
4. in vitro transcription to produce mRNA
5. in vitro translation. Since there is no stop codon, native proteins are tethered to the ribosome
6. selection on surface bound target
7 dissociation of mRNA-ribosome protein complexes. Repeat from step 1.
Features of an expression vector?
Packaging plasmid
Envelope plasmid
How to get nucleic acid into the cell
Flow diagram for getting a transgene into a cell
How do you get the transgene in?
How do you switch it on?
Pros and cons of different systems
Checking expression and adjusting parameters
Purification of the expressed protein
1. Protein Expression:
- Isolation and characterization of recombinant proteins
3. Protein/peptide chemical synthesis
4. Protein arrays
5. Protein Interactome
- Methodology
- Protein-protein interactions
- Protein-polynucleotide interactions
- Interaction with other biomolecules
6. Interactome bioinformatics:
- Definition of interactome (systems biology)
- Practical approaches
- Prediction: description of different algorithms
- Modelling interactions, pathways, cellular systems
- Challenges for bioinformaticians (prediction and interpretation of complex data)
- Visualization tools
7. Interactome databases:
- Interaction databases
- Pathway databases
- Information content and overlap
- Limitations
Antibodies
Monoclonal and polyclonal antibodies: creation, how to create antibodies (lympho-B)
How to create antibodies without animals (phage and ribosome display)
Ribosome display: we start with the database of 2 billion sequences of antibodies. Quand on a notre complexe mRNA-ribosome-proteine on fait un patenting et non un screening pour trouver le bon Ab dirige contre notre Ag, et ensuiter on recupere le mRNA qui correspond a cette proteine et on y induit des mutations pour augmenter l’affinite.
Antibodies : structure, production
Humoral response, cellular response, CDR
Antibody-antigen steric complementarity
Bonds: Hydrophobic interactions, Electrostatic interactions,
Which amino acids with related to which bonds
Difference between monoclonal and polyclonal antibodies - molecular aspects of this question: gene recombination, variable parts, CDR, maturation affinity.
Primary and secondary immune response
How many cDNA and antibodies are in a bank of antibodies
Structure of antibodies and antigens. Draw. Interactions. Amino acids involved. (steric complementarity, RW donor, NQST donor-acceptor, DEKYH donor-acceptor, electrostatic interactions (DE and KRH), hydrophobic interactions (FWYMAIL).
Why entropy in when two hydrophobic regions come closer. Because normally the polar molecules and water are arranged on the surface for the hydrophobic region; when they leave, the entropy increases in that region.
Peptide Synthesis
Non-natural amino acids (stop suppression, solid phase chemistry)
How to create a protein which contains sys tRNA muted (long et cher) or chemical synthesis (long and expensive but limited in size)
Peptide synthesis: reaction, rendement, purification, etc.
Properties of amino acids
Different types of chromatographies
Network model: random, scale free, hierarchical, p(k) and c(k)
Origins of scale free in nature
IG, EPR and PVM. Which one is most used with networks (EPR)? Why?
Computation method for PPI? How it works
Draw a graph of K vs. P(k) for scale free. Duplication hypothesis and hubs.
Experimental methods
Coimmunoprecipitation
Scale free networks: details, comparison, robustness (tolerance to error and resistance to attacks) in comparison with random networks
Y2H: false positives, pros and cons
A method to detect antibodies (CoIP) with advantages and disadvantages
Databases
Format PSI-XML
Some entries come from experimentation, others from prediction
PPI prediction
A method for PPI and explain
Similarity between phylogenetic trees. MSA of proteins and distance matrix.
Domain interactions method for ppi IntAct
Computational methods for PPI
[1] O’Farrell, P. Z., J. Biol. Chem. 1975, 250, 4007-4021.
[2] Rabilloud, T., Valette, C., Lawrence, J., Electrophoresis 1994, 15, 1552-1558.
[3] Sanchez, ARBF 1998, Sample preparation and solubilization: crucial steps preceding the two-dimensional gel electrophoresis process
[4] R.M. Twyman, Principles of Proteomics
[5] Peter J. Wirth, Alfredo Romano, 1995, Staining methods in gel electrophoresis, including use of multiple detection methods
[6] Gurd, F. R., Methods Enzymol. 1967, 11, 532-541.
[7] Griffith, O. W., Anal. Biochem. 1980, 106, 207-212.
[8] Brune, D. R., Anal. Biochem. 1992, 207, 285-290.
[9] Yohann Couté, Geneva University Hospital, mpb lecture notes