X-ray crystallography

X-ray crystallography is a technique which is widely used to determine structures of proteins. X-ray crystallography exploits the fact that X-rays are scattered or diffracted in a predictable manner when they pass through a protein crystal. X-rays are diffracted when they encounter electrons, so the nature of the scattering depends on the number of electrons that are present in each atom and the organization of the atoms in space. Diffracted X-rays can positively or negatively interfere with each other. Therefore, when protein molecules are regularly arranged in a crystal, the interaction between X-rays scattered in the same direction generating an interpretable pattern of spots. The crystal essentially amplifies the diffraction signal. The generated diffraction patterns are used to build a 3D image of the electron clouds of the molecule. This is known as an electron density map. The structural model of the protein is built within this electron density map. [2]

Why do we need x-ray crystallography

The function of a protein depends on its structure. Therefore determining protein structure accurately and determining reliable answers to structure related questions is crucial. Knowledge of accurate molecular structures is a prerequisite for drug design. X-ray crystallography is the oldest and most widely used technique to determine protein structure.

X-ray crystallography produces high resolution and has no protein mass limit since it provides atomic level resolution. However, it requires protein crystals which can be very difficult to produce. X-ray crystallography provides static average of the protein structure. Hydrogen is hardly visible through x-ray crystallography. Accuracy depends heavily on the quality of the crystal structure.

Alternate Methods

NMR and electron microscopy can be considered to be alternatives to x-ray crystallography. X-ray crystallography and NMR produces atomic level resolution. Electron microscopy offers molecular resolution. NMR is more difficult to use and much more expensive.

The method's principle

According to method's principle, LR (the limit of resolution) depends on the wavelength you are using.

LR = λ/2

Electon microscope uses a wavelength, 400 nm < λ < 800 nm. Thus, the best value for LR can be 400/2 = 200 nm, which is suitable for viewing organelle structures but not protein structures.

X-ray has a wavelength of 100 Å < λ < 0.1 Å (10 nm < λ < 0.01 nm). Thus the LR is less than the distance between two atoms, 1.2Å. Therefore, x-rays can be used at atomic level.

X-rays are formed by collision of fast electrons with matter. The wavelength of the generated x-rays depend on the matter with which the x-rays collided. Monochromatic x-rays are used to solve smaller molecules.

Workflow

  1. protein expression
  2. protein purification
  3. crystal production
  4. x-ray diffraction & phasing
  5. data collection - analysis of diffraction patterns
  6. model construction - I = A2 (intensity = square of amplitude)

Crystallization


A crystal is a solid formed by ordered atoms and ions. Ordered means that the same pattern is repeated along a regular lattice. Crystals are necessary since diffraction from individuals is too weak to measure. Crystals act as amplifiers by increasing the scattering signal since they contain a collection of same molecules ordered in a similar fashion.

Accurate structure determination requires a well-ordered crystal that diffracts X-rays strongly. Hydrophobic proteins or proteins with hydrophobic domains are the most difficult to crystallize.

Crystal formation is a multimetric process. It involves three steps:

  1. Nucleation
  2. Growth
  3. Cessation of growth

Crystallization is nothing more and nothing less than forcing a protein to precipitate into regularly ordered three dimensional arrays. These 3D arrays are the crystal.

Protein Solubility

Proteins are placed in solution with salts (precipitants). The solubility curve is a representation of protein solubility. Saturation occurs when the rate of loss and gain of both the solid and solution phases of the protein are equal, and the system is in equilibrium. Salting-out occurs when there is a reduction in protein solubility as the concentration of salt increases. Salting-in can be seen on when there is an increase in protein solubility as the concentration of salt increases. Nucleation occurs in the labile zone. Crystal growth occurs in the metastable zone. The goal is not to precipitate the protein, but to keep it is the labile and metastable zones until crystals are formed. The probability of nucleation increases with increasing supersaturation.

Crystallization energy barrier


There is a crystallization energy barrier which must be overcome by proteins before they can crystallize. The critical nucleus corresponds to the higher energy intermediate. The higher the energy barrier, the slower the rate of nucleation.

Saturation Zones

The probability of nucleation increases with increasing supersaturation. Supersaturation increases the likelihood that a critical nucleus will form. In addition, smaller nucleus is needed to induce crystal formation in a supersaturated complex. See phase diagram above. Saturation increases as we go from left to right.

Crystallization experiment issues

A protein solution is mixed with precipitating reagents such as NH4PO3 to induce nucleation and subsequently crystal growth. The choice of the precipitating agent is important since no on reagent is compatible with all proteins.

Crystallization is affected by several parameters such as:

  • physico-chemical parameters: temperature, pH, etc.
  • biological parameters: origin, contamination, etc.
  • biochemical and biophysical parameters: binding of ligands, additives, etc.

Purity and homogenity of the macromolecules is very important. Purity refers to lack of contaminants. Homogenity refers to lack of both conformational heterogeneity and sequence heterogeneity. Conformational heterogeneity refers to flexible domains and denaturations. Sequence heterogeneity refers to PTMs and proteolytic fragmentation. To reduce conformational heterogeneity, it is common to block a flexible domain with a ligand or chop it off.

There are several ways to detect contamination and heterogenity such as gel electrophoresis, immunological titrations, etc.

In general, the more you know about your protein, the more likely it would be for you to crystallize your protein. Homogeneous, compact, and globular proteins are more likely to be crystallized than hetergeneous and non-globular ones.

Solubility space
Crystallization experiments depend on large amounts of pure, soluble protein. However, it is difficult to obtain and purify large amounts of a rare protein. A sparse matrix allows rapid sampling of solubility space. It is a matrix of buffers which is used to make crude extracts that are rapidly assayed for the soluble protein using gel electrophoresis. A sparse matrix refers to a system which loosely couples different entities.

Sparse Matrix
A sparse matrix is matrix of buffers. The matrix is used to quickly estimate the best buffer for a given protein. Based on this technology, several commercial vendors supply screens which automate crystal growth. They range from same to similar products.

Growing Crystals

There are several setups which allow crystal growth. Most use a variant of the vapor diffusion method. The most widely used method is the hanging drop method.

Vapor Diffusion
A few microliters of protein solution are mixed with an about equal amount of reservoir solution containing the precipitants. A drop of this mixture is put on a glass slide which covers the reservoir. As the protein/precipitant mixture in the drop is less concentrated than the reservoir solution (we mixed the protein solution with the reservoir solution about 1:1), water evaporates from the drop into the reservoir. As a result the concentration of both protein and precipitant in the drop slowly increases, and crystals may form. There is a variety of other techniques available such as sitting drops, dialysis buttons, and gel and microbatch techniques.

Cat whisker streaking is the preferred method of seeding. Touch a crystal with a whisker, seeds will be dislodged by friction.

Alternate methods for growing crystals

  • Dehydration
  • Crushing
  • Birefringence

Is it a crystal?

So we have a crystal. Is it a salt crystal or a protein crystal. One way to find out is to set up a no protein control drop. An identical experiment but with a drop which doesn't contain a protein. Another way is to run the crystal on a gel. With this method, you lose your crystal. The most definitive way to tell your crystals apart from salt is by testing the diffraction properties.

Don't judge the crystal by its looks. A nice crystal may not diffract and an ugly crystal may diffract at high resolution.

When experiment fails

If you fail to crystallize, try the following:

  • wait longer
  • rescreen
  • use additives such as salts, detergents, and metals
  • work on the protein solution
  • chop up the protein

Solving the crystal

  • An asymmetric unit is the smallest entity (molecule) of the crystal that has no symmetry.
  • The unit cell is built by applying symmetry operators and translation along the 3 axis (X,Y,Z)
  • The side of the unit cells form the axis of the crystal (a, b, c, α, β, γ)

If an asymmetric unit contains 8 monomer, then the unit is composed of 8 amino acids.

Space Groups

Crystals and lattices can be classified into several space group based on how they favor filling space in a crystal lattice.

The combination of 14 Bravais lattices with 32 point groups and additional translational components such as screw angles and glide planes give a total of 230 groups. Of these only 65 space groups without mirror planes and inversion centers are possible for protein crystals.

X-rays can diffract with both constructive and destructive interference. Constructive inference is when the wavelength travel in unison while destructive interference is the opposite. By unison, we are referring to having the same amplitude and phase.

The pattern of diffraction allows direct determination of the unit cell and geometry (space group).

The resolution is calculated by: dmin = λ / 2 sin θ. A resolution of 2 or less angstroms is considered high resolution. Anything close to 6 angstroms is considered low resolution.

Phase Problem

Once we have acquired the diffraction patterns, we need to calculate and electron density map from the diffraction patterns. The process requires three pieces of information:

  1. wavelength λ of the incident x-rays - this is already known
  2. amplitude of the scattered x-rays - this can be determined by the intensity of the reflections
  3. phase of diffraction - this is not known and cannot be determined from the pattern of reflections.

Further experiments are usually necessary to determine diffraction phases. The standard approach is to produce heavy atom-containing isomorphous crystals. These crystals have the same structure but would produce alternative diffraction patterns. This is achieved by soaking the protein crystals into heavy metal salt solution so that the heavy metal atoms diffuse into spaces originally occupied by the solvent. By comparing the reflections generated by several different isomorphous crystals (MIR - multiple isomorphous replacement) the positions of the heavy atoms can be worked out and this allows the phase diffraction in the unsubstituted crystal to be deduced. [2]

Using the MIR process we we acquire:

  • amplitude and phase of heavy atoms
  • amplitude of protein
  • amplitudes of protein and heavy metal

The phase of the protein can then be estimated from these three amplitudes and one phase. The phase information is then used to construct an electron density map by means of a Fourier transform.

Finally, a structural model is built into the electron density map. This requires one more crucial piece of information - the amino acid sequence - because C, O, N atoms cannot be distinguished with certainty by x-ray diffraction so amino acid side chains are difficult to identify.

Accessing quality of an x-ray structure

To access the quality of an x-ray structure, we evalute:

  • resolution - higher resolution = more accurate details
  • geometry - evaluate bonds, angles, positions of side chains, etc.
  • how well the model fits into the experimentally measured density - R-factor

A well refined crystal structure should have:

  • R-factor < 0.20, Free-R < 0.27
  • RMSD bond lengths < 0.02 Å, bond angles < 2°
  • Only a few outliers in the Ramachandran plot
  • No large deviation from ideal stereochemistry without a good reason
  • Water molecules with reasonable hydrogen bonds and B-factor

Source

[1] Lecture slides Dr. Leonardo Scapozza, University of Geneva
[2] Principles of Proteomics by R. M. Twyman
[3] Structural Bioinformatics by Bourne & Weissig