• The challenge of protein purification becomes self-evident when one considers the complex mixture of macromolecules present in a biological matrix such as a cell or tissue extract. • Several thousand other proteins with different properties are present in any given cell type (~5,000–8,000 proteins) • Nonproteinaceous materials • DNA • RNA • Polysaccharides • Lipids Actin – 1,000,000 molecules per cell Transcription factor – 10 molecules per cell 8.1. Protein abundance in the cell 8.2. How much protein is needed and what level of purity is required? Most sensitive instruments: Electrospray ionization (ESI) Matrix-assisted laser desorption/ionization (MALDI) 0.2–1 pmol (25 kDa ~ 5–25 pg/ml) Ion trap ~ 1fmol 8.2. How much protein is needed and what level of purity is required? One mole of a protein is the amount that contains 6.023x1023 molecules of that protein, which is known as Avogadro’s number. The weight of a mole of a protein in grams (g) is the same as its molecular mass. For example, for a protein with a molecular mass of 20,000 daltons, the weight of 1 mole of protein is 20,000 g. 1 mmole = 10-3 moles 1 nmole = 10-9 moles 1 fmole = 10-15 moles 1 µmole = 10-6 moles 1 pmole = 10-12 moles µ µ µ µ µ µ µ 8.3. Using recombinant proteins often simplifies the purification process Purifying rare proteins from natural biological sources is often extremely difficult, requiring extremely large quantities of starting material and a 1–2-million-fold purification to achieve homogeneity (growth factors, receptors or transcription factors). µ β β β 8.3. Using recombinant proteins often simplifies the purification process In contrast, purifying over-expressed recombinant proteins in milligram to kilogram quantities has been greatly simplified by the ability to produce target proteins containing a fusion partner (or “a purification handle”) designed to facilitate protein purification. TNF (E. coli) ? L medium Pkg. size: 50 µg – USD 625 TNF (HL60 tissue culture medium) 18 L – 20 µg of protein (Wang and Creasy, 1985) The tumor necrosis factor (TNF-alpha) is a multifaceted polypeptide cytokine known to be a mediator of inflammation and is an acute phase protein which initiates a cascade of cytokines and increases vascular permeability, thereby recruiting macrophage and neutrophils to a site of infection. TNF- secreted by the macrophage causes blood clotting which serves to contain the infection. TNF- has been detected in synovial fluid of patients with rheumatoid arthritis. 8.4. Proteins can be separated on the basis of their intrinsic properties Many proteins and peptides of biological interest are of very low abundance, often constituting <0.1% of the total cellular proteins. • Large quantities of source material required (>0.1%). • Availability of separation facilities (e.g. instrumentation). • Physical constraints of chromatographic resin support capabilities. To fully exploit the chemical and physical properties of a target protein in designing an appropriate strategy for its purification, the following parameters for the target protein should be obtained in initial pilot studies: • molecular weight (e.g., by SDS-PAGE, size-exclusion chromatography or analytical centrifugation), • pI (e.g., by isoelectric focusing), and • stability with respect to pH, salt, temperature, proteases, inclusion of additives to protein solvents to maintain biological activities (e.g. detergents, thiol reagents, and metal ions). 8.4. Proteins can be separated on the basis of their intrinsic properties 8.4.1. Exposed amino acid side chains determine protein solubility The protein-protein variation in solubility is due to the differences in the ratio of solvent-exposed charged (i.e., polar) and hydrophobic amino acids on protein surfaces. Parameters that influence the solubility of a protein include: solvent pH the ionic strength and nature of the buffer ions solvent polarity temperature Proteins tend to precipitate differentially from aqueous solution upon addition of: • neutral salts (ammonium sulfate) • polymers (polyethylene glycol) • organic solvents (ethanol, acetone) Because it is not possible to predict with accuracy the solubility properties of proteins, much of the skill in purification comes from experience in handling proteins under a variety of conditions. 8.4. Proteins can be separated on the basis of their intrinsic properties 8.4.2. The size and shape of proteins affect their movement through liquids and gels Titin is the largest known protein. Its human variant consists of 34,350 amino acids, with the molecular weight of the mature protein being approximately 3,816,188.13 Da. Its mouse homologue is even larger, comprising 35,213 amino acids with a MW of 3,906,487.6 Da. Proteins vary markedly in size, ranging from a few amino acid residues of a few hundred daltons to more than 10,000 amino acids with a molecular mass in excess of 1,000,000 daltons. However, the molecular mass of most proteins falls in the range of 6 kD to 200 kD. In the purification techniques of size-exclusion chromatography, a protein solution is passed through a column of porous beads. The internal diameter of the pores are such that large proteins do not have access to the internal space of the bead, whereas small proteins have free access. In SDS-PAGE, proteins are denatured and fully coated with the negatively charged detergent SDS, such that they migrate in electrophoretic gels on the basis of their molecular weight. 8.4. Proteins can be separated on the basis of their intrinsic properties 8.4.3. Differences in the surface charge of proteins are exploited in ion-exchange chromatography The net charge on a protein is the sum of the positively and negatively charged amino acid residues, at the pH of the solvent. Basic proteins (having net positive charge at neutral pH) have a majority of basic amino acids (e.g., arginine, lysine, and histidine). Acidic proteins (having net negative charge at neutral pH) have a majority of acidic amino acids (e.g., aspartic acid, glutamic acid). The pH at which the net charge of a protein is zero is referred to as the isolectric point (pI). 8.4.4. Ligand-binding proteins may be purified by affinity chromatography Most proteins exert their biological function by specifically interacting with some other cellular component. Enzymes bind to substrates, cofactors, activators, inhibitors and metal ions. Hormones bind to receptors. Transcription factors bind to nuclear locations, export signals and DNA templates. 8.4. Proteins can be separated on the basis of their intrinsic properties The equisite specificity of antibodyantigen interactions forms the basis for immunoaffinity chromatography Metal atoms attached to a chromatographic support (immobilized metal affinity chromatography IMAC) 8.4.5. Posttranslational modifications provide additional opportunities for purification by affinity chromatography 8.4. Proteins can be separated on the basis of their intrinsic properties Posttranslational modifications are fundamental to processes controlling cellular behavior, including cell signaling, growth, and transformation. Addition of carbohydrates to form glycoproteins Addition of phosphates to form phosphoproteins Addition of lipids to form lipoproteins • Glycoproteins can be captured using immobilized lectins. • Phosphoproteins can be captured using immobilized antibodies directed against phosphotyrosine or, alternatively, using IMAC. Leucoagglutinin, a toxic phytohemagglutinin found in raw Vicia faba (Wikipedia) 8.4.6. Thermostable proteins can often be purified easily 8.4. Proteins can be separated on the basis of their intrinsic properties Proteins are typically inactivated and precipitate if heated to 95°C. However, some proteins exhibit a remarkable degree of thermoresistance: Stathmin (mammalian intracellular regulatory protein) Muscle phosphatase inhibitor I Alkaline phosphatase (innate resistance to digestion with proteases) Very often proteins with intrinsically disordered structure are thermoresistant: Tau (protein associated with microtubules) Casein (major milk major protein; 80%) 8.5. Devising strategies for protein purification Before attempting to design a purification scheme, it is always worthwhile to carry out pilot experiments on the crude extract to determine whether the target protein possesses any unusual chemical properties that might be exploited in a purification strategy. Molecular weight pI Degree of hydrophobicity Presence of carbohydrate (glycoprotein) Phosphate modification Free sulfhydryl groups Stability with respect to: • pH • Salt • Temperature • Proteolytic degradation • Mechanical shear Bioaffinity for heavy metals If the nucleotide sequence is known, much of this information might be obtained by close inspection of the deduced amino acid sequence. 8.5. Devising strategies for protein purification 8.5.1. Is retention of biological activity essential? An important consideration is whether it is essential to retain biological activity of the target protein during purification. Most proteins retain activity at: • Low temperature • Neutral aqueous buffers • Stabilizing additives (glycerol, detergent) Chromatography techniques use incompatible conditions: • Organic solvents (acetonitrile) • Ion-pairing acids (TFA) • HCl (10 mM) • NaOH (10 mM) • MgCl2 (3 M) • Glycine buffer (pH 2.3) Reversed-phase column Immunoaffinity columns To limit the losses of biological activity of labile target proteins during purification, it is important to: • minimize the number of steps in the purification protocol, • avoid the need for buffer exchange between steps, and • discriminate between losses of biological activity due to denaturation and physical losses caused by irreversible adsorption to the chromatographic support or by proteolytic degradation. 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? The average number of steps necessary to purify proteins to homogeneity is four, with an overall yield of 28% and a purification factor of 6380, corresponding to an average ninefold purification and 73% yield per step. It is generally recognized that with most conventional chromatographic supports, there are compromises among speed, resolution, recovery, and capacity. 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 1. Clarifying the starting material In purification protocol, it is always important to include steps for removing insoluble residues (lipid droplets causing column blockage) and that the initial clarification/concentration step be as rapid as possible (proteolytic degradation): Differential centrifugation (awkward) Filtration through a plug of glass wool or fine mesh cloth (less efficient) Fractional precipitation (salts, polymers, organic solvents) – very high (80%) average yield and able to gently concentrate large volumes Ultrafiltration with a variety of molecular-mass cutoff limits (1,000– 300,000 daltons) – relatively slow 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 2. Capturing the target protein For native proteins, enrichment is best accomplished using highcapacity/low-resolution chromatographic procedures: Hydrophobic interaction (HIC) Anion-exchange chromatography Nonbiospecific affinity chromatography (triazine dye chromatography) Immunoaffinity chromatography (high cost) For recombinant proteins, several fusion systems are developed: Oligohistidines (IMAC) Antigenic epitopes (Mab) Carbohydrate-binding proteins (domains recognized by lectins) Biotin-binding domain (affinity to avidin or streptavidin) If required, a specific protease cleavage site can be engineered into the fusion protein. 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 3. Purifying and concentrating-intermediate steps This step should be designed to provide further purification and reduction of sample volume, and it is best accomplished using intermediate capacity/intermediate- to high-resolution chromatography. 1 2 3 4 5 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 4. Final polishing The purpose of the final polishing step(s) is to remove any minor contaminants remaining, to remove possible aggregates, and to prepare the homogenous target protein for its intermediate use or for storage. 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? Which order of steps is best? According to an analysis of succesful purification methods by Bonnerjea et al. (1986): homogenization clarification/fractional precipitation anion-exchange chromatography affinity separation SEC An important consideration in designing the order of purification steps is to minimize buffer-exchange steps: Fractional precipitation using ammonium sulfate Hydrophobic interaction chromatography Ion-exchange chromatography SEC, dialysis, or membrane ultrafiltration 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? Checklist for protein purification: Define end goals Establish a rapid analytical assay In pilot experiments, define the chemical and physical characteristic of target protein (pI, size, temperature stability, ligand specificity) Keep the purification procedure as simple as possible: Minimize sample handling at every stage Remove damaging contaminants Be careful with addition of stabilizing additives (detergent, salts) 8.5. Devising strategies for protein purification 8.5.2. How many purification steps are necessary? 8.5. Devising strategies for protein purification 8.5.3. Enrichment of low-abundance proteins by preparative electrophoresis The dynamic range of protein abundance in a biological sample can be 1,000,000 copies per cell for the cytoskeletal proteins that maintain cellular architecture. On the other hand, we can work with a transcription factor ranging from 10 copies per cell. 2D electrophoresis can separate only a subset of a total proteome, at best 1,500–2,000 proteins. 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification Chromatography (Greek: color writing) – technique for separating the component of a mixture by allowing the sample to distribute between two phases – one remains stationary (stationary phase), while the other moves (mobile phase). A packed bed of solid material in a column (liquid chromatography) Spread as a thin layer or film on flat plat (thin-layer chromatography) Paper (paper chromatography) Liquid chromatography: • A stationary phase (with controlled structure and surface chemistry). • A column (packed with stationary phase). • A mobile phase(s) or solvent(s) of controlled chemical composition that moves the solute through the column. • Chromatography equipment capable of accurately delivering the sample to the column. • Software programs for blending the mobile phase(s). 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.1. Liquid chromatography – stationary phase 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.2. Liquid chromatography – support matrix Rigid solids – inorganic materials (porous silica, controlled pore glass, hydroxyapatite, alumina and zirconium) (4,000–6,000 psi; 26–39 MPa) Hard gels – synthetic organic polymers (polystyrene-divinylbenzene, polyacrylamide, polyvinyl alcohols, and polymethacrylate) POROS or SOURCE (2,000–5,000 psi; 13–35 MPa) Soft gels – natural polymers such as cellulose, dextran, and agarose. In most cases, support matrices used in protein and peptide applications are hydrophilic, charge-neutral, and have low nonspecific binding characteristics. 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.3. Liquid chromatography – particle size and structure Particle size (dp) – critical determinant in LC influencing the chromatographic efficiency in a given separation (mechanical stability – column lifetime; surface area – analyte capacity factor k´). (POROS/SOURCE: dp = 3–20 m in diameter) Particle shape – better to have narrow range of particle sizes (packed columns with very broad particle distribution are inefficient and less permeable) 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.4. Liquid chromatography – pore structure (accessible surface area) Typically, the pore diameter must be 5 times the size of molecules being purified to permit them to access all of the pores via molecular diffusion. • Macroporus packings contain pores ranging from 1,000 to 10,000 Å (IEC). • Mesoporous packings (wide-pore) contain pore diameters of 180–500 Å (HIC). • Microporous packings have 60–120 Å pore diameters (RP-HPLC). Stationary phase vs. bonded phase Whereas the column packing matrix provides the chemically inert “skeleton” for the stationary phase, the bonded phase provides the functional groups, which are designed to selectively bind solute molecules. 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.5. Liquid chromatography – bonded phase Functional R groups are usually: • Methyl (CH3) groups • Hydrocarbon chains (C6, C8 or C18) • Nitrile group (–C N) • Amino group (–NH2) • Carboxylic group (–COO-) • Sulphate group (–SO3 -) • Phenyl group (– ) 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.6. Liquid chromatography – chromatographic performance The separation of charged molecules, such as peptides and proteins, as they move down a column is affected by differential migration of solutes and their spreading or dispersion (peak or band broadening). When the interaction of a solute with the stationary phase is very strong, it is retained to a greater extent, and thus will move more slowly. Equilibrium distribution coefficient KD=SS/SM SS – concentration of a solute in the stationary phase SM – concentration of same solute in the mobile phase The migration behavior is influenced by three major variables: composition of the mobile phase (pH, ionic strength), composition of the stationary phase, and separation temperature. Molecular spreading is the result of dilution of a solute band as it moves down the column (kinetic and physical processes versus thermodynamic processes). 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.7. Liquid chromatography – basic retention principles Selectivity (α) is a measure of the difference in retention between the solute of interest and other solutes in the sample. Retention is simply the time (tr) or volume (vr) it takes for a solute to move through the column. The capacity factor (k´, or the retention factor) is the number of column volumes required to elute a particular solute, and t0 represents the void time. k´ = (tr-t0)/t0 Selectivity is sometimes expressed as the ratio of the capacity factors k´ of two solutes being separated: α = ´ ´ 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification N = theoretical plate number Selectivity and retention 8.5.4.7. Liquid chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.7. Liquid chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification. Asymmetric peaks. A value >1 is a tailing peak (commonly caused by sites on the packing with a stronger than normal retention of the solute). Band broadening and efficiency 8.5.4.7. Liquid Chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification Resolution Resolution (RS) is defined as the extent of separation between two chromatographic peaks. RS = 2 (trB – trA)/(WA + WB) Resolution is a composite function of both thermodynamic and kinetic parameters and is expressed in terms of an equation that includes the selectivity factor α, the capacity factor k´, and the plate number N. RS = ¼(α−1)(N)½ [1/(1+k´)] 8.5.4.7. Liquid chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification Resolution RS = ¼(α−1)(N)½ [1/(1+k´)] k´ - Capacity is directly related to the distribution coefficient of a solute between the mobile and stationary phases. α - Selectivity is affected by the surface chemistry of the column packing, the nature and composition of the mobile phase, the nature of the stationary phase and the gradient shape. 8.5.4.7. Liquid chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification Resolution 8.5.4.7. Liquid chromatography – basic retention principles 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.8. Liquid chromatography – sample capacity Sample capacity is the amount of sample that can be injected into a chromatographic system without overloading the column (the number of grams of sample per gram of column packing). • Measurement of saturation or equilibrium capacity • Frontal adsorption analysis 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.8. Liquid chromatography – sample capacity Column loadability For optimal chromatographic performance and to achieve the greatest resolution, column loadability is critical parameter. 8.5. Devising strategies for protein purification 8.5.4. Strategies based on chromatographic methods for protein and peptide purification 8.5.4.8. Liquid chromatography – packing a column