| Determining Protein Structure |
While determining the polypeptide sequence resulting from gene translation is straightforward, determining the actual three-dimensional (3D) structure requires some sophisticated experimental techniques. One such long-standing technique is X-ray crystallography, which is based on the scattering of X-rays by the electrons in the crystal's atoms. Think of the regular structure of table salt crystals. The atoms forming that structure are spaced very precisely in the crystal. Due to this regular spacing, a particular diffraction pattern forms when X-rays strike it. One can reconstruct the position of each atom in the crystal by observing the diffraction pattern and, thus, can make a three-dimensional map of the molecule. Although proteins are much more complex than table salt, researchers have crystallized many of them in their native configuration and have used X-ray crystallography to find their 3D structures. The 3D structures of proteins are available to all scientists in a public database called the "Protein Data Bank."
Not all proteins can be crystallized, however. For example, membrane proteins have many hydrophobic amino acids and are particularly difficult to crystallize. A different technique to analyze proteins in solution is nuclear magnetic resonance (NMR). NMR is based on the principle that the nuclei of some elements' atoms, such as hydrogen, resonate when a molecule, such as protein, is placed in a powerful magnetic field. NMR measures chemical shifts of the atoms' nuclei in the protein, which is dependent on nearby atoms and on their distances from each other. The signals that NMR produces are a set of distances between specific pairs of atoms. NMR data generate models of possible structures, rather than a single structure. For smaller proteins in particular, NMR can quite accurately predict the 3D structure.
Despite advances in techniques for determining protein structure, the structures of many proteins are still unknown. With the help of protein prediction programs, computer analysis of genome sequences is producing thousands of new hypothetical proteins of unknown structure and function. These proteins are called "hypothetical proteins" because they represent the products predicted from the gene sequence; however, there is, as yet, no evidence that they are actually made and there is no known function for them.
Computer programs may help determine the structure of proteins whose function is not yet known. By comparing the sequence of the unknown protein to proteins with known 3D structures, these programs can make a predictive model of the unknown protein's structure using the known proteins as templates. The success of this method depends on the quality of the match between the known template proteins and the unknown target protein. In addition, when the function of the template protein is known, it may help identify the function of the unknown protein. These prediction programs do not produce structures with the detail or reliability of experimental techniques such as X-ray crystallography. They do, however, provide a means to analyze - in a reasonable time period - the large number of new proteins identified by the analysis of whole genomes.