References

Multiple flexible structure alignment using partial order graphs. Yuzhen Ye and Adam Godzik. Bioinformatics, 2005 21(10):2362-2369. (color figures)
POSA: a user-driven, interactive multiple protein structure alignment server. Z Li, P Natarajan, Y Ye, T Hrabe, A Godzik. Nucl. Acids Res. (2014) doi: 10.1093/nar/gku394

POSA terms

POSA: Partial Order Structure Alignment
POA: Partial Order Alignment
POG: Partial Order Graph
MPStrA: Multiple Protein Structure Aalignment

POSA alignment output

POSA was developed to generate partial order alignments of protein structures, meaning that the output alignments are partial order graphs. NO alignments in the typical column-row format will be provided.
POSA server provides two ways for displaying the partial order alignments: (1) browse the alignment through our iterative webpage, or (2) download the alignment files.

How to read POA text file

POSA generates partial order alignments, which means we cannot provide the typical column-row alignments in whatever formats. Instead we provide TWO kinds of alignment output that might be useful for some users.
Here we use the POSA result of 3 calmodulin-like proteins to demonstrate. This page provides links to the two alignment outputs shown below.

(1) Partial Order Graph (POG) along with amino acids

This file (show me the demonstration file) records the list of proteins between <PRO>...</PRO>: d1ncxa_ (index = 0), d2sasa_ (index = 1) and d1jfja_ (index = 2).

Between <NODE> and </NODE>, each line represents a node in the POG, recording residues aligned at this node. For examples,
"1 0 A.2.S" records node 1 (note: NODE indexes are from 0), meaning residue "S" in position 2 (position is defined as "resSeq number"+"insersion code", it is read from column 23-27 in ATOM section) of protein 0 (i.e., d1ncxa_) in Chain "A". Protein indexes are from 0. Protein names and indexes can be found in <PRO></PRO> tags. "A.2.S" is the residue identifier that defines which residue from protein 0.
"20 0 A.21.E 1 A.9.K 2 A.1.M" records node 20, which represents an aligned position common to all 3 input structures (i.e., protein 0 residue A.21.E, protein 1 residue A.9.K and protein 2 residue A.1.M).
Between <EDGE> and </EDGE>, each line records an edge connecting two nodes; e.g., "0 0 1" records edge 0 connecting node 0 and node 1.
Between <CORE> and </CORE>: parameters about the common core for the input structures. For example:
<CORE>0 3 132 2.80 1 2.32</CORE>
  • 0 -- has no meaning
  • 3 -- number of structures that are in the common core
  • 132 -- the size of the common core
  • 2.80 -- average RMSD between all the structures
  • 2.32 -- the minimum of the average RMSD between one of the structures (as representative) and the remaining structures.
Length or size of the POG is enclosed in the <LEN> tag. For example:
<LEN>201 208</LEN>
  • 201 -- the number of nodes
  • 208 -- the number of edges
Users may prepare alignments they desire by extracting alignment information from this file.

(2) Alignment of amino acids

This file (show me the demonstration file) explicitly lists the residues of aligned regions of at least 5-residues long shared by all input structures (but not aligned positions of a subset of the input structures), showing the lengths of the other spanning regions between {}.

The first line in the file shows that there are 6 such aligned regions, labeled as S1 - S6 (same as in the simplified POG graph).
The second line "d1ncxa_ {20}EFKAAFDM{0}FDADGGGDIST{1}.." shows residues 21-28 of 1ncxa_ (EFKAAFDM), 29-39 (FDADGGGDIST) and so on.

In addition, the qualified positions of the residues that are aligned in all input structures are recorded in the <ALN></ALN> tags. For (example),
"0 0 A.21.E 1 A.9.K 2 A.1.M" records the 1st aligned residues from protein 0, 1, 2. The first number "0" is the aligned residues index which starts from 0.

PDB inputs in a tar/zip file

POSA server accepts structures (in PDB format) packed in either a tar (gzipped or not) or a zip file as input. Be sure to pack (a) clean PDB files (e.g., use a filtered PDB file that only contains a specific chain you want to use instead of the original PDB file with multiple chains); and (b) all files are placed in the current directory (instead of files in a subdirectory or mixed).

E.g., you have three structures, AAAA.pdb, BBBB.pdb and CCCC.pdb in current directory; you can archive these pdbs in three ways that POSA server can recognize.

> tar cvf pdb.tar AAAA.pdb BBBB.pdb CCCC.pdb
> tar zcvf pdb.tar.gz AAAA.pdb BBBB.pdb CCCC.pdb
> zip pdb.zip AAAA.pdb BBBB.pdb CCCC.pdb

PDB inputs with user-defined constrains

To assist users to enter constrains on the input protein structures, the POSA server interface provides a convenient and flexible protein structure definition table.

There are 5 columns in the table. The first 2 columns PDB and Chain are required, which specify the amino acid chain to be aligned from each input protein structure (in PDB format). The input protein structure can be provided by a PDB code or SCOP code filled in PDB column or by uploading a local coordinate file in PDB format. User defined constrains are provided in the Segments, Reference and Other Chains columns.

Please use the example below to learn how to specify input structures with constrains and how they are going to be interpreted.
Please use this format for your input, use 'na' as a placeholder for undefined values. Use a dot(.) for un-specified chain Id, then the 1st chain will be used; If the chain id is a space, please enter 2 dots(..)

#PDB    Chain   Segments        Reference    OtherChains
1a9n    B       NA      No      Q
1urn    A       11-16,40-46,55-60,76-86 Yes     P
2err    A       NA      No      B
1yty    A       NA      No      C
2kg1    A       NA      No      B
If you copy and paste above text to the text area provided using the second input method or manually enter above values in the input table of the first input method in POSA homepage, the result will be as Job Home; Visualization of protein structure comparison in Jmol

#PDB Chain Segments Reference OtherChains
1a9n B NA No Q
1urn A 11-16,40-46,55-60,76-86 Yes P
2err A NA No B
1yty A NA No C
2kg1 A NA No B


Column Description
PDB Initial protein structures in PDB format or given by PDB or SCOP codes. Required
A PDB file of column1.pdb should be given or will be downloaded from PDB’s site or SCOP\'s site
Chain Chain Id from column1.pdb that will be used for structure alignment. Required
Segments Segments defined by “resSeq-reqSeq” separated by comma, where "resSeq" is the residue sequence number field(column 23-26) in ATOM line.
Optional. Default=“NA”
It defines which Segments from the chain in column2 that will be used for structure alignment.
When absent, the whole chain will be used
Reference Yes/No; Optional; default=“No”
Is this structure used as the reference, which means, all other structures will be superimposed onto this structure.
Only one reference is allowed (if more than one are given, the first reference structure will be used). If present and at this case, rigid pairwise structure alignment method will be used. Otherwise, POSA multiple structure alignment method will be used.
OtherChains Chain Ids separated by comma.
Optional. Default=“NA”.
These chains are not used when calculating structure alignment but are added in the post process for users to view using the same transformation matrix in the structure alignment.
NOTE To input structures using the second input method, please follow these format rules:
Comment Line: start with "#"
Columns use spaces or tabs (default) as delimiters
There is no space in each cell content
If a cell is empty, please use "na" or "NA" as a placeholder.

POSA structure display page

There are 2 frames(example). The top frame shows the corresponding protein sequence information for the protein structures superimposed. The bottom frame shows the three-dimensional superposition of proteins. The protein sequence information in the top frame could be protein sequence(example) or protein structure alignment in text format( example) or Simplified Partial Order Graph.
There is a big structure view and several small structure views of a single protein structure in the bottom frame. The big view depicts all superimposed structures. Individual structures in the big view can be switched on and off. Moreover, any chains, segments or ligands can be switched on and off for better structure comparison. Structures in the other smaller views rotate synchronously when the "Synchronize" box (bottom of frame) is checked.
The PDB file names and chain Ids in the structure windows and in the top frame are named and colored correspondingly. The PDB files names are consistent with the "user's initial PDB file name or PDB id", plus "chain Id used as structure superposition", plus "segments id if user selected to use some segments rather than the full chain" (optional), plus "one or more chain Ids that user wants to included in the structure display" (these chains are not used as structure superpostion but are added later after the structure comparison is done on user-selected chains).
Example: protein structures of 3 antibodies(PGT135, VRC01 and VRC23) were compared. Antibody "PGT135" with PDB id 4jm2 has A,B,C,D,E chains. Antibody "VRC01" with PDB id 3ngb has A,D,G,I,B,E,H,J,C,F,K,L chains. Antibody "VRC23" with PDB id 4j6r has G,H,L chains. Although each antibody protein has more than one chains, the structure superposition is only allowed on a single chain (full or parts).
Note: if one prefers to use 2 or more chains from a PDB file to do structure superposition, then it is recommended to edit the PDB file to rename the chain ids of selected chains to be one single unique chain id.

Here the structure comparison were executed on the single antigen chains from each antibody (chain E from PGT135, chain G from VRC01 and chain G from VRC23). After structrue comparison is done on selected chains (or segments), one can choose to include any other chains or all chains from each antibody in the final superimposed structures for the purpose of display. These chains are added to the final display by structure transformation based on the transformation matrix used in the structure superpostion. In the this example, one of the heavy chains(4jm2:A; 3ngb:H; 4j6r:H) and one of the light chains(4jm2:B; 3ngb:L; 4j6r:L) were added in the final display.

How to visualize superimposed 3D structures

Multiple superimposed structures are visualized iteratively on our website. POSA currently supports only Jmol to display protein structures. In order to run Jmol, please make sure that Java is installed in your system and Java Applet Plug-in and Javascript are enabled in your web browser.
  • How to enable Java-Plugins on websites

    Please consider following steps described in Java provided help on how to enable Java-Plugins on websites.

    For WinXP/Win 7, go to System Settings => Java => Security => add posa.godziklab.org to the allowed servers list.
    For Mac, open Chrome settings => Settings => Advanced Settings => Privacy => Content Settings => Plug-Ins => Manage Exceptions => add posa.godziklab.org to allowed servers.
  • Web browsers and platforms that are tested

    We recommend using Chrome for POSA visualization as it proved to display JMol results on MacOS, Win7, Win8 most reliably. We have also tested with browsers of Safari, Chrome and Firefox in Mac OS X 10.6 and above. Moreover, we will keep this page updated once new browsers or new platforms are tested. Please report any errors or comments to POSA.
  • Tips and trouble shooting for known problems

    • Jmol window does NOT show the structure(s)

      (1) If structures do not display it may be that the default windows system settings are set to high safety. POSA uses JMol - an uncertified applet and structure viewing is hence blocked by default. In order to solve this, please go to Control Panel -> Java -> Security -> Edit site list and add the POSA URL posa.sanfordburnham.org to the exception list. For more information go to http://www.java.com/en/download/exception_sitelist.jsp

      (2) Please try to RELOAD the page.

      (3) Please minimize the browser windows to check whether there is warning window that is waiting for you to click "Allow" to run Java Applet.
    • Synchronize botton does NOT work

      please uncheck and then check the "Synchronize" botton, but DO NOT reload the page.
    • Java Applet does not work in Safari and Chrome in Mac OS

      Apple supplies its own Java. Java 7 from Oracle is not compatible with Chrome and Safari in Mac. To restore Apple's Java SE 6, please refer Apple support for details.

      These 2 commands will restore Java SE 6, for running Java Applet.
      sudo mv /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin/ /Library/Internet\ Plug-Ins/Disabled\ Plug-Ins/

      For Safari
      sudo ln -sf /System/Library/Java/Support/Deploy.bundle/Contents/Resources/JavaPlugin2_NPAPI.plugin /Library/Internet\ Plug-Ins/JavaAppletPlugin.plugin
  • Web resources for Jmol and Scripts: