LOOS  v2.3.2

Convergence Analysis Tools

A collection of tools for assessing statistical error and convergence, found in the Packages/Convergence/ directory:

assign_frames

Given a trajectory and a set of fiducial structures (histogram centers), assign each frame in the trajectory to a histogram bin. Part of the workflow for computing the effective sample size. (See effsize.pl)

avgconv

Computes the RMSD between the average structure for time i and i+1 for a trajectory. The "locally optimal" flag determines whether the trajectory is globally aligned first or whether each block of frames used in the average is aligned prior to averaging.

bcom

Implements the block covariance overlap method. Briefly, think of block-averaging where the trajectory is broken up into blocks of a given size, the PCA computed for the block, and then the covariance overlap is calculated between the block's PCA and the PCA for the entire trajectory. Then this is repeated for increasing block sizes. A Z-score for the bcom result can also be calculated (using the –zscore=1 flag and optionally setting the number of "tries" to use).

block_average

Reads a simple columnated text file, and computes the block-averaged standard error as a function of block size. The plateau value is the best estimate for the true standard error. Reference: Flyvbjerg, H. & Petersen, H. G. J. Chem. Phys., 1989, 91, 461-466

block_avgconv

Block-averaging of RMSD between average structures for a trajectory. "Range" in this case is the range of block sizes and not stricly which frames of the trajectory to use.

bootstrap_overlap.pl

PERL program to compute the bcom and bootstrapped bcom for a trajectory, generating a plot of their ratio and an exponential fit. Also generates a plot of the residual error in the fit. Use the "--help" option for more details. Note that the number of block sizes used is somewhat conservative, so it's probably a good idea to use a low number of block sizes initially to get a quick idea of how good or bad the sampling is, and then use the higher number of blocks for a more detailed analysis. Also note that plotting requires gnuplot. If you do not have gnuplot installed (or do not like gnuplot), use the "--noplot" flag to disable this.

boot_bcom

Bootstrapped bcom is similar to bcom above, but rather than using contiguous blocks, it uses a bootstrap procedure by randomly selecting frames from the trajectory to build decorrelated blocks. If no seed for the random number generator is given, LOOS will pick a default (based on the current system clock). The –replicates option determines how many blocks are generated for a given size.

chist

Calculates either a cumulative histogram (where each output row is the histogram up to that point), or a windowed histogram.

  <DT> <B> coscon </B>
  <DD> Computes the cosine content for varying windows of a
  trajectory, based on Hess, B.  "Convergence of sampling in
  protein simulations." Phys Rev E (2002) 65(3):031910


  <DT> <B> decorr_time </B>
  <DD> Decorrelaton time as computed by structural histogram
  analysis.  The default values for the range of N-values,
  repetitions, and bin fraction are taken from the paper below and
  may need to be changed, particularly if you are using a
  trajectory you suspect is undersampled.
  Reference: Lyman & Zuckerman, J Phys Chem B (2007) 111:1287-82

  <DT> <B> effsize.pl </B>
  <DD> PERL front-end to the effective sample size tools (ufidpick,
  assign_frames, hierarchy, neff).  If you want to apply the
  Zuckerman-style effective sample size method (see the entry for neff,
  below), you probably should use this script instead of the individual
  tools, since this tool automates the process of picking fiducial
  structures (the frames that will be the centers of your histogram
  bins), assigning the frames from the trajectories to those bins,
  working out the mean first passage time between bins, and computing
  the effective sample size.  Reference: Lyman & Zuckerman, Biophys J 
  (2006) 91:164-72

  <DT> <B> fidpick </B>
  <DD> Picks fiducial structures for structural histograms.
  Reference: Lyman & Zuckerman, Biophys J (2006) 91:164-72

  <DT> <B> hierarchy </B>
  <DD>Given a trajectory whose structures have been binned into
  states via reference structures, computes the mean first passage time
  between states and then constructs a hierarchy of states based on
  exchange rates.  Used to generate input for neff.
    Based on Zhang, Bhatt, and Zuckerman; JCTC, DOI: 10.1021/ct1002384
    and code provided by the Zuckerman Lab
    (http://www.ccbb.pitt.edu/Faculty/zuckerman/software.html)


  <DT> <B> neff </B>
  <DD> Computes effective sample size given an assignment and
  state file (from hierarchy).
  Based on Zhang, Bhatt, and Zuckerman; JCTC, DOI: 10.1021/ct1002384
  and code provided by the Zuckerman Lab
  (http://www.ccbb.pitt.edu/Faculty/zuckerman/software.html)

  <DT> <B> qcoscon </B>
  <DD> Computes a "quick" cosine content using the entire trajectory
  for the top few modes, based on Hess, B.  "Convergence of
  sampling in protein simulations." Phys Rev E (2002) 65(3):031910

  <DT> <B> sortfids </B>
  <DD>Sorts fiducials (from fidpick) based on a decreasing bin
  population.

  <DT> <B> ufidpick </B>
  <DD> Picks a set of fiducial structures from a trajectory using
  a uniform distribution.
  Reference: Lyman & Zuckerman, J Phys Chem B (2007) 111:12876-82
</DL>