Why protein R-factors are so large: a self-consistent analysis.

Overview

The R-factor and R-free are commonly used to measure the quality of protein models obtained in X-ray crystallography. Well-refined protein structures usually have R-factors in the range of 20-25%, whereas intrinsic errors in the experimental data are usually around 5%. We use molecular dynamics simulations to perform a self-consistent analysis by which we determine the major factors contributing to large values of protein R-factors. The analysis shows that significant R-factor values can arise from the use of isotropic B-factors to model anisotropic protein motions and from coordinate errors. Even in the absence of coordinate errors, the use of isotropic B-factors can cause the R-factors to be around 10%; for coordinate errors smaller than 0.2 A, the two errors types make similar contributions. The inaccuracy of the energy function used and multistate protein dynamics are unlikely to make significant contributions to the large R-factors.