DISPARE: DIScriminative PAttern REfinement for Position Weight Matrices.

Overview

abstract

BACKGROUND: The accurate determination of transcription factor binding affinities is an important problem in biology and key to understanding the gene regulation process. Position weight matrices are commonly used to represent the binding properties of transcription factor binding sites but suffer from low information content and a large number of false matches in the genome. We describe a novel algorithm for the refinement of position weight matrices representing transcription factor binding sites based on experimental data, including ChIP-chip analyses. We present an iterative weight matrix optimization method that is more accurate in distinguishing true transcription factor binding sites from a negative control set. The initial position weight matrix comes from JASPAR, TRANSFAC or other sources. The main new features are the discriminative nature of the method and matrix width and length optimization. RESULTS: The algorithm was applied to the increasing collection of known transcription factor binding sites obtained from ChIP-chip experiments. The results show that our algorithm significantly improves the sensitivity and specificity of matrix models for identifying transcription factor binding sites. CONCLUSION: When the transcription factor is known, it is more appropriate to use a discriminative approach such as the one presented here to derive its transcription factor-DNA binding properties starting with a matrix, as opposed to performing de novo motif discovery. Generating more accurate position weight matrices will ultimately contribute to a better understanding of eukaryotic transcriptional regulation, and could potentially offer a better alternative to ab initio motif discovery.

authors

da Piedade, Isabelle

Tang, Man-Hung Eric

Elemento, Olivier

publication date

November 26, 2009

published in

BMC bioinformatics Journal

Research

keywords

Computational Biology
Software

Identity

PubMed Central ID

PMC2788558

Scopus Document Identifier

71549150896

Digital Object Identifier (DOI)

10.1093/bioinformatics/bti623

PubMed ID

19941641

Additional Document Info

has global citation frequency

6

volume

10

VIVO Weill Cornell Medical College

DISPARE: DIScriminative PAttern REfinement for Position Weight Matrices. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume