Common data model for neuroscience data and data model exchange.
Academic Article
Overview
abstract
OBJECTIVE: Generalizing the data models underlying two prototype neurophysiology databases, the authors describe and propose the Common Data Model (CDM) as a framework for federating a broad spectrum of disparate neuroscience information resources. DESIGN: Each component of the CDM derives from one of five superclasses-data, site, method, model, and reference-or from relations defined between them. A hierarchic attribute-value scheme for metadata enables interoperability with variable tree depth to serve specific intra- or broad inter-domain queries. To mediate data exchange between disparate systems, the authors propose a set of XML-derived schema for describing not only data sets but data models. These include biophysical description markup language (BDML), which mediates interoperability between data resources by providing a meta-description for the CDM. RESULTS: The set of superclasses potentially spans data needs of contemporary neuroscience. Data elements abstracted from neurophysiology time series and histogram data represent data sets that differ in dimension and concordance. Site elements transcend neurons to describe subcellular compartments, circuits, regions, or slices; non-neuroanatomic sites include sequences to patients. Methods and models are highly domain-dependent. CONCLUSIONS: True federation of data resources requires explicit public description, in a metalanguage, of the contents, query methods, data formats, and data models of each data resource. Any data model that can be derived from the defined superclasses is potentially conformant and interoperability can be enabled by recognition of BDML-described compatibilities. Such metadescriptions can buffer technologic changes.