Comparison and optimization of machine learning methods for automated classification of circulating tumor cells. Academic Article uri icon

Overview

abstract

  • Advances in rare cell capture technology have made possible the interrogation of circulating tumor cells (CTCs) captured from whole patient blood. However, locating captured cells in the device by manual counting bottlenecks data processing by being tedious (hours per sample) and compromises the results by being inconsistent and prone to user bias. Some recent work has been done to automate the cell location and classification process to address these problems, employing image processing and machine learning (ML) algorithms to locate and classify cells in fluorescent microscope images. However, the type of machine learning method used is a part of the design space that has not been thoroughly explored. Thus, we have trained four ML algorithms on three different datasets. The trained ML algorithms locate and classify thousands of possible cells in a few minutes rather than a few hours, representing an order of magnitude increase in processing speed. Furthermore, some algorithms have a significantly (P < 0.05) higher area under the receiver operating characteristic curve than do other algorithms. Additionally, significant (P < 0.05) losses to performance occur when training on cell lines and testing on CTCs (and vice versa), indicating the need to train on a system that is representative of future unlabeled data. Optimal algorithm selection depends on the peculiarities of the individual dataset, indicating the need of a careful comparison and optimization of algorithms for individual image classification tasks. © 2016 International Society for Advancement of Cytometry.

publication date

  • October 18, 2016

Research

keywords

  • Image Processing, Computer-Assisted
  • Neoplastic Cells, Circulating
  • Pattern Recognition, Automated

Identity

Scopus Document Identifier

  • 84991738363

Digital Object Identifier (DOI)

  • 10.1002/cyto.a.22993

PubMed ID

  • 27754580

Additional Document Info

volume

  • 89

issue

  • 10