Comparison and optimization of machine learning methods for automated classification of circulating tumor cells.

Overview

abstract

Advances in rare cell capture technology have made possible the interrogation of circulating tumor cells (CTCs) captured from whole patient blood. However, locating captured cells in the device by manual counting bottlenecks data processing by being tedious (hours per sample) and compromises the results by being inconsistent and prone to user bias. Some recent work has been done to automate the cell location and classification process to address these problems, employing image processing and machine learning (ML) algorithms to locate and classify cells in fluorescent microscope images. However, the type of machine learning method used is a part of the design space that has not been thoroughly explored. Thus, we have trained four ML algorithms on three different datasets. The trained ML algorithms locate and classify thousands of possible cells in a few minutes rather than a few hours, representing an order of magnitude increase in processing speed. Furthermore, some algorithms have a significantly (P < 0.05) higher area under the receiver operating characteristic curve than do other algorithms. Additionally, significant (P < 0.05) losses to performance occur when training on cell lines and testing on CTCs (and vice versa), indicating the need to train on a system that is representative of future unlabeled data. Optimal algorithm selection depends on the peculiarities of the individual dataset, indicating the need of a careful comparison and optimization of algorithms for individual image classification tasks. © 2016 International Society for Advancement of Cytometry.

authors

Lannin, Timothy B

Thege, Fredrik I

Kirby, Brian James

publication date

October 18, 2016

published in

Cytometry. Part A : the journal of the International Society for Analytical Cytology Journal

Research

keywords

Image Processing, Computer-Assisted
Neoplastic Cells, Circulating
Pattern Recognition, Automated

Identity

Scopus Document Identifier

84991738363

Digital Object Identifier (DOI)

10.1002/cyto.a.22993

PubMed ID

27754580

Additional Document Info

has global citation frequency

25

volume

89

issue

10

VIVO Weill Cornell Medical College

Comparison and optimization of machine learning methods for automated classification of circulating tumor cells. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume

issue