Melanoma Cancer Cell Dataset

The dataset

Controle_P1_20x_Ph_menor.mov

Melanoma B16F10 cells

We present a new open multitemporal image dataset: The Melanoma Cancer Cell dataset. This dataset provides better understanding of the cancer cell migration and anti-migration promoted by specific drugs, classifying in treated and untreated cell, being possible to characterize phenotypic and morphologic drug effects. Therefore, allowing to elucidate some intrinsic biological mechanisms of cancer cell, particularly understanding the tissue invasion and metastases formation.

This dataset has two conditions of long-term culture of metastatic murine melanoma B16F10 cells in Roswell Park Memorial Institute (RPMI) medium (supplemented with 10% Fetal Bovine Serum (FBS), Streptomycin 10 mg/mL and Peni- cillin 10,000 Units/mL). First of all, B16F10 was plated (5x104 cells/mL) in a 35mm polystyrene dish and, after 24h, exposed to hydroxyurea (30mM) or only medium (control group). Then, cells were placed in BioStation IM-Q inverted microscope (Nikon)5 and images from 69 fields were acquired over 24 hours by a high sensitivity cooled charge-coupled device (CCD) camera (40x objective). At the end, the final database resulted in 69 image sequences with 95 frames with a spatial resolution of 640x480 pixels and duration of one minute.

Images

Example of cells from the melanoma cancer cell dataset. Two example cells are marked with a black bold square around its nucleoid. On the right we have a zoom on one of them .

The melanoma cancer cell dataset composed by 69 image sequences of control melanoma cells and 69 image sequences for cells treated with hydroxyurea. On the left, we see the evolution of melanoma cancer cells through time. On the right, we see the cells treated with hydroxyurea. It is easy to see how the number of cells increases without any treatment .

Click to enlarge image

Results

To evaluate the results of our experiments, we applied a 5x2-fold protocol. It consists of randomly splitting the the MCC dataset five times into two folds, balanced by class. In each time, training and testing sets were switched and consequently five analysis for every model employed were conducted.

The baseline uses three handcrafted features and deep features from ResNet-50.

We also used our framework Features As Spatiotemporal Tensors (FASTensor). If you want to learn more about this method, please see the reference below.

We used SVM as inference models for the classification tasks and compared it with the baselines using the accuracy metric. Feature extraction modules in this work were implemented using the skimage framework, while SVM and validation procedure were coded using the sklearn library. The core of the FASTensor approach uses the NumPy and SciPy libraries.

Reference

If you want to use this dataset, please contact virginia@teiacoltec.org and refer to the paper below:

  • MOTA, V. F.; OLIVEIRA, H.; SCALZO, S.; DITTZ, D.; SANTOS, R. J; SANTOS, J. A.; ARAÚJO, A. A. From video pornography to cancer cells: a tensor framework for spatiotemporal description. Multimedia Tools and Applications, 2020, v. 1, p. 1-31, 2020.