Gastric Cancer Tissue Segmentation Dataset

Select Regions of Interest from WSI

Experienced pathologists select an ROI from each WSI based on a criterion that the selected ROIs should contain more than three tissue types.

Meanwhile, they need to mark the tissue types that each ROI has.

Mark Every Region by Different Colors

For each ROI, the annotators need to delineate tissue boundaries with several irregular curves and label tissue types for each delineated region based on the pathologists' marks.

Then, the pathologists review the annotations and either (i) accept the annotations after a minor correction at pixel-level or (ii) provide detailed feedback for re-annotation.

This dataset is proposed for tissue segmentation of gastric cancer.


  • It consists of 100 ROIs from the WSI of 100 gastric cancer cases.

  • Six tissue types are annotated: tumor, lymphoid stroma, desmoplastic stroma, smooth muscle, necrosis, and others.

  • The original WSIs are derived from the TCGA database.


  • The dataset provided here is for research purposes only. Commercial uses are not allowed.

  • If you intend to publish a research work that uses any of these datasets, you must cite our publication.

Papers

Unsupervised Representation Learning for Tissue Segmentation in Histopathological Images: From Global to Local Contrast

Zeyu Gao, Chang Jia, Yang Li, Xianli Zhang, Bangyang Hong, Jialun Wu, Tieliang Gong, Chunbao Wang,Deyu Meng, Yefeng Zheng, and Chen Li

TMI, 2022.

Ground Truth Demo

Data Format

This dataset (1000 ROIs) is divided into two subsets, 80 for training and 20 for testing. Each sample of this dataset is composed of two parts:

  1. The original ROIs (image patches) were selected from WSIs.

    • Save as png files under the corresponding folder.

  2. The corresponding annotation of each ROI.

    • Save as text files under the corresponding folder.

    • The annotation of each ROI is a pixel matrix with {-1, 1, 2, 3, 4, 5, 6}, -1 is equal to 6.

    • 1,2,3,4,5,6 represent tumor, lymphoid stroma, desmoplastic stroma, smooth muscle, necrosis, and others, respectively.

  3. Download this dataset from: https://nextcloud.chenli.group/index.php/s/kDWNsAbycWLTosW

Statistics

  • The mean size of ROIs is 4655 × 5276 pixels.

  • For the annotation process:

    • The one-time acceptance rate for the annotated ROIs is 78%, and the remaining annotated ROIs are accepted after one revision.

    • The minor correction performed by the pathologists corrected 8.4% of all pixels in total.