Next Article in Journal
VPAgs-Dataset4ML: A Dataset to Predict Viral Protective Antigens for Machine Learning-Based Reverse Vaccinology
Previous Article in Journal
Multi-Year On-Farm Trial Data on the Performance of Long- and Short-Duration Wheat Varieties against Sowing Dates in the Eastern Indo-Gangetic Plain of India
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

Whole-Slide Images and Patches of Clear Cell Renal Cell Carcinoma Tissue Sections Counterstained with Hoechst 33342, CD3, and CD8 Using Multiple Immunofluorescence

by
Georg Wölflein
1,*,†,
In Hwa Um
2,†,
David J. Harrison
2,3 and
Ognjen Arandjelović
1
1
School of Computer Science, University of St Andrews, North Haugh, St Andrews KY16 9SX, Scotland, UK
2
School of Medicine, University of St Andrews, North Haugh, St Andrews KY16 9TF, Scotland, UK
3
Division of Laboratory Medicine, Lothian NHS University Hospitals, Edinburgh EH16 6SA, Scotland, UK
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Submission received: 8 January 2023 / Revised: 7 February 2023 / Accepted: 10 February 2023 / Published: 15 February 2023

Abstract

:
In recent years, there has been an increased effort to digitise whole-slide images of cancer tissue. This effort has opened up a range of new avenues for the application of deep learning in oncology. One such avenue is virtual staining, where a deep learning model is tasked with reproducing the appearance of stained tissue sections, conditioned on a different, often times less expensive, input stain. However, data to train such models in a supervised manner where the input and output stains are aligned on the same tissue sections are scarce. In this work, we introduce a dataset of ten whole-slide images of clear cell renal cell carcinoma tissue sections counterstained with Hoechst 33342, CD3, and CD8 using multiple immunofluorescence. We also provide a set of over 600,000 patches of size 256 × 256 pixels extracted from these images together with cell segmentation masks in a format amenable to training deep learning models. It is our hope that this dataset will be used to further the development of deep learning methods for digital pathology by serving as a dataset for comparing and benchmarking virtual staining models.

1. Summary

With approximately 13,300 new cases every year, kidney cancer is the seventh-most-common type of cancer in the U.K. [1]. Clear cell renal cell carcinoma (ccRCC), a subtype of kidney cancer, whose name is derived from the appearance of its tumour cells under the microscope, is by far the most prevalent [2,3]. Studying its highly heterogeneous and vascularised tumour microenvironment (TME) is important for improving our understanding of the disease and its progression [4].
An important technique in clinical oncology and cancer research is the process of immunostaining, which facilitates the visualisation of various proteins in the cells of cancer tissue using artificial colouration [5] to distinguish between different cell types. Immunostaining assists pathologists in diagnosing cancer and deciding on treatment options [6,7,8]. Multiple immunofluorescence (mIF) allows different proteins to be visualised simultaneously by the enzymatic reaction between fluorescent-coated tyramide and horseradish peroxidase (HRP) [6,9,10]. In this work, we employed mIF with three different fluorophores to decorate ccRCC tissue sections for Hoechst 33342, cluster of differentiation 3 (CD3), and 29 cluster of differentiation 8 (CD8). The first is a widely used counterstaining fluorescent dye used to highlight cell nuclei [11], while the other two highlight specific cell subtypes: CD3 identifies T lymphocytes, and CD8 marks cytotoxic T lymphocytes.
Digitising whole slide images (WSIs) of tumour tissue as gigapixel images (typically around 100,000 × 100,000 pixels in size) has become an increasingly common practice in the last decade, not only in research, but also clinical settings [12]. The contemporaneous advent of deep learning, which flourishes with the availability of large amounts of data, has sparked leaps in the computer vision community. These advancements, combined with the availability of digital pathology images, pave the way towards developing automated methods for WSI analysis. Potential applications vary from slide-level tasks such as patient risk stratification [13,14], to specific image tasks such as detecting cellular subtypes and their spatial distribution [15,16,17]. In this setting, deep learning not only has the potential to help reduce the workload of pathologists, but also to alleviate inter-observer bias, which is a common problem in pathology [18,19].
In an effort to facilitate deep learning research in digital pathology, we present a dataset of ten WSIs of ccRCC tissue, alongside the corresponding clinical data. The fact that our images contain three channels of information (Hoechst 33342, CD3, and CD8) makes our dataset particularly well-suited to the task of virtual staining [20], where a deep learning model is tasked with translating from one type of stain to another. In other words, given an image of stain A, the model should produce an image that appears as if the tissue section had instead been stained with another stain B. Our dataset, which is available in the BioImage Archive (http://www.ebi.ac.uk/bioimage-archive, accessed on 14 February 2023) under Accession Number S-BIAD605 [21], is presented in a manner that is suitable for training deep learning models by providing image patches and cell segmentation masks alongside the raw WSIs. Indeed, our dataset has already been used for training a modified generative adversarial network (GAN) [22,23] to convert Hoechst images to CD3 and CD8 [17]. Hoechst staining is significantly less expensive than CD3 and CD8 [17], so the ability to synthesise the former from the latter could also represent a significant costs saving.

2. Data Description

Our dataset consists of WSIs digitised from the tumour tissue of ten patients with ccRCC. The slides were sourced from the Pathology Archive in Lothian NHS (Ethics Reference 10/S1402/33). Using mIF, the slides were stained with Hoechst, CD3, and CD8 before being scanned at an objective of x40 on an Axioscan Zeiss scanner, resulting in a dataset of ten WSIs, each with three channels (Hoechst, CD3, and CD8).
We present the slides in two different formats: as raw WSIs and as preprocessed non-overlapping image patches of size 256 × 256 pixels covering the entire tissue region of the WSI. Furthermore, we provide the associated patients’ clinical data in CSV format.

2.1. Raw Whole-Slide Images

We supply all ten WSIs in CZI format named according to the following convention: ICAIRDXXX_MCM2FITC_CD3CY3_CD8CY5_MCK750.czi, where XXX is the patient ID (referred to as the iCAIRD number in Section 2.3). As the naming convention suggests, the Hoechst intensities are captured in the FITC channel, CD3 in the CY3 channel, and CD8 in the CY5 channel. Figure 1 shows a low-resolution thumbnail of one of the WSIs.

2.2. Preprocessed Image Patches

The patches.tar.gz archive (70 G) contains the image patches. There are ten folders, one for each WSI, named according to the same convention as the raw WSIs in Section 2.1. For each patch, we supply a JSON file containing the metadata of the patch and the paths to the various image files associated with that patch. The JSON files are named ICAIRDXXX_MCM2FITC_CD3CY3_CD8CY5_MCK750 [x=X, y=Y, w=256, h=256].json, where XXX is the patient ID and X, Y give the location of the patch’s top left edge in the WSI’s pixel coordinates. Listing 1 explains the structure of the JSON file.
Listing 1. Structure of the JSON file accompanying each patch.
		{
		 "original_file": "ICAIRD1007_MCM2FITC_CD3CY3_CD8CY5_MCK750.czi",
		 "x": 51712,
		 "y": 51968,
		 "w": 256,
		 "h": 256,
		 "images": [
		 {
		  "file": "ICAIRD1007_MCM2FITC_CD3CY3_CD8CY5_MCK750 […].png",
		  "mode": "mask",
		  "channel": "CD3"
		 },
		 // …
		 ]
		}
		
In addition to the self-explanatory metadata fields referencing the original WSI file and patch coordinates, there is a field named images, which contains a list of image files associated with the patch. Each image file is described by a JSON object with the following fields: file, mode, and channel. The file field contains the name of the particular image file (located in the same folder as the JSON file itself). The mode field indicates the type of image file, which can be either mask (indicating a binary cell mask, i.e., a black and white image where white pixels represent the detected cells of a specific type) or raw (indicating a monochrome image with pixel intensities normalised according to Section 3.3.1). Table 1 lists the seven different image files associated with each patch alongside their respective mode and channel attributes. Each image is 256 × 256 pixels in size and supplied in PNG format.
In total, the dataset consists of 627,519 non-overlapping patches. The 256 × 256 pixel patches under 20 × magnification correspond to a physical size of about 58 × 58 μ m. Statistics on the representation of each cell type in the dataset are provided in Table 2.

2.3. Clinical Data

We provide a CSV file containing clinical data for the ten patients (clinical_data.csv, 571 B). The patients’ iCAIRD numbers were used as the identifiers and match up with the names of the WSIs in Section 2.1 and the patches in Section 2.2. Data include the gender, age at surgery, five-year recurrence, and number of disease-free months after surgery. We also include morphological features assessed by a pathologist, including tumour size, lymph node involvement, and tumour grade, amongst others (Table 3 provides a full list of the columns).

3. Methods

3.1. Multiplex Immunofluorescence Protocol

The method of staining the slides and obtaining the WSIs was described in the work of Wölflein et al. [17], but we include it here for completeness. The Leica BOND RX automated immunostainer (Leica Microsystems, Milton Keynes, U.K.) was utilised to perform mIF. The sections were dewaxed at 72 °C using BOND dewax solution (Leica, AR9222) and rehydrated in absolute alcohol and deionised water, respectively. The sections were treated with BOND epitope retrieval 1 (ER1) buffer (Leica, AR9961) for 20 min at 100 °C to unmask the epitopes. The endogenous peroxidase was blocked with peroxide block (Leica, DS9800), followed by serum-free protein block (Agilent, x090930-2). The sections were incubated with the first primary antibody (CD8, Agilent, M710301-2, 1:400 dilution) for 40 min at room temperature, followed by anti-mouse HRP conjugated secondary antibody (Agilent, K400111-2) for 40 min. Then, the CD8 antigen was visualised by Cy5-conjugated tyramide signal amplification (TSA) (Akoya Bioscience, NEL745001KT). Redundant antibodies, which were not covalently bound, were stripped off by ER1 buffer at 95 °C for 20 min. Then, the second primary antibody (CD3, Agilent, A045229-2, 1:400 dilution) was visualised by TSA Cy3, taking the same steps of the peroxide block to the ER1 buffer stripping of the first antibody visualisation. Cell nuclei were counterstained by Hoechst 33342 (Thermo Fisher, H3570, 1:100), and the sections were mounted with prolong gold antifade mountant (Thermo Fisher, P36930).

3.2. Whole-Slide Image Acquisition

The fluorescence images were captured using a Zeiss Axio Scan Z1 at an objective of x40 magnification. We used three different fluorescent channels (Hoechst 33342, Cy3, and Cy5) simultaneously to capture individual channel images under 20 × object magnification with the respective exposure times of 10 ms, 20 ms, and 30 ms. Figure 2 shows the density curves of the three different channel intensities across the entire dataset.

3.3. Patch Processing

3.3.1. Intensity Normalisation

PNG files store pixels as 8-bit integers, which limits the dynamic range of the images. However, when examining the intensity histograms in Figure 3, we observed that most pixel luminance was concentrated at the lower end of the range. A naïve quantisation of the image to the range [ 0 , 255 ] would lose most of the important information, specifically the variation at the lower end. To address this, we applied a form of thresholding.
Each histogram in Figure 3 exhibits one main peak (disregarding the leftmost maximum at an intensity close to zero, corresponding to background pixels). Therefore, we found it sufficient to assume that the histogram follows a normal distribution N ( μ , σ 2 ) , the parameters of which we obtained using maximum likelihood estimation. In practice, most of the important information is contained between the peak and three standard deviations to the right, i.e., in the range [ μ , μ + 3 σ ] , indicated by the red lines in Figure 3. Eliminating intensities to the left of that peak ( x < μ ) reduces the background noise. Moreover, pixels with high intensities ( x > μ + 3 σ ) are rare and can thus be discarded as well because they do not add much information. As a result, we transformed the intensities x to the [ 0 , 1 ] range by the function:
f ( x ) = min 1 , max 0 , x μ 3 σ .
Note that we estimated the parameters μ and σ derived from the histograms of the entire WSIs and not on a per-patch basis, due to the height variance between the patches. Furthermore, the described intensity normalisation procedure was applied to each stain separately, as illustrated by the sample patch in Figure 4.

3.3.2. Nucleus Segmentation

As indicated in Section 2.2, we supply the normalised image patches of each of the three channels (Hoechst, CD3, and CD8). However, we also include masks for each of the three channels (see Table 1), which are generated by a nucleus segmentation algorithm. These masks can be used to evaluate the quality of virtual staining algorithms [17,20] or even directly train segmentation models.
Our approach to nucleus segmentation uses the Hoechst channel as the starting point, instead of directly segmenting cells on the CD3/CD8 channels because those are less reliable. First, we segmented all nuclei in this channel using the StarDist algorithm [27], a popular deep-learning-based nucleus segmentation method. We employed StarDist because it is able to produce plausible non-overlapping masks even in crowded areas where instance segmentation models such as Mask-RCNN [28] tend to generate blobs of multiple cells [27]. This is because StarDist represents cells as star-convex polygons, whereas instance segmentation models simply operate on a pixel level. Figure 4g depicts the result of StarDist with a probability threshold of 0.6 and no cell expansion, as we employed it in our pipeline. Following Hoechst cell segmentation, we applied a threshold on the CD3 channel to identify which nuclei in the Hoechst mask were CD3+ (Figure 4h). We repeated this process for the CD8 channel as well (Figure 4i). The entire nucleus segmentation pipeline (i.e., the aforementioned steps) was implemented as scripts using the QuPath software [29].
There were two factors that impacted the quality of the masks. First, Hoechst and CD3 stains may sometimes not align perfectly, which is evident in Figure 4e, where some of the high-intensity blobs do not match exactly with Figure 4d. This is because, while Hoechst stains the cell nuclei, CD3 is expressed only in a tiny part of a T cell’s cytoplasm. Analogous reasoning applies to CD8. The second factor is the thickness of the slides (4 μ m), which causes some cells to be out of focus, which becomes evident by the varying intensity levels in Figure 4a–c. As a result of both of these factors, there may be some cases where CD3+ or CD8+ cells may, by mistake, not be classified as such.

Author Contributions

Conceptualisation, G.W., I.H.U., D.J.H. and O.A.; methodology, G.W. and I.H.U.; software, G.W.; validation, G.W.; formal analysis, G.W.; investigation, I.H.U. and G.W.; resources, I.H.U. and D.J.H.; data curation, I.H.U. and G.W.; writing—original draft preparation, G.W.; writing—review and editing, I.H.U., D.J.H. and O.A.; visualisation, G.W.; supervision, O.A. and D.J.H.; project administration, O.A. and D.J.H.; funding acquisition, D.J.H. All authors have read and agreed to the published version of the manuscript.

Funding

G.W. is supported by Lothian NHS. This project received funding from the European Union’s Horizon 2020 research and innovation programme under Grant Agreement No. 101017453 as part of the KATY project. This work was supported in part by the Industrial Centre for AI Research in Digital Diagnostics (iCAIRD), which is funded by Innovate UK on behalf of UK Research and Innovation (UKRI) (Project Number 104690).

Institutional Review Board Statement

The work was conducted in accordance with the Declaration of Helsinki and approved by the Ethics Committee of NHS Lothian NRS BioResource, REC-approved Research Tissue Bank (REC Approval Ref. 13/ES/0126, 3 February 2015).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Acknowledgments

We would like to thank Craig Marshall, Lothian Biorepository, who granted access to the samples.

Conflicts of Interest

The funders had no role in the design of the study; in the collection, analyses, or interpretation of the data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
ccRCCclear cell renal cell carcinoma
TMEtumour microenvironment
mIFmultiplex immunofluorescence
IHCimmunohistochemistry
WSIwhole-slide image
GANgenerative adversarial network
CD3cluster of differentiation 3
CD8cluster of differentiation 8
TSAtyramide signal amplification
HRPhorseradish peroxidase
JSONJavaScript object notation
PNGportable network graphics
CSVcomma-separated values

References

  1. Cancer Research UK. Kidney Cancer Statistics. Available online: https://www.cancerresearchuk.org/health-professional/cancer-statistics/statistics-by-cancer-type/kidney-cancer (accessed on 30 September 2022).
  2. Siegel, R.L.; Miller, K.D.; Fuchs, H.E.; Jemal, A. Cancer Statistics, 2021. CA Cancer J. Clin. 2021, 71, 7–33. [Google Scholar] [CrossRef]
  3. Moch, H.; Cubilla, A.L.; Humphrey, P.A.; Reuter, V.E.; Ulbright, T.M. The 2016 WHO Classification of Tumours of the Urinary System and Male Genital Organs—Part A: Renal, Penile, and Testicular Tumours. Eur. Urol. 2016, 70, 93–105. [Google Scholar] [CrossRef]
  4. De Filippis, R.; Wölflein, G.; Um, I.H.; Caie, P.D.; Warren, S.; White, A.; Suen, E.; To, E.; Arandjelović, O.; Harrison, D.J. Use of high-plex data reveals novel insights into the tumour microenvironment of clear cell renal cell carcinoma. Cancers 2022, 14, 5387. [Google Scholar] [CrossRef] [PubMed]
  5. Coons, A.H.; Creech, H.J.; Jones, R.N.; Berliner, E. The Demonstration of Pneumococcal Antigen in Tissues by the Use of Fluorescent Antibody. J. Immunol. 1942, 45, 159–170. [Google Scholar] [CrossRef]
  6. Kalyuzhny, A.E. Immunohistochemistry—Essential Elements and Beyond; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
  7. Goldstein, N.S.; Hewitt, S.M.; Taylor, C.R.; Yaziji, H.; Hicks, D.G.; Members of Ad-Hoc Committee on Immunohistochemistry Standardization. Recommendations for Improved Standardization of Immunohistochemistry. Appl. Immunohistochem. Mol. Morphol. 2007, 15, 124–133. [Google Scholar] [CrossRef] [PubMed]
  8. Donaldson, J.G. Immunofluorescence Staining. Curr. Protoc. Cell Biol. 2015, 69, 3–4. [Google Scholar] [CrossRef] [PubMed]
  9. Zaidi, A.U.; Enomoto, H.; Milbrandt, J.; Roth, K.A. Dual Fluorescent in Situ Hybridization and Immunohistochemical Detection with Tyramide Signal Amplification. J. Histochem. Cytochem. 2000, 48, 1369–1375. [Google Scholar] [CrossRef] [Green Version]
  10. Buchwalow, I.B.; Böcker, W. Immunohistochemistry: Basics and Methods; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
  11. Chazotte, B. Labeling Nuclear DNA with Hoechst 33342. Cold Spring Harb. Protoc. 2011, 2011, pdb-prot5557. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  12. Caie, P.D.; Dimitriou, N.; Arandjelović, O. Precision Medicine in Digital Pathology via Image Analysis and Machine Learning. In Artificial Intelligence and Deep Learning in Pathology; Elsevier: Amsterdam, The Netherlands, 2021; pp. 149–173. [Google Scholar]
  13. Kather, J.N.; Krisam, J.; Charoentong, P.; Luedde, T.; Herpel, E.; Weis, C.A.; Gaiser, T.; Marx, A.; Valous, N.A.; Ferber, D.; et al. Predicting survival from colorectal cancer histology slides using deep learning: A retrospective multicenter study. PLoS Med. 2019, 16, e1002730. [Google Scholar] [CrossRef] [PubMed]
  14. Yao, J.; Zhu, X.; Huang, J. Deep multi-instance learning for survival prediction from whole-slide images. In Proceedings of the Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, 13–17 October 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 496–504. [Google Scholar]
  15. Abousamra, S.; Gupta, R.; Hou, L.; Batiste, R.; Zhao, T.; Shankar, A.; Rao, A.; Chen, C.; Samaras, D.; Kurc, T.; et al. Deep learning-based mapping of tumor infiltrating lymphocytes in whole-slide images of 23 types of cancer. Front. Oncol. 2022, 11, 5971. [Google Scholar] [CrossRef] [PubMed]
  16. Cooper, J.; Um, I.H.; Arandjelović, O.; Harrison, D.J. Lymphocyte Classification from Hoechst Stained Slides with Deep Learning. Cancers 2022, 14, 5957. [Google Scholar] [CrossRef] [PubMed]
  17. Wölflein, G.; Um, I.H.; Harrison, D.J.; Arandjelović, O. HoechstGAN: Virtual Lymphocyte Staining Using Generative Adversarial Networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 2–8 January 2023; pp. 4997–5007. [Google Scholar]
  18. Warren, A.Y.; Harrison, D. WHO/ISUP classification, grading and pathological staging of renal cell carcinoma: Standards and controversies. World J. Urol. 2018, 36, 1913–1926. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  19. Bektas, S.; Bahadir, B.; Kandemir, N.O.; Barut, F.; Gul, A.E.; Ozdamar, S.O. Intraobserver and interobserver variability of Fuhrman and modified Fuhrman grading systems for conventional renal cell carcinoma. Kaohsiung J. Med Sci. 2009, 25, 596–600. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  20. Mulrane, L.; Rexhepaj, E.; Penney, S.; Callanan, J.J.; Gallagher, W.M. Automated image analysis in histopathology: A valuable tool in medical diagnostics. Expert Rev. Mol. Diagn. 2008, 8, 707–725. [Google Scholar] [CrossRef] [PubMed]
  21. Wölflein, G.; Um, I.H.; Harrison, D.J.; Arandjelović, O. Whole Slide Images and Patches of Clear Cell Renal Cell Carcinoma Counterstained with Multiple Immunofluorescence for Hoechst, CD3, and CD8. 2022. Available online: https://www.ebi.ac.uk/biostudies/bioimages/studies/S-BIAD605 (accessed on 6 February 2023).
  22. Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative Adversarial Nets. In Proceedings of the Advances in Neural Information Processing Systems; Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K., Eds.; Curran Associates, Inc.: New York, NY, USA, 2014; Volume 27. [Google Scholar]
  23. Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
  24. Fuhrman, S.A.; Lasky, L.C.; Limas, C. Prognostic significance of morphologic parameters in renal cell carcinoma. Am. J. Surg. Pathol. 1982, 6, 655–663. [Google Scholar] [CrossRef] [PubMed]
  25. Amin, M.B.; Greene, F.L.; Edge, S.B.; Compton, C.C.; Gershenwald, J.E.; Brookland, R.K.; Meyer, L.; Gress, D.M.; Byrd, D.R.; Winchester, D.P. The eighth edition AJCC cancer staging manual: Continuing to build a bridge from a population-based to a more “personalized” approach to cancer staging. CA Cancer J. Clin. 2017, 67, 93–99. [Google Scholar] [CrossRef] [PubMed]
  26. Leibovich, B.C.; Blute, M.L.; Cheville, J.C.; Lohse, C.M.; Frank, I.; Kwon, E.D.; Weaver, A.L.; Parker, A.S.; Zincke, H. Prediction of progression after radical nephrectomy for patients with clear cell renal cell carcinoma: A stratification tool for prospective clinical trials. Cancer Interdiscip. Int. J. Am. Cancer Soc. 2003, 97, 1663–1671. [Google Scholar] [CrossRef] [PubMed]
  27. Schmidt, U.; Weigert, M.; Broaddus, C.; Myers, G. Cell Detection with Star-Convex Polygons. In Proceedings of the Medical Image Computing and Computer Assisted Intervention—MICCAI 2018—21st International Conference, Granada, Spain, 6–20 September 2018; pp. 265–273. [Google Scholar]
  28. He, K.; Gkioxari, G.; Dollár, P.; Girshick, R. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2961–2969. [Google Scholar]
  29. Bankhead, P.; Loughrey, M.B.; Fernández, J.A.; Dombrowski, Y.; McArt, D.G.; Dunne, P.D.; McQuaid, S.; Gray, R.T.; Murray, L.J.; Coleman, H.G.; et al. QuPath: Open source software for digital pathology image analysis. Sci. Rep. 2017, 7, 16878. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Thumbnail image of one of the WSIs in the dataset, displaying the Hoechst channel in blue, CD3 in yellow, and CD8 in red. Note that the individual cells are too small to be identified at the low resolution of this image.
Figure 1. Thumbnail image of one of the WSIs in the dataset, displaying the Hoechst channel in blue, CD3 in yellow, and CD8 in red. Note that the individual cells are too small to be identified at the low resolution of this image.
Data 08 00040 g001
Figure 2. Intensity histograms of all 10 WSIs in the dataset (each WSI corresponds to a differently coloured line).
Figure 2. Intensity histograms of all 10 WSIs in the dataset (each WSI corresponds to a differently coloured line).
Data 08 00040 g002
Figure 3. Intensity histograms (left axes) and fit normal distributions (right axes) of a sample WSI’s Hoechst and CD3 channels. The CD8 histograms behave similarly.
Figure 3. Intensity histograms (left axes) and fit normal distributions (right axes) of a sample WSI’s Hoechst and CD3 channels. The CD8 histograms behave similarly.
Data 08 00040 g003
Figure 4. A 256 × 256 pixel patch extracted from the WSI in Figure 1, showing raw and normalised intensities for Hoechst, CD3, and CD8, as well as masks for different cell types. CD8+ cells are a subset of CD3+ cells because CD3 highlights all T cells, whereas CD8 binds only to cytotoxic T cells. (a) Hoechst. (b) CD3. (c) CD8. (d) normalised Hoechst. (e) normalised CD3. (f) normalised CD8. (g) StarDist [27] cell mask. (h) CD3+ cells. (i) CD8+ cells.
Figure 4. A 256 × 256 pixel patch extracted from the WSI in Figure 1, showing raw and normalised intensities for Hoechst, CD3, and CD8, as well as masks for different cell types. CD8+ cells are a subset of CD3+ cells because CD3 highlights all T cells, whereas CD8 binds only to cytotoxic T cells. (a) Hoechst. (b) CD3. (c) CD8. (d) normalised Hoechst. (e) normalised CD3. (f) normalised CD8. (g) StarDist [27] cell mask. (h) CD3+ cells. (i) CD8+ cells.
Data 08 00040 g004
Table 1. Types of image files associated with each patch, alongside their respective mode and channel attributes.
Table 1. Types of image files associated with each patch, alongside their respective mode and channel attributes.
ModeChannelDescription
rawH3342normalised Hoechst patch
rawCy3normalised CD3 patch
rawCy5normalised CD8 patch
maskHoechstsegmentation mask of all detected cells
maskCD3segmentation mask of CD3+ cells (subset of Hoechst cells)
maskCD8segmentation mask of CD8+ cells (subset of CD3+ cells)
maskunclassifiedsegmentation mask of CD3- cells (subset of Hoechst cells)
Table 2. Representation of cell subtypes across the dataset. Presence refers to the percentage of patches that contain at least one cell of the respective subtype. Area coverage means the percentage of pixels that are occupied by each cell subtype.
Table 2. Representation of cell subtypes across the dataset. Presence refers to the percentage of patches that contain at least one cell of the respective subtype. Area coverage means the percentage of pixels that are occupied by each cell subtype.
HoechstCD3CD8
Total cells15,956,0493,390,5331,894,016
Cells per patch25.425.403.02
Presence99.95%93.08%71.61%
Area coverage26.48%05.01%03.02%
Table 3. Columns in the clinical data table. Note that the “Disease-free months” column indicates a lower bound, as some patients may have experienced recurrence after the period of data collection.
Table 3. Columns in the clinical data table. Note that the “Disease-free months” column indicates a lower bound, as some patients may have experienced recurrence after the period of data collection.
Column NameFormatDescription
ICAIRD numberICAIRD_XXXpatient ID
GenderM or Fgender
Response0 or 1recurrence within 5 years after surgery
Age at surgerywhole numberage at surgery in years
Disease-free monthsfloatnumber of months with no recurrence
Fuhrman nuclear grade14Fuhrman grade [24]
ISUP nuclear grade14ISUP grade [3]
Tumour stage1a, 1b, 2a, 2b, 3a, 3b, 3c, or 4tumour size according to TNM system [25]
Tumour sizefloattumour size in cm
Node status0 or 1lymph node status according to TNM system [25]
Necrosis0 or 1whether necrosis is detected
Leibovich score (Fuhrman)011Leibovich score [26] using Fuhrman nuclear grade [24]
Leibovich score (ISUP)011Leibovich score [26] using ISUP nuclear grade [3]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wölflein, G.; Um, I.H.; Harrison, D.J.; Arandjelović, O. Whole-Slide Images and Patches of Clear Cell Renal Cell Carcinoma Tissue Sections Counterstained with Hoechst 33342, CD3, and CD8 Using Multiple Immunofluorescence. Data 2023, 8, 40. https://doi.org/10.3390/data8020040

AMA Style

Wölflein G, Um IH, Harrison DJ, Arandjelović O. Whole-Slide Images and Patches of Clear Cell Renal Cell Carcinoma Tissue Sections Counterstained with Hoechst 33342, CD3, and CD8 Using Multiple Immunofluorescence. Data. 2023; 8(2):40. https://doi.org/10.3390/data8020040

Chicago/Turabian Style

Wölflein, Georg, In Hwa Um, David J. Harrison, and Ognjen Arandjelović. 2023. "Whole-Slide Images and Patches of Clear Cell Renal Cell Carcinoma Tissue Sections Counterstained with Hoechst 33342, CD3, and CD8 Using Multiple Immunofluorescence" Data 8, no. 2: 40. https://doi.org/10.3390/data8020040

Article Metrics

Back to TopTop