{"id":422,"date":"2023-03-14T18:39:59","date_gmt":"2023-03-14T18:39:59","guid":{"rendered":"https:\/\/todaysainews.com\/index.php\/2023\/03\/14\/a-case-study-of-feature-discovery-and-validation-in-pathology-google-ai-blog\/"},"modified":"2025-04-27T07:33:58","modified_gmt":"2025-04-27T07:33:58","slug":"a-case-study-of-feature-discovery-and-validation-in-pathology-google-ai-blog","status":"publish","type":"post","link":"https:\/\/todaysainews.com\/index.php\/2023\/03\/14\/a-case-study-of-feature-discovery-and-validation-in-pathology-google-ai-blog\/","title":{"rendered":"a case study of feature discovery and validation in pathology \u2013 Google AI Blog"},"content":{"rendered":"<p> [ad_1]<br \/>\n<\/p>\n<div id=\"post-body-451009585245785746\">\n<span class=\"byline-author\">Posted by Ellery Wulczyn and Yun Liu, Google Research<\/span><\/p>\n<p><img decoding=\"async\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEj3EL99qmeogEZryesEPlH-pp_akdzLfek1zinRdpoTW1tsebb-uRCQoKxrKLPSBKxvjxdyP5Eh8t43CIcOR2yaj7PZPUifUMEN-2eDXXIUMB2xHOs-2LIuzRN-LT6TkupmMVZpHfTnWrX6xjOQxSiPuSzw-l_dgH-mBLPEaSGLG_RuhCRCQslU_X1uoA\/s2200\/TAF.png\" style=\"display: none;\"\/><\/p>\n<p>\nWhen a patient is diagnosed with cancer, one of the most important steps is examination of the tumor under a microscope by pathologists to determine the cancer <a href=\"https:\/\/www.cancer.gov\/about-cancer\/diagnosis-staging\/staging\">stage<\/a> and to characterize the tumor. This information is central to understanding clinical prognosis (i.e., likely patient outcomes) and for determining the most appropriate treatment, such as undergoing surgery alone versus surgery plus chemotherapy. Developing machine learning (ML) tools in pathology to assist with the microscopic review represents a compelling research area with many potential applications.\n<\/p>\n<p><a name=\"more\"\/><\/p>\n<p>\nPrevious studies have shown that ML can accurately <a href=\"https:\/\/ai.googleblog.com\/2018\/10\/applying-deep-learning-to-metastatic.html\">identify<\/a> and <a href=\"https:\/\/ai.googleblog.com\/2018\/11\/improved-grading-of-prostate-cancer.html\">classify<\/a> tumors in pathology images and can even predict patient <a href=\"https:\/\/www.nature.com\/articles\/s43856-021-00005-3\">prognosis<\/a> using <em>known<\/em> pathology features, such as the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Gleason_grading_system\">degree to which gland appearances deviate from normal<\/a>. While these efforts focus on using ML to detect or quantify known features, alternative approaches offer the potential to identify <em>novel<\/em> features. The discovery of new features could in turn further improve cancer prognostication and treatment decisions for patients by extracting information that isn\u2019t yet considered in current workflows.\n<\/p>\n<p>\nToday, we\u2019d like to share progress we\u2019ve made over the past few years towards identifying novel features for colorectal cancer in collaboration with teams at the <a href=\"https:\/\/www.medunigraz.at\/en\/\">Medical University of Graz<\/a> in Austria and the <a href=\"https:\/\/en.unimib.it\/medicine-and-surgery\">University of Milano-Bicocca<\/a> (UNIMIB) in Italy. Below, we will cover several stages of the work: (1) training a model to predict prognosis from pathology images without specifying the features to use, so that it can learn what features are important; (2) probing that prognostic model using explainability techniques; and (3) identifying a novel feature and validating its association with patient prognosis. We describe this feature and evaluate its use by pathologists in our recently published paper, \u201c<a href=\"https:\/\/jamanetwork.com\/journals\/jamanetworkopen\/fullarticle\/2802392\">Pathologist validation of a machine-learned feature for colon cancer risk stratification<\/a>\u201d. To our knowledge, this is the first demonstration that medical experts can learn new prognostic features from machine learning, a promising start for the future of this \u201clearning from deep learning\u201d paradigm.\n<\/p>\n<h2>Training a prognostic model to learn what features are important<\/h2>\n<p>\nOne potential approach to identifying novel features is to train ML models to directly predict patient outcomes using only the images and the paired outcome data. This is in contrast to training models to predict \u201cintermediate\u201d human-annotated labels for <em>known<\/em> pathologic features and then using those features to predict outcomes.\n<\/p>\n<p>\nInitial work by our team showed the feasibility of training models to <a href=\"https:\/\/journals.plos.org\/plosone\/article?id=10.1371\/journal.pone.0233678\">directly predict prognosis for a variety of cancer types<\/a> using the publicly available <a href=\"https:\/\/www.cancer.gov\/about-nci\/organization\/ccg\/research\/structural-genomics\/tcga\">TCGA dataset<\/a>. It was especially exciting to see that for some cancer types, the model&#8217;s predictions were prognostic after controlling for available pathologic and clinical features. Together with collaborators from the <a href=\"https:\/\/www.medunigraz.at\/en\/\">Medical University of Graz<\/a> and the <a href=\"https:\/\/biobank.medunigraz.at\/en\/\">Biobank Graz<\/a>, we subsequently extended this work using a large de-identified <a href=\"https:\/\/www.nature.com\/articles\/s41746-021-00427-2\">colorectal cancer<\/a> cohort. Interpreting these model predictions became an intriguing next step, but common <a href=\"https:\/\/www.tensorflow.org\/tutorials\/interpretability\/integrated_gradients\">interpretability techniques<\/a> were challenging to apply in this context and did not provide clear insights.\n<\/p>\n<h2>Interpreting the model-learned features<\/h2>\n<p>\nTo probe the features used by the prognostic model, we used a second model (trained to identify image similarity) to cluster cropped <em>patches<\/em> of the large pathology images. We then used the prognostic model to compute the average ML-predicted risk score for each cluster.\n<\/p>\n<p>\nOne cluster stood out for its high average risk score (associated with poor prognosis) and its distinct visual appearance. Pathologists described the images as involving high grade tumor (i.e., least-resembling normal tissue) in close proximity to adipose (fat) tissue, leading us to dub this cluster the \u201ctumor adipose feature\u201d (TAF); see next figure for detailed examples of this feature. Further analysis showed that the relative quantity of TAF was itself highly and independently prognostic.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgFd6X1LzZKMLVcWM844SqSQA7d6Z8Bmgz_0iM5Bg56lDeNK4vj0xiDRkES_5AHnI10lTmvW8U_xrBqGdoRDdhYXR-tLuVszvDyLz-CHwEdXbtGcctYWnBXepG4bvSMfeSohq6LdUCcXY-45qXzNm03hMXfIPRr43c08as_CAtkLer9yThU4B5K0kjzPw\/s1822\/image1.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"1200\" data-original-width=\"1822\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgFd6X1LzZKMLVcWM844SqSQA7d6Z8Bmgz_0iM5Bg56lDeNK4vj0xiDRkES_5AHnI10lTmvW8U_xrBqGdoRDdhYXR-tLuVszvDyLz-CHwEdXbtGcctYWnBXepG4bvSMfeSohq6LdUCcXY-45qXzNm03hMXfIPRr43c08as_CAtkLer9yThU4B5K0kjzPw\/s16000\/image1.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\">A prognostic ML model was developed to predict patient survival directly from unannotated giga-pixel pathology images. A second image similarity model was used to cluster cropped patches of pathology images. The prognostic model was used to compute the average model-predicted risk score for each cluster. One cluster, dubbed the \u201ctumor adipose feature\u201d (TAF) stood out in terms of its high average risk score (associated with poor survival) and distinct visual appearance. Pathologists learned to identify TAF and pathologist scoring for TAF was shown to be prognostic.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\">\n<div class=\"separator\" style=\"clear: both; text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEiskWphONjkzDdG7sihhx7lnzybxq0zZ7oip1eSit7M3vX1oyVhkhGYjUB7UNpWQfH3t8LulsP4An1Sc22mFE191M0KbeTFWSvf0JreZq7Sl80bjhHIFrGtl8YjHtlufDqa6WMyUntbsl37jLCr86VMNgsnEU-Xvw2sIw3oDCxZqjK7EtK8tPF_WxezBQ\/s554\/image3.gif\" style=\"margin-left: 1em; margin-right: 1em;\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"500\" data-original-width=\"554\" height=\"289\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEiskWphONjkzDdG7sihhx7lnzybxq0zZ7oip1eSit7M3vX1oyVhkhGYjUB7UNpWQfH3t8LulsP4An1Sc22mFE191M0KbeTFWSvf0JreZq7Sl80bjhHIFrGtl8YjHtlufDqa6WMyUntbsl37jLCr86VMNgsnEU-Xvw2sIw3oDCxZqjK7EtK8tPF_WxezBQ\/s320\/image3.gif\" width=\"320\"\/><\/a><\/div>\n<\/td>\n<td>\u00a0<\/td>\n<td style=\"text-align: center;\">\n<div class=\"separator\" style=\"clear: both; text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgXzOKAiLBsxdDuK1krXghTu3vkSmPAsGoeFgG9Uc810v6L5st32StvcYK179iP__jvE0OoxjfK--9-SJmzd0V7qctVNPpuJz_xmlKD0TFwYSWlg_Q1yWGjsc4PsRvLHYBNlV-YF1GzEmbjPh1mHjznHS-u4xWDUqqu2yJpgLViGgjC8aWFM6Scb11RTg\/s2000\/image4.png\" style=\"margin-left: 1em; margin-right: 1em;\"><img loading=\"lazy\" decoding=\"async\" border=\"0\" data-original-height=\"2000\" data-original-width=\"2000\" height=\"320\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEgXzOKAiLBsxdDuK1krXghTu3vkSmPAsGoeFgG9Uc810v6L5st32StvcYK179iP__jvE0OoxjfK--9-SJmzd0V7qctVNPpuJz_xmlKD0TFwYSWlg_Q1yWGjsc4PsRvLHYBNlV-YF1GzEmbjPh1mHjznHS-u4xWDUqqu2yJpgLViGgjC8aWFM6Scb11RTg\/s320\/image4.png\" width=\"320\"\/><\/a><\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\"><b>Left<\/b>: H&amp;E pathology slide with an overlaid heatmap indicating locations of the tumor adipose feature (TAF). Regions highlighted in red\/orange are considered to be more likely TAF by the image similarity model, compared to regions highlighted in green\/blue or regions not highlighted at all. <b>Right<\/b>: Representative collection of TAF patches across multiple cases.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Validating that the model-learned feature can be used by pathologists<\/h2>\n<p>\nThese studies provided a compelling example of the potential for ML models to predict patient outcomes and a methodological approach for obtaining insights into model predictions. However, there remained the intriguing questions of whether pathologists could learn and score the feature identified by the model while maintaining demonstrable prognostic value.\n<\/p>\n<p>\nIn <a href=\"https:\/\/jamanetwork.com\/journals\/jamanetworkopen\/fullarticle\/2802392\">our most recent paper<\/a>, we collaborated with pathologists from the <a href=\"https:\/\/en.unimib.it\/medicine-and-surgery\">UNIMIB<\/a> to investigate these questions. Using example images of TAF from the <a href=\"https:\/\/www.nature.com\/articles\/s41746-021-00427-2\">previous publication<\/a> to learn and understand this feature of interest, UNIMIB pathologists developed scoring guidelines for TAF. If TAF was not seen, the case was scored as \u201cabsent\u201d, and if TAF was observed, then \u201cunifocal\u201d, \u201cmultifocal\u201d, and \u201cwidespread\u201d categories were used to indicate the relative quantity. Our study showed that pathologists could reproducibly identify the ML-derived TAF and that their scoring for TAF provided statistically significant prognostic value on an independent retrospective dataset. To our knowledge, this is the first demonstration of pathologists learning to identify and score a specific pathology feature originally identified by an ML-based approach.\n<\/p>\n<h2>Putting things in context: learning from deep learning as a paradigm<\/h2>\n<p>\nOur work is an example of people \u201clearning from deep learning\u201d. In traditional ML, models learn from hand-engineered features informed by existing domain knowledge. More recently, in the deep learning era, a combination of large-scale model architectures, compute, and datasets has enabled learning directly from raw data, but this is often at the expense of human interpretability. Our work couples the use of deep learning to predict patient outcomes with interpretability methods, to extract new knowledge that could be applied by pathologists. We see this process as a natural next step in the evolution of applying ML to problems in medicine and science, moving from the use of ML to distill existing human knowledge to people using ML as a tool for knowledge discovery.\n<\/p>\n<table align=\"center\" cellpadding=\"0\" cellspacing=\"0\" class=\"tr-caption-container\" style=\"margin-left: auto; margin-right: auto;\">\n<tbody>\n<tr>\n<td style=\"text-align: center;\"><a href=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEg-tb16Nq5sAwzF3HSq0vxzi8azooLxuYz-W32agiz4JYM6l_VfUYO7hfRsPocnAGyDA_r2eZZNDwbSgtX2RXilPyrt7SaBQM8MzBzKtGh5AW3towJ6nZ7SisEdaHFqLpJ_pSFVqdfz70ckUGbe4XIEbHb2HorAJVOIQLJdjEqTBZKSm-Kj6oCSliLqrQ\/s1620\/image2.png\" style=\"margin-left: auto; margin-right: auto;\"><img decoding=\"async\" border=\"0\" data-original-height=\"1200\" data-original-width=\"1620\" src=\"https:\/\/blogger.googleusercontent.com\/img\/b\/R29vZ2xl\/AVvXsEg-tb16Nq5sAwzF3HSq0vxzi8azooLxuYz-W32agiz4JYM6l_VfUYO7hfRsPocnAGyDA_r2eZZNDwbSgtX2RXilPyrt7SaBQM8MzBzKtGh5AW3towJ6nZ7SisEdaHFqLpJ_pSFVqdfz70ckUGbe4XIEbHb2HorAJVOIQLJdjEqTBZKSm-Kj6oCSliLqrQ\/s16000\/image2.png\"\/><\/a><\/td>\n<\/tr>\n<tr>\n<td class=\"tr-caption\" style=\"text-align: center;\">Traditional ML focused on engineering features from raw data using existing human knowledge. Deep learning enables models to learn features directly from raw data at the expense of human interpretability. Coupling deep learning with interpretability methods provides an avenue for expanding the frontiers of scientific knowledge by learning from deep learning.<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>Acknowledgements<\/h2>\n<p>\n<em>This work would not have been possible without the efforts of coauthors Vincenzo L&#8217;Imperio, Markus Plass, Heimo Muller, Nicol\u00f2&#8217; Tamini, Luca Gianotti, Nicola Zucchini, Robert Reihs, Greg S. Corrado, Dale R. Webster, Lily H. Peng, Po-Hsuan Cameron Chen, Marialuisa Lavitrano, David F. Steiner, Kurt Zatloukal, Fabio Pagni. We also appreciate the support from Verily Life Sciences and the Google Health Pathology teams \u2013 in particular Timo Kohlberger, Yunnan Cai, Hongwu Wang, Kunal Nagpal, Craig Mermel, Trissia Brown, Isabelle Flament-Auvigne, and Angela Lin. We also appreciate manuscript feedback from Akinori Mitani, Rory Sayres, and Michael Howell, and illustration help from Abi Jones. This work would also not have been possible without the support of Christian Guelly, Andreas Holzinger, Robert Reihs, Farah Nader, the Biobank Graz, the efforts of the slide digitization team at the Medical University Graz, the participation of the pathologists who reviewed and annotated cases during model development, and the technicians of the UNIMIB team.<\/em>\n<\/p>\n<\/div>\n<p>[ad_2]<br \/>\n<br \/><a href=\"http:\/\/ai.googleblog.com\/2023\/03\/learning-from-deep-learning-case-study.html\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>[ad_1] Posted by Ellery Wulczyn and Yun Liu, Google Research When a patient is diagnosed with cancer, one<\/p>\n","protected":false},"author":2,"featured_media":423,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[20],"tags":[],"class_list":["post-422","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-google-ai"],"_links":{"self":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/422","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/comments?post=422"}],"version-history":[{"count":1,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/422\/revisions"}],"predecessor-version":[{"id":2861,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/posts\/422\/revisions\/2861"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media\/423"}],"wp:attachment":[{"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/media?parent=422"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/categories?post=422"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/todaysainews.com\/index.php\/wp-json\/wp\/v2\/tags?post=422"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}