Thesis: Feature-Fusion Neck Model for Content-Based Histopathological Image Retrieval
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Feature descriptors in histopathological images pose a significant challenge for the implementation of Content-Based Image Retrieval (CBIR) systems, which are essential tools for assisting pathologists. This complexity arises from the diverse types of tissues and the high dimensionality of Whole Slide Images (WSIs). Deep learning models such as Convolutional Neural Networks (CNNs) and Vision Transformers have improved the extraction of these feature descriptors. These models typically generate embeddings by leveraging deeper single-scale linear layers or advanced pooling layers. However, embeddings that focus on local spatial details at a single scale tend to miss the richer spatial context available in earlier layers. This limitation highlights the need for methods that incorporate multi-scale information to enhance the depth and utility of feature descriptors in histopathological image analysis. In this work, we propose the Local-Global Feature Fusion Embedding Model, an approach that consists of a pre-trained backbone for multi-scale feature extraction, a neck branch for local-global feature fusion, and a Generalized Mean (GeM)-based pooling head for generating robust feature descriptors. Our experiments involved training the model’s neck and head on the ImageNet-1k and PanNuke datasets using the Sub-center ArcFace loss function. Performance was evaluated on the Kimia Path24C dataset for histopathological image retrieval. The proposed model achieved a Recall@1 of 99.40% on test patches, outperforming state-of-the-art methods.