Decomposition Sampling for Efficient Region Annotations in Active Learning

arXiv ID: 2512.07606v1

Published: 2025-12-08

Authors: Jingna Qiu, Frauke Wilm, Mathias Öttl, Jonas Utz, Maja Schlereth, Moritz Schillinger, Marc Aubreville, Katharina Breininger

Categories: cs.CV

Relevance Score: 0.95 / 1.00

View on arXiv Download PDF

Summary

This paper introduces Decomposition Sampling (DECOMP), a novel active learning strategy designed to improve annotation efficiency for dense prediction tasks, especially in medical imaging. DECOMP addresses limitations of existing region-level sampling methods by decomposing images into class-specific components using pseudo-labels and guiding region selection with class-wise confidence. It consistently outperforms baselines across ROI classification, 2D, and 3D segmentation tasks by enhancing diversity and boosting performance on challenging minority classes.

Medical Relevance

This work is highly relevant to medicine as it directly tackles the labor-intensive and costly process of annotating medical images, which is a major bottleneck in AI development. By making annotation more efficient and improving model performance on crucial minority classes, DECOMP accelerates the creation of robust and accurate AI tools for clinical use.

AI Health Application

This research improves the efficiency and accuracy of developing AI models for medical applications, specifically in tasks like disease detection, tumor segmentation, and organ segmentation from medical images (e.g., MRI, CT, X-ray, microscopy). By enabling more efficient and targeted annotation, it reduces the cost and time required to build robust AI systems for healthcare, accelerating the deployment of AI in clinical settings for improved diagnostics and treatment planning.

Key Points

The research targets the high cost and time intensity of region-level annotation for dense prediction tasks, a significant bottleneck in developing AI for medical imaging.
Existing active learning methods for representative annotation region selection suffer from high computational/memory costs, irrelevant region choices, and over-reliance on uncertainty sampling.
Proposes Decomposition Sampling (DECOMP), a new active learning sampling strategy to enhance annotation diversity and efficiency.
DECOMP operates by decomposing images into class-specific components using pseudo-labels, allowing for targeted sampling from each identified class.
The sampling process is strategically guided by class-wise predictive confidence, ensuring that annotation efforts are prioritized towards difficult or underrepresented (minority) classes.
The method aims to specifically address the challenge of effectively sampling regions belonging to minority classes, which are often critical for accurate medical diagnosis and prognosis.
DECOMP consistently surpasses baseline active learning methods across diverse medical imaging tasks, including ROI classification, 2D segmentation, and 3D segmentation, by better sampling minority-class regions and boosting overall performance on these challenging classes.

Methodology

The proposed methodology, Decomposition Sampling (DECOMP), is an active learning sampling strategy. It involves an initial step of decomposing images into class-specific components, facilitated by pseudo-labeling. Subsequently, regions are sampled from these derived class-specific components. The sampling process is strategically guided by class-wise predictive confidence, directing annotation efforts towards more challenging classes. The method was empirically validated by comparing its performance against baseline methods across ROI classification, 2D segmentation, and 3D segmentation tasks.

Key Findings

The primary finding is that DECOMP consistently outperforms existing active learning baseline methods across ROI classification, 2D segmentation, and 3D segmentation tasks. Crucially, DECOMP demonstrates superior capability in sampling regions belonging to minority classes, leading to a significant boost in model performance on these often critical and difficult-to-learn classes.

Clinical Impact

DECOMP has the potential to significantly reduce the annotation burden on highly skilled medical professionals (e.g., radiologists, pathologists), thereby accelerating the development and deployment of AI models in clinical settings. By improving model accuracy, especially for rare diseases or subtle pathological findings (minority classes), it can lead to more reliable diagnostic tools, aid in earlier detection, and contribute to better patient outcomes with more efficient use of limited expert resources.

Limitations

The abstract does not explicitly state limitations or caveats of the DECOMP method itself. It highlights the limitations of *existing* region selection methods (high computational/memory costs, irrelevant choices, heavy reliance on uncertainty sampling), which DECOMP aims to address.

Future Directions

Future research directions are not explicitly mentioned in the provided abstract.

Medical Domains

Radiology Pathology Oncology Medical Image Analysis

Keywords

Active Learning Dense Prediction Medical Imaging Region Annotation Segmentation Decomposition Sampling Minority Classes Pseudo-labeling

Abstract

Active learning improves annotation efficiency by selecting the most informative samples for annotation and model training. While most prior work has focused on selecting informative images for classification tasks, we investigate the more challenging setting of dense prediction, where annotations are more costly and time-intensive, especially in medical imaging. Region-level annotation has been shown to be more efficient than image-level annotation for these tasks. However, existing methods for representative annotation region selection suffer from high computational and memory costs, irrelevant region choices, and heavy reliance on uncertainty sampling. We propose decomposition sampling (DECOMP), a new active learning sampling strategy that addresses these limitations. It enhances annotation diversity by decomposing images into class-specific components using pseudo-labels and sampling regions from each class. Class-wise predictive confidence further guides the sampling process, ensuring that difficult classes receive additional annotations. Across ROI classification, 2-D segmentation, and 3-D segmentation, DECOMP consistently surpasses baseline methods by better sampling minority-class regions and boosting performance on these challenging classes. Code is in https://github.com/JingnaQiu/DECOMP.git.