Efficient Vision Mamba for MRI Super-Resolution via Hybrid Selective Scanning

arXiv ID: 2512.19676v1

Published: 2025-12-22

Authors: Mojtaba Safari, Shansong Wang, Vanessa L Wildman, Mingzhe Hu, Zach Eidex, Chih-Wei Chang, Erik H Middlebrooks, Richard L. J Qiu, Pretesh Patel, Ashesh B. Jania, Hui Mao, Zhen Tian, Xiaofeng Yang

Categories: cs.CV, physics.med-ph

Relevance Score: 0.98 / 1.00

View on arXiv Download PDF

Summary

This paper introduces an efficient Vision Mamba-based super-resolution (SR) framework for MRI, addressing the trade-off between fidelity and computational cost in existing deep learning methods. By utilizing multi-head selective state-space models and hybrid scanning, the proposed framework significantly enhances MRI resolution and anatomical detail while demonstrating state-of-the-art accuracy and exceptional efficiency across brain and prostate MRI datasets.

Medical Relevance

High-resolution MRI is fundamental for precise diagnosis in various medical conditions. This SR framework offers a means to achieve enhanced image quality and anatomical detail post-acquisition, potentially reducing scan times or improving diagnostic confidence without requiring expensive hardware upgrades, making advanced MRI more accessible and efficient in clinical settings.

AI Health Application

This AI framework applies deep learning (specifically Vision Mamba and state-space models) to perform super-resolution on Magnetic Resonance Imaging (MRI) scans. Its application in health is to improve the quality and detail of MRI images, which are crucial for medical diagnosis, while simultaneously reducing the computational burden and acquisition time. This leads to more efficient and potentially more accurate diagnoses for conditions affecting the brain (e.g., neurological disorders) and prostate (e.g., prostate cancer detection).

Key Points

Addresses the critical need for high-resolution MRI, overcoming limitations of long acquisition times and the fidelity-efficiency trade-off of current deep learning SR methods.
Proposes a novel SR framework based on multi-head selective state-space models (MHSSM) combined with a lightweight channel MLP.
Employs 2D patch extraction with hybrid scanning within 'MambaFormer blocks' to effectively capture long-range dependencies in MRI data.
Evaluated on diverse clinical datasets: 7T brain T1 MP2RAGE (n=142) and 1.5T prostate T2w MRI (n=334).
Achieved superior quantitative performance (e.g., SSIM=0.951 for brain, PSNR=27.15 for prostate) compared to various baselines including GANs, Transformers (SwinIR, MambaIR), and Diffusion models.
Demonstrated exceptional computational efficiency, utilizing only 0.9M parameters and 57 GFLOPs, representing a 99.8% parameter reduction and 97.5% computation reduction compared to Res-SRDiff.
The framework's high accuracy, detailed anatomical preservation, and low computational demand highlight its strong potential for practical clinical translation.

Methodology

The proposed super-resolution framework combines multi-head selective state-space models (MHSSM) with a lightweight channel MLP for efficient long-range dependency capture. It uses 2D patch extraction alongside hybrid scanning within novel 'MambaFormer' blocks. Each MambaFormer block integrates MHSSM, depthwise convolutions, and gated channel mixing to effectively process image features. The model was trained and evaluated on 7T brain T1 MP2RAGE maps (n=142) and 1.5T prostate T2w MRI (n=334) and compared against a comprehensive suite of baseline SR methods including Bicubic, GANs, Transformers, and Diffusion models.

Key Findings

The framework achieved state-of-the-art accuracy and exceptional computational efficiency. For 7T brain data, it reported SSIM=0.951+-0.021, PSNR=26.90+-1.41 dB, LPIPS=0.076+-0.022, and GMSD=0.083+-0.017, significantly outperforming all baselines (p<0.001). For 1.5T prostate data, it achieved SSIM=0.770+-0.049, PSNR=27.15+-2.19 dB, LPIPS=0.190+-0.095, and GMSD=0.087+-0.013. Critically, the model utilized only 0.9M parameters and 57 GFLOPs, demonstrating a 99.8% reduction in parameters and 97.5% reduction in computation compared to the Res-SRDiff model, while also surpassing SwinIR and MambaIR in both accuracy and efficiency.

Clinical Impact

This framework enables the acquisition of high-resolution MRI images with shorter scan times or from lower-resolution inputs, directly addressing a major bottleneck in clinical MRI workflows. Its ability to preserve anatomical detail while being highly efficient computationally allows for faster post-processing and easier integration into existing clinical diagnostic pipelines, potentially leading to quicker diagnoses, reduced patient discomfort, and improved diagnostic confidence across a range of medical conditions, particularly in neuroradiology and urology.

Limitations

The provided abstract does not explicitly state any limitations of the proposed framework.

Future Directions

The provided abstract does not explicitly state any future research directions.

Medical Domains

Neuroradiology Urology Diagnostic Imaging Medical Physics

Keywords

MRI super-resolution Vision Mamba selective state-space models computational efficiency anatomical detail 7T MRI prostate MRI deep learning

Abstract

Background: High-resolution MRI is critical for diagnosis, but long acquisition times limit clinical use. Super-resolution (SR) can enhance resolution post-scan, yet existing deep learning methods face fidelity-efficiency trade-offs. Purpose: To develop a computationally efficient and accurate deep learning framework for MRI SR that preserves anatomical detail for clinical integration. Materials and Methods: We propose a novel SR framework combining multi-head selective state-space models (MHSSM) with a lightweight channel MLP. The model uses 2D patch extraction with hybrid scanning to capture long-range dependencies. Each MambaFormer block integrates MHSSM, depthwise convolutions, and gated channel mixing. Evaluation used 7T brain T1 MP2RAGE maps (n=142) and 1.5T prostate T2w MRI (n=334). Comparisons included Bicubic interpolation, GANs (CycleGAN, Pix2pix, SPSR), transformers (SwinIR), Mamba (MambaIR), and diffusion models (I2SB, Res-SRDiff). Results: Our model achieved superior performance with exceptional efficiency. For 7T brain data: SSIM=0.951+-0.021, PSNR=26.90+-1.41 dB, LPIPS=0.076+-0.022, GMSD=0.083+-0.017, significantly outperforming all baselines (p<0.001). For prostate data: SSIM=0.770+-0.049, PSNR=27.15+-2.19 dB, LPIPS=0.190+-0.095, GMSD=0.087+-0.013. The framework used only 0.9M parameters and 57 GFLOPs, reducing parameters by 99.8% and computation by 97.5% versus Res-SRDiff, while outperforming SwinIR and MambaIR in accuracy and efficiency. Conclusion: The proposed framework provides an efficient, accurate MRI SR solution, delivering enhanced anatomical detail across datasets. Its low computational demand and state-of-the-art performance show strong potential for clinical translation.