InfoMotion: A Graph-Based Approach to Video Dataset Distillation for Echocardiography

Summary

This paper introduces InfoMotion, a novel graph-based approach for distilling compact synthetic echocardiographic video datasets to address challenges with large-scale medical video data. By leveraging motion feature extraction, class-wise graph construction, and the Infomap algorithm, the method selects a diverse and informative subset. It achieves a 69.38% test accuracy on EchoNet-Dynamic using only 25 synthetic videos, demonstrating efficiency for medical video dataset distillation.

Medical Relevance

Echocardiography is a cornerstone for diagnosing and monitoring cardiovascular diseases. This work is critical for enabling efficient development and deployment of AI models in this domain by significantly reducing data storage, computational load, and training time for large echocardiographic video datasets, thus making advanced AI more practical and accessible in clinical settings.

AI Health Application

This research applies AI (specifically dataset distillation using motion feature extraction and graph-based representative sample selection) to echocardiographic video data. The goal is to synthesize a compact, informative subset of medical data, which improves the efficiency of storage, computation, and model training for AI systems used in the diagnosis and monitoring of cardiovascular diseases. This directly enables more scalable and efficient development of medical AI applications in cardiology.

Key Points

  • Addresses the critical challenge of growing echocardiographic video data volume, impacting storage, computation, and model training efficiency in medical AI.
  • Proposes dataset distillation as a promising solution to create compact, informative synthetic video subsets that retain key clinical features.
  • Introduces "InfoMotion," a novel graph-based methodology specifically designed for distilling medical video datasets.
  • The InfoMotion method involves three key stages: motion feature extraction to capture temporal dynamics, class-wise graph construction, and representative sample selection using the Infomap algorithm.
  • The objective is to select a diverse and informative subset of synthetic videos that accurately preserves the essential characteristics and clinical features of the original large dataset.
  • Evaluated on the EchoNet-Dynamic datasets, a widely recognized benchmark for cardiac function assessment using echocardiography.
  • Achieves a significant test accuracy of 69.38% utilizing an extremely compact dataset of only 25 synthetic videos, highlighting remarkable efficiency and scalability for medical video analysis.

Methodology

The InfoMotion method for distilling synthetic echocardiographic video datasets involves: (1) **Motion Feature Extraction** from the original videos to capture their temporal dynamics; (2) **Class-wise Graph Construction** where nodes represent video samples and edges reflect feature similarities within each diagnostic class; and (3) **Representative Sample Selection** using the Infomap algorithm, which identifies highly informative and diverse synthetic videos from the constructed graphs.

Key Findings

The primary finding is the successful distillation of a highly compact yet effective synthetic echocardiographic video dataset. Specifically, the InfoMotion method achieved a test accuracy of 69.38% on the EchoNet-Dynamic dataset while using an extremely small subset of only 25 synthetic videos. This demonstrates the method's significant effectiveness and scalability in preserving critical information from large medical video datasets.

Clinical Impact

This approach could profoundly impact clinical AI by drastically reducing the computational and storage burden associated with training machine learning models on vast echocardiographic video datasets. This allows for faster model development, easier deployment in resource-constrained environments, and potentially broader adoption of AI tools for cardiovascular disease diagnosis and monitoring, accelerating research and clinical translation.

Limitations

The abstract does not explicitly state any limitations of the proposed InfoMotion method.

Future Directions

The abstract does not explicitly mention future research directions.

Medical Domains

Cardiology Cardiovascular Imaging Diagnostic Imaging Medical Artificial Intelligence Biomedical Engineering

Keywords

Echocardiography Dataset Distillation Video Analysis Graph-Based Learning Infomap Algorithm Cardiovascular Imaging Machine Learning Medical Data Efficiency

Abstract

Echocardiography playing a critical role in the diagnosis and monitoring of cardiovascular diseases as a non-invasive real-time assessment of cardiac structure and function. However, the growing scale of echocardiographic video data presents significant challenges in terms of storage, computation, and model training efficiency. Dataset distillation offers a promising solution by synthesizing a compact, informative subset of data that retains the key clinical features of the original dataset. In this work, we propose a novel approach for distilling a compact synthetic echocardiographic video dataset. Our method leverages motion feature extraction to capture temporal dynamics, followed by class-wise graph construction and representative sample selection using the Infomap algorithm. This enables us to select a diverse and informative subset of synthetic videos that preserves the essential characteristics of the original dataset. We evaluate our approach on the EchoNet-Dynamic datasets and achieve a test accuracy of \(69.38\%\) using only \(25\) synthetic videos. These results demonstrate the effectiveness and scalability of our method for medical video dataset distillation.

Comments

Accepted at MICAD 2025