Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (2024)

Synthetic data in medical imaging offers numerous benefits, including the ability to augment datasets with diverse and realistic images where real data is limited. This reduces the costs and labor associated with annotating real images. Synthetic data also provides an ethical alternative to using sensitive patient data, which helps with education and training without compromising patient privacy.

This post introduces MAISI, an NVIDIA AI Foundation model for 3D computed tomography (CT) image generation. The overarching goal of MAISI is to revolutionize the field of medical imaging by providing a reliable and efficient way to generate high-quality synthetic images that can be used for various research and clinical applications. By overcoming the challenges of data scarcity and privacy concerns, MAISI aims to enhance the accessibility and usability of medical imaging data.

The model can generate high-resolution synthetic CT images and corresponding segmentation masks with up to 127 anatomical classes (including bones, organs, and tumors), while achieving the landmark voxel dimensions of 512 × 512 × 512 and spacing of 1.0 × 1.0 × 1.0 mm³. Key applications include data augmentation that involves generating real-world medical imaging data to supplement datasets subject to privacy concerns or rarity.

Overview

The DLMED research team at NVIDIA focused on high-resolution, detailed contexture in 3D medical image generation modeling. This approach not only enriches the dataset but also enhances the performance of other machine learning models in the field of medical imaging. Another major application is saving annotation work. Generating pairs based on user-defined classes (image, label) simplifies the process of creating synthetic medical images with annotations, providing a cost-effective alternative to the labor-intensive task of collecting and annotating real medical data.

Furthermore, the MAISI model also addresses the issue of ethical data use. It provides a responsible alternative to using sensitive patient data, as the images generated do not correspond to real individuals. This capability is invaluable for generating a variety of medical images for educational purposes, helping trainees and medical students make diagnoses without having to access confidential patient records.

Foundation compression network

To generate high-resolution 3D images, the research team trained a foundation compression model that is designed to efficiently compress CT and magnetic resonance imaging (MRI) data into a condensed feature space. This variational autoencoder (VAE) model accepts CT or MRI images as inputs and produces a feature representation output. The output serves as the foundational input for the subsequent latent diffusion model. The training regimen for this model encompassed a vast collection of CT and MRI images from various anatomical regions and featuring diverse voxel spacings.

This extensive training has endowed the model with robust adaptability, enabling application to diverse datasets without the need for additional fine-tuning. In parallel, a decoder model was meticulously trained to accurately reconstruct high-resolution images from the generated feature sets.

Foundation diffusion network

Latent diffusion models (LDMs) have emerged as a powerful tool within generative machine learning, particularly for synthesizing 3D medical images. These models function by iteratively removing noise from a random distribution within a latent space. This process effectively enables the LDM to learn the underlying data distribution of the training data and then generate novel, high-fidelity samples.

In the domain of 3D medical imaging, LDMs hold immense promise for generating anatomically accurate and diverse images. By learning the data distribution, the model can produce synthetic images that reflect real-world variations.

Our LDM was trained using large-scale, high-resolution CT datasets. We also incorporated conditionings based on body regions as an extra feature embedding. These regions encompass the head, chest, abdomen, and lower body. At the inference stage, users can specify the body regions for which they wish to generate CT images. Two concrete examples of generated CT images are shown in Figure 1.

Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (1)

ControlNet to support additional conditioning

ControlNet is a framework that supports various spatial contexts as additional conditioning for diffusion models like Stable Diffusion. It was introduced in the paper, Adding Conditional Control to Text-to-Image Diffusion Models. With ControlNet, users have more control over the generation process. The output can be customized with different spatial contexts such as depth maps, segmentation maps, scribbles, key points, and more.

Specifically, the research team leveraged ControlNet to treat the organ segmentation maps, including 127 anatomic structures, as the extra condition of the foundation diffusion model to facilitate the CT image generation. Figure 2 shows a typical generated CT image and its corresponding segmentation condition.

Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (2)

This is achieved using “zero-convolution” layers connecting the trainable and locked copies. The zero-convolution layer enables the model to preserve the semantics already learned by the pretrained foundation diffusion model while enabling the trainable copy to learn the specific spatial conditioning required for the task.

Performance evaluation

Our team conducted a comprehensive evaluation of the foundation diffusion model and the ControlNet using multiple datasets. This ensures broad coverage of many different body regions.

Image quality

Initially, we evaluated the quality of the images generated by our model by comparing the images to those produced by other baseline methods, using the model weights provided. We used the chest CT image generation and actual chest CT datasets shown in Table 1.

Our method demonstrated superior performance over previous methods according to the Fréchet Inception Distance (FID) scores. In addition, our generated images are much closer to real images in appearance.

FID (Average) ↓MSD Task 06*LIDC-IDRITCIA
RealMSD Task 063.9871.858
LIDC-IDRI3.9874.744
TCIA1.8584.744
SynthesisHA-GAN98.208116.26098.064
MAISI19.00831.37020.338

Subsequently, we retrained several state-of-the-art diffusion model-based methods using our datasets. The results in Tables 2 and Table 3 show that our method consistently outperformed the previous methods for both our dataset and unseen datasets (autoPET 2023).

MethodFID (XY Plane) ↓FID (YZ Plane) ↓FID (ZX Plane) ↓FID (Average) ↓
DDPM10.03136.78243.10929.974
LDM12.40919.20222.45218.021
HA-GAN10.43910.10810.84210.463
MAISI1.2252.8462.8542.308
MethodFID (XY Plane) ↓FID (YZ Plane) ↓FID (ZX Plane) ↓FID (Average) ↓
DDPM18.52423.69625.60422.608
LDM16.85310.19110.09312.379
HA-GAN17.43210.26613.57213.757
MAISI14.1655.7708.5109.481

Figure 3 shows that the images generated by our method exhibit significantly enhanced details and more accurate global anatomical structures.

Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (3)

Downstream tasks

One of the most important applications of the generative model is to synthesize new data for data augmentation in model training. We can evaluate the quality of generated images by assessing the impact of including synthetic data. We adopted the Auto3DSeg pipeline, an automatic pipeline for developing medical image segmentation solutions in MONAI, and trained each segmentation model from scratch to reduce randomness by five-fold cross-validation.

There are two sets of experiments:

  1. Real: The normal model training is conducted on real data.
  2. Real + Synthetic: Real and synthetic data are combined in equal proportions during training to show the effect of synthetic data for data augmentation.

As shown in Table 4, all synthetic data across five tumor types positively influence the final performance of the testing set (about 2.5%~4.5% improvement). These results indicate better generalizability of models trained using synthetic data.

ExperimentDatasetTumor TypeDice ScoreImprovement
RealMSD Task 06Lung Tumor0.581
Real + Synthetic0.6254.5%
RealMSD Task 10Colon Tumor0.449
Real + Synthetic0.4904.1%
RealIn-House Bone LesionBone Lesion0.504
Real + Synthetic0.5343.0%
RealMSD Task 03Hepatic Tumor0.662
Real + Synthetic0.6872.5%
RealMSD Task 07Pancreatic Tumor0.433
Real + Synthetic0.4734.0%

Qualitative assessment

Figure 4 shows qualitative evaluations of three cases having abnormalities. It can be seen that MAISI yields excellent CT generation quality on both normal organs and abnormal tumor regions, as shown in the boxes of each subfigure. Our results indicate that MAISI effectively delineates abnormal tissue boundaries with high fidelity, demonstrating its robustness in capturing intricate details based on segmentation mask conditions in medical imaging. MAISI has the potential to effectively enhance the diversity and realism of generated CT images for data augmentation purposes.

Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (4)

Notably, in each case, MAISI accurately simulates the appearance of abnormal tumor regions and opens the possibility of enriching the dataset with variations in tumor morphology and spatial distribution. These findings highlight the potential of MAISI as a powerful tool for augmenting medical imaging datasets, thereby improving the robustness and generalization of machine learning models in clinical applications.

Summary

MAISI is a state-of-the-art foundation AI model for generating 3D high-resolution synthetic medical images with corresponding labels to address data limitations, reduce annotation costs, and maintain patient privacy. With its ability to achieve high-quality resolutions and segment 127 anatomical classes, MAISI is poised to make a significant impact in medical imaging. Incorporating MAISI-generated synthetic data into training segmentation models has demonstrated substantial performance improvements, paving the way for increased robustness and generalization in clinical applications.

To explore the potential of synthetic data generation with MAISI for your projects, join the early access program.

Acknowledgments

All co-authors wish to note that they made equal contributions to the research presented here and to the writing of this post.

Addressing Medical Imaging Limitations with Synthetic Data Generation | NVIDIA Technical Blog (2024)
Top Articles
Avocado and Three Bean Salad
5 Easy Quinoa Salad Recipes {+Meal Prep Tips} - The Girl on Bloor
Barbara Roufs Measurements
The Menu Showtimes Near Regal Edwards Ontario Mountain Village
Www.myschedule.kp.org
Antonym For Proton
Dtm Urban Dictionary
Abcm Corp Training Reliaslearning
Craiglist Mohave
Wotr Dyra
Nook Glowlight 3 Case
Shaw Centre for the Salish Sea — Eight Arms, Eight Interesting Facts: World Octopus Day
Valentina Gonzalez Leak
Syncb Ameg D
Pip Calculator | Myfxbook
Telegram Voyeur
35Mmx45Mm In Inches
Dovob222
Craigs List Jonesboro Ar
Shop - Mademoiselle YéYé
rochester, NY cars & trucks - craigslist
Huffington Post Horoscope Libra
Fast X Showtimes Near Evo Cinemas Creekside 14
Lufthansa LH456 (DLH456) from Frankfurt to Los Angeles
Dicks Sporting Good Lincoln Ne
Northern Va Bodyrubs
Small Party Hall Near Me
10000 Blaulicht-Meldungen aus Baden-Württemberg | Presseportal
Stephen King's The Boogeyman Movie: Release Date, Trailer And Other Things We Know About The Upcoming Adaptation
It Might Get Smoked Nyt
Dust Cornell
Dimmitt Range Rover
Nobivac Pet Passport
Roseberrys Obituaries
Www.cvs/Otchs/Simply
Comcast Business Downdetector
DePaul joins nationwide pro-Palestinian college protests as encampment continues at University of Chicago
Myusu Canvas
Fandafia
Lavender Dreams Nails Walnut Creek Photos
Showbiz Waxahachie Bowling Hours
Payback Bato
Walmart Supercenter Curbside Pickup
Smithfield Okta Login
Tia V15.1 Update
Salons Open Near Me Today
How To Spend a Day in Port Angeles (15 Things to Do!)
Drew Gulliver Bj
Explain the difference between a bar chart and a histogram. | Numerade
American Medical Response hiring EMT Basic - Bridgeport in Bridgeport, CT | LinkedIn
New Application Instructions · Government Portal
Pnp Telegram Group
Latest Posts
Article information

Author: Carmelo Roob

Last Updated:

Views: 5849

Rating: 4.4 / 5 (65 voted)

Reviews: 88% of readers found this page helpful

Author information

Name: Carmelo Roob

Birthday: 1995-01-09

Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

Phone: +6773780339780

Job: Sales Executive

Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.