Toward Replacing Late Gadolinium Enhancement With Artificial Intelligence Virtual Native Enhancement for Gadolinium-Free Cardiovascular Magnetic Resonance Tissue Characterization in Hypertrophic Cardiomyopathy

Supplemental Digital Content is available in the text.

. Illustration of pre-processing LGE image to match with the T1-map pixel to pixel. In this example, the LGE image (B) (raw pixel spacing of 1.41mm) was interpolated to have matching pixel spacing (1.15mm), and transformed to have matching Image Position and Image Orientation with T1-map (A) based on the Dicom metadata. The resulting image (C) has the same size, pixel spacing, position and orientation with T1-maptherefore, a pixel-to-pixel match in the myocardium.

Figure II.
LGE examples and their desired quality categories. Assessors were blinded to whether an image was VNE or LGE. (A-E) Five quality categories. (F) An example demonstrating the use of refined quality scoring. In the interface, human observers were allowed to register intermediate scores recorded on the scale of 0-100, e.g. '36' in this example. The motivation of this design was that categorical scales are typically more intuitive for human operators, whilst finer numerical scales are more suitable for statistical analyses.
Figure III. T1-maps (n=10) in test materials that were excluded before VNE and LGE quality assessment. Severe artefacts are present in all these T1-maps, preventing any interpretation; therefore, these cases were manually excluded from subsequent analysis; see note 1 in Figure 2.      VNE generator training using cGAN CGAN consists of two "adversarial" models: a generative model G and a discriminative model

D. In this application, G is the VNE generator that produces the VNE images which resembles
LGE, and D is a classification neural network ( Figure II) that distinguishes between VNE and LGE images. G and D are trained simultaneously.
Objective G and D are trained by optimizing the value of an objective function (Figure ). Suppose there is a native CMR input which is processed by G to produce the VNE image ( ) that resemble the LGE image . In this application, the objective for cGAN optimization can be expressed as an adversarial minimax game: where G is optimized to minimize the objective function, while D is optimized to maximize the objective function. The first term is an L1 loss that encourages the generator G to produce ( ) that matches pixel by pixel. Rather than exact replication of real LGE signal intensities, this VNE application focuses on enhancing the native CMR signals and translating the native images into the presentation of LGE. To account for this the second term is a perceptual loss [49] which calculates differences between high-level image feature representations of ( ) and . The features, denoted by ( ( )) and ( ) , are generated from the last convolutional layer of a 16-layer VGG network pre-trained on ImageNet [50]. In the third and fourth terms, ( ) and are input to the discriminator D which produces the "realness" labels ( ) and ( ( )) as 1: "real" LGE or 0: "virtual" LGE. The objective of training D is to distinguish between real and virtual LGE images, i.e., to maximize the last two terms.
Simultaneously, G is encouraged to produce VNE that cannot be distinguished from real LGE appearance by the discriminator D, i.e., to minimize the third term. The weighting parameters 1 and 2 are used to balance the magnitude of terms. In this application, a much lower 1 = 20 and higher 2 = 200 were set in order to enforce matching in perceptual features rather than pixel values. The strategy results in a trained generator that translates the existing native CMR signals into LGE image appearance.
To account for the inevitable position differences between native modalities and LGE used in training, additional modification was added to the first L1 loss term, to shift the LGE image locally and search for the best match: where , ∈ {−10, −9 … , 10} denote the shift in pixels horizontally and vertically.

Optimization
To improve the robustness of the model, on-the-fly augmentation was employed on the training dataset, introducing uniformly distributed random rotation within ±5 degrees and translation within ±2 pixels around the manually annotated center of the LV cavity.   Supplemental Content: HCMR investigators