This is the first study to suggest that TV 3D US is not accurate in measuring the internal indentation length in patients with IUS or AUA when compared to a diagnostic hysteroscopy. Measurement on TV 3D US consistently underestimates the IILFM compared to the actual measurement on diagnostic hysteroscopy. Such findings were confirmed irrespective of the type of the anomaly or the actual length of the internal indentation as measured on diagnostic hysteroscopy. Patients with moderate internal indentation length (10–14 mm) and those with significant internal indentation length (≥ 15 mm) on diagnostic hysteroscopy had such lesions underestimated on TV 3D US with or without SIH. In turn, our data suggests that there may be an element of uterine factor in some infertile patients that goes undiagnosed if one relies on TV 3D US with or without SIH. Such uterine anomalies may also be found in some patients with unexplained infertility. It may also explain a possible etiology for repeated implantation failure after various infertility treatments including IVF-ET. The presence of uterine septum has been linked to repeated implantation failure [25]. In addition, our findings may in part explain the variable incidence of uterine septum and AUA in infertile patients in the literature [26]. It is not clear as to why the measurement of the internal indentation length of IUS and AUA on TV 3D US is not accurate. However, volume transvaginal ultrasound pictures are computer generated, and therefore, it is possible that the IILFM in the generated pictures is not accurate.
Published data in the literature suggests that TV 3D US is the gold standard for the diagnosis of septate, sub-septate, and AUA with high degree of sensitivity and specificity [8, 14]. However, most of this literature does not clearly differentiate between patients with complete septum and IUS, and between those with subtle and significant IUS [9]. Patients with complete uterine septum and those with significant IUS could arguably be suspected on TV 2D US (in transverse view) or even on HSG, albeit such tests cannot differentiate between such anomalies, and complete or incomplete bicornuate uterus because of lack of evaluation of the external fundal contour [31]. In addition, the data in the literature on the use of TV 3D US for the diagnosis of AUA is both limited and confusing. Such studies may have also included patients with subtle arcuate uterine anomaly of no clinical significance. In a study by Graupera et. al. 2015, that compared TV 3D US to MRI in diagnosis of Müllerian anomalies, using the ESHRE–ESGE consensus, all patients had a significant pathology as they were initially suspected on TV 2D US [9] Moini et. al. 2013. compared the findings on TV 3D US to hysteroscopy and laparoscopy in patients with suspected uterine septum and reported that TV 3D US was more accurate in long complete septum and less accurate in IUS and AUA [24]. The later findings are similar to that in our study.
Several publications, albeit with small sample size, that reported on the accuracy of TV 3D US in diagnosis of IUS or AUA relied on comparison to office hysteroscopy [7, 8, 17]. Many investigators found office hysteroscopy to be a valuable method to screen for congenital anomalies of the uterine cavity [7, 8, 17]. However, we believe that such subtle anomalies can be missed if uterine distension is not adequate, as commonly happens during office hysteroscopy to avoid patients’ discomfort [19, 27, 28]. A recent video abstract suggested that even a diagnostic hysteroscopy, performed under modified general anesthesia, can miss the diagnosis of such anomalies if uterine distension is suboptimal [3]. In addition, the narrow and small view on office hysteroscopy can make the evaluation for such anomalies difficult, especially if the patient is uncomfortable. Furthermore, a study that reported on the reproducibility of diagnosing intrauterine abnormalities through office hysteroscopy, found that the interobserver agreement appeared to be disappointing [15]. In a study by Smit et. al. 2013, it was shown that in infertile patients the international agreement on the diagnosis of the septate uterus and arcuate uterus by office hysteroscopy appeared to be rather disappointing [27]. The same findings were found in a subsequent study by the same authors even when some diagnostic criteria were used at time of office hysteroscopy [28]. Therefore, the consensus on the accuracy of office hysteroscopy during assessment of the uterine shape seemed to be poor, especially for the less profound variations. However, the suggestion that office hysteroscopy is less accurate than a diagnostic hysteroscopy under sedation in diagnosis of IUS and AUA can only be confirmed by a prospective comparative study.
In a prospective study of patients with recurrent pregnancy loss, who were suspected to have septate, sub-septate, and AUA, TV 3D US was extremely accurate in making the diagnosis of such anomalies, as confirmed on subsequent diagnostic hysteroscopy and laparoscopy [8]. In the same study, the authors reported that a negative study on TV 3D US was also accurate in ruling out such anomalies as confirmed on subsequent office hysteroscopy [8]. In contrast, 29.5% of the patients in our study were found to have no evidence of any internal indentation on TV 3D US with or without SIH (0.00 mm) [subgroup 1]. Therefore, our findings are not in agreement with those of Ghi et. al. 2009 [8]. If the recommendation by Ghi et. al. 2009 [8] is followed, 29.5% of the patients in our study (subgroup 1) would have been considered to have no IUS or significant AUA based on TV 3D US. In addition, another 29.1% of the patients in our study with IILFM of 1–4.9 mm (subgroup 2) would have also been considered normal with respect to such Müllerian anomaly based on TV 3D US. Furthermore, an additional 36.4% of the patients in our study with IILFM of 5–9.9 mm (subgroup 3) would also have been considered a variant of normal based on TV 3D US. This group of patients (subgroup 3) with IILFM of 5–9.9 mm on TV 3D US would have been considered to have insignificant internal indentation length, irrespective of its appearance, according to recent ASRM guidelines [30]. In such patients, a hysteroscopy would not be recommended and in turn the diagnosis would have been missed. On the other hand, these patients (subgroup 3) would be considered to have an IUS, irrespective of its appearance, according to ESHRE/ESGE classification [12]. All in all, in only 4.9% of our patients, TV 3D US with or without SIH would have revealed an internal indentation length of ≥ 10 mm (subgroup 4) and in turn a correct diagnosis would have been made according to the criteria used in our study providing that one does not follow the recent ASRM guidelines, which suggest that IUS is defined as an internal indentation length of > 15 mm. While in 95.4% of the population studied, TV 3D US would have been most likely interpreted as normal or a variant of normal for such Müllerian anomaly [30]. It is worth of note that even if the new classification of ESHRE/ESGE regarding uterine septum (IILFM more > 50% of myometrial thickness) was used in our study, only 41.0% of the patients (subgroup 3 and subgroup 4) would have been suspected [12].
It is important to discuss our findings in context with the recent guidelines of ASRM and ESHRE/ESGE classification [12, 30]. Some of our patients with AUA and internal indentation length of 5–9.9 mm (subgroup 3) on TV 3D US would have been diagnosed as arcuate uterus, but it would have been thought to be of no clinical significance, based on the recent ASRM guidelines [30]. On the other hand, all the patients in our study who had AUA with IILFM ≥ 10 mm on hysteroscopy would have been considered unclassifiable by the recent guidelines of ASRM 2016. In that context, it is worth of note that some investigators defined AUA as those with arcuate appearance and indentation length to be between 1.0 and 1.5 cm [18]. In contrast, all the patients in our study would have been considered to have an IUS based on ESHRE/ESGE classification [12].
It should be emphasized that TV 3D US is an essential part of work-up of infertile patients. The value of TV 3D US in the diagnosis of IUS and AUA is well established, and therefore, it is a mandatory step in the assessment of the uterine cavity in patients with a suspected IUS, AUA, or bicornuate uterus, especially before planning operative hysteroscopy. TV 3D US is a noninvasive and a reproducible method that can provide information about both the external contour and the uterine cavity at the same time. Hysteroscopy alone (without laparoscopy) cannot differentiate between septate and bicornuate uterus and should only be performed after TV 3D US is done in such patients.
Our study has its limitations. First, it is a retrospective study with all the limitation related to this design. Second is the fact that TV 3D UD was performed in the follicular phase. Many authors suggested better view in the luteal phase as the endometrium is more prominent [8]. However, in our experience we did not find that to be a limiting factor to have good visualization of the endometrial cavity. Another limitation in our study is the fact that the technique used to measure the internal indentation length at the fundal midline (IILFM) has not been validated and it may not be perfectly accurate. However, even if that is the case, the difference may be a few millimeter and in turn may not impact the overall conclusions of this study. More studies are needed to confirm our finding. In future studies, the use of a novel graduated intrauterine palpator described recently by Di Spiezio et. al. 2016 [6] can enhance the accuracy of measurement of IILFM on hysteroscopy. Another limitation is the operator bias, as all surgeries were performed by the senior author, who was aware of the TV 3D US findings prior to surgery. On the other hand, this may be considered a strength of the study. Despite the negative results on TV 3D US, the operator was not influenced by such findings at the time of hysteroscopy. The strength of this study stems from the fact that all the data is from one center, which eliminates variability in diagnostic measures by the operators. In addition, another area of strength is the large sample size included in this study.