Title: Validation of models to diagnose ovarian cancer in patients managed surgically or conservatively: multicentre cohort study
Author: Ben Van Calster, Lil Valentin, Wouter Froyman, Chiara Landolfo, Jolien Ceusters, Antonia C Testa, Laure Wynants, Povilas Sladkevicius, Caroline Van Holsbeke, Ekaterini Domali, Robert Fruscio, Elisabeth Epstein, Dorella Franchi, Marek J Kudla, Valentina Chiappa, Juan L Alcazar, Francesco P G Leone, Francesca Buonomo, Maria Elisabetta Coccia, Stefano Guerriero, Nandita Deo, Ligita Jokubkiene, Luca Savelli, Daniela Fischerová, Artur Czekierdowski, Jeroen Kaijser, An Coosemans, Giovanni Scambia, Ignace Vergote, Tom Bourne, Dirk Timmerman
Abstract: Objective To evaluate the performance of diagnostic prediction models for ovarian malignancy in all patients with an ovarian mass managed surgically or conservatively.
Design Multicentre cohort study.
Setting 36 oncology referral centres (tertiary centres with a specific gynaecological oncology unit) or other types of centre.
Participants Consecutive adult patients presenting with an adnexal mass between January 2012 and March 2015 and managed by surgery or follow-up.
Main outcome measures Overall and centre specific discrimination, calibration, and clinical utility of six prediction models for ovarian malignancy (risk of malignancy index (RMI), logistic regression model 2 (LR2), simple rules, simple rules risk model (SRRisk), assessment of different neoplasias in the adnexa (ADNEX) with or without CA125). ADNEX allows the risk of malignancy to be subdivided into risks of a borderline, stage I primary, stage II-IV primary, or secondary metastatic malignancy. The outcome was based on histology if patients underwent surgery, or on results of clinical and ultrasound follow-up at 12 (±2) months. Multiple imputation was used when outcome based on follow-up was uncertain.
Results The primary analysis included 17 centres that met strict quality criteria for surgical and follow-up data (5717 of all 8519 patients). 812 patients (14%) had a mass that was already in follow-up at study recruitment, therefore 4905 patients were included in the statistical analysis. The outcome was benign in 3441 (70%) patients and malignant in 978 (20%). Uncertain outcomes (486, 10%) were most often explained by limited follow-up information. The overall area under the receiver operating characteristic curve was highest for ADNEX with CA125 (0.94, 95% confidence interval 0.92 to 0.96), ADNEX without CA125 (0.94, 0.91 to 0.95) and SRRisk (0.94, 0.91 to 0.95), and lowest for RMI (0.89, 0.85 to 0.92). Calibration varied among centres for all models, however the ADNEX models and SRRisk were the best calibrated. Calibration of the estimated risks for the tumour subtypes was good for ADNEX irrespective of whether or not CA125 was included as a predictor. Overall clinical utility (net benefit) was highest for the ADNEX models and SRRisk, and lowest for RMI. For patients who received at least one follow-up scan (n=1958), overall area under the receiver operating characteristic curve ranged from 0.76 (95% confidence interval 0.66 to 0.84) for RMI to 0.89 (0.81 to 0.94) for ADNEX with CA125.
Conclusions Our study found the ADNEX models and SRRisk are the best models to distinguish between benign and malignant masses in all patients presenting with an adnexal mass, including those managed conservatively.