How to Get an AUC Confidence Interval

Articles From:


Blogger,, and Senior Data Scientist


AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For a detailed explanation of AUC, see this link.

Since AUC is widely used, being able to get a confidence interval around this metric is valuable to both better demonstrate a model’s performance, as well as to better compare two or more models. For example, if model A has an AUC higher than model B, but the 95% confidence interval around each AUC value overlaps, then the models may not be statistically different in performance. We can get a confidence interval around AUC using R’s pROC package, which uses bootstrapping to calculate the interval.

Building a simple model to test

To demonstrate how to get an AUC confidence interval, let’s build a model using a movies dataset from Kaggle (you can get the data here).

Reading in the data

# load packages
# read in dataset
movies <- read.csv("movie_metadata.csv")
# remove records with missing budget / gross data
movies <- movies %>% filter(! & !

Split into train / test

Next, let’s randomly select 70% of the records to be in the training set and leave the rest for testing.

# get random sample of rows
train_rows <- sample(1:nrow(movies), .7 * nrow(movies))
# split data into train / test
train_data <- movies[train_rows,]
test_data <- movies[-train_rows,]
# select only fields we need
train_need <- train_data %>% select(gross, duration, director_facebook_likes, budget, imdb_score, content_rating, movie_title)
test_need <- test_data %>% select(gross, duration, director_facebook_likes, budget, imdb_score, content_rating, movie_title)

Create the label

Lastly, we need to create our label i.e. what we’re trying to predict. Here, we’re going to predict if a movie’s gross beats its budget (1 if so, 0 if not).

train_need$beat_budget <- as.factor(ifelse(train_need$gross > train_need$budget, 1, 0))
test_need$beat_budget <- as.factor(ifelse(test_need$gross > test_need$budget, 1, 0))

Train a random forest

Now, let’s train a simple random forest model with just 50 trees.

train_need <- train_need[complete.cases(train_need),]
# train a random forest
forest <- randomForest(beat_budget ~ duration + director_facebook_likes + budget + imdb_score + content_rating,
                       train_need, ntree = 50)

Getting an AUC confidence interval

Next, let’s use our model to get predictions on the test set.

test_pred <- predict(forest, test_need, type = "prob")[,2]

And now, we’re reading to get our confidence interval! We can do that in just one line of code using the ci.auc function from pROC. By default, this function uses 2000 bootstraps to calculate a 95% confidence interval. This means our 95% confidence interval for the AUC on the test set is between 0.6198 and 0.6822, as can be seen below.

ci.auc(test_need$beat_budget, test_pred) 
# 95% CI: 0.6198-0.6822 (DeLong)

We can adjust the confidence interval using the conf.level parameter:

ci.auc(test_need$beat_budget, test_pred, conf.level = 0.9) 
# 90% CI: 0.6248-0.6772 (DeLong)

Originally posted on Blog.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties and not by Interactive Brokers does NOT constitute a recommendation by Interactive Brokers that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from and is being posted with permission from The views expressed in this material are solely those of the author and/or and IBKR is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to sell or the solicitation of an offer to buy any security. To the extent that this material discusses general market activity, industry or sector trends or other broad based economic or political conditions, it should not be construed as research or investment advice. To the extent that it includes references to specific securities, commodities, currencies, or other instruments, those references do not constitute a recommendation to buy, sell or hold such security. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.

In accordance with EU regulation: The statements in this document shall not be considered as an objective or independent explanation of the matters. Please note that this document (a) has not been prepared in accordance with legal requirements designed to promote the independence of investment research, and (b) is not subject to any prohibition on dealing ahead of the dissemination or publication of investment research.

Any trading symbols displayed are for illustrative purposes only and are not intended to portray recommendations.