R code: Back Transform from Caret’s preProcess()

This post gives a small R code for the back transformation of the caret’s preProcess() function, which is not implemented in caret R package yet. This is useful , for example, when we forecast stock prices using deep learning techniques such as the LSTM which requires normalized input data but we want to back transform it to the original scale.

Reverse Transform from Caret’s preProcess()

Caret R package provides a very convenient function, preProcess(), which transform a given data to a normalized or standardized one. However, it does not provide the back (or reverse) transformation function.

Transformation

method = “center” or “scale” or c(“center”, “scale”)

x′ = (x−μx)/σx

method = “range”, rangeBounds = c(a, b)

These transformations are done by using preProcess() function in caret R package.

preProc <- preProcess(training, 
                      method = c("center", "scale"))
transformed <- predict(preProc, training)

Back Transformation

method = “center” or “scale” or c(“center”, “scale”)

method = “range”, rangeBounds = c(a, b)

These back transformations can accomplished by the following R code

#===========================================================
# back transform using the object from the caret preProcess
#===========================================================
 
back_preProc <- function(preProc, df_trans, digits = 10) {
    
    pp <- preProc
    nc <- ncol(df_trans); nr <- nrow(df_trans)
    av <- t(replicate(nr, pp$mean))
    st <- t(replicate(nr, pp$std))
    a  <- pp$rangeBounds
    x_max <- t(replicate(nr, pp$ranges[2,]))
    x_min <- t(replicate(nr, pp$ranges[1,]))
    
    if(sum(!is.na(match(c("center", "scale"), 
                        names(pp$method)))) == 2) {
        df <- df_trans*st + av
    } else if(sum(!is.na(match("center", 
                               names(pp$method)))) == 1) {
        df <- df_trans + av
    } else if(sum(!is.na(match("scale", 
                               names(pp$method)))) == 1) {
        df <- df_trans*st
    } else {
        df <- (df_trans-a[1])/(a[2]-a[1])*(x_max - x_min) + x_min
    }
    
    return(round(df, digits))
}

Excercise

An exercise is a range transformation between -1 and 1 with training and test sample data.

#========================================================#
# Quantitative ALM, Financial Econometrics & Derivatives 
# ML/DL using R, Python, Tensorflow by Sang-Heon Lee 
#
# https://kiandlee.blogspot.com
#--------------------------------------------------------#
# backtransform of caret::preProcess
#========================================================#
 
graphics.off(); rm(list = ls())
 
library(caret)
 
#-----------------------------------------
# sample data
#-----------------------------------------
df <- data.frame(x = -10:10, y = -10:10*0.001)
 
#-----------------------------------------
# train/test splitting data
#-----------------------------------------
# In case of one-column dataframe, sub rows become a vector. 
# To avoid this and preserve a single-column data frame, 
# use drop=F option. 
df_train <- df[1:15,,drop=F]
df_test <- df[16:21,,drop=F]
 
 
#-----------------------------------------
# create transform funtion
#-----------------------------------------
preProc <- preProcess(df_train, method = "range", 
                      rangeBounds = c(-1, 1))
 
#=====================================================
# transform
#=====================================================
df_train_trans <- predict(preProc, df_train)
df_test_trans  <- predict(preProc, df_test)
 
    
#=====================================================
# back transform of train data
#=====================================================
df_train_back <- back_preProc(preProc, df_train_trans)
df_test_back  <- back_preProc(preProc, df_test_trans)
 
 
#-----------------------------------------
# print comparisons of returns
#-----------------------------------------
temp <- cbind(df_train, df_train_trans, df_train_back)
 
print("========= Train Data =========")
colnames(temp) <- c(
    paste0("raw_",colnames(df_train)),
    paste0("trans_",colnames(df_train_trans)),
    paste0("back_",colnames(df_back)))
print(temp)
    
print("========= Test Data  =========")
temp <- cbind(df_test, df_test_trans, df_test_back)
colnames(temp) <- c(
    paste0("raw_",colnames(df_test)),
    paste0("trans_",colnames(df_test_trans)),
    paste0("back_",colnames(df_back)))
print(temp)

Comparisons of the original, transformed, and back transformed data delivers the expected results.

The upper and lower bounds of the transfomed test data is not 1 and -1 since the raw data has a trend. To show a distinct result, I use a trending sample data.

For additional insight on this topic and to download the R scripts, visit https://kiandlee.blogspot.com/2022/10/r-code-back-transform-from-carets.html.

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from SHLee AI Financial Model and is being posted with its permission. The views expressed in this material are solely those of the author and/or SHLee AI Financial Model and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.