Dynamic Nelson-Siegel model with R code

Posted on May 8, 2021 by sang-heon lee in R bloggers | 0 Comments

[This article was first published on K & L Fintech Modeling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Sang-Heon Lee

This article explains how to estimate parameters of the dynamic Nelson-Siegel (DNS) model (Diebold and Li;2006, Diebold, Rudebusch, and Aruoba;2006) using Kalman filter. We estimate not only parameters but also filtered latent factor estimates such as level, slope, and curvature using R code.

Dynamic Nelson-Siegel model

1. DNS model

The dynamic Nelson Siegel model can be expressed as the state state representation which consists of both measurement and state equation as follows.
\begin{align} y_t (\boldsymbol{\tau}) &= B(\boldsymbol{\tau}) X_t + \epsilon_t (\boldsymbol{\tau}) \\ X_t – \mu_X &= \phi_X (X_{t-1}-\mu_X) + \eta_t \end{align}
\[ B(\boldsymbol{\tau}) = \begin{bmatrix} 1 & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda } & \displaystyle \frac{1-e^{- \tau_1 \lambda }}{\tau_1 \lambda }-e^{- \tau_1 \lambda } \\ 1 & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda } & \displaystyle \frac{1-e^{- \tau_2 \lambda }}{\tau_2 \lambda }-e^{- \tau_2 \lambda } \\ ⋮&⋮&⋮\\ 1 & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda } & \displaystyle \frac{1-e^{- \tau_N \lambda }}{\tau_N \lambda }-e^{- \tau_N \lambda } \end{bmatrix}, \] \[ y_t (\boldsymbol{\tau}) = \begin{bmatrix} y _{t} ( \tau _{1} )\\y _{t} ( \tau _{2} )\\ ⋮ \\y _{t} ( \tau _{N} )\end{bmatrix}, \epsilon_t (\boldsymbol{\tau}) = \begin{bmatrix} \epsilon_{t} ( \tau _{1} )\\\epsilon_{t} ( \tau _{2} )\\ ⋮ \\\epsilon_{t} ( \tau _{N} )\end{bmatrix}, \] \[ \phi_X = \begin{bmatrix} \phi_L & 0 & 0 \\ 0 & \phi_S & 0 \\ 0 & 0 & \phi_C \end{bmatrix}, \mu_X = \begin{bmatrix} \mu_L \\ \mu_S \\ \mu_C \end{bmatrix} \]
where \(y_{t}(\tau)\) is continuously compounded spot rates of maturity \(\tau\) at time \(t\). \( L_t, S_t, C_t \) are level, slope, curvature factors respectively and its unconditional means and autoregressive coefficients are denoted as \( \mu_L, \mu_S, \mu_C \) and \( \phi_L, \phi_S, \phi_C \) sequentially. \( \lambda \) is an exponential decay parameter. \( \epsilon_{t} \sim N( 0_{N \times 1}, [\sigma_{\tau_1}^2,\sigma_{\tau_2}^2,…,\sigma_{\tau_N}^2] · I_{N \times N}) \) and \( \eta_{t} \sim N( 0_{3 \times 1}, [\sigma_L^2,\sigma_S^2,\sigma_C^2]·I_{3 \times 3}) \) are the multivariate normal distributed random disturbances.

DNS model as a state space model is linear with respect to factors, we can use Kalman filter to estimate parameters with numerical optimization. If we let \( \psi_0 = (I – \phi_X) \mu_X \) , \( \psi_1 = \phi_X\) for initialization, Kalman filtering is represented as the following recursive calculations. \begin{align} X_{t|t-1} &= \psi_0 + \psi_1 X_{t-1|t-1} \\ V_{t|t-1} &= \psi_1 V_{t-1|t-1} \psi_1^{‘} + \Sigma_{\eta} \\ e_{t|t-1} &= y_t – B(\boldsymbol{\tau}) X_t \\ ev_{t|t-1} &= B(\boldsymbol{\tau}) V_{t|t-1} B(\boldsymbol{\tau})^{‘} + \Sigma_{\epsilon}\\ X_{t|t} &= X_{t|t-1} + K_t e_{t|t-1} \\ V_{t|t} &= V_{t|t-1} – K_t B(\boldsymbol{\tau}) V_{t|t-1} \end{align} where \(K_t = V_{t|t-1} B(\boldsymbol{\tau})^{‘} ev_{t|t-1}^{-1}\) is the Kalman gain which reflects the uncertainty of prediction and is used as a weight for updating the time \(t-1\) prediction after observing time \(t\) data.

Since Kalman fitler is an iterative procedure, we use \(X_{1|0} = \mu_X \) and \(V_{1|0} = (I-\phi_X \phi_X^{‘})^{-1} \Sigma_{\eta} \) for initial guess. We use numerical optimization algorithm to search parameters for maximizing the log likelihood function that is constructed from conditional prediction errors (\(e_{t|t-1}\)) and its uncertainties (\(ev_{t|t-1}\)). \begin{align} ln L_t (\boldsymbol{\theta}) &= -\frac{NT}{2} ln(2\pi) – \frac{1}{2} \sum_{t=1}^{T} ln |ev_{t|t-1}| \\ &\quad – \frac{1}{2} \sum_{t=1}^{T} e_{t|t-1}^{‘}(ev_{t|t-1})^{-1}e_{t|t-1} \end{align}

2. R code for DNS model

Now we can implement the DNS model using R code as follows. Our estimation use monthly KTB (Korean Treasury Bond) term structure of zero coupon bonds (spot rates) from January 2011 to December 2019. The data are end-of-month, zero-coupon yields at thirteen maturities: 0.25, 0.5, 0.75, 1, 1.5, 2, 2.5, 3, 4, 5, 7, 10, and 20 years.

#=========================================================================#
# Financial Econometrics & Engineering, ML/DL using R, Python, Tensorflow 
# by Sang-Heon Lee
#
# https://kiandlee.blogspot.com
#————————————————————————-#
# The estimation of Dynamic Nelson-Siegel using Kalman filter
#=========================================================================#
 
library(readxl)
library(expm)
library(numDeriv)
 
graphics.off()  # clear all graphs
rm(list = ls()) # remove all files from your workspace
 
setwd(“D:/a_book_FIER_Ki_Lee/ch04_KICS/code/dns_DNS_est”)
 
# DNS factor loading matrix
NS.B<–function(lambda, tau)
{
    col1 <– rep.int(1,length(tau))
    col2 <– (1–exp(–lambda*tau))/(lambda*tau)
    col3 <– col2–exp(–lambda*tau) 
    return(cbind(col1,col2,col3))
}
 
# parameter restrictions
trans<–function(b0)
{
    b1 <– b0
    b1[1:3] <– 1/(1+exp(b0[1:3])) # AR, state equation
    b1[7:9] <– b0[7:9]^2          # variance matrix of factor disturance
    b1[10]  <– b0[10]^2           # scalar – lambda determines decay
    b1[11:npara] <– b0[11:npara]^2  # variance of measurement error
    return(b1)
}
 
# inverse transformation of parameters
inv_trans<–function(b0)
{
    b1 <– b0
    b1[1:3] <– log(1/b0[1:3]–1)  # AR, state equation
    b1[7:9] <– sqrt(b0[7:9])     # variance matrix of factor disturance
    b1[10]  <– sqrt(b0[10])      # scalar – lambda determines decay
    b1[11:npara] <– sqrt(b0[11:npara]) # variance of measurement error
    return(b1)
}
 
# negative log likelihood function to be minimized
loglike<–function(para_un,m.spot)
{
    para_con <– trans(para_un)  # parameter restrictions
    
    A      <– diag(para_con[1:3])   # AR matrix
    MU     <– para_con[4:6]         # mean vector
    Q      <– diag(para_con[7:9])   # cov in state eq.
    lambda <– para_con[10]          # lambda
    H      <– diag(para_con[11:npara]) # cov in measurement eq.
    
    # factor loading matrix
    B<–NS.B(lambda,v.mat); tB <– t(B) # factor loading matrix
 
    # initialization
    prevX <– MU  
    prevV <– solve(diag(gnk)–A%*%t(A))%*%Q
    Phi0  <– (diag(gnk)–A)%*%MU 
    Phi1  <– A
    loglike <– 0
    
    for (t in 1:nobs)
    {
        # prediction
        Xhat <– Phi0+Phi1%*%prevX; 
        Vhat <– Phi1%*%prevV%*%t(Phi1)+Q
        
        # measurement error
        y_real <– m.spot[t,] # measurement of y
        y_fit  <– B%*%Xhat     # prediction of y
        v <– y_real – y_fit    # error
        
        # updating 
        ev <– B%*%Vhat%*%tB+H; evinv<–solve(ev)
        KG <– Vhat%*%tB%*%evinv # Kalman Gain
        
        prevX <– Xhat+KG%*%v        # E[X|y_t]   updated mean 
        prevV <– Vhat–KG%*%B%*%Vhat # Cov[X|y_t] updated cov
        
        # log likelihood function
        loglike <– loglike – 0.5*(nmat)*log(2*pi)–
                   0.5*log(det(ev))–0.5*t(v)%*%evinv%*%v
        
        gm.factor[t,] <<– prevX
    }
    
    return(–loglike)
}
 
#=========================================================================#
#  Main : DNS term structure model estimation
#=========================================================================#
    gnk <– 3
    
    # read excel spot data
    file <– “spot_2011_2019.xlsx”; sheet <– “monthly”
    df.spot  <– read_excel(file,sheet,col_names=TRUE)
    
    # divide date and data
    v.ymd  <– df.spot[,1]
    v.mat  <– as.numeric(colnames(df.spot)[–1])
    m.spot <– as.matrix(df.spot[,–1])
    nmat   <– length(v.mat) # # of maturities
    nobs   <– nrow(m.spot)  # # of observations
    
    # factor estimates
    gm.factor <– matrix(0,nobs,gnk)
 
    #—————————————————–
    # initial guess for unconstrained parameters(para_un)
    #—————————————————–
    init_para_con <– c(
        # factor AR(1) coefficient
        9.673486e–01,  9.011626e–01,  7.891438e–01,
        # unconditional factor mean
        3.315541e–02, –1.089514e–02, –6.660581e–03,
        # factor variance
        3.194870e–06,  3.202131e–06,  1.105737e–05,
        # labmda
        4.375971e–01,   
        # measurement error variance
        3.344444e–07,  1.030344e–07,  5.787589e–12, 2.783731e–08, 
        9.018871e–08,  5.191208e–08,  4.779619e–08, 1.501038e–07,
        4.134415e–08,  7.313451e–08,  5.929881e–08, 7.771536e–08, 
        7.063936e–07
    )
    npara <– length(init_para_con) # # of observations
    
    init_para_un <– inv_trans(init_para_con)
    
    m<–optim(init_para_un, loglike,
             control = list(maxit=5000, trace=2),
             method=c(“Nelder-Mead”),m.spot=m.spot)
    
    prev_likev <– m$value; v.likev    <– m$value
    
    i <– 1
    repeat {
        print(paste(i,“-th iteration”))
        m<–optim(m$par,loglike,
                 control = list(maxit=5000, trace=2),
                 method=c(“Nelder-Mead”),m.spot=m.spot)
        
        v.likev <– cbind(v.likev, m$value)
        print(paste(m$value,” <- likelihood value"))
        
        if (abs(m$value – prev_likev) < 0.00000000001) {
            break
        }
        prev_likev <– m$value
        i <– i + 1
        
        print(v.likev)
    }
    
    name_theta <– 
        c(“a_1”   , “a_2”   , “a_3” ,
          “mu_1”   , “mu_2”   , “mu_3” ,
          “sigma_1”  , “sigma_2”  , “sigma_3”,
          “lambda”    , “epsilon_1” ,
          “epsilon_2” , “epsilon_3” , “epsilon_4”,
          “epsilon_5” , “epsilon_6” , “epsilon_7”,
          “epsilon_8” , “epsilon_9” , “epsilon_10”,
          “epsilon_11”, “epsilon_12”, “epsilon_13”)
    
    # draw 3 factor estimates
    x11(width=6, height=5);
    matplot(gm.factor,type=“l”, ylab=“L,S,C”, lty = 1, 
            main = “DNS 3 Factor Estimates (L,S,C)”, lwd=2)
    
    # Delta method for statistical ignkerence
    grad    <– jacobian(trans, m$par)
    hess    <– hessian(func=loglike, x=m$par,m.spot=m.spot)
    vcv_con <– grad%*%solve(hess)%*%t(grad)
    
    # parameter | std.err | t-value | p-value
    theta   <– trans(m$par)
    stderr  <– sqrt(diag(vcv_con))
    tstat   <– theta/stderr
    pvalue  <– 2*pt(–abs(tstat),df=nobs–npara)
    df.est  <– cbind(theta, round(stderr,4),
                     round(tstat,4), round(pvalue,4))
    
    rownames(df.est) <– name_theta # parameter name
    colnames(df.est) <–c(“parameter”,“std.err”,“t-stat”,“p-value”)
    print(df.est)
Colored by Color Scripter
cs

3. Estimated DNS model

Running the above R code for the DNS model, we can get the estimated parameters and the latent factor estimates(\(L, S, C\)). The following figure prints out the convergence of the log-likelihood function and estimated parameters with standard errors, t-statistics, and p-values. We use the delta method for statistical inference.

We can also plot the time-varying filtered estimates of latent factors (level, slope, curvature factors).

Reference

Diebold, F., Li, C., 2006. Forecasting the term structure of government bond yields. J. Econ. 130 (2), 337–364.

Diebold, F.X., Rudebusch, G.D., Aruoba, S.B., 2006. The macroeconomy and the yield curve: a dynamic latent factor approach. J. Econ. 131 (1–2), 309–338
\(\blacksquare\)

To leave a comment for the author, please follow the link and comment on their blog: K & L Fintech Modeling.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Dynamic Nelson-Siegel model with R code