Take Home Exam 2019 Winter - Using R for Economics and Statistics Analysis

Problem 1

(20 points, each for 4 points) This question is about the larger VIX data set {vixlarge.mat} (this is a matlab format file, you can load it by installing package “R.matlab” and use command rmat)that contains the VIX data and the associated dates. introduce and investigate in this proposal. The CBOE VIX is colloquially referred to as the “fear index” or the “fear gauge”. We choose to study the VIX not only on the widespread consensus that the VIX is a barometer of the overall market sentiment as to what concerns investors’ risk appetite, but also on the fact that there are many trading strategies that rely on the VIX index for hedging and speculative purposes.

Plot the VIX data against date. Clearly label the horizontal and vertical axises.
We know that volatility (\(y_{t}\) hearafter) exhibits a high degree of persistence and it’s likely that \(y_{t}\) is better forecast by using more lags, \(y_{t-1}, y_{t-3}, \ldots .\) That makes us think of the model with \(J\) lags: \[ y_{t}=\beta_{0}+\beta_{1} y_{t-1}+\beta_{2} y_{t-2}+\cdots+\beta_{J} y_{t-J}+u_{t} \] where \(u_{t}\) is the error term. But to capture long-range dependence might entail \(J=10\) \(J=20,\) or higher. Let the dependent variable \(y\) be the VIX and the first and the 2nd to 23th columns of the independent variable \(X\) be the intercept term and the 1-22 lag of VIX. Write your own code to imlpement AR(1) to AR(22) model, and pick out the best model by AIC and BIC, are the results same, genarate a table to illustrate your results? if not, why?
Set the window length at 3000 and make forecast on the next period \(y_{t+1}\), start from the beginning and roll until the end. For each roll, we make forecast using AR(1) to AR(22). Compute the mean squared forecast errors and the mean absolute forecast errors for AR(1) to AR(22) and report them in a table.
In (b) and (c), estimating such a large number of coefficients could entail a lot of estimation error and lead to bad forecasting properties. Following Fernandes, Medeiros, and Scharth (2014), a very popular way to model the VIX index is the heterogeneous autoregressive (HAR) model by Corsi (2009) . The HAR model gains great popularity not only because the HAR model well approximates long memory and multiscaling properties of the VIX index, but it is also very easy to implement in practice. The standard HAR model in Corsi (2009) postulates that \(h\) -step-ahead daily volatility \(y_{t+h}\) can be modeled by \[ y_{t+h}=\beta_{0}+\beta_{d} \bar{y}_{t}^{(1)}+\beta_{w} \bar{y}_{t}^{(5)}+\beta_{m} \bar{y}_{t}^{(22)}+\epsilon_{t+h} \] where we define \[ \bar{y}_{t}^{(l)} \equiv l^{-1} \sum_{s=1}^{l} y_{t-s} \] as the averages of the previous \(l\) periods of \(y\) from period \(t\) and \(\left\{\epsilon_{t}\right\}\) is a zero mean innovation process. A typical choice in the literature for the lag index vector \(l\) is \([1,5,22]\) so as to mirror the daily, weekly, and monthly components of volatility process. Using HAR model to do forcasting exercise in (b) and (c) and comparing HAR model with the best AR model in (b) and (c) by AIC, BIC and rolling method, which one is the best?
Try to come up with an algorithm that can beat the best performing method stated in question (d). Clearly describe your motivation, the details of the algorithm, and the results.

Problem 2

(18 points, each for 3 points) (Commodity prices). Consider the daily gold price, London Bullion Market, price per Troy Ounce in U.S. Dollars at 10: 30 AM local time, from January \(2,1992\) to March 31,2015. See file GoLDLBM1030. txt.

Obtain the time plot of the gold price.
Let \(r_{t}\) be the log return of the daily gold price. Obtain the time plot of \(r_{t}\)
Are there serial correlations in the \(r_{t}\) series? You may use \(Q(10)\) to draw the conclusion.
Build an AR model for \(r_{t} .\) Check the adequacy of the model.
Remove any parameter of the AR model with \(t\) -ratio less than 1.645 in absolute value. Write down the file model.
Use the final model to compute 1 -step to 3 -step ahead forecasts of \(r_{t}\) at the forecast origin March \(31,2015\)

Soution: First, we write a function to estimate \(\phi\).

Now,we can do simulation with the function

Problem 3

(32 points, each for 4) If two asset returns \(R_{1, t}\) and \(R_{2, t}\) have correlation \(\rho\) and time varying volatility \(\sigma_{1, t}\) and \(\sigma_{2, t}\)，then their covariance (GARCH covariance) is:

\[ \sigma_{12, t}=\rho \sigma_{1, t} \sigma_{2, t}, \] where we assume the \(\rho\) is a constant. In the following, estimate \(\rho\) between two series.

1. Get the close prices of “MSFT” and “AMZN” from “2017-01-04” to “2019-12-25”, and compute the return as “msftret” and “amznret”.

## [1] "MSFT"

## [1] "AMZN"

1. Set GARCH specification as “garchspec”: mean function is a constant, variance function is standard garch, distribution form is specified as skew t distribution.
1. Estimate the GARCH model for each return series.
1. Compute the standardized returns for each return series and \(\rho\) as the sample correlation of the standardized returns as “msftamzncor”, print the correlation
1. Compute the GARCH covariance and plot it.
1. An important application of GARCH cvariance is to optimize the variance of the portfolio which depends on the portfolio weights, the variance of all the assets and the covariance between the asset returns. The variance of portfolio of two assets (\(\sigma_{p, t}^{2}\)) with weight \(w_{1, t}\) invested in asset 1 and \(\left(1-w_{1, t}\right)\) in asset 2 is \[ \sigma_{p, t}^{2}=w_{1, t}^{2} \sigma_{1, t}^{2}+\left(1-w_{1, t}\right)^{2} \sigma_{2, t}^{2}+2 w_{1, t}\left(1-w_{1, t}\right) \sigma_{12, t}. \] There are many ways to define optimal weight \(w_{1, t}^{*}\). One appraoch is to set \(w_{1, t}\) such that the portfolio variance \(\sigma_{p,t}^{2}\) is minimized. What’s the first order condition to compute \(w_{1, t}^{*}\)? What is the solution of \(w_{1, t}^{*}\)?
1. If we use “MSFT” and “AMZN” to construct the portfolio, what is the optimal weight for “MSFT”? Plot your results.
1. Following CAPM model, a stock’s beta is used to measured the systematic risk of a stock which is defined as the covariance of the stock return and the market return, divided by the variance of the market return. The higher it is, the more risky the stock and thus the higher the required rate of return. For US stocks, the market return is the return on the S&P 500. Compute the dynamic beta of “MSFT” and plot your results.

## [1] "GSPC"

Problem 4

(30 points, each for 5 points)

移动平均线(MA)是股市中最常用的一种技术分析方法，用来在大行情的波动段找到有效的交易信号。移动平均线不仅简单，而且有效。据金融从业人员称，均线模型能有效地打败大部分的主观策略，是炒股、炒期货的必备基本工具。均线系统市股票市场技术分析的重要组成部分。技术分析其实的核心是统计学，通过对过往历史价格的统计和形成的统计图表来做出对未来走势的预期，并针对预期来制定交易计划。

移动平均线移动平均线(MA,Moving average）是以道·琼斯的”平均成本概念”为理论基础，采用统计学中”移动平均”的原理，将一段时期内的股票价格平均值连成曲线，用来显示股价的历史波动情况，进而反映股价指数未来发展趋势的技术分析方法。它是道氏理论的形象化表述。

移动平均线的计算方法就是求连续若干天的收盘价的算术平均。天数就是MA的参数。在技术分析领域中，移动平均线是必不可少的指标工具。移动平均线利用统计学上的“移动平均”原理，将每天的市场价格进行移动平均计算，求出一个趋势值，用来作为价格走势的研判工具。

计算公式： MA = (C1+C2+C3+C4+C5+….+Cn)/n ,C为收盘价，n为移动平均周期数。

移动平均线依时间长短可分为三种，即短期移动平均线，中期移动平均线，长期移动平均线。短期移动平均线一般以5日或10日为计算期间，中期移动平均线大多以30日、60日为计算期间；长期移动平均线大多以100天和200天为计算期间。移动均线平滑了数据序列，并有助于识别股市的发展趋势。n值越大，移动均线就越难反映序列中的短期波动，但也更好的把握了整体的趋势。

1. 获取最近一年的A股市场中国宝安（000009.XSHE）的收盘价历史交易数据(截止到2019年12月27日的252个观测值作为交易期的观测值)。
1. 使用(1)中数据，分别构造5日，10日，30日均线数据,绘制收盘价格和不同长度的均线图之间的关系(即将3条均线和价格线绘制在一张图上)，比较不同的均线的特点。
1. 均线与价格线会有交叉，各均线之间也有交叉，我们可以通过这些交叉点判断交易信号。黄金交叉，当10日均线由下往上穿越30日均线，10日均线在上，30日均线在下，其交叉点就是黄金交叉，黄金交叉是多头的表现，出现黄金交叉后，后市会有一定的涨幅空间，这是进场的最佳时机。死亡交叉，当30日均线与10日平均线交叉时，30日均线由下住上穿越10日平均线，形成30日平均线在上，10日均线在下时，其交点称之为”死亡交叉”，”死亡交叉”预示空头市场来临，股市将下跌,此时是出场的最佳时机。如果很好地运用移动平均线理论，再掌握行情的真正趋势，就能实现获取可观利润。构建一个交易策略主要包括两个方面，一是交易信号，以股价和30日均线的交叉，进行交易信号的判断，当股价上穿30日均线则买入，下穿30日均线卖出；二是买卖方式，这里我们假设空仓入市，等待第一个交易信号出现时开始交易，在买入信号出现时，一次性用完所有头寸，直到卖出信号出现，全部卖出，不考虑交易成本和滑点，可以在交易信号出现的当天以收盘价买入或卖出。请根据黄金交叉的定义，找出买入信号出现的日期和卖出信号出现的日期。
1. 根据(3)中的结果计算策略的每日收益率，绘制每日收益率的时间序列图。
1. 评价一个策略是否有效，需要与基准策略的收益相比较，这里我们选择的基准策略为在交易期开始时持有该股票，一直持有到交易期期末。计算基准策略的每日收益，并绘制时间序列图。
1. 计算(3)中策略和基准策略的累积收益率，并绘制在一张图中。

Take Home Exam 2019 Winter - Using R for Economics and Statistics Analysis

Instructor: Tao Zeng 鏇炬稕

Due date : 2020骞1鏈<88>17鏃

Problem 1

Problem 2

Problem 3

Problem 4