PORTFOLIO THEORY – LECTURE NOTES 4 Dr. Andrea Rigamonti DOWNSIDE RISK MEASURES So far we measured risk using the variance (or the standard deviation, which is simply its square root). This is based on the assumption that returns are symmetrically distributed, or at least that the investor only cares about volatility as a whole, without distinguishing between upside and downside movements. While this is not realistic (because investors want to minimize the losses but not the gains), it greatly simplifies optimization procedures. Moreover, downside risk measures tend to be more difficult to estimate as inputs for optimization procedures, which may lead to worse performance despite targeting a more appropriate measure of risk. For these reasons, here we do not consider portfolio optimization techniques targeting downside risk. However, it is still important to be familiar with the most popular downside risk measures, as they are useful for performance evaluation, and some of them are also employed by regulatory authorities supervising the banking sector. The downside risk measure most closely related to the variance is the semivariance, which is defined as:1 𝜎 𝐡 2 = 1 𝑇 βˆ‘[Min(𝑅𝑑 βˆ’ 𝐡, 0)]2 𝑇 𝑑=1 where 𝑇 is the number of periods in the estimation window, and 𝐡 is the benchmark below which the investor considers volatility to account as risk. To apply this formula, one has to replace all the portfolio returns above the benchmark with 0, and then the computations are exactly the same done to compute the variance. 𝐡 depends on the preferences of the investor. It is convenient (and also has some nice theoretical properties) to set 𝐡 equal to the risk-free rate. In this way if one works with excess returns, 𝐡 can be treated as equal to zero. However, in principle, it can be set to any value. The square root of 𝜎 𝐡 2 is called downside deviation, which we indicate with 𝜎 𝐡. The downside deviation is to the semivariance, what the standard deviation is to the variance. Theoretically, one can compute an optimal mean-semivariance or minimum semivariance portfolio by simply replacing the covariance matrix with the semicovariance matrix (the analogous to the covariance matrix in a downside risk setting) in the optimization procedures. However, estimating this matrix presents several challenges, and therefore we do not address this topic, but we still provide some intuition regarding what it means to target the semivariance. We distinguish between different scenarios: β€’ If the distribution is symmetric and the benchmark is equal to the sample mean, targeting the variance or the semivariance is always equivalent. One should therefore target the former, as sample estimates for it are more accurate than the sample estimates for the latter. β€’ If the distribution is symmetric but the benchmark is not equal to the sample mean, targeting the variance or the semivariance is only equivalent if we set a target return (i.e., meansemivariance optimization). Minimize the variance or the semivariance without a target return is not equivalent in this setting. 1 Technically, this is the downside semivariance, as it is also possible to compute an upside semivariance by replacing Min with Max in the formula. However, we are generally interested in the downside semivariance, which we therefore simply call β€œsemivariance”. β€’ If the distribution is not symmetric, targeting the variance is never equivalent to targeting the semivariance. The following figure provides a graphical illustration. To compute the risk-adjusted return in this context, the Sharpe ratio should be replaced by the Sortino ratio, which is similar to the Sharpe ratio but replaces the risk-free rate with the benchmark 𝐡, and the standard deviation with the downside deviation 𝜎 𝐡: Sortino = 𝑅 βˆ’ 𝐡 𝜎 𝐡 Another popular downside risk measure is the Value at Risk (VaR). VaR measures the maximum potential loss that an investor can suffer over a certain period, with a 1 βˆ’ 𝛼 confidence level. 𝛼 is set by the investor; for example, an 𝛼 = 0.05 corresponds to a 95% confidence level. More formally, given a profit and loss distribution π‘Œ we can define VaR as: π‘‰π‘Žπ‘… 𝛼(π‘Œ) = βˆ’inf{𝑦 ∈ R: (π‘Œ ≀ 𝑦) > 𝛼} For example, if we set 𝛼 = 0.05 and when evaluating a set of returns we get a π‘‰π‘Žπ‘… = 0.04, it means that we have a 5% chance of losing 4% or more in one period over the time horizon considered. VaR can be computed in different ways. The most commonly used is the historical method: we simply rank the historical returns in increasing order and then check the (typically negative) return that we have at the 𝛼 percentile. Another possibility is the parametric method: we assume that returns follow a certain distribution and we compute the loss at the chosen percentile. Simulation (β€œMonte Carlo”) approaches are also possible. The main problem with VaR is that it is not a coherent risk measure. Consider the outcomes 𝑉1 and 𝑉2 of two investments. A risk measure is said to be coherent if it possesses the following desirable properties: β€’ Monotonicity: if 𝑉1 is larger or equal to 𝑉2 in every possible scenario, then the risk of 𝑉1 must be lower than 𝑉2. Formally: if 𝑉1 β‰₯ 𝑉2, then π‘…π‘–π‘ π‘˜(𝑉1) < π‘…π‘–π‘ π‘˜(𝑉2). β€’ Translation invariance: for any outcome 𝑉, adding an additional outcome C with a certain return reduces the risk by that amount. Formally: π‘…π‘–π‘ π‘˜(𝑉 + 𝐢) = π‘…π‘–π‘ π‘˜(𝑉) βˆ’ 𝐢. β€’ Positive homogeneity: multiplying all outcomes by a constant should result in a scaling of the risk measure by the same constant. In other words, if we invest, say, twice the original amount, the risk measure should also double. Formally: π‘…π‘–π‘ π‘˜(πœ†π‘‰) = πœ†π‘…π‘–π‘ π‘˜(𝑉). β€’ Subadditivity: the risk of a combination of two risky positions should be lower or equal to the risk of the individual positions. In other words, diversifying by combining different assets should reduce risk, or at worst leave it unaffected, but it cannot increase it. Formally: π‘…π‘–π‘ π‘˜(𝑉1 + 𝑉2) ≀ π‘…π‘–π‘ π‘˜(𝑉1) + π‘…π‘–π‘ π‘˜(𝑉2). The Value at Risk satisfies the first three conditions, but not the last one. As it violates subadditivity, risk quantified using VaR can sometimes increase with greater diversification, which is not very meaningful. To overcome this problem, the Conditional Value at Risk (CVaR), also known as Expected Shortfall (ES), has been proposed: πΆπ‘‰π‘Žπ‘… 𝛼(π‘Œ) = βˆ’ 1 𝛼 ∫ π‘‰π‘Žπ‘… 𝑒 𝑑𝑒 𝛼 0 where 𝑒 is just the variable of integration and 𝑑𝑒 is the differential of this variable (i.e., we are integrating from 0 to 𝛼 using infinitesimal increments in 𝑒 from 0 until we reach 𝛼). In more intuitive terms, the CVaR measures the average (the β€œexpected”) loss that we get, given that the loss exceeds the VaR. As it is a coherent measure of risk, it is preferred and more commonly used than the VaR. Of course, in order to compute the CVaR, you first need to compute the VaR. The following figure provides a graphical intuition of VaR and CVaR: CVaR is always lower than the VaR, because it is the value that we get by computing the average loss that we have when we find ourselves in the red area left of the VaR. Finally, another popular measure of downside risk is the drawdown (DD). The drawdown is the decline in the value of an investment from a peak to a low point. Different drawdown measures can be computed. A popular and easy to compute one is the maximum drawdown (MDD): 𝑀𝐷𝐷 = π‘‡π‘Ÿπ‘œπ‘’π‘”β„Ž π‘‰π‘Žπ‘™π‘’π‘’ βˆ’ π‘ƒπ‘’π‘Žπ‘˜ π‘‰π‘Žπ‘™π‘’π‘’ π‘ƒπ‘’π‘Žπ‘˜ π‘‰π‘Žπ‘’π‘’ where the β€œTrough Value” is the lowest point in the series that is reached after the highest peak. Obviously, a lower MDD is preferable to a higher MDD. In the worst possible case, MDD is equal to 100%, i.e., the value of the investment drops to zero. Source: https://financetrain.com MDD fails to consider the frequency and duration of losses, and does not account for the size of any gains. To account for the gains, we can use a more informative measure called Calmar Ratio: πΆπ‘Žπ‘™π‘šπ‘Žπ‘Ÿ = 𝑅 βˆ’ π‘Ÿπ‘“ 𝑀𝐷𝐷 This is similar to the Sharpe ratio, but the MDD is used instead of the standard deviation.