Designing and Applying a Nonparametric Option Valuation Model

This paper derives, tests and discusses a comprehensive and easy to use nonparametric option-valuation model, using a representative set of historical data on underlying asset returns jointly with an assumption of minimalistic implied information on current market trend and volatility expectations. Its testing on empirical data from Warsaw Stock Exchange trading for two distinct periods of 2014 suggests that such distribution-free models are capable of delivering useful market insights as well as applicability features, in particular wherever derivative markets are relatively new, incomplete, illiquid, or with regard to the valuation of real options.


Introduction
Since inception of the Black-Scholes model (B-S), two distinct theoretical approaches have been predominantly used to derive the value of financial options (Black and Scholes, 1973, applied both). The first (equilibrium models), represented by e.g. Merton (1976), Cox et al. (1985), Hull and White (1987), assumes stochastic processes spanning the economy and estimates their parameters. The other (no-arbitrage models), by e.g. Cox and Ross (1976), Harrison and Kreps (1979), Rubinstein (1994), tries to calibrate to a set of market prices in complete markets.
Both methods derive from the premise that all market prices are correct (in other words, markets are efficient) and therefore they cannot, by definition, allow profit opportunities. However, their assumptions have always been disputed, both from the theoretical and empirical points of view (a technical summary is provided by Campbell et al., 1997, pp. 27-80; see also the contextual narrative by Vlachý, 2013).
Early studies (Black and Scholes, 1972) found that B-S systematically biases the impacts of the strike price and the time to maturity on option prices. Following Latane and Rendleman (1976), focus has shifted towards the measure of implied volatility. Rubinstein (1994) documented the existence and development of the volatility smile and volatility skew. Shimko (1993) and others demonstrated that implied distributions of S&P 500 are negatively skewed and leptokurtic, all refuting the lognormality assumption of Black and Scholes (1973). An elaborate analysis of the distributional effects is provided by e.g. Stádník (2014), who derived them from trading feedbacks.
Researchers have sought to redress these issues by incorporating stochastic volatility (Hull and White, 1987) and jumps (Bates, 1996) into parametric models. However, Das and Sundaram (1999) showed that these features mitigate, but do not eliminate the smile. Aside from empirical rebuttals, the efficiency assumption has been disproved fundamentally, as by Grossmann and Stiglitz (1980), and finally made unconvincing by the joint-hypothesis argument, which stipulates that the market efficiency theory is not falsifiable, and thus cannot be proved by any conceivable test (Lo and MacKinlay, 1999).
Ultimately, regardless of the existence of parametric models that do describe actual behavior of markets more realistically than B-S (i.e. improve data fit), they are not widely used by practitioners, who consider them impractical and revert to simple B-S, without need for fundamental theoretical substantiation, using it solely as a smoothing tool, calibrated against observed implied volatilities (Dumas et al., 1998;Bates, 2003;Berkowitz, 2010). Some or all of these concerns have motivated various attempts to use non-parametrical, i.e. distribution-free, models derived directly from observed historical returns, which is an approach resembling the heuristics used by practitioners many decades before the B-S era (Vlachý, 2014). Most of these models use interpolation and smoothing techniques such as kernel regression (Aït-Sahalia and Lo, 1998), neural networks (Hutchinson et al., 1994) or splines (Bates, 2000). The fundamental argument against this family of models is that they require vast amounts of option market price data for calibration, even as they do not constitute any general theory of option pricing. They are also overtly prone to overfitting and data-snooping (Campbell et al., 1997, pp. 523-524).
Direct use of the discrete historical distribution function histogram is much less common in literature and subsumes two distinct approaches. On the one hand, Stutzer (1996) and Alcock, Carmichael (2008) utilized the relative entropy principle for predictive pricing based solely on historical returns. This measure is designated in Stutzer's pioneering paper as "canonical estimator", and the line of reasoning is further developed and generalized by Liu (2010) within the framework of uncertainty theory. On the other hand, Chen and Palmon (2005) used the Capital Asset Pricing Model to derive the risk-adjusted discount rate, requiring complete series of historical option price data. Virtually all available studies rely on market prices from S&P 500 trading.
The research presented herein is innovative in two aspects. It applies data from the Warsaw Stock Exchange, intentionally selecting a developing market with a degree of efficiency and completeness substantially lower than that of major international markets (Strawiński and Ślepaczuk, 2008), while offering quantities of well-documented market information. We note that parametric option-valuation models were tested by Piontek (2007) and Kaminski (2013) (the remarkable evolutionary algorithm approach pursued by Myszkowski and Rachwalski, 2009, unfortunately did not engage options). Furthermore, we use a distribution-free model which does not derive from historical derivative prices, using just a single pair of current quotes for its calibration. The goals are to derive the model, apply it to two distinct trading days in 2014 (chosen as to comprise situations with both a positive and negative implied drift) and discuss the results.
1 Statistical Analysis of the Market

Historical Returns of the Underlying Asset
Ten years of daily closing prices for the Warsaw Stock Exchange WIG 20 Index are used, on a moving basis (WIG 20 is a capitalization-weighted price index published since 1994). Table 1 summarizes the distribution statistics for returns over various holding periods (commensurate with current settlement dates) as of June 12, 2014 and October 28, 2014, respectively.  (2014) The measures demonstrate major deviations from normality (Thadewald and Büning, 2007). Kernel-smoothed return distributions (in percentage returns) for 1-and 99day holding periods are charted under Figure 1 and illustrate considerable differences in their characteristics.
Both Table 1 and Figure 1 also clearly refute the broadly accepted heuristic of volatility scaling with the square root of time, as well as virtually any other practicable scaling function. Extant derivative prices (all as of June 12, 2014 and October 28, 2014, respectively, closing) facilitate calculations of the implied drift and volatilities. The drift may be implied from either futures prices (as F ) using (1) as in Chow et al. (2000), or pairs of call and put prices, assuming both contracts were traded at strike price X (as P-C X), using the put-call parity assumption using (2), as in Hull (2012, pp. 221-231). It is important to note that the put-call parity is distribution-neutral for European options. Implied volatilities X are calculated by iteration of the Black (1976) model, assuming the drift implied by futures prices (3).
where T represents the time to expiry of the contract, S the spot price of the underlying asset, F the futures price for that asset, and PX resp. CX put and call option prices at strike price X. N(x) is the distribution function of the normalized normal distribution.
The annualized results for all relevant contracts and contract combinations are summarized in -0,2 -0,1 0 0,1 0,2 -0,6 -0,3 0 0,3 0,6 The values suggest major pricing inconsistencies when considered within the bounds of conventional parametric valuation models. While the short-term options display a distinct volatility smile, the other series are skewed. Significant differences in futuresand option-based drift also expose gross deviations from put-call parity. This is unambiguous market inefficiency, reflecting primarily on its low liquidity, combined with positive transaction costs and the bid-offer spread.
According to June 12 prices, an intensive normal backwardation, expressed by the negative drift (the implied dividend yield can be readily derived by the drift's subtraction from the current risk-free rate), has also been implied in the short term. This has been presumably due to forthcoming dividends, as well as bearish sentiment linked to anticipated impacts of the Polish pension reform (WSE, 2014; Allen & Overy, 2014).

Deriving the Nonparametric Model
The precedent findings suggest that the observed market processes cannot be fully explained by equilibrium or arbitrage-free models, meriting the testing of a non-parametric model based on historical simulation. Its rationale can be best described using the following intuitive reasoning: Investors presume (i.e. have experience) that the distribution of returns generally follows that of historical returns over the time to expiry of the contract (1.1), barring currently expected drift and volatility. Their ad-hoc projections (1.2) are used to calibrate the distribution.
To achieve meaningful results, a historical underlying asset price dataset clearly needs to be available, subsuming a sufficiently diverse scope of market scenarios. In the present case this is satisfied by the inclusion of various realized shocks during the pertinent 10-year period (including e.g. the U.S. subprime mortgage crisis, collapse of Bear Stearns and Lehman Brothers, Greece bailout, as well as Euro debt downgrade), as illustrated by Figure 2. Various methods may be considered in practice to obtain reasonable estimates of drift and volatility over the holding period, including econometric models, panel research or the auctioning of appropriate financial instruments. It also suffices to partially complete the market using either an existing price of a pair of call and put options, or a single option jointly with the futures price. In the present model design, available pairs of at-the-money options will be used.
It is essential to emphasize the assumption that the particular pair of derivative quotations used for calibration would be the only relevant forward-looking information available in the market (otherwise risk-neutrality would not hold). All other realized derivative prices are thus ignored in the model construct and used in this study solely for its empirical testing. As a matter of fact, this may be considered a realistic framework in illiquid markets, where the trader starts with a benchmark to calibrate his/her model. Subsequent quotations, matched against open positions in the book then serve to recalibrate in a dynamically changing bid-offer price band.
Denoting the current price of the underlying asset S, the option exercise period T, historical prices Si for i = {0, ... n}, historical periodical returns Ri,T = Si+T-1 / Si-1 for i = {1, ... n -T} with N = n -T, and the normalized strike price *X = X / S, call (C) and put (P) option values can be calculated based on a histogram of historical returns Ri,T as in (4) and (5).
hist *P = E(e -kT max{*X -Ri,T, 0}), where E(.) represents the conditionally expected value, kT the appropriate discount rate, and *C = C / S and *P = P / S normalized values of strike and put options, respectively.
The original Ri,T distribution is fundamentally not risk-neutral and kT would thus have to represent a risk-adjusted discount rate as in Stutzer (1996) or Chen and Palmon (2005). Option valuation according to (4) and (5) also provides neither for the underlying asset's expected drift, nor volatility. Accordingly, the historical returns have to be adjusted using a transformation *Ri,T = ƒ(Ri,T). We use the function specified by (6).
where RT equals the mean of the original distribution Ri,T.
The particular form of (6) has been chosen to capture the utmost information vested in the original empirical returns function, including its skew and kurtosis, modifying the mean and standard deviation only. Options written on the transformed distribution of *Ri,T then become risk-neutral, due to its expected future value being equal to E T (this argument is substantiated by e.g. rational expectations, as in Muth, 1960), and their present values can thus be calculated using the continuously compounded risk-free rate r as in (7) and (8).
The model is calibrated by numerically solving the equation system including (6), (7) and (8) for T and E T while setting the prices *C and *P equal to a pair of realized at-the-money option prices.
Tables 3 and 4 (Appendix) compile the results, comparing realized, B-S and simulated option prices for relevant strikes (calibration points highlighted bold). B-S prices are calculated using implied at-the-money volatilities and the Black (1976) formula with futures-implied drift.

Discussion
The results suggest that nonparametric simulation outperforms B-S in actual option price predictions with most out-of-the money options, which B-S tends to undervalue. B-S also clearly tends to perform worse in the normal backwardation situation (i.e. on June 12). This is due to the fundamental assumptions of B-S resulting from the efficient market hypothesis that is clearly not supported by the empirical data.
Nevertheless, some trades were, in fact, executed at prices much closer to B-S valuation. This is probably due to the actual usage of B-S as the preferred smoothing instrument by traders at low market liquidity; otherwise, it would be impossible to explain such a conformity concurrently with the observed breaches of put-call parity.
Another interesting finding relates to the realized market prices of 8-day put options struck at 2 300 and 2 350, respectively, on June 12. Whereas the conventionally applied implied volatility measure (Table 2) seems to indicate a relative dearth of the 2 300 contract, non-parametric simulation (Table 3) shows its price to be adequate, but that of the 2 350 contract to be low. In practice, this would signify a potential arbitrage opportunity.
A final remark should be made on the intra-day dynamics of actual trading. Traders in an imperfect market, such as that on the WSE, do not base their quotations on a complete set of market prices, but rather on the latest quotations, combined with the existing (long or short) positions in their own books. The proposed trading system thus conveniently represents an iteration process, responding to the most current information available in the market.

Conclusions
The model derived in this paper has been shown to be a meaningful heuristic and efficient information carrier, providing both analytical and trading skills.
In the context of various existing alternatives, the model compares as follows: In contrast to most parametric models, it accounts for the volatility smile/smirk, autocorrelations and other empirically observed features of real-world markets. In contrast to any parametric model, it does not make particular economic assumptions, such as that of efficient or complete markets, only put-call parity. The put-call parity functionality, on the other hand, may be considered a positive feature, distinguishing it from most non-parametric models. Also, compared to more sophisticated parametric and most non-parametric models, it is simple and straightforward to interpret, implement and use.
Subject to further research and testing, the proposed model can thus be considered an attractive alternative to the existing option-pricing models in the cases where an extensive and representative historical dataset of underlying asset prices is available, with the derivative market being new, incomplete or illiquid. It may also offer interesting application opportunities for particular real-option valuation problems.