Finding Fair Value [CODE INSIDE]

Regressions, Splines, and Marks

Mar 10, 2024

∙ Paid

Introduction

Many people try to begin market making and struggle to understand what is meant by fair value or the process of establishing it. Most people think that this comes out of some model that is entirely new, but that couldn’t be further from the truth.

We’ll cover how to establish where the market views the fair value to be in spot, futures, and options markets and then move on to how to convert this into our own estimate of fair value.

Succinctly there are 4 main problems when it comes to market making:

What is my fair value?
What is my spread?
What is my skew?
What is my size?

Your fair value is what you quote around. The spread is the average distance between your bid/ask quotes and your fair value. The skew is the difference in distance from a fair value between the bid and ask, respectively. Of course, size is how much you quote.

Size is sometimes negligible; perhaps if you are quoting just one symbol or a small universe, then you weigh it based on some established share of your account, but when there are a lot of symbols, such as with shitcoins or options, you run into the issue of how to allocate your capital between different books.

For now, we will simply tackle the first problem and focus on showing how this problem is approached and solved. You still need to do a lot of research independently, but this should give you the tooling.

Index

Introduction
Index
Fitting Fair Value
SVI Surface Parameterization
Regressions
Wing Model
Conclusion

Fitting Fair Value

When looking at any orderbook, we need to isolate one price as fair value. In most cases, this will be the mid-price or simply the middle of the orderbook, but in cases where we feel there is a heavy imbalance of liquidity, we can weigh by bid/ask sizes.

The question then arises as to how many levels should we include in our weighting. This is best put in the context of why we weigh this at all - the reason being to get an accurate idea of where the price will be in the future. If we feel that imbalance typically will push the price in one direction (which it does), then we should select the number of levels where there is maximal predictability and relevance.

This makes things very complicated, though, so it’s a very late-stage optimization. In 99% of cases, the mid-price is fine. I wouldn’t even bother with micro-price (refer to the Stoikov paper on this to understand micro-price).

Once we have done this, we can use it as our “dumb” fair value. We will improve upon it by shifting it, but we start from the market price as our basis vector. For futures and spot, we simply take the midprice or weighted-midprice and call it a day, but for options, we need to fit a smooth curve for our volatility smile and even a 3D surface to get our volatility surface, which we quote around.

SVI Surface Parameterizations

Splines are employed to find this smooth surface. Most large trading firms will have complicated models with tens of parameters, but for now, we will start with the most advanced model that exists in the public literature.

We start with SVI. This is a spline model. Effectively, we want to fit a smooth surface which ensures consistency across our quotes and implied volatilities so they all make sense pricing wise when put together. We achieve that via the SVI model.

\(w_{imp}^{svi} = a + b\cdot\left({\rho}\cdot\left(x-m\right) + \sqrt{\left(x-m\right)^2+{\sigma}^2}\right) \)

Parameters

x (float): Moneyness, typically defined as the log of the strike price divided by the spot price.
a (float): Parameter a, represents the base level of the total implied variance across strikes.
b (float): Parameter b, modulates the volatility skewness/smile.
sigma (float): Parameter sigma, controls the convexity of the volatility smile.
rho (float): Parameter rho, determines the skewness or the slope of the volatility smile. It is the correlation coefficient between the asset return and its volatility.
m (float): Parameter m, signifies the mode or the peak of the smile curve.

We characterize the raw parameterization of the total implied variance (assuming some static maturity) in the above formula, with all values other than x estimated from the market (we know moneyness).

def calculate_total_implied_variance(moneyness: float,
                                     base_level: float,
                                     skew_modulation: float,
                                     smile_convexity: float,
                                     smile_slope: float,
                                     smile_peak: float) -> float:
    
    return base_level + skew_modulation * (smile_slope*(moneyness - smile_peak) + ((moneyness - smile_peak)**2 + smile_convexity**2)**0.5)

# Example: Calculate the total implied variance for given parameters
x = 0.05  # Example moneyness
a = 0.04  # Base level of total implied variance
b = 0.1   # Volatility skewness/smile modulation
sigma = 0.2  # Convexity control of the volatility smile
rho = -0.5  # Skewness/slope of the volatility smile
m = 0.01  # Peak of the smile curve

w_imp_svi = calculate_total_implied_variance(x, a, b, sigma, rho, m)
print(f"Total Implied Variance: {w_imp_svi}")

NOTE: For our market data, we need to calculate time to expiry as a fraction of a year, calculate log-moneyness, and remove OTM options.

In order to estimate the best parameters, we need to solve using ordinary least squares, minimizing the below function:

\(\min_{a,b,\sigma,\rho,m}\sum_{i=1}^{n}(w_{raw}(x,a,b,\sigma,\rho,m) - w_{market})^2\)

from typing import List, Tuple
import pandas as pd
import numpy as np

def calculate_residuals(params: Tuple[float, float, float, float, float],
                        time_to_expiry: float, 
                        market_data: pd.DataFrame) -> np.ndarray:
    """
    Calculates the residuals between the market implied volatilities and the volatilities
    predicted by the Raw SVI model for a given time to expiry.

    This function is used within a least-squares optimization process to find the SVI
    parameters that best fit the market data for each time to maturity slice.

    Parameters:
    - params (Tuple[float, float, float, float, float]): A tuple containing the SVI parameters
      (a, b, sigma, rho, m) to be optimized.
    - time_to_expiry (float): The specific time to expiry (in years) for which the residuals
      are being calculated.
    - market_data (pd.DataFrame): A DataFrame containing market data, which must include columns
      for 'time_to_expiry_yrs' (time to expiry in years), 'log_moneyness' (the log of moneyness),
      and 'mid_iv' (the market's median implied volatility).

    Returns:
    - np.ndarray: An array of residuals between the market and model implied volatilities
      for the specified time to expiry.
    """
    # Filter market data for the specific time to expiry
    specific_expiry_data = market_data[market_data['time_to_expiry_yrs'] == time_to_expiry]
    
    # Calculate the total implied variance using the Raw SVI model for filtered data
    w_svi = np.array([raw_svi(x, *params) for x in specific_expiry_data['log_moneyness']])
    
    # Extract the actual market implied volatilities
    iv_actual = specific_expiry_data['mid_iv'].values
    
    # Calculate residuals between market implied volatilities and model predictions
    residuals = iv_actual - np.sqrt(w_svi / time_to_expiry)
    
    return residuals

The Quant Stack