The Quant Stack

The Quant Stack

HFT Alpha Research 101

How to research an HFT alpha

Quant Arb's avatar
Quant Arb
Jun 02, 2026
∙ Paid
High Frequency Traders. High-frequency trading (HFT) is a… | by We Are  Atomic Fund | Atomic Fund | Medium

Introduction


I have found many proprietary HFT and MFT alphas in my time as a quant and have monetised them as part of different desks / books. In today’s article, I will do a walkthrough of how I personally approach it and explain how to find alpha in the HFT domain. We will walk through how to research short term alphas for either MFT execution or skewing into as part of an HFT strategy.

We will cover backtesting, factor modelling, residualization, correlation testing, horizon analysis, forecasting, monetisation, and more.

This is one of the core ways to drive profitability for any HFT desk and is a skill which can make or break the performance of a pod based on whether the managers know how to develop alphas or not.

Index


  1. Introduction

  2. Index

  3. The Research Data Setup

  4. HFT Factors

  5. Backtesting Signals

  6. Similarity Analysis

  7. Horizon Analysis

  8. Forecasting

  9. Monetisation

The Research Data Setup


The first part we need to cover is our research data. We need to produce various lookahead targets to forecast. A reasonable setup is:

1s, 3s, 5s, 15s, 1min, 5min, 15min

Below 1s timeframes can also be forecasted, although timeframes like 100ms forecast often require low latency and the shorter you go the more you rely on breakdowns of the events that transpired as opposed to detailed statistical analysis of features performances.

Low latency events are often very deterministic in their signal and you can clearly assert why certain behaviours are happening with only a handful of samples. I’ve written about this before in one of my prior articles:

Timeframes & Research Types

Timeframes & Research Types

Quant Arb
·
May 14, 2024
Read full story

When constructing our return targets it is wise to always use right labelled, left closed aggregation. We should also use disjoint return intervals, there should be no overlap. For example if our timeframes were 1min, 5min, and 15min we would do:

t+0 to t+1

t+1 to t+5

t+5 to t+15

We would NOT do:

t+0 to t+1

t+0 to t+5

t+0 to t+15

This is because we want to be able to distinguish the effects and their timeframes. If I test a very strong alpha with roughly 1minute of strength it will show as working on 1min, 5min, and 15min if I do not use disjoint returns. However, if I test 5_15 as my interval then I will not see it, nor will I see it in 1_5 so I know exactly that it only works for 1min. Additionally, we should not use trade bars to get our returns. We need to use mid-price to calculate our return targets. This is because there will be significant bid/ask bounce in illiquid names which will make the trade bars have a much stronger reversal effect than actually exists.

HFT Factors


What is a factor model on the HFT timeframes, and why do we need it for our alpha research pipeline? The idea of risk premiums disappears when we enter the HFT world. All of our factors will be in the high single or even double digit Sharpe ranges pre-fees, and unmonetizable post-fees (at least on their own). We should not think of factors here as some sort of inherent risk premium, but instead as any other alphas. Simply alphas we deem very important.

This post is for paid subscribers

Already a paid subscriber? Sign in
© 2026 Quant Arb · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture