Author Archives: Dean Markwick's Blog -- Julia

Trend Following with ETFs

By: Dean Markwick's Blog -- Julia

Re-posted from: https://dm13450.github.io/2022/11/18/Trend-Following-with-ETFs.html

Trend following is a rebranded name for momentum trading strategies. It looks at assets where the price has gone up and buying them because it believes the price will continue to rise and likewise for falling prices where it sells. I’ll use this post to show you how to build a basic trend-following strategy in Julia with free market data.


Enjoy these types of posts? Then you should sign up for my newsletter. It’s a short monthly recap of anything and everything I’ve found interesting recently plus
any posts I’ve written. So sign up and stay informed!






Trend following is one of the core quant strategies out there and countless pieces are written, podcasts made, and threads discussed all over Twitter about how to build and use a trend following strategy effectively. This blog post is my exploration of trend-following and uses ETFs to explore the ideas of momentum.

Over the past few months, we have been experiencing the first actual period of sustained volatility and general negative performance across all asset classes. Stocks no longer just go up. This is driven by inflation, the strengthening of the dollar, and the rise in global interest rates it feels like always buying equities 100% isn’t a foolproof plan and diversifying can help. This is where trend following comes in. It wants to try and provide positive returns whether the market is going up or down and, give diversification in regimes like we appear to be in today.

As ever I’ll be using AlpacaMarkets.jl for data and walking you through all the steps. My inspiration comes from the Cantab Capital (now part of GAM) where they had a beautiful blog post doing similar: Trend is not your only friend. Since moving over to the GAM website it no longer looks as good!

using AlpacaMarkets
using DataFrames, DataFramesMeta
using Dates
using Plots, PlotThemes
using RollingFunctions, Statistics
theme(:bright)

ETFs vs Futures in Trend Following

In the professional asset management world trend following is implemented using futures. They are typically cheaper to trade and easier to use leverage. For the average retail investor though, trading futures is a bit more of a headache as you need to roll them as they expire. So ETFs can be the more stress-free option that represents the same underlying.

More importantly, Alpaca Markets has data on ETFs for free whereas I think I would have to pay for the futures market data.

So let’s start by getting the data and preparing it for the strategy.

We will be using three ETFs that represent the different assets classes:

  • SPY for stocks
  • BND for bonds
  • GLD for gold

We expect the three of these ETFs to move in a somewhat independent manner to each other given their different durations, sensitivity to interest rates, and risk profiles.

The stock_bars function from AlpacaMarkets.jl returns the daily OHCL data that we will be working with. You also want to use the adjustment="all" flag so that dividends and stock splits are accounted for.

spy = stock_bars("SPY", "1Day"; startTime = now() - Year(10), limit = 10000, adjustment = "all")[1]
bnd = stock_bars("BND", "1Day"; startTime = now() - Year(10), limit = 10000, adjustment = "all")[1];
gld = stock_bars("GLD", "1Day"; startTime = now() - Year(10), limit = 10000, adjustment = "all")[1];

We do some basic cleaning, formatting the time into a DateTime and just pulling the columns we want, Open, Close Next Open.

function parse_date(t)
   Date(string(split(t, "T")[1]))
end

function clean(df, x) 
    df = @transform(df, :Date = parse_date.(:t), :Ticker = x, :NextOpen = [:o[2:end]; NaN])
   @select(df, :Date, :Ticker, :c, :o, :NextOpen)
end

spy = clean(spy, "SPY")
bnd = clean(bnd, "BND")
gld = clean(gld, "GLD");

Joining it all into one long data frame gives us the model data going forward

allPrices = vcat(spy, bnd, gld)
allPrices = sort(allPrices, :Date)
last(allPrices, 6)

6 rows × 5 columns

Date Ticker c o NextOpen
Date String Float64 Float64 Float64
1 2022-10-31 SPY 386.21 386.44 390.14
2 2022-10-31 BND 70.35 70.36 70.58
3 2022-10-31 GLD 151.91 152.16 153.82
4 2022-11-01 SPY 384.52 390.14 NaN
5 2022-11-01 BND 70.26 70.58 NaN
6 2022-11-01 GLD 153.46 153.82 NaN
plot(plot(spy.Date, spy.c, label = :none, title = "SPY"),
plot(bnd.Date, bnd.c, label = :none, title = "BND", color = "red"),
plot(gld.Date, gld.c, label = :none, title= "GLD", color = "green"), layout = (3,1))

ETF Closing Prices

The prices of each ETF are on different scales. BND is between 70-80, GLD in the 100’s and SPY is in the 300-400 range. The scale on which they move is also different. SPY has doubled since 2016, GLD 80% increase, and BND hasn’t done much. Therefore we need to normalise both of these factors to something comparable across all three. To achieve this we first calculate the log-returns of the close-to-close move of each ETF. We then calculate the rolling standard deviation of the price series to represent the volatility which is used to normalise the log returns

ˆr=0.1lnptlnptiσ.

allPrices = @transform(groupby(allPrices, :Ticker), 
                      :Return = [NaN; diff(log.(:c))],
                      :ReturnTC = [NaN; diff(log.(:NextOpen))]);

We also calculate the returns of using the NextOpen time series as a way to assess the transaction costs of the trend-following strategy, but more on that later.

runvar calculates the 256-day moving variance, which we take the square root of to get the running volatility. There the normalisation step is a simple multiplication and division.

allPrices = @transform(groupby(allPrices, :Ticker), :RunVol = sqrt.(runvar(:Return,  256)));
allPrices = @transform(groupby(allPrices, :Ticker), :rhat = :Return .* 0.1 ./ :RunVol);

Dropping any NaNs removes the data points before we had enough observations for the 256-day volatility calculation and calculating the cumulative sum of the returns gives us an equity curve.

allPricesClean = @subset(allPrices, .!isnan.(:rhat ))
allPricesClean = @transform(groupby(allPricesClean, :Ticker), :rhatC = cumsum(:rhat), :rc = cumsum(:Return));

To check this transformation has worked we aggregate across each ticker.

@combine(groupby(allPricesClean, :Ticker), :AvgReturn = mean(:Return), :AvgNormReturn = mean(:rhat),
                                           :StdReturn = std(:Return), :StdNormReturn = std(:rhat))

3 rows × 5 columns

Ticker AvgReturn AvgNormReturn StdReturn StdNormReturn
String Float64 Float64 Float64 Float64
1 SPY 0.000372633 0.00343358 0.0122161 0.115383
2 BND -9.4164e-5 -0.00223913 0.00349196 0.113982
3 GLD 0.000214564 0.00288888 0.0086931 0.101731

Summarising the average and standard deviation of the return and normalised return shows the standard deviation is now close to 0.1 as intended.

This is where leverage comes in because bonds have lower volatility than stocks, we have to borrow money to increase the volatility of our bond investment. For example, let’s borrow £100 with £10 of collateral and invest the £100 in BND. If BND moves up by 1% it is now worth £101, we sell and pay back our loan and we are left with £11 which is a 10% return on our original investment. So even though the price only moved by 1% the use of leverage has amplified our return by 10x.

Plotting the cumulative log returns shows how they are on similar scales now.

plot(allPricesClean.Date, allPricesClean.rhatC, 
     group = allPricesClean.Ticker, legend=:topleft, title = "Normalised Cumulative Returns")

ETF Log Returns

The Trend Following Signal

With our data in a good shape, we can move on to the actual signal construction. Just like the Cantab article, we will be using the 100-day moving average. Using the RollingFunctions.jl package again we just have to set the window length to 100 and it will do the hard work for us. The actual signal is whether this running average is positive or negative. If it is greater than zero we want to go long as the signal is saying buy. Likewise, if the rolling average is less than zero then we want to go short, the signal is saying sell. So this simply means taking the sign of the rolling average each day.

If we didn’t want to go short, we just want to know when to buy or sell to cover we can simplify it further by just using the signal when it is positive. We will call this the long only signal.

Using data frame manipulation we can calculate the signal per day for each ETF.

allPricesClean = @transform(groupby(allPricesClean, :Ticker), 
                            :Signal = sign.(runmean(:rhat, 100)), 
                            :SignalLO = runmean(:rhat, 100) .> 0);

Evaluating the Trend Following Strategy

We’ve got our signal that says when to go long and short for each ETF. We need to combine the return of each ETF per day to get out the Trend Following return. As we are using log returns this is as simple as summing across the ETFs multiplied by the signal on each day. We have three ETFs so need to weight each of the returns by 1/3 otherwise when comparing to the single ETFs we would have 3x as much capital invested in the trend following strategy.

portRes = @combine(groupby(allPricesClean, :Date), 
           :TotalReturn = sum((1/3)*(:Signal .* :rhat)), 
           :TotalReturnLO = sum((1/3)*(:SignalLO .* :rhat)),
           :TotalReturnTC = sum((1/3) * (:Signal .* :ReturnTC)),
           :TotalReturnUL = sum((1/3) * (:Signal .* :Return)));

Again, plotting the cumulative returns shows that this trend-following strategy (dark blue) is great. I’ve massively outperformed just being long the SPY ETF. Even when we remove the shorting element (Trend Following – LO, red) this has done well.

portRes = @transform(portRes, :TotalReturnC = cumsum(:TotalReturn), 
                              :TotalReturnLOC = cumsum(:TotalReturnLO), 
                              :TotalReturnTCC = cumsum(:TotalReturnTC),
                              :TotalReturnULC = cumsum(:TotalReturnUL))

plot(portRes.Date, portRes.TotalReturnC, label = "Trend Following", legendposition = :topleft, linewidth=3)
plot!(portRes.Date, portRes.TotalReturnLOC, label = "Trend Following - LO", legendposition = :topleft, linewidth=3)

plot!(allPricesClean.Date, allPricesClean.rhatC, group = allPricesClean.Ticker)

Trend following results

It’s up and to the right which is a good sign. By following the trends in the market we can profit when it is both going up and down. This is without any major sophistication on predicting the direction either. Simply using the average produces enough of a signal to be profitable.

Let’s focus on just what has happened this year

portRes2022 = @transform(@subset(portRes, :Date .>= Date("2022-01-01")), 
            :TotalReturnC = cumsum(:TotalReturn), 
            :TotalReturnLOC = cumsum(:TotalReturnLO),
            :TotalReturnULC = cumsum(:TotalReturnUL))

allPricesClean2022 = @subset(allPricesClean, :Date .>= Date("2022-01-01"))
allPricesClean2022 = @transform(groupby(allPricesClean2022, :Ticker), :rhatC = cumsum(:Return))

plot(portRes2022.Date, portRes2022.TotalReturnULC, label = "Trend Following", legendposition = :topleft, linewidth = 3)
plot!(portRes2022.Date, portRes2022.TotalReturnLOC, label = "Trend Following - LO", legendposition = :topleft, linewidth =3)
plot!(allPricesClean2022.Date, allPricesClean2022.rhatC, group = allPricesClean2022.Ticker)

2022 Trend Following Results

The long-only (red line) staying flat is indicating that we’ve been out of the market since July. The trend-following strategy has been helped by the fact that it is all one-way traffic in the markets at the minute. It’s just been heading lower and lower across all the asset classes since the summer.

Implementation Shortfall and Thinking Practically

Everything above looks like a decent trend-following approach. But how do we implement this, or simulate an implementation? By this, I mean obtaining the trades that should be made.

The mechanics of this strategy are simple:

  1. Calculate the close-to-close return of an ETF
  2. If it is above the 100-day moving average buy, if it’s below sell or go short.

The price we can trade at though is not the close price, markets are closed, and you can’t place any more orders! Instead, you will be buying at the open on the next day. So whilst our signal has triggered, our actual price is different from the theoretical price.

Sidenote, it is possible to trade after hours, and in the small size, you might be able to get close to the closing price.

Given this is retail and the sizes are so small, we are going to assume I get the next day’s open price. We can compare our purchase price to the model price. The difference between the two is called ‘Implementation Shortfall’ and measures how close you traded relative to the actual price you wanted to trade.

If we are buying higher than the model price (or selling lower) we are missing out on some of the moves and it’s going to eat the performance of the strategy.

To calculate this Implementation Shortfall (IS) we pull out the days where the signal changed, as this indicates a trade needs to be made.

allPricesClean = @transform(groupby(allPricesClean, :Ticker), :SigChange = [NaN; diff(:Signal)])

trades = @subset(allPricesClean[!, [:Date, :Ticker, :o, :c, :Signal, :NextOpen, :SigChange]], 
             :SigChange .!= 0);

Then by calculating the difference between the next open and closing price we have our estimation of the IS value.

@combine(groupby(trades, :Ticker), 
        :N = length(:Signal), 
        :IS = mean(:Signal .* 1e4 .* (:NextOpen .- :c) ./ :c))

3 rows × 3 columns

Ticker N IS
String Int64 Float64
1 SPY 40 14.7934
2 BND 69 -6.45196
3 GLD 61 10.4055

For both SPY and GLD we have lost 15bps and 10bps to this difference between the close and the next open. For BND though we actually would earn more than the close price, again, this is down to the one-way direction of bonds this year. But overall, implementation shortfall is a big drag on returns. We can plot the equity curve including this transaction cost.

plot(portRes.Date, portRes.TotalReturnULC, label = "Trend Following", legendposition = :topleft)
plot!(allPricesClean.Date, allPricesClean.rc, group = allPricesClean.Ticker)
plot!(portRes.Date, portRes.TotalReturnTCC, label = "Trend Following - TC", legendposition = :topleft, linewidth = 3)

Equity Curve with Transaction Costs

What’s the old saying:

“In theory there is no difference between theory and practice – in practice there is” (Yogi Berra)

Implementation shortfall is just one transaction cost. There are also actual physical costs to consider too:

  • Brokerage costs
  • Borrow fees
  • Capacity

Let’s start with the simple transaction costs. Each time you buy or sell you are probably paying some sort of small fee. At Interactive Brokers, it is $0.005 with a minimum of $1. So if you trade $100 of one of the ETFs you will be paying 1\% in feeds. You could get rid of this fee completely with a free broker like Robinhood which is one way to keep the costs down.

This model requires you to go short the different ETFs, so you will also need to pay borrow fees to whoever you borrow the ETF from. This is free for up to $100k in Interactive Brokers, so not a concern for the low-capacity retail trader. You can’t short sell on Robinhood directly, instead, you’ll have to use options or inverse ETFs which is just complicating matters and overall not going to bring costs down.

All these fees mean there is both a minimum amount and a maximum amount that you can implement this model with. If you have too small of a budget your returns will just be eaten up by transaction costs. If you have a large amount of money, the implementation shortfall hurts. This is what we mean by capacity.

Next Steps in Trend Following

The above implementation is the most basic possible trend following. It only uses three assets when there is a whole world of other ETFs out there. It has a simple signal (running average) and uses that to either allocate 100% long or 100% short. There are several ways in which you can go further to make this better.

  • Expand the asset universe.

Including more ETFs and ones that are uncorrelated to SPY, BND and GLD gives you a broader opportunity to go long or short. I would look at including some sort of commodity, real estate, and international equity component as the next ETFs.

  • A better trend signal.

The simple moving average is great as there are no moving parts, but nothing is stopping it from being expanded to a more sophisticated model. Including another moving average with a faster period, say 5 days, and then checking as to whether this faster average is higher or lower than the slow average is also commonly used.

  • Better asset allocation using the signal

The current signal says {-1, 1} and no in-between. This can lead to some volatile behaviour where the signal might be hovering around 0 and leading to going long and short quickly around multiple days. Instead, you should map the signal strength onto a target position with some sort of function. This could mean waiting for the signal to get strong enough before trading and then linearly adding to the position until you max out the allocation.

Conclusion

Trend following is a profitable strategy. We’ve shown that using a simple rolling average of the returns can produce a profitable signal. When scrutinising it a bit further we find that it is difficult to achieve these types of returns in practise. The overnight price movements in the ETFs means that we trade at a worse price. This combined with the general transaction costs of making the trades makes it a hard strategy to implement and to rely on another quote, there is no such thing as a free lunch, you have to put in a remarkable amount of effort to scale this kind of strategy up. We then highlight several ways you could take this research further. So you now have everything at your fingertips to explore Trend Following yourself. Let me know anything I’ve missed in the comments below.

Optimising FPL with Julia and JuMP

By: Dean Markwick's Blog -- Julia

Re-posted from: https://dm13450.github.io/2022/08/05/FPL-Optimising.html

One of my talks for JuliaCon 2022 explored the use of JuMP to optimise a Fantasy Premier League (FPL) team. You can watch my presentation here: Optimising Fantasy Football with JuMP and this blog post is an accompaniment and extension to that talk. I’ve used FPL Review free expected points model and their tools to generate the team images, go check them out.


Enjoy these types of posts? Then you should sign up for my newsletter. It’s a short monthly recap of anything and everything I’ve found interesting recently plus
any posts I’ve written. So sign up and stay informed!






Now last season was my first time playing this game. I started with an analytical approach but didn’t write the team optimising routines until later in the season, by which time it was too late to make too much of a difference. I finished at 353k, so not too bad for a first attempt, but quite a way off from that 100k “good player” milestone. I won’t be starting a YouTube channel for FPL anytime soon.

Still, a new season awaits, and with more knowledge, a hand-crafted optimiser, and some expected points, let’s see if I can do any better.

A Quick Overview of FPL

FPL is a fantasy football game where you need to choose a team of 15 players that consists of:

  • 2 goalkeepers
  • 5 defenders
  • 5 midfielders
  • 3 forwards

Then from these 15 players, you chose a team of 11 each week that must conform to:

  • 1 goalkeeper
  • Between 3 and 5 defenders
  • Between 2 and 5 midfielders
  • Between 1 and 3 forwards

You have a budget of £100 million and you can have at most 3 players from a given team. So no more than 3 Liverpool players etc.

You then score points based on how many goals a player scores, how many assists, and other ways. Each week you can transfer one player out of your squad of 15 for a new player.

That’s the long and short of it, you want to score the most points each week and be forwarding looking to ensure you are set for getting the most points.

A Quick Overview of JuMP

JuMP is an optimisation library for Julia. You write out your problem in the JuMP language, supply an optimiser and let it work its magic. For a detailed explanation of how you can solve the FPL problem in JuMP I recommend you watch my JuliaCon talk here:

But in short, we want to maximise the number of points based on the above constraints while sticking to the overall budget. The code is easy to interpret and there is just the odd bit of code massage to make it do what we want.

All my optimising functions are in the below file which is will be hosted on Github shortly so you can keep up to date with my tweaks.

include("team_optim_functions.jl")

FPL Review Expected Points

To start, we need some indication of each player’s ability. This is an expected points model and will take into account the player’s position, form, and overall ability to score FPL points. Rather than build my expected models I’m going to be using FPL Reviews numbers. They are a very popular site for this type of data and the amount of time I would have to invest to come up with a better model would be not worth the effort. Plus, I feel that the amount of variance in FPL points means that it’s a tough job anyway, it’s better to crowdsource the effort and use other results.

That being said, once you’ve set your team, there might be some edge in interpreting the statistics. But that’s a problem for another day.

FPL Review is nice enough to make their free model as a downloadable CSV so you can head there, download the file and pull it into Julia.

df = CSV.read("fplreview_1658563959", DataFrame)

To verify the numbers they have produced we can look and the total number of points each team is expected to score over the 5 game weeks they provide.

sort(@combine(groupby(df, :Team), 
       :TotalPoints_1 = sum(cols(Symbol("1_Pts"))),
       :TotalPoints_2 = sum(cols(Symbol("2_Pts"))),
       :TotalPoints_3 = sum(cols(Symbol("3_Pts"))),
       :TotalPoints_4 = sum(cols(Symbol("4_Pts"))),
       :TotalPoints_5 = sum(cols(Symbol("5_Pts"))) 
        ), :TotalPoints_5, rev=true)

20 rows × 3 columns

Team TotalPoints_1_2 TotalPointsAll
String15 Float64 Float64
1 Man City 117.63 296.58
2 Liverpool 115.52 284.36
3 Chelsea 94.37 243.1
4 Arsenal 90.38 241.0
5 Spurs 90.85 237.81
6 Man Utd 92.77 215.76
7 Brighton 79.35 205.93
8 Wolves 87.02 203.23
9 Aston Villa 88.65 202.32
10 Brentford 74.75 199.51
11 Leicester 77.99 197.34
12 West Ham 76.19 197.33
13 Leeds 80.61 194.82
14 Newcastle 89.81 190.98
15 Everton 68.46 189.73
16 Crystal Palace 64.95 180.66
17 Southampton 72.23 176.97
18 Fulham 63.02 172.57
19 Bournemouth 62.31 161.9
20 Nott'm Forest 69.65 161.05

So looks pretty sensible, Man City and Liverpool up the top, the newly promoted teams at the bottom. So looks like the FPL Review knows what they are doing.

With that done, let’s move on to optimising. I have to take the dataframe and prepare the inputs for my optimising functions.

expPoints1 = df[!, "1_Pts"]
expPoints2 = df[!, "2_Pts"]
expPoints3 = df[!, "3_Pts"]
expPoints4 = df[!, "4_Pts"]
expPoints5 = df[!, "5_Pts"]

cost = df.BV*10
position = df.Pos
team = df.Team

#currentSquad = rawData.Squad

posInt = recode(position, "M" => 3, "G" => 1, "F" => 4, "D" => 2)
df[!, "PosInt"] = posInt
df[!, "TotalExpPoints"] = expPoints1 + expPoints2 + expPoints3 + expPoints4 + expPoints5
teamDict = Dict(zip(sort(unique(team)), 1:20))
teamInt = get.([teamDict], team, NaN);

I have to multiply the buy values (BV) by 10 to get the values in the same units as my optimising code.

The Set and Forget Team

In this scenario, we add up all the expected points for the five game weeks and run the optimiser to select the highest scoring team over the 5 weeks. No transfers and we set the bench-weighting to 0.5.

# Best set and forget
modelF, resF = squad_selector(expPoints1 + expPoints2 + expPoints3 + expPoints4 + expPoints5, 
    cost, posInt, teamInt, 0.5, false)

Set and forget

It’s a pretty strong-looking team. Big at the back with all the premium defenders which is a slight danger as one conceded goal by either Liverpool or Man City could spell disaster for your rank. Plus no Salah is a bold move.

To add some human input, we can look at the other £5 million defenders to assess who to swap Walker with.

first(sort(@subset(df[!, [:Name, :Team, :Pos, :BV, :TotalExpPoints]], :BV .<= 5.0, :Pos .== "D", :Team .!= "Arsenal"), 
     :TotalExpPoints, rev=true), 5)
Name Team Pos BV TotalExpPoints
String31 String15 String1 Float64 Float64
1 Walker Man City D 5.0 16.53
2 Digne Aston Villa D 5.0 15.25
3 Doherty Spurs D 5.0 15.18
4 Romero Spurs D 5.0 15.03
5 Dunk Brighton D 4.5 14.74

So Doherty or Digne seems like a decent shout. This just goes to show though that you can’t blindly follow the optimiser and you can add some alpha by tweaking as you see fit.

Update After Two Game Weeks

What about if we now allow transfers? We will optimise for the first two game weeks and then see how many transfers are needed afterward to maximise the number of points.

model, res1 = squad_selector(expPoints1 + expPoints2, cost, posInt, teamInt, 0.5)
currentSquad = zeros(nrow(df))
currentSquad[res1[1]["Squad"] .> 0.5] .= 1

res = Array{Dict{String, Any}}(undef, length(0:5))

expPoints = zeros(length(0:5))

for (i, t) in enumerate(0:5)
    model, res3 = transfer_test(expPoints3 + expPoints4 + expPoints5, cost, posInt, teamInt, 0.5, currentSquad, t, true)
    res[i] = res3[1]
    expPoints[i] = res3[1]["ExpPoints"]
end

Checking the expected points of the teams and adjusting for any transfers after the first two free ones gives us:

expPoints .- [0,0,0,1,2,3]*4
6-element Vector{Float64}:
 162.385
 164.295
 167.987
 165.767
 164.084
 161.726

So making two transfers improve our score by 5 points, so seems worth it. If we go beyond two transfers, then we will pay a 4 point penalty, so it seems worth

Update after 2 GWs

So Botman and Watkins are switched out for Gabriel and Toney. Again, not a bad-looking team, and making these transfers improves the expected points by 5.

Shortcomings

The FPL community can be split into two camps, those that think data help and those that think watching the games and the players help. So what are the major issues with these teams?

Firstly, Spurs have a glaring omission from any of the results. Given their strong finish to the season and high expectations coming into the season this is potentially a problem.

Things can change very quickly. After the first week, we will have some information on how different players are looking and by that time these teams could be very wrong with little flexibility to change them to adjust to the new information. I am reminded of last year where Luke Shaw was a hot pick in lots of initial teams and look how that turned out.

How off-meta these teams are. It’s hard to judge what the current template team is going to be at these early stages in the pre-season, but if you aren’t accounting for who other people will be owning you can find yourself being left behind all for the sake of being contrarian. For example, this team has put lots of money into goalkeepers when you could potentially spend that elsewhere.

Some of the players in the teams listed might not get that many minutes. Especially for the cheaper players, I could be selecting fringe players rather than the reliable starters for the lower teams. Again, similar to the last point, there is are ‘enablers’ that the wider community believes to be the most reliable at the lower price points.

And finally variance. FPL is a game of variance. Haaland is projected to score 7 points in his first match, which is the equivalent to playing the full 90 minutes and a goal/assist. He could quite easily only score 1 point after not starting and coming on for the last 10 minutes and you are then panicking about the future game weeks. Relying on these optimised teams can sometimes mean you forget about the variance and how easy it is for a player to not get close to the number of points they are predicted.

Conclusion and What Next

Overall using the optimiser helps reduce the manual process of working out if there is a better player at each price point. Instead, you can use it to inspire some teams and build on them from there adjusting accordingly. There are still some tweaks that I can build into the optimiser, making sure it doesn’t overload the defence with players from the same team and see if I can simulate week-by-week what the optimal transfer if there is one, should be.

I also want to try and make this a bit more interactive so I’m less reliant on the notebooks and have something more production-ready that other people can play with.

Also given we get a free wildcard over Christmas I can do a mid-season review and essentially start again! So check back here in a few months time.

Modelling Microstructure Noise Using Hawkes Processes

By: Dean Markwick's Blog -- Julia

Re-posted from: https://dm13450.github.io/2022/05/11/modelling-microstructure-noise-using-hawkes-processes.html

Microstructure noise is where the
price we observe isn’t the ‘true’ price of the underlying asset. The
observed price doesn’t diffuse as we assume in your typical
derivative pricing models, but instead, we see
some quirks in the underlying data. For example, there is an explosion of
realised variance as we use finer and finer time subsampling periods.

Last month I wrote about
calculating realised volatility
and now I’ll be taking it a step further. I’ll show you how this
microstructure noise manifests itself in futures trade data and how I
use a Hawkes process to come up with a price formation model that fits
the actual data.

The original work was all done in
Modeling microstructure noise with mutually exciting point processes. I’ll
be explaining the maths behind it, showing you how to fit the models
in Julia and hopefully educating you on this high-frequency finance
topic.


Enjoy these types of posts? Then you should sign up for my newsletter. It’s a short monthly recap of anything and everything I’ve found interesting recently plus
any posts I’ve written. So sign up and stay informed!






A Bit of Background

My Ph.D. was all about using Hawkes processes to model when things
happen and how these things are self-exciting. You may have read my
work on Hawkes process before, either through my Julia package
HawkesProcess.jl,
or examples of me
using Hawkes processes to model terror attack data or just
how to calculate the deviance information criteria of a Hawkes process. The
real hardcore might have read my Ph.D. thesis.

But how can we use a Hawkes process to model the price of some
financial asset? There is a vast amount of research and work about how Hawkes
processes can be used in price formation processes for financial
instruments and high-frequency types of problems and this post will
act as a primer to anyone interested in using Hawkes processes for
such problems.

The High-Frequency Problem

At short timescales, a price moves randomly rather than with any
trend. Amazon stock might trade thousands of times in a minute but
that’s supply and demand changing rather than people thinking
Amazon’s future is going to change from one minute to the next. So we need
a different way of thinking about how prices move at short timescales
compared to longer timescales.

We can build nice mathematical models guessing how a price might move;
it might move like a random walk or maybe a random walk with jumps in
the price now and then. But, no matter what model we use, it must
match up with what is observed in the real world. One of
these observations is a phenomenon called ‘microstructure noise’ and we
only see it in high-frequency data.

Microstructure noise is a catch-all term for different things happening
in the market. This includes, bid-ask bounce, people buy and sell at
two different prices, so looks like the price is moving, but in
reality, just oscillating around the mid-price. The discreteness of
prices at these time scales also plays a part. There is a minimum
increment level that prices change buy and this can have real effects
on how prices move. Exchanges need to pay attention to their tick
sizes, as it can help or hinder liquidity if they are set
incorrectly. This then has a real effect that we can observe when we
calculate a realised volatility.

Realised volatility is measuring how much the price moved in a
period. These high-frequency effects are going to give the impression
of more movement than what the ‘true’ volatility is. So we end up
seeing our measurement of volatility explode as the time scale we use
gets smaller and smaller. Calculating the volatility using 1-second
intervals gives a larger value than if we used 1 minute
intervals. This means that our volatility estimation depends on the
time scale used, so what is the ‘real’ volatility?

We aren’t interested in working out what the real volatility
is. Instead, we want to build a model for a price that displays this
volatility scaling effect.

The Hawkes Process Model for a Price

How do we use Hawkes processes to build a model that will have this microstructure noise?

Lets call the price at time t, X(t) and guess that it moves by
summing up the positive jumps N1(t) and negative jumps N2(t)
that also happen at time t.

X(t)=N1(t)N2(t)

When do these jumps occur? How are these jumps distributed? This is
where we use the Hawkes process.

A Hawkes process is a self-exciting point process. When something
happens it increases the probability of another event happening. This
is the self-exciting behaviour we want. Every time there is a positive
jump, there is an increase in the probability of a negative jump
happening and, likewise, when there is a negative jump there is a greater
probability of a positive jump.

Hawkes demonstration
Each jump causes the probability of the other jump happening like in
this picture

When someone buys and pushes the price higher by removing that
liquidity, it’s more likely that someone will now sell at the new
higher price introducing some downward pressure. This is mean
reversion
where prices move higher and then outside forces
push it lower and vice versa.

With Hawkes processes there are three parameters:

  • The background rate or when the jumps randomly occur. This is common to both positive and negative jumps.
  • κ – the ‘force’ that pushes and increases the probability of the other jump happening.
  • The kernel g(t) dictates how long the force lasts. This is an exponential decay with parameter β.

Hawkes parameters demonstration

We will fit a Hawkes process to a price series
to infer the 3 parameters that describe how the jumps behave. This
model will hopefully replicate the ‘microstructure noise’ effects we
see in practise.

Futures Trade Data

In the early days of my Ph.D., I answered an email that was advertising
for early grad students to do some prop trading. As part of the
interview, they gave me some data and asked me to write a simple
moving cross-over strategy. I failed miserably as I never
heard back from them. But I did get some nice data, which now that I’m
older and wiser, recognise as reported trades from a futures
exchange. This is the data we will be using today to calculate the
mode and luckily it’s similar to the data they use in the original
paper.

using CSV
using DataFrames, DataFramesMeta
using Plots
using Dates
using Statistics

All the usual packages when working with data in Julia.

rawData = CSV.read("fgbl__BNH14_clean.csv", DataFrame, header=false)
rename!(rawData, [:UnixTime, :Price, :Volume, :DateTime]);
first(rawData, 5)

5 rows × 4 columns

UnixTime Price Volume DateTime
Int64 Float64 Int64 String
1 1378908794086 136.9 1 09/11/201315:13:14.086
2 1378974046854 137.25 5 09/12/201309:20:46.854
3 1378990110771 137.55 1 09/12/201313:48:30.771
4 1378998136894 137.7 1 09/12/201316:02:16.894
5 1378999992561 137.55 1 09/12/201316:33:12.561

To clean the data we convert the Unix timestamp to an actual DateTime object and pull out the hour and the date of the trade.

cleanData = @transform(rawData, DateTimeClean = DateTime.(:DateTime, dateformat"mm/dd/yyyyHH:MM:SS.sss"), 
                                DateTimeUnix = unix2datetime.(:UnixTime ./ 1000) )
cleanData = @transform(cleanData, Date = Date.(:DateTimeUnix),
                                  Hour = hour.(:DateTimeUnix));
first(cleanData[:,[:UnixTime, :DateTimeClean, :DateTimeUnix]], 5)

5 rows × 3 columns

UnixTime DateTimeClean DateTimeUnix
Int64 DateTime DateTime
1 1378908794086 2013-09-11T15:13:14.086 2013-09-11T14:13:14.086
2 1378974046854 2013-09-12T09:20:46.854 2013-09-12T08:20:46.854
3 1378990110771 2013-09-12T13:48:30.771 2013-09-12T12:48:30.771
4 1378998136894 2013-09-12T16:02:16.894 2013-09-12T15:02:16.894
5 1378999992561 2013-09-12T16:33:12.561 2013-09-12T15:33:12.561

To get an idea of the data we are looking at I aggregate the total
number of trades and total volume of the trades over each day and plot
that as a time series.

dayData = groupby(cleanData, :Date)
dailyVolumes = @combine(dayData, TotalVolume = sum(:Volume),
                                  TotalTrades = length(:Volume),
                                  FirstTradeTime = minimum(:DateTimeUnix))

xticks = minimum(dailyVolumes.Date):Month(2):maximum(dailyVolumes.Date)
xticks_labels = Dates.format.(xticks, "yyyy-mm")

p1 = plot(dailyVolumes.Date, dailyVolumes.TotalVolume, seriestype=:scatter, label="Daily Volume", legend = :topleft, xticks = (xticks, xticks_labels))
p2 = plot(dailyVolumes.Date, dailyVolumes.TotalTrades, seriestype=:scatter, label= "Daily Number of Trades", legend = :topleft, xticks = (xticks, xticks_labels))
plot(p1, p2, fmt=:png)

Daily futures volume

It takes a while for the trading to take off in this future
contract. This is where it slowly becomes the front-month contract and
then is the most active.

Also, because trading doesn’t cross over the daylight saving dates, we don’t have to worry about timezones. Always a bonus!

What about if we look at what hour is the most active?

hourDataG = groupby(cleanData, [:Date, :Hour])
hourDataS = @combine(hourDataG, TotalHourVolume = sum(:Volume),
                                  TotalHourTrades = length(:Volume))
hourDataS = leftjoin(hourDataS, dailyVolumes, on=:Date)

hourDataS = @transform(hourDataS, FracVolume = :TotalHourVolume ./ :TotalVolume)

hourDataG = groupby(hourDataS, :Hour)
hourDataS = @combine(hourDataG, MeanFracVolume = mean(:FracVolume),
                                 MedianFracVolume = median(:FracVolume))
sort!(hourDataS, :Hour)
bar(hourDataS.Hour, hourDataS.MedianFracVolume * 100, title = "Fraction of Daily Volume", label=:none, fmt=:png)

Hourly volume fraction of a futures contract.

Early in the morning (just after the exchange opens) and late afternoon (when the Americans start trading) is where there is the most activity.

For our analysis, we are going to be focused on the hours 14, 15, 16
to make sure that we have the most active period and this is the same
as what the original paper did, took a subset of the day.

How to Calculate the Volatility Signature

Let X(t) be the price of the future at time t. The signature is the quadratic variation over a window of [0,T], which is more commonly known as the realised volatility:

C(τ)=1TΣT/τn=0X((n+1)τ)X(nτ)2.

τ is our sampling frequency, say every minute, etc.

To calculate the volatility across the trades we have to pay particular attention to the fact that these trades are irregularly spaced, so we need to fill forward the price for every t value.

function get_price(t::Number, prices, times)
    ind = min(searchsortedfirst(times, t), length(times))
    sp = ind == 0 ? 0 : prices[ind]
end

function get_price(t::Array{<:Number}, prices, times)
    res = Array{Float64}(undef, length(t))
    for i in eachindex(t)
        res[i] = get_price(t[i], prices, times)
    end
    res
end

Our get_price function here will return the last price before time t.

To calculate the signature value we chose a τ value, generate
the indexes between 0 and T using a τ step size. Pull the
price at those times and calculate the quadratic variation. Again, we
add a method to calculate the signature for different τ’s.

function signature(tau::Number, x, t, maxT)
    inds = collect(0:tau:maxT)
    prices = get_price(inds, x, t)
    
    rets = prices[2:end] .- prices[1:(end-1)]
    (1/maxT) * sum(abs.(rets) .^ 2)
end

function signature(tau::Array{<:Number}, x, t, maxT)
   res = Array{Float64}(undef, length(tau))
    for i in eachindex(res)
        res[i] = signature(tau[i], x, t, maxT)
    end
    res
end

Now let’s apply this to the data. We are only interested when the
future was actively trading and in the hours between 14:00 and 16:00.
We convert the times into seconds since 15:59:59 and calculate the
signature for all the dates, before taking the final average.

We are taking the log price of the last trade to represent our actual
X(t). We just look at 2014 dates too as that is when the future is
active.

cleanData2014 = @subset(cleanData, :Date .>= Date("2014-01-01"))

uniqueDates = unique(cleanData2014.Date)

eventList = Array{Array{Float64}}(undef, length(uniqueDates))
priceList = Array{Array{Float64}}(undef, length(uniqueDates))

signatureList = Array{Array{Float64}}(undef, length(uniqueDates))
avgSignature = zeros(length(1:1:200))

for (i, dt) in enumerate(uniqueDates)
   
    subData = @subset(cleanData2014, :Date .== dt, :Hour .<= 16, :Hour .>= 14)
    eventList[i] = getfield.(subData.DateTimeClean .- (DateTime(dt) + Hour(14) - Second(1)), :value) ./ 1e3
    priceList[i] = subData.Price
    
    signatureList[i] = signature(collect(1:1:200), log.(priceList[i]), eventList[i] .+ rand(length(eventList[i])), 3*60*60 + 1)
    avgSignature .+= signatureList[i]
end

avgSignature = avgSignature ./ length(eventList);

To plot the signature we take the average across all the dates and
then normalised by the τ=60 value.

plot(avgSignature / avgSignature[60], seriestype=:scatter, 
    label = "Average Signature", xlabel = "Tau", ylabel = "Realised Volatility (normalised)", fmt=:png)

Realised Volatility Signature

This is an interesting plot with big consequences in high-frequency finance.

This explosion in realised volatility at small timescales (τ0) comes from microstructure noise. If prices evolved
as Brownian motion, the above plot would be flat for all timescales so
the above result contradicts lots of classical finance assumptions.

Practically, this is a pain if we are trying to measure the currently
volatility, it depends on the timescale we are looking at, there isn’t
one true volatility using the normal methods. Instead, we need to be
aware of these microstructure effects as we use a finer τ over
which to calculate the volatility.

This is where the Hawkes model comes in. If we assume the price,
X(t) moves as stated in Equation (), can we produce a similar signature plot?

The Theoretical Signature Under a Hawkes Process

After doing some maths (you can read the paper for the full details), we arrive at the following equation for the theoretical signature.

If both N1 and N2 are realisations of Hawkes processes with
parameters μ,κ and g(t)=βexp(βt) then their intensity can be written as

C(τ)=Λ(k2+(1k2)1eγτγτ),

where

Λ=2μ1κ,k=11+κ,γ=β(κ+1).

These are from the paper and adjusted based on my parameterisation of the Hawkes process. This gives us our theo_signature function.

function theo_signature(tau, bg, kappa, kern)
    Lambda = 2*bg/(1-kappa)
    k = 1/(1 + kappa)
    gamma = kern*(kappa + 1)
    @. Lambda * (k^2 + (1-k^2) * (1 - exp(-gamma * tau)) / (gamma*tau))
end

Calibrating the Hawkes Process Model

We now move on to fitting the Hawkes process to the data. I’ll be using
a new method that takes a different approach to my package HawkesProcesses.jl.

We have a theoretical volatility signature from a Hawkes process
(theo_signature) and the above plot of what the actual signature
looks like it. Therefore, it is just a case of optimising over a loss
function to find the best fitting parameters. I’ll use root mean
square error as my loss function and simply use the Optim.jl package
to perform the minimisation.

signatureRMSE(x, sig) = sqrt(
    mean(
        (sig .- theo_signature(1:200, x[1], x[2], x[3])).^2
        )
)

using Optim
optRes = optimize(x->signatureRMSE(x, avgSignature/avgSignature[10]), rand(3))

paramEst = Optim.minimizer(optRes)
3-element Vector{Float64}:
 0.24402236592012655
 0.7417867072115396
 0.19569169443936185

These are the three parameters of the Hawkes process which appear
sensible.

plot(avgSignature, label="Observed", seriestype=:scatter)
plot!(avgSignature[10]*theo_signature(1:200, paramEst[1], paramEst[2], paramEst[3]), label="Theoretical", lw=3, xlabel = "Tau", ylabel = "Realised Variance", fmt=:png)

Theoretical vs Observed Signature

So a nice match-up between the theoretical signature and what we
observed. This gives some weight to the Hawkes model as a
representation of the price process.

Interpreting the Hawkes Parameters

Our κ value of 0.75 shows there is a large amount of
excitement with each price jump. a β value of 0.2 shows that this
mean reversion lasts for about 5 seconds. So if we see an uptick in
the price, we expect a downtick with a rough half-life of five
seconds. The opposite is also true, a downtick likely leads to an
uptick 5 seconds later.

Conclusion

Microstructure noise shows up when we start calculating volatility on
a high-frequency timescale. We have shown that it is a real effect
using some futures data and then built a Hawkes model to try and
reproduce this effect. We managed to get the right shape of the
volatility signature and found that was quite a bit of mean reversion
between the up and downticks that lasted around 5 seconds.

What’s next? In a future post, I will introduce another dimension and
show and there can also be correlation across assets under a similar
method to reproduce another high-frequency phenomenon. This will be
based on the same paper and show you how we can start looking at the
correlation between two assets and how that changes at high-frequency
time scales.