Blog

Understanding knowledge organisation terminology
2 February 2025
In data science and machine learning, the structures we use to represent grounded knowledge are increasingly important, but the corresponding names for those structures are often confused and conflated. In this post I'll disambiguate a few of the most common terms and provide examples of how and where they can be applied and misapplied.
Reasons for uncertainty in machine learning models
1 April 2024
Why should we care about uncertainty in machine learning models? And is it even possible to calculate it?
The best way to encode dates, times, and other cyclical features
14 May 2023
A little bit of trigonometry is all you need to get the most from your datetime features
Privileged vs non-privileged bases in machine learning
5 May 2023
What's do we mean when we talk about privileged and non-privileged bases in neural networks? How is a basis chosen or trained, and what can we say about their features?
A custom keybinding to merge jupyter notebook cells in vscode
8 March 2023
Merging cells with ⇧m
Understanding positional embeddings in transformer models
25 January 2023
Positional embeddings are key to the success of transformer models like BERT and GPT, but the way they work is often left unexplored. In this deep-dive, I want to break down the problem they're intended to solve and establish an intuitive feel for how they achieve it.
Super fast approximate nearest neighbour search with locality sensitive hashing and elasticsearch
5 December 2022
This novel approach to approximate nearest neighbour search achieves super-fast query times by using elasticsearch to index and query the outputs of an LSH model. The cheap, scalable architecture for both hashing and querying make it an ideal approach for huge datasets in production environments.
Iterating over large datasets with "yield" in python
2 December 2022
Generators come to the rescue when your dataset is too big for your machine
Creating a downloadable .pdf copy of a page using next.js and puppeteer
27 January 2022
My CV is written and styled to be viewed on the web, but some clever javascript creates a downloadable PDF version automatically, every time the site is built!
You should commit your .env files to version control (carefully)
20 August 2021
Share your .env files without their secrets, using dotenv-stripout
A CLI alias for quick and easy passwords
22 June 2021
Using python and zsh to create secure random strings whenever I need them
Better terminal tab titles
28 May 2021
Making my terminal tabs more readable with git and zsh
Automating tests with Vercel and GitHub actions
9 May 2021
Making sure that what I build works for everyone, all the time
Using Emoji as Favicons
5 December 2020
How I set a unique emoji as the favicon for each page on this website
Anti-racism
5 June 2020
You can, and should make anti-racism part of your day-to-day work
def from above
15 May 2020
How do you do relative imports in python??