Posts

Showing posts from May, 2023

Program your own Chess Engine with Monte Carlo Tree Search

Image
Deep Blue beats Kasparov The Chat-GPT moment of my childhood happened in 1997, when IBM's Deep Blue supercomputer defeated the then chess grandmaster Garry Kasparov, in 3.5 out of a match of 6 games. It was perhaps the first time that a significant demonstration of AI made headlines worldwide. A real eye-opener for me in school, and apart from the Terminator films, an inspiration for a career forged in the universities of Australia and Belgium a decade or more later. Since then, AI-based chess engines have grown ever more powerful and accessible. Chess.com allows you to play chess on your phone with chess engines in the cloud that are superior to Deep Blue. To a schoolboy in the 90s, this stuff was close to magic - I had almost no idea how one would go about building such a thing - and I had no one to ask or help me with such an endeavour. The subject of this post is to help someone like my 16-year old self, who is interested in building a chess engine, but doesn't know where t...

Train LLMs on your Laptop Using Low Rank Approximation

Image
Recent advancements in AI have made it possible to run Large Language Models (LLMs) on smaller machines and with limited resources. We are only a few months on from these models being the preserve of large companies, with 100s of GPUs requiring 100s of millions of dollars and months to train. This new era of AI accessibility is now enabling individuals to operate and train LLMs on their laptops, Pixel phones, and even Raspberry Pis. In this post, we'll discuss the  developments that have led to this change and explore the techniques employed to make LLMs more accessible to everyone - in particular something called Low Rank Adaptation. The Turning Point: Llama.cpp and the Leaked LLaMA Weights On March 10th, Bulgarian developer Georgi Gerganov released Llama.cpp , a quantized LLM network that allows users to run it on an M1 MacBook. Llama.cpp is based on Meta's LLaMA model , a GPT-3 class model intended for academic use. Although Meta made the code open-source, the model weights ...