Posts

Showing posts from May, 2025

AI Prompt Engineering - Use Code not Words

Image
AI language models don’t actually reason in a human sense. For those interested in how these systems are trained, I recommend checking out Demystifying LLMs with Andrej Karpathy .   The Token Challenge When processing text, language models work with “tokens” rather than complete words. The relationship between words and tokens isn’t always one-to-one. For instance, the term “LLM” gets split into two separate tokens in the paragraph below. Similarly, longer or unusual strings can be divided into numerous tokens. The word “ SuperCaliFragilisticExpialiDociouc ” is broken down by GPT-4o into 11 distinct tokens. It’s important to understand that AI responses are generated probabilistically, one token at a time, with deliberate randomness incorporated. This explains why asking the same question multiple times often yields different answers. These fundamental characteristics create significant constraints when AI attempts text analysis tasks. For example, until recently, many langu...

Demystifying LLMs with Andrej Karpathy

Image
The emergence of Large Language Models (LLMs) represents a pivotal advancement in artificial intelligence, transforming multiple industries. Andrej Karpathy’s presentation, “ Deep Dive into LLMs like ChatGPT ”, offers an accessible yet comprehensive exploration of these models. As former Director of AI at Tesla and a founding member of OpenAI, Karpathy breaks down complex concepts for audiences regardless of technical background.  While most generative AI training focuses on prompt engineering to generate specific content, this only scratches the surface of how LLMs truly function.  Core LLM Development Process  LLMs are developed through several critical stages:  Data Acquisition and Preparation : Models are trained on massive datasets collected from internet sources. This extensive collection enables the LLM to learn statistical patterns in human language.  Data Cleaning : Internet-sourced data contains significant noise—duplicates, spam, and low-qual...