GuppyLM

A 9M-param LLM you can train in 5 min and run in any browser

GuppyLM is a 9 million parameter transformer language model designed specifically for education — built to demystify the complete LLM development pipeline from scratch. The full stack covers dataset generation, tokenizer training, model training, export to ONNX, 4-bit quantization, and in-browser inference via WebAssembly. The final model weighs roughly 10 MB and runs entirely client-side with no server required. The training run takes approximately 5 minutes on a single Google Colab GPU — the kind of experiment any developer can run on a free tier. The project includes a working browser demo and step-by-step documentation walking through every stage of the pipeline. The creator's goal is to make the full LLM lifecycle tangible for learners who have heard about transformers but never actually trained one. The project hit the top of Hacker News Show HN submissions with nearly 900 points — an exceptional response that reflects widespread hunger for genuinely accessible ML education. In an era of 400B parameter models and multi-million-dollar training runs, a model that fits in a browser tab and trains in a coffee break is a meaningful pedagogical counterpoint.

Panel Reviews

The Builder

Developer Perspective

Ship

“This is exactly what ML education has been missing — a full pipeline you can actually run, not just read about. The WASM + ONNX browser deployment is particularly sharp: students get immediate feedback running their trained model in a tab without any server setup. Perfect for workshops, university courses, or self-directed engineers getting past the 'just use the API' ceiling.”

The Skeptic

Reality Check

Skip

“Nine million parameters produces text that reads like a broken Markov chain — it's a teaching toy, not something you'd use for any real task. There's a risk learners walk away thinking they understand LLMs when they've actually trained a system orders of magnitude simpler than production models. The educational framing needs stronger caveats about the scaling gap.”

The Futurist

Big Picture

Ship

“Democratizing the LLM pipeline matters for the long game. The next generation of AI researchers and engineers needs hands-on experience with the full stack — tokenization, training dynamics, quantization, deployment. GuppyLM makes that accessible to anyone with a browser. That's a compounding investment in the talent pool.”

The Creator

Content & Design

Ship

“For content creators and educators teaching technical literacy, this is a remarkable tool. The browser demo is immediately shareable and requires zero setup from students. Being able to show a live, working language model trained from scratch in an afternoon session — that's transformative for classroom engagement.”

Community Sentiment

Overall1,442 mentions

83% positive13% neutral4% negative

Hacker News892 mentions

“Full pipeline in one repo — tokenizer through WASM deploy”

Reddit200 mentions

“Using it for university ML courses”

Twitter/X350 mentions

“Training in 5 minutes on free Colab”