ShipInspector
A poker hand history analyzer that surfaces leaks and edges
ShipInspector ingests poker hand history files and surfaces statistical patterns — where you're bleeding chips, where you have edges, what your opponent tendencies look like. Built for serious players who want data-driven improvement, not vibes.
ShipInspector
ShipInspector started from a simple observation: poker players are surrounded by data and still routinely study the game in the least data-native way possible. Every session produces hand histories dense with action, sizing, position, and showdown information, but most players either never look at it or look at it through tools that feel like accounting software. The interesting problem was not just computing a few familiar stats. It was turning messy, drifting hand-history formats into a stable enough event stream that a player could see where they were actually leaking money.
The Problem
Poker analysis sounds straightforward until you try to ingest real hand histories at scale. In theory, a site exports text files, you parse them, you compute VPIP, PFR, and 3bet rates, and you call it a day. In practice, the raw material is adversarial. PokerStars and GGPoker both emit hand histories that look regular from ten feet away and turn slippery the moment you rely on them. Formats drift across years. Stakes are encoded differently. Tournament and cash-game structures diverge. Regional or client-specific variants introduce small inconsistencies that are trivial for a human reader and dangerous for a parser.
That is why ShipInspector exists. The goal was to build something a serious player could drop files into and get back something more useful than a wall of numbers: positional tendencies, aggression patterns, leak signals, and diagnostic output that actually points to study priorities. But to get there, the product first had to solve the much less glamorous problem of reading the game correctly.
What Was Built
The core flow is straightforward for the user and deliberately more opinionated under the hood. A player uploads hand histories exported from PokerStars or GGPoker. ShipInspector ingests the files, parses each hand into a normalized representation, computes derived actions and outcomes, and then runs a statistical analysis layer across the resulting corpus. What the user sees is not raw text processing. What they see is an organized picture of how they are playing by position, where their frequencies diverge from sound baselines, and which patterns are likely costing them the most money.
The ingestion layer had to be boring in the best sense: resilient, repeatable, and suspicious of the input. Hand histories are not application-level JSON contracts. They are text artifacts written for human legibility first and machine stability second. That means the parser cannot act like the source is trustworthy. It has to assume the opposite. Each file moves through a parse-and-normalize pipeline that extracts seat assignments, stack sizes, blind structure, street actions, bet sizes, board cards, showdown results, and final payouts when relevant. Once that representation is stable, analysis becomes possible. Before that, every statistic is built on sand.
The analysis layer focuses on the stats players actually use to reason about their game, but it frames them in a way that supports action rather than trivia. VPIP, PFR, and 3bet form the basic triad because they describe whether someone is entering pots too loosely, raising enough, or failing to apply pressure preflop. Position-specific breakdowns matter just as much, because a global stat line can hide a leak that is obvious once you isolate cutoff, button, small blind, or big blind behavior. A player might appear reasonably aggressive in aggregate while bleeding heavily from the blinds or entering too many dominated pots from early position. ShipInspector treats those distinctions as the point, not an optional drill-down.
The Parsing Challenge
The hardest engineering problem in ShipInspector is parsing, not statistics. Statistics are only difficult when the input is wrong, incomplete, or inconsistently modeled. The actual arithmetic behind win rates, preflop frequencies, and showdown trends is not mysterious. The hard part is deciding what happened in a hand when the source text has changed shape three times over the life of a poker client and a dozen more times in edge cases.
PokerStars is a good example of why this matters. Over the years, its hand histories have shifted in formatting details that feel cosmetic until code depends on them. Timestamps move. Header phrasing changes. Tournament metadata is expressed differently. Currency, buy-in, and blind information can appear in slightly different structures depending on game type and era. Even when the semantic meaning is the same, the parser cannot assume identical token order or line layout across files. If you build a brittle parser against one clean sample, it will look competent right up until the first serious corpus hits it.
GGPoker introduces a different flavor of pain. Some of its format variants are platform-specific, some appear tied to region or client version, and some simply encode the same action in a way that does not align cleanly with what another site would emit. This is where naive regex-heavy parsing breaks down. A parser that is optimized only for the happy path tends to pass tests written from the happy path and then quietly misclassify hands in production. In a poker tool, that is lethal. If the system misreads action order or folds aggression into the wrong branch of a hand, every downstream diagnostic becomes fiction.
ShipInspector handles that problem by treating parsing as defensive interpretation rather than one-shot extraction. Variant handling is explicit. Pattern matching is layered. Fallback chains exist for the places where the source format is known to drift. The parser tries to identify the hand type, platform, and relevant structural cues before it commits to a deeper interpretation of the action lines. That makes the code less elegant than a single perfect grammar, but much more honest about the reality of the domain. In domains with long-lived text exports, robustness comes from accepting messiness rather than wishing it away.
Testing follows the same philosophy. The only meaningful parser test suite is one built from real hand histories across platforms, years, and weird cases. Synthetic examples help with isolated behavior, but they do not surface the sort of drift that breaks production tools. A strong parser earns trust by surviving a corpus, not by looking clever in abstraction. That lesson shaped the whole project: if the parser is wrong one percent of the time in a random spot, the user will eventually study the wrong leak with full confidence.
The Analysis Engine
Once a hand history is normalized, ShipInspector can do the part users think of as the product. It aggregates play by seat and position, computes frequencies over meaningful sample sizes, and turns those frequencies into diagnostic signals. VPIP tells you how often a player voluntarily puts money in the pot. PFR tells you how often they are taking the betting lead preflop. 3bet measures how willing they are to re-raise rather than call. Those three numbers are basic, but they remain foundational because they expose the shape of a player’s strategy before you even get into postflop detail.
Where the tool becomes useful is in context. A player with a superficially reasonable VPIP may still be opening too wide under the gun. A player with a normal overall 3bet rate may be missing profitable re-raise spots from the small blind. A player can look balanced in the aggregate and still be massively over-folding the big blind versus late-position pressure. ShipInspector is built to surface those distortions rather than bury them. Position is not metadata; it is the lens that makes the statistics interpretable.
Leak detection sits on top of that statistical layer. The goal is not to tell users what every number means in theory. The goal is to identify patterns that likely warrant attention. If someone is folding to 3bets from the big blind at an unsustainably high frequency, that is not an interesting stat artifact; it is an actionable weakness. If their button open frequency is too tight, they are probably leaving money on the table. If their aggression collapses after the flop, they may be entering pots without a coherent plan for later streets. These are the kinds of patterns players want from software: not just “here are your numbers,” but “here is where your strategy is predictably breaking down.”
Diagnostic Output
The output is designed to feel like analysis, not bookkeeping. A user should be able to upload hands and quickly understand what kind of player the dataset describes. Are they too passive? Too loose from early position? Under-defending the big blind? Missing value by not 3betting enough? The product is strongest when it turns a statistical profile into study direction.
That means diagnostics are framed as decisions and tendencies. Instead of making the user reverse-engineer meaning from a dashboard of isolated percentages, ShipInspector emphasizes interpretable findings: you are entering too many pots from the small blind, you are not putting enough pressure on late-position opens, you are folding to aggression in spots that stronger pools defend more aggressively. A good analysis tool compresses the distance between data and correction. It should help a player decide what to review in their database, what spots to drill, and what strategic assumptions to revisit.
Why Parsing Matters More Than People Think
Most users assume the intelligence in a poker analyzer lives in the statistics layer. In reality, that layer is only as good as the parser beneath it. Parsing is the part that decides whether a hand was understood correctly at all. Get that wrong, and even beautifully presented output becomes misleading. That is what makes ShipInspector interesting from an engineering perspective. It is a product about analysis built on top of a reliability problem.
The broader lesson is that domain software often hides its hardest work in the least visible layer. People talk about dashboards, models, and insights because those are the parts they can see. But in products like this, the real leverage comes from building a normalization layer strong enough that the visible insights deserve trust. Once that exists, the higher-level features start to feel obvious. Before it exists, they are theater.
Outcome
ShipInspector gives serious players a way to turn their own hand histories into a readable map of how they actually play. It ingests PokerStars and GGPoker variants, survives format drift that would break brittle parsers, computes the core statistical profile of a strategy, and surfaces leak diagnostics that are useful enough to shape study. What I learned building it is that the interesting software problem was never “can we calculate poker stats?” It was “can we build a parser honest enough to keep those stats true?” Once that answer is yes, the rest of the product gets to matter.