Computer Vision
Standing up the full applied loop around a computer-vision product — dataset curation, evaluation, false-positive analysis, and human review — so it was reliable enough to ship.
Problem
Raw image data is noisy, and the interesting part of a computer-vision product isn't getting a model to fire — it's turning its output into signals reliable enough that a real product, and real people, can act on them. Accuracy on a clean test set rarely survives contact with the messy inputs production actually sees.
Approach
I helped stand up the full applied loop around the model, not just the model itself: curating datasets, running object detection and classification, evaluating results against held-out sets, analysing false positives and mining hard negatives, validating before anything shipped, and keeping a human review step in production. A semantic vision-language stage handled the "what is this, really?" judgment alongside the detectors. The point was a loop that improves over time, not a one-off model drop.
- Dataset curation as a first-class task
- Evaluation sets, not demo accuracy
- False-positive & hard-negative analysis
- Human-in-the-loop review in production
- Iterative retraining as new edge cases appear
What made it hard
The hardest part wasn't running the models. It was building enough trust around the results for them to be useful in a real product. False positives, missed detections, edge cases, and changing data distributions meant that accuracy alone wasn't enough. A detector confidently calling a garage door a radiator isn't a bug you train away in a single pass — it's a signal that the system needs a hard negative and a review step, not just a better model. The challenge was creating a process where model output could be evaluated, improved, and reviewed over time rather than treated as a one-time implementation.
Trade-offs
The main trade-off was balancing automation against reliability. Increasing coverage often introduced more false positives, while making the system more conservative reduced useful detections. Instead of chasing perfect automation, we accepted that some decisions were better supported by review workflows, evaluation datasets, and iterative improvement. The goal wasn't a perfect model — it was a system that could improve safely as new edge cases appeared.
What I learned
The general lesson — that the problem is almost never the model — is in the companion article. What this work taught me specifically is that shipping computer vision is mostly building the system around the model: the evaluation, the review, and the loop that lets it improve safely as the data underneath it keeps changing.