Technical
Real-time player tracking with SAM3
How we built a vision system that tracks players across frames with persistent IDs, even through occlusions and camera cuts.
Sports video analysis has always been limited by the tedious manual work of tagging players frame by frame. With SAM3 and some careful engineering, we've reduced hours of manual work to real-time inference.
The Challenge
Traditional tracking systems struggle with sports footage because:
- Players frequently occlude each other
- Camera cuts break tracking continuity
- Jersey numbers are often unreadable at broadcast resolution
- Fast motion causes blur and detection failures
Our Approach
We combined SAM3's segmentation capabilities with a custom re-identification model trained on sports-specific features. The key insight was treating player identity as a graph problem rather than a frame-by-frame classification.
Physics-Aware Tracking
We added constraints based on physical plausibility—a player can't teleport across the field in one frame. This simple heuristic eliminated 60% of ID switches.
def validate_track(prev_pos, curr_pos, max_velocity=12):
"""12 m/s is roughly max sprint speed"""
distance = np.linalg.norm(curr_pos - prev_pos)
return distance < max_velocity * frame_dtResults
On our benchmark of NFL broadcast footage, we achieved 41% better tracking accuracy compared to off-the-shelf solutions, with real-time inference at 30fps on an RTX 4090.
Want to see this in action?
We build systems like this for clients. Let's talk.