← Back
Built end-to-end

Jigsaw Puzzle Solver AI

SwiftSwiftUICore GraphicsPythonPyTorch

Goal

Create the fastest jigsaw puzzle solver AI in the world, with a physical robot that can autonomously solve puzzles in real-time. This is a multi-phase research project combining computer vision, supervised learning, reinforcement learning, and robotics.

Phase 1: Data Generation ✅

Built a native macOS app that generates high-quality jigsaw puzzle training data at scale.

  • Openverse integration: search and download tens of thousands of Creative Commons images directly into projects, with licence and attribution preserved
  • Puzzle cutting engine with CGPath cubic bezier curves for realistic interlocking edges with randomised control points
  • Adjacent pieces share edge curves (forward/reversed traversal) for perfect interlocking fit
  • Configurable grid sizes from 1x2 to 100x100
  • AI normalization pipeline: centre-crop, resize, pad all pieces to uniform square canvas for ML training
  • Batch processing: process multiple images in one go with per-item progress tracking
  • Pieces exported as transparent PNGs with metadata JSON (bounding boxes, neighbour lists, grid positions)

Phase 2: Dataset Generation ✅

Structured ML dataset creation from 2-piece puzzles with proper data science methodology.

  • Four pair categories: correct match, wrong shape match (same edges, different image), wrong image match (same image, different edges), wrong nothing (different both)
  • Image-level train/test/valid splits to prevent data leakage
  • Shared grid edges enable shape-match pairs across images for harder negatives
  • Larger grids produce more pair positions and diverse piece types (corners, edges, interior)
  • Datasets persisted as independent entities (survive project deletion), exportable to external directories

Phase 3: Model Architecture and Training ✅

Siamese Neural Network training infrastructure built directly into the app.

  • Architecture presets: reusable SNN configurations with 3 built-in defaults (Quick Test, Recommended, High Capacity)
  • Configurable ConvBlocks, comparison methods, and hyperparameters
  • Auto-generates self-contained PyTorch training scripts + requirements.txt
  • Local training: auto venv creation, pip install, subprocess execution with live epoch progress
  • Cloud training via SSH: uploads dataset + scripts via scp, runs on remote GPU instances, streams live progress, auto-downloads results
  • Experiment tracking: script SHA-256 hashes, preset names, notes, timestamps, metrics comparison table
  • Auto-imports metrics.json and Core ML model on completion

Phase 4: Piece Matching Model (next)

Train the Siamese Neural Network on generated datasets to learn piece compatibility. Evaluate matching accuracy across different puzzle complexities and ambiguous textures.

Phase 5: Assembly Solver (planned)

Develop an RL agent or optimization-based solver that uses pairwise match scores to plan and execute full puzzle assembly.

Phase 6: Physical Robot (planned)

Computer vision for physical piece detection via overhead camera, robot arm integration with vacuum gripper for pick-rotate-place operations. Bridge the virtual and physical worlds.

Tech Decisions

  • Native macOS app in Swift + SwiftUI for maximum performance on image processing
  • Pure native implementation with no external dependencies (Core Graphics)
  • PyTorch for model training with Core ML export for on-device inference
  • SSH-based cloud training for GPU access without leaving the app
  • Four-level project hierarchy: Project > Cut > CutImageResult > Pieces
  • Full persistence layer with JSON manifests for all entities

Outcome

Built a native macOS puzzle generator app producing training data at scale with bezier-curve interlocking pieces.