SnapMath

End-to-end system that converts handwritten/printed math into LaTeX and solves them with SymPy.

AI/MLDark Theme

Tech Stack

PyTorchHuggingFace TransformersViT-B/16TrOCR-BaseFlaskBootstrapONNXSymPyDDP/NCCL

Key Highlights

CROHME'14 (handwritten) 62% Exact-Match, 6.1% SER

im2latex-100k (printed) 6.7% CER, 55% EM

Trained in 2h55m on 2× NVIDIA L40 (FP16 DDP)

Inference ≈ 42 ms/expression; ~220M params

~116k CROHME expressions processed

Custom data collator and evaluation metrics

Project Details

I built an end-to-end system that converts images of handwritten/printed math into LaTeX and (next step) solves them with SymPy, using a ViT-Base + TrOCR-Base VisionEncoderDecoder model and a Flask web app.

**Results:** CROHME'14 (handwritten) 62% Exact-Match, 6.1% SER; im2latex-100k (printed) 6.7% CER, 55% EM.

**Performance:** Trained in 2h55m on 2× NVIDIA L40 (FP16 DDP); inference ≈ 42 ms/expression; ~220M params.

**Data & Prep:** ~116k CROHME expressions; images normalized to 384×384 with histogram equalization; +83 math symbols added to TrOCR BPE tokenizer.

**Engineering:** Custom data collator, label smoothing + AdamW, beam decoding, ONNX export path.

**App:** Flask UI with secure uploads (MIME/extension whitelist, size limits, secure_filename) and live preview.

My contributions:

Designed and fine-tuned ViT-Base + TrOCR; implemented FP16 DDP training and ablations (FP32 vs FP16, vocab extension, ViT-Large).

Built the preprocessing/tokenization pipeline and evaluation (EM/SER/CER).

Developed the Flask frontend/backend and wired the inference/LaTeX rendering plan; prepared ONNX deployment.

© 2025 Hüseyin Bora Baran. All rights reserved.