SnapMath
End-to-end system that converts handwritten/printed math into LaTeX and solves them with SymPy.
Tech Stack
Key Highlights
CROHME'14 (handwritten) 62% Exact-Match, 6.1% SER
im2latex-100k (printed) 6.7% CER, 55% EM
Trained in 2h55m on 2× NVIDIA L40 (FP16 DDP)
Inference ≈ 42 ms/expression; ~220M params
~116k CROHME expressions processed
Custom data collator and evaluation metrics
Project Details
I built an end-to-end system that converts images of handwritten/printed math into LaTeX and (next step) solves them with SymPy, using a ViT-Base + TrOCR-Base VisionEncoderDecoder model and a Flask web app.
**Results:** CROHME'14 (handwritten) 62% Exact-Match, 6.1% SER; im2latex-100k (printed) 6.7% CER, 55% EM.
**Performance:** Trained in 2h55m on 2× NVIDIA L40 (FP16 DDP); inference ≈ 42 ms/expression; ~220M params.
**Data & Prep:** ~116k CROHME expressions; images normalized to 384×384 with histogram equalization; +83 math symbols added to TrOCR BPE tokenizer.
**Engineering:** Custom data collator, label smoothing + AdamW, beam decoding, ONNX export path.
**App:** Flask UI with secure uploads (MIME/extension whitelist, size limits, secure_filename) and live preview.
My contributions:
• Designed and fine-tuned ViT-Base + TrOCR; implemented FP16 DDP training and ablations (FP32 vs FP16, vocab extension, ViT-Large).
• Built the preprocessing/tokenization pipeline and evaluation (EM/SER/CER).
• Developed the Flask frontend/backend and wired the inference/LaTeX rendering plan; prepared ONNX deployment.