FileScope

Intelligent file analyzer that classifies, audits, and cleans local files using machine learning algorithms.

AI/MLDark Theme

Tech Stack

PythonTkinterscikit-learnTensorFlow/KerasResNet50NumPy/Pandassend2trashzipfilehashlib

Key Highlights

TF-IDF + Naive Bayes for text classification

ResNet50 feature extraction for image analysis

MD5 hash-based duplicate detection

Multi-threaded Tkinter GUI with progress tracking

Safe file operations with undo functionality

Exportable CSV reports and audit logs

Project Details

I built a desktop app that classifies, audits, and cleans local files—using TF-IDF + Naive Bayes for text and ResNet50 features for images—then lets you compress, archive, or delete safely via a Tkinter UI.

Multimodal classification:

**Text:** TF-IDF vectorization → Multinomial Naive Bayes (topic/doctype labels).

**Images:** ResNet50 (pretrained) feature extractor → lightweight classifier (LogReg/SVM).

**Batch scan & insights:** Recursively indexes folders, extracts metadata (size, type, mtime), computes hashes (MD5) to detect duplicates, surfaces large/old/rarely-opened candidates.

**Interactive GUI:** Tkinter table with filters, preview pane, progress bars, cancel-safe scanning, and one-click actions (compress to ZIP, move to archive, safe delete to OS trash).

**Quality & reporting:** Confusion matrix, precision/recall/F1, per-class support; exportable CSV of findings and actions log for auditability.

**Safety rails:** Dry-run mode, undo queue, permission checks, integrity verification after compress/move.

My contributions:

Implemented the text & image pipelines, feature caching, and model persistence; wrote the duplicate finder (hash + size heuristics).

Built the Tkinter UI (virtualized table, preview, progress), multi-threaded scanning worker, and action handlers with rollback.

Authored evaluation scripts and reporting (metrics, CSV export), plus config profiles for "aggressive" vs "conservative" cleanup.

© 2025 Hüseyin Bora Baran. All rights reserved.