Back

Projects

The things I've built — from SaaS platforms and NLP tools to systems programming, data visualization, and deep learning.

Wheelbase

Multi-tenant dealership management SaaS — auction management, vehicle inventory, reconditioning workflows, AI assistant, collaborative docs, and programmatic marketing video generation.

A Turborepo monorepo with Bun workspaces containing 5 applications and 4 shared packages. Full vehicle lifecycle management from VIN decoding through inventory intake, customizable status pipelines (Kanban + list views), reconditioning workflows, and auction scheduling with runlist CSV import. One dealership actively using the platform.

Technical Highlights

tRPC API Layer End-to-end type-safe API with 20+ routers and 150+ procedures, TanStack Query for server state with optimistic updates.
Go VIN Decoder Gin HTTP service backed by a ~2GB local SQLite database (~1.6M pattern rows, ~8.7M valid-character rows). Multi-pass VDS pattern matching, check-digit validation, auto-correction for single-character errors.
AI System Natural language → DSL query builder with tenant-scoped execution, risk classification for write operations, approval token workflow for destructive actions, streaming chat via OpenRouter proxy.
Real-time Collaboration TipTap editor with extensions (code blocks, tables, task lists, KaTeX math, Mermaid diagrams), Yjs CRDT for conflict-free concurrent editing, Supabase Realtime as transport layer.
Multi-Tenancy PostgreSQL Row-Level Security on 30+ tables, tenant isolation via tenant_id scoping, role-based access control (admin, manager, tech, detailer, read_only).
Runlist Upload Streaming CSV processing with constant memory usage, dynamic column-to-field mapping, 500-record batch inserts with atomic guarantees — auto-deletes orphaned records on link failure.

Stack

Next.js 16React 19TypeScripttRPCTanStack QueryZustandShadcn/uiTailwind CSSGo 1.25GinSQLiteSupabaseTipTapYjsTurborepoBunDockerMinIORemotion

Grammario

Full-stack linguistic analysis platform that helps language learners understand grammar through interactive visualizations and AI-powered explanations.

Users enter a sentence in one of 5 supported languages (Italian, Spanish, German, Russian, Turkish) and receive deep grammatical analysis — tokenization, lemmatization, POS tagging, morphological analysis, and dependency parsing — all visualized as interactive syntax trees and linear dependency graphs. LLM-generated pedagogical insights explain why grammar works the way it does. A gamification layer (streaks, XP/levels, achievements) and spaced-repetition vocabulary system (SM-2 algorithm) drive engagement.

Technical Highlights

NLP Strategy Pattern Language-family-specific processing — RomanceStrategy (clitics, multi-word token expansion), InflectionStrategy (case governance, verbal aspect), AgglutinativeStrategy (Turkish morpheme segmentation with vowel harmony and consonant softening).
LLM Integration Dual-provider setup (OpenRouter primary, OpenAI fallback) with response caching (100-entry manual cache), JSON-mode parsing, and language-specific prompt engineering for pedagogical output.
Backend FastAPI (async Python 3.11), Stanford NLP (Stanza) pipelines with LRU-based model caching (up to 5 language models in memory with eviction), Pydantic v2 schema validation.
Infrastructure Dockerized multi-stage builds, Nginx reverse proxy with rate limiting (10 req/s), SSL via Let's Encrypt, GitHub Actions CI/CD (test → build → push to GHCR → SSH deploy to DigitalOcean), frontend on Vercel.
Video Generation Remotion-based promotional video system rendering landscape (YouTube) and portrait (Shorts/TikTok/Reels) formats programmatically from React components.

Stack

Next.js 16React 19TypeScriptTailwind CSS v4ReactFlowZustandTanStack QueryFramer MotionFastAPIPython 3.11StanzaPydantic v2SupabaseDockerNginxGitHub ActionsVercelDigitalOceanRemotion

Global Terrorism Data Visualization Dashboard

Interactive 3D globe and analytics dashboard transforming 177,000+ records from the Global Terrorism Database (1970–2017).

Renders terrorism incidents as geospatial points on an interactive 3D globe with hover tooltips showing location, attack type, and casualty details. Provides seven analytics views — Overview, Trends, Regions, Attack Types, Targets, Weapons, and Hotspots — each with detailed statistical breakdowns. Backend processes and serves the full 177K-record dataset through a REST API with data cleaning, NaN handling, type conversion, and performance-conscious sampling (5,000 points for smooth globe rendering).

Technical Highlights

3D Globe Globe.gl for WebGL-based earth rendering with realistic textures, night-time lighting, and geospatial incident point plotting.
Data Pipeline Cleans raw GTD data — handling missing coordinates, NaN imputation, type coercion for JSON serialization — and serves it through 8 analytical aggregation endpoints.
Performance Data sampling strategies balancing visualization density against rendering performance, lazy loading for on-demand data fetching, and backend caching.

Stack

ReactGlobe.gl (WebGL)Tailwind CSSAxiosViteFastAPIPythonPandasDockerNginx

Teen Phone Addiction Prediction Dashboard

Full-stack ML project predicting teen phone addiction levels from lifestyle and behavioral survey data.

Trains a RandomForestRegressor to predict an Addiction_Level score (0–10 continuous) from survey features, tracked with a comprehensive metric suite (MSE, RMSE, R², MAE, MAPE, Max Error, Median AE). Serves predictions through a FastAPI /predict endpoint that automatically loads the latest registered model from MLflow. Provides a Streamlit dashboard for exploratory data analysis, feature importance visualization, and an interactive prediction playground.

Technical Highlights

ML Pipeline scikit-learn RandomForestRegressor with MLflow experiment tracking — every training run logs parameters, metrics, and the serialized model artifact for reproducible experiments and automatic model versioning.
Prediction API FastAPI backend with Pydantic schema validation; dynamically loads the latest MLflow-registered model at startup so deployments always serve the most recent trained version.
Dashboard Streamlit app with data filtering/EDA, feature importance charts (Seaborn/Matplotlib), and a prediction playground that calls the FastAPI backend in real-time.

Stack

scikit-learnMLflowFastAPIPydanticStreamlitPandasSeabornMatplotlibDocker

Skin Cancer Classification CNN

Deep learning application for binary classification of skin lesions as benign or malignant, built on the HAM10000 dermatoscopy dataset (~10,000 images).

Dual-framework implementations in both TensorFlow/Keras and PyTorch, MLflow experiment tracking, CoreML conversion for iOS/macOS deployment, and Docker containerization. Maps seven diagnostic types into a binary label scheme with class-imbalance downsampling and ImageNet normalization.

Technical Highlights

PyTorch CNN Three convolutional blocks (3→16→32→64 channels, 3×3 kernels) with a fully connected layer (64×28×28 → 256 units), ReLU activations, and sigmoid output. Binary Cross-Entropy loss, Adam optimizer (lr=0.001).
TensorFlow/Keras CNN Sequential model with three Conv2D blocks (32 filters), Dropout(0.5) for regularization, ReduceLROnPlateau callback (factor=0.2, patience=5) for adaptive learning rate scheduling.
MLOps MLflow experiment tracking logging hyperparameters, per-epoch metrics, and model artifacts. Model checkpointing saves the best model based on validation accuracy.
Deployment CoreML conversion via torch.jit.trace (PyTorch) and coremltools (TF) for on-device inference on iOS 13+ and macOS. GPU-accelerated training with automatic CUDA device detection.

Stack

PyTorchTensorFlow / Kerasscikit-learnPandasNumPyPillowMLflowCoreML ToolsDockerCUDA

procmon

Linux host telemetry and process-monitoring tool written in C++17 with ncurses. Real-time terminal dashboard powered directly by /proc, with process-level CPU and memory analytics, fast filtering, and suspicious-process tagging for defensive workflows.

Parses live telemetry from Linux kernel-exposed interfaces (/proc/stat, /proc/meminfo, /proc/<pid>/*) and renders a responsive low-level TUI with no heavy framework. Provides real-time host metrics (total CPU and memory utilization) alongside a full process table showing PID, state, CPU %, memory %, suspicious tag, and command. Interactive keyboard controls for sorting (by CPU, memory, or PID), live substring filtering on PID/command, and immediate non-blocking input.

Technical Highlights

Process Telemetry Collects per-process data from /proc/<pid>/stat, status, cmdline, and comm. Computes CPU % normalized against elapsed process lifetime and logical CPU count, and memory % as RSS relative to host MemTotal.
Defensive Tagging Heuristic-based suspicious process flags — TMP_EXEC for commands launched from temp/shared-memory paths, LOLBIN for living-off-the-land patterns, SPIKE for very high CPU consumers.
System Metrics Enumerates numeric process directories under /proc, computes host CPU % using delta sampling of /proc/stat, and derives host memory % from /proc/meminfo.
ncurses Dashboard High-frequency terminal UI with interactive sort/filter operations and non-blocking keyboard input. Designed for low overhead and rapid scanning aligned with SOC/IR-style endpoint analysis.

Stack

C++17ncursesLinux /procMake