03 / BUILDS

WORK

The things I've built — from SaaS platforms and NLP tools to systems programming, data visualization, and deep learning.

◆

Wheelbase

wheelbase.io

Multi-tenant dealership management SaaS platform encompassing auction management, vehicle inventory tracking, reconditioning workflows, AI-powered operations assistance, collaborative document management, and programmatic marketing video generation. Built as a Turborepo monorepo with Bun workspaces containing 5 applications and 4 shared packages.

Full vehicle lifecycle management: VIN decoding → inventory intake → customizable status pipeline (Kanban + list views) → reconditioning workflow (stage definitions, holds, parts orders, work orders, inspections) → auction scheduling with runlist CSV import and per-car assessment wizards. AI assistant that converts natural language into tenant-scoped SQL queries (SELECT-only), performs risk-assessed write operations (high-risk ops require explicit approval tokens), and personalizes responses via context documents (USER.md, DEALERSHIP.md, TEAM.md). Real-time collaborative document editor (Vault app) with TipTap, Yjs + Supabase Realtime for multiplayer editing, version history, folder hierarchy, and template gallery. Multi-tenant architecture with Row-Level Security — all queries scoped by tenant_id, with role-based access control (admin, manager, tech, detailer, read_only) and per-tenant customization of status pipelines, recon stages, and demand categories.

Technical Highlights

→Monorepo — Turborepo + Bun workspaces orchestrating 5 apps and 4 shared packages (@wheelbase/ui, /utils, /types, /config) with subpath exports for tree-shakeable type imports.

→Frontend (Main App) — Next.js 16 App Router, React 19, TypeScript, tRPC for end-to-end type-safe API layer (20+ routers, 150+ procedures), TanStack Query for server state with optimistic updates, Zustand for global state, Shadcn/ui component library.

→Go VIN Decoder — Gin HTTP framework with a self-contained VIN decoder backed by a ~2GB local SQLite database (NHTSA data, ~1.6M pattern rows, ~8.7M valid-character rows). Decoding pipeline: extract model year from position 10 with 30-year cycle logic, WMI lookup for manufacturer/make, multi-pass VDS pattern matching (positions 4–8) to decode body style, engine, drive type, and model.

→Go Runlist Upload — Streaming CSV processing with constant memory usage, dynamic column-to-field mapping via ImportFlow configurations fetched from Supabase, client-side UUID generation for batching, 500-record batch inserts with atomic-like guarantees.

→AI System — Natural language → DSL query builder with tenant-scoped execution, risk classification for write operations (low/high), approval token workflow for destructive actions, context document versioning with draft/publish, streaming chat via OpenRouter proxy.

→Real-time Collaboration (Vault) — TipTap editor with extensions (code blocks, tables, task lists, KaTeX math, Mermaid diagrams), Yjs CRDT for conflict-free concurrent editing, Supabase Realtime as transport layer, 2-second debounced auto-save, version history with restore.

→Multi-Tenancy & Security — PostgreSQL Row-Level Security on all tables, tenant isolation via tenant_id scoping, role-based access control, tenant/dealership switching via cookies + API routes, custom feature preferences per tenant.

→Infrastructure — Docker Compose for multi-service deployment (frontend, landing, backend), multi-stage Go build with CGO for SQLite, MinIO object storage for the VIN database (~2GB), Remotion for programmatic video generation.

Stack

Next.js 16React 19TypeScripttRPCTanStack QueryZustandShadcn/uiTailwind CSSViteTanStack RouterHonoTipTapYjsGo 1.25GinSQLiteSupabaseMinIOTurborepoBunDockerRemotion 4.0

◆

Grammario

DEEP DIVE → grammario.ai

A production-ready, full-stack linguistic analysis platform that helps language learners understand grammar through interactive visualizations and AI-powered explanations. Deployed and publicly accessible at grammario.ai.

Users enter a sentence in one of 5 supported languages (Italian, Spanish, German, Russian, Turkish) and receive deep grammatical analysis: tokenization, lemmatization, POS tagging, morphological analysis, and dependency parsing — all visualized as interactive syntax trees and linear dependency graphs. LLM-generated pedagogical insights explain why grammar works the way it does, providing rules, examples, and cultural nuance. Gamification layer (streaks, XP/levels, achievements, daily goals) and a spaced-repetition vocabulary system (SM-2 algorithm) to drive engagement.

Technical Highlights

→Frontend — Next.js 16 (App Router, SSR, API routes), React 19, TypeScript, Tailwind CSS v4, ReactFlow for interactive node-graph visualizations, Dagre for automated tree layout, Zustand for client state with localStorage persistence, TanStack Query for server state and caching, Framer Motion for animations.

→Backend — FastAPI (async Python 3.11), Stanford NLP (Stanza) pipelines for linguistic analysis with LRU-based model caching (up to 5 language models in memory with eviction), Pydantic v2 for schema validation.

→NLP Strategy Pattern — Language-family-specific processing — RomanceStrategy (clitics, multi-word token expansion), InflectionStrategy (case governance, verbal aspect), AgglutinativeStrategy (Turkish morpheme segmentation including vowel harmony, consonant softening, buffer consonants).

→LLM Integration — Dual-provider setup (OpenRouter primary, OpenAI fallback) with response caching (100-entry manual cache), JSON-mode parsing, and language-specific prompt engineering for pedagogical output.

→Auth & Database — Supabase Auth (email/password + Google OAuth, PKCE flow), PostgreSQL with Row-Level Security policies for complete user data isolation, auto-triggers for profile creation and timestamp management.

→Infrastructure — Dockerized (multi-stage production builds, non-root user), Docker Compose orchestration, Nginx reverse proxy with rate limiting (10 req/s), SSL via Let's Encrypt/Certbot, GitHub Actions CI/CD pipeline (test → build → push to GHCR → SSH deploy to DigitalOcean), frontend on Vercel.

Stack

Next.js 16React 19TypeScriptTailwind CSS v4ReactFlowZustandTanStack QueryFramer MotionRadix UIAxiosFastAPIPython 3.11StanzaOpenAI SDKPydantic v2UvicornSupabaseJWTDockerDocker ComposeNginxLet's EncryptGitHub ActionsVercelDigitalOceanRemotion

◆

Global Terrorism Data Visualization

Source

A full-stack data visualization application that transforms 177,000+ records from the Global Terrorism Database (GTD, START consortium) into an interactive 3D globe interface with a comprehensive analytics dashboard.

Renders terrorism incidents (1970–2017) as geospatial points on an interactive 3D globe with hover tooltips showing location, attack type, and casualty details. Provides seven analytics views — Overview, Trends, Regions, Attack Types, Targets, Weapons, and Hotspots — each with detailed statistical breakdowns. Backend processes and serves the full 177K-record dataset through a REST API with data cleaning, NaN handling, type conversion, and performance-conscious sampling.

Technical Highlights

→Frontend — React, Globe.gl for WebGL-based 3D earth rendering with realistic textures and night-time lighting, Tailwind CSS, Axios for API communication, Vite as build tool.

→Backend — FastAPI with Pandas for data manipulation and analysis across 177K+ records, Uvicorn ASGI server, CORS middleware, RESTful API design with 8 endpoints.

→Data Engineering — Built a data pipeline that cleans raw GTD data — handling missing coordinates, NaN imputation, type coercion for JSON serialization — and serves it through analytical aggregation endpoints.

→Performance — Implemented data sampling strategies to balance visualization density against rendering performance, lazy loading for on-demand data fetching, and backend caching.

→Infrastructure — Dockerized with Docker Compose (backend + frontend + Nginx reverse proxy).

Stack

ReactGlobe.gl (WebGL)Tailwind CSSAxiosViteFastAPIPythonPandasUvicornDockerDocker ComposeNginx

◆

Teen Phone Addiction Prediction

Source Demo

A full-stack machine learning project that predicts teen phone addiction levels from lifestyle and behavioral survey data, combining model training with experiment tracking, a real-time prediction API, and an interactive data exploration dashboard.

Trains a RandomForestRegressor to predict an Addiction_Level score (0–10 continuous) from survey features, tracked with a comprehensive metric suite (MSE, RMSE, R², MAE, MAPE, Max Error, Median AE). Serves predictions through a FastAPI /predict endpoint that automatically loads the latest registered model from MLflow. Provides a Streamlit dashboard for exploratory data analysis, feature importance visualization, and an interactive prediction playground.

Technical Highlights

→ML Pipeline — scikit-learn RandomForestRegressor with MLflow experiment tracking — every training run logs parameters, metrics, and the serialized model artifact, enabling reproducible experiments and automatic model versioning.

→Prediction API — FastAPI backend with Pydantic schema validation; dynamically loads the latest MLflow-registered model at startup so deployments always serve the most recent trained version without manual model path updates.

→Dashboard — Streamlit app with data filtering/EDA, feature importance charts (Seaborn/Matplotlib), and a prediction playground that calls the FastAPI backend in real-time.

→Infrastructure — Dockerized for reproducible deployment.

Stack

scikit-learnMLflowFastAPIPydanticUvicornStreamlitPandasSeabornMatplotlibDocker

◆

Skin Cancer Classification CNN

Source

A deep learning application for binary classification of skin lesions as benign or malignant, built on the HAM10000 dermatoscopy dataset (~10,000 images). Dual-framework implementations in both TensorFlow/Keras and PyTorch, MLflow experiment tracking, CoreML conversion for iOS/macOS deployment.

Trains convolutional neural networks to classify dermatoscopic skin lesion images into benign or malignant categories, mapping seven diagnostic types into a binary label scheme. Provides multiple training scripts with CLI configuration for epoch count and optional CoreML model conversion for iOS/macOS deployment. Tracks experiments with MLflow, logging hyperparameters, per-epoch metrics, and model artifacts for reproducibility.

Technical Highlights

→Data Pipeline — Ingests the HAM10000 dataset (~10,000 images) across two directories, joined with a CSV metadata file via Pandas merge on image_id. Addresses class imbalance through downsampling. Images resized (224×224 for PyTorch, 128×128 for TensorFlow), normalized using ImageNet statistics.

→PyTorch CNN — Three convolutional blocks (Conv2d → ReLU → MaxPool2d) with progressively deeper filters (3→16→32→64 channels), followed by a fully connected layer with ReLU activation and a single sigmoid output neuron. Trained with Binary Cross-Entropy loss and Adam optimizer.

→TensorFlow/Keras CNN — Sequential model with three Conv2D + ReLU + MaxPooling2D blocks, followed by Flatten, Dense(64) with ReLU, Dropout(0.5) for regularization, and a single sigmoid output unit. Includes ReduceLROnPlateau callback for adaptive learning rate scheduling.

→MLOps — MLflow experiment tracking — logs hyperparameters, per-epoch metrics, and model artifacts. Model checkpointing saves the best model based on validation accuracy.

→Deployment — CoreML conversion via coremltools for on-device inference on iOS 13+ and macOS. GPU-accelerated training with automatic CUDA device detection. CLI interface via argparse.

Stack

PyTorchTensorFlow / Kerasscikit-learnPandasNumPyPillowMLflowCoreML ToolsDockerCUDA

◆

procmon

Source

Linux host telemetry and process-monitoring tool written in C++17 with ncurses. Real-time terminal dashboard powered directly by /proc, with process-level CPU and memory analytics, fast filtering, and suspicious-process tagging for defensive workflows.

Parses live telemetry from Linux kernel-exposed interfaces (/proc/stat, /proc/meminfo, /proc/<pid>/*) and renders a responsive low-level TUI with no heavy framework. Provides real-time host metrics alongside a full process table showing PID, state, CPU %, memory %, suspicious tag, and command. Interactive keyboard controls for sorting, live substring filtering, and immediate non-blocking input.

Technical Highlights

→Process Telemetry — Collects per-process data from /proc/<pid>/stat, status, cmdline, and comm. Computes CPU % normalized against elapsed process lifetime and logical CPU count, and memory % as RSS relative to host MemTotal.

→Defensive Tagging — Heuristic-based suspicious process flags — TMP_EXEC for commands launched from temp/shared-memory paths, LOLBIN for living-off-the-land patterns, SPIKE for very high CPU consumers.

→System Metrics — Enumerates numeric process directories under /proc, computes host CPU % using delta sampling of /proc/stat, and derives host memory % from /proc/meminfo.

→ncurses Dashboard — High-frequency terminal UI with interactive sort/filter operations and non-blocking keyboard input. Designed for low overhead and rapid scanning aligned with SOC/IR-style endpoint analysis.

Stack

C++17ncursesLinux /procMake

END OF INDEX