INDEX06 ENTRIES

03 / BUILDS

WORK

The things I've built — from SaaS platforms and NLP tools to systems programming, data visualization, and deep learning.

01

Wheelbase

Multi-tenant dealership management SaaS platform encompassing auction management, vehicle inventory tracking, reconditioning workflows, AI-powered operations assistance, collaborative document management, and programmatic marketing video generation. Built as a Turborepo monorepo with Bun workspaces containing 5 applications and 4 shared packages.

Full vehicle lifecycle management: VIN decoding → inventory intake → customizable status pipeline (Kanban + list views) → reconditioning workflow (stage definitions, holds, parts orders, work orders, inspections) → auction scheduling with runlist CSV import and per-car assessment wizards. AI assistant that converts natural language into tenant-scoped SQL queries (SELECT-only), performs risk-assessed write operations (high-risk ops require explicit approval tokens), and personalizes responses via context documents (USER.md, DEALERSHIP.md, TEAM.md). Real-time collaborative document editor (Vault app) with TipTap, Yjs + Supabase Realtime for multiplayer editing, version history, folder hierarchy, and template gallery. Multi-tenant architecture with Row-Level Security — all queries scoped by tenant_id, with role-based access control (admin, manager, tech, detailer, read_only) and per-tenant customization of status pipelines, recon stages, and demand categories.

Technical Highlights

MonorepoTurborepo + Bun workspaces orchestrating 5 apps and 4 shared packages (@wheelbase/ui, /utils, /types, /config) with subpath exports for tree-shakeable type imports.
Frontend (Main App)Next.js 16 App Router, React 19, TypeScript, tRPC for end-to-end type-safe API layer (20+ routers, 150+ procedures), TanStack Query for server state with optimistic updates, Zustand for global state, Shadcn/ui component library.
Go VIN DecoderGin HTTP framework with a self-contained VIN decoder backed by a ~2GB local SQLite database (NHTSA data, ~1.6M pattern rows, ~8.7M valid-character rows). Decoding pipeline: extract model year from position 10 with 30-year cycle logic, WMI lookup for manufacturer/make, multi-pass VDS pattern matching (positions 4–8) to decode body style, engine, drive type, and model.
Go Runlist UploadStreaming CSV processing with constant memory usage, dynamic column-to-field mapping via ImportFlow configurations fetched from Supabase, client-side UUID generation for batching, 500-record batch inserts with atomic-like guarantees.
AI SystemNatural language → DSL query builder with tenant-scoped execution, risk classification for write operations (low/high), approval token workflow for destructive actions, context document versioning with draft/publish, streaming chat via OpenRouter proxy.
Real-time Collaboration (Vault)TipTap editor with extensions (code blocks, tables, task lists, KaTeX math, Mermaid diagrams), Yjs CRDT for conflict-free concurrent editing, Supabase Realtime as transport layer, 2-second debounced auto-save, version history with restore.
Multi-Tenancy & SecurityPostgreSQL Row-Level Security on all tables, tenant isolation via tenant_id scoping, role-based access control, tenant/dealership switching via cookies + API routes, custom feature preferences per tenant.
InfrastructureDocker Compose for multi-service deployment (frontend, landing, backend), multi-stage Go build with CGO for SQLite, MinIO object storage for the VIN database (~2GB), Remotion for programmatic video generation.

Stack

Next.js 16React 19TypeScripttRPCTanStack QueryZustandShadcn/uiTailwind CSSViteTanStack RouterHonoTipTapYjsGo 1.25GinSQLiteSupabaseMinIOTurborepoBunDockerRemotion 4.0

A production-ready, full-stack linguistic analysis platform that helps language learners understand grammar through interactive visualizations and AI-powered explanations. Deployed and publicly accessible at grammario.ai.

Users enter a sentence in one of 5 supported languages (Italian, Spanish, German, Russian, Turkish) and receive deep grammatical analysis: tokenization, lemmatization, POS tagging, morphological analysis, and dependency parsing — all visualized as interactive syntax trees and linear dependency graphs. LLM-generated pedagogical insights explain why grammar works the way it does, providing rules, examples, and cultural nuance. Gamification layer (streaks, XP/levels, achievements, daily goals) and a spaced-repetition vocabulary system (SM-2 algorithm) to drive engagement.

Technical Highlights

FrontendNext.js 16 (App Router, SSR, API routes), React 19, TypeScript, Tailwind CSS v4, ReactFlow for interactive node-graph visualizations, Dagre for automated tree layout, Zustand for client state with localStorage persistence, TanStack Query for server state and caching, Framer Motion for animations.
BackendFastAPI (async Python 3.11), Stanford NLP (Stanza) pipelines for linguistic analysis with LRU-based model caching (up to 5 language models in memory with eviction), Pydantic v2 for schema validation.
NLP Strategy PatternLanguage-family-specific processing — RomanceStrategy (clitics, multi-word token expansion), InflectionStrategy (case governance, verbal aspect), AgglutinativeStrategy (Turkish morpheme segmentation including vowel harmony, consonant softening, buffer consonants).
LLM IntegrationDual-provider setup (OpenRouter primary, OpenAI fallback) with response caching (100-entry manual cache), JSON-mode parsing, and language-specific prompt engineering for pedagogical output.
Auth & DatabaseSupabase Auth (email/password + Google OAuth, PKCE flow), PostgreSQL with Row-Level Security policies for complete user data isolation, auto-triggers for profile creation and timestamp management.
InfrastructureDockerized (multi-stage production builds, non-root user), Docker Compose orchestration, Nginx reverse proxy with rate limiting (10 req/s), SSL via Let's Encrypt/Certbot, GitHub Actions CI/CD pipeline (test → build → push to GHCR → SSH deploy to DigitalOcean), frontend on Vercel.

Stack

Next.js 16React 19TypeScriptTailwind CSS v4ReactFlowZustandTanStack QueryFramer MotionRadix UIAxiosFastAPIPython 3.11StanzaOpenAI SDKPydantic v2UvicornSupabaseJWTDockerDocker ComposeNginxLet's EncryptGitHub ActionsVercelDigitalOceanRemotion

03

Global Terrorism Data Visualization

A full-stack data visualization application that transforms 177,000+ records from the Global Terrorism Database (GTD, START consortium) into an interactive 3D globe interface with a comprehensive analytics dashboard.

Renders terrorism incidents (1970–2017) as geospatial points on an interactive 3D globe with hover tooltips showing location, attack type, and casualty details. Provides seven analytics views — Overview, Trends, Regions, Attack Types, Targets, Weapons, and Hotspots — each with detailed statistical breakdowns. Backend processes and serves the full 177K-record dataset through a REST API with data cleaning, NaN handling, type conversion, and performance-conscious sampling.

Technical Highlights

FrontendReact, Globe.gl for WebGL-based 3D earth rendering with realistic textures and night-time lighting, Tailwind CSS, Axios for API communication, Vite as build tool.
BackendFastAPI with Pandas for data manipulation and analysis across 177K+ records, Uvicorn ASGI server, CORS middleware, RESTful API design with 8 endpoints.
Data EngineeringBuilt a data pipeline that cleans raw GTD data — handling missing coordinates, NaN imputation, type coercion for JSON serialization — and serves it through analytical aggregation endpoints.
PerformanceImplemented data sampling strategies to balance visualization density against rendering performance, lazy loading for on-demand data fetching, and backend caching.
InfrastructureDockerized with Docker Compose (backend + frontend + Nginx reverse proxy).

Stack

ReactGlobe.gl (WebGL)Tailwind CSSAxiosViteFastAPIPythonPandasUvicornDockerDocker ComposeNginx

04

Teen Phone Addiction Prediction

A full-stack machine learning project that predicts teen phone addiction levels from lifestyle and behavioral survey data, combining model training with experiment tracking, a real-time prediction API, and an interactive data exploration dashboard.

Trains a RandomForestRegressor to predict an Addiction_Level score (0–10 continuous) from survey features, tracked with a comprehensive metric suite (MSE, RMSE, R², MAE, MAPE, Max Error, Median AE). Serves predictions through a FastAPI /predict endpoint that automatically loads the latest registered model from MLflow. Provides a Streamlit dashboard for exploratory data analysis, feature importance visualization, and an interactive prediction playground.

Technical Highlights

ML Pipelinescikit-learn RandomForestRegressor with MLflow experiment tracking — every training run logs parameters, metrics, and the serialized model artifact, enabling reproducible experiments and automatic model versioning.
Prediction APIFastAPI backend with Pydantic schema validation; dynamically loads the latest MLflow-registered model at startup so deployments always serve the most recent trained version without manual model path updates.
DashboardStreamlit app with data filtering/EDA, feature importance charts (Seaborn/Matplotlib), and a prediction playground that calls the FastAPI backend in real-time.
InfrastructureDockerized for reproducible deployment.

Stack

scikit-learnMLflowFastAPIPydanticUvicornStreamlitPandasSeabornMatplotlibDocker

05

Skin Cancer Classification CNN

A deep learning application for binary classification of skin lesions as benign or malignant, built on the HAM10000 dermatoscopy dataset (~10,000 images). Dual-framework implementations in both TensorFlow/Keras and PyTorch, MLflow experiment tracking, CoreML conversion for iOS/macOS deployment.

Trains convolutional neural networks to classify dermatoscopic skin lesion images into benign or malignant categories, mapping seven diagnostic types into a binary label scheme. Provides multiple training scripts with CLI configuration for epoch count and optional CoreML model conversion for iOS/macOS deployment. Tracks experiments with MLflow, logging hyperparameters, per-epoch metrics, and model artifacts for reproducibility.

Technical Highlights

Data PipelineIngests the HAM10000 dataset (~10,000 images) across two directories, joined with a CSV metadata file via Pandas merge on image_id. Addresses class imbalance through downsampling. Images resized (224×224 for PyTorch, 128×128 for TensorFlow), normalized using ImageNet statistics.
PyTorch CNNThree convolutional blocks (Conv2d → ReLU → MaxPool2d) with progressively deeper filters (3→16→32→64 channels), followed by a fully connected layer with ReLU activation and a single sigmoid output neuron. Trained with Binary Cross-Entropy loss and Adam optimizer.
TensorFlow/Keras CNNSequential model with three Conv2D + ReLU + MaxPooling2D blocks, followed by Flatten, Dense(64) with ReLU, Dropout(0.5) for regularization, and a single sigmoid output unit. Includes ReduceLROnPlateau callback for adaptive learning rate scheduling.
MLOpsMLflow experiment tracking — logs hyperparameters, per-epoch metrics, and model artifacts. Model checkpointing saves the best model based on validation accuracy.
DeploymentCoreML conversion via coremltools for on-device inference on iOS 13+ and macOS. GPU-accelerated training with automatic CUDA device detection. CLI interface via argparse.

Stack

PyTorchTensorFlow / Kerasscikit-learnPandasNumPyPillowMLflowCoreML ToolsDockerCUDA

06

procmon

Linux host telemetry and process-monitoring tool written in C++17 with ncurses. Real-time terminal dashboard powered directly by /proc, with process-level CPU and memory analytics, fast filtering, and suspicious-process tagging for defensive workflows.

Parses live telemetry from Linux kernel-exposed interfaces (/proc/stat, /proc/meminfo, /proc/<pid>/*) and renders a responsive low-level TUI with no heavy framework. Provides real-time host metrics alongside a full process table showing PID, state, CPU %, memory %, suspicious tag, and command. Interactive keyboard controls for sorting, live substring filtering, and immediate non-blocking input.

Technical Highlights

Process TelemetryCollects per-process data from /proc/<pid>/stat, status, cmdline, and comm. Computes CPU % normalized against elapsed process lifetime and logical CPU count, and memory % as RSS relative to host MemTotal.
Defensive TaggingHeuristic-based suspicious process flags — TMP_EXEC for commands launched from temp/shared-memory paths, LOLBIN for living-off-the-land patterns, SPIKE for very high CPU consumers.
System MetricsEnumerates numeric process directories under /proc, computes host CPU % using delta sampling of /proc/stat, and derives host memory % from /proc/meminfo.
ncurses DashboardHigh-frequency terminal UI with interactive sort/filter operations and non-blocking keyboard input. Designed for low overhead and rapid scanning aligned with SOC/IR-style endpoint analysis.

Stack

C++17ncursesLinux /procMake
END OF INDEX