Next.js 15
TypeScript
FastAPI
BullMQ
YOLO
Prisma
Modal GPU
Pinecone
Prometheus
Grafana
- AI-native retail execution platform featuring async YOLO shelf analysis, multi-signal fraud detection, and governed outlet master data.
- Orchestrates an async BullMQ pipeline executing YOLO + LLM reasoning, complete with Prometheus queue metric collection and DLQ replay CLI.
- Features a 5-signal calibrated fraud engine (SHA-256, dHash, GPS mismatch, EXIF analysis) that runs early to save expensive GPU resources.
- Implements weighted outlet similarity matching (pg_trgm + geo prefilter) with three-tier resolution and non-destructive duplicate merging.
- Designed a two-tier RAG operational assistant combining exact DB queries (via Prisma) with Pinecone semantic search over visit reports.
Python
YOLOv11
AWS EKS
FastAPI
Docker
ONNX
Prometheus
ArgoCD
Helm
- Fine-tuned and deployed YOLOv11L on AWS EKS as a microserviced
inference service for dense, multi-class retail product detection.
- Compiled the model into an in-process ONNX Runtime engine with INT8
quantization + right-sized CPU bin-packing (200m/pod) for sub-500ms detection
entirely on CPU — no GPU required.
- 4× CPU inference speedup and 70–90% cost savings via INT8,
AMX, and Spot node diversification across
m7i-flex,
c7i-flex, and t3.small.
- GitOps progressive delivery: ArgoCD Image Updater polls ECR,
Argo Rollouts runs weighted canaries through NGINX Ingress with automated
smoke tests before promotion.
- Spot resilience enforced with Pod Disruption Budgets and native 2-minute interruption
handling for zero-downtime operation on volatile hardware.
- DevSecOps gates in GitHub Actions (OIDC/STS) —
ruff,
pip-audit, and Trivy CVE scans guard the ECR boundary;
Loki/Prometheus/Grafana for observability.
Go
Python
Celery
RabbitMQ
Terraform
AWS
K3s
Redis
- Migrated the HTTP edge from FastAPI to Go, preserving full contract
parity with the existing Celery worker runtime and Alembic-managed schema.
- Architected a dual-node K3s cluster on AWS (On-Demand control + Spot data plane) with
full Terraform IaC and automated SSM reconciliation for async image processing queued
through RabbitMQ.
- Optimized cost with EC2 Spot and graceful eviction via AWS Node Termination
Handler.
- Resilience under load with Redis idempotency keys, pybreaker circuit
breakers, and k6 stress tests.
- Zero-Trust GitHub Actions (OIDC) CI/CD; telemetry and traces to Grafana
LGTM via Alloy.
- Decoupled ML inference from fast conversions via dedicated queues to prevent worker starvation.
Flutter
Kotlin
CameraX
ML Kit
AlarmManager
GeofencingClient
MapLibre
- Flutter/Native split: Flutter UI shell backed by a Kotlin alarm engine that
owns scheduling, ringing, recovery, and dismissal authority.
- Location alarms via
GeofencingClient with hybrid geofence +
passive approach-assist and 10-state health model per alarm.
- Mission-based dismissal with native inactivity enforcement — math, steps
(
TYPE_STEP_DETECTOR), and QR (CameraX + ML Kit) missions.
- Direct-boot persistence and reboot recovery before first unlock via
device-protected storage and
LOCKED_BOOT_COMPLETED.
- 28-finding security/performance/reliability audit driving a dedicated
hardening sprint. Macrobenchmark + Perfetto performance tooling.
FastAPI
Kafka
InfluxDB
React 19
Distributed monitoring platform — FastAPI microservices, a Kafka ingestion
pipeline, tiered InfluxDB retention, and real-time Telegram alerts.
LangChain
Pinecone
Gemini 2.5
Cohere
Hybrid-search RAG over docs — semantic + BM25 retrieval in Pinecone, Cohere
rerank, and Gemini 2.5 generation with query-time alpha tuning.
LangGraph
FastAPI
Multi-LLM
Firecrawl
AI 30-day learning-plan generator — LangGraph agent orchestration, cross-LLM
routing (OpenAI/Gemini/OpenRouter), and Firecrawl + Tavily retrieval.
FastAPI
Docker
Nginx
PostgreSQL
Microservices routine manager — split Auth/Core FastAPI services behind an
Nginx gateway, Argon2 + JWT security, and automated PDF export.