Harrison Muskat

Staff engineer — cloud platform & DevOps leadership, now building multi-agent AI infrastructure.

contactme@muskat.dev GitHubLinkedIn Download PDF

Summary

Staff engineer with 9+ years operating cloud-native infrastructure at scale, now building platform and AI infrastructure for multi-agent LLM systems. Hands-on across GCP (GKE/Autopilot, CloudSQL, Pub/Sub), Terraform, Kubernetes, and CI/CD orchestration, with a track record leading DevOps teams of 7–12 and 24/7 on-call. Shipped multi-tenant platforms serving 100+ customer environments at 300K+ messages/hour and cut deployment times ~87%.

Experience

Staff Software Engineer, Platform & AI Infrastructure Nov 2025 – Present

Ichi · Oakland, CA

  • Architected the core platform for a building-code compliance SaaS (Rails, Next.js, GraphQL), including a multi-agent LLM orchestration layer integrating Gemini Flash/Pro and OpenAI behind a unified provider abstraction.
  • Stood up the observability stack from the ground up — Prometheus, Grafana, and OpenTelemetry across Sidekiq workers, API endpoints, and the LLM gateway — establishing baseline latency, error, and token-usage metrics.
  • Re-architected document processing into an asynchronous, webhook-based pipeline with S3 object storage and Redis caching, eliminating synchronous blocking in the request path and removing N+1 queries.
  • Designed a security-isolated public-sharing architecture using segregated GraphQL types, enabling external collaboration and audit-grade visit tracking without exposing authenticated tenant data.
  • Built an end-to-end streaming pipeline delivering dynamically generated structured outputs over GraphQL subscriptions, with artifact versioning and self-healing error recovery.
Staff DevOps Engineer Feb 2019 – Oct 2025

6 River Systems · Waltham, MA

  • Converted the Terraform IaC surface to TypeScript invoking Kubernetes and Google APIs directly, cutting average deployment time from ~1 hour to ~8 minutes (~87%) and improving extensibility, reliability, and traceability.
  • Built core infrastructure for a multi-tenant ETL platform on GKE Autopilot, scaling to 100+ customer environments and 300,000+ messages/hour, backed by Prometheus/Grafana observability and Google IAP.
  • Designed and rolled out OIDC-based SSO and configurable multi-tenant infrastructure with customer-facing management tooling — directly enabling two major enterprise client wins.
  • Led the 24/7/365 on-call rotation and gave technical leadership to a 12-engineer platform team; as DevOps Lead, ran a 7-engineer team across release engineering, user management, metrics/reporting, and IoT.
  • Cut robot-fleet upgrade time ~67% (~60 to ~20 minutes), created a CircleCI orb standardizing CI/CD across repos, and built Node.js internal tooling that replaced manual database operations.
Software, QA & Operations Jan 2015 – Feb 2019

Earlier roles · Greater Boston, MA

  • 6 River Systems — Senior Software Engineer in Test (2017–18): designed full-system regression, smoke, and integration tests for the mobile robotic fulfillment platform; first hands-on exposure to Docker, Kubernetes, Spinnaker, and GCP — the pivot into DevOps.
  • Solve at MIT — Web Developer (2018–19): lead developer for the Solve platform; modernized legacy code, built test coverage, and led the transition to an Agile SDLC.
  • athenahealth — Senior Payer Rules Associate (2015–17): led a cross-team Client Work Reduction project, shipping automations that cut client workload ~10%.

Skills

Cloud Platforms
GCP (GKE/Autopilot, Cloud Run, Compute Engine, CloudSQL, Cloud Storage, Pub/Sub, IAP, IAM, BigQuery)
Container Orchestration
Kubernetes, Docker, Helm
Infrastructure as Code
Terraform, Kubernetes/Google APIs, Helm, Ansible, Spinnaker
CI/CD
CircleCI, GitHub Actions, Jenkins, Cloud Build, release orchestration
Observability & Reliability
Prometheus, VictoriaMetrics, Alertmanager, Grafana, OpenTelemetry, on-call, incident response, runbooks
Security & Identity
OIDC, JWT-based SSO, Workload Identity, IAM, multi-tenant isolation
Networking
Cloud DNS, Nginx, load balancing
Data
Postgres, MySQL, SQLite, Redis, Pub/Sub, BigQuery
Languages
TypeScript, JavaScript, Node.js, Ruby, Rails, Bash
Methodologies
Agile, SRE practices, postmortems, hiring & technical interviewing

Education

  • B.A. in Religion, cum laude, Amherst College · 2012