AI Platforms Engineer

Ade Daramola.
Building systems
that think and heal.

I design and ship production-grade AI-native infrastructure — autonomous Kubernetes platforms, agentic RAG pipelines, and multi-model orchestration systems that act with confidence and fail gracefully.


About

I'm a senior AI/Cloud engineer specialising in production-grade agentic systems, cloud-native infrastructure, and multi-agent orchestration. My focus is engineering software that can reason, decide, and act with minimal human intervention.

Every platform I build is infrastructure-as-code first — Terraform-managed, properly observed, and documented from day one. Not proof-of-concepts. Deployable, maintainable systems.

My work sits at the intersection of AWS cloud architecture and modern AI frameworks: LangGraph, confidence-gated decision engines, and retrieval-augmented generation at production scale.

"Every remediation is a Git commit. The cluster never changes outside a reviewed pull request or a high-confidence auto-apply."

Selected Work

Three platforms.
One through-line.

01
AIOps Platform

GitOps Sentinel

A production-grade AIOps platform for Kubernetes — intercepts infrastructure anomalies, routes them through a 5-agent reasoning pipeline, and acts only when confidence justifies it. Every fix is a Git commit; every rollback is automatic.

Confidence-gated routing: ≥80% auto-applies, 40–79% opens a PR, <40% escalates to on-call — no human in the loop for routine incidents.
AWS Step Functions pipeline: Classifier → Root Cause → Action Planner → Confidence Scorer → Router, all Lambda-backed.
Full GitOps write path via Argo CD; failed post-remediation Prometheus checks trigger an automatic revert PR.
12+ custom Terraform modules; 33 unit tests at 100% pass rate.
PythonAWS LambdaStep FunctionsTerraform EventBridgeDynamoDBArgo CDKubernetes / EKSPrometheusAmazon Bedrock
02
Agentic RAG System

Medical Agentic RAG

A production agentic retrieval system for medical knowledge — streaming token-by-token responses, heuristic confidence scoring, and iterative relevance checking across three distinct knowledge sources.

LangGraph stateful workflow: Router → Retriever → Relevance Checker (up to 3 iterations) → Augment → Generate, with live DuckDuckGo web fallback.
SSE streaming from GPT-4o-mini through FastAPI to React; real-time token-by-token output with confidence scores and source badges.
PostgreSQL + pgvector vector store, rate limiting (20 req/min), API key auth, and CORS restrictions — production-hardened from day one.
AWS stack via Terraform: ECS Fargate, RDS PostgreSQL 16, ALB, ECR, S3 + CloudFront, Secrets Manager.
PythonLangGraphFastAPITerraform PostgreSQL / pgvectorGPT-4o-miniReact · SSEECS FargateCloudFrontDocker
03
LLM Orchestration

Multi-LLM Platform

A unified orchestration layer abstracting multiple LLM providers behind a single API — enabling intelligent model routing, per-provider cost tracking, and automatic failover across OpenAI, Anthropic, and AWS Bedrock.

Provider-agnostic adapter pattern: routing decisions based on cost, latency, and capability requirements per request.
Automatic fallback strategies and per-run token cost breakdowns for transparent LLM spend management.
Designed for extensibility — new providers added via a single adapter interface with no changes to the routing layer.
PythonFastAPILangChain / LangGraph OpenAIAnthropicAWS BedrockDockerTerraform

Technical Expertise

Cloud-native.
AI-first.

AI & Agentic Systems
  • Multi-agent pipeline orchestration
  • LangGraph stateful workflows
  • Retrieval-Augmented Generation
  • Confidence-gated decision engines
  • LLM routing & provider abstraction
  • Semantic vector search (pgvector)
Cloud Infrastructure
  • AWS Lambda, Step Functions, EventBridge
  • ECS Fargate, RDS, ALB, CloudFront
  • Amazon Bedrock & SageMaker
  • DynamoDB, S3, Secrets Manager
  • Terraform — 12+ custom modules
  • IAM scoped roles & X-Ray tracing
Kubernetes & GitOps
  • EKS cluster management
  • Argo CD GitOps controller
  • OPA Gatekeeper policy enforcement
  • Prometheus + Grafana observability
  • Helm chart deployment
  • Alertmanager integration
Backend & APIs
  • FastAPI — production-grade
  • SSE streaming responses
  • PostgreSQL + pgvector
  • Docker & docker-compose
  • Rate limiting & API key auth
  • Structured JSON logging
Languages
  • Python (primary)
  • HCL / Terraform
  • JavaScript / React
  • Bash / Makefile
  • Open Policy Agent (Rego)
  • YAML / JSON
Practices
  • Test-driven development (pytest)
  • Infrastructure as Code
  • Event-driven architecture
  • CI/CD — GitHub Actions
  • Security-first IAM design
  • Cost-aware architecture

Let's build something
genuinely ambitious.

hello@isokandev.com  ·  isokandev.com