Resources List
Resources Created in LOCI Self-Hosted Deployment
This document lists all resources created by the standalone deployment infrastructure.
Table of Contents
Infrastructure & Networking
EKS Cluster & Node Groups
Kubernetes Resources
AWS Services
Route53 & DNS
Monitoring & Logging
Security & IAM
Storage
Notes
All resources are tagged with cost allocation tags for tracking
Resources follow AWS Well-Architected Framework principles
High availability is configured for critical components (databases, node groups)
Backup and disaster recovery configured for databases
Monitoring and logging are comprehensive across all services
Security follows least-privilege IAM principles
All data is encrypted at rest and in transit
Infrastructure & Networking
VPC Resources
VPC (
example-cluster)CIDR:
10.3.0.0/16(example CIDR)DNS hostnames enabled
DNS support enabled
Subnets
Public Subnets
10.3.1.0/24(Availability Zone A)10.3.3.0/24(Availability Zone B)Tagged for ELB:
kubernetes.io/role/elb = 1
Private Subnets
10.3.2.0/24(Availability Zone A)10.3.4.0/24(Availability Zone B)Tagged for internal ELB:
kubernetes.io/role/internal-elb = 1
Networking Components
Internet Gateway (IGW)
NAT Gateway (Single NAT Gateway for cost optimization)
Route Tables (Public and Private)
VPC Endpoints (for private access to AWS services):
ECR API Endpoint (
com.amazonaws.{region}.ecr.api)ECR DKR Endpoint (
com.amazonaws.{region}.ecr.dkr)CloudWatch Logs Endpoint (
com.amazonaws.{region}.logs)EventBridge Endpoint (
com.amazonaws.{region}.events) - OptionalS3 Gateway Endpoint (
com.amazonaws.{region}.s3)SageMaker Runtime Endpoint (
com.amazonaws.{region}.sagemaker.runtime)SageMaker API Endpoint (
com.amazonaws.{region}.sagemaker.api)
Security Groups
EKS Cluster Security Group
Node Security Groups (for each node group)
SageMaker Endpoint Security Group
VPC Endpoint Security Group
EKS Cluster & Node Groups
EKS Cluster
Cluster Name:
example-cluster(example name, will be set from config.ini)Kubernetes Version: Latest supported
CloudWatch Logging: Enabled (14-day retention)
Public Endpoint Access: Enabled
Node Groups
1. Private Node Group
Instance Types:
t3.mediumDesired Size: 2
Min Size: 2
Max Size: 3
Subnets: Private subnets
Capacity Type: ON_DEMAND
Workload Type: Private workloads
Labels:
workload-type=privateServices: Backend application, Monitoring services (Grafana, Prometheus, Loki)
2. Database Node Group
Instance Types:
t3.xlargeDesired Size: 2
Min Size: 2
Max Size: 5
Subnets: Private subnets
Disk Size: 400GB (gp3, encrypted)
Workload Type: Database workloads
Labels:
workload-type=database
3. Public Node Group (Disabled)
Status: Disabled (desired_size = 0)
Note: Public node group is not created. All workloads run on private nodes.
4. MCP Node Group
Instance Types:
t3.xlargeDesired Size: 2
Min Size: 2
Max Size: 4
Subnets: Private subnets
Capacity Type: ON_DEMAND
Workload Type: MCP workloads
Labels:
workload-type=mcpServices: MCP Server, MCP Agent, Code Optim Agent
EKS Add-ons
EBS CSI Driver (for persistent volumes)
AWS Load Balancer Controller
CoreDNS (for service discovery)
kube-proxy
VPC CNI
Kubernetes Resources
Namespaces
backend - Backend application namespace
frontend - Frontend application namespace
db - Database namespace (PostgreSQL)
neo4j - Neo4j database namespace
lambda - Lambda functions namespace
mcp - MCP services namespace (MCP Server, MCP Agent, Code Optim Agent)
monitoring - Monitoring stack namespace (Grafana, Prometheus)
logging - Logging stack namespace (Loki, Fluent-bit)
Helm Charts Deployed
1. Backend (charts/backend/)
Deployment: Backend application pods
Service: ClusterIP service for backend
Service (LB): LoadBalancer service for backend
Ingress: ALB Ingress for external access
HPA: Horizontal Pod Autoscaler
Image:
loci-backend:latest
2. Frontend (charts/frontend/)
Deployment: Frontend application pods
NodeSelector:
workload-type: private(explicitly configured - runs on Private Node Group)Service: ClusterIP service for frontend
Ingress: ALB Ingress for external access
Image:
example-frontend:latest(example image name)Hostname:
example.com(example hostname, will be set from config.ini)
3. Neo4j (charts/neo4j/)
Helm Chart: Neo4j Community Edition
Deployment: Neo4j database pods
NodeSelector:
workload-type: database(explicitly configured - runs on Database Node Group)Service: ClusterIP and LoadBalancer services
Backup CronJob: Automated Neo4j backups to S3
Credentials: Managed via Kubernetes secrets
4. PostgreSQL (charts/postgres-pgo/)
PostgreSQL Operator: PGO (PostgreSQL Operator)
PostgreSQL Cluster: High Availability PostgreSQL
NodeSelector:
workload-type: database(explicitly configured - runs on Database Node Group)Service:
loci-postgres-haserviceBackup CronJob: Automated PostgreSQL backups to S3
Credentials: Managed via Kubernetes secrets
5. Grafana (charts/grafana/)
Helm Chart: Grafana
Namespace:
monitoringDeployment: Grafana pods
NodeSelector: None (can run on any node group)
Service: ClusterIP service
Ingress: ALB Ingress for external access
Hostname:
grafana.example.com(example hostname, will be set from config.ini)Storage: 10Gi persistent volume
Dashboards: Pre-configured dashboards (cluster-logs.json)
Data Sources: Prometheus, Loki
6. Prometheus (charts/prometheus/)
Helm Chart: Prometheus
Namespace:
monitoringDeployment: Prometheus pods
NodeSelector:
workload-type: private(explicitly configured - runs on Private Node Group)Service: ClusterIP service (
prometheus.monitoring.svc.cluster.local:9090)Storage: 50Gi persistent volume
Retention: 7 days
Scraping: Metrics collection from cluster
7. Loki (charts/loki/)
Helm Chart: Loki
Namespace:
loggingDeployment: Loki pods (SingleBinary mode)
NodeSelector:
workload-type: private(explicitly configured - runs on Private Node Group)Service: ClusterIP service (
loci-logs-loki.logging.svc.cluster.local:3100)Storage: 50Gi persistent volume
Retention: 48 hours (2 days)
External Service: For Fluent-bit integration
8. Fluent-bit (charts/fluent-bit/)
DaemonSet: Fluent-bit pods
Namespace:
loggingNodeSelector:
workload-type: private(explicitly configured - runs on all Private Node Group nodes)Log Forwarding: To Loki and CloudWatch
8. Fluent-bit (charts/fluent-bit/)
DaemonSet: Fluent-bit pods on all nodes
ConfigMap: Fluent-bit configuration
ServiceAccount: With IRSA for CloudWatch access
ClusterRole: Permissions for log collection
Log Forwarding: To Loki and CloudWatch
9. Lambda (charts/lambda/)
Kubernetes Functions: Lambda functions deployed as K8s resources
RBAC: ServiceAccount and RoleBindings
Functions:
Version Uploaded Lambda
Service Status Updated Lambda
SageMaker Status Lambda
10. MCP Server (charts/mcp/mcp-server/)
Helm Chart: MCP Server
Namespace:
mcpDeployment: MCP Server pods
NodeSelector:
workload-type: mcp(explicitly configured - runs on MCP Node Group)Service: ClusterIP service (
loci-mcp-server.mcp.svc.cluster.local:7000)Image:
loci-mcp-server:latestResources: 1 CPU limit, 3Gi memory limit
Environment: Backend URL configured
11. MCP Agent (charts/mcp/mcp-agent/)
Helm Chart: MCP Agent
Namespace:
mcpDeployment: MCP Agent pods
NodeSelector:
workload-type: mcp(explicitly configured - runs on MCP Node Group)Service: ClusterIP service (port 8000)
Ingress: ALB Ingress for external access
Hostname:
agent.example.com(example hostname, will be set from config.ini)Image:
example-mcp-agent:latest(example image name)Resources: 2 CPU limit, 3Gi memory limit
Environment: Backend URL and AWS credentials configured
MCP Server URL:
example-mcp-server.mcp.svc.cluster.local(example service name)
AWS Services
ECS (Elastic Container Service)
ECS Cluster:
LociPlatform{LociEnv}ClusterCapacity Providers: FARGATE, FARGATE_SPOT
Task Definitions:
Static Analysis Task:
LociPlatform{LociEnv}StaticAnalysisCPU: 8192 (8 vCPU)
Memory: 61440 MB (60 GB)
Container:
static-analysis
LCLM Task:
LociPlatform{LociEnv}LCLMCPU: 8192 (8 vCPU)
Memory: 61440 MB (60 GB)
Container:
lclm
SageMaker
SageMaker Model:
LociPlatform{LociEnv}ModelModel Package ARN: From config (
lclm-multi)VPC Configuration: Enabled
Endpoint Configuration:
LociPlatform{LociEnv}EndpointConfigInstance Type:
ml.g5.xlargeAsync Inference: Enabled
SNS Notifications: Configured
SageMaker Endpoint:
LociPlatform{LociEnv}EndpointAuto Scaling: Enabled (min: 1, max: configured)
CloudWatch Alarms: CPU and Memory utilization
DynamoDB
Table:
LociPlatform{LociEnv}PipelineDBBilling Mode: PAY_PER_REQUEST
Primary Key:
version_id(Hash),artifact_action(Range)Point-in-Time Recovery: Enabled
Encryption: Server-side encryption enabled
S3 Buckets
Storage Bucket:
loci-{environment}(or configured name)Versioning: Suspended
Encryption: AES256
Public Access: Blocked
Lifecycle Rules: Transition to IA (30 days), Glacier (90 days)
Backup Bucket:
loci-backups-{environment}Used for: PostgreSQL backups, Neo4j backups
Terraform State Bucket:
loci-terraform(or configured)State file:
cluster/terraform.tfstate
EventBridge
Event Bus:
LociPlatform{LociEnv}EventsEvent Rules:
LociPlatform{LociEnv}PipelineStartedLociPlatform{LociEnv}PipelineCompletedLociPlatform{LociEnv}PipelineFailedLociPlatform{LociEnv}VersionUploadedLociPlatform{LociEnv}ServiceStatusUpdated
SNS (Simple Notification Service)
Topic:
LociPlatform{LociEnv}NotificationsUsed for: SageMaker async inference notifications
Used for: Pipeline event notifications
CloudWatch
Log Groups:
/ecs/LociPlatform{LociEnv}StaticAnalysis(14-day retention)/ecs/LociPlatform{LociEnv}LCLM(14-day retention)/aws/sagemaker/Endpoints/LociPlatform{LociEnv}Endpoint(14-day retention)/aws/lambda/LociPlatform{LociEnv}-VersionUploadedLambda(14-day retention)/aws/lambda/LociPlatform{LociEnv}-ServiceStatusUpdatedLambda(14-day retention)/aws/lambda/LociPlatform{LociEnv}-SageMakerStatusLambda(14-day retention)
Alarms:
SageMaker High CPU Utilization
SageMaker High Memory Utilization
Lambda Functions (Kubernetes-based)
Version Uploaded Lambda: Handles S3 upload events
Service Status Updated Lambda: Handles ECS task status updates
SageMaker Status Lambda: Handles SageMaker endpoint status
Route53 & DNS
Public DNS (Route53 Hosted Zone)
Hosted Zone:
example.com(example zone ID: Z08012303MU06OUHG8U2M)Records Created:
example.com→ Frontend ALB (example hostname)api.example.com→ Backend ALB (example hostname)grafana.example.com→ Grafana ALB (example hostname)agent.example.com→ MCP Agent ALB (example hostname)
Private DNS (Route53 Private Hosted Zone)
Private Zone:
k8s.{environment}.internalVPC Association: Associated with VPC
Service Discovery Records:
neo4j.k8s.{environment}.internal→ Neo4j LoadBalancerloci-postgres-ha.k8s.{environment}.internal→ PostgreSQL LoadBalancerloci-backend.k8s.{environment}.internal→ Backend LoadBalancer
Route53 Resolver
Resolver Endpoint:
{environment}-k8s-dns-forwarder(OUTBOUND)Resolver Rule:
{environment}-k8s-service-forwardingForwards
svc.cluster.localqueries to CoreDNS
Rule Association: Associated with VPC
Monitoring & Logging
Prometheus
Metrics Collection: Cluster-wide metrics
Scraping Targets: All Kubernetes services
Storage: Persistent volume
Grafana
Dashboards: Pre-configured dashboards
Cluster Logs Dashboard
Data Sources: Prometheus, Loki
Access:
https://grafana.example.com(example URL, will be set from config.ini)
Loki
Log Aggregation: Centralized log storage
Service:
loci-logs-loki.logging.svc.cluster.localIntegration: Fluent-bit → Loki
Fluent-bit
Log Collection: DaemonSet on all nodes
Log Forwarding:
To Loki (cluster logs)
To CloudWatch (AWS service logs)
Filters: Log parsing and enrichment
CloudWatch
Container Insights: Enabled on ECS cluster
Log Groups: All application and service logs
Metrics: ECS, SageMaker, Lambda metrics
Security & IAM
IAM Roles
ECS Roles
ECS Task Execution Role:
LociPlatform{LociEnv}ECSTaskExecutionRoleECR access
CloudWatch Logs access
S3 read access (model artifacts)
ECS Task Role:
LociPlatform{LociEnv}ECSTaskRoleFull S3 access
EventBridge access
DynamoDB access
SageMaker access
ECS access
SNS access
CloudWatch Logs access
SageMaker Roles
SageMaker Execution Role:
LociPlatform{LociEnv}SageMakerExecutionRoleSageMaker Full Access
S3 access (storage bucket)
SNS access (notifications)
ECR access (cross-account)
Model Package access (cross-account)
Lambda Roles
Lambda Execution Roles: Created via Kubernetes IRSA
ECS access
S3 access
EventBridge access
DynamoDB access
SageMaker access
Kubernetes Service Accounts
lambda-secret-updater: ServiceAccount with IRSA
Updates Kubernetes secrets from AWS Secrets Manager
fluent-bit: ServiceAccount with IRSA
CloudWatch Logs write access
IAM Policies
DynamoDB Pipeline Access Policy:
LociPlatform{LociEnv}DynamoDBAccessSageMaker S3 Access Policy:
LociPlatform{LociEnv}SageMakerS3AccessSageMaker SNS Access Policy:
LociPlatform{LociEnv}SageMakerSNSAccessSageMaker Model Package Access Policy:
LociPlatform{LociEnv}SageMakerModelPackageAccessSageMaker ECR Access Policy:
LociPlatform{LociEnv}SageMakerECRAccess
Security Groups
EKS Cluster Security Group: Cluster control plane access
Node Security Groups: Node group access rules
SageMaker Endpoint Security Group: Endpoint access (HTTPS from VPC)
VPC Endpoint Security Group: VPC endpoint access (HTTPS from VPC)
Secrets Management
Kubernetes Secrets:
Neo4j credentials (
neo4jnamespace)PostgreSQL credentials (
dbnamespace)Lambda environment variables (
lambdanamespace)AWS credentials (
mcpnamespace) - Used by MCP Agent and Code Optim Agent
AWS Secrets Manager: (if configured)
Database passwords
API keys
Storage
EBS Volumes
Database Node Volumes: 400GB gp3 encrypted volumes
Persistent Volumes: Created via EBS CSI Driver
PostgreSQL data volumes
Neo4j data volumes
Prometheus storage
S3 Storage
Application Data:
loci-{environment}bucketModel artifacts
SageMaker input/output
Application uploads
Backups:
loci-backups-{environment}bucketPostgreSQL backups (pgBackRest)
Neo4j backups
Terraform State:
loci-terraformbucketInfrastructure state files
Storage Classes (Kubernetes)
gp3: General purpose SSD (default)
io1: Provisioned IOPS SSD (if configured)
Summary
Total Resource Count (Approximate)
VPC Resources: ~15 (VPC, subnets, gateways, route tables, endpoints)
EKS Resources: 1 cluster + 3 node groups (Private + Database + MCP; Public disabled)
Kubernetes Namespaces: 8 (backend, frontend, db, neo4j, lambda, mcp, monitoring, logging)
Helm Charts: 12
ECS Resources: 1 cluster + 2 task definitions
SageMaker Resources: 1 model + 1 endpoint config + 1 endpoint
DynamoDB Tables: 1
S3 Buckets: 2-3
Route53 Zones: 2 (1 public + 1 private)
Route53 Records: ~8-10
IAM Roles: ~5-7
IAM Policies: ~8-10
Security Groups: ~5-7
CloudWatch Log Groups: ~6
CloudWatch Alarms: 2
EventBridge Resources: 1 event bus + 5 rules
SNS Topics: 1
Lambda Functions: 3 (Kubernetes-based)
Node Group Assignments
Services by Node Group
Private Node Group (workload-type: private)
Backend (
backendnamespace)Application pods
NodeSelector:
workload-type: private(explicitly configured)Service:
loci-backendLoadBalancer: Internal NLB for backend access
Frontend (
frontendnamespace)Application pods
NodeSelector:
workload-type: private(explicitly configured)Service: Frontend service
Ingress: ALB for external access
Prometheus (
monitoringnamespace)Prometheus pods
NodeSelector:
workload-type: private(explicitly configured)Storage: 50Gi persistent volume
Service: ClusterIP (
prometheus.monitoring.svc.cluster.local:9090)
Loki (
loggingnamespace)Loki pods
NodeSelector:
workload-type: private(explicitly configured)Storage: 50Gi persistent volume
Service: ClusterIP (
loci-logs-loki.logging.svc.cluster.local:3100)
Fluent-bit (
loggingnamespace)DaemonSet pods (runs on all private nodes)
NodeSelector:
workload-type: private(explicitly configured)Log collection from all nodes
Database Node Group (workload-type: database)
Neo4j (
neo4jnamespace)Neo4j database pods
NodeSelector:
workload-type: database(explicitly configured)Storage: 120Gi EBS volumes
Service:
neo4j(ClusterIP) + LoadBalancer (Internal NLB)
PostgreSQL (
dbnamespace)PostgreSQL cluster pods (managed by PGO)
NodeSelector:
workload-type: database(explicitly configured)Replicas: 2 instances
Storage: 400Gi EBS volumes per instance
Service:
loci-postgres-ha(LoadBalancer - Internal NLB)PGO Operator: Deployed in
dbnamespace (no nodeSelector - control plane component)
MCP Node Group (workload-type: mcp)
MCP Server (
mcpnamespace)MCP Server pods
NodeSelector:
workload-type: mcp(explicitly configured)Service:
loci-mcp-server(ClusterIP)Resources: 1 CPU, 3Gi memory
MCP Agent (
mcpnamespace)MCP Agent pods
NodeSelector:
workload-type: mcp(explicitly configured)Service: ClusterIP + Ingress (ALB)
Hostname:
agent.example.com(example hostname)Resources: 2 CPU, 3Gi memory
Public Node Group (workload-type: public) - DISABLED
Status: Not created (disabled in configuration)
Note: Public node group is disabled. System pods (CoreDNS, kube-proxy, ALB Ingress Controller) will run on private nodes.
All Node Groups
Fluent-bit (
loggingnamespace)DaemonSet deployed on all nodes
No nodeSelector (runs on all node types)
Collects logs from all nodes
Node Group Summary
Public Node Group: DISABLED (not created)
Private Node Group: All workloads run here (applications + system pods):
Backend (explicitly configured with
workload-type: private)Frontend (explicitly configured with
workload-type: private)Prometheus (explicitly configured with
workload-type: private)Loki (explicitly configured with
workload-type: private)Fluent-bit DaemonSet (explicitly configured with
workload-type: private)Min Size: 2 nodes
Database Node Group: Database workloads only:
Neo4j (explicitly configured with
workload-type: database)PostgreSQL (explicitly configured with
workload-type: database)Min Size: 2 nodes
MCP Node Group: MCP/AI workloads:
MCP Server (explicitly configured with
workload-type: mcp)MCP Agent (explicitly configured with
workload-type: mcp)Code Optim Agent (explicitly configured with
workload-type: mcp)Min Size: 2 nodes
Instance Type:
t3.xlarge(4 vCPU, 16GB RAM)
Monitoring Services Details
Namespace:
monitoring(Grafana, Prometheus) andlogging(Loki, Fluent-bit)Node Assignment:
Prometheus: Explicitly configured with
workload-type: private- runs on Private Node GroupLoki: Explicitly configured with
workload-type: private- runs on Private Node GroupFluent-bit: Explicitly configured with
workload-type: private- DaemonSet runs on all Private Node Group nodesGrafana: No explicit nodeSelector (can run on any node group, typically scheduled on Private Node Group)
Storage: All monitoring services use persistent volumes (Grafana: 10Gi, Prometheus: 50Gi, Loki: 50Gi)
Last updated