Setting Up Your MLOps Infrastructure: MinIO, Dagster, and Monitoring on Kubernetes

With your home server running Kubernetes, it’s time to deploy the infrastructure that powers modern data platforms. In this chapter, we’ll install MinIO for S3-compatible storage, Dagster for orchestration, and Prometheus with Grafana for monitoring.

The goal: clean URLs, no port-forwarding, and production-ready components accessible from anywhere on your network.

What You’ll Deploy

Component	Purpose	Access
MinIO	S3-compatible object storage	API on port 9000, Console on port 80
Dagster	Data orchestration platform	Web UI on port 80
Prometheus	Metrics collection	Query interface on port 80
Grafana	Visualization dashboards	Web UI on port 80

Using LoadBalancer services with MetalLB, each component gets its own IP address. No memorizing port numbers—just clean URLs like http://192.168.1.241 for MinIO Console.

Prerequisites

In the previous chapter you enabled essential addons so your Kubernetes cluster has:

MicroK8s with dns, storage, and metallb addons enabled
MetalLB configured with an IP address pool (e.g., 192.168.1.210-192.168.1.250)
kubectl configured (via microk8s kubectl or alias)

MicroK8s bundles Helm, but it’s accessed via microk8s helm. Set up an alias to use helm directly:

alias helm='microk8s helm'

Add this to your ~/.bashrc or ~/.zshrc to make it permanent.

Creating the Namespaces

Kubernetes namespaces provide logical isolation between workloads—separate resource quotas, access controls, and cleaner organization. We’ll use two namespaces: data for storage services like MinIO, and pipeline for orchestration tools like Dagster.

kubectl create namespace data
kubectl create namespace pipeline

Installing MinIO

MinIO provides S3-compatible object storage that runs on your hardware. Your code uses standard AWS SDKs, but data never leaves your network.

A note on MinIO licensing: MinIO has shifted toward an enterprise model, with the community edition now positioned for “test and dev use.” Features like site replication and advanced security require an enterprise license. For our tutorial purposes, the community edition works fine. If you want a fully open-source alternative, consider Garage—a self-hosted S3-compatible storage system without enterprise restrictions.

Add the Helm Repository

helm repo add minio https://charts.min.io/
helm repo update

Create Credentials

kubectl create secret generic minio-credentials \
  --from-literal=rootUser=admin \
  --from-literal=rootPassword=$(openssl rand -base64 32) \
  -n data

These are your MinIO root credentials—the admin account with full access to create buckets, manage users, and configure policies. Treat them like your AWS root credentials.

To retrieve them later:

# Access key
kubectl get secret minio-credentials -n data \
  -o jsonpath='{.data.rootUser}' | base64 --decode

# Secret key
kubectl get secret minio-credentials -n data \
  -o jsonpath='{.data.rootPassword}' | base64 --decode

Deploy with Helm

Create a minio-values.yaml file:

image:
  repository: quay.io/minio/minio
  # We pin to this version because later releases removed key features from the
  # console (API key generation, certain admin functions). This version includes
  # the full-featured web UI.
  tag: RELEASE.2025-04-22T22-12-26Z

mode: standalone
replicas: 1

persistence:
  enabled: true
  size: 500Gi

service:
  type: LoadBalancer
  port: 9000

consoleService:
  type: LoadBalancer
  port: 80

resources:
  requests:
    memory: 2Gi

existingSecret: minio-credentials

This values file configures: standalone mode (single node), 1TB storage, LoadBalancer services for network access, and references our credentials secret. Helm merges these values with the chart’s defaults—we only specify what we want to override.

Install MinIO:

helm install minio minio/minio \
  --namespace ais-pipeline \
  --values minio-values.yaml \
  --version 5.4.0

Verify the Installation

# Check pod status
kubectl get pods -n data -l app=minio

# Get LoadBalancer IPs
kubectl get svc -n data | grep minio

You should see two services with external IPs assigned. Open the Console IP in your browser to access MinIO’s web interface.

Installing Dagster

Dagster is an asset-based orchestration platform that thinks in terms of data products rather than tasks. It’s ideal for data pipelines where the focus is on what data exists, not just what jobs run.

Add the Helm Repository

helm repo add dagster https://dagster-io.github.io/helm
helm repo update

Create Credentials for MinIO Access

Dagster needs MinIO credentials to store logs and access data:

MINIO_ACCESS_KEY=$(kubectl get secret minio-credentials -n data \
  -o jsonpath='{.data.rootUser}' | base64 --decode)

MINIO_SECRET_KEY=$(kubectl get secret minio-credentials -n data \
  -o jsonpath='{.data.rootPassword}' | base64 --decode)

kubectl create secret generic dagster-aws-credentials \
  --from-literal=AWS_ACCESS_KEY_ID=$MINIO_ACCESS_KEY \
  --from-literal=AWS_SECRET_ACCESS_KEY=$MINIO_SECRET_KEY \
  -n pipeline

Deploy with Helm

Create a dagster-values.yaml file:

dagsterWebserver:
  service:
    type: LoadBalancer
    port: 80

dagsterDaemon:
  enabled: true

postgresql:
  enabled: true
  persistence:
    enabled: true
    size: 10Gi

runLauncher:
  type: K8sRunLauncher

Dagster requires PostgreSQL for metadata storage. Here we use the bundled PostgreSQL that ships with the Helm chart. In larger setups, you might deploy a shared PostgreSQL instance in your data namespace and configure Dagster to use it instead.

Install Dagster:

helm install dagster dagster/dagster \
  --namespace pipeline \
  --values dagster-values.yaml \
  --version 1.12.8

Verify the Installation

# Watch pods come up (takes 2-3 minutes)
kubectl get pods -n pipeline -w

# Get Dagit UI URL
kubectl get svc dagster-dagit -n pipeline

Expected pods:

dagster-daemon - Runs schedules and sensors
dagster-webserver - Web UI
dagster-postgresql - Metadata database

The user deployments pod may show errors initially—that’s expected until we deploy pipeline code, which we’ll do in a later chapter.

Installing Prometheus and Grafana

MicroK8s includes a managed observability addon that bundles Prometheus, Grafana, and Alertmanager with sensible defaults.

microk8s enable observability

This installs everything in the observability namespace. Wait for the pods to be ready:

kubectl get pods -n observability -w

Access Grafana

By default, Grafana uses a ClusterIP service. Patch it to LoadBalancer so it’s accessible from your network:

kubectl patch svc kube-prom-stack-grafana -n observability -p '{"spec": {"type": "LoadBalancer"}}'

Verify the external IP was assigned:

kubectl get svc -n observability | grep grafana

Get the admin password:

kubectl get secret kube-prom-stack-grafana -n observability \
  -o jsonpath="{.data.admin-password}" | base64 --decode

Default username is admin. Change the password after first login.

Collecting Your Service URLs

Create a helper script to display all your service endpoints:

#!/bin/bash
echo "=== Service URLs ==="
echo "MinIO API:      http://$(kubectl get svc minio-api -n data -o jsonpath='{.status.loadBalancer.ingress[0].ip}'):9000"
echo "MinIO Console:  http://$(kubectl get svc minio-console -n data -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
echo "Dagit UI:       http://$(kubectl get svc dagster-dagster-webserver -n pipeline -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
echo "Grafana:        http://$(kubectl get svc kube-prom-stack-grafana -n observability -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"

Save these URLs—you’ll use them throughout the series.

Verification Checklist

Before moving forward, confirm:

All pods show Running status (except user deployments, which may error)
MinIO Console loads in your browser
Dagit UI loads (even with missing code errors)
Grafana login works with admin credentials
All services have external IPs assigned (no <pending> status)

What’s Running Now

You’ve deployed a complete infrastructure stack:

MinIO ready to store your data lake with S3-compatible APIs
Dagster ready to orchestrate pipelines with a visual interface
Prometheus collecting cluster metrics
Grafana ready for custom dashboards

Each service has its own LoadBalancer IP, accessible from any machine on your network. This is the same architecture pattern used in cloud deployments—you’ve just built it on hardware you control.

What’s Next

With infrastructure in place, we need to design how data flows through the system. Chapter 3 covers the medallion architecture pattern and bucket organization that will structure your data lake.

Building production data infrastructure? We help organizations design and implement modern data platforms. Get in touch to discuss your project.

What You’ll Deploy

Prerequisites

Creating the Namespaces

Installing MinIO

Add the Helm Repository

Create Credentials

Deploy with Helm

Verify the Installation

Installing Dagster

Add the Helm Repository

Create Credentials for MinIO Access

Deploy with Helm

Verify the Installation

Installing Prometheus and Grafana

Access Grafana

Collecting Your Service URLs

Verification Checklist

What’s Running Now

What’s Next

Comments