With your home server running Kubernetes, it’s time to deploy the infrastructure that powers modern data platforms. In this chapter, we’ll install MinIO for S3-compatible storage, Dagster for orchestration, and Prometheus with Grafana for monitoring.
The goal: clean URLs, no port-forwarding, and production-ready components accessible from anywhere on your network.
What You’ll Deploy
| Component | Purpose | Access |
|---|---|---|
| MinIO | S3-compatible object storage | API on port 9000, Console on port 80 |
| Dagster | Data orchestration platform | Web UI on port 80 |
| Prometheus | Metrics collection | Query interface on port 80 |
| Grafana | Visualization dashboards | Web UI on port 80 |
Using LoadBalancer services with MetalLB, each component gets its own IP address. No memorizing port numbers—just clean URLs like http://192.168.1.241 for MinIO Console.
Prerequisites
In the previous chapter you enabled essential addons so your Kubernetes cluster has:
- MicroK8s with
dns,storage, andmetallbaddons enabled - MetalLB configured with an IP address pool (e.g.,
192.168.1.210-192.168.1.250) - kubectl configured (via
microk8s kubectlor alias)
MicroK8s bundles Helm, but it’s accessed via microk8s helm. Set up an alias to use helm directly:
alias helm='microk8s helm'
Add this to your ~/.bashrc or ~/.zshrc to make it permanent.
Creating the Namespaces
Kubernetes namespaces provide logical isolation between workloads—separate resource quotas, access controls, and cleaner organization. We’ll use two namespaces: data for storage services like MinIO, and pipeline for orchestration tools like Dagster.
kubectl create namespace data
kubectl create namespace pipeline
Installing MinIO
MinIO provides S3-compatible object storage that runs on your hardware. Your code uses standard AWS SDKs, but data never leaves your network.
A note on MinIO licensing: MinIO has shifted toward an enterprise model, with the community edition now positioned for “test and dev use.” Features like site replication and advanced security require an enterprise license. For our tutorial purposes, the community edition works fine. If you want a fully open-source alternative, consider Garage—a self-hosted S3-compatible storage system without enterprise restrictions.
Add the Helm Repository
helm repo add minio https://charts.min.io/
helm repo update
Create Credentials
kubectl create secret generic minio-credentials \
--from-literal=rootUser=admin \
--from-literal=rootPassword=$(openssl rand -base64 32) \
-n data
These are your MinIO root credentials—the admin account with full access to create buckets, manage users, and configure policies. Treat them like your AWS root credentials.
To retrieve them later:
# Access key
kubectl get secret minio-credentials -n data \
-o jsonpath='{.data.rootUser}' | base64 --decode
# Secret key
kubectl get secret minio-credentials -n data \
-o jsonpath='{.data.rootPassword}' | base64 --decode
Deploy with Helm
Create a minio-values.yaml file:
image:
repository: quay.io/minio/minio
# We pin to this version because later releases removed key features from the
# console (API key generation, certain admin functions). This version includes
# the full-featured web UI.
tag: RELEASE.2025-04-22T22-12-26Z
mode: standalone
replicas: 1
persistence:
enabled: true
size: 500Gi
service:
type: LoadBalancer
port: 9000
consoleService:
type: LoadBalancer
port: 80
resources:
requests:
memory: 2Gi
existingSecret: minio-credentials
This values file configures: standalone mode (single node), 1TB storage, LoadBalancer services for network access, and references our credentials secret. Helm merges these values with the chart’s defaults—we only specify what we want to override.
Install MinIO:
helm install minio minio/minio \
--namespace ais-pipeline \
--values minio-values.yaml \
--version 5.4.0
Verify the Installation
# Check pod status
kubectl get pods -n data -l app=minio
# Get LoadBalancer IPs
kubectl get svc -n data | grep minio
You should see two services with external IPs assigned. Open the Console IP in your browser to access MinIO’s web interface.
Installing Dagster
Dagster is an asset-based orchestration platform that thinks in terms of data products rather than tasks. It’s ideal for data pipelines where the focus is on what data exists, not just what jobs run.
Add the Helm Repository
helm repo add dagster https://dagster-io.github.io/helm
helm repo update
Create Credentials for MinIO Access
Dagster needs MinIO credentials to store logs and access data:
MINIO_ACCESS_KEY=$(kubectl get secret minio-credentials -n data \
-o jsonpath='{.data.rootUser}' | base64 --decode)
MINIO_SECRET_KEY=$(kubectl get secret minio-credentials -n data \
-o jsonpath='{.data.rootPassword}' | base64 --decode)
kubectl create secret generic dagster-aws-credentials \
--from-literal=AWS_ACCESS_KEY_ID=$MINIO_ACCESS_KEY \
--from-literal=AWS_SECRET_ACCESS_KEY=$MINIO_SECRET_KEY \
-n pipeline
Deploy with Helm
Create a dagster-values.yaml file:
dagsterWebserver:
service:
type: LoadBalancer
port: 80
dagsterDaemon:
enabled: true
postgresql:
enabled: true
persistence:
enabled: true
size: 10Gi
runLauncher:
type: K8sRunLauncher
Dagster requires PostgreSQL for metadata storage. Here we use the bundled PostgreSQL that ships with the Helm chart. In larger setups, you might deploy a shared PostgreSQL instance in your data namespace and configure Dagster to use it instead.
Install Dagster:
helm install dagster dagster/dagster \
--namespace pipeline \
--values dagster-values.yaml \
--version 1.12.8
Verify the Installation
# Watch pods come up (takes 2-3 minutes)
kubectl get pods -n pipeline -w
# Get Dagit UI URL
kubectl get svc dagster-dagit -n pipeline
Expected pods:
dagster-daemon- Runs schedules and sensorsdagster-webserver- Web UIdagster-postgresql- Metadata database
The user deployments pod may show errors initially—that’s expected until we deploy pipeline code, which we’ll do in a later chapter.
Installing Prometheus and Grafana
MicroK8s includes a managed observability addon that bundles Prometheus, Grafana, and Alertmanager with sensible defaults.
microk8s enable observability
This installs everything in the observability namespace. Wait for the pods to be ready:
kubectl get pods -n observability -w
Access Grafana
By default, Grafana uses a ClusterIP service. Patch it to LoadBalancer so it’s accessible from your network:
kubectl patch svc kube-prom-stack-grafana -n observability -p '{"spec": {"type": "LoadBalancer"}}'
Verify the external IP was assigned:
kubectl get svc -n observability | grep grafana
Get the admin password:
kubectl get secret kube-prom-stack-grafana -n observability \
-o jsonpath="{.data.admin-password}" | base64 --decode
Default username is admin. Change the password after first login.
Collecting Your Service URLs
Create a helper script to display all your service endpoints:
#!/bin/bash
echo "=== Service URLs ==="
echo "MinIO API: http://$(kubectl get svc minio-api -n data -o jsonpath='{.status.loadBalancer.ingress[0].ip}'):9000"
echo "MinIO Console: http://$(kubectl get svc minio-console -n data -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
echo "Dagit UI: http://$(kubectl get svc dagster-dagster-webserver -n pipeline -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
echo "Grafana: http://$(kubectl get svc kube-prom-stack-grafana -n observability -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"
Save these URLs—you’ll use them throughout the series.
Verification Checklist
Before moving forward, confirm:
- All pods show
Runningstatus (except user deployments, which may error) - MinIO Console loads in your browser
- Dagit UI loads (even with missing code errors)
- Grafana login works with admin credentials
- All services have external IPs assigned (no
<pending>status)
What’s Running Now
You’ve deployed a complete infrastructure stack:
- MinIO ready to store your data lake with S3-compatible APIs
- Dagster ready to orchestrate pipelines with a visual interface
- Prometheus collecting cluster metrics
- Grafana ready for custom dashboards
Each service has its own LoadBalancer IP, accessible from any machine on your network. This is the same architecture pattern used in cloud deployments—you’ve just built it on hardware you control.
What’s Next
With infrastructure in place, we need to design how data flows through the system. Chapter 3 covers the medallion architecture pattern and bucket organization that will structure your data lake.
Building production data infrastructure? We help organizations design and implement modern data platforms. Get in touch to discuss your project.
Comments