Setting Up Fluentd in a Kubernetes Environment with Elasticsearch and Kibana for Production
Table of contents
- Introduction
- Prerequisites
- Conclusion
Introduction
In modern Kubernetes environments, log management is crucial for monitoring application health, debugging issues, and ensuring security compliance. One of the most popular logging solutions for Kubernetes is the EFK stack (Elasticsearch, Fluentd, Kibana).
This guide provides a step-by-step approach to setting up Fluentd for log collection in a Kubernetes cluster, while deploying Elasticsearch and Kibana for centralized log storage and visualization in a production-ready environment.
Why Use Fluentd, Elasticsearch, and Kibana in Kubernetes?
1. Fluentd (Log Collector and Forwarder)
Collects logs from Kubernetes pods, services, and nodes.
Processes, filters, and enriches logs before sending them to Elasticsearch.
Lightweight and scales well in production.
2. Elasticsearch (Log Storage and Search Engine)
A distributed and scalable search engine designed to store, index, and query logs efficiently.
Ideal for real-time log analysis.
3. Kibana (Visualization and Monitoring)
Provides a web-based interface for searching, analyzing, and visualizing logs.
Helps DevOps teams monitor system health and detect anomalies.
Prerequisites
Before setting up Fluentd, Elasticsearch, and Kibana, ensure that:
✅ You have a Kubernetes cluster (AWS EKS, GKE, AKS, or a self-hosted cluster).
✅ kubectl is installed and configured to interact with the cluster.
✅ Helm is installed (for easier deployment of Elasticsearch and Kibana).
✅ You have sufficient CPU, memory, and storage resources.
Why is Elasticsearch Deployed as a StatefulSet?
In Kubernetes, StatefulSets are used for stateful applications like Elasticsearch that require:
Stable network identities (predictable pod names for easy clustering).
Persistent storage (data should not be lost if a pod restarts).
Ordered deployments and scaling to avoid data corruption.
How Elasticsearch Works as a StatefulSet
Persistent Storage
Each Elasticsearch pod gets a PersistentVolumeClaim (PVC), ensuring that log data is retained even if a pod restarts.
Example storage paths:
elasticsearch-master-0 → /data/elasticsearch elasticsearch-master-1 → /data/elasticsearch elasticsearch-master-2 → /data/elasticsearch
Stable Network Identity
Each pod gets a unique, predictable hostname (e.g.,
elasticsearch-master-0
,elasticsearch-master-1
).This helps Elasticsearch nodes discover each other easily.
Ordered Scaling & Updates
Kubernetes ensures that nodes are started in order and shutdown gracefully.
This prevents data corruption or split-brain issues in Elasticsearch.
Example: Deploying Elasticsearch as a StatefulSet
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
spec:
serviceName: "elasticsearch"
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.0
resources:
requests:
memory: "2Gi"
cpu: "1"
volumeMounts:
- name: storage
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
Benefits of Using StatefulSets for Elasticsearch
Ensures high availability and data persistence.
Prevents data loss and ensures graceful scaling.
Stable networking for Elasticsearch cluster discovery.
Step 1: Deploy Elasticsearch in Kubernetes (Production Setup)
Elasticsearch requires persistent storage and proper resource allocation in production. We will use Helm to deploy it.
1. Add the Helm Repository
helm repo add elastic https://helm.elastic.co
helm repo update
2. Deploy Elasticsearch with Helm
helm install elasticsearch elastic/elasticsearch \
--set replicas=3 \
--set minimumMasterNodes=2 \
--set persistence.enabled=true \
--set resources.requests.cpu=1 \
--set resources.requests.memory=2Gi
Configuration Explanation:
replicas=3
: Deploys a 3-node cluster (production-ready setup).minimumMasterNodes=2
: Ensures high availability.persistence.enabled=true
: Enables persistent storage for logs.resources.requests.cpu=1, memory=2Gi
: Allocates sufficient resources.
3. Verify Elasticsearch Deployment
kubectl get pods -n default -l app=elasticsearch
4. Expose Elasticsearch (Optional for External Access)
kubectl port-forward svc/elasticsearch-master 9200:9200
Now, you can access Elasticsearch at http://localhost:9200
.
Step 2: Deploy Kibana in Kubernetes
Kibana will connect to Elasticsearch to visualize logs.
1. Deploy Kibana Using Helm
helm install kibana elastic/kibana \
--set service.type=ClusterIP
2. Verify Kibana Deployment
kubectl get pods -l app=kibana
3. Expose Kibana UI (Port Forwarding for Testing)
kubectl port-forward svc/kibana-kibana 5601:5601
Now, you can access Kibana at http://localhost:5601
.
Step 3: Deploy Fluentd as a DaemonSet in Kubernetes
Fluentd will run as a DaemonSet, ensuring that logs from all nodes in the cluster are collected and forwarded to Elasticsearch.
1. Create a Fluentd Configuration File
Create a ConfigMap for Fluentd configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: fluentd-config
namespace: kube-system
data:
fluentd.conf: |
<source>
@type tail
path /var/log/containers/*.log
pos_file /var/log/fluentd-containers.pos
tag kubernetes.*
read_from_head true
<parse>
@type json
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
<match kubernetes.**>
@type elasticsearch
host elasticsearch-master
port 9200
logstash_format true
logstash_prefix kubernetes-logs
flush_interval 5s
</match>
2. Apply the ConfigMap to the Cluster
kubectl apply -f fluentd-config.yaml
3. Deploy Fluentd DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
serviceAccountName: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.15-debian-elasticsearch7
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch-master"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: config-volume
mountPath: /etc/fluent/config.d
volumes:
- name: varlog
hostPath:
path: /var/log
- name: config-volume
configMap:
name: fluentd-config
4. Apply Fluentd DaemonSet to the Cluster
kubectl apply -f fluentd-daemonset.yaml
5. Verify Fluentd Logs
kubectl logs -l name=fluentd -n kube-system
If Fluentd is running correctly, it should show log data being forwarded to Elasticsearch.
Step 4: Visualizing Logs in Kibana
1. Access Kibana
If Kibana is not exposed externally, you can use port forwarding:
kubectl port-forward svc/kibana-kibana 5601:5601
Then, open localhost:5601 in your browser.
2. Configure Kibana to Read Logs from Elasticsearch
In Kibana, navigate to Management → Stack Management → Index Patterns.
Click Create Index Pattern and enter
kubernetes-logs-*
(same as defined in the Fluentd config).Select the
@timestamp
field and save.
3. Explore Kubernetes Logs
Go to Discover in Kibana.
Filter logs using pod names, namespaces, or error messages.
Step 5: (Optional) Securing the Setup for Production
For a secure and production-ready setup, you should:
Enable authentication and role-based access control (RBAC) for Elasticsearch and Kibana.
Use persistent storage (PVCs) for Elasticsearch data.
Enable TLS encryption for Elasticsearch and Fluentd communications.
Set up log retention policies in Elasticsearch.
Conclusion
By deploying Fluentd as a DaemonSet, Elasticsearch for storage, and Kibana for visualization, we achieve a scalable, centralized logging system for Kubernetes clusters.
🚀 Benefits of This Setup:
✅ Real-time log aggregation across Kubernetes nodes.
✅ Advanced search & filtering for troubleshooting.
✅ Scalable & production-ready architecture.
✅ Integrated with Kibana for visualization.
With this setup, your team can monitor, analyze, and troubleshoot logs efficiently in any Kubernetes production environment! 🚀