Monitoring your EKS clusters audit logs
At the beginning of this year, Falco introduced a game changing feature, the Falco Plugins. They allow Falco to monitor and trigger alerts for any kind of event. Since the launch of the new plugin framework the Falco community has collaborated to create plugins for Github, AWS CloudTrail and Okta. A plugin has also replaced the way Falco consumes the Audit Logs generated by a K8s API server through a dedicated plugin. With these plugins, Falco covers more in depth the aspects of your infrastructure and allows you to use a single syntax for rules.
For months (okay, maybe years...), our adopters have asked us for a way to monitor K8s Audit Logs. The previous implementation used an internal web server to receive the logs from the Kubernetes API, although it was functional, it was a very manual process to install and manage clusters. This method didn't support clusters managed by cloud providers, such as EKS, AKS, or GKE as they had to capture the Audit Logs for their own usage and then add them to their log aggregators.
This situation is now solved thanks to the plugin framework and we're proud to announce the first release of the plugin for EKS Audit Logs!!!
How it works
AWS captures the Audit Logs and exposes them in the CloudWatch Logs service. We have made available libs to create a clean session with the AWS API and pull logs from the relevant Cloudwatch Logs Stream. You can reuse these libs for any plugin you'd like to create for any Amazon service.
The configuration for the usage of the plugin is:
plugins: - name: k8saudit-eks library_path: libk8saudit-eks.so init_config: region: "us-east-1" profile: "default" shift: 10 polling_interval: 10 use_async: false buffer_size: 500 open_params: "my-cluster" - name: json library_path: libjson.so init_config: "" load_plugins: [k8saudit-eks, json]
- profile: The Profile to use to create the session, env var AWS_PROFILE if present
- region: The Region of your EKS cluster, env var AWS_REGION is used if present
- use_async: If true then async extraction optimization is enabled (Default: true)
- polling_interval: Polling Interval in seconds (default: 5s)
- shift: Time shift in past in seconds (default: 1s)
- buffer_size: Buffer Size (default: 200)
open_params: A string which contains the name of your EKS Cluster (required).
If you run Falco inside an EKS cluster with a setup of an OIDC provider, the profile and region parameters can be omitted in favor of a
service account + IAM Role (see the official docs).
A good thing about Kubernetes is that it brings standards into our industry. Despite a few differences, the cluster works in the same way and produces the same format of logs. This helps us enormously. By creating the k8saudit plugin we declared the fields to extract, as well as some default rules, which we can reuse for any plugin that consumes the same Audit Logs. It is a time saver for both, developers and adopters.
You can find the proposed default rules here. To use them just add an
alternative required plugin at the beginning of the YAML file of rules like this:
- required_engine_version: 15 - required_plugin_versions: - name: k8saudit version: 0.1.0 alternatives: - name: k8saudit-eks version: 0.1.0
Let's try with a dummy rule, just to check it works:
- required_engine_version: 15 - required_plugin_versions: - name: k8saudit-eks version: 0.1.0 - rule: Dummy rule desc: > Dummy rule condition: > ka.verb in (get,create,delete,update) output: user=%ka.user.name verb=%ka.verb target=%ka.target.name target.namespace=%ka.target.namespace resource=%ka.target.resource priority: WARNING source: k8s_audit tags: [k8s]
falco -c falco.yaml -r rules/dummy_k8s_audit_rules.yaml
16:07:42.045023000: Warning user=eks:certificate-controller verb=get target=eks-certificates-controller target.namespace=kube-system resource=configmapsEvents detected: 1 Rule counts by severity: WARNING: 1 Triggered rules by rule name: Dummy rule: 1 Syscall event drop monitoring: - event drop detected: 0 occurrences - num times actions taken: 0
The referencies to
eks proves it works!!!
Our tests noticed some latencies between the presence of the logs in the CloudWatch Logs Stream and their evaluation by Falco. This is more visible with highly requested API servers. The solution is to adapt the size of your nodes where Falco runs, considering a minimal size of
2xlarge as a safe option.
.so artifacts are available here and the official Helm chart will be updated as soon as the Falco images will be, stay tuned.
With this first Plugin for a managed K8s solution, we hope to open the door for new contributions from the community for other flavors like GKE and AKS. If you need to create a plugin for another AWS service, take also a look at the libs we created to help the developers.
You can find us in the Falco community. Please feel free to reach out to us for any questions, suggestions, or even for a friendly chat!
If you would like to find out more about Falco:
- Get started in Falco.org
- Check out the Falco project in GitHub.
- Get involved in the Falco community.
- Meet the maintainers on the Falco Slack.
- Follow @falco_org on Twitter.