K8s Webhook & Runtime Troubleshooting Guide

Deepfactor provides a K8s webhook which automatically injects a lightweight language-agnostic library, referred to as Deepfactor runtime in this document, into the containers being observed with Deepfactor. This library intercepts and sends relevant telemetry over to the Deepfactor portal for analysis and alert generation. This document describes steps to troubleshoot issues with Deepfactor K8s webhook and runtime.

If you notice that your Kubernetes pods are not instrumented/mutated with Deepfactor, or instrumented process are not reporting expected telemetry, please follow the steps below to collect logs and information that will help Deepfactor support staff debug the issue.

Check webhook installation and pod status.

WEBHOOK_NS="df-webhook"
kubectl get pods -n $WEBHOOK_NS

# check pods are running, haven't restarted, and the validation pod has completed successfully
# e.g. no lines should be printed for the command:
kubectl get pods -n $WEBHOOK_NS | grep -v 'Completed' | grep -v 'Running'

Check webhook<->portal connectivity, cluster & namespace configurations.

# collect the webhook log
WEBHOOK_PNAME=`kubectl get pods -n $WEBHOOK_NS | grep mutating-webhook | awk '{print $1}'`
kubectl logs $WEBHOOK_PNAME -n $WEBHOOK_NS > $WEBHOOK_PNAME.log

# check for any error lines, investigate any portal communication error first
grep '^E' $WEBHOOK_PNAME.log | grep 'error updating webhook config' | tail

# confirm webhook was able to retrieve cluster & namespace configuration from portal
# Find the last line which says 'Config reloaded' and review the configuration
grep -n 'Config reloaded, config' $WEBHOOK_PNAME.log | tail -n1

Check if the pod or image is excluded by configuration

# inspect the last namespace configuration after 'Config reloaded'
# for pod or image name exclusion patterns
grep -n 'Config reloaded, config' $WEBHOOK_PNAME.log

# look for excluded pod images or names
grep 'ExcludeImageNameRegularExpression' $WEBHOOK_PNAME.log
grep 'ExcludePodNameRegularExpression' $WEBHOOK_PNAME.log

Check if component was successfully registered with Deepfactor portal
Check logs of df-init-con-0 container in the instrumented pod. Check for alerts, dfctl register success, dfinit-test results or warnings.
```
pod=transactionhistory-68d7bb76d8-jmbdz
ns=myns

kubectl logs $pod -c df-init-con-0 -n $ns > $pod.dfinit-con0.log
# grab a few lines after register
grep -A 5 'dfctl register' $pod.dfinit-con0.log
```
Login to the Deepfactor portal UI and locate the application corresponding to the pod of interest. If there are any warnings associated with that application, please take a screenshot.

Collect number of restarts, container exit codes, reasons, probes, resource requests & limits.

pod=transactionhistory-68d7bb76d8-jmbdz
ns=myns

kubectl describe pod $pod -n $ns > $pod.describe

# check for container "Exit Code"(s), Reason(s)
grep -A 10 'State:' $pod.describe

# check for probe failures, restart event history
grep -A 99 '^Events:' $pod.describe

# check for resources and probes in pod spec
#grep -A 2 -e 'Requests:' -e 'Limits:' $pod.describe
#grep -e 'Readiness:' -e 'Startup:' -e 'Liveness:' $pod.describe

Collect the previous container’s stdout log if there was a restart. This could include aborts, kill signals, crashes, or exceptions along with other important context just before a container exited. Once a container restarts 3-7 times and a ‘running without deepfactor’ alert is reported, collect the ‘baseline’ log for the container for comparison.

kubectl logs $pod -p -n $ns > $pod.prev.log

tail -n20 $pod.prev.log

# check if container running without deepfactor
df_enabled=`k exec -it $pod -n $ns -- sh -c \
'grep "Container started without Deepfactor" /tmp-df/df-con-*.log.entry > /dev/null && echo .0df'`

# another check for df in pid1, but not 100% accurate
df_pid_1=`k exec -it $pod -n $ns -- sh -c \
'grep "libdf\.so" /proc/1/maps > /dev/null && echo .pid1df'`

# collect a baseline log
kubectl logs $pod -n $ns > $pod$df_pid_1$df_enabled.log
ls -l $pod$df_pid_1$df_enabled.log

Collect runtime logs, verbose debug logging.

# enable debug runtime/java logs for: webhook scan/dfcsan, runtime

pod=transactionhistory-68d7bb76d8-jmbdz
ns=myns

# option: -c container if more than one pod container
kubectl cp $pod:/tmp-df $pod-tmp-df -n $ns

# check dfeventd log for connectivity errors, periodic telemetry event counts
# vi $pod-tmp-df/dfeventd-*.log

Set DF_DEBUG=true in pod env for runtime verbose logging

env:
- name: DF_DEBUG
  value: "true"

Set DF_JAVA_LOG_FILE=/tmp/df.java.log to enable Javaagent (Class usage) debug log file
Set DF_DEBUG_VERBOSE=trueto get verbose, all telemetry decoded, dfeventd-X.log

Collect webhook static scan pod log, enable verbose debug logging.

WEBHOOK_NS="df-webhook"
kubectl get pods -n $WEBHOOK_NS

# collect the webhook static scan log
SCAN_PNAME=`kubectl get pods -n $WEBHOOK_NS | grep static-scan | awk '{print $1}'`
kubectl logs $SCAN_PNAME -n $WEBHOOK_NS > $SCAN_PNAME.log

Enable debug logging for webhook and static-scan pod by editing deployment env.

WEBHOOK_DEPLOY=`kubectl get deployments -n $WEBHOOK_NS | grep mutating-webhook | awk '{print $1}'`
kubectl edit deployment $WEBHOOK_DEPLOY -n $WEBHOOK_NS

SCAN_DEPLOY=`kubectl get deployments -n $WEBHOOK_NS | grep static-scan | awk '{print $1}'`
kubectl edit deployment $SCAN_DEPLOY -n $WEBHOOK_NS

env:
- name: DF_DEBUG
  value: "true"

Product

Use Cases

4-Minute Video

What is Deepfactor and How Does It Work? >

Resources

Whitepaper

Implement Effective Next-Gen Container Runtime Security in Kubernetes and Cloud Native Apps >

Product

Use Cases

Use Cases

4-Minute Video

What is Deepfactor and How Does It Work? >

Resources

Whitepaper

Implement Effective Next-Gen Container Runtime Security in Kubernetes and Cloud Native Apps >

Getting Started

Tutorials

Deepfactor Platform

Integrations

Self managed Deepfactor portal

Release Notes

K8s Webhook & Runtime Troubleshooting Guide

Was this article helpful?

Product

Use Cases

4-Minute Video

What is Deepfactor and How Does It Work? >

Resources

Whitepaper

Implement Effective Next-Gen Container Runtime Security in Kubernetes and Cloud Native Apps >

Product

Use Cases

Use Cases

4-Minute Video

What is Deepfactor and How Does It Work? >

Resources

Whitepaper

Implement Effective Next-Gen Container Runtime Security in Kubernetes and Cloud Native Apps >

Was this article helpful?

How can we help?