1. Envoy Threat model
refer to: threat_model
The Threat Model is:
- Identifying and enumerating threats and vulnerabilities
- Devising mitigations
- Prioritising residual risks
- Escalating the most important risks
Why Treat Model?
- Identify security flaws early
- Save money and time consuming redesigns
- Focus your security requirements
- Identify complex risks and data flows for critical assets
2. Configuration Best Practices
refer to: best_practices/edge
An example to run envoy with secure config:
# https://github.com/solo-io/hoot/blob/master/03-security/edge.yaml
admin:
# access log to admin interface
access_log_path: "/tmp/envoy_admin.log"
address:
socket_address:
address: 127.0.0.1 # only local machine can access to the admin interface
port_value: 9090
# overload manager: protect envoy from overloading system resources
overload_manager:
# Interval to check the resources
refresh_interval: 0.25s
resource_monitors:
- name: "envoy.resource_monitors.fixed_heap"
typed_config:
"@type": type.googleapis.com/envoy.config.resource_monitor.fixed_heap.v2alpha.FixedHeapConfig
# Defining the max heap size (tune to your system) - the thresholds below are percentages of
# this value. For example - in kubernetes this should match your pods memory limit
max_heap_size_bytes: 2147483648 # 2 GiB
actions:
# When using 95% of heap limit defined above, try to shrink it
- name: "envoy.overload_actions.shrink_heap"
triggers:
- name: "envoy.resource_monitors.fixed_heap"
threshold:
value: 0.95
# When reaching 98% of the heal limit defined above, stop accepting new requests
- name: "envoy.overload_actions.stop_accepting_requests"
triggers:
- name: "envoy.resource_monitors.fixed_heap"
threshold:
value: 0.98
static_resources:
listeners:
- name: example_listener_name
address:
socket_address:
address: 0.0.0.0
port_value: 8443
listener_filters:
- name: "envoy.filters.listener.tls_inspector"
typed_config: {}
# Number of bytes buffered in the connection buffer.
per_connection_buffer_limit_bytes: 32768 # 32 KiB
filter_chains:
- filter_chain_match:
server_names: ["example.com", "www.example.com"]
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
common_tls_context:
tls_certificates:
- certificate_chain: { filename: "example_com_cert.pem" }
private_key: { filename: "example_com_key.pem" }
filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
# Use the peer's address as the client address (i.e. don't use XFF header to determine client address)
# see more info here: https://www.envoyproxy.io/docs/envoy/v1.15.0/configuration/http/http_conn_man/headers#x-forwarded-for
use_remote_address: true
common_http_protocol_options:
idle_timeout: 3600s # 1 hour
# Reject requests that has headers with underscores, to protect upstreams that convert
# them to hyphens
headers_with_underscores_action: REJECT_REQUEST
http2_protocol_options:
# max http2 streams at the same time
max_concurrent_streams: 100
# the buffer size for each stream
initial_stream_window_size: 65536 # 64 KiB
# the buffer size for the connection
initial_connection_window_size: 1048576 # 1 MiB
stream_idle_timeout: 300s # 5 mins, must be disabled for long-lived and streaming requests
request_timeout: 300s # 5 mins, must be disabled for long-lived and streaming requests
route_config:
virtual_hosts:
- name: default
domains: "*"
routes:
- match: { prefix: "/" }
route:
cluster: service_foo
idle_timeout: 15s # must be disabled for long-lived and streaming requests
http_filters:
- name: envoy.filters.http.router
clusters:
- name: service_foo
connect_timeout: 15s
type: STATIC
lb_policy: ROUND_ROBIN
per_connection_buffer_limit_bytes: 32768 # 32 KiB
load_assignment:
cluster_name: service_foo
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 8083
http2_protocol_options:
initial_stream_window_size: 65536 # 64 KiB
initial_connection_window_size: 1048576 # 1 MiB
transport_socket:
name: envoy.transport_sockets.tls
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext
sni: example.com
layered_runtime:
# use runtime to limit number of connections to prevent file descriptor exhaustion
layers:
- name: static_layer_0
static_layer:
envoy:
resource_limits:
listener:
example_listener_name:
connection_limit: 10000
overload:
global_downstream_max_connections: 50000 # maximum number of connection envoy allows for downstream
- In admin section,
127.0.0.1
means only local machine can access to the admin interface - In overload manager section, it prevents the envoy from overloading by monitoring the resources’ usage.(DDoS protection)
- In http connection manager section,
use_remote_address
means envoy can get the remote IP address of the client.(by appending thex-forward-for
header in Envoy http filter, use case: matching the same user client for rate limit) - In
layered_runtime
section, it’s a global limitation for envoy proxy- limit the file descriptor number(e.g. limits connection number)
3. Hardening
Some Hardening Tips:
- Use a minimal image, with no sharp tools (curl, gcc, ssh, etc…)
- Envoy doesn’t write files unless you tell it to - ser readOnlyRootFilesystem
- Remove capabilities
- Non-root user
- PIE (to allow ASLR)
An example to run envoy with hardening config:
# https://github.com/solo-io/hoot/blob/master/03-security/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: gloo
gateway-proxy-id: gateway-proxy
gloo: gateway-proxy
name: gateway-proxy
namespace: gloo-system
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
gateway-proxy-id: gateway-proxy
gloo: gateway-proxy
template:
metadata:
annotations:
prometheus.io/path: /metrics
prometheus.io/port: "8081"
prometheus.io/scrape: "true"
labels:
gateway-proxy: live
gateway-proxy-id: gateway-proxy
gloo: gateway-proxy
spec:
containers:
- args:
- --disable-hot-restart
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.name
image: quay.io/solo-io/gloo-ee-envoy-wrapper:1.4.8
imagePullPolicy: IfNotPresent
name: gateway-proxy
ports:
- containerPort: 8080
name: http
protocol: TCP
- containerPort: 8443
name: https
protocol: TCP
resources:
requests:
cpu: 500m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
securityContext:
allowPrivilegeEscalation: false
runAsUser: 10101
capabilities:
add:
- NET_BIND_SERVICE
drop:
- ALL
readOnlyRootFilesystem: true
volumeMounts:
- mountPath: /etc/envoy
name: envoy-config
dnsPolicy: ClusterFirst
serviceAccountName: gateway-proxy
volumes:
- configMap:
defaultMode: 420
name: gateway-proxy-envoy-config
name: envoy-config
- Resource requests/limits setting
- In securityContext, use non-root user, not allow privilege escalation, use read only root file system
- Drop all capability except for
NET_BIND_SERVICE
(allows envoy to bind a specific port) - Have own separate service account