云原生零信任架构设计

零信任（Zero Trust）是一种安全模型，基于"永不信任，始终验证"的原则，要求对每个访问请求进行验证，无论其来源位置。在云原生环境中，零信任架构为动态、分布式的微服务提供了强大的安全保障。

🎯 零信任核心原则

基础理念和架构

零信任核心原则零信任架构框架

yaml

zero_trust_principles:
  never_trust_always_verify:
    description: "永不信任，始终验证"
    implementation:
      - "移除网络边界信任假设"
      - "对所有连接进行身份验证"
      - "基于身份而非网络位置授权"
      - "持续监控和验证访问"
    
    traditional_vs_zero_trust:
      traditional_model:
        trust_boundary: "网络边界（防火墙）"
        internal_trust: "内网默认信任"
        security_focus: "边界防护"
        access_control: "基于网络位置"
      
      zero_trust_model:
        trust_boundary: "每个连接点"
        internal_trust: "内网默认不信任"
        security_focus: "身份和数据保护"
        access_control: "基于身份和上下文"
  
  least_privilege_access:
    description: "最小权限访问"
    principles:
      - "仅授予执行任务所需的最小权限"
      - "基于时间和上下文的动态权限"
      - "定期审查和调整权限"
      - "自动化权限管理"
    
    implementation_levels:
      user_level: "用户最小权限原则"
      service_level: "服务间最小权限通信"
      data_level: "数据访问最小化"
      network_level: "网络连接最小化"
  
  assume_breach:
    description: "假设已被入侵"
    strategies:
      containment: "限制横向移动"
      detection: "快速检测异常行为"
      response: "自动化事件响应"
      recovery: "快速恢复和隔离"
    
    design_implications:
      - "微分段网络设计"
      - "端到端加密"
      - "行为监控和分析"
      - "自动化隔离机制"
  
  verify_explicitly:
    description: "显式验证"
    verification_factors:
      identity: "身份认证"
      device: "设备信任状态"
      location: "访问位置"
      behavior: "行为模式"
      risk: "风险评估"
    
    multi_factor_verification: |
      # 多因素验证示例
      verification_policy:
        identity_verification:
          - "多因素认证(MFA)"
          - "证书认证"
          - "生物识别"
        
        device_verification:
          - "设备证书"
          - "设备合规性检查"
          - "设备信任评分"
        
        contextual_verification:
          - "访问时间验证"
          - "地理位置验证"
          - "行为模式分析"

yaml

zero_trust_architecture:
  control_plane:
    description: "控制平面组件"
    components:
      policy_engine:
        responsibility: "策略决策制定"
        functions:
          - "访问策略评估"
          - "风险分析计算"
          - "决策记录和审计"
        
        implementation: |
          # Open Policy Agent (OPA) 策略引擎
          apiVersion: v1
          kind: ConfigMap
          metadata:
            name: opa-policies
            namespace: zero-trust-system
          data:
            access-policy.rego: |
              package authz
              
              default allow = false
              
              # 允许经过身份验证的用户访问其命名空间资源
              allow {
                  input.identity.authenticated == true
                  input.resource.namespace == input.identity.namespace
                  input.operation in ["get", "list", "watch"]
              }
              
              # 管理员可以访问所有资源
              allow {
                  input.identity.roles[_] == "admin"
                  input.identity.authenticated == true
              }
              
              # 基于时间的访问控制
              allow {
                  input.identity.authenticated == true
                  is_business_hours
                  input.resource.sensitivity != "high"
              }
              
              is_business_hours {
                  now := time.now_ns()
                  hour := time.weekday(now)[1]
                  hour >= 9
                  hour <= 17
              }
      
      policy_administrator:
        responsibility: "策略配置和管理"
        functions:
          - "策略生命周期管理"
          - "策略版本控制"
          - "策略验证和测试"
        
        configuration: |
          # 策略管理配置
          apiVersion: security.io/v1
          kind: PolicyConfiguration
          metadata:
            name: zero-trust-policies
          spec:
            policyEngine: "opa"
            policyRepository: "git://policies.example.com/zero-trust"
            updateStrategy:
              type: "RollingUpdate"
              validation: true
              testSuite: true
            
            policies:
            - name: "identity-verification"
              version: "v1.2.0"
              scope: "cluster"
              enforcement: "strict"
            
            - name: "network-access"
              version: "v1.1.0"
              scope: "namespace"
              enforcement: "warn"
  
  data_plane:
    description: "数据平面组件"
    components:
      policy_enforcement_point:
        description: "策略执行点"
        implementations:
          service_mesh_proxy: |
            # Istio/Envoy 作为PEP
            apiVersion: security.istio.io/v1beta1
            kind: AuthorizationPolicy
            metadata:
              name: zero-trust-authz
              namespace: production
            spec:
              selector:
                matchLabels:
                  app: web-app
              rules:
              - from:
                - source:
                    principals: ["cluster.local/ns/production/sa/frontend"]
                to:
                - operation:
                    methods: ["GET", "POST"]
                when:
                - key: source.ip
                  values: ["10.0.0.0/16"]
                - key: request.headers[x-user-role]
                  values: ["user", "admin"]
          
          ingress_controller: |
            # NGINX Ingress with OPA
            apiVersion: networking.k8s.io/v1
            kind: Ingress
            metadata:
              name: zero-trust-ingress
              annotations:
                nginx.ingress.kubernetes.io/auth-url: "http://opa-service.auth-system.svc.cluster.local:8181/v1/data/authz/allow"
                nginx.ingress.kubernetes.io/auth-response-headers: "X-User,X-Roles,X-Risk-Score"
            spec:
              rules:
              - host: app.example.com
                http:
                  paths:
                  - path: /
                    pathType: Prefix
                    backend:
                      service:
                        name: web-app
                        port:
                          number: 80
          
          api_gateway: |
            # API Gateway 策略执行
            apiVersion: gateway.networking.k8s.io/v1
            kind: HTTPRoute
            metadata:
              name: zero-trust-route
            spec:
              parentRefs:
              - name: secure-gateway
              rules:
              - matches:
                - path:
                    type: PathPrefix
                    value: /api/
                filters:
                - type: ExtensionRef
                  extensionRef:
                    group: security.example.com
                    kind: ZeroTrustPolicy
                    name: api-access-policy
                backendRefs:
                - name: api-service
                  port: 8080
  
  trust_algorithms:
    description: "信任算法和评分"
    risk_scoring:
      factors: |
        # 风险评分因子
        risk_factors:
          identity_factors:
            authentication_method: 
              weight: 0.3
              values:
                mfa: 0.1
                certificate: 0.2
                password_only: 0.8
            
            identity_freshness:
              weight: 0.2
              calculation: "time_since_last_auth / max_session_time"
          
          device_factors:
            device_compliance:
              weight: 0.25
              values:
                compliant: 0.1
                partially_compliant: 0.5
                non_compliant: 0.9
            
            device_reputation:
              weight: 0.15
              calculation: "historical_risk_score + anomaly_score"
          
          contextual_factors:
            location_risk:
              weight: 0.1
              values:
                trusted_location: 0.1
                unknown_location: 0.7
                high_risk_location: 0.9
            
            time_based_risk:
              weight: 0.1
              calculation: "outside_business_hours ? 0.3 : 0.1"
            
            behavioral_anomaly:
              weight: 0.2
              calculation: "ml_anomaly_score"
    
    dynamic_trust_adjustment: |
      # 动态信任调整算法
      trust_adjustment_policy:
        triggers:
          - "failed_authentication_attempts > 3"
          - "unusual_access_pattern_detected"
          - "device_compliance_degraded"
          - "high_risk_location_access"
        
        actions:
          step_up_authentication:
            condition: "risk_score > 0.6"
            action: "require_additional_mfa"
          
          access_restriction:
            condition: "risk_score > 0.8"
            action: "restrict_to_read_only"
          
          session_termination:
            condition: "risk_score > 0.9"
            action: "terminate_all_sessions"
          
          continuous_monitoring:
            condition: "risk_score > 0.5"
            action: "increase_monitoring_frequency"

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230

🏗️ 云原生零信任实现

服务网格零信任

Istio零信任配置网络微分段

yaml

istio_zero_trust:
  mutual_tls_configuration:
    strict_mtls: |
      # 严格mTLS策略
      apiVersion: security.istio.io/v1beta1
      kind: PeerAuthentication
      metadata:
        name: default
        namespace: zero-trust-system
      spec:
        mtls:
          mode: STRICT
      
      ---
      # 命名空间级别mTLS
      apiVersion: security.istio.io/v1beta1
      kind: PeerAuthentication
      metadata:
        name: namespace-policy
        namespace: production
      spec:
        mtls:
          mode: STRICT
        portLevelMtls:
          8080:
            mode: STRICT
          9090:
            mode: PERMISSIVE  # Prometheus监控端口
    
    certificate_management: |
      # 证书自动轮换配置
      apiVersion: install.istio.io/v1alpha1
      kind: IstioOperator
      metadata:
        name: control-plane
      spec:
        values:
          pilot:
            env:
              # 证书有效期设置
              DEFAULT_WORKLOAD_CERT_TTL: "24h"
              MAX_WORKLOAD_CERT_TTL: "24h"
              # 根证书轮换
              ROOT_CA_CERT_TTL: "8760h"  # 1年
        
        components:
          pilot:
            k8s:
              env:
              - name: CITADEL_ENABLE_WORKLOAD_CERT_TTL
                value: "true"
              - name: CITADEL_WORKLOAD_CERT_TTL
                value: "24h"
  
  authorization_policies:
    service_to_service: |
      # 服务间访问控制
      apiVersion: security.istio.io/v1beta1
      kind: AuthorizationPolicy
      metadata:
        name: frontend-to-backend
        namespace: production
      spec:
        selector:
          matchLabels:
            app: backend-service
        rules:
        - from:
          - source:
              principals: ["cluster.local/ns/production/sa/frontend-service"]
        - to:
          - operation:
              methods: ["GET", "POST"]
              paths: ["/api/v1/*"]
        - when:
          - key: source.ip
            values: ["10.244.0.0/16"]  # Pod CIDR
          - key: request.headers[x-request-id]
            notValues: [""]
    
    user_to_service: |
      # 用户到服务访问控制
      apiVersion: security.istio.io/v1beta1
      kind: AuthorizationPolicy
      metadata:
        name: user-access-policy
        namespace: production
      spec:
        selector:
          matchLabels:
            app: web-app
        rules:
        - from:
          - source:
              requestPrincipals: ["https://auth.example.com/users/*"]
        - to:
          - operation:
              methods: ["GET", "POST", "PUT", "DELETE"]
        - when:
          - key: request.auth.claims[role]
            values: ["user", "admin"]
          - key: request.auth.claims[email_verified]
            values: ["true"]
    
    external_service_access: |
      # 外部服务访问控制
      apiVersion: security.istio.io/v1beta1
      kind: AuthorizationPolicy
      metadata:
        name: external-api-access
        namespace: production
      spec:
        rules:
        - to:
          - operation:
              hosts: ["external-api.example.com"]
              methods: ["GET", "POST"]
        - when:
          - key: source.labels[app]
            values: ["trusted-app"]
          - key: source.labels[version]
            values: ["v1.0", "v1.1"]
  
  traffic_encryption:
    end_to_end_encryption: |
      # 端到端加密配置
      apiVersion: networking.istio.io/v1beta1
      kind: DestinationRule
      metadata:
        name: backend-encryption
        namespace: production
      spec:
        host: backend-service.production.svc.cluster.local
        trafficPolicy:
          tls:
            mode: ISTIO_MUTUAL
            minProtocolVersion: TLSV1_3
            maxProtocolVersion: TLSV1_3
            cipherSuites:
            - ECDHE-ECDSA-AES256-GCM-SHA384
            - ECDHE-RSA-AES256-GCM-SHA384
        
        portLevelSettings:
        - port:
            number: 8080
          tls:
            mode: ISTIO_MUTUAL
            sni: backend-service.production.svc.cluster.local

yaml

network_microsegmentation:
  calico_zero_trust:
    default_deny: |
      # Calico默认拒绝策略
      apiVersion: projectcalico.org/v3
      kind: GlobalNetworkPolicy
      metadata:
        name: default-deny-all
      spec:
        order: 1000
        selector: all()
        types:
        - Ingress
        - Egress
        # 默认拒绝所有流量
    
    zone_based_policies: |
      # 基于区域的网络策略
      apiVersion: projectcalico.org/v3
      kind: NetworkPolicy
      metadata:
        name: dmz-to-app-tier
        namespace: production
      spec:
        order: 100
        selector: zone == "app-tier"
        types:
        - Ingress
        
        ingress:
        # 允许DMZ区域访问应用层
        - action: Allow
          source:
            selector: zone == "dmz"
          destination:
            ports:
            - 8080
            - 8443
        
        # 拒绝其他区域直接访问
        - action: Deny
          source:
            selector: zone != "dmz"
    
    service_based_segmentation: |
      # 基于服务的微分段
      apiVersion: projectcalico.org/v3
      kind: NetworkPolicy
      metadata:
        name: service-segmentation
        namespace: production
      spec:
        order: 200
        selector: app == "database"
        types:
        - Ingress
        
        ingress:
        # 仅允许应用服务访问数据库
        - action: Allow
          source:
            selector: app in {"web-app", "api-service"}
          destination:
            ports:
            - 5432  # PostgreSQL
            - 3306  # MySQL
        
        # 记录拒绝的连接
        - action: Log
          source:
            selector: app not in {"web-app", "api-service"}
        
        - action: Deny
          source:
            selector: app not in {"web-app", "api-service"}
  
  ebpf_enforcement:
    cilium_policies: |
      # Cilium eBPF网络策略
      apiVersion: cilium.io/v2
      kind: CiliumNetworkPolicy
      metadata:
        name: zero-trust-l7-policy
        namespace: production
      spec:
        endpointSelector:
          matchLabels:
            app: api-service
        
        ingress:
        - fromEndpoints:
          - matchLabels:
              app: frontend
          toPorts:
          - ports:
            - port: "8080"
              protocol: TCP
            rules:
              http:
              - method: "GET"
                path: "/api/v1/users"
                headers:
                - "Authorization: Bearer .*"
              - method: "POST"
                path: "/api/v1/users"
                headers:
                - "Content-Type: application/json"
        
        egress:
        - toEndpoints:
          - matchLabels:
              app: database
          toPorts:
          - ports:
            - port: "5432"
              protocol: TCP
        
        # DNS策略
        - toFQDNs:
          - matchName: "external-api.example.com"
          toPorts:
          - ports:
            - port: "443"
              protocol: TCP
    
    runtime_security: |
      # eBPF运行时安全策略
      apiVersion: cilium.io/v2
      kind: CiliumClusterwideNetworkPolicy
      metadata:
        name: runtime-security-policy
      spec:
        endpointSelector: {}
        
        # 阻止特权提升
        ingress:
        - fromEntities:
          - "cluster"
          toPorts:
          - ports:
            - port: "22"  # SSH
              protocol: TCP
          rules:
            # 阻止来自容器内部的SSH连接
            l7proto: tcp
            tcp:
            - method: "CONNECT"
        
        # 监控和记录可疑活动
        egress:
        - toEntities:
          - "world"
          toPorts:
          - ports:
            - port: "22"
            - port: "23"  # Telnet
            - port: "135" # RPC
              protocol: TCP
          rules:
            # 记录可疑出站连接
            l7proto: tcp
            tcp:
            - method: "CONNECT"

🔐 身份和访问管理

统一身份认证

OIDC集成工作负载身份

yaml

oidc_integration:
  kubernetes_oidc:
    api_server_config: |
      # Kubernetes API服务器OIDC配置
      kube_apiserver_flags:
        - --oidc-issuer-url=https://auth.example.com
        - --oidc-client-id=kubernetes
        - --oidc-username-claim=email
        - --oidc-username-prefix=oidc:
        - --oidc-groups-claim=groups
        - --oidc-groups-prefix=oidc:
        - --oidc-ca-file=/etc/ssl/certs/oidc-ca.pem
        - --oidc-required-claim=email_verified:true
    
    kubectl_config: |
      # kubectl OIDC配置
      apiVersion: v1
      kind: Config
      users:
      - name: oidc-user
        user:
          auth-provider:
            name: oidc
            config:
              issuer-url: https://auth.example.com
              client-id: kubernetes
              client-secret: kubernetes-secret
              refresh-token: refresh-token-value
              id-token: id-token-value
              idp-certificate-authority: /path/to/ca.crt
  
  istio_oidc:
    request_authentication: |
      # Istio JWT认证配置
      apiVersion: security.istio.io/v1beta1
      kind: RequestAuthentication
      metadata:
        name: jwt-auth
        namespace: production
      spec:
        selector:
          matchLabels:
            app: web-app
        jwtRules:
        - issuer: "https://auth.example.com"
          jwksUri: "https://auth.example.com/.well-known/jwks.json"
          audiences:
          - "kubernetes"
          - "web-app"
          # JWT必须包含的声明
          claims:
            email_verified: "true"
            aud: "kubernetes"
          # JWT有效期验证
          fromHeaders:
          - name: Authorization
            prefix: "Bearer "
          fromParams:
          - "access_token"
    
    authorization_with_jwt: |
      # 基于JWT的授权策略
      apiVersion: security.istio.io/v1beta1
      kind: AuthorizationPolicy
      metadata:
        name: jwt-authz
        namespace: production
      spec:
        selector:
          matchLabels:
            app: web-app
        rules:
        - from:
          - source:
              requestPrincipals: ["https://auth.example.com/*"]
        - when:
          - key: request.auth.claims[role]
            values: ["admin", "user"]
          - key: request.auth.claims[email_verified]
            values: ["true"]
          - key: request.auth.claims[exp]
            # 确保token未过期
            notValues: [""]

yaml

workload_identity:
  spiffe_spire:
    spire_server_config: |
      # SPIRE Server配置
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: spire-server
        namespace: spire-system
      data:
        server.conf: |
          server {
            bind_address = "0.0.0.0"
            bind_port = "8081"
            trust_domain = "example.com"
            data_dir = "/opt/spire/data/server"
            log_level = "INFO"
            ca_ttl = "168h"  # 7天
            default_x509_svid_ttl = "1h"
          }
          
          plugins {
            DataStore "sql" {
              plugin_data {
                database_type = "postgres"
                connection_string = "postgres://spire:password@postgres:5432/spire"
              }
            }
            
            NodeAttestor "k8s_psat" {
              plugin_data {
                clusters = {
                  "example-cluster" = {
                    service_account_allow_list = ["spire-system:spire-agent"]
                  }
                }
              }
            }
            
            KeyManager "memory" {
              plugin_data = {}
            }
          }
    
    spire_agent_daemonset: |
      # SPIRE Agent DaemonSet
      apiVersion: apps/v1
      kind: DaemonSet
      metadata:
        name: spire-agent
        namespace: spire-system
      spec:
        selector:
          matchLabels:
            app: spire-agent
        template:
          metadata:
            labels:
              app: spire-agent
          spec:
            serviceAccountName: spire-agent
            containers:
            - name: spire-agent
              image: ghcr.io/spiffe/spire-agent:1.8.0
              args:
              - -config
              - /opt/spire/conf/agent/agent.conf
              volumeMounts:
              - name: spire-config
                mountPath: /opt/spire/conf/agent
              - name: spire-bundle
                mountPath: /opt/spire/conf/bundle
              - name: spire-agent-socket
                mountPath: /tmp/spire-agent/public
              securityContext:
                allowPrivilegeEscalation: false
                readOnlyRootFilesystem: true
                runAsNonRoot: true
                runAsUser: 1000
                capabilities:
                  drop:
                  - ALL
            volumes:
            - name: spire-config
              configMap:
                name: spire-agent
            - name: spire-bundle
              configMap:
                name: spire-bundle
            - name: spire-agent-socket
              hostPath:
                path: /tmp/spire-agent/public
                type: DirectoryOrCreate
    
    workload_registration: |
      # 工作负载注册
      apiVersion: spire.spiffe.io/v1alpha1
      kind: ClusterSPIFFEID
      metadata:
        name: web-app-registration
      spec:
        spiffeIDTemplate: "spiffe://example.com/ns/{{ .PodMeta.Namespace }}/sa/{{ .PodSpec.ServiceAccountName }}/pod/{{ .PodMeta.Name }}"
        podSelector:
          matchLabels:
            app: web-app
        workloadSelectorTemplates:
        - "k8s:pod-label:app:web-app"
        - "k8s:ns:production"
        
        # X.509 SVID配置
        x509SVIDTemplate:
          subject:
            commonName: "web-app.production.svc.cluster.local"
            organization: ["example.com"]
            organizationalUnit: ["platform-team"]
          dnsNames:
          - "web-app.production.svc.cluster.local"
          - "web-app"
          ttl: "1h"
  
  service_account_token_projection:
    projected_tokens: |
      # 投影式ServiceAccount Token
      apiVersion: v1
      kind: Pod
      metadata:
        name: zero-trust-workload
        namespace: production
      spec:
        serviceAccountName: app-service-account
        containers:
        - name: app
          image: myapp:latest
          volumeMounts:
          - name: service-account-token
            mountPath: /var/run/secrets/tokens
            readOnly: true
          env:
          - name: TOKEN_PATH
            value: "/var/run/secrets/tokens/api-token"
        volumes:
        - name: service-account-token
          projected:
            sources:
            - serviceAccountToken:
                path: api-token
                expirationSeconds: 3600  # 1小时
                audience: api.example.com
            - serviceAccountToken:
                path: vault-token
                expirationSeconds: 600   # 10分钟
                audience: vault.example.com
            - configMap:
                name: kube-root-ca.crt
                items:
                - key: ca.crt
                  path: ca.crt
    
    token_review_webhook: |
      # Token Review Webhook
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: token-review-config
        namespace: zero-trust-system
      data:
        webhook-config.yaml: |
          apiVersion: v1
          kind: Config
          clusters:
          - name: token-review-cluster
            cluster:
              server: https://token-review.zero-trust-system.svc.cluster.local
              certificate-authority-data: LS0tLS1CRUdJTi...
          users:
          - name: token-review-user
            user:
              client-certificate-data: LS0tLS1CRUdJTi...
              client-key-data: LS0tLS1CRUdJTi...
          contexts:
          - name: token-review-context
            context:
              cluster: token-review-cluster
              user: token-review-user
          current-context: token-review-context

📊 监控和可观测性

零信任监控策略

安全监控指标事件响应自动化

yaml

security_monitoring:
  authentication_metrics:
    prometheus_rules: |
      # 认证相关监控指标
      groups:
      - name: zero-trust-authentication
        rules:
        - record: authentication_failure_rate
          expr: rate(istio_request_total{response_code!~"2.."}[5m])
        
        - alert: HighAuthFailureRate
          expr: authentication_failure_rate > 0.1
          for: 2m
          labels:
            severity: warning
          annotations:
            summary: "High authentication failure rate detected"
            description: "Authentication failure rate is {{ $value }} req/sec"
        
        - record: mfa_bypass_attempts
          expr: increase(oauth2_authentication_total{mfa_required="true",mfa_completed="false"}[1h])
        
        - alert: MFABypassAttempt
          expr: mfa_bypass_attempts > 0
          for: 0m
          labels:
            severity: critical
          annotations:
            summary: "MFA bypass attempt detected"
            description: "{{ $value }} MFA bypass attempts in the last hour"
  
  network_monitoring:
    istio_telemetry: |
      # Istio网络遥测配置
      apiVersion: telemetry.istio.io/v1alpha1
      kind: Telemetry
      metadata:
        name: zero-trust-metrics
        namespace: istio-system
      spec:
        metrics:
        - providers:
          - name: prometheus
        - overrides:
          - match:
              metric: ALL_METRICS
            operation: UPSERT
            tags:
              zero_trust_policy_applied:
                value: "true"
              source_workload_trust_score:
                value: "{{ .source_workload.labels['trust-score'] | default('unknown') }}"
              destination_workload_trust_score:
                value: "{{ .destination_workload.labels['trust-score'] | default('unknown') }}"
    
    network_policy_monitoring: |
      # 网络策略监控
      groups:
      - name: zero-trust-network
        rules:
        - record: network_policy_violations
          expr: rate(cilium_policy_l3_l4_denied_total[5m])
        
        - alert: NetworkPolicyViolation
          expr: network_policy_violations > 1
          for: 1m
          labels:
            severity: warning
          annotations:
            summary: "Network policy violations detected"
            description: "{{ $value }} network policy violations per second"
        
        - record: lateral_movement_attempts
          expr: rate(cilium_policy_l3_l4_denied_total{direction="ingress"}[5m])
        
        - alert: LateralMovementAttempt
          expr: lateral_movement_attempts > 0.1
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "Potential lateral movement detected"
            description: "{{ $value }} suspicious lateral movement attempts per second"
  
  behavioral_analytics:
    anomaly_detection: |
      # 异常行为检测配置
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: behavioral-analytics
        namespace: zero-trust-system
      data:
        analytics-config.yaml: |
          behavioral_analytics:
            user_behavior:
              baseline_learning_period: "7d"
              anomaly_threshold: 0.8
              features:
                - "access_patterns"
                - "resource_usage"
                - "time_patterns"
                - "location_patterns"
            
            service_behavior:
              baseline_learning_period: "24h"
              anomaly_threshold: 0.9
              features:
                - "api_call_patterns"
                - "response_time_patterns"
                - "error_rate_patterns"
                - "resource_consumption"
            
            network_behavior:
              baseline_learning_period: "1h"
              anomaly_threshold: 0.95
              features:
                - "connection_patterns"
                - "data_transfer_volumes"
                - "protocol_usage"
                - "destination_diversity"
    
    ml_pipeline: |
      # 机器学习异常检测流水线
      apiVersion: argoproj.io/v1alpha1
      kind: Workflow
      metadata:
        name: anomaly-detection-pipeline
        namespace: zero-trust-system
      spec:
        entrypoint: anomaly-detection
        templates:
        - name: anomaly-detection
          steps:
          - - name: data-collection
              template: collect-metrics
          - - name: feature-engineering
              template: process-features
              arguments:
                artifacts:
                - name: raw-data
                  from: "{{steps.data-collection.outputs.artifacts.metrics}}"
          - - name: model-inference
              template: run-inference
              arguments:
                artifacts:
                - name: features
                  from: "{{steps.feature-engineering.outputs.artifacts.features}}"
          - - name: alert-generation
              template: generate-alerts
              arguments:
                artifacts:
                - name: predictions
                  from: "{{steps.model-inference.outputs.artifacts.predictions}}"
        
        - name: collect-metrics
          container:
            image: prometheus-client:latest
            command: [python, collect_metrics.py]
            args: ["--output", "/tmp/metrics.json"]
          outputs:
            artifacts:
            - name: metrics
              path: /tmp/metrics.json
        
        - name: process-features
          container:
            image: ml-processor:latest
            command: [python, feature_engineering.py]
            args: ["--input", "/tmp/raw-data", "--output", "/tmp/features.json"]
          inputs:
            artifacts:
            - name: raw-data
              path: /tmp/raw-data
          outputs:
            artifacts:
            - name: features
              path: /tmp/features.json

yaml

incident_response:
  automated_response:
    quarantine_workflow: |
      # 自动隔离工作流
      apiVersion: argoproj.io/v1alpha1
      kind: WorkflowTemplate
      metadata:
        name: security-incident-response
        namespace: zero-trust-system
      spec:
        entrypoint: incident-response
        templates:
        - name: incident-response
          inputs:
            parameters:
            - name: incident-type
            - name: affected-workload
            - name: severity
          steps:
          - - name: validate-incident
              template: validate
              arguments:
                parameters:
                - name: incident-type
                  value: "{{inputs.parameters.incident-type}}"
          
          - - name: isolate-workload
              template: network-isolation
              arguments:
                parameters:
                - name: workload
                  value: "{{inputs.parameters.affected-workload}}"
              when: "{{steps.validate-incident.outputs.result}} == 'high-risk'"
          
          - - name: collect-forensics
              template: forensic-collection
              arguments:
                parameters:
                - name: workload
                  value: "{{inputs.parameters.affected-workload}}"
          
          - - name: notify-team
              template: notification
              arguments:
                parameters:
                - name: severity
                  value: "{{inputs.parameters.severity}}"
        
        - name: network-isolation
          inputs:
            parameters:
            - name: workload
          container:
            image: kubectl:latest
            command: [sh, -c]
            args:
            - |
              # 创建隔离网络策略
              cat <<EOF | kubectl apply -f -
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: quarantine-{{inputs.parameters.workload}}
                namespace: production
              spec:
                podSelector:
                  matchLabels:
                    app: {{inputs.parameters.workload}}
                policyTypes:
                - Ingress
                - Egress
                # 仅允许DNS和监控流量
                egress:
                - to: []
                  ports:
                  - protocol: UDP
                    port: 53
                - to:
                  - namespaceSelector:
                      matchLabels:
                        name: monitoring
                  ports:
                  - protocol: TCP
                    port: 9090
              EOF
    
    response_playbooks: |
      # 响应剧本配置
      response_playbooks:
        privilege_escalation:
          severity: "critical"
          actions:
            immediate:
              - "terminate_suspicious_sessions"
              - "revoke_elevated_permissions"
              - "enable_enhanced_monitoring"
            
            investigation:
              - "collect_audit_logs"
              - "analyze_access_patterns"
              - "review_rbac_bindings"
            
            remediation:
              - "update_rbac_policies"
              - "implement_additional_controls"
              - "conduct_security_review"
        
        lateral_movement:
          severity: "high"
          actions:
            immediate:
              - "isolate_affected_workloads"
              - "block_suspicious_connections"
              - "increase_monitoring_frequency"
            
            investigation:
              - "trace_network_connections"
              - "analyze_service_interactions"
              - "check_compromised_credentials"
            
            remediation:
              - "update_network_policies"
              - "rotate_service_credentials"
              - "strengthen_microsegmentation"
        
        data_exfiltration:
          severity: "critical"
          actions:
            immediate:
              - "block_external_connections"
              - "quarantine_data_sources"
              - "preserve_forensic_evidence"
            
            investigation:
              - "analyze_data_access_logs"
              - "identify_compromised_accounts"
              - "assess_data_exposure_scope"
            
            remediation:
              - "implement_dlp_controls"
              - "update_data_classification"
              - "enhance_access_controls"
  
  compliance_automation:
    continuous_compliance: |
      # 持续合规检查
      apiVersion: batch/v1
      kind: CronJob
      metadata:
        name: zero-trust-compliance-check
        namespace: zero-trust-system
      spec:
        schedule: "0 */6 * * *"  # 每6小时运行一次
        jobTemplate:
          spec:
            template:
              spec:
                containers:
                - name: compliance-checker
                  image: compliance-tools:latest
                  command:
                  - /bin/sh
                  - -c
                  - |
                    # 检查零信任策略合规性
                    echo "Starting Zero Trust compliance check..."
                    
                    # 检查mTLS策略
                    kubectl get peerauthentication --all-namespaces -o json | \
                    jq '.items[] | select(.spec.mtls.mode != "STRICT")' > /tmp/mtls-violations.json
                    
                    # 检查网络策略覆盖
                    kubectl get namespaces -o json | \
                    jq '.items[] | select(.metadata.name != "kube-system" and .metadata.name != "kube-public")' | \
                    while read ns; do
                      ns_name=$(echo $ns | jq -r '.metadata.name')
                      policy_count=$(kubectl get networkpolicies -n "$ns_name" --no-headers | wc -l)
                      if [ "$policy_count" -eq 0 ]; then
                        echo "No network policies in namespace: $ns_name" >> /tmp/policy-violations.txt
                      fi
                    done
                    
                    # 检查权限配置
                    kubectl get clusterrolebindings -o json | \
                    jq '.items[] | select(.roleRef.name == "cluster-admin")' > /tmp/admin-bindings.json
                    
                    # 生成合规报告
                    python /scripts/generate-compliance-report.py
                
                restartPolicy: OnFailure

📋 零信任面试重点

核心理念类

零信任的核心原则是什么？
- 永不信任，始终验证
- 最小权限访问
- 假设已被入侵
- 显式验证
- 各原则的实现方式
零信任与传统边界安全的区别？
- 信任边界的重新定义
- 安全策略的执行位置
- 网络架构设计差异
- 身份验证方式变化
零信任架构的核心组件？
- 策略引擎和策略管理
- 策略执行点配置
- 信任算法设计
- 数据和控制平面分离

实现技术类

如何在Kubernetes中实现零信任？
- 服务网格mTLS配置
- 网络微分段策略
- RBAC最小权限设计
- Pod安全标准实施
零信任网络策略设计原则？
- 默认拒绝策略
- 基于身份的访问控制
- 动态策略调整
- 流量加密要求
工作负载身份管理方案？
- SPIFFE/SPIRE实现
- ServiceAccount Token投影
- 证书自动轮换
- 身份验证集成

监控审计类

零信任环境的监控策略？
- 认证失败监控
- 网络策略违规检测
- 异常行为分析
- 风险评分机制
如何检测横向移动攻击？
- 网络流量分析
- 服务间通信监控
- 权限使用模式分析
- 异常连接检测
零信任事件响应自动化？
- 自动隔离机制
- 威胁响应工作流
- 取证数据收集
- 恢复程序设计

高级实践类

零信任成熟度评估模型？
- 能力成熟度等级
- 实施路线图设计
- 关键指标定义
- 持续改进机制
零信任与DevSecOps集成？
- 安全左移实践
- CI/CD安全集成
- 基础设施即代码安全
- 自动化安全验证
多云环境零信任实施？
- 跨云身份联邦
- 统一策略管理
- 混合云网络安全
- 一致性安全控制

🔗 相关内容

Kubernetes安全概述 - Kubernetes安全整体架构
容器安全 - 容器安全完整实践
服务网格安全 - 服务网格安全配置
网络策略 - Kubernetes网络安全

零信任架构代表了现代安全思维的重要转变，在云原生环境中实施零信任需要综合考虑身份管理、网络安全、策略执行和监控等多个方面。通过系统性的零信任实施，可以显著提升云原生应用的安全防护能力。

云原生零信任架构设计 ​

🎯 零信任核心原则 ​

基础理念和架构 ​

🏗️ 云原生零信任实现 ​

服务网格零信任 ​

🔐 身份和访问管理 ​

统一身份认证 ​

📊 监控和可观测性 ​

零信任监控策略 ​

📋 零信任面试重点 ​

核心理念类 ​

实现技术类 ​

监控审计类 ​

高级实践类 ​

🔗 相关内容 ​

云原生零信任架构设计

🎯 零信任核心原则

基础理念和架构

🏗️ 云原生零信任实现

服务网格零信任

🔐 身份和访问管理

统一身份认证

📊 监控和可观测性

零信任监控策略

📋 零信任面试重点

核心理念类

实现技术类

监控审计类

高级实践类

🔗 相关内容