Skip to content

kube-apiserver

Overview

Components:

  1. API Extensions Server: Create HTTP handlers for CRD.
  2. API Server: Manage core API and core Kubernetes components.
  3. Aggregator Layer: Proxy the requests sent to the registered extended resource to the extension API server that runs in a Pod in the same cluster.

kube-apiserver

Auditing

https://kubernetes.io/docs/tasks/debug/debug-cluster/audit/

example
apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"
--audit-policy-file=/etc/kubernetes/audit-policy.yaml \
--audit-log-path=/var/log/kubernetes/audit/audit.log

Run kube-apiserver in local

Prerequisite

  1. Bash version 4 or later Mac: brew install bash

    version

    bash --version
    GNU bash, version 3.2.57(1)-release (arm64-apple-darwin21)
    Copyright (C) 2007 Free Software Foundation, Inc.
    
  2. Openssl: LibreSSL is also ok. (brew install openssl <- this should also work.)

    version

    openssl version
    OpenSSL 3.1.0 14 Mar 2023 (Library: OpenSSL 3.1.0 14 Mar 2023)
    
  3. etcd: brew install etcd

    version

    etcd --version
    etcd Version: 3.5.7
    Git SHA: 215b53cf3
    Go Version: go1.19.5
    Go OS/Arch: darwin/arm64
    

Steps

  1. Build Kubernetes binary (ref: Build Kubernetes).
    1. Clone Kubernetes repo.
      git clone https://github.com/kubernetes/kubernetes
      
    2. Build the version you want to use.
      git checkout release-1.26 # you can choose any version
      make
      
  2. Run etcd. (ref: etcd)

    etcd
    
  3. Create certificates.

    ./generate_certificate.sh
    

    manual steps

    1. Create certificates for service-account.

      openssl genrsa -out service-account-key.pem 4096
      openssl req -new -x509 -days 365 -key service-account-key.pem -subj "/CN=test" -sha256 -out service-account.pem
      
    2. Create certificate for apiserver.

      1. Generate a ca.key with 2048bit:
        openssl genrsa -out ca.key 2048
        
      2. According to the ca.key generate a ca.crt (use -days to set the certificate effective time):
        openssl req -x509 -new -nodes -key ca.key -subj "/CN=127.0.0.1" -days 10000 -out ca.crt
        
      3. server.key
        openssl genrsa -out server.key 2048
        
      4. csr.conf
      5. generate certificate signing request (server.csr)
        openssl req -new -key server.key -out server.csr -config csr.conf
        
      6. generate server certificate server.crt using ca.key, ca.crt and server.csr.
        openssl x509 -req -in server.csr -CA ca.crt -CAkey ca.key \
        -CAcreateserial -out server.crt -days 10000 \
        -extensions v3_ext -extfile csr.conf
        

    For more details, please check Generate Certificates Manually

  4. Run the built binary.

    Set the path:

    PATH_TO_KUBERNETES_DIR=~/repos/kubernetes/kubernetes
    

    Check the API server's version:

    ${PATH_TO_KUBERNETES_DIR}/_output/bin/kube-apiserver --version
    
    Kubernetes v1.26.3-11+9043dd888deae0
    

    Start API server:

    ${PATH_TO_KUBERNETES_DIR}/_output/bin/kube-apiserver --etcd-servers http://localhost:2379 \
    --service-account-key-file=service-account-key.pem \
    --service-account-signing-key-file=service-account-key.pem \
    --service-account-issuer=api \
    --tls-cert-file=server.crt \
    --tls-private-key-file=server.key \
    --client-ca-file=ca.crt
    
  5. Configure kubeconfig. (You can skip this step by running ./generate_certificate.sh)

    (I'm too lazy to generate crt and key for kubectl. So used the same one as server here.)

    kubectl config set-cluster local-apiserver \
    --certificate-authority=ca.crt \
    --embed-certs=true \
    --server=https://127.0.0.1:6443 \
    --kubeconfig=kubeconfig
    
    kubectl config set-credentials admin \
    --client-certificate=server.crt \
    --client-key=server.key \
    --embed-certs=true \
    --kubeconfig=kubeconfig
    
    kubectl config set-context default \
    --cluster=local-apiserver \
    --user=admin \
    --kubeconfig=kubeconfig
    
    kubectl config use-context default --kubeconfig=kubeconfig
    
  6. Check component status. (only etcd is healthy.)

    kubectl get componentstatuses --kubeconfig kubeconfig
    
    Warning: v1 ComponentStatus is deprecated in v1.19+
    NAME                 STATUS      MESSAGE                                                                                        ERROR
    scheduler            Unhealthy   Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused
    controller-manager   Unhealthy   Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused
    etcd-0               Healthy     {"health":"true","reason":""}
    

  7. Create service account default
    kubectl create sa default --kubeconfig kubeconfig
    
  8. Create a Pod

    kubectl run nginx --image nginx --kubeconfig kubeconfig
    pod/nginx created
    

    A new pod is created but will always remain Pending, as we don't have kubelet to start a container.

    kubectl get pod --kubeconfig kubeconfig
    NAME    READY   STATUS    RESTARTS   AGE
    nginx   0/1     Pending   0          41s
    

  9. Read the data from etcd

    etcdctl get /registry/pods/default/nginx
    
    etcdctl get /registry/pods/default/nginx
    /registry/pods/default/nginx
    k8s
    
    v1Pod�
    �
    nginxdefault"*$a77f3131-9ce0-4319-a7c2-ea859df720212����Z
    
    runnginx��
    
    kubectl-runUpdatev����FieldsV1:�
    �{"f:metadata":{"f:labels":{".":{},"f:run":{}}},"f:spec":{"f:containers":{"k:{\"name\":\"nginx\"}":{".":{},"f:image":{},"f:imagePullPolicy":{},"f:name":{},"f:resources":{},"f:terminationMessagePath":{},"f:terminationMessagePolicy":{}}},"f:dnsPolicy":{},"f:enableServiceLinks":{},"f:restartPolicy":{},"f:schedulerName":{},"f:securityContext":{},"f:terminationGracePeriodSeconds":{}}}B�
    �
    kube-api-access-2f85qk�h
    "
    
    �token
    (&
    
    kube-root-ca.crt
    ca.crtca.crt
    )'
    %
            namespace
    v1metadata.namespace��
    nginxnginx*BJL
    kube-api-access-2f85q-/var/run/secrets/kubernetes.io/serviceaccount"2j/dev/termination-logrAlways����FileAlways 2
                                                            ClusterFirstBdefaultJdefaultRX`hr���default-scheduler�6
    node.kubernetes.io/not-readyExists"     NoExecute(��8
    node.kubernetes.io/unreachableExists"   NoExecute(�����PreemptLowerPriority
    Pending"*2J
    BestEffortZ"
    

    You can decode with https://github.com/jpbetz/auger.

    Clone and build

    AUGER_DIR=~/repos/jpbetz/auger
    mkdir -p $AUGER_DIR
    git clone https://github.com/jpbetz/auger $AUGER_DIR && cd $AUGER_DIR
    go build -o anger main.go
    
    etcdctl get /registry/pods/default/nginx | $AUGER_DIR/anger decode
    
    apiVersion: v1
    kind: Pod
    metadata:
      creationTimestamp: "2023-03-25T00:21:26Z"
      labels:
        run: nginx
      name: nginx
      namespace: default
      uid: a77f3131-9ce0-4319-a7c2-ea859df72021
    spec:
      containers:
      - image: nginx
        imagePullPolicy: Always
        name: nginx
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: kube-api-access-2f85q
          readOnly: true
      dnsPolicy: ClusterFirst
      priority: 0
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: default
      serviceAccountName: default
      terminationGracePeriodSeconds: 30
      tolerations:
      - effect: NoExecute
        key: node.kubernetes.io/not-ready
        operator: Exists
        tolerationSeconds: 300
      - effect: NoExecute
        key: node.kubernetes.io/unreachable
        operator: Exists
        tolerationSeconds: 300
      volumes:
      - name: kube-api-access-2f85q
        projected:
          defaultMode: 420
          sources:
          - {}
          - configMap:
              items:
              - key: ca.crt
                path: ca.crt
              name: kube-root-ca.crt
          - downwardAPI:
              items:
              - fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.namespace
                path: namespace
    status:
      phase: Pending
      qosClass: BestEffort
    
  10. Cleanup

    kubectl delete pod nginx --kubeconfig kubeconfig
    

Errors

  1. Error1: mkdir /var/run/kubernetes: permission denied

    E0302 06:40:09.767084   37385 run.go:74] "command failed" err="error creating self-signed certificates: mkdir /var/run/kubernetes: permission denied"
    

    Run

    sudo mkdir /var/run/kubernetes
    chown -R `whoami` /var/run/kubernetes
    

  2. Error2: service-account-issuer is a required flag, --service-account-signing-key-file and --service-account-issuer are required flags

    E0302 07:14:46.234431   79468 run.go:74] "command failed" err="[service-account-issuer is a required flag, --service-account-signing-key-file and --service-account-issuer are required flags]"
    

    BoundServiceAccountTokenVolume is now GA from 1.22. Need to pass --service-account-signing-key-file and --service-account-issuer.

apiextensions-apiserver

It provides an API for registering CustomResourceDefinitions.

When creating CRD: 1. Store CRD resource. 1. Validate the CRD with several controllers. 1. CRD handler automatically creates HTTP handler for the CRD.

When deleting CRD: 1. Wait until finalizingController deletes all the custom resources.

  • NewCustomResourceDefinitionHandler is called in CompletedConfig.New
  • CompletedConfig.New
    1. Prepare genericServer with completedConfig.New.
    2. Initialize CustomResourceDefinitions with GenericAPIServer.
    3. Initialize apiGroupInfo with genericapiserver.NewDefaultAPIGroupInfo.
    4. Install API group with s.GenericAPIServer.InstallAPIGroup.
    5. Initialize clientset for CRD with crdClient, err := clientset.NewForConfig(s.GenericAPIServer.LoopbackClientConfig)
    6. Initialize and set informer with s.Informers = externalinformers.NewSharedInformerFactory(crdClient, 5*time.Minute)
    7. Prepare handlers
      1. delegateHandler
      2. versionDiscoveryHandler
      3. groupDiscoveryHandler
    8. Initialize EstablishingController.
    9. Initialize crdHandler by NewCustomResourceDefinitionHandler with versionDiscoveryHandler, groupDiscoveryHandler, informer, delegateHandler, establishingController, etc.
    10. Set HTTP handler for GenericAPIServer with crdHandler.
      s.GenericAPIServer.Handler.NonGoRestfulMux.Handle("/apis", crdHandler)
      s.GenericAPIServer.Handler.NonGoRestfulMux.HandlePrefix("/apis/", crdHandler)
      
    11. Initialize controllers.
      • discoveryController
      • namingController
      • nonStructuralSchemaController
      • apiApprovalController
      • finalizingController
      • openapicontroller
    12. Set AddPostStartHookOrDie for GenericAPIServer to start informer.
    13. Set AddPostStartHookOrDie for GenericAPIServer to start controllers.
    14. Set AddPostStartHookOrDie for GenericAPIServer to wait until CRD informer is synced.

Functions

Delete

Return value:

  1. runtime.Object
  2. bool
  3. error

Steps:

  1. Get key
  2. Get obj from the storage
  3. BeforeDelete: responsible for setting deletionTimestamp.
    1. The return value is
      1. graceful (bool)
      2. gracefulPending (bool)
      3. err (error)
    2. Case1: if not deleting gracefully -> false, false, nil
    3. Case2: Update DeletionTimestamp & DeletionGracePeriodSeconds if necessary
    4. Case3: if gracefulStrategy.CheckGracefulDelete is false -> false, false, nil
  4. (If pendingGraceful is true, finalizeDelete. <- this function doesn't modify object.)
  5. deletionFinalizersForGarbageCollection

    shouldUpdateFinalizers, _ := deletionFinalizersForGarbageCollection(ctx, e, accessor, options)
    

    1. deletionFinalizersForGarbageCollection: Remove orphan and foreground finalizers if shouldOrphanDependents and shouldDeleteDependents are not true respectively. if finalizers are updated, return false, otherwise true.
  6. Update deleteImmediately:

    1. if there's pending finalizers -> false
      1. markAsDeleting sets the obj's DeletionGracePeriodSeconds to 0, and sets the DeletionTimestamp to "now"
    2. if GracePeriodSeconds > 0 -> false
    3. not pendingGraceful and not graceful -> true

    if graceful || pendingFinalizers || shouldUpdateFinalizers {
        err, ignoreNotFound, deleteImmediately, out, lastExisting = e.updateForGracefulDeletionAndFinalizers(ctx, name, key, options, preconditions, deleteValidation, obj)
        // Update the preconditions.ResourceVersion if set since we updated the object.
        if err == nil && deleteImmediately && preconditions.ResourceVersion != nil {
            accessor, err = meta.Accessor(out)
            if err != nil {
                return out, false, apierrors.NewInternalError(err)
            }
            resourceVersion := accessor.GetResourceVersion()
            preconditions.ResourceVersion = &resourceVersion
        }
    }
    
    1. updateForGracefulDeletionAndFinalizers 1. If deleteImmediately is false or if there's err, return. (not delete immediately) 1. If Dry-run, return. 1. Finally, Delete the obj from the storage.
    err := e.Storage.Delete(ctx, key, out, &preconditions, storage.ValidateObjectFunc(deleteValidation), dryrun.IsDryRun(options.DryRun), nil);
    
    1. finalizeDelete

    References