Kubernetes-master charm

This charm is an encapsulation of the Kubernetes master processes and the operations to run on any cloud for the entire lifecycle of the cluster.

This charm is built from other charm layers using the Juju reactive framework. The other layers focus on specific subset of operations making this layer specific to operations of Kubernetes master processes.

Deployment

This charm is not fully functional when deployed by itself. It requires other charms to model a complete Kubernetes cluster. A Kubernetes cluster needs a distributed key value store such as Etcd and the kubernetes-worker charm which delivers the Kubernetes node services. A cluster requires a Software Defined Network (SDN), a Container Runtime such as containerd, and Transport Layer Security (TLS) so the components in a cluster communicate securely.

Please take a look at the Charmed Kubernetes or the Kubernetes core bundles for examples of complete models of Kubernetes clusters.

Resources

The kubernetes-master charm takes advantage of the Juju Resources feature to deliver the Kubernetes software.

In deployments on public clouds the Charm Store provides the resource to the charm automatically with no user intervention. Some environments with strict firewall rules may not be able to contact the Charm Store. In these network restricted environments the resource can be uploaded to the model by the Juju operator.

Snap Refresh

The kubernetes resources used by this charm are snap packages. When not specified during deployment, these resources come from the public store. By default, the snapd daemon will refresh all snaps installed from the store four (4) times per day. A charm configuration option is provided for operators to control this refresh frequency.

NOTE: this is a global configuration option and will affect the refresh time for all snaps installed on a system.

Examples:

## refresh kubernetes-master snaps every tuesday
juju config kubernetes-master snapd_refresh="tue"

## refresh snaps at 11pm on the last (5th) friday of the month
juju config kubernetes-master snapd_refresh="fri5,23:00"

## delay the refresh as long as possible
juju config kubernetes-master snapd_refresh="max"

## use the system default refresh timer
juju config kubernetes-master snapd_refresh=""

For more information, see the snap documentation.

Configuration

This charm supports some configuration options to set up a Kubernetes cluster that works in your environment, detailed in the section below.

For some specific Kubernetes service configuration tasks, please also see the section on configuring K8s services.

name type Default Description
allow-privileged string auto See notes
api-extra-args string   See notes
audit-policy string See notes Audit policy passed to kube-apiserver via –audit-policy-file. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/tasks/debug-application-cluster/audit/
audit-webhook-config string   Audit webhook config passed to kube-apiserver via –audit-webhook-config-file. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/tasks/debug-application-cluster/audit/
authn-webhook-endpoint string   See notes
authorization-mode string Node,RBAC Comma separated authorization modes. Allowed values are “RBAC”, “Node”, “Webhook”, “ABAC”, “AlwaysDeny” and “AlwaysAllow”.
cephfs-mounter string default The client driver used for cephfs based storage. Options are “fuse”, “kernel” and “default”.
channel string 1.22/stable Snap channel to install Kubernetes master services from
client_password string   Password to be used for admin user (leave empty for random password).
controller-manager-extra-args string   See notes
dashboard-auth string auto See notes
default-cni string   See notes
default-storage string auto The storage class to make the default storage class. Allowed values are “auto”, “none”, “ceph-xfs”, “ceph-ext4”, “cephfs”. Note: Only works in Kubernetes >= 1.10
dns-provider string auto See notes
dns_domain string cluster.local The local domain for cluster dns
enable-dashboard-addons boolean True Deploy the Kubernetes Dashboard
enable-keystone-authorization boolean False If true and the Keystone charm is related, users will authorize against the Keystone server. Note that if related, users will always authenticate against Keystone.
enable-metrics boolean True If true the metrics server for Kubernetes will be deployed onto the cluster.
enable-nvidia-plugin string auto Load the nvidia device plugin daemonset. Supported values are “auto” and “false”. When “auto”, the daemonset will be loaded only if GPUs are detected. When “false” the nvidia device plugin will not be loaded.
extra_packages string   Space separated list of extra deb packages to install.
extra_sans string   Space-separated list of extra SAN entries to add to the x509 certificate created for the master nodes.
ha-cluster-dns string   DNS entry to use with the HA Cluster subordinate charm. Mutually exclusive with ha-cluster-vip.
ha-cluster-vip string   Virtual IP for the charm to use with the HA Cluster subordinate charm Mutually exclusive with ha-cluster-dns. Multiple virtual IPs are separated by spaces.
image-registry string See notes Container image registry to use for CDK. This includes addons like the Kubernetes dashboard, metrics server, ingress, and dns along with non-addon images including the pause container and default backend image.
install_keys string   See notes
install_sources string   See notes
keystone-policy string See notes Policy for Keystone authorization. This is used when a Keystone charm is related to kubernetes-master in order to provide authorization for Keystone users on the Kubernetes cluster.
keystone-ssl-ca string   Keystone certificate authority encoded in base64 for securing communications to Keystone. For example: juju config kubernetes-master keystone-ssl-ca=$(base64 /path/to/ca.crt)
loadbalancer-ips string   See notes
nagios_context string juju See notes
nagios_servicegroups string   A comma-separated list of nagios servicegroups. If left empty, the nagios_context will be used as the servicegroup
package_status string install The status of service-affecting packages will be set to this value in the dpkg database. Valid values are “install” and “hold”.
proxy-extra-args string   See notes
require-manual-upgrade boolean True When true, master nodes will not be upgraded until the user triggers it manually by running the upgrade action.
scheduler-extra-args string   See notes
service-cidr string 10.152.183.0/24 CIDR to use for Kubernetes services. After deployment it is only possible to increase the size of the IP range. It is not possible to change or shrink the address range after deployment.
snapd_refresh string max See notes
storage-backend string auto The storage backend for kube-apiserver persistence. Can be “etcd2”, “etcd3”, or “auto”. Auto mode will select etcd3 on new installations, or etcd2 on upgrades.
sysctl string See notes See notes

allow-privileged

Description:

Allow kube-apiserver to run in privileged mode. Supported values are “true”, “false”, and “auto”. If “true”, kube-apiserver will run in privileged mode by default. If “false”, kube-apiserver will never run in privileged mode. If “auto”, kube-apiserver will not run in privileged mode by default, but will switch to privileged mode if gpu hardware is detected on a worker node.

Back to table

api-extra-args

Description:

Space separated list of flags and key=value pairs that will be passed as arguments to kube-apiserver. For example a value like this:

  runtime-config=batch/v2alpha1=true profiling=true

will result in kube-apiserver being run with the following options: –runtime-config=batch/v2alpha1=true –profiling=true

Back to table

audit-policy

Default:

apiVersion: audit.k8s.io/v1
kind: Policy
rules:
# Don't log read-only requests from the apiserver
- level: None
  users: ["system:apiserver"]
  verbs: ["get", "list", "watch"]
# Don't log kube-proxy watches
- level: None
  users: ["system:kube-proxy"]
  verbs: ["watch"]
  resources:
  - resources: ["endpoints", "services"]
# Don't log nodes getting their own status
- level: None
  userGroups: ["system:nodes"]
  verbs: ["get"]
  resources:
  - resources: ["nodes"]
# Don't log kube-controller-manager and kube-scheduler getting endpoints
- level: None
  users: ["system:unsecured"]
  namespaces: ["kube-system"]
  verbs: ["get"]
  resources:
  - resources: ["endpoints"]
# Log everything else at the Request level.
- level: Request
  omitStages:
  - RequestReceived

Back to table

authn-webhook-endpoint

Description:

Custom endpoint to check when authenticating kube-apiserver requests. This must be an https url accessible by the k8s-master units. For example:

https://your.server:8443/authenticate

When a JSON-serialized TokenReview object is POSTed to this endpoint, it must respond with appropriate authentication details. For more info, please refer to the upstream documentation at https://kubernetes.io/docs/reference/access-authn-authz/authentication/#webhook-token-authentication

Back to table

controller-manager-extra-args

Description:

Space separated list of flags and key=value pairs that will be passed as arguments to kube-controller-manager. For example a value like this:

  runtime-config=batch/v2alpha1=true profiling=true

will result in kube-controller-manager being run with the following options: –runtime-config=batch/v2alpha1=true –profiling=true

Back to table

dashboard-auth

Description:

Method of authentication for the Kubernetes dashboard. Allowed values are “auto”, “basic”, and “token”. If set to “auto”, basic auth is used unless Keystone is related to kubernetes-master, in which case token auth is used.

DEPRECATED: this option has no effect on Kubernetes 1.19 and above.

Back to table

default-cni

Description:

Default CNI network to use when multiple CNI subordinates are related.

The value of this config should be the application name of a related CNI subordinate. For example:

juju config kubernetes-master default-cni=flannel

If unspecified, then the default CNI network is chosen alphabetically.

Back to table

dns-provider

Description:

DNS provider addon to use. Can be “auto”, “core-dns”, “kube-dns”, or “none”.

CoreDNS is only supported on Kubernetes 1.14+.

When set to “auto”, the behavior is as follows:

  • New deployments of Kubernetes 1.14+ will use CoreDNS
  • New deployments of Kubernetes 1.13 or older will use KubeDNS
  • Upgraded deployments will continue to use whichever provider was previously used.

Back to table

image-registry

Default:

rocks.canonical.com:443/cdk

Back to table

install_keys

Description:

List of signing keys for install_sources package sources, per charmhelpers standard format (a yaml list of strings encoded as a string). The keys should be the full ASCII armoured GPG public keys. While GPG key ids are also supported and looked up on a keyserver, operators should be aware that this mechanism is insecure. null can be used if a standard package signing key is used that will already be installed on the machine, and for PPA sources where the package signing key is securely retrieved from Launchpad.

Back to table

install_sources

Description:

List of extra apt sources, per charm-helpers standard format (a yaml list of strings encoded as a string). Each source may be either a line that can be added directly to sources.list(5), or in the form ppa:/ for adding Personal Package Archives, or a distribution component to enable.

Back to table

keystone-policy

Default:

apiVersion: v1
kind: ConfigMap
metadata:
  name: k8s-auth-policy
  namespace: kube-system
  labels:
    k8s-app: k8s-keystone-auth
data:
  policies: |
    [
      {
       "resource": {
          "verbs": ["get", "list", "watch"],
          "resources": ["*"],
          "version": "*",
          "namespace": "*"
        },
        "match": [
          {
            "type": "role",
            "values": ["k8s-viewers"]
          },
          {
            "type": "project",
            "values": ["k8s"]
          }
        ]
      },
      {
       "resource": {
          "verbs": ["*"],
          "resources": ["*"],
          "version": "*",
          "namespace": "default"
        },
        "match": [
          {
            "type": "role",
            "values": ["k8s-users"]
          },
          {
            "type": "project",
            "values": ["k8s"]
          }
        ]
      },
      {
       "resource": {
          "verbs": ["*"],
          "resources": ["*"],
          "version": "*",
          "namespace": "*"
        },
        "match": [
          {
            "type": "role",
            "values": ["k8s-admins"]
          },
          {
            "type": "project",
            "values": ["k8s"]
          }
        ]
      }
    ]

Back to table

loadbalancer-ips

Description:

Space separated list of IP addresses of loadbalancers in front of the control plane. These can be either virtual IP addresses that have been floated in front of the control plane or the IP of a loadbalancer appliance such as an F5. Workers will alternate IP addresses from this list to distribute load - for example If you have 2 IPs and 4 workers, each IP will be used by 2 workers. Note that this will only work if kubeapi-load-balancer is not in use and there is a relation between kubernetes-master:kube-api-endpoint and kubernetes-worker:kube-api-endpoint. If using the kubeapi-load-balancer, see the loadbalancer-ips configuration variable on the kubeapi-load-balancer charm.

Back to table

nagios_context

Description:

Used by the nrpe subordinate charms. A string that will be prepended to instance name to set the host name in nagios. So for instance the hostname would be something like:

    juju-myservice-0

If you’re running multiple environments with the same services in them this allows you to differentiate between them.

Back to table

proxy-extra-args

Description:

Space separated list of flags and key=value pairs that will be passed as arguments to kube-proxy. For example a value like this:

  runtime-config=batch/v2alpha1=true profiling=true

will result in kube-apiserver being run with the following options: –runtime-config=batch/v2alpha1=true –profiling=true

Back to table

scheduler-extra-args

Description:

Space separated list of flags and key=value pairs that will be passed as arguments to kube-scheduler. For example a value like this:

  runtime-config=batch/v2alpha1=true profiling=true

will result in kube-scheduler being run with the following options: –runtime-config=batch/v2alpha1=true –profiling=true

Back to table

snapd_refresh

Description:

How often snapd handles updates for installed snaps. Setting an empty string will check 4x per day. Set to “max” to delay the refresh as long as possible. You may also set a custom string as described in the ‘refresh.timer’ section here:

  https://forum.snapcraft.io/t/system-options/87

DEPRECATED in 1.19: Manage installed snap versions with the snap-store-proxy model config. See: https://snapcraft.io/snap-store-proxy and https://juju.is/docs/offline-mode-strategies#heading–snap-specific-proxy

Back to table

sysctl

Default:

{ net.ipv4.conf.all.forwarding : 1, net.ipv4.neigh.default.gc_thresh1 : 128, net.ipv4.neigh.default.gc_thresh2 : 28672, net.ipv4.neigh.default.gc_thresh3 : 32768, net.ipv6.neigh.default.gc_thresh1 : 128, net.ipv6.neigh.default.gc_thresh2 : 28672, net.ipv6.neigh.default.gc_thresh3 : 32768, fs.inotify.max_user_instances : 8192, fs.inotify.max_user_watches : 1048576, kernel.panic : 10, kernel.panic_on_oops: 1, vm.overcommit_memory : 1 }

Back to table

Description:

YAML formatted associative array of sysctl values, e.g.: ‘{kernel.pid_max : 4194303 }’. Note that kube-proxy handles the conntrack settings. The proper way to alter them is to use the proxy-extra-args config to set them, e.g.:

  juju config kubernetes-master proxy-extra-args="conntrack-min=1000000 conntrack-max-per-core=250000"
  juju config kubernetes-worker proxy-extra-args="conntrack-min=1000000 conntrack-max-per-core=250000"

The proxy-extra-args conntrack-min and conntrack-max-per-core can be set to 0 to ignore kube-proxy’s settings and use the sysctl settings instead. Note the fundamental difference between the setting of conntrack-max-per-core vs nf_conntrack_max.

Back to table

Configuring K8s services

Charmed Kubernetes ships with sensible, tested default configurations to ensure a reliable Kubernetes experience, but of course these can be changed to reflect the purpose and resources of your cluster. The configuration section above details all available configuration options, this section deals with specific, commonly used settings. You may wish to also read the Addons page for information on the extra services installed with Charmed Kubernetes.

IPVS (IP Virtual Server)

IPVS implements transport-layer load balancing as part of the Linux kernel, and can be used by the kube-proxy service to handle service routing. By default kube-proxy uses a solution based on iptables, but this can cause a lot of overhead in systems with large numbers of nodes. There is more information on this in the upstream Kubernetes IPVS deep dive documentation.

IPVS is an extra option for kube-proxy, and can be enabled by changing the configuration:

juju config kubernetes-master proxy-extra-args="proxy-mode=ipvs"

It is also necessary to change this configuration option on the worker:

juju config kubernetes-worker proxy-extra-args="proxy-mode=ipvs"

Admission controls

As with other aspects of the Kubernetes API, admission controls can be enabled by adding extra values to the charm’s api-extra-args configuration.

For admission controls, it may be useful to refer to the Kubernetes blog for more information on the options, but for example, to add the PodSecurityPolicy admission controller:

  1. Check any current config settings for api-extra-args (there are none by default):
    juju config kubernetes-master api-extra-args
    
  2. Append the desired config option to the previous output and apply:
    juju config kubernetes-master api-extra-args="enable-admission-plugins=PodSecurityPolicy"
    

Note that prior to Kubernetes 1.16 (kubernetes-master revision 778), the config setting was admission-control, rather than enable-admission-plugins.

Adding SANs and certificate regeneration

As explained in the Certificates and trust overview, the extra_sans configuration settings can be used to add SANs and regenerate x509 certificate(s) for the API server running on the Kubernetes master node(s), and for the load balancer. When this configuration is changed, the master node(s) will regenerate its certificate and restart the API server to update the certificate used for communication. Note: This is disruptive and restarts the API server.

The process is the same for both the kubernetes-master and the kubeapi-load-balancer. The configuration option takes a space-separated list of extra entries:

juju config kubernetes-master extra_sans="master.mydomain.com lb.mydomain.com"
juju config kubeapi-load-balancer extra_sans="master.mydomain.com lb.mydomain.com"

To clear the entries out of the certificate, use an empty string:

juju config kubernetes-master extra_sans=""
juju config kubeapi-load-balancer extra_sans=""

DNS for the cluster

The DNS add-on allows pods to have DNS names in addition to IP addresses. The Kubernetes cluster DNS server (based on the SkyDNS library) supports forward lookups (A records), service lookups (SRV records) and reverse IP address lookups (PTR records). More information about the DNS can be obtained from the Kubernetes DNS admin guide.

Actions

You can run an action with the following

juju run-action kubernetes-master ACTION [parameters] [--wait]
apply-manifest

Apply JSON formatted Kubernetes manifest to cluster

This action has the following parameters:


json

The content of the manifest to deploy in JSON format

Default:



cis-benchmark

Run the CIS Kubernetes Benchmark against snap-based components.

This action has the following parameters:


apply

Apply remediations to address benchmark failures. The default, 'none', will not attempt to fix any reported failures. Set to 'conservative' to resolve simple failures. Set to 'dangerous' to attempt to resolve all failures. Note: Applying any remediation may result in an unusable cluster.

Default: none


config

Archive containing configuration files to use when running kube-bench. The default value is known to be compatible with snap components. When using a custom URL, append '#<hash_type>=<checksum>' to verify the archive integrity when downloaded.

Default: https://github.com/charmed-kubernetes/kube-bench-c onfig/archive/cis-1.5.zip#sha1=cb8e78712ee5bfeab87 d0ed7c139a83e88915530


release

Set the kube-bench release to run. If set to 'upstream', the action will compile and use a local kube-bench binary built from the master branch of the upstream repository: https://github.com/aquasecurity/kube-bench This value may also be set to an accessible archive containing a pre-built kube-bench binary, for example: https://github.com/aquasecurity/kube- bench/releases/download/v0.0.34/kube-bench_0.0.34_ linux_amd64.tar.gz#sha256=f96d1fcfb84b18324f1299db 074d41ef324a25be5b944e79619ad1a079fca077

Default: https://github.com/aquasecurity/kube- bench/releases/download/v0.2.3/kube-bench_0.2.3_li nux_amd64.tar.gz#sha256=429a1db271689aafec009434de d1dea07a6685fee85a1deea638097c8512d548



create-rbd-pv

Create RADOS Block Device (RDB) volume in Ceph and creates PersistentVolume. Note this is deprecated on Kubernetes >= 1.10 in favor of CSI, where PersistentVolumes are created dynamically to back PersistentVolumeClaims.

This action has the following parameters:


filesystem

File system type to format the volume.

Default: xfs


mode

Access mode for the persistent volume.

Default: ReadWriteOnce


name

Name the persistent volume.

Default:


size

Size in MB of the RBD volume.

Default:


skip-size-check

Allow creation of overprovisioned RBD.

Default: False



debug

Collect debug data


get-kubeconfig

Retrieve Kubernetes cluster config, including credentials


namespace-create

Create new namespace

This action has the following parameters:


name

Namespace name eg. staging

Default:



namespace-delete

Delete namespace

This action has the following parameters:


name

Namespace name eg. staging

Default:



namespace-list

List existing k8s namespaces


restart

Restart the Kubernetes master services on demand.


upgrade

Upgrade the kubernetes snaps

This action has the following parameters:


fix-cluster-name

If using the OpenStack cloud provider, whether to fix the cluster name sent to it to include the cluster tag. This fixes an issue with load balancers conflicting with other clusters in the same project but will cause new load balancers to be created which will require manual intervention to resolve.

Default: True



More information