Kubernetes 1.6: Multi-user, Multi-workloads at Scale

Tuesday, March 28, 2017

Kubernetes 1.6: Multi-user, Multi-workloads at Scale

Today we’re announcing the release of Kubernetes 1.6.

In this release the community’s focus is on scale and automation, to help you deploy multiple workloads to multiple users on a cluster. We are announcing that 5,000 node clusters are supported. We moved dynamic storage provisioning to stable. Role-based access control (RBAC), kubefed, kubeadm, and several scheduling features are moving to beta. We have also added intelligent defaults throughout to enable greater automation out of the box.

What’s New

Scale and Federation : Large enterprise users looking for proof of at-scale performance will be pleased to know that Kubernetes’ stringent scalability SLO now supports 5,000 node (150,000 pod) clusters. This 150% increase in total cluster size, powered by a new version of etcd v3 by CoreOS, is great news if you are deploying applications such as search or games which can grow to consume larger clusters.

For users who want to scale beyond 5,000 nodes or spread across multiple regions or clouds, federation lets you combine multiple Kubernetes clusters and address them through a single API endpoint. In this release, the kubefed command line utility graduated to beta - with improved support for on-premise clusters. kubefed now automatically configures kube-dns on joining clusters and can pass arguments to federated components.

Security and Setup : Users concerned with security will find that RBAC, now beta adds a significant security benefit through more tightly scoped default roles for system components. The default RBAC policies in 1.6 grant scoped permissions to control-plane components, nodes, and controllers. RBAC allows cluster administrators to selectively grant particular users or service accounts fine-grained access to specific resources on a per-namespace basis. RBAC users upgrading from 1.5 to 1.6 should view the guidance here

Users looking for an easy way to provision a secure cluster on physical or cloud servers can use kubeadm, which is now beta. kubeadm has been enhanced with a set of command line flags and a base feature set that includes RBAC setup, use of the Bootstrap Token system and an enhanced Certificates API.

Advanced Scheduling : This release adds a set of powerful and versatile scheduling constructs to give you greater control over how pods are scheduled, including rules to restrict pods to particular nodes in heterogeneous clusters, and rules to spread or pack pods across failure domains such as nodes, racks, and zones.

Node affinity/anti-affinity, now in beta, allows you to restrict pods to schedule only on certain nodes based on node labels. Use built-in or custom node labels to select specific zones, hostnames, hardware architecture, operating system version, specialized hardware, etc. The scheduling rules can be required or preferred, depending on how strictly you want the scheduler to enforce them.

A related feature, called taints and tolerations, makes it possible to compactly represent rules for excluding pods from particular nodes. The feature, also now in beta, makes it easy, for example, to dedicate sets of nodes to particular sets of users, or to keep nodes that have special hardware available for pods that need the special hardware by excluding pods that don’t need it.

Sometimes you want to co-schedule services, or pods within a service, near each other topologically, for example to optimize North-South or East-West communication. Or you want to spread pods of a service for failure tolerance, or keep antagonistic pods separated, or ensure sole tenancy of nodes. Pod affinity and anti-affinity, now in beta, enables such use cases by letting you set hard or soft requirements for spreading and packing pods relative to one another within arbitrary topologies (node, zone, etc.).

Lastly, for the ultimate in scheduling flexibility, you can run your own custom scheduler(s) alongside, or instead of, the default Kubernetes scheduler. Each scheduler is responsible for different sets of pods. Multiple schedulers is beta in this release. 

Dynamic Storage Provisioning : Users deploying stateful applications will benefit from the extensive storage automation capabilities in this release of Kubernetes.

Since its early days, Kubernetes has been able to automatically attach and detach storage, format disk, mount and unmount volumes per the pod spec, and do so seamlessly as pods move between nodes. In addition, the PersistentVolumeClaim (PVC) and PersistentVolume (PV) objects decouple the request for storage from the specific storage implementation, making the pod spec portable across a range of cloud and on-premise environments. In this release StorageClass and dynamic volume provisioning are promoted to stable, completing the automation story by creating and deleting storage on demand, eliminating the need to pre-provision.

The design allows cluster administrators to define and expose multiple flavors of storage within a cluster, each with a custom set of parameters. End users can stop worrying about the complexity and nuances of how storage is provisioned, while still selecting from multiple storage options.

In 1.6 Kubernetes comes with a set of built-in defaults to completely automate the storage provisioning lifecycle, freeing you to work on your applications. Specifically, Kubernetes now pre-installs system-defined StorageClass objects for AWS, Azure, GCP, OpenStack and VMware vSphere by default. This gives Kubernetes users on these providers the benefits of dynamic storage provisioning without having to manually setup StorageClass objects. This is a change in the default behavior of PVC objects on these clouds. Note that default behavior is that dynamically provisioned volumes are created with the “delete” reclaim policy. That means once the PVC is deleted, the dynamically provisioned volume is automatically deleted so users do not have the extra step of ‘cleaning up’.

In addition, we have expanded the range of storage supported overall including:

  • ScaleIO Kubernetes Volume Plugin enabling pods to seamlessly access and use data stored on ScaleIO volumes.
  • Portworx Kubernetes Volume Plugin adding the capability to use Portworx as a storage provider for Kubernetes clusters. Portworx pools your server capacity and turns your servers or cloud instances into converged, highly available compute and storage nodes.
  • Support for NFSv3, NFSv4, and GlusterFS on clusters using the COS node image 
  • Support for user-written/run dynamic PV provisioners. A golang library and examples can be found here.
  • Beta support for mount options in persistent volumes

Container Runtime Interface, etcd v3 and Daemon set updates : while users may not directly interact with the container runtime or the API server datastore, they are foundational components for user facing functionality in Kubernetes’. As such the community invests in expanding the capabilities of these and other system components.

  • The Docker-CRI implementation is beta and is enabled by default in kubelet. Alpha support for other runtimes, cri-o, frakti, rkt, has also been implemented.
  • The default backend storage for the API server has been upgraded to use etcd v3 by default for new clusters. If you are upgrading from a 1.5 cluster, care should be taken to ensure continuity by planning a data migration window. 
  • Node reliability is improved as Kubelet exposes an admin configurable Node Allocatable feature to reserve compute resources for system daemons.
  • Daemon set updates lets you perform rolling updates on a daemon set

Alpha features : this release was mostly focused on maturing functionality, however, a few alpha features were added to support the roadmap

  • Out-of-tree cloud provider support adds a new cloud-controller-manager binary that may be used for testing the new out-of-core cloud provider flow
  • Per-pod-eviction in case of node problems combined with tolerationSeconds, lets users tune the duration a pod stays bound to a node that is experiencing problems
  • Pod Injection Policy adds a new API resource PodPreset to inject information such as secrets, volumes, volume mounts, and environment variables into pods at creation time.
  • Custom metrics support in the Horizontal Pod Autoscaler changed to use 
  • Multiple Nvidia GPU support is introduced with the Docker runtime only

These are just some of the highlights in our first release for the year. For a complete list please visit the release notes.

Community
This release is possible thanks to our vast and open community. Together, we’ve pushed nearly 5,000 commits by some 275 authors. To bring our many advocates together, the community has launched a new program called K8sPort, an online hub where the community can participate in gamified challenges and get credit for their contributions. Read more about the program here.

Release Process

A big thanks goes out to the release team for 1.6 (lead by Dan Gillespie of CoreOS) for their work bringing the 1.6 release to light. This release team is an exemplar of the Kubernetes community’s commitment to community governance. Dan is the first non-Google release manager and he, along with the rest of the team, worked throughout the release (building on the 1.5 release manager, Saad Ali’s, great work) to uncover and document tribal knowledge, shine light on tools and processes that still require special permissions, and prioritize work to improve the Kubernetes release process. Many thanks to the team.

User Adoption

We’re continuing to see rapid adoption of Kubernetes in all sectors and sizes of businesses. Furthermore, adoption is coming from across the globe, from a startup in Tennessee, USA to a Fortune 500 company in China. 

  • JD.com, one of China’s largest internet companies, uses Kubernetes in conjunction with their OpenStack deployment. They’ve move 20% of their applications thus far on Kubernetes and are already running 20,000 pods daily. Read more about their setup here
  • Spire, a startup based in Tennessee, witnessed their public cloud provider experience an outage, but suffered zero downtime because Kubernetes was able to move their workloads to different zones. Read their full experience here.

“With Kubernetes, there was never a moment of panic, just a sense of awe watching the automatic mitigation as it happened.”

  • Share your Kubernetes use case story with the community here.

Availability
Kubernetes 1.6 is available for download here on GitHub and via get.k8s.io. To get started with Kubernetes, try one of the these interactive tutorials

Get Involved
CloudNativeCon + KubeCon in Berlin is this week March 29-30, 2017. We hope to get together with much of the community and share more there!

Share your voice at our weekly community meeting

Many thanks for your contributions and advocacy!

– Aparna Sinha, Senior Product Manager, Kubernetes, Google

PS: read this series of in-depth articles on what’s new in Kubernetes 1.6