In the eventual destination of this journey we are looking to demonstrate cloud native capabilities, elastic load handling, zero downtime operations, agile application updates, dynamic feature testing capability etc. In Part 1 we saw how to take an example application that has been packaged in a docker image, customize it slightly and put the customized image up on an OCI Registry.The next step here in Part 2 we'll look at elastic load handling. The load we are addressing is multi user or multiple similar problems in parallel as opposed to single large computations in sequence, horizontal scale not vertical scale. We'll achieve this by running multiple copies of the docker container from Part 1 in a cluster orchestrated via kubernetes.
The traditional horizontal scale solution is to aggregate identical servers into a cluster and seek to interact with the cluster as if it were a single server. The elastic part comes in when the cluster has the ability to grow or shrink dynamically based on load and how capable the cluster members are of keeping up. If the existing members are overwhelmed, add more members. If they are underwhelmed remove some members.
We are only scaling in this blog, we'll visit the elastic capability in a later blog on service mesh.
Creating a cluster in OCI is as simple as pressing a button. Go to the main menu, Developer Services and select Container Clusters (OKE).
Use the kubernetes command line tool kubectl to access the OKE kubernetes cluster. Install kubectl using your package manager, for example dnf, apt or yum. Setup access by copying kubeconfig for you cluster as described in the OCI documentation. (You will also need to setup the OCI command line tool if you have not already done so.)
To kick things off let's try to run a single copy of the default SageMath image analogous to the way we did with docker, (docker run -it sagemath/sagemath). To test launching the default Sage image run it in a pod from the kubectl command line. Once that's working we'll move on to creating a declarative description in a yaml file and setup a deployment to manage multiple pods and running instances. A kubernetes pod roughly corresponds to a cluster member in our opening discussion (if the kubernetes terminology; deployments, pods, services, replica sets, nodes etc. is unfamiliar there is a good overview in the kubernetes documentation.)
Use the kubectl run command to launch the image directly
# kubectl run --generator=run-pod/v1 --image=sagemath/sagemath sagemath-app sage-jupyter --port=8888 pod/sagemath-app created
Once the container is running we need external (external to the cluster) access in order to test. Create an external access path by exposing the pod via a load balancer service.
# kubectl expose pod sagemath-app --port=8888 --name=sagemath-single --type=LoadBalancer service/sagemath-single exposed
Find the external IP address by looking up the newly created service
# kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP x.x.x.x none 443/TCP 25d sagemath-single LoadBalancer x.x.x.x x.x.x.x 8888:30811/TCP 37m
Browse to .x.x.x.x:8888 to check the running container. Notice that OKE will manage port 8888 on the load balancer VCN automatically so you will not need to go open it manually.
Great, the container is up and running but where is the login token? The console output where we retrieved the token in Part 1 is not immediately visible but can displayed using the kubectl logs command. There is a more general problem at hand here of integrating authentication with OCI IAM. We will look at solutions to that problem in detail in a future blog.
The console output stdout is available, use kubectl to display the logs for the pod.
# kubectl logs -f sagemath-app [I 07:38:03.240 NotebookApp] Using MathJax: nbextensions/mathjax/MathJax.js [I 07:38:03.247 NotebookApp] Writing notebook server cookie secret to /home/sage/.local/share/jupyter/runtime/notebook_cookie_secret [I 07:38:03.437 NotebookApp] Serving notebooks from local directory: /home/sage [I 07:38:03.437 NotebookApp] The Jupyter Notebook is running at: [I 07:38:03.437 NotebookApp] http://(sagemath-app or 127.0.0.1):8888/?token=23a6f3186b2941c52e5694a5d538f13c1e9542265f78f6c6 [I 07:38:03.437 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation). [C 07:38:03.441 NotebookApp] To access the notebook, open this file in a browser: file:///home/sage/.local/share/jupyter/runtime/nbserver-1-open.html Or copy and paste one of these URLs: http://(sagemath-app or 127.0.0.1):8888/?token=23a6f3186b2941c52e5694a5d538f13c1e9542265f78f6c6 [W 07:38:19.106 NotebookApp] Clearing invalid/expired login cookie username-132-145-255-128-8888
Now that we've seen an analog to "docker run" using "kubectl run", let's proceed to the normal method of writing a yaml description of our desired cluster (deployment, service, replicas and pods). Declare three replicas so we will get three members in our cluster. Writing yaml for kubernetes isn't too much of a hassle. It's not hard to find some introductions but the quickest way to learn is by example. Here is an example that represents a deployment with three replicas and a load balancer service. (use kubectl delete to clean up the pod and service above before creating these new objects)
apiVersion: apps/v1 kind: Deployment metadata: name: sagemath-app spec: selector: matchLabels: app: sagemath-app replicas: 3 template: metadata: labels: app: sagemath-app spec: containers: - name: sagemath-app image: sagemath/sagemath:latest args: - sage-jupyter ports: - containerPort: 8888 --- apiVersion: v1 kind: Service metadata: name: sagemath-single spec: type: LoadBalancer ports: - port: 8888 protocol: TCP targetPort: 8888 selector: app: sagemath-app
Apply or create using kubectl. Note the difference, create is the imperative command and apply the declarative. Apply will create if nothing exists and will update to the declared spec in the yaml file if the objects already exist. If you wanted to change the replicas from three to five, set the declaration in the yaml file and kubectl apply.
# kubectl apply -f sagemath.yaml deployment.apps/sagemath-app created service/sagemath-single created # kubectl get pods NAME READY STATUS RESTARTS AGE sagemath-app-677fc4477d-9d9kl 1/1 Running 0 18s sagemath-app-677fc4477d-dg9h6 1/1 Running 0 18s sagemath-app-677fc4477d-gdmbj 1/1 Running 0 18s # kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP x.x.x.x none 443/TCP 25d sagemath-single LoadBalancer x.x.x.x x.x.x.x 8888:30947/TCP 28s
Once the container is running in all three pods and the service is up, open a browser to the external IP:8888. Notice that all three cluster members are running which is good news but there is bad news, the application is unusable. When you enter the login token you don't know which pod you've connected to so need to try any or all of the three tokens to login. Once logged in, your next click may go to a different pod via the load balancer and you need a second token to login again. This is just the beginning of problems, if you create a notebook on one pod your next network exchange may go to a different pod where the notebook does not exist. There is a larger problem surfaced here of session state management and whether the containers are stateful or not. The simplest fix, which is more sweeping the problem under the rug, is to implement "sticky sessions" based on client IP. The load balancer keeps track of a client's IP address and once it sends traffic to a pod it remembers and sends all subsequent traffic to the same pod. This can be enabled by adding sessionAffinity: ClientIP to the spec section of the service definition as shown in the clip below
--- apiVersion: v1 kind: Service metadata: name: sagemath-single spec: type: LoadBalancer ports: - port: 8888 protocol: TCP targetPort: 8888 sessionAffinity: ClientIP selector: app: sagemath-app
Add the session affinity line to sagemath.yaml and apply the change.
# kubectl apply -f sagemath.yaml deployment.apps/sagemath-app unchanged service/sagemath-single configured
Notice the deployment is unchanged and the service re-configured. So things should be better now, right? Trying we find out the sessions aren't "sticking" as expected and the problem persists. Take a look at the service using kubectl describe and we can see why:
# kubectl describe service sagemath-single Name: sagemath-single Namespace: default Labels: none Annotations: kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"name":"sagemath-single","namespace":"default"},"spec":{"ports":[{"port":... Selector: app=sagemath-app Type: LoadBalancer IP: 10.96.136.4 LoadBalancer Ingress: x.x.x.x Port: unset 8888/TCP TargetPort: 8888/TCP NodePort: unset 30066/TCP Endpoints: 10.244.0.6:8888,10.244.1.3:8888,10.244.2.4:8888 Session Affinity: ClientIP External Traffic Policy: Cluster Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal EnsuredLoadBalancer 31m service-controller Ensured load balancer Normal EnsuringLoadBalancer 2m16s (x7 over 32m) service-controller Ensuring load balancer Warning CreatingLoadBalancerFailed 2m16s (x6 over 4m52s) service-controller Error creating load balancer (will retry): failed to ensure load balancer for service default/sagemath-single: invalid service: OCI only supports SessionAffinity "None" currently
We needed a more robust solution anyway so now we have an additional reason to pursue something better. The sticky approach itself is okay in lieu of the more complete solution of fixing session state management and moving notebook storage out of the containers. Sticky sessions based on browser cookie will suffice so we'll take that route. First let's step back and get our customized image from the OCIR registry as promised up top. (clean up the deployment and service using kubectl delete, note you can use the .yaml file with the delete command)
In order to get our custom image we pushed to OCIR in Part 1 we need to get the registry path and authorization into the deployment description of sagemath.yaml.
apiVersion: apps/v1 kind: Deployment metadata: name: sagemath-prod spec: selector: matchLabels: app: sagemath-prod replicas: 3 template: metadata: labels: app: sagemath-prod spec: containers: - name: sagemath-prod image: fra.ocir.io/intvravipati/jhf/sagemath:latest args: - sage-jupyterlab ports: - containerPort: 8888 imagePullSecrets: - name: ocirsecret --- apiVersion: v1 kind: Service metadata: name: sagemath-http spec: type: LoadBalancer ports: - port: 8888 protocol: TCP targetPort: 8888 selector: app: sagemath-prod
We are using the sage-jupyterlab arg to the container launch to get our customization of JupyterLab instead of the default Jupyter notebook. The image path should look familiar from Part 1 where we pushed the image to OCIR. The Auth Token used for docker login is required to authorize the pull. In order to avoid putting the token in the .yaml file we'll point to it here as imagePullSecrets: and store it as a kubernetes secret object. Use kubectl to create the object as shown substituting your Auth Token for xxxxx in --docker-password= below. (also substitute your tenancy and username/email info accordingly of course)
kubectl create secret docker-registry ocirsecret --docker-server=fra.ocir.io --docker-username='intvravipati/John.Featherly@oracle.com' --docker-password='xxxxx' --docker-email='John.Featherly@oracle.com'
Once the ocirsecret object has been created apply the app description file sagemath.yaml using kubectl apply -f sagemath.yaml.
# kubectl apply -f sagemath.yaml deployment.apps/sagemath-prod created service/sagemath-http created # kubectl get deployments NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE sagemath-prod 3 3 3 3 4s # kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 none 443/TCP 162m sagemath-http LoadBalancer 10.96.208.83 x.x.x.x 8888:31395/TCP 54m # kubectl get pods NAME READY STATUS RESTARTS AGE sagemath-prod-65dd578fcc-62b2b 1/1 Running 0 14s sagemath-prod-65dd578fcc-dtqqw 1/1 Running 0 14s sagemath-prod-65dd578fcc-lh5zr 1/1 Running 0 14s
Browse to the external IP:8888 and you should be able to get to JupyterLab on the pods as we did before with the base image and Jupyter notebook. That was fairly painless, now we have the session state problem to address.
The Load Balancer service has been implicitly providing ingress by creating an external IP address for the cluster and directing traffic from the outside to cluster members. To gain more control and capability we'll break this apart and explicitly use a kubernetes Ingress object along with a traffic director running in a separate pod. There are a number of options to use for the cookie capable traffic director. Eventually we will want sophisticated traffic control to implement canary testing etc. using istio but for now we'll choose the widely popular nginx. First let's remove the Load Balancer and the accompanying implied ingress by removing the type: LoadBalancer line from the spec section of the service in sagemath.yaml. A name for the port has also been added, the section will now look like:
--- apiVersion: v1 kind: Service metadata: name: sagemath-http spec: ports: - name: http port: 8888 protocol: TCP targetPort: 8888 selector: app: sagemath-prod
Apply the change with kubectl apply -f sagemath.yaml (if you get an error "The Service "sagemath-http" is invalid: " delete the old LoadBalancer service first and then retry the kubectl apply). The service type should have changed to the default, ClusterIP.
There are a number of references to deploy an nginx ingress controller. The OCI documentation for example, unfortunately the link to the .yaml file referenced in that doc is broken. I have used the nginx installation guide.
Reproducing the main steps here, first create a separate (from default) namespace for the controller.
# kubectl create namespace ingress-nginx
Create a file kustomization.yaml with the following content.
apiVersion: kustomize.config.k8s.io/v1beta1 kind: Kustomization namespace: ingress-nginx bases: - github.com/kubernetes/ingress-nginx/deploy/cluster-wide - github.com/kubernetes/ingress-nginx/deploy/cloud-generic
apply the kustomize with kubectl apply --kustomize .
# kubectl apply --kustomize . serviceaccount/nginx-ingress-serviceaccount created role.rbac.authorization.k8s.io/nginx-ingress-role created clusterrole.rbac.authorization.k8s.io/nginx-ingress-clusterrole created rolebinding.rbac.authorization.k8s.io/nginx-ingress-role-nisa-binding created clusterrolebinding.rbac.authorization.k8s.io/nginx-ingress-clusterrole-nisa-binding created configmap/nginx-configuration created configmap/tcp-services created configmap/udp-services created service/ingress-nginx created deployment.apps/nginx-ingress-controller created
The ingress-nginx service has the external IP that we'll use to access the cluster. The last step is to create an explicit ingress object that defines the frontend DNS name, the backend service we are connecting to and the affinity cookie parameters. Create a file named ingress.yaml and enter the following contents:
apiVersion: extensions/v1beta1 kind: Ingress metadata: name: nginx-sagemath annotations: nginx.ingress.kubernetes.io/affinity: "cookie" nginx.ingress.kubernetes.io/session-cookie-name: "route" nginx.ingress.kubernetes.io/session-cookie-expires: "129600" nginx.ingress.kubernetes.io/session-cookie-max-age: "129600" spec: rules: - host: sage.featherly.net http: paths: - backend: serviceName: sagemath-http servicePort: 8888 path: /
Apply as usual with kubectl apply -f ingress.yaml.
The DNS name is required so either make an entry in your /etc/hosts or make the A record on your DNS server. The IP address to put in DNS is found from the ingress or the service
# kubectl get ingress NAME HOSTS ADDRESS PORTS AGE nginx-sagemath sage.featherly.net x.x.x.x 80 41s # kubectl get services --namespace=ingress-nginx NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE ingress-nginx LoadBalancer 10.96.223.69 x.x.x.x 80:32219/TCP,443:32280/TCP 56s
Browse to sage.featherly.net (and find the login token) to get JupyterLab as we had in Part 1. The app is finally usable. Try creating folders, notebooks, terminals etc. you should find everything working as expected. Check your browser for cookies and you should find the "route" cookie we're using for session affinity.
We've scaled our customized app from a single docker container to a kubernetes cluster. In Part 3 we'll take a look at application management topics like service mesh, canary testing and zero downtime upgrades.
Previous Post
Next Post