Replacing Supervisor Cluster Certificates

In my last post, I discussed about replacing supervisor cluster’s VIP cert and the load balancer cert. This article will be a continuation of sorts.

Supervisor cluster is compromised of so many individual components that communicate between each other and with vCenter using certificates and this signifies the importance of this certs being valid at all points in time. If one of these certs (for ex: api server etcd client) gets expired, then none of the kube components running on the worker nodes will be able to establish communication with the etcd cluster.

Ideally, with every version upgrade (minor or major), the certificates will get replaced as well. But if for some reason, you had an invalid or an expired certificate, then until last month, your only option was to call GSS and they would have manually replaced those certs for you using a documented procedure. Thanks to the feedback from many enterprise customers, we now have a cert-manager tool and a KB for this. I am going to be executing this KB and record it here.

Procedure:

Step 1: SSH into one of the supervisor cluster nodes and execute the following command to note the validity of the certs. You can use this procedure to login.

find / -type f \( -name "*.cert" -o -name "*.crt" \) -print 2>/dev/null | egrep -iv 'ca.crt$|ca-bundle.crt$|kubelet\/pods|var\/lib\/containerd|run\/containerd|backup' | xargs -L 1 -t -i bash -c 'openssl x509 -noout -text -in {}|grep After'

You should see an output similar to the one below

Though the certs are still valid for a good amount of time, I will go ahead and replace them to show how the tool works. However, I will show the output of the same command after the certs are replaced to highlight the differences in their validity.

Step 2: Download the wcp_cert_manager zip file from KB 90627 and copy it to the vCenter where the supervisor cluster is enabled. You can do this using either WinSCP or the SCP command line utility.

Step 3: Unzip the file using the following command

unzip wcp_cert_manager.zip

Step 4: You can run the following command to list (all) the supervisor cluster(s) that is installed on this vCenter.

./certmanager supervisors

Step 5: Run the following command to replace the certificates.

./certmgr certificates rotate

Note: If there are multiple supervisor clusters in the vCenter then you need to pass a “-c” flag and specify the cluster ID, which would be “domain-c8:5ad4efac-9894-450b-9a5e-613236c044ba” from the screenshot above.

Supervisor VM 1 – Overall Status – Ok
Supervisor VM 2 – Overall Status – Ok
Supervisor VM 3 – Overall Status – Ok
Spherelet certificates (on ESXi hosts) are being replaced as well

The spherelet certs can be found in the hosts under the directory /etc/vmware/spherelet

That’s it. We have replaced all the relevant certificates required to keep our environment up and running 🙂

Validation

As promised above, I am going to run the same find command to check the validity of the certs after rotation.

All of the certs before replacement had validity until Jan 25, now it has updated till Feb 14 of next year 🙂

Conclusion

As you can see, except for the vip.crt and load-balancer.crt, all the other valid certs would have gotten a longer validity because of the replacement activity we just did. I honestly don’t have an answer to why these are not replaced, but I assume it is because of the fact that we already have an UI to do it, which was what I wrote about in my previous blog.

However, this has made our lives much simpler and hope the tool will improve beyond this and will have more features in the near future.

As always, feel free to pass on your feedback by leaving a comment for me below. Happy learning!

Please follow and like my content:

Leave a Reply

Your email address will not be published. Required fields are marked *

error

Enjoy this blog? Please spread the word :)