更新 k8s 证书
某天部署在 k8s 上的应用报错,日志显示与 API server 通信失败,原因为:
certificate verify failed: certificate has expired
经调查发现 k8s 证书的有效期为 1 年。官方建议经常升级版本,版本升级后证书也会更新。 若不升级版本,则需手动更新证书。
证书更新步骤
NOTE
- 以下操作需要在所有 Control Plane 服务器上执行。
- 本例只适用于 kubeadm 部署的 k8s 集群。
执行以下命令查看证书有效期:
bash
kubeadm certs check-expiration
命令返回以下信息:
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[check-expiration] Error reading configuration from the Cluster. Falling back to default configuration
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Sep 18, 2024 03:37 UTC <invalid> ca no
apiserver Sep 18, 2024 03:37 UTC <invalid> ca no
apiserver-etcd-client Sep 18, 2024 03:37 UTC <invalid> etcd-ca no
apiserver-kubelet-client Sep 18, 2024 03:37 UTC <invalid> ca no
controller-manager.conf Sep 18, 2024 03:37 UTC <invalid> ca no
etcd-healthcheck-client Sep 18, 2024 03:37 UTC <invalid> etcd-ca no
etcd-peer Sep 18, 2024 03:37 UTC <invalid> etcd-ca no
etcd-server Sep 18, 2024 03:37 UTC <invalid> etcd-ca no
front-proxy-client Sep 18, 2024 03:37 UTC <invalid> front-proxy-ca no
scheduler.conf Sep 18, 2024 03:37 UTC <invalid> ca no
可以看出,证书都处于 <invalid>
状态,执行以下命令更新证书:
bash
kubeadm certs renew all
命令返回结果为:
[renew] Reading configuration from the cluster...
[renew] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[renew] Error reading configuration from the Cluster. Falling back to default configuration
certificate embedded in the kubeconfig file for the admin to use and for kubeadm itself renewed
certificate for serving the Kubernetes API renewed
certificate the apiserver uses to access etcd renewed
certificate for the API server to connect to kubelet renewed
certificate embedded in the kubeconfig file for the controller manager to use renewed
certificate for liveness probes to healthcheck etcd renewed
certificate for etcd nodes to communicate with each other renewed
certificate for serving etcd renewed
certificate for the front proxy client renewed
certificate embedded in the kubeconfig file for the scheduler manager to use renewed
Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.
重新查询证书有效期,可以得到以下结果:
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Sep 18, 2025 07:32 UTC 364d ca no
apiserver Sep 18, 2025 07:32 UTC 364d ca no
apiserver-etcd-client Sep 18, 2025 07:32 UTC 364d etcd-ca no
apiserver-kubelet-client Sep 18, 2025 07:32 UTC 364d ca no
controller-manager.conf Sep 18, 2025 07:32 UTC 364d ca no
etcd-healthcheck-client Sep 18, 2025 07:32 UTC 364d etcd-ca no
etcd-peer Sep 18, 2025 07:32 UTC 364d etcd-ca no
etcd-server Sep 18, 2025 07:32 UTC 364d etcd-ca no
front-proxy-client Sep 18, 2025 07:32 UTC 364d front-proxy-ca no
scheduler.conf Sep 18, 2025 07:32 UTC 364d ca no
可以发现证书已经更新。
最后需要重启 kube-apiserver, kube-controller-manager, kube-scheduler 和 etcd:
bash
cd /etc/kubernetes/manifests/
mv *.yaml ../
# 等待几秒钟
cd ..
mv *.yaml manifests/
ETCD 异常
更新证书后,发现 etcd 无法正常工作,异常信息如下:
Error creating pod: etcdserver: mvcc: database space exceeded
该异常信息说明需要手动压缩 etcd 的空间。
首先需要下载 etcdctl:https://github.com/etcd-io/etcd/releases/download/v3.4.34/etcd-v3.4.34-linux-amd64.tar.gz
解压后执行以下命令:
bash
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
endpoint status
127.0.0.1:2379, 715784118fc0c46d, 3.5.9, 2.1 GB, false, false, 62, 129354155, 129354155, memberID:8167141660220769389 alarm:NOSPACE
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
endpoint status --write-out="json" | grep revision
[{"Endpoint":"127.0.0.1:2379","Status":{"header":{"cluster_id":16530211416065483436,"member_id":8167141660220769389,"revision":109753907,"raft_term":62},"version":"3.5.9","dbSize":2147610624,"leader":2066636333178759182,"raftIndex":129354091,"raftTerm":62,"raftAppliedIndex":129354091,"errors":["memberID:8167141660220769389 alarm:NOSPACE "],"dbSizeInUse":2772992}}]
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
compact 109753907
compacted revision 109753907
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
defrag
Finished defragmenting etcd member[127.0.0.1:2379]
ETCDCTL_API=3 etcdctl \
--cacert /etc/kubernetes/pki/etcd/ca.crt \
--cert /etc/kubernetes/pki/apiserver-etcd-client.crt \
--key /etc/kubernetes/pki/apiserver-etcd-client.key \
endpoint status
127.0.0.1:2379, 715784118fc0c46d, 3.5.9, 2.0 MB, false, false, 62, 129354611, 129354611, memberID:8167141660220769389 alarm:NOSPACE
可以看出,压缩前空间占用为 2GB,压缩后变为 2MB。
关于 ETCD 空间压缩的详细介绍可参考 官方文档 -- Maintenance。