kubeadm部署kubernetes备份etcd
由于kubeadm部署的kubernetes是以容器形式运行etcd,所以在系统中没有etcdctl
的命令,如果需要在系统中备份,则可以下载etcd二进制包来使用etcdctl
命令。
查看版本下载工具
通过kubectl
在容器中执行etcdctl version
来查看etcd版本:
# kubectl exec -it -n kube-system etcd-k8s-master01 -- etcdctl version
etcdctl version: 3.5.4
API version: 3.5
查看到3.5.4版本后,在github下载二进制包,然后把etcdctl
复制到系统环境变量中:
wget https://github.com/etcd-io/etcd/releases/download/v3.5.4/etcd-v3.5.4-linux-amd64.tar.gz
tar -xzf etcd-v3.5.4-linux-amd64.tar.gz
cp -a etcd-v3.5.4-linux-amd64/etcdctl /usr/bin/
查看etcd配置信息
查看/etc/kubernetes/manifests/etcd.yaml文件:
apiVersion: v1
kind: Pod
metadata:
annotations:
kubeadm.kubernetes.io/etcd.advertise-client-urls: https://192.168.4.11:2379
creationTimestamp: null
labels:
component: etcd
tier: control-plane
name: etcd
namespace: kube-system
spec:
containers:
- command:
- etcd
- --advertise-client-urls=https://192.168.4.11:2379
- --cert-file=/etc/kubernetes/pki/etcd/server.crt
- --client-cert-auth=true
- --data-dir=/var/lib/etcd
- --experimental-initial-corrupt-check=true
- --experimental-watch-progress-notify-interval=5s
- --initial-advertise-peer-urls=https://192.168.4.11:2380
- --initial-cluster=k8s-master01=https://192.168.4.11:2380
- --key-file=/etc/kubernetes/pki/etcd/server.key
- --listen-client-urls=https://127.0.0.1:2379,https://192.168.4.11:2379
- --listen-metrics-urls=http://127.0.0.1:2381
- --listen-peer-urls=https://192.168.4.11:2380
- --name=k8s-master01
- --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
- --peer-client-cert-auth=true
- --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
- --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
- --snapshot-count=10000
- --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
image: registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.5.4-0
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 8
httpGet:
host: 127.0.0.1
path: /health?exclude=NOSPACE&serializable=true
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
name: etcd
resources:
requests:
cpu: 100m
memory: 100Mi
startupProbe:
failureThreshold: 24
httpGet:
host: 127.0.0.1
path: /health?serializable=false
port: 2381
scheme: HTTP
initialDelaySeconds: 10
periodSeconds: 10
timeoutSeconds: 15
volumeMounts:
- mountPath: /var/lib/etcd
name: etcd-data
- mountPath: /etc/kubernetes/pki/etcd
name: etcd-certs
hostNetwork: true
priorityClassName: system-node-critical
securityContext:
seccompProfile:
type: RuntimeDefault
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd
type: DirectoryOrCreate
name: etcd-data
status: {}
通过command
中的--cert-file
、--key-file
、--trusted-ca-file
获取到证书信息:
--cert-file=/etc/kubernetes/pki/etcd/server.crt
--key-file=/etc/kubernetes/pki/etcd/server.key
--trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
使用etcdctl endpoint health
测试连接:
# ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key endpoint health
https://127.0.0.1:2379 is healthy: successfully committed proposal: took = 24.627386ms
备份
通过以上获取的信息,使用etcdctl备份到/opt/etcd-snapshot.db:
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot save /opt/etcd-snapshot.db
最后提示Snapshot saved at /opt/etcd-snapshot.db
表示备份成功。
使用以下命令,可以查看备份的状态:
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot status /opt/etcd-snapshot.db
还原
停止所有 API 服务实例
在各个master节点停止所有的API服务,包括kube-apiserver、kube-controller-manager、kube-scheduler
mkdir /opt/backup
mv /etc/kubernetes/manifests/kube-* /opt/backup/
如果不确定的话,可以通过以下步骤寻找:
- 通过
systemctl status kubelet.service
查看kubelet配置信息:
# systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Thu 2023-02-23 16:38:26 CST; 2 days ago
Docs: https://kubernetes.io/docs/
Main PID: 1226 (kubelet)
Tasks: 15
Memory: 83.2M
CGroup: /system.slice/kubelet.service
└─1226 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/config.yaml --container-runtime=remote --...
显示的是/usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf这个文件。
- 查看这个10-kubeadm.conf文件:
# grep -Ev '^#' /usr/lib/systemd/system/kubelet.service.d/10-kubeadm.conf
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
EnvironmentFile=-/etc/sysconfig/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
在KUBELET_CONFIG_ARGS
中看到引入了另一个文件。
- 执行
cat /var/lib/kubelet/config.yaml
查看配置文件:
apiVersion: kubelet.config.k8s.io/v1beta1
authentication:
anonymous:
enabled: false
webhook:
cacheTTL: 0s
enabled: true
x509:
clientCAFile: /etc/kubernetes/pki/ca.crt
authorization:
mode: Webhook
webhook:
cacheAuthorizedTTL: 0s
cacheUnauthorizedTTL: 0s
cgroupDriver: systemd
clusterDNS:
- 172.16.0.10
clusterDomain: cluster.local
cpuManagerReconcilePeriod: 0s
evictionPressureTransitionPeriod: 0s
fileCheckFrequency: 0s
healthzBindAddress: 127.0.0.1
healthzPort: 10248
httpCheckFrequency: 0s
imageMinimumGCAge: 0s
kind: KubeletConfiguration
logging:
flushFrequency: 0
options:
json:
infoBufferSize: "0"
verbosity: 0
memorySwap: {}
nodeStatusReportFrequency: 0s
nodeStatusUpdateFrequency: 0s
rotateCertificates: true
runtimeRequestTimeout: 0s
shutdownGracePeriod: 0s
shutdownGracePeriodCriticalPods: 0s
staticPodPath: /etc/kubernetes/manifests
streamingConnectionIdleTimeout: 0s
syncFrequency: 0s
volumeStatsAggPeriod: 0s
在这个文件中的staticPodPath
配置了kubernetes组件的配置文件位置。
在所有 etcd 实例中恢复状态
停止后,进行恢复etcd:
ETCDCTL_API=3 etcdctl --endpoints="https://127.0.0.1:2379" --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key snapshot restore /opt/etcd-snapshot.db --data-dir=/var/lib/etcd--restore
这一步在以前的版本中或许需要--skip-hash-check
来忽略hash的检验。
修改etcd的配置hostPath挂载目录:
volumes:
- hostPath:
path: /etc/kubernetes/pki/etcd
type: DirectoryOrCreate
name: etcd-certs
- hostPath:
path: /var/lib/etcd-restore
type: DirectoryOrCreate
name: etcd-data
把原来的path: /var/lib/etcd
修改为path: /var/lib/etcd-restore
重启所有 API 服务实例
mv /opt/backup/* /etc/kubernetes/manifests
systemctl restart kubelet
稍等片刻后,即可通过kubectl
正常使用。