如何详细学习[K8s] Kubernetes 安装部署指南(V2)的步骤和技巧?

摘要:0 序 熟悉了 K8s 的基础概念和运行原理后,即可尝试自行部署一下K8s集群。 K8s概述 - 博客园千千寰宇 本篇区别于: [虚拟化云原生] Kubernetes 安装部署指南 - 博客园千千寰宇 前篇: 基于 docker CR
0 序 熟悉了 K8s 的基础概念和运行原理后,即可尝试自行部署一下K8s集群。 K8s概述 - 博客园/千千寰宇 本篇区别于: [虚拟化/云原生] Kubernetes 安装部署指南 - 博客园/千千寰宇 前篇: 基于 docker CRI/运行时,k8s server 版本: 1.25.0 / "exec-opts": ["native.cgroupdriver=systemd"] 本篇: 基于 containerd CRI/运行时,k8s server 版本: 1.28.0 1 K8s 安装部署篇 环境配置与前置准备 CENTOS7 服务器 x N台 (N≥3) 每台服务器 2Core 2GB 更新YUM镜像源 yum update yum upgrade Step1 主节点+工作节点:安装、运行 Docker yum -y install wget wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo yum -y install docker-ce //查看版本 docker version systemctl enable docker systemctl start docker Step2 主节点+工作节点:安装 kubeadm / kubectl / kubelet 安装阿里源 cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes #baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ enabled=1 gpgcheck=1 #gpgcheck=0 repo_gpgcheck=1 #repo_gpgcheck=0 #gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg #exclude=kubelet kubeadm kubectl EOF //更新 YUM 源 yum update 禁用 SELinux 将 SELinux 设置为 pemissive 模式,相当于将其禁用 通过运行命令setenforce 0 和 sed ... 将 SELinux 设置为 permissive 模式可以有效地将其禁用。这是允许容器访问主机文件系统所必须的,而这些操作是为了保证Pod网络工作正常。 setenforce 0 sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config 关闭防火墙 systemctl stop firewalld systemctl disable firewalld 关闭 swap 禁用交换分区,为了保证kubelet 正常工作,你必须禁用交互分区。 例如,sudo swappoff -a 将暂时禁用交换分区。要使此更改在重启后保持不变,请确保在如/etc/fstab、systemd.swap 等配置文件中禁用交换分区,具体取决于你的系统配置如何。 swapoff -a 安装、并启用 kubelet sudo yum install -y kubelet kubeadm kubectl --disableexclude=kubernetes //设置开机自动启动 sudo systemctl enable --now kubelet Step3 部署主节点 查看 kubeadm 版本信息 # kubeadm config print init-defaults ... apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: local: dataDir: /var/lib/etcd imageRepository: registry.k8s.io kind: ClusterConfiguration kubernetesVersion: 1.28.0 ... 其中 apiVersion 和 kubernetesVersion 需要和下面编写的 kubeadm.yml 保持一致 kind = InitConfiguration: 用于定义一些初始化配置,如初始化使用的token以及apiserver地址等 kind = ClusterConfiguration:用于定义apiserver、etcd、network、scheduler、controller-manager等master组件相关配置项 kind = KubeletConfiguration:用于定义kubelet组件相关的配置项 kind = KubeProxyConfiguration:用于定义kube-proxy组件相关的配置项 编写 kubeadm.yaml apiVersion: kubeadm.k8s.io/v1beta3 kind: ClusterConfiguration kubernetesVersion: 1.28.0 imageRepository: registry.aliyuncs.com/google_containers #imageRepository: k8s.gcr.io controllerManager: {} dns: type: CoreDNS apiServer: extraArgs: # extraArgs 字段由 key: value 对组成 runtime-config: "api/all=true" etcd: local: dataDir: /data/k8s/etcd scheduler: {} kubeadm ClusterConfiguration 对象公开了 extraArgs 字段,它可以覆盖传递给控制平面组件(如 APIServer、ControllerManager 和 Scheduler)的默认参数。 启用 kubelet.service systemctl enable kubelet.service 启动容器运行时:containerd rm -rf /etc/containerd/config.toml //修改 containerd 配置,添加镜像加速: // 1) 基于默认配置之上,编辑 containerd 配置 sudo mkdir -p /etc/containerd sudo containerd config default | sudo tee /etc/containerd/config.toml //修改 /etc/containerd/config.toml : ... [plugins."io.containerd.grpc.v1.cri"] # 修改: 1行配置 # sandbox_image = "registry.k8s.io/pause:3.6" sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9" [plugins."io.containerd.grpc.v1.cri".registry] ... # 新增:3+2行配置(不含注释行或空行) [plugins."io.containerd.grpc.v1.cri".registry.mirrors] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"] # 阿里云镜像加速获取 from https://cr.console.aliyun.com/cn-hangzhou/instances/mirrors endpoint = ["https://xxx.mirror.aliyuncs.com", "https://registry-1.docker.io"] [plugins."io.containerd.grpc.v1.cri".registry.mirrors."registry.k8s.io"] endpoint = ["https://registry.aliyuncs.com/google_containers"] ... systemctl restart containerd systemctl status containerd 使网桥支持 IPv6 cd /etc/sysctl.d/ vi k8s-sysctl.conf //添加如下文本: net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 ls /etc/sysctl.d/k8s-sysctl.conf //使其立即生效 sudo sysctl --system 时间同步 (所有机器, 可选) # sudo yum install -y ntpdate //执行时间同步命令 # sudo ntpdate time.windows.com 6 Apr 00:57:35 ntpdate[8944]: step time server 52.231.114.183 offset -0.921723 sec 主节点初始化: kubeadm / kubelet 主节点初始化 本操作会生成 /var/lib/kubelet/config.yaml 等重要文件。 //使其生效 # kubeadm init --config ~/k8s-deployments/kubeadm.yaml [init] Using Kubernetes version: v1.28.0 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local vm-a] and IPs [10.96.0.1 192.168.xx.211] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [localhost vm-a] and IPs [192.168.xx.211 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [localhost vm-a] and IPs [192.168.xx.211 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 7.507281 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node vm-a as control-plane by adding the labels: [node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node vm-a as control-plane by adding the taints [node-role.kubernetes.io/control-plane:NoSchedule] [bootstrap-token] Using token: qi82d7.glltv3hltpe4aq08 [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: //为了启动使用你的集群,您需要以【普通用户】身份运行以下内容: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: //或者,如果您是 root 用户,您也可以运行 export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. //您现在应该向集群部署一个 Pod 网络。 Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: //然后,您可以通过在每个节点上以 root 身份运行以下命令来连接任意数量的工作者节点。 kubeadm join 192.168.xx.211:6443 --token qi82d7.glltv3hltpe4aq08 \ --discovery-token-ca-cert-hash sha256:6054b8402053e9eb8f6cb134c066f3e28ae80aa5fd28cec002af1f4199383890 解释说明: Kubernetes 集群的 Service 网段(私网网段) = 10.96.0.0/12 是 kubeadm 默认的 Service 网段(可通过 --service-cidr 配置) 10.96.0.1 通常是 Kubernetes API Server 的 ClusterIP (用于集群内部 Pod 访问 API Server;对应 kubernetes 这个默认 Service),如果你在这个集群中(且集群在后续步骤中已启动时),可以执行: 如果 join 指令没有及时记住,还可以在 master 节点上生成加入命令: kubeadm token create --print-join-command //查看 kubernetes service 的 ClusterIP # kubectl get svc kubernetes -n default NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 10h //查看所有 service 的网段分布 # kubectl get svc --all-namespaces -o wide NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 10h <none> kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 10h k8s-app=kube-dns 本步骤基于配置文件的方式,当然也可直接命令行的方式: Demo(源自网络) Step4 主节点:启动集群 为了启动集群,需要运行以下内容: master节点执行,node节点不执行 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config 或者 root 用户执行: export KUBECONFIG=/etc/kubernetes/admin.conf 查看集群节点 # kubectl get nodes NAME STATUS ROLES AGE VERSION vm-a NotReady control-plane 9h v1.28.2 Step5 工作节点:加入集群 node节点执行,master节点不执行 安装: kubelet / kubectl / kubeadm(参见上面的专门章节) kubeadm : 初始化集群的命令。 kubectl : 与集群通信的命令行工具。 kubelet : 在集群中的每个节点上用来启动 Pod 和 容器等。 加入集群 kubeadm join 192.168.xx.211:6443 --token qi82d7.glltv3hltpe4aq08 \ --discovery-token-ca-cert-hash sha256:6054b8402053e9eb8f6cb134c066f3e28ae80aa5fd28cec002af1f4199383890 Step6 主节点: 安装 CNI 网络组件: Calico 下载 calico 镜像 到 containerd 如下这种方式,主要为了解决阿里云等镜像仓库中可能没有 calico 镜像、及解决 docker.io 的网络问题 //step1 拉取镜像 docker pull calico/cni:v3.26.1 docker pull calico/node:v3.26.1 docker pull calico/kube-controllers:v3.26.1 如果无法直接访问 docker.io ,则可: docker pull m.daocloud.io/docker.io/calico/cni:v3.26.1 docker pull m.daocloud.io/docker.io/calico/node:v3.26.1 docker pull m.daocloud.io/docker.io/calico/kube-controllers:v3.26.1 docker tag m.daocloud.io/docker.io/calico/cni:v3.26.1 docker.io/calico/cni:v3.26.1 docker tag m.daocloud.io/docker.io/calico/node:v3.26.1 docker.io/calico/node:v3.26.1 docker tag m.daocloud.io/docker.io/calico/kube-controllers:v3.26.1 docker.io/calico/kube-controllers:v3.26.1 //step2 导入 docker 镜像 到 containerd ctr -n k8s.io images import <(docker save calico/cni:v3.26.1) ctr -n k8s.io images import <(docker save calico/node:v3.26.1) ctr -n k8s.io images import <(docker save calico/kube-controllers:v3.26.1) [root@vm-a ~]# touch ~/k8s-deployments/calico.yaml 手动下载 calico.yaml 并copy到 `~/k8s-deployments/calico.yaml` https://github.com/projectcalico/calico/blob/v3.26.1/manifests/calico.yaml 【推荐】 or https://docs.projectcalico.org/manifests/calico.yaml (不推荐,版本对不上,需要 v3.26.1) or https://calico-v3-25.netlify.app/archive/v3.25/manifests/calico.yaml (不推荐,版本对不上,需要 v3.26.1) [root@vm-a ~]# kubectl apply -f ~/k8s-deployments/calico.yaml poddisruptionbudget.policy/calico-kube-controllers configured serviceaccount/calico-kube-controllers unchanged serviceaccount/calico-node unchanged configmap/calico-config unchanged customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org configured customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org configured clusterrole.rbac.authorization.k8s.io/calico-kube-controllers unchanged clusterrole.rbac.authorization.k8s.io/calico-node configured clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers unchanged clusterrolebinding.rbac.authorization.k8s.io/calico-node unchanged daemonset.apps/calico-node configured deployment.apps/calico-kube-controllers configured 检查运行状态 // 确认 Calico 就绪后,检查 CoreDNS [root@vm-a ~]# kubectl get pod -n kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-kube-controllers-7ddc4f45bc-7nvqn 1/1 Running 0 9m52s 172.16.51.65 vm-a <none> <none> calico-node-sm5hw 1/1 Running 0 9m52s fd00:6868:6868::52b vm-a <none> <none> coredns-66f779496c-52htv 1/1 Running 0 2d1h 172.16.51.67 vm-a <none> <none> coredns-66f779496c-tsgrb 1/1 Running 0 2d1h 172.16.51.66 vm-a <none> <none> etcd-vm-a 1/1 Running 1 2d1h fd00:6868:6868::af9 vm-a <none> <none> kube-apiserver-vm-a 1/1 Running 1 2d1h fd00:6868:6868::af9 vm-a <none> <none> kube-controller-manager-vm-a 1/1 Running 3 (53m ago) 2d1h fd00:6868:6868::52b vm-a <none> <none> kube-proxy-7crkv 1/1 Running 0 2d1h fd00:6868:6868::af9 vm-a <none> <none> kube-scheduler-vm-a 1/1 Running 5 (53m ago) 2d1h fd00:6868:6868::52b vm-a <none> <none> //如果 CoreDNS 仍卡住,重启它们 [root@vm-a ~]# kubectl rollout restart deployment coredns -n kube-system [root@vm-a ~]# echo -e "\n=== 节点状态 ===" && \ [root@vm-a ~]# kubectl get nodes -o wide [root@vm-a ~]# kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME vm-a Ready control-plane 2d1h v1.28.2 fd00:6868:6868::52b <none> CentOS Linux 7 (Core) 3.10.0-1160.el7.x86_64 containerd://1.6.33 Step7 核验部署结果:主节点 查看 kubelet 运行状态与日志 sudo systemctl status kubelet --no-pager sudo journalctl -xeu kubelet -n 50 --no-pager | tail -30 检查镜像是否成功拉取 # sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock images | grep pause registry.aliyuncs.com/google_containers/pause 3.9 e6f1816883972 322kB 查看容器的运行状态 # sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a CONTAINER IMAGE CREATED STATE NAME ATTEMPT POD ID POD 1f57bb4c9ab9d ea1030da44aa1 6 minutes ago Running kube-proxy 0 4d286784c3ddb kube-proxy-7crkv 280d71fdbe867 f6f496300a2ae 6 minutes ago Running kube-scheduler 1 9c57afb350343 kube-scheduler-vm-a 0b32789371828 4be79c38a4bab 6 minutes ago Running kube-controller-manager 1 14ab1f95b380a kube-controller-manager-vm-a 2e821a289e1ca 73deb9a3f7025 6 minutes ago Running etcd 1 b06ec8f19d648 etcd-vm-a 62288683995c3 bb5e0dde9054c 6 minutes ago Running kube-apiserver 1 e4b6ad50d3758 kube-apiserver-vm-a 补充看下: docker docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 说明: 【容器运行时】确实已切换到了 containerd Z FAQ for K8s 安装部署 Q: 重置k8s节点? # 1. 重置 kubeadm(会删除所有集群配置和容器) sudo kubeadm reset -f # 2. 删除残留的配置文件和目录 sudo rm -rf /etc/kubernetes/ sudo rm -rf /var/lib/kubelet/ sudo rm -rf /var/lib/etcd/ sudo rm -rf ~/.kube/ # 3. 清理 CNI 网络配置 sudo rm -rf /etc/cni/net.d/ sudo rm -rf /var/lib/cni/ # 4. 清理 iptables 规则(可选,但建议执行) sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X # 5. 重启 kubelet(或整个系统) sudo systemctl restart kubelet # 或者重启系统以确保干净 sudo reboot 检查 kubelet 运行或异常情况: sudo systemctl status kubelet sudo journalctl -xeu kubelet -n 100 --no-pager | tail -50 Q: kubeadm init 失败,报等待控制平面组件启动时超时等错误? 原因分析 这类错误表示 kubeadm init 在等待控制平面组件启动时超时了。最常见的原因是 kubelet 没有正常运行 或 容器运行时配置问题。 1. 检查 kubelet 进程状态 sudo systemctl status kubelet 如果显示 inactive (dead) 或失败状态,尝试启动: sudo systemctl start kubelet sudo systemctl enable kubelet 2. 查看 kubelet 详细日志 sudo journalctl -xeu kubelet -n 200 --no-pager 重点关注以下错误: failed to run Kubelet node "xxx" not found cannot find cgroup container runtime is down 3. 检查容器运行时(containerd) # 检查 containerd 是否运行 sudo systemctl status containerd # 如果未运行,启动它 sudo systemctl start containerd sudo systemctl enable containerd # 验证 containerd 状态 sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock version 4. 检查控制平面容器状态 # 列出所有 Kubernetes 相关容器 sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock ps -a | grep kube | grep -v pause # 如果有失败的容器,查看日志(替换 CONTAINERID) sudo crictl --runtime-endpoint unix:///var/run/containerd/containerd.sock logs CONTAINERID 常见原因及解决方案 原因 A:kubelet 配置错误 检查 kubelet 配置文件: cat /var/lib/kubelet/config.yaml 常见问题: cgroupDriver 与 containerd 不匹配(应为 systemd) 节点名称解析问题 修复 cgroupDriver 不匹配: # 检查 containerd 的 cgroup driver sudo cat /etc/containerd/config.toml | grep SystemdCgroup # 确保 kubelet 使用相同的 driver # 编辑 /var/lib/kubelet/config.yaml,设置: # cgroupDriver: systemd # 重启服务 sudo systemctl restart containerd sudo systemctl restart kubelet 原因 B:镜像拉取失败(国内环境非常常见) control-plane镜像无法从 registry.k8s.io 拉取。 解决方案:使用国内镜像源 自动方式: 参见 配置 /etc/containerd/config.toml 的章节 (亲测) 手动方式(未亲测) # 查看 kubeadm.yaml 中是否指定了 imageRepository grep imageRepository ~/k8s-deployments/kubeadm.yaml # 如果没有,修改 kubeadm.yaml 添加: # imageRepository: registry.aliyuncs.com/google_containers # 或者手动拉取镜像 sudo kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers # 然后重新初始化 sudo kubeadm init --config ~/k8s-deployments/kubeadm.yaml 原因 C:之前的重置不彻底 如果之前执行过 kubeadm reset,但 etcd 或网络配置残留: # 彻底清理 sudo kubeadm reset -f sudo rm -rf /etc/kubernetes/ /var/lib/kubelet/ /var/lib/etcd/ /var/lib/cni/ /etc/cni/ sudo rm -rf ~/.kube/ # 清理网络接口 sudo ip link delete cni0 2>/dev/null || true sudo ip link delete flannel.1 2>/dev/null || true # 重启 sudo reboot 推荐排查流程 # 1. 先查看具体错误日志 sudo journalctl -xeu kubelet -n 100 --no-pager | tail -50 # 2. 根据错误类型处理,常见情况: # 情况 1:如果看到 "connection refused" 到 containerd sudo systemctl restart containerd sudo systemctl restart kubelet # 情况 2:如果看到 "ImagePullBackOff" 或镜像相关错误 sudo kubeadm config images pull --image-repository=registry.aliyuncs.com/google_containers # 情况 3:如果看到 cgroup 相关错误 # 编辑 /var/lib/kubelet/config.yaml 确保 cgroupDriver: systemd sudo systemctl restart kubelet # 3. 重新初始化 sudo kubeadm init --config ~/k8s-deployments/kubeadm.yaml 这样可以看到具体的错误信息,而不是超时提示。 Q: Docker Engine 没有实现容器运行时在Kubernetes 中工作所需的 CRI,如何解决? Docker Engine 没有实现容器运行时在Kubernetes 中工作所需的 CRI,为此——必须安装一个额外的服务cri-dockerd cri-dockerd 是一个基于传统的内置 Docker 引擎支持的项目,它在 1.24 版本中从 kubelet 中移除。 Q: 卸载安装的 kubeadm? sudo yum remove kubelet kubeadm kubectl Q: 重新安装指定版本的 kubeadm ? yum install kubelet-1.23.17 kubeadm-1.23.17 kubectl-1.23.17 kubenetes-cri Q: 启用 kubelet ? sudo systemctl enable --now kubelet Q: 查看 kubeadm 版本信息? # kubeadm version Q: 修改 kubeadm.yaml 版本? 修改 kubeadm.yaml 版本为 1.23.0 Q: 部署主节点 kubeadm? 参见 上文。 Q: 集群节点显示"couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused"? 问题描述 # kubectl get nodes E0406 09:59:50.133311 30401 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 09:59:50.133757 30401 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 09:59:50.160301 30401 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 09:59:50.161204 30401 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 09:59:50.201546 30401 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused The connection to the server localhost:8080 was refused - did you specify the right host or port? 原因分析 根本原因: kubectl 需要 kubeconfig 文件来知道如何连接集群。错误信息显示它找不到配置,所以使用了默认的 localhost:8080。 解决方法 方法1 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config 或者 root 用户执行: export KUBECONFIG=/etc/kubernetes/admin.conf 查看集群节点 # kubectl get nodes NAME STATUS ROLES AGE VERSION vm-a NotReady control-plane 9h v1.28.2 查看 Pods 状态 # kubectl get pods -n kube-system E0406 00:39:29.018802 8634 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 00:39:29.019794 8634 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 00:39:29.022029 8634 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 00:39:29.024362 8634 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused E0406 00:39:29.024987 8634 memcache.go:265] couldn't get current server API group list: Get "http://localhost:8080/api?timeout=32s": dial tcp [::1]:8080: connect: connection refused The connection to the server localhost:8080 was refused - did you specify the right host or port? Q: 节点处于NotReady状态,如何排查? NotReady 状态表示节点尚未完全就绪,最常见的原因是 CNI(容器网络插件)未安装 或 kubelet 健康检查失败。 1. 查看详细状态和事件 //查看节点详细信息 # kubectl describe node vm-a ... Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Mon, 06 Apr 2026 10:12:33 +0800 Mon, 06 Apr 2026 00:10:03 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Mon, 06 Apr 2026 10:12:33 +0800 Mon, 06 Apr 2026 00:10:03 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Mon, 06 Apr 2026 10:12:33 +0800 Mon, 06 Apr 2026 00:10:03 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Mon, 06 Apr 2026 10:12:33 +0800 Mon, 06 Apr 2026 00:10:03 +0800 KubeletNotReady container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized ... 重点关注: Conditions 部分(特别是 Ready、NetworkUnavailable) Events 部分 2. 检查 kubelet 日志 # sudo journalctl -xeu kubelet -n 100 --no-pager | tail -50 ... Apr 06 10:17:13 vm-a kubelet[7901]: E0406 10:17:13.966639 7901 kubelet.go:2855] "Container runtime network not ready" networkReady="NetworkReady=false reason:NetworkPluginNotReady message:Network plugin returns error: cni plugin not initialized" ... 3. 检查系统 Pod 状态 # sudo kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-66f779496c-52htv 0/1 Pending 0 10h coredns-66f779496c-tsgrb 0/1 Pending 0 10h etcd-vm-a 1/1 Running 1 10h kube-apiserver-vm-a 1/1 Running 1 10h kube-controller-manager-vm-a 1/1 Running 1 10h kube-proxy-7crkv 1/1 Running 0 10h kube-scheduler-vm-a 1/1 Running 1 10h 最常见原因:缺少 CNI 插件 如果是新初始化的集群,必须安装 CNI 插件(如 Calico、Flannel、Weave 等),否则节点会一直处于 NotReady。 安装 Calico(推荐:生产环境) //安装 Calico CNI # kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml poddisruptionbudget.policy/calico-kube-controllers created serviceaccount/calico-kube-controllers created serviceaccount/calico-node created serviceaccount/calico-cni-plugin created configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrole.rbac.authorization.k8s.io/calico-cni-plugin created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-cni-plugin created daemonset.apps/calico-node created deployment.apps/calico-kube-controllers created 或 若需调整 Calico 的网络配置(如 Pod 网段、网络模式),可先手动下载 custom-resources.yaml 文件编辑后再部署: curl -O https://raw.githubusercontent.com/projectcalico/calico/v3.28.2/manifests/custom-resources.yaml # 替换镜像为国内源 (可选,未亲测) sed -i 's|docker.io/calico|docker.mirrors.ustc.edu.cn/calico|g' calico.yaml 打开文件后,重点关注以下配置: Pod 网段(cidr):默认 192.168.0.0/16,需确保与 Kubernetes 集群的--cluster-cidr(kube-apiserver 参数)一致(可通过kubectl -n kube-system get pod kube-apiserver-xxx -o yaml | grep cluster-cidr查看); 网络模式(vxlanEnabled):默认true(使用 vxlan 模式,无需节点间 BGP 配置),若需高性能可改为false并启用 BGP(需额外配置 BGP peer)。 修改完成后,再执行kubectl apply -f custom-resources.yaml部署。 查看下载/部署状态 //等待他们全部running即可 # kubectl get pod -n kube-system -o wide # kubectl get nodes 安装 Flannel(更简单:非生产环境) kubectl apply -f https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml 其他可能原因 原因 检查命令 解决方案 CNI 未安装 kubectl get pods -n kube-system 无网络 Pod 安装 Calico/Flannel kubelet 无法访问 API journalctl -u kubelet 有连接错误 检查防火墙/证书 容器运行时故障 sudo crictl ps 失败 重启 containerd 磁盘压力/内存不足 kubectl describe node 有 DiskPressure 清理磁盘/扩容 kube-proxy 未启动 kubectl get pods -n kube-system 无 kube-proxy 检查 kube-proxy Pod 小结:推荐排查流程 # 1. 查看节点详细状态 kubectl describe node vm-a # 2. 查看系统 Pod 状态(确认是否有网络相关 Pod) kubectl get pods -n kube-system -o wide # 3. 如果只看到 kube-apiserver/kube-controller-manager/kube-scheduler/etcd, # 但没有 calico/flannel/coredns 等,说明缺少 CNI # 4. 安装 CNI(以 Calico 为例) kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/calico.yaml # 5. 等待 Pod 启动 kubectl get pods -n kube-system -w # 6. 检查节点状态(通常 1-2 分钟后变为 Ready) kubectl get nodes Q: Calico CNI 网络插件? Calico Calico 是一款功能全面的 CNI,不仅支持 Pod 间通信(通过 BGP、vxlan 等模式),还原生集成 Network Policy 功能,可实现精细化的网络访问控制(如限制 Pod 间通信、屏蔽外部访问等),完全满足需求。此外,Calico 3.28.2 作为稳定版本,兼容性强(支持 Kubernetes 1.24+)、性能优异,是生产环境的首选。 Calico 作为一个开源的三层虚拟化网络方案,用于为云原生应用实现互联及策略控制,相较于Flannel来说,Calico的优势是对网络策略(Network policy),它允许用户动态定义ACL规则进出容器的数据报文,实现为Pod间的通信按需施加安全策略。不仅如此,Calico还可以整合进大多数具备编排能力的环境, 可以为虚机和容器提供多主机间通信的功能。 运行架构图 Flannel Flannel 是一款轻量级 CNI,主要通过 vxlan、host-gw 等模式实现 Pod 间网络互通,优势在于部署简单、资源占用低。 但Flannel 默认不支持 Network Policy—— 即使通过额外插件扩展,也需复杂配置且功能不完善,无法满足 “实施网络策略” 的硬性要求,因此直接排除。 Q: 卸载旧的 CNI (flannel) ? # 1. 删除旧CNI的Pod和命名空间(以Flannel为例) kubectl delete ns kube-flannel kubectl delete pod -n kube-system -l app=flannel # 2. 清理节点上的旧CNI配置(所有节点执行) rm -rf /etc/cni/net.d/* # 删除CNI配置文件 systemctl restart kubelet # 重启kubelet,使配置生效 Q:为何K8s集群Service内网IP(10.96.0.1)在宿主机所在的物理局域网(192.168网段)也能ping通? 经典的网络问题。10.96.0.1 能在 192.168 网段 ping 通,说明存在路由或网络桥接机制。 可能原因 1. 宿主机路由表配置 宿主机上添加了指向 10.96.0.0/12 网段的路由: //在宿主机上查看路由表 # ip route | grep 10.96 (实际查询:无结果) 其他可能的输出: 10.96.0.0/12 via 192.168.x.x dev eth0 # 通过某个网关 10.96.0.0/12 dev cni0 proto kernel # 通过 CNI 网桥 2. Kubernetes 网络模式:使用主机网络或端口映射 场景 说明 hostNetwork: true Pod 直接使用宿主机网络栈 hostPort 将容器端口映射到宿主机 NodePort Service 通过 <NodeIP>:<Port> 暴露服务 3. CNI 插件的网桥模式 常见的 Kubernetes CNI(如 Flannel、Calico、Weave)会: 在宿主机创建虚拟网桥(如 cni0、docker0、flannel.1) 将 Pod 网段和 Service 网段路由到宿主机 //查看宿主机网桥 # brctl show bridge name bridge id STP enabled interfaces docker0 8000.024221f3205a no virbr0 8000.52540013f1cb yes virbr0-nic # ip link show type bridge 3: virbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 52:54:00:xx:xx:cb brd ff:ff:ff:ff:ff:ff 5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default link/ether 02:42:21:xx:xx:5a brd ff:ff:ff:ff:ff:ff 4. iptables / IPVS 转发规则 Kube-proxy 在宿主机上创建了 NAT 或转发规则: //查看 iptables 中关于 kubernetes 服务的规则 # sudo iptables -t nat -L | grep 10.96.0.1 KUBE-SVC-NPX46M4PTMTKRN6Y tcp -- anywhere 10.96.0.1 /* default/kubernetes:https cluster IP */ tcp dpt:https # sudo iptables -t filter -L | grep 10.96 REJECT udp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns has no endpoints */ udp dpt:domain reject-with icmp-port-unreachable REJECT tcp -- anywhere 10.96.0.10 /* kube-system/kube-dns:dns-tcp has no endpoints */ tcp dpt:domain reject-with icmp-port-unreachable REJECT tcp -- anywhere 10.96.0.10 /* kube-system/kube-dns:metrics has no endpoints */ tcp dpt:9153 reject-with icmp-port-unreachable 5. MetalLB 或类似负载均衡器 如果集群部署了 MetalLB,可能将 Service IP 直接暴露在局域网: MetalLB 的 L2 模式会通过 ARP/NDP 将 10.96.0.1 宣告到局域网 使局域网其他设备认为 10.96.0.1 就在本地网络 如何排查确认? 在宿主机上执行: //1. 查看路由路径 # ip route get 10.96.0.1 10.96.0.1 via 192.168.xx.1 dev ens33 src 192.168.xx.211 cache //2. 查看数据包走哪个网卡 # traceroute 10.96.0.1 traceroute to 10.96.0.1 (10.96.0.1), 30 hops max, 60 byte packets 1 XiaoQiang (192.168.xx.1) 1.927 ms 2.570 ms 2.502 ms 2 192.168.1.1 (192.168.1.1) 3.097 ms 3.037 ms 2.963 ms 3 100.64.xx.1 (100.64.xx.1) 10.148 ms 10.071 ms 10.007 ms 4 183.222.xx.121 (183.222.xx.121) 6.912 ms 5.619 ms 183.222.xx.125 (183.222.xx.125) 5.784 ms 5 10.96.0.1 (10.96.0.1) 7.407 ms 7.334 ms 8.256 ms //3. 查看 局域网 arp/MAC IP转换协议 # arp 10.96.0.1 10.96.0.1 (10.96.0.1) -- no entry // 4. 查看接口地址 # ip addr | grep 10.96 (实际查询:无结果) // 5. 查看 kube-proxy 模式 # kubectl get configmap kube-proxy -n kube-system -o yaml | grep mode mode: "" 如果在局域网其他机器上: # 查看 ARP 表,确认 10.96.0.1 的 MAC 地址 arp -a | grep 10.96.0.1 # 对比宿主机的 MAC 地址 arp -a | grep <宿主机IP> 最可能的情况总结 你的集群配置 现象解释 使用 Flannel/Calico + 默认配置 宿主机作为路由器,转发 10.96.0.0/12 流量 使用 MetalLB (L2模式) 10.96.0.1 被直接ARP宣告到局域网 自定义路由 路由器/宿主机手动配置了静态路由 kube-proxy IPVS 模式 宿主机上创建了虚拟 IP 安全提醒 如果 10.96.0.1 能在整个局域网访问,说明 Kubernetes 内部网络暴露到了外部。建议检查: 网络策略(NetworkPolicy)是否正确配置 是否需要限制 Service 的访问范围 防火墙规则是否过于宽松 最终结论 基于路由路径: 192.168.xx.1(你家路由器) 192.168.1.1(光猫-路由模式 / 上级路由) 100.64.x.1(运营商CGNAT 共享地址, 说明你家宽带是大内网(没有公网 IP)) 183.222.x.x(运营商公网出口) 10.96.0.1 ( 当地运营商的 BRAS 设备(宽带接入服务器); 作用: 1 给你拨号认证(PPPoE); 2 分配上网 IP; 3 做限速、策略、计费; 4 是你宽带真正的 “上网网关”) 关键点: 10.96.0.1 是运营商内部设备(BRAS、SR、核心路由器、网关等) 运营商内部会路由私网地址,所以你虽然走了公网出口,依然能到达它 这个 IP 只在运营商内网可见,互联网上其他人访问不到 10.96.0.1 这个 IP 不是公网,也不是黑客 / 异常节点就是你本地运营商的核心上网设备。 ping 它、traceroute 它都很安全,常用于测试本地宽带质量。 如果想测真实外网延迟不要 ping 10.96.0.1,要 ping 公网 DNS,比如:114.114.114.114、223.5.5.5、8.8.8.8 Q: K8s Service 网段(默认: 10.96.0.0/12)与 Pod 网段有啥区别吗? 区别非常大,而且是 Kubernetes 最核心的两个网段,我用最简单、最直白的方式给你讲清楚: Pod 网段:给每个容器用的真实 IP Service 网段:给服务提供的“虚拟固定 IP” 1. Pod 网段(podSubnet) 给 Pod 本身 用的 IP。 特点: 每个 Pod 启动时都会分配一个真实 IP 是动态、临时的,Pod 重建 IP 就变 用于 Pod 之间直接通信 属于二层/三层真实路由 IP 集群内部可以直接 ping 通 例子: 10.244.0.0/16 Pod IP 可能是:10.244.1.5、10.244.2.8 等 2. Service 网段(serviceSubnet) 你现在遇到的 10.96.0.0/12 就是这个。 它是 虚拟 IP(ClusterIP),不是真实网卡 IP。 特点: 给 Service 用,固定不变 不对应任何真实容器/网卡 作用是 负载均衡 + 服务发现 由 kube-proxy 做 iptables/ipvs 转发 不能直接 ping 通背后的 Pod,只能访问服务端口 例子: 10.96.0.0/12 Service IP 如:10.96.0.1(kubernetes)、10.96.0.10(coredns) //在宿主机上尝试 ping CoreDNS 的 10.96.0.10 (结论:网络不通) # traceroute 10.96.0.10 、、、(失败 ==> 说明: k8s 的 10 私网网段IP确实未暴露到外部宿主机所在的局域网) 3. 最直观的区别对比表 项目 Pod 网段 Service 网段 给谁用 每个 Pod 每个 Service 是否真实 IP 真实,有网卡 虚拟,无网卡 是否变化 每次重建都会变 一旦创建固定不变 作用 Pod 之间直接通信 提供稳定访问入口、负载均衡 能否直接 ping 能 能 ping,但不代表后端 Pod 可达 网络模式 真实路由 NAT/iptables/ipvs 转发 4. 流量怎么走?(超简单比喻) Pod IP = 某个人的手机号(换手机就变) Service IP = 公司总机号(永远不变) 你打总机(Service IP)→ 总机转给某个人(Pod IP) 5. 为什么你会冲突? 因为: 你的运营商内网就是 10.96.x.x K8s 默认 Service 网段也是 10.96.0.0/12 路由冲突,导致访问异常、 traceroute 跑到运营商设备上去 解决方法就是: 把 Service 网段改成不和 10.96 重叠的网段,比如: 10.255.0.0/16 172.31.0.0/16 192.168.100.0/20 Q: 如何修改k8s的默认网段 10.96.xx.xx 为其他?//todo 问题描述 //看 CLUSTER-IP 的网段 # kubectl get svc --all-namespaces NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 25h kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 25h 要改的 10.96.0.0/12 是 K8s Service 网段(service-cluster-ip-range /serviceSubnet),不是 Pod 网段。 解决方法 Q: 基于 kubectl 列出所有命名空间下的所有容器镜像? https://kubernetes.io/zh-cn/docs/tasks/access-application-cluster/list-all-running-container-images/ # kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\ tr -s '[[:space:]]' '\n' |\ sort |\ uniq -c //out: 1 busybox 1 docker.io/calico/kube-controllers:v3.26.1 1 docker.io/calico/node:v3.26.1 2 registry.aliyuncs.com/google_containers/coredns:v1.10.1 1 registry.aliyuncs.com/google_containers/etcd:3.5.9-0 1 registry.aliyuncs.com/google_containers/kube-apiserver:v1.28.0 1 registry.aliyuncs.com/google_containers/kube-controller-manager:v1.28.0 1 registry.aliyuncs.com/google_containers/kube-proxy:v1.28.0 1 registry.aliyuncs.com/google_containers/kube-scheduler:v1.28.0 docker # docker images REPOSITORY TAG IMAGE ID CREATED SIZE # docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES Y 推荐文献 K8s概述 - 博客园/千千寰宇 [Docker] Docker 基础教程(概念/原理/基础操作) - 博客园/千千寰宇 [Docker] 基于CENTOS7安装Docker环境 - 博客园/千千寰宇 [Docker] Docker Compose 基础教程(概念/基础操作) - 博客园/千千寰宇 X 参考文献 Linux安装Kubernetes(k8s)详细教程 - CSDN kubeadm安装kubernetes 1.16.2 - 博客园 【推荐】 kubeadm-config说明 - CSDN K8s集群CNI升级:Calico3.28.2安装全攻略 - 实践 - 博客园 Flannel 和 Calico 都是主流的 Kubernetes CNI,但二者在 “支持网络策略” 这一核心需求上存在关键差异 kubernetes集群(k8s)之安装部署Calico 网络 - CSDN 【推荐x4】