ES分片不足
故障现象
同事反馈,企微聊天无法查询,查看ES提示如下异常。排查发现是ES分片不足,分片数量为2000,但是有1999个分片。
查询DeepSeek回复要增加参数,指定分片数量
结果修改配置文件后,docker容器都没法正常启动了。
问题解决
恢复配置文件
从容器中拷贝配置文件到宿主机,
并恢复配置 elasticsearch.yml
文件,由于容器无法正常启动,所以该命令需要多试几次,执行成功就行。
bash
docker cp chatsync-elasticsearch:/usr/share/elasticsearch/config/elasticsearch.yml .
将配置文件设置成
yaml
## 设置,不需要persistent:
cluster.max_shards_per_node: 3000
复制文件到容器中,等待容器重启。
bash
docker cp elasticsearch.yml chatsync-elasticsearch:/usr/share/elasticsearch/config/elasticsearch.yml
查看启动日志
查看启动日志,发下还存在如下问题
bash
{
"type": "server",
"timestamp": "2025-02-28T13:39:51,387+08:00",
"level": "INFO",
"component": "o.e.c.r.a.AllocationService",
"cluster.name": "docker-cluster",
"node.name": "elasticsearch",
"message": "Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[gupaoedu-wxcp-msg-2023-08-21][0]]]).",
"cluster.uuid": "9dflySuCQOCWvfZqolNj3Q",
"node.id": "FNI3xNTLQj6yYGaq8LI9Sg"
}
查看集群健康状态
登录 kibana https://ke.gupaoedu.cn/csk/
bash
GET /_cluster/health?pretty
json
{
"cluster_name": "docker-cluster",
"status": "yellow",
"timed_out": false,
"number_of_nodes": 1,
"number_of_data_nodes": 1,
"active_primary_shards": 1012,
"active_shards": 1012,
"relocating_shards": 0,
"initializing_shards": 0,
"unassigned_shards": 991,
"delayed_unassigned_shards": 0,
"number_of_pending_tasks": 0,
"number_of_in_flight_fetch": 0,
"task_max_waiting_in_queue_millis": 0,
"active_shards_percent_as_number": 50.52421367948078
}
bash
GET _cat/allocation?v
bash
shards disk.indices disk.used disk.avail disk.total disk.percent host ip node
1012 83.1gb 558.3gb 229gb 787.3gb 70 10.0.1.60 10.0.1.60 elasticsearch
991 UNASSIGNED
设置分片大小
查看集群配置,发现之前的配置还是2000
bash
GET /_cluster/settings
json
{
"persistent": {
"cluster": {
"max_shards_per_node": "2000"
}
},
"transient": {}
}
我们修改为3000,问题解决
bash
PUT /_cluster/settings
{
"persistent" : {
"cluster" : {
"max_shards_per_node" : "3000"
}
},
"transient" : { }
}