Redis集群的使用

概述

Redis有三种集群模式,分别是:主从模式、Sentinel模式和Cluster模式。本文以Redis 6.0.1与Debian 10为例,演示Redis集群的使用。

主从模式

主从模式介绍

主从模式是三种模式中最简单的,在主从模式中,数据库分为两类:主数据库(master)和从数据库(slave)。
主从模式有如下特点:

  • 主数据库可以进行读写操作,当读写操作导致数据变化时会自动将数据同步给从数据库
  • 从数据库一般都是只读的,并且接收主数据库同步过来的数据
  • 一个master可以拥有多个slave,但是一个slave只能对应一个master
  • slave挂了不影响其他slave的读和master的读和写,重新启动后会将数据从master同步过来
  • master挂了以后,不影响slave的读,但redis不再提供写服务,master重启后redis将重新对外提供写服务
  • master挂了以后,不会在slave节点中重新选一个master

工作机制:当slave启动后,主动向master发送SYNC命令。master接收到SYNC命令后在后台保存快照(RDB持久化)和缓存保存快照这段时间的命令,然后将保存的快照文件和缓存的命令发送给slave。slave接收到快照文件和命令后加载快照文件和缓存的执行命令。

复制初始化后,master每次接收到的写命令都会同步发送给slave,保证主从数据一致性。

安全设置:当master节点设置密码后,客户端访问master需要密码;启动slave需要密码,在配置文件中配置即可;客户端访问slave不需要密码。

缺点:从上面可以看出,master节点在主从模式中唯一,若master挂掉,则redis无法对外提供写服务。

主从模式搭建

环境准备

  • master节点 192.168.50.11
  • slave节点 192.168.50.12
  • slave节点 192.168.50.13

下载安装

mkdir /Redis

cd /Redis

wget http://download.redis.io/releases/redis-6.0.1.tar.gz

tar zxf redis-6.0.1.tar.gz && mv redis-6.0.1/ /usr/local/redis

cd /usr/local/redis && make && make install

可能遇到的问题:

zmalloc.h:50:10: fatal error: jemalloc/jemalloc.h: No such file or directory
#include <jemalloc/jemalloc.h>
^~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [Makefile:293: adlist.o] Error 1
make[1]: Leaving directory '/usr/local/redis/src'
make: *** [Makefile:6: all] Error 2

原因是jemalloc重载了Linux下的ANSI C的malloc和free函数。解决办法:make时添加参数。

make MALLOC=libc

即:

cd /usr/local/redis && make MALLOC=libc && make install

创建Redis服务

参照默认服务文件(/usr/local/redis/utils/systemd-redis_server.service)来创建服务文件:

vim /usr/lib/systemd/system/redis.service

[Unit]
Description=Redis persistent key-value database
After=network.target
After=network-online.target

[Service]
ExecStart=/usr/local/bin/redis-server /usr/local/redis/redis.conf
ExecStop=/Redis/redis-shutdown
Type=forking
LimitNOFILE=10032
NoNewPrivileges=yes
TimeoutStartSec=5
TimeoutStopSec=5
UMask=0077
User=redis
Group=redis
RuntimeDirectory=redis
RuntimeDirectoryMode=0755

[Install]
WantedBy=multi-user.target

创建服务关闭脚本文件:

vim /Redis/redis-shutdown

#!/bin/bash
#
# Wrapper to close properly redis and sentinel
test x"$REDIS_DEBUG" != x && set -x

REDIS_CLI=/usr/local/bin/redis-cli

# Retrieve service name
SERVICE_NAME="$1"
if [ -z "$SERVICE_NAME" ]; then
SERVICE_NAME=redis
fi

# Get the proper config file based on service name
CONFIG_FILE="/usr/local/redis/$SERVICE_NAME.conf"

# Use awk to retrieve host, port from config file
HOST=`awk '/^[[:blank:]]*bind/ { print $2 }' $CONFIG_FILE | tail -n1`
PORT=`awk '/^[[:blank:]]*port/ { print $2 }' $CONFIG_FILE | tail -n1`
PASS=`awk '/^[[:blank:]]*requirepass/ { print $2 }' $CONFIG_FILE | tail -n1`
SOCK=`awk '/^[[:blank:]]*unixsocket\s/ { print $2 }' $CONFIG_FILE | tail -n1`

# Just in case, use default host, port
HOST=${HOST:-127.0.0.1}
if [ "$SERVICE_NAME" = redis ]; then
PORT=${PORT:-6379}
else
PORT=${PORT:-26739}
fi

# Setup additional parameters
# e.g password-protected redis instances
[ -z "$PASS" ] || ADDITIONAL_PARAMS="-a $PASS"

# shutdown the service properly
if [ -e "$SOCK" ] ; then
$REDIS_CLI -s $SOCK $ADDITIONAL_PARAMS shutdown
else
$REDIS_CLI -h $HOST -p $PORT $ADDITIONAL_PARAMS shutdown
fi

执行命令:

chmod +x /Redis/redis-shutdown

useradd -s /sbin/nologin redis

chown -R redis:redis /usr/local/redis

mkdir -p /data/redis #创建数据库备份文件存放目录

chown -R redis:redis /data/redis

apt-get install -y bash-completion && source /etc/profile #安装linux命令自动补全工具

systemctl daemon-reload

systemctl enable redis

修改Redis配置

先将默认配置文件 /usr/local/redis/redis.conf 备份一份,在原文件基础上修改。

master节点 192.168.50.11

vim /usr/local/redis/redis.conf

bind 192.168.50.11 #监听ip,多个ip用空格分隔
port 6379
daemonize yes #允许后台启动
supervised systemd
logfile "/usr/local/redis/redis.log" #日志路径
dir /data/redis #数据库备份文件存放目录
masterauth 123456    #slave连接master密码,master可省略
requirepass 123456   #设置master连接密码,slave可省略
appendonly yes #在/data/redis/目录生成appendonly.aof文件,将每一次写操作请求都追加到appendonly.aof 文件中

slave节点 192.168.50.12

vim /usr/local/redis/redis.conf

bind 192.168.50.12
port 6379
daemonize yes
supervised systemd
logfile "/usr/local/redis/redis.log"
dir /data/redis
replicaof 192.168.50.11 6379
masterauth 123456 #slave连接master密码,master可省略
requirepass 123456 #设置master连接密码,slave可省略
appendonly yes

slave节点 192.168.50.13

vim /usr/local/redis/redis.conf

bind 192.168.50.13
port 6379
daemonize yes
supervised systemd
logfile "/usr/local/redis/redis.log"
dir /data/redis
replicaof 192.168.50.11 6379
masterauth 123456 #slave连接master密码,master可省略
requirepass 123456 #设置master连接密码,slave可省略
appendonly yes

三台主机分别启动redis

systemctl start redis

查看集群状态

使用redis-cli连接master节点:

redis-cli -h 192.168.50.11 -a 123456

然后执行info replication,结果输出如下。可见两个slave节点均在线。

# Replication
role:master
connected_slaves:2
slave0:ip=192.168.50.12,port=6379,state=online,offset=196,lag=0
slave1:ip=192.168.50.13,port=6379,state=online,offset=196,lag=0
master_replid:cbe130618b6daa41b1018c3f4a6f1ad61aab8da9
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:196
master_repl_meaningful_offset:0
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:196

连接其中一个slave节点:

redis-cli -h 192.168.50.12 -a 123456

然后执行info replication,结果输出如下。可见master节点在线。

# Replication
role:slave
master_host:192.168.50.11
master_port:6379
master_link_status:up
master_last_io_seconds_ago:1
master_sync_in_progress:0
slave_repl_offset:350
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:d567faadfceb894806bcdaaf6ea244458001ea3a
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:350
master_repl_meaningful_offset:14
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:15
repl_backlog_histlen:336

数据演示

master节点:

192.168.50.11:6379> keys *
(empty array)
192.168.50.11:6379> set key1 100
OK
192.168.50.11:6379> set key2 mengmei.moe
OK
192.168.50.11:6379> keys *
1) "key2"
2) "key1"

slave节点:

192.168.50.12:6379> keys *
1) "key2"
2) "key1"
192.168.50.12:6379> config get dir
1) "dir"
2) "/data/redis"
192.168.50.12:6379> config get dbfilename
1) "dbfilename"
2) "dump.rdb"
192.168.50.12:6379> get key1
"100"
192.168.50.12:6379> get key2
"mengmei.moe"
192.168.50.12:6379> set key3 GUNDAM
(error) READONLY You can't write against a read only replica.

可以看到,在master节点写入的数据,很快就同步到slave节点上,而且在slave节点上无法写入数据。

Sentinel模式

Sentinel模式介绍

主从模式的弊端就是不具备高可用性,当master挂掉以后,Redis将不能再对外提供写入操作,因此sentinel应运而生。

sentinel中文含义为哨兵,顾名思义,它的作用就是监控redis集群的运行状况,特点如下:

  • sentinel模式是建立在主从模式的基础上,如果只有一个Redis节点,sentinel就没有任何意义
  • 当master挂了以后,sentinel会在slave中选择一个做为master,并修改它们的配置文件,其他slave的配置文件也会被修改,比如slaveof属性会指向新的master
  • 当master重新启动后,它将不再是master而是做为slave接收新的master的同步数据
  • sentinel因为也是一个进程有挂掉的可能,所以sentinel也会启动多个形成一个sentinel集群
  • 多sentinel配置的时候,sentinel之间也会自动监控
  • 当主从模式配置密码时,sentinel也会同步将配置信息修改到配置文件中,不需要担心
  • 一个sentinel或sentinel集群可以管理多个主从Redis,多个sentinel也可以监控同一个redis
  • sentinel最好不要和Redis部署在同一台机器,不然Redis的服务器挂了以后,sentinel也挂了

工作机制:

  • 每个sentinel以每秒钟一次的频率向它所知的master,slave以及其他sentinel实例发送一个 PING 命令
  • 如果一个实例距离最后一次有效回复 PING 命令的时间超过 down-after-milliseconds 选项所指定的值, 则这个实例会被sentinel标记为主观下线。
  • 如果一个master被标记为主观下线,则正在监视这个master的所有sentinel要以每秒一次的频率确认master的确进入了主观下线状态
  • 当有足够数量的sentinel(大于等于配置文件指定的值)在指定的时间范围内确认master的确进入了主观下线状态, 则master会被标记为客观下线
  • 在一般情况下, 每个sentinel会以每 10 秒一次的频率向它已知的所有master,slave发送 INFO 命令
  • 当master被sentinel标记为客观下线时,sentinel向下线的master的所有slave发送 INFO 命令的频率会从 10 秒一次改为 1 秒一次
  • 若没有足够数量的sentinel同意master已经下线,master的客观下线状态就会被移除;
    若master重新向sentinel的 PING 命令返回有效回复,master的主观下线状态就会被移除

当使用sentinel模式的时候,客户端就不要直接连接Redis,而是连接sentinel的ip和port,由sentinel来提供具体的可提供服务的Redis实现,这样当master节点挂掉以后,sentinel就会感知并将新的master节点提供给使用者。

Sentinel模式搭建

环境准备

  • master节点 192.168.50.11 sentinel端口:26379
  • slave节点 192.168.50.12 sentinel端口:26379
  • slave节点 192.168.50.13 sentinel端口:26379

修改sentinel配置

先将默认配置文件 /usr/local/redis/sentinel.conf 备份一份,在原文件基础上修改。

注:每台主机的配置均相同。

vim /usr/local/redis/sentinel.conf

port 26379
daemonize yes
logfile "/usr/local/redis/sentinel.log"
dir "/usr/local/redis/sentinel"
sentinel monitor mymaster 192.168.50.11 6379 2 #判断master失效至少需要2个sentinel同意,建议设置为n/2+1,n为sentinel个数
sentinel auth-pass mymaster 123456
sentinel down-after-milliseconds mymaster 30000 #判断master主观下线时间,默认30s

这里需要注意,sentinel auth-pass mymaster 123456需要配置在sentinel monitor mymaster 192.168.50.11 6379 2下面,否则启动报错,如下。

*** FATAL CONFIG FILE ERROR ***
Reading the configuration file, at line 104
>>> ‘sentinel auth-pass mymaster 123456’
No such master with specified name.

三台主机分别启动sentinel

创建sentinel工作目录:

mkdir /usr/local/redis/sentinel && chown -R redis:redis /usr/local/redis/sentinel

启动sentinel:

/usr/local/bin/redis-sentinel /usr/local/redis/sentinel.conf

查看任一主机sentinel日志

tail -f /usr/local/redis/sentinel.log

4855:X 08 May 2020 22:05:26.225 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
4855:X 08 May 2020 22:05:26.225 # Redis version=6.0.1, bits=64, commit=00000000, modified=0, pid=4855, just started
4855:X 08 May 2020 22:05:26.225 # Configuration loaded
4856:X 08 May 2020 22:05:26.226 * Increased maximum number of open files to 1003 2 (it was originally set to 1024).
4856:X 08 May 2020 22:05:26.227 * Running mode=sentinel, port=26379.
4856:X 08 May 2020 22:05:26.227 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 1 28.
4856:X 08 May 2020 22:05:26.228 # Sentinel ID is 4f615cf9985cb26ae1bfabbadd88baa a90f32ee7
4856:X 08 May 2020 22:05:26.228 # +monitor master mymaster 192.168.50.11 6379 qu orum 2
4856:X 08 May 2020 22:05:26.228 * +slave slave 192.168.50.12:6379 192.168.50.12 6379 @ mymaster 192.168.50.11 6379
4856:X 08 May 2020 22:05:26.229 * +slave slave 192.168.50.13:6379 192.168.50.13 6379 @ mymaster 192.168.50.11 6379

Sentinel模式下的事件

  • +reset-master :主服务器已被重置。
  • +slave :一个新的从服务器已经被 Sentinel 识别并关联。
  • +failover-state-reconf-slaves :故障转移状态切换到了 reconf-slaves 状态。
  • +failover-detected :另一个 Sentinel 开始了一次故障转移操作,或者一个从服务器转换成了主服务器。
  • +slave-reconf-sent :领头(leader)的 Sentinel 向实例发送了 [SLAVEOF](/commands/slaveof.html) 命令,为实例设置新的主服务器。
  • +slave-reconf-inprog :实例正在将自己设置为指定主服务器的从服务器,但相应的同步过程仍未完成。
  • +slave-reconf-done :从服务器已经成功完成对新主服务器的同步。
  • -dup-sentinel :对给定主服务器进行监视的一个或多个 Sentinel 已经因为重复出现而被移除 —— 当 Sentinel 实例重启的时候,就会出现这种情况。
  • +sentinel :一个监视给定主服务器的新 Sentinel 已经被识别并添加。
  • +sdown :给定的实例现在处于主观下线状态。
  • -sdown :给定的实例已经不再处于主观下线状态。
  • +odown :给定的实例现在处于客观下线状态。
  • -odown :给定的实例已经不再处于客观下线状态。
  • +new-epoch :当前的纪元(epoch)已经被更新。
  • +try-failover :一个新的故障迁移操作正在执行中,等待被大多数 Sentinel 选中(waiting to be elected by the majority)。
  • +elected-leader :赢得指定纪元的选举,可以进行故障迁移操作了。
  • +failover-state-select-slave :故障转移操作现在处于 select-slave 状态 —— Sentinel 正在寻找可以升级为主服务器的从服务器。
  • no-good-slave :Sentinel 操作未能找到适合进行升级的从服务器。Sentinel 会在一段时间之后再次尝试寻找合适的从服务器来进行升级,又或者直接放弃执行故障转移操作。
  • selected-slave :Sentinel 顺利找到适合进行升级的从服务器。
  • failover-state-send-slaveof-noone :Sentinel 正在将指定的从服务器升级为主服务器,等待升级功能完成。
  • failover-end-for-timeout :故障转移因为超时而中止,不过最终所有从服务器都会开始复制新的主服务器(slaves will eventually be configured to replicate with the new master anyway)。
  • failover-end :故障转移操作顺利完成。所有从服务器都开始复制新的主服务器了。
  • +switch-master :配置变更,主服务器的 IP 和地址已经改变。 这是绝大多数外部用户都关心的信息。
  • +tilt :进入 tilt 模式。
  • -tilt :退出 tilt 模式。

master宕机演示

master节点停止redis服务:

systemctl stop redis

查看sentinel日志:

tail -f /usr/local/redis/sentinel.log

4856:X 08 May 2020 22:33:02.310 # +sdown master mymaster 192.168.50.11 6379
4856:X 08 May 2020 22:33:02.388 # +new-epoch 1
4856:X 08 May 2020 22:33:02.390 # +vote-for-leader 2e1b5fcb4e1f6882114de8b401a9226baf70f92e 1
4856:X 08 May 2020 22:33:03.463 # +odown master mymaster 192.168.50.11 6379 #quorum 3/2
4856:X 08 May 2020 22:33:03.464 # Next failover delay: I will not start a failover before Fri May 8 22:39:02 2020
4856:X 08 May 2020 22:33:03.591 # +config-update-from sentinel 2e1b5fcb4e1f6882114de8b401a9226baf70f92e 192.168.50.13 26379 @ mymaster 192.168.50.11 6379
4856:X 08 May 2020 22:33:03.591 # +switch-master mymaster 192.168.50.11 6379 192.168.50.12 6379
4856:X 08 May 2020 22:33:03.592 * +slave slave 192.168.50.13:6379 192.168.50.13 6379 @ mymaster 192.168.50.12 6379
4856:X 08 May 2020 22:33:03.592 * +slave slave 192.168.50.11:6379 192.168.50.11 6379 @ mymaster 192.168.50.12 6379
4856:X 08 May 2020 22:33:33.612 # +sdown slave 192.168.50.11:6379 192.168.50.11 6379 @ mymaster 192.168.50.12 6379

从日志中可以看到,master节点已经从192.168.50.11转移到192.168.50.12上。

查看192.168.50.12的集群信息:

/usr/local/bin/redis-cli -h 192.168.50.12 -p 6379 -a 123456

info replication

# Replication
role:master
connected_slaves:1
slave0:ip=192.168.50.13,port=6379,state=online,offset=273931,lag=0
master_replid:14080496fc5537e6afacb66c78bb9cd76382b165
master_replid2:aa569683e2d83252129b5e8badb7cf13ccd04008
master_repl_offset:274213
master_repl_meaningful_offset:274213
second_repl_offset:160367
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:274213

可见192.168.50.12已变为master节点,当前集群中只有一个slave节点——192.168.50.13。

在192.168.50.13上查看Redis的配置文件也可以注意到replicaof 192.168.30.12 6379,这是sentinel在选举master时做的修改。

重新把192.168.50.11上Redis服务启动:

systemctl start redis

查看sentinel日志:

tail -f /usr/local/redis/sentinel.log

4856:X 08 May 2020 22:33:03.591 # +switch-master mymaster 192.168.50.11 6379 192.168.50.12 6379
4856:X 08 May 2020 22:33:03.592 * +slave slave 192.168.50.13:6379 192.168.50.13 6379 @ mymaster 192.168.50.12 6379
4856:X 08 May 2020 22:33:03.592 * +slave slave 192.168.50.11:6379 192.168.50.11 6379 @ mymaster 192.168.50.12 6379
4856:X 08 May 2020 22:33:33.612 # +sdown slave 192.168.50.11:6379 192.168.50.11 6379 @ mymaster 192.168.50.12 6379
4856:X 08 May 2020 22:48:05.697 # -sdown slave 192.168.50.11:6379 192.168.50.11 6379 @ mymaster 192.168.50.12 6379

查看192.168.50.11的集群信息:

/usr/local/bin/redis-cli -h 192.168.50.11 -p 6379 -a 123456

info replication

# Replication
role:slave
master_host:192.168.50.12
master_port:6379
master_link_status:up
master_last_io_seconds_ago:0
master_sync_in_progress:0
slave_repl_offset:426692
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:14080496fc5537e6afacb66c78bb9cd76382b165
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:426692
master_repl_meaningful_offset:426692
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:351295
repl_backlog_histlen:75398

可见即使原master节点192.168.50.11重新启动Redis服务,也是作为slave加入Redis集群,192.168.50.12仍然是master节点。

Cluster模式

Cluster模式介绍

sentinel模式基本可以满足一般生产的需求,具备高可用性。但是当数据量过大到一台服务器存放不下的情况时,主从模式或sentinel模式就不能满足需求了,这个时候需要对存储的数据进行分片,将数据存储到多个Redis实例中。cluster模式的出现就是为了解决单机Redis容量有限的问题,将Redis的数据根据一定的规则分配到多台机器。
cluster可以说是sentinel和主从模式的结合体,通过cluster可以实现主从和master重选功能,所以如果配置两个副本三个分片的话,就需要六个Redis实例。因为Redis的数据是根据一定规则分配到cluster的不同机器的,当数据量过大时,可以新增机器进行扩容。

使用cluster集群,只需要将redis配置文件中的cluster-enable配置打开即可。每个集群中至少需要三个主数据库才能正常运行,新增节点非常方便。

cluster集群特点:

  • 多个Redis节点网络互联,数据共享
  • 所有的节点都是一主一从(也可以是一主多从),其中从不提供服务,仅作为备用
  • 不支持同时处理多个key(如MSET/MGET),因为redis需要把key均匀分布在各个节点上,
    并发量很高的情况下同时创建key-value会降低性能并导致不可预测的行为
  • 支持在线增加、删除节点
  • 客户端可以连接任何一个主节点进行读写

简单理解Cluster的slot

如果使用redis-cluster集群部署Redis,redis-cluster把所有的物理节点映射到[0-16383]slot上。这里有个理解误区,有初学者会想,这个slot是不是存储数据的点?就是只能存16383+1个键?
实际并不是这样,这个slot只是对应Redis节点的一个存储范围(可以理解为这个Redis节点的别名)。
比如,现在有3台Redis ,分别给他们分配slot :
Redis节点 slot(可以不平均分配)
A 0~5000
B 5001~10000
C 10001~16383
现有一个key要insert到Redis,那么根据 CRC16(key) mod 16384的值,比如得到3000,那就把这个key保存在A节点里面了。读的时候也一样,有个key要去读,就先 CRC16(key) mod 16384 找到对应的slot,然后就去对应的节点找数据。看起来,很像个索引吧。

Cluster模式搭建

环境准备

三台主机,全部停止当前Redis服务:

systemctl stop redis

然后准备分别开启两个Redis服务

  • 192.168.50.11 端口:7001,7002
  • 192.168.50.12 端口:7003,7004
  • 192.168.50.13 端口:7005,7006

修改配置文件

在主机192.168.50.11上运行:

mkdir /usr/local/redis/cluster

chown -R redis:redis /usr/local/redis/cluster

cp /usr/local/redis/redis.conf /usr/local/redis/cluster/redis_7001.conf

cp /usr/local/redis/redis.conf /usr/local/redis/cluster/redis_7002.conf

mkdir -p /data/redis/cluster/{redis_7001,redis_7002} && chown -R redis:redis /data/redis/cluster

修改7001配置文件,注意注释掉replicaof行:

vim /usr/local/redis/cluster/redis_7001.conf

bind 192.168.50.11
port 7001
daemonize yes
supervised systemd
pidfile "/var/run/redis_7001.pid"
logfile "/usr/local/redis/cluster/redis_7001.log"
dir "/data/redis/cluster/redis_7001"
#replicaof 192.168.50.12 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes_7001.conf
cluster-node-timeout 15000

修改7002配置文件,注意注释掉replicaof行:

vim /usr/local/redis/cluster/redis_7002.conf

bind 192.168.50.11
port 7002
daemonize yes
supervised systemd
pidfile "/var/run/redis_7002.pid"
logfile "/usr/local/redis/cluster/redis_7002.log"
dir "/data/redis/cluster/redis_7002"
#replicaof 192.168.50.12 6379
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes_7002.conf
cluster-node-timeout 15000

分别启动Redis:

redis-server /usr/local/redis/cluster/redis_7001.conf

查看日志:tail -f /usr/local/redis/cluster/redis_7001.log

redis-server /usr/local/redis/cluster/redis_7002.conf

查看日志:tail -f /usr/local/redis/cluster/redis_7002.log

其它两台主机配置与192.168.50.11类似,此处省略。

创建集群

redis-cli -a 123456 --cluster create 192.168.50.11:7001 192.168.50.11:7002 192.168.50.12:7003 192.168.50.12:7004 192.168.50.13:7005 192.168.50.13:7006 --cluster-replicas 1

Warning: Using a password with ‘-a’ or ‘-u’ option on the command line interface may not be safe.
>>> Performing hash slots allocation on 6 nodes…
Master[0] -> Slots 0 – 5460
Master[1] -> Slots 5461 – 10922
Master[2] -> Slots 10923 – 16383
Adding replica 192.168.50.12:7004 to 192.168.50.11:7001
Adding replica 192.168.50.13:7006 to 192.168.50.12:7003
Adding replica 192.168.50.11:7002 to 192.168.50.13:7005
M: 8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001
slots:[0-5460] (5461 slots) master
S: 5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002
replicates 7952e19c242f45152e840bd6748b9c251c3b0ae0
M: 134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003
slots:[5461-10922] (5462 slots) master
S: f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004
replicates 8d2cf30fbef15544352d1756d128bee18d849547
M: 7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005
slots:[10923-16383] (5461 slots) master
S: 344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006
replicates 134ca729913d1705b92df38567d3bcccd6da9514
Can I set the above configuration? (type ‘yes’ to accept): yes    #输入yes,接受上面配置
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
…..
>>> Performing Cluster Check (using node 192.168.50.11:7001)
M: 8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001
slots:[0-5460] (5461 slots) master
1 additional replica(s)
S: 344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006
slots: (0 slots) slave
replicates 134ca729913d1705b92df38567d3bcccd6da9514
S: 5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002
slots: (0 slots) slave
replicates 7952e19c242f45152e840bd6748b9c251c3b0ae0
M: 134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003
slots:[5461-10922] (5462 slots) master
1 additional replica(s)
S: f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004
slots: (0 slots) slave
replicates 8d2cf30fbef15544352d1756d128bee18d849547
M: 7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005
slots:[10923-16383] (5461 slots) master
1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots…
>>> Check slots coverage…
[OK] All 16384 slots covered.

可以看到

  • 192.168.50.11:7001是master,它的slave是192.168.50.12:7004
  • 192.168.50.12:7003是master,它的slave是192.168.50.13:7006
  • 192.168.50.13:7005是master,它的slave是192.168.50.11:7002

自动生成nodes.conf文件:

ls /data/redis/cluster/redis_7001/

appendonly.aof dump.rdb nodes_7001.conf

cat /data/redis/cluster/redis_7001/nodes_7001.conf

344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588994534000 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588994532000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588994535825 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588994533807 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588994534820 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588994534000 5 connected 10923-16383
vars currentEpoch 6 lastVoteEpoch 0

集群操作

登录集群

redis-cli -c -h 192.168.50.11 -p 7001 -a 123456

查看集群信息

执行:cluster info

192.168.50.11:7001> cluster info
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:1
cluster_stats_messages_ping_sent:1001
cluster_stats_messages_pong_sent:981
cluster_stats_messages_sent:1982
cluster_stats_messages_ping_received:976
cluster_stats_messages_pong_received:1001
cluster_stats_messages_meet_received:5
cluster_stats_messages_received:1982

列出节点信息

执行:cluster nodes

192.168.50.11:7001> cluster nodes
344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588995637329 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588995636000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588995636000 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588995638334 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588995639340 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588995637000 5 connected 10923-16383

可见这里与nodes.conf文件内容相同。

读写数据测试

192.168.50.11:7001> get key1
-> Redirected to slot [9189] located at 192.168.50.12:7003
(nil)
192.168.50.12:7003> set key1 valueA
OK
192.168.50.12:7003> set key2 valueB
-> Redirected to slot [4998] located at 192.168.50.11:7001
OK
192.168.50.11:7001> set key3 valueC
OK
192.168.50.11:7001> set key4 valueD
-> Redirected to slot [13120] located at 192.168.50.13:7005
OK
192.168.50.13:7005> get key1
-> Redirected to slot [9189] located at 192.168.50.12:7003
“valueA”
192.168.50.12:7003> get key2
-> Redirected to slot [4998] located at 192.168.50.11:7001
“valueB”
192.168.50.11:7001> get key3
“valueC”

可以看出redis cluster集群是去中心化的,每个节点都是平等的,连接哪个节点都可以获取和设置数据。

当然,平等指的是master节点,因为slave节点根本不提供服务,只是作为对应master节点的一个备份。

增加节点

在192.168.50.12上增加一节点:

mkdir /data/redis/cluster/redis_7007

chown -R redis:redis /usr/local/redis && chown -R redis:redis /data/redis

cp /usr/local/redis/cluster/redis_7003.conf /usr/local/redis/cluster/redis_7007.conf

与之前一样,修改7007配置文件,注意注释掉replicaof行:

vim /usr/local/redis/cluster/redis_7007.conf

bind 192.168.50.12
port 7007
daemonize yes
supervised systemd
pidfile "/var/run/redis_7007.pid"
logfile "/usr/local/redis/cluster/redis_7007.log"
dir "/data/redis/cluster/redis_7007"
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes_7007.conf
cluster-node-timeout 15000

启动Redis:

redis-server /usr/local/redis/cluster/redis_7007.conf

查看日志:tail -f /usr/local/redis/cluster/redis_7007.log

在192.168.50.13上增加一节点:

mkdir /data/redis/cluster/redis_7008

chown -R redis:redis /usr/local/redis && chown -R redis:redis /data/redis

cp /usr/local/redis/cluster/redis_7005.conf /usr/local/redis/cluster/redis_7008.conf

与之前一样,修改7008配置文件,注意注释掉replicaof行:

vim /usr/local/redis/cluster/redis_7008.conf

bind 192.168.50.13
port 7008
daemonize yes
supervised systemd
pidfile "/var/run/redis_7008.pid"
logfile "/usr/local/redis/cluster/redis_7008.log"
dir "/data/redis/cluster/redis_7008"
masterauth 123456
requirepass 123456
appendonly yes
cluster-enabled yes
cluster-config-file nodes_7008.conf
cluster-node-timeout 15000

启动Redis:

redis-server /usr/local/redis/cluster/redis_7008.conf

查看日志:tail -f /usr/local/redis/cluster/redis_7008.log

集群中增加节点:

登录集群后执行:cluster meet 192.168.50.12 7007

cluster nodes

192.168.50.11:7001> cluster meet 192.168.50.12 7007
OK
192.168.50.11:7001> cluster nodes
553f36263cabfc0b8f35860a0474c1427e498059 192.168.50.12:7007@17007 master – 0 1588997264491 0 connected
344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588997266502 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588997264000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588997265497 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588997266000 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588997263000 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588997264000 5 connected 10923-16383

cluster meet 192.168.50.13 7008

cluster nodes

192.168.50.11:7001> cluster meet 192.168.50.13 7008
OK
192.168.50.11:7001> cluster nodes
553f36263cabfc0b8f35860a0474c1427e498059 192.168.50.12:7007@17007 master – 0 1588997402433 0 connected
344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588997403440 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588997402000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588997401000 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588997400000 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588997401000 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588997400419 5 connected 10923-16383
28275a071173ed1712e3404b0a1b4d94a52049f0 192.168.50.13:7008@17008 master – 0 1588997401425 7 connected

可以看到,新增的节点都是以master身份加入集群的。

更换节点身份

将新增的192.168.50.13:7008节点身份改为192.168.50.12:7007的slave:

redis-cli -c -h 192.168.50.13 -p 7008 -a 123456 cluster replicate 553f36263cabfc0b8f35860a0474c1427e498059

cluster replicate后面跟node_id,更改对应节点身份。

也可以登录集群后更改:

192.168.50.13:7008> cluster replicate 553f36263cabfc0b8f35860a0474c1427e498059

查看更改后的节点信息:

192.168.50.11:7001> cluster nodes
553f36263cabfc0b8f35860a0474c1427e498059 192.168.50.12:7007@17007 master – 0 1588997910887 0 connected
344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588997910000 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588997911000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588997912902 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588997910000 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588997910000 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588997911000 5 connected 10923-16383
28275a071173ed1712e3404b0a1b4d94a52049f0 192.168.50.13:7008@17008 slave 553f36263cabfc0b8f35860a0474c1427e498059 0 1588997911894 7 connected

可以看到,192.168.50.13:7008节点的身份已变为slave。

查看相应的nodes.conf文件,可以发现有更改,它记录当前集群的节点信息。

cat /data/redis/cluster/redis_7001/nodes_7001.conf

553f36263cabfc0b8f35860a0474c1427e498059 192.168.50.12:7007@17007 master – 0 1588997622940 0 connected
344066d97ccff073f81220458a5813c5c9c1c954 192.168.50.13:7006@17006 slave 134ca729913d1705b92df38567d3bcccd6da9514 0 1588997618911 6 connected
8d2cf30fbef15544352d1756d128bee18d849547 192.168.50.11:7001@17001 myself,master – 0 1588997620000 1 connected 0-5460
5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 192.168.50.11:7002@17002 slave 7952e19c242f45152e840bd6748b9c251c3b0ae0 0 1588997622000 5 connected
134ca729913d1705b92df38567d3bcccd6da9514 192.168.50.12:7003@17003 master – 0 1588997621000 3 connected 5461-10922
f7ce1489c2c2f382a24d86c5db2f12b734cdc669 192.168.50.12:7004@17004 slave 8d2cf30fbef15544352d1756d128bee18d849547 0 1588997622000 4 connected
7952e19c242f45152e840bd6748b9c251c3b0ae0 192.168.50.13:7005@17005 master – 0 1588997621930 5 connected 10923-16383
28275a071173ed1712e3404b0a1b4d94a52049f0 192.168.50.13:7008@17008 slave 553f36263cabfc0b8f35860a0474c1427e498059 0 1588997621000 7 connected
vars currentEpoch 7 lastVoteEpoch 0

删除节点

删除 slave 节点:redis-cli --cluster del-node <ip:port> <node_id>

例如删除192.168.50.11:7002节点:

redis-cli --cluster del-node 192.168.50.11:7002 5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 -a 123456

>>> Removing node 5f8515bf4f7e3c2fd117704d1893b0e1c9f21d26 from cluster 192.168.50.11:7002
>>> Sending CLUSTER FORGET messages to the cluster…
>>> Sending CLUSTER RESET SOFT to the deleted node.

删除 master 节点:

先对节点进行分片工作,防止数据丢失。

redis-cli --cluster reshard ip:port

按提示转移所有slot至另外一个master节点。
然后执行redis-cli --cluster del-node <ip:port> <node_id>

强制删除 master 节点,不做 slot 转移:

  • 查看 redis.conf 文件中 dir 的配置路径
  • 在路径下删除关于该节点的 nodes_port.conf、.aof 和.rdb 文件
  • 最后再执行删除节点的命令
  • 如果要彻底移除集群,且不保存数据也可以使用该方式
  • 对于执行集群相关操作的时候出现错误,可以使用 redis-cli --cluster check ip:port 检查集群是否有错误,使用 redis-cli --cluster fix ip:port 修复相关错误

 

参考文章

留下评论

您的电子邮箱地址不会被公开。 必填项已用 * 标注