集群介绍

前文有搭建MGR和keepalived的学习,目前结合之前的路线,MGR单主一主两从无法满足需求,我们此章从proxysql对MGR进行分片,搭建两套MGR组复制,并且proxysql对其进行分片规则和流量分发,zookeeper做管理配置协调中心。

  • ZooKeeper 用于集群协调和配置管理

  • ProxySQL 实现高效分片路由

  • MySQL Group Replication 保证数据高可用

192.168.93.133 mysql
192.168.93.134 mysql
192.168.93.135 mysql
192.168.93.136 mysql+zookeeper
192.168.93.137 proxysql+zookeeper+keepalived
192.168.93.138 proxysql+zookeeper+keepalived

系统环境+主机名

192.168.93.133 db1-master
192.168.93.134 db1-slave
192.168.93.135 db2-master
192.168.93.136 db2-slave zk3
192.168.93.137 proxysql1 zk1
192.168.93.138 proxysql2 zk2 

测试功能:实时业务流量下发 -》验证两套mgr集群的数据是否根据age分片下发-》proxysql故障 -》keepalived是否自动切换 -》mgr主节点故障 -》从节点是否接管流量 -》 流量实时下发  

集群架构:

ZooKeeper核心机制

目前结合zookeeper的机制原理,期望实现如下在架构中的关键作用

  1. 配置中心

    • 存储集群拓扑信息

    • 管理分片路由规则

    • 保存备份策略

  2. 状态管理

    • 记录当前主节点信息

    • 跟踪节点健康状态

    • 维护备份历史

  3. 分布式协调

    • 主从切换通知

    • 分布式锁(用于备份任务)

    • 集群选主

  4. 服务发现

    • ProxySQL 节点注册

    • MySQL 实例发现

    • 监控目标自动发现

基础环境配置 

先不用ansible,后面搭建完成后可以用ansible自动化部署,现在一步一步更能理解。

zookeeper部署 

分别在如下的六台虚机创建脚本执行:

环境信息根据需求来添加

cat env.sh 
chmod +x env.sh
#!/bin/bash
# 设置主机名解析
cat >> /etc/hosts <<EOF
192.168.93.133 db1-master
192.168.93.134 db1-slave
192.168.93.135 db2-master
192.168.93.136 db2-slave zk3
192.168.93.137 proxysql1 zk1
192.168.93.138 proxysql2 zk2
EOF

# 关闭防火墙
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config

# 时间同步
yum install -y ntp
echo "server ntp.aliyun.com iburst" >> /etc/ntp.conf
systemctl restart ntpd && systemctl enable ntpd

curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo
yum clean all 
yum makecache fast
# 安装基础工具
yum install -y epel-release
yum install -y git wget curl vim net-tools htop python3 python3-p nc

 配置完可以检查一下机器的时间是否同步,防火墙是否关闭。

机器有限,对应的hostname修改成hosts里面的主机名,通过查看hostname,配置myid。

三台zk:136、137、138

cat zk.sh 
#!/bin/bash
# 安装JDK
yum install -y java-11-openjdk
# 下载镜像
url="https://archive.apache.org/dist/zookeeper/zookeeper-3.8.1/apache-zookeeper-3.8.1-bin.tar.gz"
wget $url -O /tmp/apache-zookeeper-3.8.1-bin.tar.gz

# 解压安装
tar -zxvf /tmp/apache-zookeeper-3.8.1-bin.tar.gz -C /opt
mv /opt/apache-zookeeper-3.8.1-bin /opt/zookeeper

# 创建目录
mkdir -p /data/zookeeper

# 配置zoo.cfg
cat > /opt/zookeeper/conf/zoo.cfg <<EOF
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
EOF

# 节点特定配置
# 分别在不同节点执行对应命令
h=$(hostname)
case $h in
  proxy-1) echo "1" > /data/zookeeper/myid ;;
  proxy-2) echo "2" > /data/zookeeper/myid ;;
  n3) echo "3" > /data/zookeeper/myid ;;
  *)   echo "ERROR: 未知主机名" && exit 1 ;;
esac

# 启动服务
/opt/zookeeper/bin/zkServer.sh start

启动成功 

/opt/zookeeper/bin/zkServer.sh start

# 验证状态
/opt/zookeeper/bin/zkServer.sh status

tickTime=2000  zk的基本时间单位毫秒,心跳检测超时计算
initLimit=10  从节点初始化连接到zk主的最大时间,10x2000=20秒
syncLimit=5   从和主之间心跳响应的最大超时时间,5x2000=10秒
dataDir=/data/zookeeper 数据目录
clientPort=2181 zk对外访问端口
server.1=zk1:2888:3888  2888为节点间数据同步端口,3888为选举端口
server.2=zk2:2888:3888
server.3=zk3:2888:3888

 创建节点

/opt/zookeeper/bin/zkCli.sh
create /database-cluster ""
create /database-cluster/mgr-clusters ""
create /database-cluster/mgr-clusters/db1 ""
create /database-cluster/mgr-clusters/db2 ""
create /database-cluster/mgr-clusters/db1/master "192.168.93.133"
create /database-cluster/mgr-clusters/db1/slaves '["192.168.93.134"]'
create /database-cluster/mgr-clusters/db2/master "192.168.93.135"
create /database-cluster/mgr-clusters/db2/slaves '["192.168.93.136"]'
create /database-cluster/proxysql ""
create /database-cluster/proxysql/active "192.168.93.137"
create /database-cluster/proxysql/standby "192.168.93.138"
create /database-cluster/locks ""

 MGR部署

MGR主要命令:

之前的文章有搭过,可查看!

 MGR节点master(133/135)配置,里面的ip和端口改成实际的。步骤写成脚本了,可执行脚本部署。

cat /data/db.sh 
chmod +x db.sh
#!/bin/bash

systemctl stop mysqld 2>/dev/null
yum remove -y mysql-* mariadb-* 2>/dev/null
rpm -e --nodeps $(rpm -qa | grep -i mysql) 2>/dev/null
rm -rf /var/lib/mysql /var/log/mysqld* /etc/my.cnf* /var/run/mysqld
rm -rf /data/mysql /data/mysql-base
find / -name mysql -type d -exec rm -rf {} + 2>/dev/null

# 安装依赖
yum install -y libaio numactl perl wget tar

# 创建用户和组
if ! id "mysql" &>/dev/null; then
    groupadd mysql
    useradd -r -g mysql -s /bin/false mysql
fi

# 创建目录
mkdir -p /data/mysql-base
mkdir -p /data/mysql/{data,logs,tmp,binlog,conf,run}
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql

# 下载并解压mysql二进制包
mysql_version="8.0.42"
mysql_pkg="mysql-${mysql_version}-linux-glibc2.17-x86_64"

cd /data/mysql-base
echo "下载 mysql ${mysql_version}"
wget https://dev.mysql.com/get/Downloads/MySQL-8.0/${mysql_pkg}.tar.xz
tar -xvf ${mysql_pkg}.tar.xz
rm -f ${mysql_pkg}.tar.xz
mv ${mysql_pkg} mysql
echo "mysql ${mysql_version} 解压完成"



# 配置文件导入
cat > /data/mysql/conf/my.cnf <<EOF
[mysqld]
user=mysql
port=3306
basedir=/data/mysql-base/mysql
datadir=/data/mysql/data
socket=/data/mysql/run/mysqld.sock
tmpdir=/data/mysql/tmp
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/run/mysqld.pid
log-bin=/data/mysql/binlog/mysql-bin
relay-log=/data/mysql/binlog/relay-bin

character-set-server=UTF8MB4
lower_case_table_names=1
server_id=1
enforce_gtid_consistency=on
gtid_mode=on
default_authentication_plugin=mysql_native_password

plugin_dir=/data/mysql-base/mysql/lib/plugin
binlog_checksum=NONE
transaction_write_set_extraction=XXHASH64
loose-group_replication_recovery_use_ssl=ON
loose-group_replication_group_name="90e11f7a-58a5-11f0-a076-000c29366dc1"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address="192.168.93.137:24901"
loose-group_replication_group_seeds="192.168.93.137:24901,192.168.93.138:24901"
loose-group_replication_bootstrap_group=OFF

[client]
socket=/data/mysql/run/mysqld.sock
EOF

# 软链接到默认配置环境,可以不软
ln -sf /data/mysql/conf/my.cnf /etc/my.cnf

# 初始化mysql
echo "初始化mysql数据库---"
/data/mysql-base/mysql/bin/mysqld --initialize-insecure \
    --user=mysql \
    --basedir=/data/mysql-base/mysql \
    --datadir=/data/mysql/data

# systemd服务文件配置
cat > /etc/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
After=network.target

[Service]
User=mysql
Group=mysql
Type=forking
Environment="MYSQL_HOME=/data/mysql/conf"
Environment="PATH=/data/mysql-base/mysql/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart=/data/mysql-base/mysql/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --daemonize --pid-file=/data/mysql/run/mysqld.pid
ExecReload=/bin/kill -HUP \$MAINPID
Restart=on-failure
RestartPreventExitStatus=1
PrivateTmp=true
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target
EOF

# 目录权限
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql

# 启动mysql服务
systemctl daemon-reload
systemctl start mysqld
systemctl enable mysqld

# 等待mysql启动
echo "等待mysql启动..."
sleep 10
while ! /data/mysql-base/mysql/bin/mysqladmin ping --silent; do
    echo "等待mysql启动..."
    sleep 5
done

# 设置root简单密码
echo "设置root密码..."
/data/mysql-base/mysql/bin/mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '123';"

# 设置登录方式环境变量
echo 'export PATH=/data/mysql-base/mysql/bin:$PATH' > /etc/profile.d/mysql.sh
source /etc/profile.d/mysql.sh

# 配置组复制
echo "配置组复制..."
/data/mysql-base/mysql/bin/mysql -p'123' <<EOF
SET SQL_LOG_BIN=0;
CREATE USER repl@'%' IDENTIFIED WITH mysql_native_password BY '123';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';
GRANT BACKUP_ADMIN ON *.* TO repl@'%';
GRANT GROUP_REPLICATION_STREAM ON *.* TO repl@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
CHANGE REPLICATION SOURCE TO SOURCE_USER='repl', SOURCE_PASSWORD='123' FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

SELECT * FROM performance_schema.replication_group_members;

EOF


echo "MySQL ${mysql_version} 二进制安装完成"

 执行成功如下:warning不用管,是密码简单提醒

 slave节点(134、136)执行

cat /data/db.sh 

#!/bin/bash

systemctl stop mysqld 2>/dev/null
yum remove -y mysql-* mariadb-* 2>/dev/null
rpm -e --nodeps $(rpm -qa | grep -i mysql) 2>/dev/null
rm -rf /var/lib/mysql /var/log/mysqld* /etc/my.cnf* /var/run/mysqld
rm -rf /data/mysql /data/mysql-base
find / -name mysql -type d -exec rm -rf {} + 2>/dev/null

# 安装依赖
yum install -y libaio numactl perl wget tar

# 创建用户和组
if ! id "mysql" &>/dev/null; then
    groupadd mysql
    useradd -r -g mysql -s /bin/false mysql
fi

# 创建目录
mkdir -p /data/mysql-base
mkdir -p /data/mysql/{data,logs,tmp,binlog,conf,run}
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql

# 下载并解压mysql二进制包
mysql_version="8.0.42"
mysql_pkg="mysql-${mysql_version}-linux-glibc2.17-x86_64"

cd /data/mysql-base
echo "下载 mysql ${mysql_version}"
wget https://dev.mysql.com/get/Downloads/MySQL-8.0/${mysql_pkg}.tar.xz
tar -xvf ${mysql_pkg}.tar.xz
rm -f ${mysql_pkg}.tar.xz
mv ${mysql_pkg} mysql
echo "mysql ${mysql_version} 解压完成"



# 配置文件导入
cat > /data/mysql/conf/my.cnf <<EOF
[mysqld]
user=mysql
port=3306
basedir=/data/mysql-base/mysql
datadir=/data/mysql/data
socket=/data/mysql/run/mysqld.sock
tmpdir=/data/mysql/tmp
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/run/mysqld.pid
log-bin=/data/mysql/binlog/mysql-bin
relay-log=/data/mysql/binlog/relay-bin

character-set-server=UTF8MB4
lower_case_table_names=1
server_id=2
enforce_gtid_consistency=on
gtid_mode=on
default_authentication_plugin=mysql_native_password

plugin_dir=/data/mysql-base/mysql/lib/plugin
binlog_checksum=NONE
transaction_write_set_extraction=XXHASH64
loose-group_replication_recovery_use_ssl=ON
loose-group_replication_group_name="90e11f7a-58a5-11f0-a076-000c29366dc1"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address="192.168.93.138:24901"
loose-group_replication_group_seeds="192.168.93.137:24901,192.168.93.138:24901"
loose-group_replication_bootstrap_group=OFF

[client]
socket=/data/mysql/run/mysqld.sock
EOF

# 软链接到默认配置环境,可以不软
ln -sf /data/mysql/conf/my.cnf /etc/my.cnf

# 初始化mysql
echo "初始化mysql数据库---"
/data/mysql-base/mysql/bin/mysqld --initialize-insecure \
    --user=mysql \
    --basedir=/data/mysql-base/mysql \
    --datadir=/data/mysql/data

# systemd服务文件配置
cat > /etc/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
After=network.target

[Service]
User=mysql
Group=mysql
Type=forking
Environment="MYSQL_HOME=/data/mysql/conf"
Environment="PATH=/data/mysql-base/mysql/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart=/data/mysql-base/mysql/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --daemonize --pid-file=/data/mysql/run/mysqld.pid
ExecReload=/bin/kill -HUP \$MAINPID
Restart=on-failure
RestartPreventExitStatus=1
PrivateTmp=true
LimitNOFILE=65535

[Install]
WantedBy=multi-user.target
EOF

# 目录权限
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql

# 启动mysql服务
systemctl daemon-reload
systemctl start mysqld
systemctl enable mysqld

# 等待mysql启动
echo "等待mysql启动..."
sleep 10
while ! /data/mysql-base/mysql/bin/mysqladmin ping --silent; do
    echo "等待mysql启动..."
    sleep 5
done

# 设置root简单密码
echo "设置root密码..."
/data/mysql-base/mysql/bin/mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '123';"

# 设置登录方式环境变量
echo 'export PATH=/data/mysql-base/mysql/bin:$PATH' > /etc/profile.d/mysql.sh
source /etc/profile.d/mysql.sh

# 配置组复制
echo "配置组复制..."
/data/mysql-base/mysql/bin/mysql -p'123' <<EOF
set sql_log_bin=0;
create user repl@'%' identified with mysql_native_password by '123';
grant replication slave on *.* to repl@'%';
grant backup_admin on *.* to repl@'%';
grant group_replication_stream on *.* to repl@'%';
flush privileges;
set sql_log_bin=1;
install plugin group_replication soname 'group_replication.so';
change replication source to source_user='repl', source_password='123' for channel 'group_replication_recovery';
start group_replication;
select * from performance_schema.replication_group_members;
EOF


echo "MySQL ${mysql_version} 二进制安装完成"

上面两个主从脚本在四个虚机跑完,没有报错,就不用看这个报错说明,直接查看组复制状态就成。

如有报错,可登录数据库查看组复制状态,然后去查看error.log日志去排查,大部分都是配置错误,hosts错误、多了gtid等错误。

如果以前有配置过的,有环境的,就只下载插件,my.cnf配置上面的group的参数,先清掉master信息,再执行上面的 

#建立复制关系
change master to  master_user='repl', master_password='123' for channel  'group_replication_recovery';
# 开启自启服务
set global group_replication_bootstrap_group=on;
#开启组复制
start group_replication;
# 关闭自启
set global group_replication_bootstrap_group=off;
#查看组复制关系
select * from performance_schema.replication_group_members;
#清除master信息,重置binlog和gtid等
reset master;
#查看uuid,克隆的机器要改uuid;停止mysql,删掉auto.cnf,再启动myql即可
select uuid();
show master status\G
# 查看gtid是否清空
 select * from mysql.gtid_executed;

MGR两套主从架构状态信息如下:

133、134

 135、136

分别在两个MGR集群的133和135创建以下两个用户: 

create user 'appuser'@'%' identified by '123';
grant all privileges on *.* to 'appuser'@'%';

create user 'monitor'@'%' identified by '123';
grant all privileges on *.* to 'monitor'@'%';
flush privileges;

proxysql部署

分别在两台proxysql节点(137、138)执行以下安装脚本:

cat proxy.sh 
chmod +x proxy.sh
#!/bin/bash

cat <<EOF | tee /etc/yum.repos.d/proxysql.repo
[proxysql]
name=ProxySQL YUM repository
baseurl=https://repo.proxysql.com/ProxySQL/proxysql-2.5.x/centos/\$releasever
gpgcheck=1
gpgkey=https://repo.proxysql.com/ProxySQL/proxysql-2.5.x/repo_pub_key
EOF

#下载mysql客户端,proxysql需要mysql客户端连接
yum install -y proxysql mysql-client
systemctl start proxysql.service
systemctl enable proxysql.service

 137 proxysql1执行如下进行分片规则配置:

rule_id 规则唯一标识,按数字顺序处理规则
active 1=启用规则,0=禁用规则
match_digest 匹配抽象的sql语句,如^insert,会把insert开头全部匹配
match_pattern

匹配具体sql语句,如insert into user_info (name, address, age, email) values('ll

q','湖南省',19,'com'); 

re_modifiers 正则修饰符,如CASELESS为忽略大小写
flagIN / flagOUT 规则链标记传递:flagOUT 设置标记,flagIN 匹配标记
destination_hostgroup SQL 路由的目标主机组(如 10=写组1,20=读组,30=写组2)
apply 1=匹配后立即执行并停止后续规则,0=继续向下匹配
age 匹配字age字段
\s*,\s*

\s* = 匹配 0 个或多个空白字符

, = 匹配逗号

匹配 age 后的逗号,允许前后有空格

[^)]+\)

[^)]+ = 匹配 1 个或多个非右括号 ) 的字符

\) = 匹配右括号  )

匹配 age, ...) 中的字段列表部分(name,)

\s*VALUES\s*\(

\s* = 匹配 0 个或多个空白字符

VALUES = 匹配 VALUES字段

\s* = 匹配空白字符

\( = 匹配左括号 

[^)]+?,\s*[^)]+?,\s*

[^)]+? = 匹配 1 个或多个非右括号 ) 的字符

, = 匹配逗号

\s* = 匹配空白字符

两个就是重复以上匹配

(\d*[02468])

\d* = 匹配 0 个或多个数字

[02468] = 匹配一个偶数(0/2/4/6/8)

匹配以偶数结尾的整数

\D 匹配一个非数字字符(确保年龄值结束,如逗号 ,、右括号 ) 或空格)
# 默认登录管理端口
mysql -u admin -padmin -h 127.0.0.1 -P 6032
# 配置数据库节点的
replace into mysql_servers(hostgroup_id, hostname, port) values
(10, '192.168.93.133', 3306),
(20, '192.168.93.134', 3306),  
(30, '192.168.93.135', 3306),
(40, '192.168.93.136', 3306); 

# 配置监控用户
update global_variables set variable_value='monitor' 
  where variable_name='mysql-monitor_username';
update global_variables set variable_value='123' 
  where variable_name='mysql-monitor_password';

# 添加连接用户
replace into mysql_users(username, password, default_hostgroup) 
values ('appuser', '123', 10);
# 设置用户只能对test1操作,如不设置,那执行sql时得对指定库操作
update mysql_users  set default_schema = 'test1'   where username = 'appuser';
# 持久化用户加执行
load mysql users to runtime;
save mysql users to disk;

# 配置读写分离规则
# 先在rules表定义insert/update/delete/replace/select for update(加锁读) 写操作
replace into mysql_query_rules (rule_id, active, match_digest, re_modifiers, flagOUT, apply)
values 
(1, 1, '^SELECT.*FOR UPDATE', 'CASELESS', 1000, 0),
(2, 1, '^(INSERT|UPDATE|DELETE|REPLACE)', 'CASELESS', 1000, 0);

# 定义读操作select,flageout就是定义,级别低于规则1,apply传递定义
replace into mysql_query_rules (rule_id, active, match_digest, re_modifiers, flagOUT, apply)
values 
(3, 1, '^SELECT', 'CASELESS', 2000, 0);

# 定义insert规则,根据age字段来分片,因为id是自增的,并且跨实例的id分奇偶数表数据会很难看,所以用age年龄用正则去对年龄进行取模判断奇偶数,flagint接受传递的定义
replace into mysql_query_rules (
    rule_id, active, 
    match_pattern, re_modifiers, 
    flagIN, destination_hostgroup, apply
)
values 
# age是偶数的下发到hostgroup10
(4, 1, 'age\s*,\s*[^)]+\)\s*VALUES\s*\([^)]+?,\s*[^)]+?,\s*(\d*[02468])\D', 'CASELESS', 1000, 10, 1),
# age是奇数的下发到hostgroup30
(5, 1, 'age\s*,\s*[^)]+\)\s*VALUES\s*\([^)]+?,\s*[^)]+?,\s*(\d*[13579])\D', 'CASELESS', 1000, 30, 1);

# 条件操作规则,根据 name 和 email 分片
replace into mysql_query_rules (
    rule_id, active, 
    match_pattern, re_modifiers, 
    flagIN, destination_hostgroup, apply
)
values 
# 条件update和delete操作
(6, 1, 'WHERE\s+name\s*=\s*[''"]?([^''")]+)[''"]?\s+AND\s+email\s*=\s*[''"]?([^''")]+)', 'CASELESS', 1000, 10, 1),
# select操作解释定义,条件根据name和email去查
(7, 1, 'WHERE\s+name\s*=\s*[''"]?([^''")]+)[''"]?\s+AND\s+email\s*=\s*[''"]?([^''")]+)', 'CASELESS', 2000, 20, 1);

# 默认路由规则
replace into mysql_query_rules (
    rule_id, active, 
    flagIN, destination_hostgroup, apply
)
values 
# 写操作默认路由 (hostgroup10)
(8, 1, 1000, 10, 1),
# 读操作默认路由 (hostgroup20)
(9, 1, 2000, 20, 1);




# proxy自动检测主从配置,用read_only=on/off去判断集群主从关系
replace into mysql_replication_hostgroups VALUES 
(10,20,'read_only','ProxySQL自动主从检测'),
(30,40,'read_only','ProxySQL自动主从检测');

# 保存配置,disk表示持久化到磁盘,runtime表示立即生效
save mysql servers to disk;
save mysql variables to disk;
save mysql query rules to disk;
save mysql users to disk;
load mysql servers to runtime;
load mysql variables to runtime;
load mysql query rules to runtime;
load mysql users to runtime;

 138 proxysql2 就是复制 proxysql1的配置:

#创建proxysql集群配置同步账号,2.0以上版本支持的功能
set admin-cluster_username = 'cuser';
set admin-cluster_password = '123';
update global_variables set variable_value = 'admin:admin;cuser:123' 
where variable_name = 'admin-admin_credentials';

load admin variables to runtime;
save admin variables to disk;
 
# 创建集群节点
insert into proxysql_servers (hostname, port, weight, comment) 
values ('192.168.93.137', 6032, 1, 'node1'), 
       ('192.168.93.138', 6032, 1, 'node2');

# 调整的参数刷盘和及时生效
load proxysql servers to runtime;
save proxysql servers to disk;  
load mysql users to runtime; 
save mysql users to disk;
   

 配置完成后,可在137上查看四台db节点的信息,查看配置的读写分离状态

keepalived部署

137、138安装keepalived,防止proxysql单点故障

137

yum install -y keepalived

cat keepalived.conf


! Configuration File for keepalived

global_defs {
    router_id proxysql1   #主机名
script_user root     #脚本执行用户
enable_script_security      
}

vrrp_script chk_proxysql {
    interval 2    #执行脚本间隔
    weight -20    #权重优先级

}

vrrp_instance VI_1 {
    state MASTER    #服务器名主从设置
    interface ens33    #网卡用ip addr查看
    virtual_router_id 10    #路由id
    priority 100    #优先级
    advert_int 1    #检查间隔
    
    authentication {
        auth_type PASS    #认证
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.93.101/24       dev ens33     #vip
    }
    
    
}

#启动keepalived,开启开机自启,查看状态
systemctl start keepalived

 systemctl enable keepalived
systemctl status keepalived

138

yum install -y keepalived

cat keepalived.conf

! Configuration File for keepalived

global_defs {
    router_id proxysql2
script_user root 
enable_script_security
}

vrrp_script chk_proxysql {
    interval 2
    weight -20
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 10
    priority 90
    advert_int 1
    
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    
    virtual_ipaddress {
        192.168.93.101/24       dev ens33
    }
   }



systemctl start keepalived

 systemctl enable keepalived
systemctl status keepalived

状态信息 

 安装完keepalived可使用vip去查后端数据库是否连接,sql是否能正常下发。

mysql -h 192.168.93.101 -P 6033 -uappuser -p123 -e "select @@hostname"

可以看到select根据读写分离下发至db1-slave上,查到主机名。 

 测试

测试功能:实时业务流量下发 -》验证两套mgr集群的数据是否根据age分片下发-》proxysql故障 -》keepalived是否自动切换 -》mgr主节点故障 -》从节点是否接管流量 -》 流量实时下发 

在测试之前应查看mgr同步状态,查看proxysql对后端数据库的状态是否正常

proxysql 137上如下命令可查看信息和状态

mysql -u admin -padmin -h 127.0.0.1 -P 6032
# 查看后端主从是否状态正常
select * from runtime_mysql_servers;
# 查看monitor监控后端服务 
select * from monitor.mysql_server_ping_log  order by time_start_us desc  limit 10;
# 查看服务配置
select * from mysql_servers;
# 查看分片规则配置
select * from mysql_query_rules;

 测试前要创建test1库

create database test1;

使用之前proxysql安装时创建的用户appuser,python脚本模拟流量下发

#!/usr/bin/env python3
import pymysql
from pymysql import Error
import random
import time
from faker import Faker
 
 #pymysql连接vip和账号
db_config = {
    'host': '192.168.93.101',
    'user': 'appuser',
    'password': '123',
    'database': 'test1',
    'port': 3306,
    'charset': 'utf8mb4',
    'cursorclass': pymysql.cursors.DictCursor
}
#第三方库,随机生成中文数据
 
fake = Faker('zh_CN')
 
#创建表结构
def create_table():
    try:
        with pymysql.connect(**db_config) as conn:
            with conn.cursor() as cursor:
                create_table_sql = """
                create table if not exists user_info(
                    id int auto_increment primary key,
                    name varchar(10) not null,
                    address varchar(30) not null,
                    age int,
                    email varchar(20),
                    create_time timestamp default current_timestamp)
                    engine=innodb default charset=utf8mb4;
                """
                cursor.execute(create_table_sql)
            conn.commit()
 #随机生成数据
 
def generate_random_data():
    return {
        'name': fake.name(),
        'address': fake.address().replace('\n', ' '),
        'age': random.randint(18, 80),
        'email': fake.email()
    }
 
 #连接数据库,insert随机的数据
def insert_data():
    try:
        with pymysql.connect(**db_config) as conn:
           with conn.cursor() as cursor:
              insert_sql = """
              insert into user_info (name, address, age, email)
                values (%(name)s, %(address)s, %(age)s, %(email)s)
                """
                data = generate_random_data()
                cursor.execute(insert_sql, data)
            conn.commit()
    except Error as e:
        print(f"数据写入失败: {e}")
 
 #while循环每次写入三条数据
def main():
    create_table()
    try:
        print("开始写入数据")
        while True:
            for _ in range(3):
                insert_data()
 
            print(f"写入3行数据,时间: {time.strftime('%Y-%m-%d %H:%M:%S')}")
            time.sleep(1) 
    except KeyboardInterrupt:
        print("\n已停止写入")
 
 
if __name__ == "__main__":
    main()

 执行如上脚本,制造数据,执行如下

 查看proxysql配置的分片规则命中数,rule_id就是分片规则的id,可以看到insert走的配置的规则,且规则4、5就是把数据分片到奇偶数到hostgroup10(133),hostgroup30(135)

 

查看两个133、135主的数据分布情况

 可以清晰的看到age字段奇数偶数分离在两个集群,且id字段为自增连续的

 此时去停止proxysql1的服务,停机制作故障

shutdown -h now

 去138,proxysql2去查看vip是否飘逸,vip101飘逸至138

 去查看脚本状态,是否有报错,可以看到无报错,vip飘逸在故障时间点会导致sql下发有延迟,但还是只要不到超时时间则会等待飘逸完成后继续下发,超时容忍可以去配置。

 此时继续停掉133,mgr-1集群主节点,制造故障

继续查看脚本执行情况,无中断。 

 mgr-1集群的从节点134,去134查看业务流量是否持续写入,可以看到表数据一直在增加,理论上无明显影响

 查看135,偶数下发的节点,数据持续增加,理论上无影响

 分别查看偶数分片情况,此时134为mgr-1集群的主

 135查看奇数分片数据情况

现在开机对137proxysql主节点

 此时可看到因为proxysql配置了抢占模式,也就是137会把101vip抢占回来,此时影响了流量下发,可以配置不抢占模式,理论上影响面更小。

对133开机,查看mgr同步状态已中断

134状态为主,单节点 

 在133执行开启组复制模式,此时133加入集群,并且会把故障时间段的数据从134同步过来

验证数据是否同步,133数据如下 

 134数据如下,同步情况正常,故障后加入集群也能把数据同步

 总结:该高可用分布式架构,可保证节点之间的高可用,避免单点故障,保证流量下发,细节部分可以调整,大数据量情况下的sql优化、索引优化、配置参数优化等。
 缺点:分片规则复杂,而且分片规则只对insert做了详细的正则,对其他修改和删除等规则不完善,对于分片来说,使用成熟的分片分库分表的中间件会更好(mycat/dble)。无自动备份、监控管理等功能。可用xtrabackup和Prometheus+grafana搭建较为成熟的数据库体系。

 问题

部署完成中出现的一些问题,及排查思路

执行测试脚本时,遇到写入失败,且报错为无法写入是因为开启了super-read-only,所以判断为写入流量至读节点。

 从137上去单独执行,发现确实是写入到从节点了

此时可判断为是配置问题,去排查配置信息,发现runtime表,有判断两个从节点也为可写入模式 

如上图,runtime_mysql_servers表中,运行的主从,因为proxy配置自动监控判断主从节点,导致134和136也被判断成主节点,可写入,所以排查了看到是因为组复制规则的判断条件是innodb_read_only 是否开启,innodb_read_only=on则判断为读节点,off判断为可写节点。

而134从节点上的为off 

 所以结合上述应该修改条件为read_only来判断

删除原因规则,重新插入新规则,持久化和直接生效。

 delete from mysql_replication_hostgroups;

 select * from mysql_replication_hostgroups;

 insert into mysql_replication_hostgroups 
    (writer_hostgroup, reader_hostgroup, check_type, comment) 
    values
   (10, 20, 'read_only', 'ProxySQL自动主从检测'),
   (30, 40, 'read_only', 'ProxySQL自动主从检测');


save mysql servers to disk;
save mysql variables to disk;
save mysql query rules to disk;
save mysql users to disk;
load mysql servers to runtime;
load mysql variables to runtime;
load mysql query rules to runtime;
load mysql users to runtime;

​

 问题解决

部署过程中,有很多其他的问题,不全部列出,有问题可先思考问题的流程逻辑,从日志及报错信息入手,查看配置及规则,调整配置来测试问题。

Logo

更多推荐