proxysql+mysql+zookeeper+keepalived分布式数据库集群架构-分片高可用篇
前文有搭建MGR和keepalived的学习,目前结合之前的路线,MGR单主一主两从无法满足需求,我们此章从proxysql对MGR进行分片,搭建两套MGR组复制,并且proxysql对其进行分片规则和流量分发,zookeeper做管理配置协调中心。ZooKeeper 用于集群协调和配置管理ProxySQL 实现高效分片路由MySQL Group Replication 保证数据高可用mysqlm
集群介绍
前文有搭建MGR和keepalived的学习,目前结合之前的路线,MGR单主一主两从无法满足需求,我们此章从proxysql对MGR进行分片,搭建两套MGR组复制,并且proxysql对其进行分片规则和流量分发,zookeeper做管理配置协调中心。
-
ZooKeeper 用于集群协调和配置管理
-
ProxySQL 实现高效分片路由
-
MySQL Group Replication 保证数据高可用
192.168.93.133 | mysql |
192.168.93.134 | mysql |
192.168.93.135 | mysql |
192.168.93.136 | mysql+zookeeper |
192.168.93.137 | proxysql+zookeeper+keepalived |
192.168.93.138 | proxysql+zookeeper+keepalived |
系统环境+主机名
192.168.93.133 db1-master
192.168.93.134 db1-slave
192.168.93.135 db2-master
192.168.93.136 db2-slave zk3
192.168.93.137 proxysql1 zk1
192.168.93.138 proxysql2 zk2
测试功能:实时业务流量下发 -》验证两套mgr集群的数据是否根据age分片下发-》proxysql故障 -》keepalived是否自动切换 -》mgr主节点故障 -》从节点是否接管流量 -》 流量实时下发
集群架构:
ZooKeeper核心机制
目前结合zookeeper的机制原理,期望实现如下在架构中的关键作用
-
配置中心:
-
存储集群拓扑信息
-
管理分片路由规则
-
保存备份策略
-
-
状态管理:
-
记录当前主节点信息
-
跟踪节点健康状态
-
维护备份历史
-
-
分布式协调:
-
主从切换通知
-
分布式锁(用于备份任务)
-
集群选主
-
-
服务发现:
-
ProxySQL 节点注册
-
MySQL 实例发现
-
监控目标自动发现
-
基础环境配置
先不用ansible,后面搭建完成后可以用ansible自动化部署,现在一步一步更能理解。
zookeeper部署
分别在如下的六台虚机创建脚本执行:
环境信息根据需求来添加
cat env.sh
chmod +x env.sh
#!/bin/bash
# 设置主机名解析
cat >> /etc/hosts <<EOF
192.168.93.133 db1-master
192.168.93.134 db1-slave
192.168.93.135 db2-master
192.168.93.136 db2-slave zk3
192.168.93.137 proxysql1 zk1
192.168.93.138 proxysql2 zk2
EOF
# 关闭防火墙
systemctl stop firewalld && systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# 时间同步
yum install -y ntp
echo "server ntp.aliyun.com iburst" >> /etc/ntp.conf
systemctl restart ntpd && systemctl enable ntpd
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo
yum clean all
yum makecache fast
# 安装基础工具
yum install -y epel-release
yum install -y git wget curl vim net-tools htop python3 python3-p nc
配置完可以检查一下机器的时间是否同步,防火墙是否关闭。
机器有限,对应的hostname修改成hosts里面的主机名,通过查看hostname,配置myid。
三台zk:136、137、138
cat zk.sh
#!/bin/bash
# 安装JDK
yum install -y java-11-openjdk
# 下载镜像
url="https://archive.apache.org/dist/zookeeper/zookeeper-3.8.1/apache-zookeeper-3.8.1-bin.tar.gz"
wget $url -O /tmp/apache-zookeeper-3.8.1-bin.tar.gz
# 解压安装
tar -zxvf /tmp/apache-zookeeper-3.8.1-bin.tar.gz -C /opt
mv /opt/apache-zookeeper-3.8.1-bin /opt/zookeeper
# 创建目录
mkdir -p /data/zookeeper
# 配置zoo.cfg
cat > /opt/zookeeper/conf/zoo.cfg <<EOF
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/data/zookeeper
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
EOF
# 节点特定配置
# 分别在不同节点执行对应命令
h=$(hostname)
case $h in
proxy-1) echo "1" > /data/zookeeper/myid ;;
proxy-2) echo "2" > /data/zookeeper/myid ;;
n3) echo "3" > /data/zookeeper/myid ;;
*) echo "ERROR: 未知主机名" && exit 1 ;;
esac
# 启动服务
/opt/zookeeper/bin/zkServer.sh start
启动成功
/opt/zookeeper/bin/zkServer.sh start
# 验证状态
/opt/zookeeper/bin/zkServer.sh status
tickTime=2000 zk的基本时间单位毫秒,心跳检测超时计算
initLimit=10 从节点初始化连接到zk主的最大时间,10x2000=20秒
syncLimit=5 从和主之间心跳响应的最大超时时间,5x2000=10秒
dataDir=/data/zookeeper 数据目录
clientPort=2181 zk对外访问端口
server.1=zk1:2888:3888 2888为节点间数据同步端口,3888为选举端口
server.2=zk2:2888:3888
server.3=zk3:2888:3888
创建节点
/opt/zookeeper/bin/zkCli.sh
create /database-cluster ""
create /database-cluster/mgr-clusters ""
create /database-cluster/mgr-clusters/db1 ""
create /database-cluster/mgr-clusters/db2 ""
create /database-cluster/mgr-clusters/db1/master "192.168.93.133"
create /database-cluster/mgr-clusters/db1/slaves '["192.168.93.134"]'
create /database-cluster/mgr-clusters/db2/master "192.168.93.135"
create /database-cluster/mgr-clusters/db2/slaves '["192.168.93.136"]'
create /database-cluster/proxysql ""
create /database-cluster/proxysql/active "192.168.93.137"
create /database-cluster/proxysql/standby "192.168.93.138"
create /database-cluster/locks ""
MGR部署
MGR主要命令:
之前的文章有搭过,可查看!
MGR节点master(133/135)配置,里面的ip和端口改成实际的。步骤写成脚本了,可执行脚本部署。
cat /data/db.sh
chmod +x db.sh
#!/bin/bash
systemctl stop mysqld 2>/dev/null
yum remove -y mysql-* mariadb-* 2>/dev/null
rpm -e --nodeps $(rpm -qa | grep -i mysql) 2>/dev/null
rm -rf /var/lib/mysql /var/log/mysqld* /etc/my.cnf* /var/run/mysqld
rm -rf /data/mysql /data/mysql-base
find / -name mysql -type d -exec rm -rf {} + 2>/dev/null
# 安装依赖
yum install -y libaio numactl perl wget tar
# 创建用户和组
if ! id "mysql" &>/dev/null; then
groupadd mysql
useradd -r -g mysql -s /bin/false mysql
fi
# 创建目录
mkdir -p /data/mysql-base
mkdir -p /data/mysql/{data,logs,tmp,binlog,conf,run}
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql
# 下载并解压mysql二进制包
mysql_version="8.0.42"
mysql_pkg="mysql-${mysql_version}-linux-glibc2.17-x86_64"
cd /data/mysql-base
echo "下载 mysql ${mysql_version}"
wget https://dev.mysql.com/get/Downloads/MySQL-8.0/${mysql_pkg}.tar.xz
tar -xvf ${mysql_pkg}.tar.xz
rm -f ${mysql_pkg}.tar.xz
mv ${mysql_pkg} mysql
echo "mysql ${mysql_version} 解压完成"
# 配置文件导入
cat > /data/mysql/conf/my.cnf <<EOF
[mysqld]
user=mysql
port=3306
basedir=/data/mysql-base/mysql
datadir=/data/mysql/data
socket=/data/mysql/run/mysqld.sock
tmpdir=/data/mysql/tmp
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/run/mysqld.pid
log-bin=/data/mysql/binlog/mysql-bin
relay-log=/data/mysql/binlog/relay-bin
character-set-server=UTF8MB4
lower_case_table_names=1
server_id=1
enforce_gtid_consistency=on
gtid_mode=on
default_authentication_plugin=mysql_native_password
plugin_dir=/data/mysql-base/mysql/lib/plugin
binlog_checksum=NONE
transaction_write_set_extraction=XXHASH64
loose-group_replication_recovery_use_ssl=ON
loose-group_replication_group_name="90e11f7a-58a5-11f0-a076-000c29366dc1"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address="192.168.93.137:24901"
loose-group_replication_group_seeds="192.168.93.137:24901,192.168.93.138:24901"
loose-group_replication_bootstrap_group=OFF
[client]
socket=/data/mysql/run/mysqld.sock
EOF
# 软链接到默认配置环境,可以不软
ln -sf /data/mysql/conf/my.cnf /etc/my.cnf
# 初始化mysql
echo "初始化mysql数据库---"
/data/mysql-base/mysql/bin/mysqld --initialize-insecure \
--user=mysql \
--basedir=/data/mysql-base/mysql \
--datadir=/data/mysql/data
# systemd服务文件配置
cat > /etc/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
After=network.target
[Service]
User=mysql
Group=mysql
Type=forking
Environment="MYSQL_HOME=/data/mysql/conf"
Environment="PATH=/data/mysql-base/mysql/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart=/data/mysql-base/mysql/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --daemonize --pid-file=/data/mysql/run/mysqld.pid
ExecReload=/bin/kill -HUP \$MAINPID
Restart=on-failure
RestartPreventExitStatus=1
PrivateTmp=true
LimitNOFILE=65535
[Install]
WantedBy=multi-user.target
EOF
# 目录权限
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql
# 启动mysql服务
systemctl daemon-reload
systemctl start mysqld
systemctl enable mysqld
# 等待mysql启动
echo "等待mysql启动..."
sleep 10
while ! /data/mysql-base/mysql/bin/mysqladmin ping --silent; do
echo "等待mysql启动..."
sleep 5
done
# 设置root简单密码
echo "设置root密码..."
/data/mysql-base/mysql/bin/mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '123';"
# 设置登录方式环境变量
echo 'export PATH=/data/mysql-base/mysql/bin:$PATH' > /etc/profile.d/mysql.sh
source /etc/profile.d/mysql.sh
# 配置组复制
echo "配置组复制..."
/data/mysql-base/mysql/bin/mysql -p'123' <<EOF
SET SQL_LOG_BIN=0;
CREATE USER repl@'%' IDENTIFIED WITH mysql_native_password BY '123';
GRANT REPLICATION SLAVE ON *.* TO repl@'%';
GRANT BACKUP_ADMIN ON *.* TO repl@'%';
GRANT GROUP_REPLICATION_STREAM ON *.* TO repl@'%';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
CHANGE REPLICATION SOURCE TO SOURCE_USER='repl', SOURCE_PASSWORD='123' FOR CHANNEL 'group_replication_recovery';
SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;
SELECT * FROM performance_schema.replication_group_members;
EOF
echo "MySQL ${mysql_version} 二进制安装完成"
执行成功如下:warning不用管,是密码简单提醒
slave节点(134、136)执行
cat /data/db.sh
#!/bin/bash
systemctl stop mysqld 2>/dev/null
yum remove -y mysql-* mariadb-* 2>/dev/null
rpm -e --nodeps $(rpm -qa | grep -i mysql) 2>/dev/null
rm -rf /var/lib/mysql /var/log/mysqld* /etc/my.cnf* /var/run/mysqld
rm -rf /data/mysql /data/mysql-base
find / -name mysql -type d -exec rm -rf {} + 2>/dev/null
# 安装依赖
yum install -y libaio numactl perl wget tar
# 创建用户和组
if ! id "mysql" &>/dev/null; then
groupadd mysql
useradd -r -g mysql -s /bin/false mysql
fi
# 创建目录
mkdir -p /data/mysql-base
mkdir -p /data/mysql/{data,logs,tmp,binlog,conf,run}
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql
# 下载并解压mysql二进制包
mysql_version="8.0.42"
mysql_pkg="mysql-${mysql_version}-linux-glibc2.17-x86_64"
cd /data/mysql-base
echo "下载 mysql ${mysql_version}"
wget https://dev.mysql.com/get/Downloads/MySQL-8.0/${mysql_pkg}.tar.xz
tar -xvf ${mysql_pkg}.tar.xz
rm -f ${mysql_pkg}.tar.xz
mv ${mysql_pkg} mysql
echo "mysql ${mysql_version} 解压完成"
# 配置文件导入
cat > /data/mysql/conf/my.cnf <<EOF
[mysqld]
user=mysql
port=3306
basedir=/data/mysql-base/mysql
datadir=/data/mysql/data
socket=/data/mysql/run/mysqld.sock
tmpdir=/data/mysql/tmp
log-error=/data/mysql/logs/error.log
pid-file=/data/mysql/run/mysqld.pid
log-bin=/data/mysql/binlog/mysql-bin
relay-log=/data/mysql/binlog/relay-bin
character-set-server=UTF8MB4
lower_case_table_names=1
server_id=2
enforce_gtid_consistency=on
gtid_mode=on
default_authentication_plugin=mysql_native_password
plugin_dir=/data/mysql-base/mysql/lib/plugin
binlog_checksum=NONE
transaction_write_set_extraction=XXHASH64
loose-group_replication_recovery_use_ssl=ON
loose-group_replication_group_name="90e11f7a-58a5-11f0-a076-000c29366dc1"
loose-group_replication_start_on_boot=OFF
loose-group_replication_local_address="192.168.93.138:24901"
loose-group_replication_group_seeds="192.168.93.137:24901,192.168.93.138:24901"
loose-group_replication_bootstrap_group=OFF
[client]
socket=/data/mysql/run/mysqld.sock
EOF
# 软链接到默认配置环境,可以不软
ln -sf /data/mysql/conf/my.cnf /etc/my.cnf
# 初始化mysql
echo "初始化mysql数据库---"
/data/mysql-base/mysql/bin/mysqld --initialize-insecure \
--user=mysql \
--basedir=/data/mysql-base/mysql \
--datadir=/data/mysql/data
# systemd服务文件配置
cat > /etc/systemd/system/mysqld.service <<EOF
[Unit]
Description=MySQL Server
After=network.target
[Service]
User=mysql
Group=mysql
Type=forking
Environment="MYSQL_HOME=/data/mysql/conf"
Environment="PATH=/data/mysql-base/mysql/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin"
ExecStart=/data/mysql-base/mysql/bin/mysqld --defaults-file=/data/mysql/conf/my.cnf --daemonize --pid-file=/data/mysql/run/mysqld.pid
ExecReload=/bin/kill -HUP \$MAINPID
Restart=on-failure
RestartPreventExitStatus=1
PrivateTmp=true
LimitNOFILE=65535
[Install]
WantedBy=multi-user.target
EOF
# 目录权限
chown -R mysql:mysql /data/mysql-base /data/mysql
chmod 750 /data/mysql
# 启动mysql服务
systemctl daemon-reload
systemctl start mysqld
systemctl enable mysqld
# 等待mysql启动
echo "等待mysql启动..."
sleep 10
while ! /data/mysql-base/mysql/bin/mysqladmin ping --silent; do
echo "等待mysql启动..."
sleep 5
done
# 设置root简单密码
echo "设置root密码..."
/data/mysql-base/mysql/bin/mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '123';"
# 设置登录方式环境变量
echo 'export PATH=/data/mysql-base/mysql/bin:$PATH' > /etc/profile.d/mysql.sh
source /etc/profile.d/mysql.sh
# 配置组复制
echo "配置组复制..."
/data/mysql-base/mysql/bin/mysql -p'123' <<EOF
set sql_log_bin=0;
create user repl@'%' identified with mysql_native_password by '123';
grant replication slave on *.* to repl@'%';
grant backup_admin on *.* to repl@'%';
grant group_replication_stream on *.* to repl@'%';
flush privileges;
set sql_log_bin=1;
install plugin group_replication soname 'group_replication.so';
change replication source to source_user='repl', source_password='123' for channel 'group_replication_recovery';
start group_replication;
select * from performance_schema.replication_group_members;
EOF
echo "MySQL ${mysql_version} 二进制安装完成"
上面两个主从脚本在四个虚机跑完,没有报错,就不用看这个报错说明,直接查看组复制状态就成。
如有报错,可登录数据库查看组复制状态,然后去查看error.log日志去排查,大部分都是配置错误,hosts错误、多了gtid等错误。
如果以前有配置过的,有环境的,就只下载插件,my.cnf配置上面的group的参数,先清掉master信息,再执行上面的
#建立复制关系
change master to master_user='repl', master_password='123' for channel 'group_replication_recovery';
# 开启自启服务
set global group_replication_bootstrap_group=on;
#开启组复制
start group_replication;
# 关闭自启
set global group_replication_bootstrap_group=off;
#查看组复制关系
select * from performance_schema.replication_group_members;
#清除master信息,重置binlog和gtid等
reset master;
#查看uuid,克隆的机器要改uuid;停止mysql,删掉auto.cnf,再启动myql即可
select uuid();
show master status\G
# 查看gtid是否清空
select * from mysql.gtid_executed;
MGR两套主从架构状态信息如下:
133、134
135、136
分别在两个MGR集群的133和135创建以下两个用户:
create user 'appuser'@'%' identified by '123';
grant all privileges on *.* to 'appuser'@'%';
create user 'monitor'@'%' identified by '123';
grant all privileges on *.* to 'monitor'@'%';
flush privileges;
proxysql部署
分别在两台proxysql节点(137、138)执行以下安装脚本:
cat proxy.sh
chmod +x proxy.sh
#!/bin/bash
cat <<EOF | tee /etc/yum.repos.d/proxysql.repo
[proxysql]
name=ProxySQL YUM repository
baseurl=https://repo.proxysql.com/ProxySQL/proxysql-2.5.x/centos/\$releasever
gpgcheck=1
gpgkey=https://repo.proxysql.com/ProxySQL/proxysql-2.5.x/repo_pub_key
EOF
#下载mysql客户端,proxysql需要mysql客户端连接
yum install -y proxysql mysql-client
systemctl start proxysql.service
systemctl enable proxysql.service
137 proxysql1执行如下进行分片规则配置:
rule_id |
规则唯一标识,按数字顺序处理规则 |
active |
1 =启用规则,0 =禁用规则 |
match_digest |
匹配抽象的sql语句,如^insert,会把insert开头全部匹配 |
match_pattern |
匹配具体sql语句,如insert into user_info (name, address, age, email) values('ll q','湖南省',19,'com'); |
re_modifiers |
正则修饰符,如CASELESS为 忽略大小写 |
flagIN / flagOUT |
规则链标记传递:flagOUT 设置标记,flagIN 匹配标记 |
destination_hostgroup |
SQL 路由的目标主机组(如 10 =写组1,20 =读组,30 =写组2) |
apply |
1 =匹配后立即执行并停止后续规则,0 =继续向下匹配 |
age |
匹配字age字段 |
\s*,\s* |
匹配 |
[^)]+\) |
匹配 |
\s*VALUES\s*\( |
|
[^)]+?,\s*[^)]+?,\s* |
两个就是重复以上匹配 |
(\d*[02468]) |
匹配以偶数结尾的整数 |
\D |
匹配一个非数字字符(确保年龄值结束,如逗号 , 、右括号 ) 或空格) |
# 默认登录管理端口
mysql -u admin -padmin -h 127.0.0.1 -P 6032
# 配置数据库节点的
replace into mysql_servers(hostgroup_id, hostname, port) values
(10, '192.168.93.133', 3306),
(20, '192.168.93.134', 3306),
(30, '192.168.93.135', 3306),
(40, '192.168.93.136', 3306);
# 配置监控用户
update global_variables set variable_value='monitor'
where variable_name='mysql-monitor_username';
update global_variables set variable_value='123'
where variable_name='mysql-monitor_password';
# 添加连接用户
replace into mysql_users(username, password, default_hostgroup)
values ('appuser', '123', 10);
# 设置用户只能对test1操作,如不设置,那执行sql时得对指定库操作
update mysql_users set default_schema = 'test1' where username = 'appuser';
# 持久化用户加执行
load mysql users to runtime;
save mysql users to disk;
# 配置读写分离规则
# 先在rules表定义insert/update/delete/replace/select for update(加锁读) 写操作
replace into mysql_query_rules (rule_id, active, match_digest, re_modifiers, flagOUT, apply)
values
(1, 1, '^SELECT.*FOR UPDATE', 'CASELESS', 1000, 0),
(2, 1, '^(INSERT|UPDATE|DELETE|REPLACE)', 'CASELESS', 1000, 0);
# 定义读操作select,flageout就是定义,级别低于规则1,apply传递定义
replace into mysql_query_rules (rule_id, active, match_digest, re_modifiers, flagOUT, apply)
values
(3, 1, '^SELECT', 'CASELESS', 2000, 0);
# 定义insert规则,根据age字段来分片,因为id是自增的,并且跨实例的id分奇偶数表数据会很难看,所以用age年龄用正则去对年龄进行取模判断奇偶数,flagint接受传递的定义
replace into mysql_query_rules (
rule_id, active,
match_pattern, re_modifiers,
flagIN, destination_hostgroup, apply
)
values
# age是偶数的下发到hostgroup10
(4, 1, 'age\s*,\s*[^)]+\)\s*VALUES\s*\([^)]+?,\s*[^)]+?,\s*(\d*[02468])\D', 'CASELESS', 1000, 10, 1),
# age是奇数的下发到hostgroup30
(5, 1, 'age\s*,\s*[^)]+\)\s*VALUES\s*\([^)]+?,\s*[^)]+?,\s*(\d*[13579])\D', 'CASELESS', 1000, 30, 1);
# 条件操作规则,根据 name 和 email 分片
replace into mysql_query_rules (
rule_id, active,
match_pattern, re_modifiers,
flagIN, destination_hostgroup, apply
)
values
# 条件update和delete操作
(6, 1, 'WHERE\s+name\s*=\s*[''"]?([^''")]+)[''"]?\s+AND\s+email\s*=\s*[''"]?([^''")]+)', 'CASELESS', 1000, 10, 1),
# select操作解释定义,条件根据name和email去查
(7, 1, 'WHERE\s+name\s*=\s*[''"]?([^''")]+)[''"]?\s+AND\s+email\s*=\s*[''"]?([^''")]+)', 'CASELESS', 2000, 20, 1);
# 默认路由规则
replace into mysql_query_rules (
rule_id, active,
flagIN, destination_hostgroup, apply
)
values
# 写操作默认路由 (hostgroup10)
(8, 1, 1000, 10, 1),
# 读操作默认路由 (hostgroup20)
(9, 1, 2000, 20, 1);
# proxy自动检测主从配置,用read_only=on/off去判断集群主从关系
replace into mysql_replication_hostgroups VALUES
(10,20,'read_only','ProxySQL自动主从检测'),
(30,40,'read_only','ProxySQL自动主从检测');
# 保存配置,disk表示持久化到磁盘,runtime表示立即生效
save mysql servers to disk;
save mysql variables to disk;
save mysql query rules to disk;
save mysql users to disk;
load mysql servers to runtime;
load mysql variables to runtime;
load mysql query rules to runtime;
load mysql users to runtime;
138 proxysql2 就是复制 proxysql1的配置:
#创建proxysql集群配置同步账号,2.0以上版本支持的功能
set admin-cluster_username = 'cuser';
set admin-cluster_password = '123';
update global_variables set variable_value = 'admin:admin;cuser:123'
where variable_name = 'admin-admin_credentials';
load admin variables to runtime;
save admin variables to disk;
# 创建集群节点
insert into proxysql_servers (hostname, port, weight, comment)
values ('192.168.93.137', 6032, 1, 'node1'),
('192.168.93.138', 6032, 1, 'node2');
# 调整的参数刷盘和及时生效
load proxysql servers to runtime;
save proxysql servers to disk;
load mysql users to runtime;
save mysql users to disk;
配置完成后,可在137上查看四台db节点的信息,查看配置的读写分离状态
keepalived部署
137、138安装keepalived,防止proxysql单点故障
137
yum install -y keepalived
cat keepalived.conf
! Configuration File for keepalived
global_defs {
router_id proxysql1 #主机名
script_user root #脚本执行用户
enable_script_security
}
vrrp_script chk_proxysql {
interval 2 #执行脚本间隔
weight -20 #权重优先级
}
vrrp_instance VI_1 {
state MASTER #服务器名主从设置
interface ens33 #网卡用ip addr查看
virtual_router_id 10 #路由id
priority 100 #优先级
advert_int 1 #检查间隔
authentication {
auth_type PASS #认证
auth_pass 1111
}
virtual_ipaddress {
192.168.93.101/24 dev ens33 #vip
}
}
#启动keepalived,开启开机自启,查看状态
systemctl start keepalived
systemctl enable keepalived
systemctl status keepalived
138
yum install -y keepalived
cat keepalived.conf
! Configuration File for keepalived
global_defs {
router_id proxysql2
script_user root
enable_script_security
}
vrrp_script chk_proxysql {
interval 2
weight -20
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
virtual_router_id 10
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass 1111
}
virtual_ipaddress {
192.168.93.101/24 dev ens33
}
}
systemctl start keepalived
systemctl enable keepalived
systemctl status keepalived
状态信息
安装完keepalived可使用vip去查后端数据库是否连接,sql是否能正常下发。
mysql -h 192.168.93.101 -P 6033 -uappuser -p123 -e "select @@hostname"
可以看到select根据读写分离下发至db1-slave上,查到主机名。
测试
测试功能:实时业务流量下发 -》验证两套mgr集群的数据是否根据age分片下发-》proxysql故障 -》keepalived是否自动切换 -》mgr主节点故障 -》从节点是否接管流量 -》 流量实时下发
在测试之前应查看mgr同步状态,查看proxysql对后端数据库的状态是否正常
proxysql 137上如下命令可查看信息和状态
mysql -u admin -padmin -h 127.0.0.1 -P 6032
# 查看后端主从是否状态正常
select * from runtime_mysql_servers;
# 查看monitor监控后端服务
select * from monitor.mysql_server_ping_log order by time_start_us desc limit 10;
# 查看服务配置
select * from mysql_servers;
# 查看分片规则配置
select * from mysql_query_rules;
测试前要创建test1库
create database test1;
使用之前proxysql安装时创建的用户appuser,python脚本模拟流量下发
#!/usr/bin/env python3
import pymysql
from pymysql import Error
import random
import time
from faker import Faker
#pymysql连接vip和账号
db_config = {
'host': '192.168.93.101',
'user': 'appuser',
'password': '123',
'database': 'test1',
'port': 3306,
'charset': 'utf8mb4',
'cursorclass': pymysql.cursors.DictCursor
}
#第三方库,随机生成中文数据
fake = Faker('zh_CN')
#创建表结构
def create_table():
try:
with pymysql.connect(**db_config) as conn:
with conn.cursor() as cursor:
create_table_sql = """
create table if not exists user_info(
id int auto_increment primary key,
name varchar(10) not null,
address varchar(30) not null,
age int,
email varchar(20),
create_time timestamp default current_timestamp)
engine=innodb default charset=utf8mb4;
"""
cursor.execute(create_table_sql)
conn.commit()
#随机生成数据
def generate_random_data():
return {
'name': fake.name(),
'address': fake.address().replace('\n', ' '),
'age': random.randint(18, 80),
'email': fake.email()
}
#连接数据库,insert随机的数据
def insert_data():
try:
with pymysql.connect(**db_config) as conn:
with conn.cursor() as cursor:
insert_sql = """
insert into user_info (name, address, age, email)
values (%(name)s, %(address)s, %(age)s, %(email)s)
"""
data = generate_random_data()
cursor.execute(insert_sql, data)
conn.commit()
except Error as e:
print(f"数据写入失败: {e}")
#while循环每次写入三条数据
def main():
create_table()
try:
print("开始写入数据")
while True:
for _ in range(3):
insert_data()
print(f"写入3行数据,时间: {time.strftime('%Y-%m-%d %H:%M:%S')}")
time.sleep(1)
except KeyboardInterrupt:
print("\n已停止写入")
if __name__ == "__main__":
main()
执行如上脚本,制造数据,执行如下
查看proxysql配置的分片规则命中数,rule_id就是分片规则的id,可以看到insert走的配置的规则,且规则4、5就是把数据分片到奇偶数到hostgroup10(133),hostgroup30(135)
查看两个133、135主的数据分布情况
可以清晰的看到age字段奇数偶数分离在两个集群,且id字段为自增连续的
此时去停止proxysql1的服务,停机制作故障
shutdown -h now
去138,proxysql2去查看vip是否飘逸,vip101飘逸至138
去查看脚本状态,是否有报错,可以看到无报错,vip飘逸在故障时间点会导致sql下发有延迟,但还是只要不到超时时间则会等待飘逸完成后继续下发,超时容忍可以去配置。
此时继续停掉133,mgr-1集群主节点,制造故障
继续查看脚本执行情况,无中断。
mgr-1集群的从节点134,去134查看业务流量是否持续写入,可以看到表数据一直在增加,理论上无明显影响
查看135,偶数下发的节点,数据持续增加,理论上无影响
分别查看偶数分片情况,此时134为mgr-1集群的主
135查看奇数分片数据情况
现在开机对137proxysql主节点
此时可看到因为proxysql配置了抢占模式,也就是137会把101vip抢占回来,此时影响了流量下发,可以配置不抢占模式,理论上影响面更小。
对133开机,查看mgr同步状态已中断
134状态为主,单节点
在133执行开启组复制模式,此时133加入集群,并且会把故障时间段的数据从134同步过来
验证数据是否同步,133数据如下
134数据如下,同步情况正常,故障后加入集群也能把数据同步
总结:该高可用分布式架构,可保证节点之间的高可用,避免单点故障,保证流量下发,细节部分可以调整,大数据量情况下的sql优化、索引优化、配置参数优化等。
缺点:分片规则复杂,而且分片规则只对insert做了详细的正则,对其他修改和删除等规则不完善,对于分片来说,使用成熟的分片分库分表的中间件会更好(mycat/dble)。无自动备份、监控管理等功能。可用xtrabackup和Prometheus+grafana搭建较为成熟的数据库体系。
问题
部署完成中出现的一些问题,及排查思路
执行测试脚本时,遇到写入失败,且报错为无法写入是因为开启了super-read-only,所以判断为写入流量至读节点。
从137上去单独执行,发现确实是写入到从节点了
此时可判断为是配置问题,去排查配置信息,发现runtime表,有判断两个从节点也为可写入模式
如上图,runtime_mysql_servers表中,运行的主从,因为proxy配置自动监控判断主从节点,导致134和136也被判断成主节点,可写入,所以排查了看到是因为组复制规则的判断条件是innodb_read_only 是否开启,innodb_read_only=on则判断为读节点,off判断为可写节点。
而134从节点上的为off
所以结合上述应该修改条件为read_only来判断
删除原因规则,重新插入新规则,持久化和直接生效。
delete from mysql_replication_hostgroups;
select * from mysql_replication_hostgroups;
insert into mysql_replication_hostgroups
(writer_hostgroup, reader_hostgroup, check_type, comment)
values
(10, 20, 'read_only', 'ProxySQL自动主从检测'),
(30, 40, 'read_only', 'ProxySQL自动主从检测');
save mysql servers to disk;
save mysql variables to disk;
save mysql query rules to disk;
save mysql users to disk;
load mysql servers to runtime;
load mysql variables to runtime;
load mysql query rules to runtime;
load mysql users to runtime;
问题解决
部署过程中,有很多其他的问题,不全部列出,有问题可先思考问题的流程逻辑,从日志及报错信息入手,查看配置及规则,调整配置来测试问题。
更多推荐
所有评论(0)