整理这篇文档主要记录复制搭建过程中遇到的问题,被其中一个搞了好久才突然一下子明白过来。。。。
1.备份恢复的时候报错:SQL0970N The system attempted to write to a read-only file. SQLSTATE=55009
2.SQL1768N Unable to start HADR. Reason code = “5”.
3.没有复制。。。
辅助工具: db2diag
db2diag -level Error
在一台机器上安装了两个实例db2inst1,db2inst2,然后将HADR_LOCAL_SVC 和HADR_REMOTE_SVC设置的与实例的运行端口一致,所以,一直在报“SQL1768N Unable to start HADR. Reason code = “5””的错误,实际上在这里有个误解,我们配置MySQL的复制时,要使用端口号3306,但是这里的复制监听端口跟MySQL的端口不一样的,因为DB2的进程实际上就是Linux的进程,所以在这里需要另行开端口,也就是我们随意指定一个监听端口,当启动HADR后,这个端口就会打开,负责复制的通信。
如图:
2017-09-01-16.28.54.340565+480 I908211E796 LEVEL: Error
PID : 36456 TID : 139973198079744 PROC : db2sysc
INSTANCE: db2inst2 NODE : 000 DB : KILO
APPHDL : 0-17 APPID: *LOCAL.db2inst2.170901082853
AUTHID : DB2INST2 HOSTNAME: db22.com
EDUID : 22 EDUNAME: db2agent (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrValidateLocalServiceDbCfg, probe:21210
MESSAGE : ZRC=0x87800142=-2021654206=HDR_ZRC_CONFIG_LOCAL_SVC"HADR_LOCAL_SVC configuration parameter is an invalid service name"
DATA #1 : <preformatted>
HADR_LOCAL_SVC cannot be same port as SVCENAME or SVCENAME +1
HADR_LOCAL_SVC 54001, Port # 54001
SVCENAME 54001, Port # 54001
SVCENAME + 1 54002
诊断日志中说HADR_LOCAL_SVC cannot be same port as SVCENAME or SVCENAME +1 ,所以突然就想到应该是要另开端口的。
2.db2inst1上的备份,db2inst2上恢复时报SQL0970N :read-only的错误
[db2inst2@a18 ~]$ db2 restore db kilo taken at 20170904100841
SQL0970N The system attempted to write to a read-only file. SQLSTATE=55009
2017-09-04-10.12.50.228929+480 I169707E3155 LEVEL: Error (OS)
PID : 4096 TID : 140640650258176 PROC : db2sysc
INSTANCE: db2inst2 NODE : 000 DB :
APPHDL : 0-13 APPID: *LOCAL.db2inst2.170904020959
AUTHID : DB2INST2 HOSTNAME: a18.mysql.iso0.lfk.com
EDUID : 21 EDUNAME: db2agent (instance)
FUNCTION: DB2 Common, OSSe, ossErrorIOAnalysis, probe:100
CALLED : OS, -, open OSERR: EACCES (13)
DATA #1 : String, 157 bytes
A total of 2 analysis will be performed :- User info- Path access permission
[db2inst2@a18 ~]$ db2 restore db kilo taken at 20170904100841 on /home/db2inst2
DB20000I The RESTORE DATABASE command completed successfully.
3.在primary上建表并插入数据,standby上没有收到。
db2diag -level Error
... ... ...
2017-09-01-17.08.55.963065+480 E3711665E645 LEVEL: Error
PID : 39151 TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1 NODE : 000 DB : KILO
HOSTNAME: db22.com
EDUID : 41 EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20200
MESSAGE : Did not receive anything through HADR connection for the duration of HADR_TIMEOUT. Closing connection.
DATA #1 : String, 30 bytes
hdrCurrentTime/hdrLastRecvTime
DATA #2 : unsigned integer, 4 bytes
1504256935
DATA #3 : unsigned integer, 4 bytes
15042564172017-09-01-18.18.58.511765+480 E3720982E514 LEVEL: Error (OS)
PID : 39151 TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1 NODE : 000 DB : KILO
HOSTNAME: db22.com
EDUID : 41 EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, oper system services, sqloPdbQuerySocketErrorStatus, probe:15
MESSAGE : ZRC=0x810F001C=-2129723364=SQLO_NO_CONNECTION "No connection"
CALLED : OS, -, getsockopt OSERR: ECONNRESET (104)2017-09-01-18.19.19.039294+480 I3721497E557 LEVEL: Error
PID : 39151 TID : 140330070435584 PROC : db2sysc
INSTANCE: db2inst1 NODE : 000 DB : KILO
HOSTNAME: db22.com
EDUID : 41 EDUNAME: db2hadrp.0.1 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrSendRedirectMsgToOneAddress, probe:31070
MESSAGE : ZRC=0xFFFFFFFF=-1
DATA #1 : <preformatted>
The HADR primary was not able to form a TCP connection with the standby: [0000:0000:0000:0000:0000:0000:0000:0001]:54001.... ... ...
2017-09-04-11.13.22.594229+480 I445495E514 LEVEL: Error
PID : 11131 TID : 140677660796672 PROC : db2sysc
INSTANCE: db2inst2 NODE : 000 DB : KILO
HOSTNAME: a18.mysql.iso0.lfk.com
EDUID : 90 EDUNAME: db2hadrs.0.0 (KILO)
FUNCTION: DB2 UDB, High Availability Disaster Recovery, hdrEduAcceptEvent, probe:20280
MESSAGE : ZRC=0x810F0019=-2129723367=SQLO_CONN_REFUSED "Connection refused"
DATA #1 : <preformatted>
Failed to connect to primary.
The HADR primary was not able to form a TCP connection with the standby: [0000:0000:0000:0000:0000:0000:0000:0001]:54001.
4.备库设置可读
[db2inst2@a18 ~]$ db2 connect to kilo
SQL1776N The command cannot be issued on an HADR database. Reason code = "1".
启用“在备用数据库上读取”
将 DB2_HADR_ROS 注册表变量设置为 ON。
[db2inst2@a18 ~]$ db2set DB2_HADR_ROS=ON
#重启一下db2实例,重启一下复制
[db2inst2@a18 ~]$ db2 connect to kiloDatabase Connection InformationDatabase server = DB2/LINUXX8664 11.1.2.2SQL authorization ID = DB2INST2Local database alias = KILO
5.设置了可读,如果用户不一致,还是会因为没有权限报SQL0551N的错误呢
SQL0551N
authorization-ID does not have the required authorization or privilege to perform operation operation on object object-name.
SELECT AUTHORITY, D_USER, D_GROUP, D_PUBLIC, ROLE_USER, ROLE_GROUP, ROLE_PUBLIC, D_ROLE FROM TABLE (SYSPROC.AUTH_LIST_AUTHORITIES_FOR_AUTHID ('ALICE', 'U') ) AS T ORDER BY AUTHORITY
在备份之前进行授权。
授权后备份恢复后又遇到一个问题:
执行select时报错:SQL1773N,
查找参考官文:
SQL1773N
该语句或命令失败,因为其需要使用允读的 HADR 备用数据库上不支持的功能。原因码:原因码。
说明与原因码对应的解释是:1使用了除 UR 之外的隔离级别,并且已关闭 DB2_STANDBY_ISO 注册表变量。
db2set DB2_STANDBY_ISO=UR
终于正确读出来了~
https://www.ibm.com/support/knowledgecenter/zh/SSEPGG_10.1.0/com.ibm.db2.luw.admin.ha.doc/doc/t0011725.html
操作步骤
本次搭建采用一机两个实例搭建:
| HADR角色 | host | svcname | instname | 数据库版本 | 数据库名称 |
|---|---|---|---|---|---|
| 主库 | localhost | ba5oltp1 | 10.5 | ST | |
| 备库 | localhost | db2inst1 | 10.5 | – |
1).主机上设置循环日志
2).主机上连接到授权
db2 connect to st
db2 "grant DATAACCESS ON DATABASE TO DB2INST1"
3).主机上备份数据库
db2 backup database to /tmp
4).root更改备份文件所有者
chown db2inst1 /tmp/ST*
5).备库上恢复
db2 restore db st from /tmp taken at 20170920112008 on /home/db2inst1
6).编辑/etc/services文件,添加与SVCENAME不同的服务名和端口号
#ba5oltp1
$db2 get dbm cfg | grep SVCTCP/TP Service name (SVCENAME)=DB2cbaoltp1
#主库所在的实例服务
DB2cba5oltp1 50110/tcp
#添加主库HADR的端口号
DB2_ba5oltp1_1 60071/tcp#备库所在的实例服务
DB2_db2inst1 60011/tcp
#备库HADR的端口号
DB2_db2inst1_1 60013/tcp
7).根据配置执行更新
#主库上:
db2 update db cfg for ST using HADR_LOCAL_HOST localhost HADR_LOCAL_SVC DB2_ba5oltp1_1 HADR_REMOTE_HOST localhost HADR_REMOTE_SVC DB2_db2inst1_1 HADR_REMOTE_INST db2inst1#备库上:
db2 update db cfg for ST using HADR_LOCAL_HOST localhost HADR_LOCAL_SVC DB2_db2inst1_1 HADR_REMOTE_HOST localhost HADR_REMOTE_SVC DB2_ba5oltp1_1 HADR_REMOTE_INST ba5oltp1
启动HADR
备库上:
















