您的位置:首页 > 大数据 > 人工智能

11g RAC TAF 实现 Failover Connect 理解Time Connection Failover/Runtime Connection Failover

2015-10-15 13:49 603 查看
相关脚本

 
select instance_name from v$instance;
select distinct sid from v$mystat;
 
select count(1) from aaa a, aaa b;
 
select a.SID, a.FAILOVER_TYPE, a.FAILOVER_METHOD, a.FAILED_OVER
from v$session a
where a.sid in (select distinct sid from v$mystat);
 
set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/

1fff8

客户端监听

RACDB_TAF =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = racscan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = taf_service)
    )
  )

srvctl modify service -d racdb -s taf_service -q TRUE -P BASIC -m BASIC -e SELECT -z 180 -w 5

重启service
srvctl stop service -d racdb -s taf_service
srvctl start service -d racdb -s taf_service

service状态

SQL>select a.name,
a.NETWORK_NAME,
a.FAILOVER_METHOD,
a.FAILOVER_TYPE,
a.FAILOVER_RETRIES,
a.FAILOVER_DELAY
from dba_services a
where a.NAME = 'taf_service';
2    3    4    5    6    7    8
NAME	     NETWORK_NAME FAILOVER_M FAILOVER_T FAILOVER_RETRIES FAILOVER_DELAY
------------ ------------ ---------- ---------- ---------------- --------------
taf_service  taf_service  BASIC      SELECT		     180	      5


运行sql

SQL>select count(1) from aaa a,aaa b;


停止实例,必须是abort,正常的shutdown,会停止sql
srvctl stop instance -d racdb -i racdb1 -o abort

或者是srvctl stop instance -d racdb -i racdb2 -o abort

结果

SQL>select instance_name from v$instance;
 
INSTANCE_NAME
----------------
racdb2
 
SQL>select count(1) from aaa a,aaa b;
 
 
select count(1) from aaa a,aaa b
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
 
 
 
SQL>
 
SQL>
SQL>select instance_name from v$instance;
 
INSTANCE_NAME
----------------
racdb1


可以看到中间并没有发生错误,手工停止sql后,实例已经切换过去了

--禁止TAF策略
srvctl modify service -d racdb -s taf_service -q false -P none -m none -e none -z 0 -w 0

测试其他场景

 长时间单SQL/故障前session id,故障后session id长时间单SQL/故障前instance id,故障后i nstance id故障表象短时间,多SQL/故障前session id,故障后session id短时间,多SQL/故障前instance id,故障后i nstance id故障表象是否自动转移实例
不使用TAF33/session已断开,无法连接racdb2/session已断开,无法连接
SQL>select count(1) from aaa a, aaa b;
select count(1) from aaa a, aaa b
*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed


153/session已断开,无法连接racdb2/session已断开,无法连接
SQL>set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/SQL>2    3    4    5    6    7    8  9  
ERROR:
ORA-03114: not connected to ORACLE
 
 
DECLARE
*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed
ORA-06512: at line 5

不能
TAF:select37/143racdb1/racdb2无影响,正常运行,session切换28/153racdb1/racdb2
/SQL>2    3    4    5    6    7    8  9  
 
DECLARE
*
ERROR at line 1:
ORA-25408: can not safely replay call

可以,但会出现 ORA-25408错误
TAF:session153/34racdb1/racdb2
SQL>SQL>
select count(1) from aaa a, aaa b
*
ERROR at line 1:
ORA-25408: can not safely replay call


32/28racdb1/racdb2
SQL>set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/SQL>2    3    4    5    6    7    8  9  
DECLARE
*
ERROR at line 1:
ORA-25408: can not safely replay call


 
结论
没有TAF,发生故障,会直接中断连接,无法继续操作。
使用TAF,可以保证session继续,不会中断,但session本身被切换了
TAF的Failover type=select 时,执行的sql不会中断,可以做到断点续传

也就是说TAF可以实现运行时的Failover,Runtime connection failover

修改TAF策略语法

srvctl modify service -d RAC11g -s Email -q TRUE -P BASIC -e SELECT -z 180 -w 5 -j LONG 具体各个参数的含义可参见如下注释:

Usage: srvctl modify service -d <db_unique_name> -s <service_name> [-c {UNIFORM |
SINGLETON}] [-P {BASIC|PRECONNECT|NONE}] [-l
[PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y
{AUTOMATIC | MANUAL}][-q {true|false}] [-x {true|false}] [-j {SHORT|LONG}] [-B
{NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}] [-m
{NONE|BASIC}] [-z <integer>] [-w <integer>]
-d <db_unique_name>      Unique name for the database
-s <service>             Service name
-c {UNIFORM | SINGLETON} Service runs on every active server in the server
pool hosting this service (UNIFORM) or just one server (SINGLETON)
-P {NONE | BASIC | PRECONNECT}        TAF policy specification
-l <role>                Role of the service (primary, physical_standby,
logical_standby, snapshot_standby)
-y <policy>              Management policy for the service (AUTOMATIC or MANUAL)
-e <Failover type>       Failover type (NONE, SESSION, or SELECT)
-m <Failover method>     Failover method (NONE or BASIC)
-w <integer>             Failover delay
-z <integer>             Failover retries
-j <clb_goal>  Connection Load Balancing Goal (SHORT or LONG). Default is LONG.
-B <Runtime Load Balancing Goal>     Runtime Load Balancing Goal (SERVICE_TIME,
THROUGHPUT, or NONE)
-x <Distributed Transaction Processing>  Distributed Transaction Processing (TRUE or FALSE)
-q <AQ HA notifications> AQ HA notifications
(TRUE or FALSE)
-h                       Print usage


相关脚本

 
select instance_name from v$instance;
select distinct sid from v$mystat;
 
select count(1) from aaa a, aaa b;
 
select a.SID, a.FAILOVER_TYPE, a.FAILOVER_METHOD, a.FAILED_OVER
from v$session a
where a.sid in (select distinct sid from v$mystat);
 
set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/


客户端监听

RACDB_TAF =
  (DESCRIPTION =
    (ADDRESS = (PROTOCOL = TCP)(HOST = racscan)(PORT = 1521))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = taf_service)
    )
  )

srvctl modify service -d racdb -s taf_service -q TRUE -P BASIC -m BASIC -e SELECT -z 180 -w 5

重启service
srvctl stop service -d racdb -s taf_service
srvctl start service -d racdb -s taf_service

service状态

SQL>select a.name,
a.NETWORK_NAME,
a.FAILOVER_METHOD,
a.FAILOVER_TYPE,
a.FAILOVER_RETRIES,
a.FAILOVER_DELAY
from dba_services a
where a.NAME = 'taf_service';
2    3    4    5    6    7    8
NAME	     NETWORK_NAME FAILOVER_M FAILOVER_T FAILOVER_RETRIES FAILOVER_DELAY
------------ ------------ ---------- ---------- ---------------- --------------
taf_service  taf_service  BASIC      SELECT		     180	      5


运行sql

SQL>select count(1) from aaa a,aaa b;


停止实例,必须是abort,正常的shutdown,会停止sql
srvctl stop instance -d racdb -i racdb1 -o abort

或者是srvctl stop instance -d racdb -i racdb2 -o abort

结果

SQL>select instance_name from v$instance;
 
INSTANCE_NAME
----------------
racdb2
 
SQL>select count(1) from aaa a,aaa b;
 
 
select count(1) from aaa a,aaa b
*
ERROR at line 1:
ORA-01013: user requested cancel of current operation
 
 
 
SQL>
 
SQL>
SQL>select instance_name from v$instance;
 
INSTANCE_NAME
----------------
racdb1


可以看到中间并没有发生错误,手工停止sql后,实例已经切换过去了

--禁止TAF策略
srvctl modify service -d racdb -s taf_service -q false -P none -m none -e none -z 0 -w 0

测试其他场景

 长时间单SQL/故障前session id,故障后session id长时间单SQL/故障前instance id,故障后i nstance id故障表象短时间,多SQL/故障前session id,故障后session id短时间,多SQL/故障前instance id,故障后i nstance id故障表象是否自动转移实例
不使用TAF33/session已断开,无法连接racdb2/session已断开,无法连接
SQL>select count(1) from aaa a, aaa b;
select count(1) from aaa a, aaa b
*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed


153/session已断开,无法连接racdb2/session已断开,无法连接
SQL>set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/SQL>2    3    4    5    6    7    8  9  
ERROR:
ORA-03114: not connected to ORACLE
 
 
DECLARE
*
ERROR at line 1:
ORA-00028: your session has been killed
ORA-00028: your session has been killed
ORA-06512: at line 5

不能
TAF:select37/143racdb1/racdb2无影响,正常运行,session切换28/153racdb1/racdb2
/SQL>2    3    4    5    6    7    8  9  
 
DECLARE
*
ERROR at line 1:
ORA-25408: can not safely replay call

可以,但会出现 ORA-25408错误
TAF:session153/34racdb1/racdb2
SQL>SQL>
select count(1) from aaa a, aaa b
*
ERROR at line 1:
ORA-25408: can not safely replay call


32/28racdb1/racdb2
SQL>set serveroutput on
DECLARE
v_count NUMBER;
BEGIN
for i in 1..10000000 loop
select count(1) into v_count from test;
dbms_output.put_line('i:='||i);
end loop;
END;
/SQL>2    3    4    5    6    7    8  9  
DECLARE
*
ERROR at line 1:
ORA-25408: can not safely replay call


 
结论
没有TAF,发生故障,会直接中断连接,无法继续操作。
使用TAF,可以保证session继续,不会中断,但session本身被切换了
TAF的Failover type=select 时,执行的sql不会中断,可以做到断点续传

也就是说TAF可以实现运行时的Failover,Runtime connection failover

修改TAF策略语法

srvctl modify service -d RAC11g -s Email -q TRUE -P BASIC -e SELECT -z 180 -w 5 -j LONG 具体各个参数的含义可参见如下注释:

Usage: srvctl modify service -d <db_unique_name> -s <service_name> [-c {UNIFORM |
SINGLETON}] [-P {BASIC|PRECONNECT|NONE}] [-l
[PRIMARY][,PHYSICAL_STANDBY][,LOGICAL_STANDBY][,SNAPSHOT_STANDBY]] [-y
{AUTOMATIC | MANUAL}][-q {true|false}] [-x {true|false}] [-j {SHORT|LONG}] [-B
{NONE|SERVICE_TIME|THROUGHPUT}] [-e {NONE|SESSION|SELECT}] [-m
{NONE|BASIC}] [-z <integer>] [-w <integer>]
-d <db_unique_name>      Unique name for the database
-s <service>             Service name
-c {UNIFORM | SINGLETON} Service runs on every active server in the server
pool hosting this service (UNIFORM) or just one server (SINGLETON)
-P {NONE | BASIC | PRECONNECT}        TAF policy specification
-l <role>                Role of the service (primary, physical_standby,
logical_standby, snapshot_standby)
-y <policy>              Management policy for the service (AUTOMATIC or MANUAL)
-e <Failover type>       Failover type (NONE, SESSION, or SELECT)
-m <Failover method>     Failover method (NONE or BASIC)
-w <integer>             Failover delay
-z <integer>             Failover retries
-j <clb_goal>  Connection Load Balancing Goal (SHORT or LONG). Default is LONG.
-B <Runtime Load Balancing Goal>     Runtime Load Balancing Goal (SERVICE_TIME,
THROUGHPUT, or NONE)
-x <Distributed Transaction Processing>  Distributed Transaction Processing (TRUE or FALSE)
-q <AQ HA notifications> AQ HA notifications
(TRUE or FALSE)
-h                       Print usage
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: