查看: 234|回复: 0

[Mysql数据库] Innodb:RR隔离级别下insert...select 对select表加锁模型和死锁案列

发表于 7 天前
太阳http代理AD

最近有网友遇到了在RR隔离级别下insert A select B where B.COL=** 由于select表也就是B表引发的死锁的问题。分析死锁日志后,笔者进行模拟重现了这位网友遇到了2种场景并且在本文中进行详细的描述。


感谢叶金荣老师对本文的审核,笔者也曾是一名知数堂的学生

一、基本概念

在开始正文之前我打算介绍一下一些基本概念,特别是锁模型和兼容矩阵会对本文的阅读有相当大的帮助。

1、 innodb lock模型
  1. #define LOCK_ORDINARY 0 /*!< this flag denotes an ordinary
  2. next-key lock in contrast to LOCK_GAP
  3. or LOCK_REC_NOT_GAP */
复制代码

默认是LOCK_ORDINARY即普通的next_key_lock,锁住行及以前的间隙。

  1. #define LOCK_GAP 512 /*!< when this bit is set, it means that the
  2. lock holds only on the gap before the record;
  3. for instance, an x-lock on the gap does not
  4. give permission to modify the record on which
  5. the bit is set; locks of this type are created
  6. when records are removed from the index chain
复制代码

间隙锁,锁住行以前的间隙,不锁住本行。

  1. #define LOCK_REC_NOT_GAP 1024 /*!< this bit means that the lock is only on
  2. the index record and does NOT block inserts
  3. to the gap before the index record; this is
  4. used in the case when we retrieve a record
  5. with a unique key, and is also used in
  6. locking plain SELECTs (not part of UPDATE
  7. or DELETE) when the user has set the READ
  8. COMMITTED isolation level */
复制代码

行锁,锁住行而不锁住任何间隙。

  1. #define LOCK_INSERT_INTENTION 2048 /*!< this bit is set when we place a waiting
  2. gap type record lock request in order to let
  3. an insert of an index record to wait until
  4. there are no conflicting locks by other
  5. transactions on the gap; note that this flag
  6. remains set when the waiting lock is granted,
  7. or if the lock is inherited record */
复制代码

插入意向锁,如果插入的记录在某个已经锁定的间隙内为这个锁。

2、 innodb lock兼容矩阵
  1. /* LOCK COMPATIBILITY MATRIX
  2. * IS IX S X AI
  3. * IS + + + - +
  4. * IX + + - - +
  5. * S + - + - -
  6. * X - - - - -
  7. * AI + + - - -
复制代码
3、infimum和supremum

一个page中包含这两个伪记录。页中所有的行未删除(或删除未purge)的行逻辑上都连接到这两个虚列之间,表现为一个逻辑链表数据结构,其中supremum伪记录的锁始终为next_key_lock。

4、heap no

heap no存储在fixed_extrasize 中。heap no 为物理存储填充的序号,页的空闲空间挂载在page free链表中(头插法)可以重用,但是重用此heap no不变,如果一直是insert 则heap no 不断增加,并不是按照ROWID(主键)排序的逻辑链表顺序,而是物理填充顺序。

5、n bits

和这个page相关的锁位图的大小,每一行记录都有1 bit的位图信息与其对应,用来表示是否加锁,并且始终预留64bit。例如我的表有9条数据,同时包含infimum和supremum虚拟记录即 64+9+2 bits,即75bits但是必须被8整除向上取整为一个字节,结果也就是就是80 bits。注意不管是否加锁每行都会对应一bit的位图。

6、lock struct

这是LOCK的内存结构体源码中用lock_t表示其可以包含

  1. lock_table_t tab_lock;/*!< table lock */
  2. lock_rec_t rec_lock;/*!< record lock */
复制代码

一般来说innodb上锁都会对表级加上IX,这占用一个结构体。然后分别对二级索引和主键进行加锁,每一个BLOCK会占用这样一个结构体。

7、row lock

这个信息描述了当前事务加锁的行数,他是所有lock struct结构体中排除table lock以外所有加锁记录的总和,并且包含了infimum和supremum伪列。

8、逐步加锁

如果细心的朋友应该会发现在show engine 中事务信息中的row lock在对大量行进行加锁的时候会不断的增加,因为加行锁最终会调用lock_rec_lock逐行加锁,这也会增加了大数据量加锁的触发死锁的可能性。

二、Innodb层对insert...select 中select表的加锁模式

RR隔离级别下insert A select B where B.COL=**,innodb层会对B表满足条件的数据进行加锁,但是RC模式下B表记录不会加任何innodb层的锁,表现如下:

三、Innodb层对insert...select中select表的加锁测试

下面我们分别对其进行测试和打印输出:

1. 如果B.COL有二级(唯一),并且执行计划使用到了(非using index)

使用语句:

  1. drop table t1;
  2. drop table t2;
  3. create table t1(id int primary key,n1 varchar(20),n2 varchar(20),key(n1));
  4. create table t2 like t1;
  5. insert into t1 values(1,'gao1','gao'),(2,'gao1','gao'),(3,'gao1','gao'),(4,'gao2','gao'),(5,'gao2','gao'),(6,'gao2','gao'),(7,'gao3','gao'),(8,'gao4','gao');
复制代码

查看执行计划:

  1. mysql> desc insert into t2 select * from t1 force index(n1) where n1='gao2';
  2. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
  3. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  4. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
  5. | 1 | INSERT | t2 | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
  6. | 1 | SIMPLE | t1 | NULL | ref | n1 | n1 | 23 | const | 3 | 100.00 | NULL |
  7. +----+-------------+-------+------------+------+---------------+------+---------+-------+------+----------+-------+
复制代码

执行语句:

  1. begin;insert into t2 select * from t1 force index(n1) where n1='gao2';
复制代码
观察结果:
  1. -----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 86 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks gap and rec(LOCK_ORDINARY[next_key_lock])
  3. Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  4. 0: len 4; hex 67616f32; asc gao2;;
  5. 1: len 4; hex 80000004; asc ;;
  6. Record lock, heap no 6 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  7. 0: len 4; hex 67616f32; asc gao2;;
  8. 1: len 4; hex 80000005; asc ;;
  9. Record lock, heap no 7 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  10. 0: len 4; hex 67616f32; asc gao2;;
  11. 1: len 4; hex 80000006; asc ;;
复制代码
  1. -----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 86 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
  3. Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  4. 0: len 4; hex 80000004; asc ;;
  5. 1: len 6; hex 000000006f20; asc o ;;
  6. 2: len 7; hex bc000001300134; asc 0 4;;
  7. 3: len 4; hex 67616f32; asc gao2;;
  8. 4: len 3; hex 67616f; asc gao;;
  9. Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  10. 0: len 4; hex 80000005; asc ;;
  11. 1: len 6; hex 000000006f20; asc o ;;
  12. 2: len 7; hex bc000001300140; asc 0 @;;
  13. 3: len 4; hex 67616f32; asc gao2;;
  14. 4: len 3; hex 67616f; asc gao;;
  15. Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  16. 0: len 4; hex 80000006; asc ;;
  17. 1: len 6; hex 000000006f20; asc o ;;
  18. 2: len 7; hex bc00000130014c; asc 0 L;;
  19. 3: len 4; hex 67616f32; asc gao2;;
  20. 4: len 3; hex 67616f; asc gao;;
复制代码
  1. -----TRX NO:28470 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 86 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 28470 lock mode S(LOCK_S) locks gap before rec(LOCK_GAP)
  3. Record lock, heap no 8 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  4. 0: len 4; hex 67616f33; asc gao3;;
  5. 1: len 4; hex 80000007; asc ;;
复制代码
1.png
1.png 2. 如果B.COL有二级(唯一),并且执行计划使用到了(非using index)

使用语句:

  1. drop table t1;
  2. drop table t2;
  3. create table t1(id int primary key,n1 varchar(20),n2 varchar(20),unique key(n1));
  4. create table t2 like t1;
  5. insert into t1 values(1,'gao1','gao'),(2,'gao2','gao'),(3,'gao3','gao'),(4,'gao4','gao'),(5,'gao5','gao'),(6,'gao6','gao'),(7,'gao7','gao'),(8,'gao8','gao');
复制代码

查看执行计划:

  1. mysql> desc insert into t2 select * from t1 force index(n1) where n1 in ('gao2','gao3','gao4');
  2. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
  3. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  4. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
  5. | 1 | INSERT | t2 | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
  6. | 1 | SIMPLE | t1 | NULL | range | n1 | n1 | 23 | NULL | 3 | 100.00 | Using index condition |
  7. +----+-------------+-------+------------+-------+---------------+------+---------+------+------+----------+-----------------------+
复制代码

执行语句:

  1. begin;insert into t2 select * from t1 force index(n1) where n1 in ('gao2','gao3','gao4');
复制代码
观察输出:
  1. -----TRX NO:30514 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 94 page no 4 n bits 80 index n1 of table `test`.`t1` trx id 30514 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
  3. Record lock, heap no 3 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  4. 0: len 4; hex 67616f32; asc gao2;;
  5. 1: len 4; hex 80000002; asc ;;
  6. Record lock, heap no 4 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  7. 0: len 4; hex 67616f33; asc gao3;;
  8. 1: len 4; hex 80000003; asc ;;
  9. Record lock, heap no 5 PHYSICAL RECORD: n_fields 2; compact format; info bits 0
  10. 0: len 4; hex 67616f34; asc gao4;;
  11. 1: len 4; hex 80000004; asc ;;
复制代码
  1. -----TRX NO:30514 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 94 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 30514 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
  3. Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  4. 0: len 4; hex 80000002; asc ;;
  5. 1: len 6; hex 000000007728; asc w(;;
  6. 2: len 7; hex a200000115011c; asc ;;
  7. 3: len 4; hex 67616f32; asc gao2;;
  8. 4: len 3; hex 67616f; asc gao;;
  9. Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  10. 0: len 4; hex 80000003; asc ;;
  11. 1: len 6; hex 000000007728; asc w(;;
  12. 2: len 7; hex a2000001150128; asc (;;
  13. 3: len 4; hex 67616f33; asc gao3;;
  14. 4: len 3; hex 67616f; asc gao;;
  15. Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  16. 0: len 4; hex 80000004; asc ;;
  17. 1: len 6; hex 000000007728; asc w(;;
  18. 2: len 7; hex a2000001150134; asc 4;;
  19. 3: len 4; hex 67616f34; asc gao4;;
  20. 4: len 3; hex 67616f; asc gao;;
复制代码
2.png
2.png 3.如果B.COL没有二级索引

使用语句:

  1. drop table t1;
  2. drop table t2;
  3. create table t1(id int primary key,n1 varchar(20),n2 varchar(20));
  4. create table t2 like t1;
  5. insert into t1 values(1,'gao1','gao'),(2,'gao2','gao'),(3,'gao3','gao'),(4,'gao4','gao'),(5,'gao5','gao'),(6,'gao6','gao'),(7,'gao7','gao'),(8,'gao8','gao');
复制代码

查看执行计划:

  1. mysql> desc insert into t2 select * from t1 where n1 in ('gao2','gao3','gao4');
  2. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
  3. | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
  4. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
  5. | 1 | INSERT | t2 | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
  6. | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 8 | 37.50 | Using where |
  7. +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
复制代码

执行语句:

  1. begin;insert into t2 select * from t1 where n1 in ('gao2','gao3','gao4');
复制代码
观察输出:
  1. -----TRX NO:30535 LOCK STRUCT(1)(Add by gaopeng)
  2. RECORD LOCKS space id 94 page no 3 n bits 80 index PRIMARY of table `test`.`t1` trx id 30535 lock mode S(LOCK_S) locks gap and rec(LOCK_ORDINARY[next_key_lock])
  3. Record lock, heap no 1 PHYSICAL RECORD: n_fields 1; compact format; info bits 0
  4. 0: len 8; hex 73757072656d756d; asc supremum;;
  5. Record lock, heap no 2 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  6. 0: len 4; hex 80000001; asc ;;
  7. 1: len 6; hex 000000007728; asc w(;;
  8. 2: len 7; hex a2000001150110; asc ;;
  9. 3: len 4; hex 67616f31; asc gao1;;
  10. 4: len 3; hex 67616f; asc gao;;
  11. Record lock, heap no 3 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  12. 0: len 4; hex 80000002; asc ;;
  13. 1: len 6; hex 000000007728; asc w(;;
  14. 2: len 7; hex a200000115011c; asc ;;
  15. 3: len 4; hex 67616f32; asc gao2;;
  16. 4: len 3; hex 67616f; asc gao;;
  17. Record lock, heap no 4 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  18. 0: len 4; hex 80000003; asc ;;
  19. 1: len 6; hex 000000007728; asc w(;;
  20. 2: len 7; hex a2000001150128; asc (;;
  21. 3: len 4; hex 67616f33; asc gao3;;
  22. 4: len 3; hex 67616f; asc gao;;
  23. Record lock, heap no 5 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  24. 0: len 4; hex 80000004; asc ;;
  25. 1: len 6; hex 000000007728; asc w(;;
  26. 2: len 7; hex a2000001150134; asc 4;;
  27. 3: len 4; hex 67616f34; asc gao4;;
  28. 4: len 3; hex 67616f; asc gao;;
  29. Record lock, heap no 6 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  30. 0: len 4; hex 80000005; asc ;;
  31. 1: len 6; hex 000000007728; asc w(;;
  32. 2: len 7; hex a2000001150140; asc @;;
  33. 3: len 4; hex 67616f35; asc gao5;;
  34. 4: len 3; hex 67616f; asc gao;;
  35. Record lock, heap no 7 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  36. 0: len 4; hex 80000006; asc ;;
  37. 1: len 6; hex 000000007728; asc w(;;
  38. 2: len 7; hex a200000115014c; asc L;;
  39. 3: len 4; hex 67616f36; asc gao6;;
  40. 4: len 3; hex 67616f; asc gao;;
  41. Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  42. 0: len 4; hex 80000007; asc ;;
  43. 1: len 6; hex 000000007728; asc w(;;
  44. 2: len 7; hex a2000001150158; asc X;;
  45. 3: len 4; hex 67616f37; asc gao7;;
  46. 4: len 3; hex 67616f; asc gao;;
  47. Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  48. 0: len 4; hex 80000008; asc ;;
  49. 1: len 6; hex 000000007728; asc w(;;
  50. 2: len 7; hex a2000001150164; asc d;;
  51. 3: len 4; hex 67616f38; asc gao8;;
  52. 4: len 3; hex 67616f; asc gao;;
复制代码
3.png
3.png 四、insert...select由于select查询表引起的死锁

有了上面的理论,我们知道在RR隔离级别下insert...select会对select符合条件的数据加上LOCK_S锁,我曾经总结过出现死锁的条件:

由于存在对select符合条件的数据加上LOCK_S锁的情况,RR模式下insert...select出现死锁的概率无疑更加高,我通过测试模拟出这种情况,严格意义上是相同的语句在高并发情况下表现为两种死锁情况。

测试脚本:

  1. create table b(id int primary key,name1 varchar(20),name2 varchar(20));
  2. alter table b add key(name1);
  3. DELIMITER //
  4. CREATE PROCEDURE test_i()
  5. begin
  6. declare num int;
  7. set num = 1;
  8. while num <= 3000 do
  9. insert into b values(num,concat('gao',num),'gaopeng');
  10. set num=num+1;
  11. end while;
  12. end//
  13. call test_i()//
  14. create table a like b//
  15. DELIMITER ;
复制代码

语句都是一样的:

TX1 TX2
begin; -
update b set name2='test' where id=2999; -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);
update b set name2='test' where id=999; -

但是在高并发下相同的语句却表现出不同的死锁情况,现在进行分析:

情况1:

如下图红色记录为不能获得锁的记录:

情况1.jpg
情况1.jpg 情况2:

这种情况比较极端只能在高并发上出现

如下图红色记录为不能获得锁的记录:

情况2.jpg
情况2.jpg 五、源码修改和参数增加

情况2的测试需要在高并发下才会出现,因为insert...select语句是一条语句很难人为控制,也就是很让他在特定条件下停止。但是为了能够模拟出这种情况笔者对innodb增加了4个参数如下,为了方便识别我都加上了自己的名字的拼音:

  1. mysql> show variables like '%gaopeng%';
  2. +---------------------------+-------+
  3. | Variable_name | Value |
  4. +---------------------------+-------+
  5. | innodb_gaopeng_sl_heap_no | 0 |
  6. | innodb_gaopeng_sl_ind_id | 0 |
  7. | innodb_gaopeng_sl_page_no | 0 |
  8. | innodb_gaopeng_sl_time | 0 |
  9. +---------------------------+-------+
复制代码

默认情况都是0,即不启用。他们的意思如下:

并且在源码lock_rec_lock 开头增加如下代码:

  1. //add by gaopeng
  2. /*if find index_id heap no page no to sleep srv_gaopeng_sl_time secs*/
  3. if(srv_gaopeng_sl_ind_id && srv_gaopeng_sl_page_no && srv_gaopeng_sl_heap_no)
  4. {
  5. if(heap_no == (ulint)(srv_gaopeng_sl_heap_no) && (block->page.id).page_no() ==(ib_uint32_t)(srv_gaopeng_sl_page_no)
  6. && index->id ==(index_id_t)(srv_gaopeng_sl_ind_id) )
  7. {
  8. lock_mutex_exit();
  9. sleep(srv_gaopeng_sl_time);
  10. lock_mutex_enter();
  11. }
  12. }
  13. //add end
复制代码

这样一旦判定为符合条件的记录,本条记录的加锁钱便会休眠指定的秒如果我们拟定在LOCK_S:id=2997之前睡眠30秒,那么情况2能够必定发生如下图:

情况3.jpg
情况3.jpg 六、实际测试 情况1:
TX1 TX2
begin; -
update b set name2='test' where id=2999;对id:2999加LOCK_X锁 -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);对id:996,997,998,999,2995,2996,2997,2998加LOCK_S锁,但是对id:2999加LOCK_S锁时发现已经加LOCK_X锁,需等待
update b set name2='test' where id=999;对id:999加LOCK_X锁,但是发现已经加LOCK_S锁,需等待,触发死锁检测 -
TX1触发死锁,TX1在权重判定下回滚 -

死锁报错语句:

  1. mysql> update b set name2='test' where id=999;
  2. ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
复制代码

死锁日志:

  1. *** (1) TRANSACTION:
  2. TRANSACTION 48423, ACTIVE 7 sec starting index read
  3. mysql tables in use 2, locked 2
  4. LOCK WAIT 5 lock struct(s), heap size 1160, 9 row lock(s), undo log entries 8
  5. MySQL thread id 4, OS thread handle 140737223177984, query id 9110 localhost root Sending data
  6. insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
  7. *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
  8. RECORD LOCKS space id 119 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 48423 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
  9. Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  10. 0: len 4; hex 80000bb7; asc ;;
  11. 1: len 6; hex 00000000bd26; asc &;;
  12. 2: len 7; hex 21000001511e7d; asc ! Q };;
  13. 3: len 7; hex 67616f32393939; asc gao2999;;
  14. 4: len 4; hex 74657374; asc test;;
  15. *** (2) TRANSACTION:
  16. TRANSACTION 48422, ACTIVE 24 sec starting index read
  17. mysql tables in use 1, locked 1
  18. 3 lock struct(s), heap size 1160, 2 row lock(s), undo log entries 1
  19. MySQL thread id 3, OS thread handle 140737223444224, query id 9111 localhost root updating
  20. update b set name2='test' where id=999
  21. *** (2) HOLDS THE LOCK(S):
  22. RECORD LOCKS space id 119 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 48422 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP)
  23. Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  24. 0: len 4; hex 80000bb7; asc ;;
  25. 1: len 6; hex 00000000bd26; asc &;;
  26. 2: len 7; hex 21000001511e7d; asc ! Q };;
  27. 3: len 7; hex 67616f32393939; asc gao2999;;
  28. 4: len 4; hex 74657374; asc test;;
  29. *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
  30. RECORD LOCKS space id 119 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 48422 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
  31. Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  32. 0: len 4; hex 800003e7; asc ;;
  33. 1: len 6; hex 00000000b534; asc 4;;
  34. 2: len 7; hex bd000001310110; asc 1 ;;
  35. 3: len 6; hex 67616f393939; asc gao999;;
  36. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  37. *** WE ROLL BACK TRANSACTION (2)
复制代码

信息提取如下:

  1. TRX1:48423
  2. LOCK HOLD:死锁信息不提供
  3. LOCK WAIT:
  4. 表:b
  5. 索引:`PRIMARY`
  6. 锁模式:LOCK_S|LOCK_REC_NOT_GAP|LOCK_WAIT
  7. 记录:主键为0Xbb7(2999)
  8. 附加信息:space id 119 page no 18 heap no 86
  9. CURRENT SQL:
  10. insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
  11. TRX2:48422(触发死锁、权重回滚)
  12. LOCK HOLD:
  13. 表:b
  14. 索引:`PRIMARY`
  15. 锁模式:LOCK_X|LOCK_REC_NOT_GAP
  16. 记录:主键为0Xbb7(2999)
  17. 附加信息:pace id 119 page no 18 heap no 86
  18. LOCK WAIT:
  19. 表:b
  20. 索引:`PRIMARY`
  21. 锁模式:LOCK_X|LOCK_REC_NOT_GAP|LOCK_WAIT
  22. 记录:主键为0X3e7(999)
  23. 附加信息:space id 119 page no 10 heap no 11
  24. CURRENT SQL:
  25. update b set name2='test' where id=999
复制代码
情况2:

如上我们拟定在语句

  1. insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
复制代码

对b表记录加锁时在2997加锁前停顿30秒,那么我就需要找到b表主键2997的index_id、page_no、heap_no三个信息,这里使用到我的innblock工具

  1. ./innblock b.ibd scan 16
  2. ===INDEX_ID:121
  3. level1 total block is (1)
  4. block_no: 3,level: 1|*|
  5. level0 total block is (9)
  6. block_no: 5,level: 0|*|block_no: 6,level: 0|*|block_no: 7,level: 0|*|
  7. block_no: 10,level: 0|*|block_no: 11,level: 0|*|block_no: 13,level: 0|*|
  8. block_no: 15,level: 0|*|block_no: 17,level: 0|*|block_no: 18,level: 0|*|
复制代码

因为为顺序插入那么2997必定到page 18中然后如下:

  1. ./innblock b.ibd 18 16
  2. ==== Block base info ====
  3. block_no:18 space_id:121 index_id:121
  4. ....
  5. (84) normal record offset:3287 heapno:83 n_owned 0,delflag:N minflag:0 rectype:0
  6. (85) normal record offset:3326 heapno:84 n_owned 0,delflag:N minflag:0 rectype:0
  7. (86) normal record offset:3365 heapno:85 n_owned 0,delflag:N minflag:0 rectype:0
  8. (87) normal record offset:3404 heapno:86 n_owned 0,delflag:N minflag:0 rectype:0
  9. (88) normal record offset:3443 heapno:87 n_owned 0,delflag:N minflag:0 rectype:0
复制代码

因为为顺序插入heap_no 84就是id为2997的记录。我们使用另外一个工具bcview进行验证

  1. ./bcview b.ibd 16 3326 4
  2. current block:00000018--Offset:03326--cnt bytes:04--data is:80000bb5
复制代码

当然0Xbb5就是2997
因此设置参数为:

  1. set global innodb_gaopeng_sl_heap_no=84;
  2. set global innodb_gaopeng_sl_ind_id=121;
  3. set global innodb_gaopeng_sl_page_no=18;
  4. set global innodb_gaopeng_sl_time=30;
  5. mysql> show variables like '%gaopeng%';
  6. +---------------------------+-------+
  7. | Variable_name | Value |
  8. +---------------------------+-------+
  9. | innodb_gaopeng_sl_heap_no | 84 |
  10. | innodb_gaopeng_sl_ind_id | 121 |
  11. | innodb_gaopeng_sl_page_no | 18 |
  12. | innodb_gaopeng_sl_time | 30 |
  13. +---------------------------+-------+
复制代码

那么情况2执行顺序如下:

TX1 TX2
begin; -
update b set name2='test' where id=2999; 对id:2999加LOCK_X锁 -
- insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999);对id:在加锁到996,997,998,999,2995,2996加LOCK_S锁,在对id:2997加锁前睡眠30秒,为下面的update语句腾出时间)
update b set name2='test' where id=999;对id:999加LOCK_X锁等待但发现已经加LOCK_S锁,需等待 -
- 醒来后继续对2997、2998、2999加LOCK_S锁,但是发现id:2999已经加LOCK_X锁,需等待,触发死锁检测
TX1权重回滚 -

死锁报错语句:

  1. mysql> update b set name2='test' where id=999;
  2. ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
复制代码

死锁日志:

  1. *** (1) TRANSACTION:
  2. TRANSACTION 51545, ACTIVE 41 sec starting index read
  3. mysql tables in use 1, locked 1
  4. LOCK WAIT 3 lock struct(s), heap size 1160, 2 row lock(s), undo log entries 1
  5. MySQL thread id 9, OS thread handle 140737223444224, query id 18310 localhost root updating
  6. update b set name2='test' where id=999
  7. *** (1) WAITING FOR THIS LOCK TO BE GRANTED:
  8. RECORD LOCKS space id 121 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 51545 lock_mode X(LOCK_X) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
  9. Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  10. 0: len 4; hex 800003e7; asc ;;
  11. 1: len 6; hex 00000000c167; asc g;;
  12. 2: len 7; hex bc000001300110; asc 0 ;;
  13. 3: len 6; hex 67616f393939; asc gao999;;
  14. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  15. *** (2) TRANSACTION:
  16. TRANSACTION 51546, ACTIVE 30 sec starting index read
  17. mysql tables in use 2, locked 2
  18. 5 lock struct(s), heap size 1160, 9 row lock(s), undo log entries 8
  19. MySQL thread id 8, OS thread handle 140737223177984, query id 18309 localhost root Sending data
  20. insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
  21. *** (2) HOLDS THE LOCK(S):
  22. RECORD LOCKS space id 121 page no 10 n bits 456 index PRIMARY of table `test`.`b` trx id 51546 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP)
  23. Record lock, heap no 8 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  24. 0: len 4; hex 800003e4; asc ;;
  25. 1: len 6; hex 00000000c164; asc d;;
  26. 2: len 7; hex b90000012d0110; asc - ;;
  27. 3: len 6; hex 67616f393936; asc gao996;;
  28. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  29. Record lock, heap no 9 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  30. 0: len 4; hex 800003e5; asc ;;
  31. 1: len 6; hex 00000000c165; asc e;;
  32. 2: len 7; hex ba0000014f0110; asc O ;;
  33. 3: len 6; hex 67616f393937; asc gao997;;
  34. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  35. Record lock, heap no 10 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  36. 0: len 4; hex 800003e6; asc ;;
  37. 1: len 6; hex 00000000c166; asc f;;
  38. 2: len 7; hex bb0000012f0110; asc / ;;
  39. 3: len 6; hex 67616f393938; asc gao998;;
  40. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  41. Record lock, heap no 11 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  42. 0: len 4; hex 800003e7; asc ;;
  43. 1: len 6; hex 00000000c167; asc g;;
  44. 2: len 7; hex bc000001300110; asc 0 ;;
  45. 3: len 6; hex 67616f393939; asc gao999;;
  46. 4: len 7; hex 67616f70656e67; asc gaopeng;;
  47. *** (2) WAITING FOR THIS LOCK TO BE GRANTED:
  48. RECORD LOCKS space id 121 page no 18 n bits 160 index PRIMARY of table `test`.`b` trx id 51546 lock mode S(LOCK_S) locks rec but not gap(LOCK_REC_NOT_GAP) waiting(LOCK_WAIT)
  49. Record lock, heap no 86 PHYSICAL RECORD: n_fields 5; compact format; info bits 0
  50. 0: len 4; hex 80000bb7; asc ;;
  51. 1: len 6; hex 00000000c959; asc Y;;
  52. 2: len 7; hex 00000002240110; asc $ ;;
  53. 3: len 7; hex 67616f32393939; asc gao2999;;
  54. 4: len 4; hex 74657374; asc test;;
  55. *** WE ROLL BACK TRANSACTION (1)
复制代码

信息提取如下:

  1. TRX1:51545
  2. LOCK HOLD:死锁信息不提供
  3. LOCK WAIT:
  4. 表:b
  5. 索引:`PRIMARY`
  6. 锁模式:LOCK_MODE:LOCK_X|LOCK_REC_NOT_GAP|LOCK_WAIT
  7. 记录:主键为0X3e7
  8. 附加信息: space id 121 page no 10 heap no 11
  9. CURRENT SQL:
  10. update b set name2='test' where id=999
  11. TRX2:51546
  12. LOCK HOLD:
  13. 表:b
  14. 索引:`PRIMARY`
  15. 锁模式:LOCK_MODE:LOCK_S|LOCK_REC_NOT_GAP
  16. 记录:主键为0X3e4到0X3e7的多个行锁
  17. 附加信息:space id 121 page no 10
  18. LOCK WAIT:
  19. 表:b
  20. 索引:`PRIMARY`
  21. 锁模式:LOCK_MODE:LOCK_S|LOCK_REC_NOT_GAP|LOCK_WAIT
  22. 记录:主键为0Xbb7
  23. 附加信息:space id 121 page no 10 heap no 86
  24. CURRENT SQL:
  25. insert into a select * from b where id in (996,997,998,999,2995,2996,2997,2998,2999)
复制代码

我们通过死锁日志明显的看出同样的语句报出来的死锁信息却不一样,在高并发下相同语句,两种死锁场景都是可能发生的。

七、总结

分析死锁一般要从死锁日志中获取如下信息

在重现的时候,必须要做到和线上死锁信息完全匹配那么这个死锁场景才叫测试成功了,从这个例子我们就发现,同样的语句产生的死锁信息却不一样,我们当然就要按照不通的场景去考虑,本文中的情况2比较复杂一般只是在高并发先出现,测试也相对麻烦,本文通过修改源码的方式进行测试的,否则很难重现。找到原因后就需要采取必要的措施,比如本文中的例子需要考虑:

强调一点对于出现LOCK_S这样的锁最好深入分析,因为这种锁并不多见。

作者微信:

微信.jpg
微信.jpg


太阳http代理AD
回复

使用道具 举报