explain排查分析慢sql

健康平安的活着 2022-06-03 我要评论

一概述

1.0 sql调优的过程

SQL调优过程：

观察，至少跑1天，看看生产的慢SQL情况。
开启慢查询日志，设置阙值，比如超过5秒钟的就是慢SQL，并将它抓取出来。
explain + 慢SQL分析。
show profile，查询SQL在Mysql服务器里面的执行细节和生命周期情况。
运维经理 or DBA，进行SQL数据库服务器的参数调优。

1.1 优化索引口诀

优化的口诀如下：

全值匹配我最爱，最佳左前缀法则 ；

带头大哥不能死，中间兄弟不能断；

索引列上少计算，范围之后全失效；

LIKE 百分写最右，覆盖索引不写 *；

不等空值还有 OR，索引影响要注意；

VAR 引号不可丢， SQL 优化有诀窍。

注意：以下操作都是在所建索引

CREATE INDEX idx_age_deptid_name ON emp(age,deptid,NAME);

下进行的操作，如下:

1.1.1 全值匹配我最爱

CREATE INDEX idx_age_deptid_name ON emp(age,deptid,NAME);

结论：全职匹配我最爱指的是，查询的字段按照顺序在索引中都可以匹配到！ ,SQL 中查询字段的顺序，跟使用索引中字段的顺序，没有关系。优化器会在不影响 SQL 执行结果的前提下，给你自动地优化。

1.1.2 最佳左前缀法则，带头大哥不能死，中间兄弟不能断；

使用复合索引，需要遵循最佳左前缀法则，即如果索引了多列，要遵守最左前缀法则。指的是查询从索引的最左前列开始并且不跳过索引中的列

前面两个sql索引失效的原因：

即使跳过了中间的索引，但是其长度没变化，跟第一个sql只使用name的索引的长度一样，那就说明第二个sql值使用了部分索引，只使用了name的索引，后面的age，pos失效。不然的话长度肯定大于74。

结论：过滤条件要使用索引必须按照索引建立时的顺序，依次满足， 一旦跳过某个字段，索引后面的字段都无法被使用。

1.1.3 索引列上少计算

不在索引列上做任何操作（计算、函数、（自动or手动）类型转换），会导致索引失效而转向全表扫描。

所以字符串类型的数据，该加单引号的一定要加！

1.1.4 范围之后全失效

复合索引：CREATE INDEX idx_age_deptid_name ON emp(age,deptid,NAME);

索引列上不能有范围查询，少用>，<，between…and等结构；范围查询的列忽略，索引失效，后面的索引列也跟着失效，不起作用。

建议：将可能做范围查询的字段的索引顺序放在最后

1.1.5 覆盖索引不写 *

即查询列和索引列一致，不要写 select *!， 尽量使用覆盖索引（只访问索引的查询（索引列和查询列一致））区别在于extra，索引的不同，速度不一样

1.1.6 使用不等于(!= 或者<>)的时候

mysql 在使用不等于 (!= 或者 <>) 时，有时会无法使用索引会导致全表扫描。这个得知道这种情况，根据业务情况，得写这种范围查询，还是要写的。一切满足业务。

1.1.7 不用 is null 或者is not null

is not null 用不到索引，如果某列字段中包含null，is null是可以用到索引的如果某列字段中不包含null，is null是不可以用到索引的

1.1.8 LIKE 百分写最右

1. 注意看模糊查询的细节，只有xx%前缀查询才不会失效

2.如果要实现两边百分号，不能失效，%xx%，可以使用覆盖索引来解决

增加一个索引CREATE INDEX idx_user_nameAge ON tbl_user(NAME,age);

如果使用下面这些，都是使用的覆盖索引，结果都是一样的， 注意id之所以没加索引，但是可以加入使用不会失效，是因为他是主键

但是但是如果加入了没有主键又不是索引的东西，%xx%就会失效

1.1.9 字符串不加单引号索引失效

如varchar类型，自己写成int型，虽然类型不正确也可以查询，但是底层会帮你转换类型，索引直接失效，变成了全表查询。字符串不加单引号索引失效。

1.1.10 少用or，用它来连接时会索引失效。

少用or，用它来连接时会索引失效。

用使用 union all 或者 union 来替代：

1.2 案例分析

1.3 建索引总结

1.对于单键索引，尽量选择针对当前query过滤性更好的索引。

2.在选择组合索引的时候，当前Query中过滤性最好的字段在索引字段顺序中，位置越靠前越好。

3.在选择组合索引的时候，尽量选择可以能够包含当前query中的where字句中更多字段的索引。

4.尽可能通过分析统计信息和调整query的写法来达到选择合适索引的目的。

二案例分析

2.1 单表分析

2.2.1 分析过程

1.sql语句

explain SELECT id, author_id FROM article WHERE category_id = 1 AND comments > 1 ORDER BY views DESC LIMIT 1;

通过执行查看发现：此sql进行了全表查询，而且在extra还出现了Using filesort等问题。

2解决办法：建立其复合索引

create index idx_article_ccv on article(category_id,comments,views);

或者

ALTER TABLE 'article' ADD INDEX idx_article_ccv ( 'category_id , 'comments', 'views' );

通过观察发现：扫描的范围发生改变，变为range，但是extra还是using filetext。

3.原因在于：

当comments字段在联合索引里处于中间位置时，因comments > 1条件是一个范围值(所谓range)，MySQL无法利用索引再对后面的views部分进行检索，即range类型查询字段后面的索引无效。

4.解决办法：建立复合索引是对的，但是其思路要避开中间那个范围的索引进去。只加入另外两个索引即可create index idx_article_cv on article(category_id, views);

到此优化完成！

2.2.2 原因结论

索引失效违反的规则为：

联合索引中，范围索引字段之后全失效。

2.2 两表表分析

2.2.1 案例准备

1.calss表

CREATE TABLE IF NOT EXISTS `class` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`id`)
);

2.book表

CREATE TABLE IF NOT EXISTS `book` (
bookid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`card` INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (`bookid`)
);

1.不建索引

2.假设在class 表中对card字段创建索引

在 class 表上建立索引：alter table class add index idx_card(card);

3.假设在book表中对card字段创建索引

ALTER TABLE `book` ADD INDEX idx_card( `card`);

通过，1，2，3情况的比较，2，3使用到了索引，要比1的情况好，3要比2的好，查询效率高，通过过book表，这个大端表实现rows为1，etra为using index 相比情况2中book使用到索引，效率高。

2.2.2 结论

left join 时候，小表放到左边，小表叫驱动表，大表放到右边，大表叫被驱动表。

优化关联查询时，只有在被驱动表上建立索引才有效！

索引两表优化，左连接右表建索引，右连接左表建索引

2.3 3表表分析

2.3.1 案例

现在再建一张新表：

CREATE TABLE IF NOT EXISTS phone(
    phoneid INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    card INT(10) UNSIGNED NOT NULL,
    PRIMARY KEY(phoneid)
)ENGINE=INNODB;

添加必要的索引

通过后两张表

分别是alter tablephoneadd index z(card);，以及alter tablebookadd index y(card);

2.3.2 结论

leftjoin时：永远用小表结果集驱动大表结果集，保证join语句中被驱动的表的join条件的字段添加了索引

2.4 4表表分析

2.4.1 实验比较

1.虚表作为左表，实表作为右表

EXPLAIN SELECT ed.name ' 人物 ',c.name ' 掌门 ' FROM (SELECT e.name,d.ceo from t_emp e LEFT JOIN t_dept d on e.deptid=d.id) ed LEFT JOIN t_emp c on ed.ceo= c.id;

2.实体表作为左表，虚表作为实体表

EXPLAIN SELECT e.name ' 人物 ',tmp.name ' 掌门 ' FROM t_emp e LEFT JOIN (SELECT d.id did,e.name FROM t_dept d LEFT JOIN t_emp e ON d.ceo=e.id)tmp ON e.deptId=tmp.did;

3. 直接关联

EXPLAIN SELECT e1.name ' 人物 ',e2.name ' 掌门 ' FROM t_emp e1 LEFT JOIN t_dept d on e1.deptid = d.id LEFT JOIN t_emp e2 on d.ceo = e2.id ;

2.4.2 结论

第一个查询效率较高，且有优化的余地。第二个案例中， 子查询作为被驱动表，由于子查询是虚表，无法建立索引，因此不能优化 1子查询尽量不要放在被驱动表，有可能使用不到索引；

2left join时，尽量让实体表作为被驱动表。

3.第3种情况，相对于前面2种，进行直接关联。 能够直接多表关联的尽量直接关联，不用子查询。

2.5 inner join关联

2.5.1 例子

1.EXPLAIN SELECT * FROM book inner join class on class.card=book.card;

2.EXPLAIN SELECT * FROM class inner join book on class.card=book.card;

2.5.2 结论

两个查询字段调换顺序，发现结果也是一样的！ inner join 时，mysql 会自己帮你把小结果集的表选为驱动表。

2.6 子查询优化

2.6.1 情况描述

1.查询所有不为掌门人的员工，按年龄分组！ select age as ' 年龄 ', count(*) as ' 人数' from t_emp where id not in (select ceo from t_dept where ceo is not null) group by age;

可以看到 dept表是全表扫描。没有使用上索引

2.解决 dept 表的全表扫描，建立 ceo 字段的索引：

3.再次查询

4.修改sql写法：替换 not in

select age as ' 年龄 ',count(*) as ' 人数 ' from emp e left join dept d on e.id=d.ceo where d.id is null group by age;

2.6.2 结论

在范围判断时，尽量不要使用 not in 和 not exists，使用 left join on xxx is null 代替。

2.7 小表驱动大表（in 或者exits的使用）

2.7.1 情况举例

select * from A where id in (select id from B)

1.当B表的数据集必须小于A表的数据集时，用in优于exists。

等价于:

for select id from B

for select * from A where A.id = B.id

2.当A表的数据集系小于B表的数据集时，用exists优于in

等价于：

for select * from A

for select * from B where B.id = A.id

2.7.2 结论

1.小表驱动大表

2.EXSTS(subquey)只返回TRUE或FALSE，因此子查询中的SELECT * 也可以是 SELECT 1 或select ‘X’，官方说法是实际执行时会忽略SELECT清单，因此没有区别。

3.EXISTS子查询往往也可以用条件表达式，其他子查询或者JOIN来替代，何种最优需要具体问题具体分析

2.8 order by

2.8.1 案例说明

1.创建一张表

create table tblA( #id int primary key not null auto_increment, age int, birth timestamp not null );

2.创建索引

索引 create index idx_A_ageBirth on tblA(age, birth);

1.如果索引的时候也是按照顺序

2.索引的时候不按顺序就会出现这样的情况

2.8.2 结论

1.MySQL支持二种方式的排序：FileSort和lIndex

Index效率高，它指MySQL扫描索引本身完成排序

FileSort方式效率较低。

2 ORDER BY子句，尽量使用Index方式排序，避免使用FileSort方式排序。

3. ORDER BY满足两情况，会使用Index方式排序：

没有where条件过滤的情况，ORDER BY语句使用索引最左前缀原则。

使用where子句与Order BY子句条件列组合满足索引最左前缀原则。

4.order by 时候，select * 是一个大忌，select * 影响排序速度。影响情况：

当Query的字段大小总和小于max_length_for_sort_data而且排序字段不是TEXT|BLOB类型时，会用改进后的算法——单路排序，否则用老算法——多路排序。
两种算法的数据都有可能超出sort_buffer的容量，超出之后，会创建tmp文件进行合并排序，导致多次IO，但是用单路排序算法的风险会更大一些，所以要提高sort_buffer__size。
尝试提高sort_buffer_size，不管用哪种算法，提高这个参数都会提高效率，当然，要根据系统的能力去提高，因为这个参数是针对每个进程的。
尝试提高max_length_for_sort_data，提高这个参数，会增加用改进算法的概率。但是如果设的太高，数据总容量超出sort_buffer_size的概率就增大，明显症状是高的磁盘I/O活动和低的处理器使用率。

4.常见案例

2.9 group by

2.9.1 结论

GroupBy优化（和order by差不多）

group by实质是先排序后进行分组，遵照索引建的最佳左前缀。

当无法使用索引列，增大max_length_for_sort_data参数的设置 + 增大sort_buffer_size参数的设置。

where高于having，能写在where限定的条件就不要去having限定了。

group by 使用索引的原则几乎跟 order by 一致，唯一区别是 groupby 即使没有过滤条件用到索引，也可以直接使用索引

2.10 最左前缀原则

2.10.1 描述

1.建表

CREATE TABLE `tb_student` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `age` int(10) DEFAULT NULL,
  `pos` varchar(255) DEFAULT NULL,
  `address` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;

2.10.2 不建索引的情况

explain   select * from tb_student

结论：可以看到是全表扫描

2.10.3 创建单列索引

alter table tb_student add index name_index(name);
desc select * from tb_student where  name='ljf'

结论：使用到了索引，且索引级别为ref；

2.10.4 创建复合索引

alter table tb_student drop index name_index;
 alter table tb_student add index name_age_pos_index(name,age,pos);
  show index from tb_student

情况1：使用到了索引，但是没有用到覆盖索引，select * (id,name,age,pos,address)查询的列大于所建的索引列(name,age,pos),etra 为null
desc select name,age,pos,address from tb_student where name='ljf' and age=22 and pos='dd'

    desc select * from tb_student where  name='ljf'

    desc select address from tb_student where  name='ljf'

情况2：使用到了索引，走覆盖索引，所建的索引列，正好能覆盖住 select 要查询的列，走覆盖索引，etra 为using index

    desc select name from tb_student where  name='ljf'
        desc select age from tb_student where  name='ljf'
        desc select pos from tb_student where  name='ljf'
      desc select name,pos from tb_student where  name='ljf'
        desc select name,pos from tb_student where  name='ljf' and age=22 and pos='dd'
                desc select name,age,pos from tb_student where  name='ljf' 
                             desc select name,age,pos from tb_student where  name='ljf' and age=23
        #（255*3+2+1）+（4+1）+（255*3+2+1）=1541
        desc select name,age,pos from tb_student where  name='ljf' and age=22 and pos='dd'

情况3：使用到了索引，使用部分索引，部分索引字段失效，有using index，和using where 条件过滤

     #中间兄弟不能断
             desc select name,age,pos from tb_student where  name='ljf' and pos='dd'
              desc select age,pos from tb_student where  name='ljf' and pos='dd'

desc select * from tb_student where  name='ljf' and pos='dd'

                #带头大哥不能断
                desc select name,pos from tb_student where     pos='dd'

2.11 综合where+group by

explain select d.deptName,if(avg(age)>40,' 老鸟 ',' 菜鸟 ') from dept d inner join emp e
on d.id=e.deptid
group by d.deptName,d.id

查看：

优化思路：

1.在emp表中对deptid字段创建索引；create index index_deptid on emp(deptid);

2.在 deprt表中对depatname，id两个字段创建索引：create index idx_deptName_id on dept(deptName,id);

3.dept为驱动表，emp为被驱动表

2.12 综合

1.首先创建表

 create  table student(
    id int,
      first_name varchar(10),
     last_name varchar(10),
     primary key(id),
    key index_first(first_name)
)engine=innodb default charset=utf8;

1.进行查询

 -- 插入数据
insert into student values (1,'a','b');
 -- 按照first_name查找
 desc select first_name,last_name from student where first_name='a';

结论：当设置first_name为普通索引(单列索引)，按照first_name查询；type:ref、possible_keys:indes_first、key:indes_first、extra:null，用到索引；

2.使用复合索引

-- 设置first_name,last_name复合索引
 alter table student drop index index_first;
 alter table student add index index_name(first_name,last_name);
-- 按照first_name查找
 desc select first_name,last_name from student where first_name='a';

结论：当设置first_name，last_name为复合索引(联合索引)，按照first_name查询；type:ref、possible_keys:indes_name、key:indes_name、extra:Using index；type:ref用到索引，因为是复合索引不需要回表扫描，extra:Using index索引覆盖；注意此时key_len为33=10*3(utf8)+2(变长)+1(null标志位)，用到了复合索引的一部分即first_name

3.使用复合索引

结论：当设置first_name，last_name为复合索引(联合索引)，按照last_name查询；type:index、possible_keys:null、key:indes_name、extra:Using where，Using index；type:index而不是ref，扫描索引树，复合索引的最左原则；此时key_len为66=10*3(utf8)+2(变长)+1(null)+10*3(utf8)+2(变长)+1(null标志位)；Using where应where子句进行限制