mysqlIndexOptimization - juedaiyuer/researchNote GitHub Wiki

#MySQL---索引优化#

##安装演示数据库sakila##

http://downloads.mysql.com/docs/sakila-db.tar.gz

tar -zxf sakila-db.tar.gz

mysql -uroot -p <sakila-schema.sql
mysql -uroot -p <sakila-data.sql

##如何选择合适的列建立索引##

  1. 在where从句,group by从句,order by从句,on从句中出现的列

  2. 索引字段越小越好

  3. 离散度大的列放在联合索引的前面

    SELECT * FROM payment WHERE staff_id=2 AND customer_id=584;

    index(staff_id,customer_id) < index(customer_id,staff_id)

    customer_id离散度更大

    mysql> select count(distinct customer_id),count(distinct staff_id) from payment; +-----------------------------+--------------------------+ | count(distinct customer_id) | count(distinct staff_id) | +-----------------------------+--------------------------+ | 599 | 2 | +-----------------------------+--------------------------+

##优化索引策略##

索引列上不能使用表达式或函数

select ... from product
where to_days(out_date)-to_days(current_date)<=30

select ... from product
where out_date<=date_add(current_date,interval 30 day)

前缀索引和索引列的选择性

Btree索引对键值的大小是有限制的,Innodb不超过767个字节,myisam不超过1000个字节,对于字符串来说有时有些不足

CREATE INDEX index_name ON table(col_name(n));

索引的选择性是不重复的索引值和表的记录数的比值

前缀索引

###联合索引###

如何选择索引列的顺序

  1. 经常会被使用到的列优先
  2. 选择性高的列优先
  3. 宽度小的列优先

###覆盖索引###

优点

  1. 可以优化缓存,减少磁盘IO
  2. 可以减少随即IO,变随即IO操作变为顺序IO操作
  3. 可以避免对Innodb主键索引的二次查询
  4. 可以避免MySAM表进行系统调用

无法使用覆盖索引的情况

  1. 存储引擎不支持覆盖索引(memory)

  2. 查询中使用了太多的列(select *)

  3. 使用了双%号的like查询

    mysql> explain select language_id from film where language_id=1\G;

    Extra: Using index

    mysql> explain select * from film where language_id=1\G;

    Extra: Using where

    mysql> show create table actor\G;

    mysql> explain select actor_id,last_name from actor where last_name='Joe'\G;

    Extra: Using index

###使用索引扫描来优化排序###

通过排序操作 按照索引顺序扫描数据

  1. 索引的列顺序和Order by子句的顺序完全一致

  2. 索引中所有列的方向(升序,降序)和order by子句完全一致

  3. order by中的字段全部在关联表中的第一张表中

    mysql> show create table rental;

    mysql> explain select * from rental where rental_date>'2005-01-01' order by rental_id\G;

    type: index

    mysql> explain select * from rental where rental_date='2005-05-09' order by inventory_id,customer_id\G;

###模拟Hash索引优化查询##

mysql> show create table film\g;

`title` varchar(255) NOT NULL 这一列使用前缀索引

mysql> alter table film add title_md5 varchar(32);

mysql> update film set title_md5=md5(title);

mysql> create index idx_md5 on film(title_md5);

#使用Btree索引模拟Hash索引

mysql> explain select * from film where title_md5=md5('EGG IGBY') and title='EGG IGBY'\G;
  • 只能处理键值的全值匹配查找
  • 所使用的Hash函数决定着索引键的大小

###利用索引优化锁###

  • 索引可以减少锁定的行数

  • 索引可以加快处理速度,同时也加快了锁的释放

    mysql> show create table actor\G;

    mysql> drop index idx_actor_last_name on actor;

    #查看执行计划 mysql> explain select * from actor where last_name='WOOD'\G;

    #连接1,加排它锁 mysql> select * from actor where last_name='WOOD' for update;

    #连接2,被阻塞 mysql> select * from actor where last_name='willis' for update;

    #回滚操作,释放锁 mysql> rollback;

    #恢复索引 mysql> create index idx_lastname on actor(last_name);

    #连接1,加排它锁 mysql> select * from actor where last_name='WOOD' for update;

    #连接2,未被阻塞 mysql> select * from actor where last_name='willis' for update;

###删除重复和冗余的索引###

#重复索引
primary key(id),unique key(id),index(id)

#冗余索引
index(a),index(a,b)
primary key(id),index(a,id)

create index idx_customerid_staffid on payment(customer_id,staff_id);

use information_schema

#查找重复及冗余索引SQL
#执行失败,需要调试
SELECT a.TABLE_SCHEMA AS '数据名'
      ,a.table_name AS '表名'
      ,a.index_name AS '索引1'
      ,b.INDEX_NAME AS '索引2'
      ,a.COLUMN_NAME AS '重复列名'
FROM STATISTICS a JOIN STATISTICS b ON
a.TABLE_SCHEMA=b.TABLE.SCHEMA AND a.TABLE_NAME=b.table_name
AND a.SEQ_IN_INDEX=b.SEQ_IN_INDEX AND a.COLUMN_NAME=
b.COLUMN_NAME WHERE a.SEQ_IN_INDEX =1 AND a.INDEX_NAME <>
b.INDEX_NAME

#检查工具
#percona-toolkit
pt-duplicate-key-checker -uroot -p h=127.0.0.1

###查找未被使用过的索引###

目前MySQL中还没有记录索引的使用情况,但是在perconMySQL和MariaDB中可以通过INDEX_STATISTICS表来查看哪些索引未被使用,但在MySQL中目前只能通过慢查日志pt-index-usage工具来进行索引使用情况分析

pt-index-usage -uroot -p mysql-slow.log

mysql> SELECT object_schema,object_name,index_name,b.`TABLE_ROWS`
 FROM performance_schema.table_io_waits_summary_by_index_usage a
 JOIN information_schema.tables b ON
 a.`OBJECT_SCHEMA`=b.`TABLE_SCHEMA` AND
 a.`OBJECT_NAME`=b.`TABLE_NAME`
 WHERE index_name IS NOT NULL
 AND count_star=0
 ORDER BY object_schema,object_name;

###更新索引统计信息及减少索引碎片###

analyze table table_name

optimize table table_name 使用不当会导致锁表