MySQL Execution Plan--COUNT相關測驗-有解無憂

COUNT全表記錄

在MySQL中，相同的SQL不同的存盤引擎執行計劃不同：

對于MyISAM引擎，由于使用表鎖進行并發控制，同一時間點多個并發執行緒執行相同查詢獲得的結果相同，且MyISAM存盤引擎專門存盤表總記錄數，因此使用COUNT(*)查詢全表記錄時能直接回傳，

而對于InnoDB存盤引擎，由于使用MVCC和行鎖進行并發控制，同一時間點多個并發執行緒執行相同查詢獲得的結果存在差異(每個回話的READVIEW不同)，且沒有專門存盤表總記錄數，因此每次查詢都需要掃描全表或掃描某個索引的全部記錄，

For transactional storage engines such as InnoDB, storing an exact row count is problematic. Multiple transactions may be occurring at the same time, each of which may affect the count.

InnoDB does not keep an internal count of rows in a table because concurrent transactions might “see” different numbers of rows at the same time. Consequently, SELECT COUNT(*) statements only count rows visible to the current transaction.

Prior to MySQL 5.7.18, InnoDB processes SELECT COUNT(*) statements by scanning the clustered index. As of MySQL 5.7.18, InnoDB processes SELECT COUNT(*) statements by traversing the smallest available secondary index unless an index or optimizer hint directs the optimizer to use a different index. If a secondary index is not present, the clustered index is scanned.

Processing SELECT COUNT(*) statements takes some time if index records are not entirely in the buffer pool. For a faster count, create a counter table and let your application update it according to the inserts and deletes it does. However, this method may not scale well in situations where thousands of concurrent transactions are initiating updates to the same counter table. If an approximate row count is sufficient, use SHOW TABLE STATUS.

InnoDB handles SELECT COUNT(*) and SELECT COUNT(1) operations in the same way. There is no performance difference.

For MyISAM tables, COUNT(*) is optimized to return very quickly if the SELECT retrieves from one table, no other columns are retrieved, and there is no WHERE clause.

This optimization only applies to MyISAM tables, because an exact row count is stored for this storage engine and can be accessed very quickly. COUNT(1) is only subject to the same optimization if the first column is defined as NOT NULL，

現有測驗表TB101：

CREATE TABLE `tb101` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `C1` int(11) NOT NULL,
  `C2` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=MyISAM AUTO_INCREMENT=140001 DEFAULT CHARSET=utf8

對于沒有WHERE條件的COUNT(*)/COUNT(1)/COUNT(ID)/COUNT(C1)的執行計劃為：

mysql> EXPLAIN SELECT COUNT(*) FROM TB101 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: Select tables optimized away
1 row in set, 1 warning (0.00 sec)

對于沒有WHERE條件的COUNT(C2)的執行計劃為：

mysql> EXPLAIN SELECT COUNT(C2) FROM TB101 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TB101
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 140000
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

可以發現，對于MyISAM存盤引擎，在沒有WHERE條件下，如果C1列為NOT NULL，則可以將COUNT(C1)與COUNT(*)和COUNT(1)做相同的處理，

針對上面的測驗，對于InnoDB存盤引擎，在沒有WHERE條件下：

1、ID列為NOT NULL主鍵，COUNT(ID)和COUNT(1)或COUNT(*)的執行計劃相同，回傳結果相同，

2、C1列為NOT NULL，COUNT(C1)和COUNT(1)或COUNT(*)的執行結果相同，但執行計劃不同，

COUNT(expr)異同

1、COUNT(1)和COUNT(*)等價，兩者在執行計劃和執行效率上完全相同，

個人推薦使用COUNT(1)替換COUNT(*)，原因是簡單直觀，
他人推薦使用COUNT(1),原因是符合SQL92標準，阿里巴巴Java開發手冊推薦，

2、COUNT(*)和COUNT(C1)不一定等價，兩者執行計劃和執行結果會存在差異，

COUNT(*)：執行回傳滿足WHERE條件的行數，不考慮NULL值問題
COUNT(C1): 執行回傳滿足WHERE條件且C1不等于NULL的行數，不統計C1等于NULL的行，

換種理解思路：
對于MyISAM引擎表和InnoDB引擎表，無論是顯式主鍵還是因此ROWID，都要求非空唯一，每行記錄都肯定存在一個不為NULL的列(列組)，因此計算COUNT(*)時不需要考慮NULL值問題，

一個有趣的擴展，如果C1為NOT NULL，那么COUNT(C1)與COUNT(1)的回傳結果相同，那么MySQL會對此進行優化么？

現有測驗表結果如下：

CREATE TABLE `tb01` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `C1` int(11) NOT NULL,
  `C2` int(11) NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `IDX_C2` (`C2`)
) ENGINE=InnoDB AUTO_INCREMENT=16384 DEFAULT CHARSET=utf8

查看COUNT(*)和COUNT(C1)的執行計劃：

mysql> EXPLAIN SELECT COUNT(*) FROM TB01 WHERE C2<100 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TB01
   partitions: NULL
         type: range
possible_keys: IDX_C2
          key: IDX_C2
      key_len: 4
          ref: NULL
         rows: 99
     filtered: 100.00
        Extra: Using where; Using index
1 row in set, 1 warning (0.00 sec)

mysql> EXPLAIN SELECT COUNT(C1) FROM TB01 WHERE C2<100 \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TB01
   partitions: NULL
         type: range
possible_keys: IDX_C2
          key: IDX_C2
      key_len: 4
          ref: NULL
         rows: 99
     filtered: 100.00
        Extra: Using index condition
1 row in set, 1 warning (0.00 sec)

從上面執行計劃可以發現，在處理COUNT(*)時，僅需要使用IDX_C2即可完成查詢，因此Extra為Using index，而在處理COUNT(C1)時，需要使用IDX_C2進行過濾后再執行回表查詢，因此Extra為Using index condition，

針對上面的測驗，MyISAM存盤引擎和InnoDB存盤引擎的測驗結果相同，

COUNT(DISTINC ...)操作

MySQL官網解釋為：

COUNT(DISTINCT expr,[expr...])

Returns a count of the number of rows with different non-NULL expr values.

In MySQL, you can obtain the number of distinct expression combinations that do not contain NULL by giving a list of expressions. In standard SQL, you would have to do a concatenation of all expressions inside COUNT(DISTINCT ...).

在MySQL中允許執行：

SELECT COUNT(DISTINCT ID,C1) FROM TB02;

但不允許執行：

SELECT COUNT(ID,C1) FROM TB02;

總結

1、對于InnoDB和MyISAM存盤引擎，COUNT(1)和COUNT(*)在任何場景下都等價，執行性能和執行計劃相同，

2、在查詢全表記錄(沒有WHERE條件)時，對于MyISAM存盤引擎，存盤引擎存盤表總記錄數，無需掃描資料因此查詢可以很快回傳，對于InnoDB存盤引擎，需要掃描全表或某個索引的全部記錄因此查詢可能比較耗時，

3、對于MyISAM存盤引擎，在沒有WHERE條件情況下，如果列C1為NOT NULL，那么COUNT(C1)和COUNT(*)執行操作相同，

4、對于InnoDB存盤引擎，如果列C1為主鍵，那么COUNT(C1)和COUNT(*)執行計劃和執行效率相同，如果C1為NOT NULL，那么COUNT(C1)和COUNT(*)執行計劃和執行效率不一定相同，只有在查詢使用C1列上索引時才可能相同，

參考鏈接：

https://dev.mysql.com/doc/refman/8.0/en/group-by-functions.html#function_count

https://dev.mysql.com/doc/refman/5.7/en/create-index.html

https://mp.weixin.qq.com/s/IOHvtel2KLNi-Ol4UBivbQ

轉載請註明出處，本文鏈接：https://www.uj5u.com/shujuku/115820.html

標籤：MySQL

上一篇：MySQL索引查詢原理

下一篇：神奇的 SQL 之團結的力量 → JOIN