主鍵上的SQLMAX，如果已經編入索引，是否不需要過濾條件？-有解無憂

select MAX(id) from studenthistory 
where class_id = 1 
   and date(created_at) = '2021-11-05' 
   and time(created_at) > TIME('04:00:00') 
group by student_id

composite indexes = ("class_id", "student_id", "created_at")

id 是主鍵。

是date(created_at) = '2021-11-05' and time(created_at) > TIME('04:00:00')過濾條件不必要的最大功能，因為studenthistory已經編入索引上和類標識碼student_id資料？

我添加日期時間過濾器的唯一原因是這個表會隨著時間的推移變得越來越大。（歷史資料）我想減少查詢必須搜索的行數。

但是對于 Max 函式的情況 - 我相信 MAX 會簡單地獲取最后一個值而不檢查整行，如果它被索引了。

那么我可以安全地洗掉日期時間過濾器并將其轉換為

select MAX(id) from studenthistory 
where class_id = 1 
group by student_id

并具有相同的性能？（或者更好，因為它不需要進一步過濾？）

檢查查詢計劃似乎性能相似，但目前表的大小相當小..

第一的：

| -> Group aggregate: max(id)  (cost=1466.30 rows=7254) (actual time=2.555..5.766 rows=3 loops=1)
    -> Filter: ((cast(studenthistory.created_at as date) = '2021-11-05') and (cast(riderlocation.created_at as time(6)) > <cache>(cast('04:00:00' as time))))  (cost=740.90 rows=7254) (actual time=0.762..5.384 rows=5349 loops=1)
        -> Index lookup on studenthistory using idx_studenthistory_class_id_931474 (class_id=1)  (cost=740.90 rows=7254) (actual time=0.029..3.589 rows=14638 loops=1)
 |

1 row in set (0.00 sec)

第二：

| -> Group aggregate: max(studenthistory.id)  (cost=1475.40 rows=7299) (actual time=0.545..5.271 rows=10 loops=1)
    -> Index lookup on studenthistory using idx_studenthistory_class_id_931474 (class_id=1)  (cost=745.50 rows=7299) (actual time=0.026..4.164 rows=14729 loops=1)
 |
1 row in set (0.01 sec)

提前謝謝了

更新：應用@rick james 的建議：

將索引更改為 (class_id, student_id, id)。

FLUSH STATUS;
explain FORMAT=JSON SELECT MAX(`id`) `0` FROM `studenthistory` WHERE `class_id`=1 AND `created_at`>='2021-11-05T18:25:50.544850 00:00' GROUP BY `student_id`;


| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "940.10"
    },
    "grouping_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "studenthistory",
        "access_type": "ref",
        "possible_keys": [
          "fk_studenthist_student_e25b0310",
          "idx_studenthistory_class_id_931474"
        ],
        "key": "idx_studenthistory_class_id_931474",
        "used_key_parts": [
          "class_id"
        ],
        "key_length": "4",
        "ref": [
          "const"
        ],
        "rows_examined_per_scan": 8381,
        "rows_produced_per_join": 2793,
        "filtered": "33.33",
        "cost_info": {
          "read_cost": "102.00",
          "eval_cost": "279.34",
          "prefix_cost": "940.10",
          "data_read_per_join": "130K"
        },
        "used_columns": [
          "id",
          "created_at",
          "student_id",
          "class_id"
        ],
        "attached_condition": "(`test-table`.`studenthistory`.`created_at` >= TIMESTAMP'2021-11-05 18:25:50.54485')"
      }
    }
  }
} |

i.e. only class_id is used as an index, (as created_at is no longer in the index. rows_produced_per_join is lower due to filter: 2793,

Without datetime filter:

FLUSH STATUS;
mysql> explain FORMAT=JSON SELECT MAX(`id`) `0` FROM `riderlocation` WHERE `zone_id`=1  GROUP BY `rider_id`;


| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "854.75"
    },
    "grouping_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "studenthistory",
        "access_type": "ref",
        "possible_keys": [
          "fk_studenthistory_student_e25b0310",
          "idx_studenthistory_class_id_931474"
        ],
        "key": "idx_studenthistory_class_id_931474",
        "used_key_parts": [
          "class_id"
        ],
        "key_length": "4",
        "ref": [
          "const"
        ],
        "rows_examined_per_scan": 8381,
        "rows_produced_per_join": 8381,
        "filtered": "100.00",
        "using_index": true,
        "cost_info": {
          "read_cost": "16.65",
          "eval_cost": "838.10",
          "prefix_cost": "854.75",
          "data_read_per_join": "392K"
        },
        "used_columns": [
          "id",
          "student_id",
          "class_id"
        ]
      }
    }
  }
} |

Runs on all 3 indexes ("class_id", "student_id", "id"), same 8381 number of rows slightly lower query cost (940 -> 854)

Applying the first query with original index ("class_id", "student_id", "created_at") yields:

FLUSH STATUS;
    explain FORMAT=JSON SELECT MAX(`id`) `0` FROM `studenthistory` WHERE `class_id`=1 AND `created_at`>='2021-11-05T18:25:50.544850 00:00' GROUP BY `student_id`;

| {
  "query_block": {
    "select_id": 1,
    "cost_info": {
      "query_cost": "858.94"
    },
    "grouping_operation": {
      "using_filesort": false,
      "table": {
        "table_name": "studenthistory",
        "access_type": "ref",
        "possible_keys": [
          "fk_studenthistory_student_e25b0310",
          "idx_studenthistory_class_id_931474"
        ],
        "key": "idx_studenthistory_class_id_931474",
        "used_key_parts": [
          "zone_id"
        ],
        "key_length": "4",
        "ref": [
          "const"
        ],
        "rows_examined_per_scan": 8381,
        "rows_produced_per_join": 2793,
        "filtered": "33.33",
        "using_index": true,
        "cost_info": {
          "read_cost": "20.84",
          "eval_cost": "279.34",
          "prefix_cost": "858.94",
          "data_read_per_join": "130K"
        },
        "used_columns": [
          "id",
          "created_at",
          "student_id",
          "class_id"
        ],
        "attached_condition": "(`test-table`.`studenthistory`.`created_at` >= TIMESTAMP'2021-11-05 18:25:50.54485')"
      }
    }
  }
} |

The cost this time is 858, rows "rows_examined_per_scan": 8381, "rows_produced_per_join": 2793. Only class_id was used as key however. (why.?) not the remaining student_id and created_at

uj5u.com熱心網友回復：

查詢 1

select MAX(id) from studenthistory 
where class_id = 1 
   and date(created_at) = '2021-11-05' 
   and time(created_at) > TIME('04:00:00') 
group by student_id

不要把日期分開；改成

AND created_at > '2021-11-05 04:00:00'

如果要檢查當天“創建”的行，請使用

AND created_at >= '2021-11-05'
AND created_at  < '2021-11-05'   INTERVAL 1 DAY

或者，如果您想檢查“今天”：

AND created_at >= CURDATE()

今天凌晨 4 點后：

AND created_at >= CURDATE()   INTERVAL 4 HOUR

使用date(created_at)使created_at部分INDEX無法使用。（參見“sargable”）

select MAX(id) ... group by student_id

可能會回傳多行——每個學生一個。也許你想擺脫group by? 或者指定一個特定的student_id？

查詢 2 可能運行得更快：

select MAX(id) from studenthistory 
where class_id = 1 
group by student_id

但最佳索引是INDEX(class_id, student_id, id),（可以同時包含兩個復合索引。）

它可能回傳多行，所以也許你想要

select student_id, MAX(id) from studenthistory 
where class_id = 1 
group by student_id

最大限度

我相信 MAX 會簡單地獲取最后一個值而不檢查整行，如果它被索引了。

有時。

您的第二個查詢可以做到這一點。但是第一個查詢不能——因為范圍測驗（在 created_at 上）是障礙。

解釋

查詢計劃似乎……類似

唉，EXPLAIN省略了細節。您可以使用獲得更多詳細資訊EXPLAIN FORMAT=JSON SELECT ...，但不一定足夠詳細。

I think you will find that the second query will give a much smaller value for "Rows" after adding my suggested index.

A way to get an accurate measure of "rows (table or index) touched":

FLUSH STATUS;
SELECT ...;
SHOW SESSION STATUS LIKE 'Handler%';

uj5u.com熱心網友回復：

查詢：

select MAX(id) from studenthistory 
where class_id = 1 
group by student_id

如果您創建索引，速度會很快：

create index ix1 on studenthistory (class_id, student_id, id);

轉載請註明出處，本文鏈接：https://www.uj5u.com/qukuanlian/350369.html

標籤：mysql sql database

上一篇：Mongodb按內部元素分組

下一篇：如何在mongodb資料庫或所有專案中查找特定專案？