導讀
作者:Gopal Shankar
翻譯:徐晨亮原文地址: https://mysqlserverteam.com/mysql-8-0-improvements-to-information_schema/
INFORMATION_SCHEMA
subsystem design in MySQL 8.0. In this post I will first go through our legacy implementation as it has stood since MySQL 5.1, and then cover what’s changed.
INFORMATION_SCHEMA
子系統設計中,我們做了一些很有用的增強,在這篇文章中,我將會介紹自MySQL 5.1以來的舊的實作方式,然后介紹我們做了什么改變,
INFORMATION_SCHEMA
was first introduced into MySQL 5.0, as a standards compliant way of retrieving meta data from a running MySQL server. When we look at the history of
INFORMATION_SCHEMA
there have been a number of complaints about the performance of certain queries, particularly in the case that there are many database objects (schemas, tables etc).
INFORMATION_SCHEMA
首次引入MySQL 5.0,作為一種從正在運行的MySQL服務器檢索元資料的標準兼容方式,當我們回顧
INFORMATION_SCHEMA
的歷史時,對于某些特定查詢性能總是有很多的抱怨,特別是在有許多資料庫物件(schema,表等)的情況下,
INFORMATION_SCHEMA
queries. The optimizations are described in the MySQL manual, and apply when the user provides an explicit schema name or table name in the query.
INFORMATION_SCHEMA
查詢的執行速度,MySQL手冊
<鏈接1>
中描述了這些優化,當用戶在查詢中提供顯式schema名稱或表名時,將會應用這些,
INFORMATION_SCHEMA
performance is still a major pain point for many of our users. The key reason behind these performance issues in the current
INFORMATION_SCHEMA
implementation is that
INFORMATION_SCHEMA
tables are implemented as temporary tables that are created on-the-fly during query execution. These temporary tables are populated via:
-
Meta data from files, e.g. table definitions from .FRM files.
-
Details from storage engines, e.g. dynamic table statistics.
-
Data from global data structures in the MySQL server.
INFORMATION_SCHEMA的
性能仍然是我們許多用戶的主要痛點,在當前
INFORMATION_SCHEMA
實作方式下產生的性能問題背后的關鍵原因是,
INFORMATION_SCHEMA
表的查詢實作方式是在查詢執行期間創建臨時表,這些臨時表通過以下方式填充:
-
元資料來自檔案,例如:表定義來自FRM檔案
-
細節來自于存盤引擎,例如:動態表的統計資訊
-
來自MySQL server層中全域資料結構的資料
INFORMATION_SCHEMA
query would end-up doing lot of I/O reading each individual FRM files from the file system. And it would also end-up using more CPU cycles in effort to open the table and prepare related in-memory data structures. It does attempt to use the MySQL server table cache (the system variable ‘
table_definition_cache
‘), however in large server instances it’s very rare to have a table cache that is large enough to accommodate all of these tables.
INFORMATION_SCHEMA
查詢最侄訓從檔案系統中讀取每個單獨的FRM檔案,造成很多I/O讀取,
并且最侄訓會消耗更多的CPU來打開表并準備相關的記憶體資料結構,
它確實嘗試使用MySQL server層的表快取(系統變數
table_definition_cache
),但是在大型實體中,很少有一個足夠大的表快取來容納所有的表,
INFORMATION_SCHEMA
query. For example, let us consider the two queries below
INFORMATION_SCHEMA
查詢未使用優化,則可以很容易碰到上面的性能問題,
例如,讓我們考慮下面的兩個查詢
-> TABLE_SCHEMA = 'test' AND TABLE_NAME = 't1'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: TABLES
partitions: NULL
type: ALL
possible_keys: NULL
key: TABLE_SCHEMA,TABLE_NAME
key_len: NULL
ref: NULL
rows: NULL
filtered: NULL
Extra: Using where; Skip_open_table; Scanned 0 databases
1 row in set, 1 warning ( 0. 00 sec)
mysql > EXPLAIN SELECT TABLE_NAME FROM INFORMATION_SCHEMA. TABLES WHERE
-> TABLE_SCHEMA like 'test%' AND TABLE_NAME like 't%'\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: TABLES
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: NULL
filtered: NULL
Extra: Using where; Skip_open_table; Scanned all databases
1 row in set, 1 warning ( 0. 00 sec)
EXPLAIN
output, we see that the former query would use the values provided in
WHERE
clause for the
TABLE_SCHEMA
and
TABLE_NAME
field as a key to read the desired
FRM
files from the file system. However, the latter query would end up reading all the
FRM
in the entire data directory, which is very costly and does not scale.
EXPLAIN
的輸出可以看到,
我們看到前一個查詢將使用
WHERE
子句中為
TABLE_SCHEMA
和
TABLE_NAME
欄位提供的值作為鍵,從檔案系統讀取所需
FRM
檔案,
但是,后一個查詢最終將讀取整個資料目錄中的所有
FRM
,這非常昂貴且無法擴展,
FRM
files) and also help MySQL to move towards supporting transactional DDL. For more details on introduction of data dictionary feature in 8.0 and its benefits, please look at Staale’s post here.
FRM
檔案),并幫助MySQL轉向支持事務DDL,有關在8.0中引入資料字典功能及其優點的更多詳細資訊,請在此處查看Staale的文章
<鏈接2>
,
INFORMATION_SCHEMA
table as a database VIEW over the data dictionary tables. This eliminates costs such as the creation of temporary tables for each
INFORMATION_SCHEMA
query during execution on-the-fly, and also scanning file-system directories to find FRM files. It is also now possible to utilize the full power of the MySQL optimizer to prepare better query execution plans using indexes on data dictionary tables.
INFORMATION_SCHEMA
查詢創建臨時表,以及掃描檔案系統目錄以查找FRM檔案,
現在還可以利用MySQL優化器的全部功能,使用資料字典表上的索引來獲得更好的執行計劃,
+ --+-----------+-----++------+------------------+----------++----------------------+----+--------+----------------------------------+
|id|select_type|table ||type |possible_keys |key ||ref |rows|filtered|Extra |
+ --+-----------+-----++------+------------------+----------++----------------------+----+--------+----------------------------------+
| 1|SIMPLE |cat ||index |PRIMARY |name ||NULL | 1| 100. 00|Using index |
| 1|SIMPLE |sch ||eq_ref|PRIMARY,catalog_id|catalog_id || mysql. cat.id,const | 1| 100. 00|Using index |
| 1|SIMPLE |tbl ||eq_ref|schema_id |schema_id || mysql. sch.id,const | 1| 10. 00|Using index condition; Using where|
| 1|SIMPLE |col ||eq_ref|PRIMARY |PRIMARY || mysql. tbl.collation_id| 1| 100. 00|Using index |
+ --+-----------+-----++------+------------------+----------++----------------------+----+--------+----------------------------------+
mysql > EXPLAIN SELECT TABLE_NAME FROM INFORMATION_SCHEMA. TABLES WHERE TABLE_SCHEMA like 'test%' AND TABLE_NAME like 't%';
+ --+-----------+-----++------+------------------+----------++-----------------------+----+--------+---------------------------------+
|id|select_type|table ||type |possible_keys |key || ref |rows|filtered|Extra |
+ --+-----------+-----++------+------------------+----------++-----------------------+----+--------+---------------------------------+
| 1|SIMPLE |cat ||index |PRIMARY |name || NULL | 1| 100. 00|Using index |
| 1|SIMPLE |sch ||ref |PRIMARY,catalog_id|catalog_id || mysql. cat.id | 6| 16. 67|Using where; Using index |
| 1|SIMPLE |tbl ||ref |schema_id |schema_id || mysql. sch.id | 26| 1. 11|Using index condition;Using where|
| 1|SIMPLE |col ||eq_ref|PRIMARY |PRIMARY || mysql. tbl.collation_id| 1| 100. 00|Using index |
+ --+-----------+-----++------+------------------+----------++-----------------------+----+--------+---------------------------------+
INFORMATION_SCHEMA
design in 8.0, we see that it is much more efficient than MySQL 5.7. As an example, this query is now ~100 times faster (with 100 databases with 50 tables each). A separate blog will describe more about performance of
INFORMATION_SCHEMA
in 8.0.
INFORMATION_SCHEMA
來看性能提升時,我們發現它比MySQL 5.7更有效,
例如,此查詢現在快?100倍(100個資料庫,每個50個表),
另外一篇博客將詳細介紹8.0 中
INFORMATION_SCHEMA
性能 ,
FROM information_schema. tables
WHERE TABLE_SCHEMA LIKE 'db%';
Sources of Metadata
INFORMATION_SCHEMA
tables are implemented as a VIEW over the data dictionary tables in 8.0. Currently we have the following
INFORMATION_SCHEMA
tables designed as views:
INFORMATION_SCHEMA
表都通過8.0中的資料字典表作為視圖實作,
目前,我們將以下
INFORMATION_SCHEMA
表設計為視圖:
-
SCHEMATA
-
TABLES
-
COLUMNS
-
VIEWS
-
CHARACTER_SETS
-
COLLATIONS
-
COLLATION_CHARACTER_SET_APPLICABILITY
-
STATISTICS
-
KEY_COLUMN_USAGE
-
TABLE_CONSTRAINTS
INFORMATION_SCHEMA
tables as views:
INFORMATION_SCHEMA
表作為視圖:
-
EVENTS
-
TRIGGERS
-
ROUTINES
-
REFERENTIAL_CONSTRAINTS
INFORMATION_SCHEMA
queries which are not directly implemented as VIEWs over data dictionary tables, let me first describe that there are two types of meta data which are presented in
INFORMATION_SCHEMA
tables:
INFORMATION_SCHEMA
表中有兩種型別的元資料:
-
Static table metadata. For example:
TABLE_SCHEMA,TABLE_NAME,TABLE_TYPE,ENGINE. These statistics will be read directly from the data dictionary. -
Dynamic table metadata. For example:
AUTO_INCREMENT,AVG_ROW_LENGTH,DATA_FREE. Dynamic metadata frequently changes (for example: the auto_increment value will advance after each insert).In many cases the dynamic metadata will also incur some cost to accurately calculate on demand, and accuracy may not be beneficial for the typical query. Consider the case of theDATA_FREEstatistic which shows the number of free bytes in a table – a cached value is usually sufficient. -
靜態表元資料, 例如:
TABLE_SCHEMA,TABLE_NAME,TABLE_TYPE,ENGINE, 這些靜態資料將會從資料字典中直接讀取 -
動態表元資料, 例如:
AUTO_INCREMENT,AVG_ROW_LENGTH,DATA_FREE, 動態元資料經常會變更(例如: 自增值會在每次插入后自增), 在許多情況下,動態元資料也會產生一些成本,以便按需準確計算,并且對于某些特定的查詢這個準確性并沒有用, 考慮DATA_FREE統計資訊的情況,該統計資訊顯示表中的空閑位元組數 - 快取值通常就足夠了,
information_schema_stats
(default
cached
), and can be changed to
information_schema_stats=latest
in order to always retrieve the dynamic information directly from the storage engine (at the cost of slightly higher query execution).
information_schema_stats
(默認
快取
)進行配置,并且可以更改為
information_schema_stats = latest
,以便始終直接從存盤引擎檢索動態資訊(以稍高的查詢執行為代價)
ANALYZE TABLE
on the table, to update the cached dynamic statistics.
ANALYZE TABLE
,以更新快取的動態統計資訊,
INFORMATION_SCHEMA
design in 8.0 is a step forward enabling:
-
-
Simple and maintainable implementation.
-
Us to get rid of numerous
INFORMATION_SCHEMAlegacy bugs. -
Proper use of the MySQL optimizer for
INFORMATION_SCHEMAqueries. -
INFORMATION_SCHEMAqueries to execute ~100 times faster, compared to 5.7, when retrieving static table metadata, as show in query Q1.
-
INFORMATION_SCHEMA
設計是向前邁出的一步:
-
-
簡單且可維護的實作,
-
我們擺脫了很多的
INFORMATION_SCHEMA遺留漏洞, -
正確使用MySQL優化器進行
INFORMATION_SCHEMA查詢, -
與檢索靜態表元資料時的5.7相比,
INFORMATION_SCHEMA查詢執行速度快~100倍,如查詢Q1中所示,
-
INFORMATION_SCHEMA
in 8.0. The new implementation comes with a few changes in behavior when compared to the old
INFORMATION_SCHEMA
implementation. Please check the MySQL manual for more details about it.
INFORMATION_SCHEMA的
討論,
與舊的
INFORMATION_SCHEMA
實作相比,新的實作方式有一些變化,
有關它的更多詳細資訊,請查看MySQL手冊,
END
掃碼加入MySQL技術Q群
(群號:529671799)
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/224343.html
標籤:其他
