隱式型別轉換簡介
通常ORACLE資料庫存在顯式型別轉換(Explicit Datatype Conversion)和隱式型別轉換(Implicit Datatype Conversion)兩種型別轉換方式,如果進行比較或運算的兩個值的資料型別不同時(源資料的型別與目標資料的型別),而且此時又沒有轉換函式時,那么ORACLE必須將其中一個值進行型別轉換,使其能夠運算,這就是所謂的隱式型別轉換,其中隱式型別轉換是自動進行的,當然,只有在這種轉換是有意義的時候,才會自動進行,
Data Conversion
Generally an expression cannot contain values of different datatypes. For example, an expression cannot multiply 5 by 10 and then add 'JAMES'. However, Oracle supports both implicit and explicit conversion of values from one datatype to another.
關于隱式型別轉換,建議翻看官方檔案“Data Type Comparison Rules”章節,下面是官方檔案中的隱式型別轉換矩陣,從下面這個表格,我們就能對哪些資料型別能進行轉換一目了然,
![clip_image001[4] clip_image001[4]](https://img.uj5u.com/2020/09/13/60082130128141.png)
隱式轉換的規則:
其實隱式型別轉換發生在很多地方,只是我們很多時候沒有留意罷了,不打算一一舉例,自行翻閱官方檔案的介紹,摘抄隱式型別轉換的一些常見的規則如下:
The following rules govern implicit data type conversions:
- During INSERT and UPDATE operations, Oracle converts the value to the data type of the affected column.
- During SELECT FROM operations, Oracle converts the data from the column to the type of the target variable.
- When manipulating numeric values, Oracle usually adjusts precision and scale to allow for maximum capacity. In such cases, the numeric data type resulting from such operations can differ from the numeric data type found in the underlying tables.
- When comparing a character value with a numeric value, Oracle converts the character data to a numeric value.
- Conversions between character values or NUMBER values and floating-point number values can be inexact, because the character types and NUMBER use decimal precision to represent the numeric value, and the floating-point numbers use binary precision.
- When converting a CLOB value into a character data type such as VARCHAR2, or converting BLOB to RAW data, if the data to be converted is larger than the target data type, then the database returns an error.
- During conversion from a timestamp value to a DATE value, the fractional seconds portion of the timestamp value is truncated. This behavior differs from earlier releases of Oracle Database, when the fractional seconds portion of the timestamp value was rounded.
- Conversions from BINARY_FLOAT to BINARY_DOUBLE are exact.
- Conversions from BINARY_DOUBLE to BINARY_FLOAT are inexact if the BINARY_DOUBLE value uses more bits of precision that supported by the BINARY_FLOAT.
- When comparing a character value with a DATE value, Oracle converts the character data to DATE.
- When you use a SQL function or operator with an argument of a data type other than the one it accepts, Oracle converts the argument to the accepted data type.
- When making assignments, Oracle converts the value on the right side of the equal sign (=) to the data type of the target of the assignment on the left side.
- During concatenation operations, Oracle converts from noncharacter data types to CHAR or NCHAR.
- During arithmetic operations on and comparisons between character and noncharacter data types, Oracle converts from any character data type to a numeric, date, or rowid, as appropriate. In arithmetic operations between CHAR/VARCHAR2 and NCHAR/NVARCHAR2, Oracle converts to a NUMBER.
- Most SQL character functions are enabled to accept CLOBs as parameters, and Oracle performs implicit conversions between CLOB and character types. Therefore, functions that are not yet enabled for CLOBs can accept CLOBs through implicit conversion. In such cases, Oracle converts the CLOBs to CHAR or VARCHAR2 before the function is invoked. If the CLOB is larger than 4000 bytes, then Oracle converts only the first 4000 bytes to CHAR.
- When converting RAW or LONG RAW data to or from character data, the binary data is represented in hexadecimal form, with one hexadecimal character representing every four bits of RAW data. Refer to "RAW and LONG RAW Data Types" for more information.
- Comparisons between CHAR and VARCHAR2 and between NCHAR and NVARCHAR2 types may entail different character sets. The default direction of conversion in such cases is from the database character set to the national character set. Table 2-9 shows the direction of implicit conversions between different character types.
對上面官方檔案資料的翻譯如下,如有不對或不夠確切的地方,敬請指出
1. 對于INSERT和UPDATE操作,ORACLE會把插入值或者更新值隱式轉換為對應欄位的資料型別,
2. 對于SELECT陳述句,ORACLE會把欄位的資料型別隱式轉換為變數的資料型別,
3. 當處理數值時,ORACLE通常會調整精度和小數位,以實作最大容量,在這種情況下,由此類操作產生的數字資料型別可能與在基礎表中找到的數字資料型別不同,
4. 當比較一個字符型和數值型的值時,ORACLE會把字符型的值隱式轉換為數值型,
5. 字符值或NUMBER值與浮點數值之間的轉換可能不準確,因為字符型別和NUMBER使用十進制精度表示數字值,而浮點數則使用二進制精度,
6. 將CLOB值轉換為字符資料型別(例如VARCHAR2)或將BLOB轉換為RAW資料時,如果要轉換的資料大于目標資料型別,則資料庫將回傳錯誤,
7. 當timestamp型別轉換為DATE時(按照第三條,隱式轉換不應該把timestamp轉換為date,除非insert這樣的),timestamp后幾位會被truncated忽略,至于忽略幾位,取決于資料庫版本,
8. 從BINARY_FLOAT到BINARY_DOUBLE的轉換是準確的,
9. 從BINARY_DOUBLE到BINARY_FLOAT的轉換是不精確的,因為BINARY_DOUBLE精度更高,
10. 當比較字符型和日期型的資料時,ORACLE會把字符型轉換為日期型,
11. 如果呼叫函式(程序)或運算子操作時,如果輸入引數的資料型別與函式(存盤程序)定義的引數資料型別不一致或不是可接受的資料型別時,則ORACLE會把輸入引數的資料型別轉換為函式或者程序定義的資料型別,
12. 當使用賦值符號(等號)時,右邊的型別轉換為左邊的型別
13. 當連接操作(concatenation,一般為||)時,ORACLE會隱式轉換非字符型到字符型
14. 如果字符型別的資料和非字符型別的資料(如number、date、rowid等)作算術運算,則ORACLE會將字符型別的資料轉換為合適的資料型別,這些資料型別可能是number、date、rowid等,
如果CHAR/VARCHAR2 和NCHAR/NVARCHAR2之間作算術運算,則ORACLE會將她們都轉換為number型別的資料再做比較,
15. 比較CHAR/VARCHAR2 和NCHAR/NVARCHAR2時,如果兩者字符集不一樣,則默認的轉換方式是將資料編碼從資料庫字符集轉換為國家字符集
下面簡單舉兩個例子,看看隱式轉換發生的場景:
例子:
SQL> create table test(object_id varchar2(12), object_name varchar2(64));
Table created.SQL> insert into test
2 select object_id, object_name from dba_objects;
63426 rows created.SQL> commit;
Commit complete.SQL> create index ix_test_n1 on test(object_id);
Index created.SQL> select count(*) from test where object_id=20;
COUNT(*)---------- 1SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------SQL_ID 4bh7yzj5ma0ks, child number 0-------------------------------------select count(*) from test where object_id=20Plan hash value: 1950795681
---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 45 (100)| || 1 | SORT AGGREGATE | | 1 | 8 | | |PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------|* 2 | TABLE ACCESS FULL| TEST | 3 | 24 | 45 (20)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------- 2 - filter(TO_NUMBER("OBJECT_ID")=20)Note
----- - dynamic sampling used for this statementPLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------------23 rows selected.
如上所示,這個發生隱式轉換是因為這個規則: “當比較一個字符型和數值型的值時,ORACLE會把字符型的值隱式轉換為數值型”(對于SELECT陳述句,ORACLE會把欄位的資料型別隱式轉換為變數的資料型別,似乎這個規則也對),此時由于隱式轉換發生在OBJECT_ID欄位上(TO_NUMBER("OBJECT_ID")),導致執行計劃走全表掃描,如果我們稍微修改一下SQL的寫法,就會發現執行計劃會走INDEX RANGE SCAN, 如下所示:
SQL> select count(*) from test where object_id='20';
COUNT(*)---------- 1SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------SQL_ID 7800f6da7c909, child number 0------------------------------------- select count(*) from test where object_id='20'Plan hash value: 4037411162
--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 1 (100)| || 1 | SORT AGGREGATE | | 1 | 6 | | |PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------|* 2 | INDEX RANGE SCAN| IX_TEST_N1 | 1 | 6 | 1 (0)| 00:00:01 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------- 2 - access("OBJECT_ID"='20')19 rows selected.
下面再介紹一個案例(當比較字符型和日期型的資料時,ORACLE會把字符型轉換為日期型,),這種轉換雖然大部分情況下都是正常的,但是有時候會成為一個隱藏的邏輯炸彈,當NLS_DATE_FORMAT環境變數改變時,則有可能出現錯誤或邏輯錯誤,
SQL> SELECT *
2 FROM scott.emp3 WHERE hiredate between '01-JAN-1981' and '01-APR-1981';
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ---------- 7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 307521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------SQL_ID czyc76busj56d, child number 0-------------------------------------SELECT * FROM scott.emp WHERE hiredate between '01-JAN-1981' and'01-APR-1981'Plan hash value: 3956160932
--------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 2 (100)| |PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------|* 1 | TABLE ACCESS FULL| EMP | 2 | 74 | 2 (0)| 00:00:01 |--------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------- 1 - filter(("HIREDATE"<=TO_DATE(' 1981-04-01 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "HIREDATE">=TO_DATE(' 1981-01-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss')))21 rows selected.
隱式型別轉換問題
Implicit and Explicit Data Conversion
Oracle recommends that you specify explicit conversions, rather than rely on implicit or automatic conversions, for these reasons:
· SQL statements are easier to understand when you use explicit datatype conversion functions.
· Implicit datatype conversion can have a negative impact on performance, especially if the datatype of a column value is converted to that of a constant rather than the other way around.
· Implicit conversion depends on the context in which it occurs and may not work the same way in every case. For example, implicit conversion from a datetime value to a VARCHAR2 value may return an unexpected year depending on the value of the NLS_DATE_FORMAT parameter.
· Algorithms for implicit conversion are subject to change across software releases and among Oracle products. Behavior of explicit conversions is more predictable.
雖然隱式轉換在很多地方自動發生,但是不推薦使用隱式型別轉換,Oracle官方建議指定顯式型別轉換,而不要依賴隱式或自動轉換,主要有下面一下原因:
使用顯式型別轉換函式時,SQL陳述句更易于理解,
隱式型別轉換可能會對性能產生負面影響,尤其是如果將列值的資料型別轉換為常量而不是相反的資料型別轉換操作時,
隱式轉換取決于發生這種轉換的背景關系,在不同的情況下,隱式轉換的作業方式可能不同,例如,從日期時間值到VARCHAR2值的隱式轉換可能會回傳錯誤(意外)的年份,具體取決于NLS_DATE_FORMAT引數的值,
隱式轉換演算法可能會在軟體版本之間以及Oracle產品之間發生變化,明確轉換的行為更容易預測,否則有可能埋下一個大坑,
如果在索引運算式中發生隱式型別轉換,則Oracle資料庫可能不使用索引,因為它是pre-conversion data type.,這可能會對性能產生負面影響,
Tom Kyte的這篇博文On Implicit Conversions and More,還總結了隱式資料型別轉換會帶來的一些問題:
The resulting code typically has logic bombs in it. The code appears to work in certain circumstances but will not work in others.
- The resulting code relies on default settings. If someone changes the default settings, the way the code works will be subject to change. (A DBA changing a setting can make your code work entirely differently from the way it does today.)
- The resulting code can very well be subject to SQL injection bugs.
- The resulting code may end up performing numerous unnecessary repeated conversions (negatively affecting performance and consuming many more resources than necessary).
- The implicit conversion may be precluding certain access paths from being available to the optimizer, resulting in suboptimal query plans. (In fact, this is exactly what is happening to you!)
隱式轉換可能會阻止某些訪問路徑無法用于優化器,從而導致查詢計劃不理想, (實際上,這正是您資料庫當中正在發生的事情!)
- Implicit conversions may prevent partition elimination.
其實上面已經有相關例子介紹,下面介紹一個例子,主要用來說明,隱式型別轉換不一定導致執行計劃不走索引,只有當隱式轉換函式出現在查詢條件中的索引欄位上,而且左值的型別被隱式轉為了右值的型別時才會出現嚴重性能問題,
SQL> drop table test;
Table dropped.SQL> create table test
2 as3 select * from dba_objects;
Table created.SQL> create index ix_test_n1 on test(object_id);
Index created.SQL> select count(*) from test where object_id='20';
COUNT(*)---------- 1SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------SQL_ID 29jmhh43kkrv4, child number 0-------------------------------------select count(*) from test where object_id='20'Plan hash value: 4037411162
--------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |--------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 1 (100)| || 1 | SORT AGGREGATE | | 1 | 13 | | |PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------|* 2 | INDEX RANGE SCAN| IX_TEST_N1 | 10 | 130 | 1 (0)| 00:00:01 |--------------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------- 2 - access("OBJECT_ID"=20)Note
----- - dynamic sampling used for this statementPLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------23 rows selected.SQL>
![clip_image002[4] clip_image002[4]](https://img.uj5u.com/2020/09/13/60082130128142.png)
其實SQL陳述句發生了隱式轉換,而且轉換的地方在字串’20'上面,轉換為數字20,這樣的變化沒有發生在OBJECT_ID列上面,其次,這種轉換沒有發生在左值列上面,沒有影響到IX_TEST_N1的路徑,
所以以后,如果遇到”隱式轉換一定不走索引嗎?”或”隱式型別轉換一定導致索引失效嗎?”這類問題,你都要辯證的來分析,不能一概而論,
下面介紹一個系結變數發生隱式型別轉換的例子:
SQL> create table test
2 as3 select * from dba_objects;
Table created.SQL> commit;
Commit complete.SQL> create index ix_test_object_name on test(object_name);
Index created.SQL> variables v_object_name nvarchar2(30);SP2-0734: unknown command beginning "variables ..." - rest of line ignored.
SQL> SQL> variable v_object_name nvarchar2(30);
SQL> exec :v_object_name :='I_OBJ1';
PL/SQL procedure successfully completed.
SQL> select count(*) from test where object_name=:v_object_name;
COUNT(*)---------- 1SQL> SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY_CURSOR);
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------SQL_ID ft05prnxtpk9u, child number 0-------------------------------------select count(*) from test where object_name=:v_object_namePlan hash value: 1950795681
---------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |---------------------------------------------------------------------------| 0 | SELECT STATEMENT | | | | 113 (100)| || 1 | SORT AGGREGATE | | 1 | 66 | | |PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------|* 2 | TABLE ACCESS FULL| TEST | 10 | 660 | 113 (11)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):
--------------------------------------------------- 2 - filter(SYS_OP_C2C("OBJECT_NAME")=:V_OBJECT_NAME)Note
----- - dynamic sampling used for this statementPLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------23 rows selected.
![clip_image003[4] clip_image003[4]](https://img.uj5u.com/2020/09/13/60082130128143.png)
這里發生隱式型別轉換,是因為隱式型別規則:“比較CHAR/VARCHAR2 和NCHAR/NVARCHAR2時,如果兩者字符集不一樣,則默認的轉換方式是將資料編碼從資料庫字符集轉換為國家字符集” ,而此時是借助內部函式SYS_OP_C2C實作的
SYS_OP_C2C is an internal function which does an implicit conversion of varchar2 to national character set using TO_NCHAR function. Thus, the filter completely changes as compared to the filter using normal comparison.
如何找出存在隱式轉換的SQL?
有些公司可能對發布的SQL進行全面審計,能夠從源頭上杜絕大多數存在隱式型別轉換的SQL,但是大多數公司可能沒有這個能力或資源來實作這個目標,那么,最重要的就是如何找出資料庫中存在隱式轉換的SQL,關于如何找出存在隱式資料型別轉換的SQL,一般有下面兩個SQL:
SELECT
SQL_ID,
PLAN_HASH_VALUE
FROM
V$SQL_PLAN X
WHERE
X.FILTER_PREDICATES LIKE '%INTERNAL_FUNCTION%'
GROUP BY
SQL_ID,
PLAN_HASH_VALUE;
SELECT
SQL_ID,
PLAN_HASH_VALUE
FROM
V$SQL_PLAN X
WHERE
X.FILTER_PREDICATES LIKE '%SYS_OP_C2C%'
GROUP BY
SQL_ID,
PLAN_HASH_VALUE;
但是需要注意的是,即使執行計劃中存在INTERNAL_FUNCTION,也不一定說明SQL陳述句出現了隱式資料型別轉換,關于這個問題,參考我的博客“ORACLE資料庫中執行計劃出現INTERNAL_FUNCTION一定是隱式轉換嗎?”, 所以還必須對找出的相關SQL進行仔細甄別、鑒定,
另外,這篇博客“ORACLE中內部函式SYS_OP_C2C和隱式型別轉換”,也值得對隱式型別轉換了解不深的同學看看,
如何避免隱式型別轉換呢?
1:在資料庫設計階段和寫SQL期間,盡量遵循一致的原則,避免不必要的資料型別轉換,
在建模時,要統一欄位型別,尤其是和其它表進行關聯的相關欄位必須保證資料型別一致,這樣可以避免不必要的隱式資料型別轉換,
查詢SQL中條件與欄位型別保持一致,另外,確保系結變數的資料型別,使其與對應欄位的資料型別一致
2:使用轉換函式,進行顯示型別轉換,
例如有下面一些常見的型別轉換函式:
· TO_CHAR:把DATE或NUMBER轉換成字串;
· TO_DATE:把NUMBER、CHAR或VARCHAR2轉換成DATE,當用到時間戳時,可以用到TO_TIMESTAMP或TO_TIMESTAMP_TZ,
· TO_NUMBER: 把CHAR或VARCHAR2轉換成NUMBER,
3:創建帶有SYS_OP_C2C的函式索引,
這種方法比較少用,不過確實也是特殊場景下的一種優化方法,
參考資料:
https://blogs.oracle.com/oraclemagazine/on-implicit-conversions-and-more
https://docs.oracle.com/cd/E21764_01/apirefs.1111/e12048/cql_elements.htm#CQLLR290
https://docs.oracle.com/en/database/oracle/oracle-database/19/sqlrf/Data-Type-Comparison-Rules.html#GUID-98BE3A78-6E33-4181-B5CB-D96FD9DC1694
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/18431.html
標籤:Oracle
上一篇:Oracle匯出警告“EXP-00003: 未找到段 (0,0) 的存盤定義”解決
下一篇:Oracle11以后的行列轉換
