[20221018]本地運行與遠程運行.txt
--//鏈接http://blog.tanelpoder.com/2008/02/05/oracle-hidden-costs-revealed-part-1/.
--//里面提到一個問題本地運行與遠程運行,oracle性能存在怎么區別,理論講如果不考慮網路傳輸,兩組差別不大.
--//因為Oracle是一個客戶端服務器資料庫系統,所有的執行都是在本地執行的,而不管客戶機的位置如何,因此性能是相同的,
--//作者給出一個例子,說明一些區別:
1.環境:
SCOTT@book> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
2.分別從widnows客戶端以及linux服務端測驗看看.
--//注:lotslios.sql 來自 tpt 里面的測驗腳本.
--//測驗在本地服務器.
SCOTT@book> @spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- ------ ------- ---------- --------------------------------------------------
36 45801 14705 DEDICATED 14706 26 206 alter system kill session '36,45801' immediate;
SCOTT@book> set timing on
SCOTT@book> @lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:08.11
--//注:@lotslios 1e5根本測驗不出來.
--//測驗在客戶端windows:
SCOTT@78> @spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
53 44367 4476:8736 DEDICATED 14714 27 144 alter system kill session '53,44367' immediate;
SCOTT@78> set timing on
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:10.40
--//你可以運行多次,都是windows下測驗時間大于在本地服務器的測驗時間.
--//作者通過使用他自己寫Snapper包以及V$SESSTAT,看不出任何差異.
--//使用strace跟蹤(注:作者的服務器solaris,使用truss).
--//測驗在本地服務器,使用strace跟蹤服務端行程.
$ strace -cp 14706
Process 14706 attached - interrupt to quit
^CProcess 14706 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 2 read
nan 0.000000 0 2 write
nan 0.000000 0 27 getrusage
nan 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 41 total
--//測驗在客戶端windows,使用strace跟蹤服務端行程.
$ strace -cp 14714
Process 14714 attached - interrupt to quit
^CProcess 14714 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
96.13 0.024820 0 949593 poll
3.87 0.001000 500 2 read
0.00 0.000000 0 2 write
0.00 0.000000 0 83 getrusage
0.00 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.025820 949690 total
--//很慢!!我不得按ctrl+c停止strace,你可以發現大量呼叫poll.
--//可以發現測驗在客戶端windows,多了一個poll 系統呼叫,作者測驗平臺solaris,呼叫的是pollsys.
# man -a pool
POLL(2) Linux Programmer's Manual POLL(2)
NAME
poll, ppoll - wait for some event on a file descriptor
SYNOPSIS
#include <poll.h>
int poll(struct pollfd *fds, nfds_t nfds, int timeout);
#define _GNU_SOURCE
#include <poll.h>
int ppoll(struct pollfd *fds, nfds_t nfds,
const struct timespec *timeout, const sigset_t *sigmask);
--//轉載:http://blog.tanelpoder.com/2008/02/05/oracle-hidden-costs-revealed-part-1/.
So, there is a big difference in number of pollsys() system calls, depending on which client was used for connecting.
The pollsys syscall is normally used for checking whether there is any data that can be read from a file descriptor (or
whether the file descriptor is ready for receiving more writes). As TCP sockets on Unix are also accessed through file
descriptors, Oracle could be polling the client TCP connection file descriptor… but (without prior knowledge) we can
not be sure.
因此,pollsys()系統呼叫的數量有很大的差異,這取決于用于連接的客戶端,pollsys系統通常用于檢查是否有可以從檔案描述符讀取的
資料(或者檔案描述符是否準備好接收更多的寫操作),由于Unix上的TCP套接字也可以通過檔案描述符訪問,Oracle可以輪詢客戶端TCP連
接檔案描述符…但是(沒有事先知識)我們不能確定,
...
Oracle client server communication normally works in RPC fashion – for example a client sends a command to Oracle and
Oracle doesn't return anything until the command is completed.
Oracle客戶端服務器通信通常以RPC的方式作業——例如,客戶端向Oracle發送一個命令,而Oracle在該命令完成之前不會回傳任何東西
,
Now if a user tries to cancel their query (using CTRL+C in sqlplus or calling OCIBreak in non-blocking OCI), a cancel
packet is sent to server over TCP. The packet will be stored in the server side receive buffer of OS TCP stack and
becomes available for reading for the server process (via a TCP socket). However if the server process is in a
long-running loop executing a query, it needs to periodically check the TCP receive socket for any outstanding packets.
And this is exactly what the pollsys() system call does.
現在,如果用戶試圖取消他們的查詢(在sqlplus中使用CTRL+C或在非阻塞OCI中呼叫OCIBreak),一個取消資料包將通過TCP發送到服務器
,該資料包將存盤在OS TCP堆疊的服務器端接識訓沖區中,并可為服務器行程讀取(通過TCP套接字),但是,如果服務器行程處于執行查詢
的長時間運行的回圈中,那么它需要定期檢查TCP接收套接字中是否有任何未完成的資料包,這正是pollsys()系統所做的,
This approach for cancelling an operation is called in-band break, as the break packet is sent in-band with all other
traffic. The server process has to be programmed to periodically check for any newly arrived packets, even if it is
already busy working on something else.
這種取消操作的方法稱為in-band break,因為中斷包與所有其他業務一起在帶內發送,服務器行程必須被編程,以定期檢查任何新到達
的資料包,即使它已經在忙于處理其他事情,
There are several functions in Oracle kernel where the developers have put the check for in-band breaks. This means that
in some highly repetitive operations (like nested loop join) the same functions are hit again and again – causing
frequent polling on the TCP socket. And too frequent polling is what causes the peformance degradation.
在Oracle內核中有幾個函式,開發人員可以檢查in-band breaks,這意味著在一些高度重復的操作(如嵌套回圈連接)中,相同的函式會被
反復命中,導致TCP套接字上頻繁輪詢,而過頻繁的輪詢是導致性能下降的原因,
However Oracle network layer has a sqlnet.ora parameter called break_poll_skip, which can help in such situations. This
parameters defines, how many times to just silently skip the TCP socket polling when the nsmore2recv() function is
called. The parameter defaults to 3 in recent versions, which means that only 1 of 3 polls are actually executed ( from
above test case it's seen that for 4 million consistent gets roughly 1/3 = 1.3 million pollsys() calls were executed ).
然而,Oracle網路層有一個名為break_poll_skip的sqlnet.ora引數,這可以在這種情況下提供幫助,此引數定義了當呼叫nsmore2recv()
函式時,只需無聲地跳過TCP套接字輪詢的次數,在最近的版本中,引數默認為3,這意味著實際3個輪詢中只有1個被執行(從上面的測驗
用例可以看出,400萬一致得到大約1/3=130萬個民調系統()呼叫),
--//換成執行lotslios 1e5,再使用strace跟蹤看看.
--//測驗在本地服務器,使用strace跟蹤服務端行程.
$ strace -cp 14706
Process 14706 attached - interrupt to quit
^CProcess 14706 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 2 read
nan 0.000000 0 2 write
nan 0.000000 0 19 getrusage
nan 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 33 total
--//測驗在客戶端windows,使用strace跟蹤服務端行程.
$ strace -cp 14714
Process 14714 attached - interrupt to quit
^CProcess 14714 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000018 0 897 poll
0.00 0.000000 0 2 read
0.00 0.000000 0 2 write
0.00 0.000000 0 19 getrusage
0.00 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000018 930 total
--//poll=897次.
SCOTT@78> set autot traceonly
SCOTT@78> @tpt/lotslios 1e5
generate lots of LIOs by repeatedly full scanning through a small table...
Elapsed: 00:00:00.06
Execution Plan
---------------------------
Plan hash value: 3691747574
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 23P (1)|999:59:59 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | NESTED LOOPS | | 2401P| 15E| 23P (1)|999:59:59 |
| 4 | NESTED LOOPS | | 79T| 650T| 769G (1)|999:59:59 |
| 5 | NESTED LOOPS | | 2631M| 14G| 25M (1)| 84:57:18 |
| 6 | TABLE ACCESS FULL| OBJ$ | 87098 | 255K| 295 (1)| 00:00:04 |
|* 7 | TABLE ACCESS FULL| OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
|* 8 | TABLE ACCESS FULL | OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
|* 9 | TABLE ACCESS FULL | OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM<=1e5)
7 - filter("A"."OWNER#"="B"."OWNER#")
8 - filter("B"."OWNER#"="C"."OWNER#")
9 - filter("C"."OWNER#"="D"."OWNER#")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2706 consistent gets
0 physical reads
0 redo size
346 bytes sent via SQL*Net to client
471 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
--//按照作者介紹break_poll_skip預設3, consistent gets/3 = 2706/3= 902,與跟蹤看到的897接近.
--//順便提一下,不知道作者如何測驗的,@lotslios 10000,consistent gets達到了4089670.或許11.2.0.4執行計劃發生了變化.疑問??
--//因為回傳是count(*),僅僅1行.即使設定arraysize=2 ,邏輯讀我的測驗也是2706
--//可以通過改變break_poll_skip的sqlnet.ora引數,減少poll呼叫.
3.改變break_poll_skip引數在sqlnet.ora檔案中.
--//修改break_poll_skip=10,注意測驗時要重新登錄才生效!!
SCOTT@78> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
53 44369 5380:8500 DEDICATED 15041 27 145 alter system kill session '53,44369' immediate;
SCOTT@78> set timing on
SCOTT@78> @tpt/lotslios 1e5
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000
Elapsed: 00:00:00.04
$ strace -cp 15041
Process 15041 attached - interrupt to quit
^CProcess 15041 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 2 read
nan 0.000000 0 2 write
nan 0.000000 0 269 poll
nan 0.000000 0 19 getrusage
nan 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 302 total
--//2706/10 = 270.6,poll呼叫269,已經非常接近.
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:09.99
--//比前面10.40快了一點點.
--//修改break_poll_skip=1000
$ grep break sqlnet.ora
break_poll_skip=1000
SCOTT@78> @ spid
SID SERIAL# PROCESS SERVER SPID PID P_SERIAL# C50
---------- ---------- ------------------------ --------- -------------------- ------- ---------- --------------------------------------------------
53 44371 4544:8592 DEDICATED 15081 27 146 alter system kill session '53,44371' immediate;
SCOTT@78> set timing on
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:09.65
SCOTT@78> @tpt/lotslios 1e5
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000
Elapsed: 00:00:00.04
$ strace -cp 15081
Process 15081 attached - interrupt to quit
^CProcess 15081 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
nan 0.000000 0 2 read
nan 0.000000 0 2 write
nan 0.000000 0 3 poll
nan 0.000000 0 19 getrusage
nan 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000000 36 total
--//補充測驗@tpt/lotslios 1e8:
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:09.98
$ strace -cp 15081
Process 15081 attached - interrupt to quit
^CProcess 15081 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000066 0 2849 poll
0.00 0.000000 0 2 read
0.00 0.000000 0 2 write
0.00 0.000000 0 29 getrusage
0.00 0.000000 0 10 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000066 2892 total
SCOTT@78> set autot traceonly
SCOTT@78> @tpt/lotslios 1e8
SCOTT@78> set autot traceonly
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
Elapsed: 00:00:09.68
Execution Plan
----------------------------------------------------------
Plan hash value: 3691747574
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 23P (1)|999:59:59 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | COUNT STOPKEY | | | | | |
| 3 | NESTED LOOPS | | 2401P| 15E| 23P (1)|999:59:59 |
| 4 | NESTED LOOPS | | 79T| 650T| 769G (1)|999:59:59 |
| 5 | NESTED LOOPS | | 2631M| 14G| 25M (1)| 84:57:18 |
| 6 | TABLE ACCESS FULL| OBJ$ | 87098 | 255K| 295 (1)| 00:00:04 |
|* 7 | TABLE ACCESS FULL| OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
|* 8 | TABLE ACCESS FULL | OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
|* 9 | TABLE ACCESS FULL | OBJ$ | 30210 | 90630 | 293 (1)| 00:00:04 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter(ROWNUM<=1e8)
7 - filter("A"."OWNER#"="B"."OWNER#")
8 - filter("B"."OWNER#"="C"."OWNER#")
9 - filter("C"."OWNER#"="D"."OWNER#")
Statistics
----------------------------------------------------------
12 recursive calls
0 db block gets
2859366 consistent gets
0 physical reads
0 redo size
346 bytes sent via SQL*Net to client
471 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
--//2859366/1000= 2859.366
3.收尾:
--//取消break_poll_skip設定.
$ grep break sqlnet.ora
#break_poll_skip=1000
--//補充說明如果break_poll_skip設定10,100,使用strace跟蹤很慢.
--//break_poll_skip=100
SCOTT@78> @tpt/lotslios 1e8
generate lots of LIOs by repeatedly full scanning through a small table...
COUNT(*)
----------
100000000
Elapsed: 00:00:16.32
--//2859358 consistent gets
$ strace -cp 15165
Process 15165 attached - interrupt to quit
^CProcess 15165 detached
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 0.000052 0 28487 poll
0.00 0.000000 0 5 read
0.00 0.000000 0 5 write
0.00 0.000000 0 1 lseek
0.00 0.000000 0 46 getrusage
0.00 0.000000 0 26 times
------ ----------- ----------- --------- --------- ----------------
100.00 0.000052 28570 total
轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/518784.html
標籤:Oracle
上一篇:MySQL資料庫-資料表(四)
