由于出現ora-00020超出最大行程數的問題,修改了processes數為1500
之后資料庫都正常,但是最近發現在udump的幾乎所有的trc檔案中,都有5-6個ORA-00020的報錯,而在alert檔案中沒有報錯,資料庫看著正常,只是似乎感覺有些慢,請各位看看是什么原因?
ORA-00020: maximum number of processes 150 exceeded
/export/home/oracle/admin/orcl/udump/orcl_ora_21259.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /export/home/oracle/product/10.0
System name: SunOS
Node name: EEE-DataBase
Release: 5.10
Version: Generic_148888-01
Machine: sun4v
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 21259, image: oracle@EEE-DataBase
SQL> show parameter process;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
aq_tm_processes integer 0
db_writer_processes integer 16
gcs_server_processes integer 0
job_queue_processes integer 10
log_archive_max_processes integer 2
processes integer 1500
SQL> select count(*) from v$process;
COUNT(*)
----------
256
uj5u.com熱心網友回復:
你感覺出問題了
uj5u.com熱心網友回復:
應該還是有問題的,但是無從下手了uj5u.com熱心網友回復:
有個人,沒病,他去找醫生看病,醫生說他沒病,但是他還是說自己有病,結果會怎樣?
A、醫生把他攆出來了;
B、醫生把它攆出來了,他轉頭捅死了醫生
C、他一直感覺自己有病,但所有醫生都說他沒病,然后有一天他瘋了,真病了……



好吧,說正經的:
你說你感覺資料庫慢?慢在哪里?慢在某個功能點?還是所有操作上?慢很很多嗎?比如原來跑了100毫秒的SQL,現在跑了200毫秒?資料庫看著正常?你為什么覺得他正常?感覺它正常?然后感覺它又有點慢?
1、如果你覺得整個庫都慢,那么提供:你覺得資料庫慢時段資料庫的AWR報告;
2、如果你覺得某個功能點慢,那么找出這個功能點跑的慢的SQL,提供這個SQL的執行計劃。
至少先有了這些資訊再提性能問題,上面雖說是玩笑,但有一點與醫生看病是異曲同工的:問題不是猜出來的、感受出來的——談戀愛的時候問題大概是這么來的
,而是需要檢查,需要有憑有據。BTW:行程數這種東西,即使真的出現與之相關的問題,它也只會是表象——比如行程數爆了,不會是問題的根源。
uj5u.com熱心網友回復:
說實話,感覺資料可以正常也就是沒發現無故宕機,看了alert檔案,沒有致命的錯誤而已。資料庫服務器是新換的,cpu和記憶體比原來強一倍,但是用戶反映比以前慢。alert檔案:
Fri Nov 17 15:04:16 2017
Thread 1 advanced to log sequence 17463 (LGWR switch)
Current log# 9 seq# 17463 mem# 0: /oradata/orcl/redo09.log
Fri Nov 17 15:09:46 2017
Thread 1 advanced to log sequence 17464 (LGWR switch)
Current log# 10 seq# 17464 mem# 0: /oradata/orcl/redo10.log
Fri Nov 17 15:16:54 2017
Thread 1 advanced to log sequence 17465 (LGWR switch)
Current log# 1 seq# 17465 mem# 0: /oradata/orcl/redo01.log
Fri Nov 17 15:19:11 2017
Thread 1 advanced to log sequence 17466 (LGWR switch)
Current log# 2 seq# 17466 mem# 0: /oradata/orcl/redo02.log
Fri Nov 17 15:19:29 2017
Thread 1 advanced to log sequence 17467 (LGWR switch)
Current log# 3 seq# 17467 mem# 0: /oradata/orcl/redo03.log
Fri Nov 17 15:19:52 2017
Thread 1 advanced to log sequence 17468 (LGWR switch)
Current log# 4 seq# 17468 mem# 0: /oradata/orcl/redo04.log
Fri Nov 17 15:20:39 2017
Thread 1 advanced to log sequence 17469 (LGWR switch)
Current log# 5 seq# 17469 mem# 0: /oradata/orcl/redo05.log
Fri Nov 17 15:22:36 2017
ORA-00060: Deadlock detected. More info in file /export/home/oracle/admin/orcl/udump/orcl
_ora_25025.trc.
Fri Nov 17 15:25:18 2017
Thread 1 advanced to log sequence 17470 (LGWR switch)
Current log# 11 seq# 17470 mem# 0: /oradata/orcl/redo11.log
Fri Nov 17 15:31:22 2017
Thread 1 advanced to log sequence 17471 (LGWR switch)
Current log# 12 seq# 17471 mem# 0: /oradata/orcl/redo12.log
Fri Nov 17 15:38:24 2017
Thread 1 advanced to log sequence 17472 (LGWR switch)
Current log# 6 seq# 17472 mem# 0: /oradata/orcl/redo06.log
Fri Nov 17 15:47:12 2017
Thread 1 advanced to log sequence 17473 (LGWR switch)
Current log# 7 seq# 17473 mem# 0: /oradata/orcl/redo07.log
Fri Nov 17 15:52:40 2017
Thread 1 advanced to log sequence 17474 (LGWR switch)
Current log# 8 seq# 17474 mem# 0: /oradata/orcl/redo08.log
Fri Nov 17 15:54:17 2017
ORA-00060: Deadlock detected. More info in file /export/home/oracle/admin/orcl/udump/orcl
_ora_7809.trc.
Fri Nov 17 15:57:15 2017
Thread 1 advanced to log sequence 17475 (LGWR switch)
Current log# 9 seq# 17475 mem# 0: /oradata/orcl/redo09.log
Fri Nov 17 16:01:30 2017
Thread 1 advanced to log sequence 17476 (LGWR switch)
Current log# 10 seq# 17476 mem# 0: /oradata/orcl/redo10.log
報死鎖的那個trc:
/export/home/oracle/admin/orcl/udump/orcl_ora_7809.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /export/home/oracle/product/10.0
System name: SunOS
Node name: EAS-DataBase
Release: 5.10
Version: Generic_148888-01
Machine: sun4v
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 7809, image: oracle@EAS-DataBase
ORA-00020: maximum number of processes 150 exceeded
/export/home/oracle/admin/orcl/udump/orcl_ora_7809.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /export/home/oracle/product/10.0
System name: SunOS
Node name: EAS-DataBase
Release: 5.10
Version: Generic_148888-01
Machine: sun4v
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 7809, image: oracle@EAS-DataBase
ORA-00020: maximum number of processes 150 exceeded
/export/home/oracle/admin/orcl/udump/orcl_ora_7809.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
ORACLE_HOME = /export/home/oracle/product/10.0
System name: SunOS
Node name: EAS-DataBase
Release: 5.10
Version: Generic_148888-01
Machine: sun4v
Instance name: orcl
Redo thread mounted by this instance: 1
Oracle process number: 0
Unix process pid: 7809, image: oracle@EAS-DataBase
ORA-00020: maximum number of processes 150 exceeded
/export/home/oracle/admin/orcl/udump/orcl_ora_7809.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
"/export/home/oracle/admin/orcl/udump/orcl_ora_7809.trc" 15049 行,842130 字符
詳細的trc檔案和awr我上傳在這里了,麻煩給看下?
https://cloud.ciscol.com/nextcloud/index.php/s/M4UxYcI6WLgX2ky
uj5u.com熱心網友回復:
好久沒看到solaris了~這個系統好像有大量的臨時表在創建,在洗掉?插入這些表或者參考到這些表的時候,搞出了很多文本不太一樣的SQL來,可能因為設計問題,其他也有一些SQL也不方便使用系結變數,決議看來是有點壓力的,共享池還撐到了18g,比buffer cache大,但是300次/s軟決議,30次/s的硬決議,應該還算過得去,不知道為啥決議時間占DB TIME的比重會那么高,高達30%,有沒有同時段作業系統的性能監控,CPU的使用率是不是比較高?或者說這服務器上是不是有其他應用在與資料庫爭搶資源?
還有,資料庫1/10的DB TIME是TX鎖爭用,結合alert日志經常報出ORA-60,看來應用的邏輯是有點問題的,這個包括死鎖,你們得自己查程式了,也許優化流程中的TOP SQL可能一定程度上改善鎖爭用的現象。
uj5u.com熱心網友回復:
接上:buffer cache的命中率也比較低,看來buffer cache確實被擠兌得有點慘,記憶體的設定也許也該關注下,看看是不是主機還有足夠的物理記憶體擴充buffer cache,或者說SGAuj5u.com熱心網友回復:
主機占用平時都不高,記憶體64G的,應用只有oracle,設定了sga_target=32G,但是原來的系統只有32G記憶體,才給了oracle20G,程式都一樣,切實代碼有待優化,但使用者感覺比原系統慢。仍然對trc檔案里的processor耿耿于懷,感覺是不是哪里沒設定好?
-bash-3.2$ vmstat 2
kthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr s5 s6 sd sd in sy cs us sy id
0 0 0 49316536 22549088 2614 202 20577 1 1 0 0 -0 4 -1 -1 3223 8294 3308 1 0 98
0 0 0 48328040 20626272 237 871 76 4 4 0 0 0 2 0 0 3229 11172 3695 1 0 99
0 0 0 48327272 20624968 19 88 12 0 0 0 0 0 0 0 0 2352 8449 2068 1 0 98
0 0 0 48327944 20625128 19 22 12 0 0 0 0 0 0 0 0 1394 2152 1044 0 0 100
0 0 0 48324840 20621768 187 532 12 0 0 0 0 0 0 0 0 1981 5092 1880 1 0 99
oracle引數:
System parameters with non-default values:
processes = 1500
__shared_pool_size = 18622709760
__large_pool_size = 16777216
__java_pool_size = 117440512
__streams_pool_size = 16777216
sga_target = 34359738368
control_files = /oradata/orcl/control01.ctl, /oradata/orcl/control02.ctl, /o
radata/orcl/control03.ctl
db_block_size = 8192
__db_cache_size = 15435038720
compatible = 10.2.0.3.0
log_archive_format = %t_%s_%r.dbf
db_file_multiblock_read_count= 16
db_recovery_file_dest = /oradata/flash_recovery_area
db_recovery_file_dest_size= 1649267441664
undo_management = AUTO
undo_tablespace = UNDOTBS1
undo_retention = 32400
remote_login_passwordfile= EXCLUSIVE
db_domain =
dispatchers = (PROTOCOL=TCP) (SERVICE=orclXDB)
job_queue_processes = 10
background_dump_dest = /export/home/oracle/admin/orcl/bdump
user_dump_dest = /export/home/oracle/admin/orcl/udump
core_dump_dest = /export/home/oracle/admin/orcl/cdump
audit_file_dest = /export/home/oracle/admin/orcl/adump
db_name = orcl
open_cursors = 300
pga_aggregate_target = 6830424064
uj5u.com熱心網友回復:
主機還有足夠的物理記憶體擴充buffer cache,或者說SGA,有什么建議嗎?uj5u.com熱心網友回復:
那個trc里的我覺得也許是個巧合:以前有過同樣行程號的行程,曾經報過這樣的錯誤,后面剛好有個爆出ORA-60的行程講資訊也寫到那個trc檔案去了,否則好像找不到其他理由了,除非你修改processes后沒有重啟,但看你上面貼出的引數應該是alert日志中重啟的時候顯示的引數,說明你已經重啟并生效的,所以我只能想到有行程號重復的這種情況。如果優化程式很實作(因為看來這是jindie的ERP系統),也許物件的統計資訊可以檢查下是不是沒有跟上資料的變化?如果SQL優化或者程式優化短時間內沒法做,可以考慮增大SGA看看情況。
uj5u.com熱心網友回復:
不過最好根據原來系統穩定運行時候的共享池和buffer cache的情況,給現在的系統中的這兩個引數也設定個最小值,也就是shared_pool_size和db_cache_size
uj5u.com熱心網友回復:
另外,我仔細看了下你那個vmstat的輸出,好像記憶體曾經爆掉過?為什么會有那么多持續的pi?也就是有大量資料持續從swap往物理記憶體移動,如果是這樣子的話,調大SGA也許還不是個好主意了,有可能加劇swap的活動,或者迫不得已增大SGA后,還得時刻觀察系統的記憶體使用情況。uj5u.com熱心網友回復:
謝了,確實是jindie的ERP系統,sql優化估計短時間做不了,有時間再調一下sga吧,那個ora-00020的報錯真的很奇怪,沒辦法只能先無視了。uj5u.com熱心網友回復:
物件的統計資訊也可以先查一下,因為有些看上去像是統計的SQL執行時間還是比較長的,如果有必要的話要收集物件的統計資訊,以得到更好的執行計劃。
plsql dev如果平常登錄也慢的話,考慮收集下資料字典的統計資訊,如果平常可以,那么可能是因為高峰期資源被占用的厲害導致它也受累了。
uj5u.com熱心網友回復:
對了,鑒于這個系統可能有大量的臨時表在創建生成,這些物件的統計資訊可以暫時不予理會,因為收集的時候萬一表里沒資料,那么樂子就大了,很可能有反效果。另外,是不是同時做了遷移(遷移看來可能性很大)或者升級?從10.2.0.1升級到了10.2.0.4?我印象中好像我們單位也折騰過升級……
uj5u.com熱心網友回復:
http://www.xifenfei.com/2017/11/dbconsole-ora-00020.html轉載請註明出處,本文鏈接:https://www.uj5u.com/shujuku/79232.html
標籤:高級技術
下一篇:資料庫專案設計實作
