我正在嘗試使用 SSIS 將包含約 2 億條記錄的表從 Oracle 加載到 Postgres。Oracle、Postgres 和 SSIS 位于不同的服務器上。
從 Oracle 讀取資料
為了從 Oracle 資料庫中讀取資料,我使用了一個使用“Oracle Provider for OLE DB”的 OLE DB 連接。OLE DB 源配置為使用 SQL 命令讀取資料。
總共有 44 列,主要是 varchar、11 個數字和 3 個時間戳。


將資料加載到 Postgres
要將資料匯入 Postgres,我使用的是 ODBC 連接。ODBC 目標組件被配置為以批處理方式加載資料(不是逐行插入)。



SSIS 配置
我創建了一個 SSIS 包,它只包含一個簡單的Data Flow Task。



問題
負載似乎需要很多小時才能達到一百萬個計數。源查詢在 SQL 開發人員中執行時快速給出結果。但是當我嘗試匯出時,它拋出了超出限制的錯誤。
在 SSIS 中,當我嘗試預覽它回傳的 Source SQL 命令的結果時:The system cannot find message text for message number 0x80040e51 in the message file for OraOLEDB. (OraOLEDB)
注意源(SQL 命令)和目標表沒有任何索引。
您能否提出任何提高負載性能的方法?
uj5u.com熱心網友回復:
我將嘗試提供一些技巧來幫助您提高包裝性能。您應該開始系統地對您的程式包進行故障排除,以找到性能瓶頸。
一些提供的鏈接與 SQL Server 相關。不用擔心!相同的規則適用于所有資料庫管理系統。
1. 可用資源
首先,您應該確保您有足夠的資源將資料從源服務器加載到目標服務器。
確保源、ETL 和目標服務器上的可用記憶體可以處理您嘗試加載的資料量。此外,請確保您的網路連接帶寬不會降低資料傳輸性能。
Data flow task buffer size/limits
Using SSIS, data is loaded in memory while being transferred from source to destination. There are two properties in the data flow task that specifies how much data is transferred in memory buffers used by the SSIS pipelines.

Based on the following Integration Services performance tuning white paper:
DefaultMaxBufferRows – DefaultMaxBufferRows is a configurable setting of the SSIS Data Flow task that is automatically set at 10,000 records. SSIS multiplies the Estimated Row Size by the DefaultMaxBufferRows to get a rough sense of your dataset size per 10,000 records. You should not configure this setting without understanding how it relates to DefaultMaxBufferSize.
DefaultMaxBufferSize – DefaultMaxBufferSize is another configurable setting of the SSIS Data Flow task. The DefaultMaxBufferSize is automatically set to 10 MB by default. As you configure this setting, keep in mind that its upper bound is constrained by an internal SSIS parameter called MaxBufferSize which is set to 100 MB and can not be changed.
You should try to change those values and test your package performance each time you change them until the package performance increases.
4. Destination
Indexes/Triggers/Constraints
You should make sure that the destination table does not have any constraints or triggers since they significantly decrease the data load performance; each batch inserted should be validated or preprocessed before storing it.
- Are SQL Server database triggers evil?
- The benefits, costs, and documentation of database constraints
Besides, the more you have indexes the lower is the data load performance.
ODBC Destination custom properties
ODBC destination has several custom properties that can affect the data load performance such as
BatchSize(Rows per batch),TransactionSize(TransactionSize is only available in the advanced editor).- ODBC connection is very slow
轉載請註明出處,本文鏈接:https://www.uj5u.com/gongcheng/429540.html標籤:PostgreSQL 甲骨文 西斯 等 数据迁移
上一篇:基于另一個具有日期條件的表更新表
