測驗版本為FlinkX1.10
最終的測驗結果:
設定parallelism并行度,在json檔案的speed.channel里配置
設定taskmanger記憶體,在flink-conf.yaml的taskmanager.memory.process.size里配置
設定slot個數,在flinkx-conf.yaml的taskmanager.numberOfTaskSlots里配置
-confProp只有配置jobmanager.memory.mb才有生效,其他配置都不生效
yarn 配置
yarn.scheduler.minimum-allocation-mb=1G
yarn.scheduler.minimum-allocation-vcores=1
yarn.scheduler.maximum-allocation-mb=12G
yarn.scheduler.maximum-allocation-vcores=12
Flink配置
jobmanager.rpc.address: server001
jobmanager.rpc.port: 6123
jobmanager.heap.size: 1024m
taskmanager.memory.process.size: 1024m
taskmanager.numberOfTaskSlots: 1
parallelism.default: 2
測驗一
mysql2hive.json配置
"splitPk": "id"
"channel": 2
flinkx任務執行腳本
/usr/local/src/flinkx-1.10/bin/flinkx \
-mode yarnPer \
-job /usr/local/src/flinkx-1.10/job/mysql2hive.json \
-queue root.default
正確,由于mysql2hive任務設定了2個通道讀寫,也就是開啟了2個parallelism,由于slot=1,所以是2個taskmanager,3個container,3核CPU,3G記憶體
啟發:難道flinkx需要設定指定的分隔符,才能使parallelism并發度生效呢?走去看下一個測驗


測驗二,stream_stream.json
{
"job" : {
"content" : [ {
"reader" : {
"parameter" : {
"column" : [ {
"name": "id",
"type" : "id"
}, {
"name": "string",
"type" : "string"
} ],
"sliceRecordCount" : [ "10000"]
},
"name" : "streamreader"
},
"writer" : {
"parameter" : {
"print" : true
},
"name" : "streamwriter"
}
} ],
"setting" : {
"speed" : {
"channel" : 2
}
}
}
}
flinkx啟動腳本
/usr/local/src/flinkx-1.10/bin/flinkx \
-mode yarnPer \
-job /usr/local/src/flinkx-1.10/docs/example/stream_stream.json \
-queue root.default
之前還以為只有關系型資料庫才能設定并行度讀取資料,其實不然
parallelism正常,taskmanager數 = parallelism/slot=2/1=2個

測驗三,接著上面的配置,修改flink-conf.yaml的parallelism.default: 1 和 taskmanager.numberOfTaskSlots: 2
flinkx啟動腳本
Flink配置
parallelism.default: 1
taskmanager.numberOfTaskSlots: 2
jobmanager.heap.size: 1024m
taskmanager.memory.process.size: 1024m
FlinkX的json配置
"channel" : 2
/usr/local/src/flinkx-1.10/bin/flinkx \
-mode yarnPer \
-job /usr/local/src/flinkx-1.10/docs/example/stream_stream.json \
-queue root.default
預計是taskmanager數=parallelism / slot = 2 / 2 = 1,也就是2個container,CPU數=taskmanger數 * slot + 1 = 1*2+1=3核,2G記憶體,
正常

測驗四,接著上面的配置,修改json.speed.channel=4
Flink配置
parallelism.default: 1
taskmanager.numberOfTaskSlots: 2
jobmanager.heap.size: 1024m
taskmanager.memory.process.size: 1024m
FlinkX的json配置
"speed" : { "channel" : 4 }
/usr/local/src/flinkx-1.10/bin/flinkx \
-mode yarnPer \
-job /usr/local/src/flinkx-1.10/docs/example/stream_stream.json \
-queue root.default
預計taskmanager數= 4/2 =2,3個container,CPU數=taskmanger數 * slot + 1 = 2*2+1=5G,使用記憶體=jobmanager.heap+taskmanager數*taskmanager.memory=1024+2*1024=3G
正常

測驗五,接著上面的配置測驗,修改taskmanager.memory.process.size: 2048m
Flink的配置
parallelism.default: 1
taskmanager.numberOfTaskSlots: 2
jobmanager.heap.size: 1024m
taskmanager.memory.process.size: 2048m
Flinkx的json配置
channel=4
/usr/local/src/flinkx-1.10/bin/flinkx \
-mode yarnPer \
-job /usr/local/src/flinkx-1.10/docs/example/stream_stream.json \
-queue root.default
預計taskmanager數=parallelism/slot=4/2=2,3個container,CPU數=taskmanger數*slot+1=2*2+1=5核,使用記憶體=jobmanager.heap+taskmanager數*taskmanager.memory=1024+2*2048=5G
正常

最終需要著手的是,如何使-confProp的taskmanger記憶體和slot能夠生效
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/321132.html
標籤:其他
下一篇:Hadoop集群搭建之環境準備
