文章目錄
- 1 DataX介紹及安裝
- 1.1 DataX介紹
- 1.2 支持的資料源
- 1.3 運行原理
- 1.4 DataX安裝
- 2 DataX的使用
- 2.1 streamTostream
- 2.1.1 創建組態檔(json格式)
- 2.1.2 啟動DataX
- 2.1.3 執行結果
- 2.2 mysqlTohbase
- 2.2.1 創建組態檔(json格式)
- 2.2.2 執行DataX
- 2.2.3 執行結果
- 2.3 mysqlTohdfs
- 2.3.1 創建組態檔
- 2.4 hbaseTomysql
- 2.4.1 創建組態檔(同上)
1 DataX介紹及安裝
1.1 DataX介紹
DataX 是阿里巴巴開源的一個異構資料源離線同步工具,致力于實作包括關系型資料庫(MySQL、Oracle 等)、HDFS、Hive、ODPS、HBase、FTP 等各種異構資料源之間穩定高效的資料同步功能,
為了解決異構資料源同步問題,DataX 將復雜的網狀的同步鏈路變成了星型資料鏈路,DataX 作為中間傳輸載體負責連接各種資料源,當需要接入一個新的資料源的時候,只需要將此資料源對接到 DataX,便能跟已有的資料源做到無縫資料同步,

1.2 支持的資料源

1.3 運行原理

1.4 DataX安裝
下載地址:http://datax-opensource.oss-cn-hangzhou.aliyuncs.com/datax.tar.gz
原始碼地址:https://github.com/alibaba/DataX
DataX不需要依賴其他服務,直接上傳、解壓、安裝、配置環境變數即可
也可以直接在windows上解壓
#解壓至/usr/local/soft/
tar -zxvf datax.tar.gz -C /usr/local/soft/
#配置環境變數
vim /etc/profile
#更新組態檔
source /etc/profile

2 DataX的使用
2.1 streamTostream
從stream讀取資料并列印到控制臺
2.1.1 創建組態檔(json格式)
# stream2stream.json
{
"job": {
"content": [
{
"reader": {
"name": "streamreader",
"parameter": {
"sliceRecordCount": 10,
"column": [
{
"type": "long",
"value": "10"
},
{
"type": "string",
"value": "hello,你好,世界-DataX"
}
]
}
},
"writer": {
"name": "streamwriter",
"parameter": {
"encoding": "UTF-8",
"print": true
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
2.1.2 啟動DataX
datax.py stream2stream.json
2.1.3 執行結果
2021-12-07 20:08:30.673 [job-0] INFO JobContainer -
任務啟動時刻 : 2021-12-07 20:08:20
任務結束時刻 : 2021-12-07 20:08:30
任務總計耗時 : 10s
任務平均流量 : 95B/s
記錄寫入速度 : 5rec/s
讀出記錄總數 : 50
讀寫失敗總數 : 0
2.2 mysqlTohbase
需要在mysql中創建student庫和student表
需要在hbase中創建datax_test表
不同資料庫和表,相應的引數也要改變
2.2.1 創建組態檔(json格式)
#mysqlTohbase.json
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "123456",
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"splitPk": "id",
"connection": [
{
"table": [
"student"
],
"jdbcUrl": [
"jdbc:mysql://master:3306/student?useSSL=false&characterEncoding=utf8"
]
}
]
}
},
"writer": {
"name": "hbase11xwriter",
"parameter": {
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181,node1:2181,node2:2181"
},
"table": "data_test",
"mode": "normal",
"rowkeyColumn": [
{
"index":0,
"type":"string"
}
],
"column": [
{
"index":1,
"name": "cf1:name",
"type": "string"
},
{
"index":2,
"name": "cf1:age",
"type": "string"
},
{
"index":3,
"name": "cf1:gender",
"type": "string"
},
{
"index":4,
"name": "cf1:clazz",
"type": "string"
}
],
"versionColumn":{
"index": 5,
},
"encoding": "utf-8"
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
2.2.2 執行DataX
datax.py mysqlTohbase.json
2.2.3 執行結果
2021-12-07 20:51:14.214 [job-0] INFO JobContainer -
任務啟動時刻 : 2021-12-07 20:51:03
任務結束時刻 : 2021-12-07 20:51:14
任務總計耗時 : 11s
任務平均流量 : 4.30KB/s
記錄寫入速度 : 100rec/s
讀出記錄總數 : 1000
讀寫失敗總數 : 0
2.3 mysqlTohdfs
2.3.1 創建組態檔
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "123456",
"column": [
"id",
"name",
"age",
"gender",
"clazz",
"last_mod"
],
"splitPk": "age",
"connection": [
{
"table": [
"student"
],
"jdbcUrl": [
"jdbc:mysql://master:3306/student"
]
}
]
}
},
"writer": {
"name": "hdfswriter",
"parameter": {
"defaultFS": "hdfs://master:9000",
"fileType": "text",
"path": "/user/hive/warehouse/datax.db/students",
"fileName": "student",
"column": [
{
"name": "id",
"type": "bigint"
},
{
"name": "name",
"type": "string"
},
{
"name": "age",
"type": "INT"
},
{
"name": "gender",
"type": "string"
},
{
"name": "clazz",
"type": "string"
},
{
"name": "last_mod",
"type": "string"
}
],
"writeMode": "append",
"fieldDelimiter": ","
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
2.4 hbaseTomysql
2.4.1 創建組態檔(同上)
{
"job": {
"content": [
{
"reader": {
"name": "hbase11xreader",
"parameter": {
"hbaseConfig": {
"hbase.zookeeper.quorum": "master:2181"
},
"table": "student",
"encoding": "utf-8",
"mode": "normal",
"column": [
{
"name": "rowkey",
"type": "string"
},
{
"name": "cf1:name",
"type": "string"
},
{
"name": "cf1:age",
"type": "string"
},
{
"name": "cf1:gender",
"type": "string"
},
{
"name": "cf1:clazz",
"type": "string"
}
],
"range": {
"startRowkey": "",
"endRowkey": "",
"isBinaryRowkey": false
}
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"writeMode": "insert",
"username": "root",
"password": "123456",
"column": [
"id",
"name",
"age",
"gender",
"clazz"
],
"preSql": [
"truncate student2"
],
"connection": [
{
"jdbcUrl": "jdbc:mysql://master:3306/student2?useUnicode=true&characterEncoding=utf8",
"table": [
"student2"
]
}
]
}
}
}
],
"setting": {
"speed": {
"channel": 5
}
}
}
}
轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/377444.html
標籤:其他
