Hive動態磁區insert 資料寫入HDFS時磁區欄位為NULL失敗-有解無憂

問題描述：
將全國資訊按照省級單位動態磁區，然后設定分桶，在最后將結果寫入HDFS檔案系統中失敗，只生成了一個磁區

建表陳述句：
create table position_fenqufentong (id int,city_id bigint,city_name string,county_id bigint,county_name string,town_id bigint,town_name string,village_id bigint,village_name string) partitioned by (province_name string) clustered by (city_name) sorted by(city_id) into 3 buckets row format delimited fields terminated by '\t' collection items terminated by '\t' lines terminated by '\n' stored as textfile location '/user/hive/warehouse/chinamap.db/fenqufentong_position';

裝載資料
from position_all insert into  table position_fenqufentong partition(province_name) select id,city_id,city_name,county_id,county_name,town_id,town_name,village_id,village_name,province_name;

運行結果
Query ID = hadoop_20181205104902_9ed34916-2369-46fa-9186-41a3d9f83911
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 3
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1543976396107_0001, Tracking URL = http://hadoop3:8088/proxy/application_1543976396107_0001/
Kill Command = /opt/modules/hadoop-2.6.5/bin/hadoop job  -kill job_1543976396107_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 3
2018-12-05 10:49:43,280 Stage-1 map = 0%,  reduce = 0%
2018-12-05 10:50:17,859 Stage-1 map = 67%,  reduce = 0%, Cumulative CPU 9.04 sec
2018-12-05 10:50:18,908 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 9.09 sec
2018-12-05 10:51:19,553 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 9.09 sec
2018-12-05 10:51:56,830 Stage-1 map = 100%,  reduce = 22%, Cumulative CPU 10.1 sec
2018-12-05 10:52:04,219 Stage-1 map = 100%,  reduce = 44%, Cumulative CPU 11.81 sec
2018-12-05 10:52:12,488 Stage-1 map = 100%,  reduce = 56%, Cumulative CPU 19.93 sec
2018-12-05 10:52:25,281 Stage-1 map = 100%,  reduce = 67%, Cumulative CPU 28.27 sec
2018-12-05 10:52:39,520 Stage-1 map = 100%,  reduce = 78%, Cumulative CPU 29.07 sec
2018-12-05 10:52:41,611 Stage-1 map = 100%,  reduce = 89%, Cumulative CPU 29.78 sec
2018-12-05 10:53:12,294 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 37.19 sec
MapReduce Total cumulative CPU time: 37 seconds 190 msec
Ended Job = job_1543976396107_0001
Loading data to table chinamap.position_fenqufentong partition (province_name=null)
Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter partition. alter is not possible
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 3   Cumulative CPU: 38.31 sec   HDFS Read: 82878437 HDFS Write: 72514675 SUCCESS
Total MapReduce CPU Time Spent: 38 seconds 310 msec

uj5u.com熱心網友回復：

磁區欄位還敢給null，你心也是夠大。IFNULL(partition_column,'未知') 應該不難吧

uj5u.com熱心網友回復：

動態磁區磁區欄位是根據指定的province_name動態生成的，資料源position_all表中的province_name欄位都是非空的，所以不知道這個null是從哪里來的

uj5u.com熱心網友回復：

請問如何解決Unable to alter partition. alter is not possible，我是用spark寫入磁區表，相同的資料執行了兩次，第二次就報這個錯。

轉載請註明出處，本文鏈接：https://www.uj5u.com/qita/269914.html

標籤：分布式計算/Hadoop

上一篇：衛星資料傳輸模塊

下一篇：python呼叫類提示object()不接受任何引數如如何解決