關于spark與cassandra結合使用的問題!!官網案列跑不通!!!!!
直接貼代碼,我基本上都是按照官網案列來的!!!如下:分不夠,只剩下這么點了 ....望大家幫幫我
maven依賴:
<!-- spark java 先注釋 -->

<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.6.0-M2</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector-java_2.10</artifactId>
<version>1.6.0-M1</version>
</dependency>
<!-- spark java 結束 -->
<!-- spark core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.1</version>
</dependency>
<!-- spark core 結束 -->
<!-- cassandra驅動 -->
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>3.0.1</version>
</dependency>
<!-- cassandra物體物件映射相關連的包 -->
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-mapping</artifactId>
<version>3.0.1</version>
</dependency>
然后直接就是測驗代碼了:
/**
* 獲取連接
*/
public static JavaSparkContext getConnection() {
// 獲取連接方式
SparkConf conf = new SparkConf(true).setAppName("spark and cassandra")
//.set("spark.testing.memory", "2147480000")//分配記憶體,記憶體不足512M
.set("spark.cassandra.connection.host", "192.168.1.13");
JavaSparkContext sc = new JavaSparkContext("spark://192.168.1.13:7077", "SparkOptionCassandra1", conf);
System.out.println(sc.master() + " : " + sc.appName());
return sc;
}
/**
* spark讀取cassandra表資料 22222
*/
public static void getDataFromCassandra() {
JavaSparkContext sc = getConnection();
try {
JavaRDD<String> cassandraRowsRDD = javaFunctions(sc).cassandraTable("xmmsg", "people")
.map(new Function<CassandraRow, String>() {
public String call(CassandraRow cassandraRow) throws Exception {
return cassandraRow.toString();
}
});
System.out.println("Data as CassandraRows: \n" + StringUtils.join("\n", cassandraRowsRDD.collect()));
} catch (Exception e) {
e.printStackTrace();
}finally{
sc.stop();
sc.close();
}
}
然后報錯資訊:

然后保存也是:哎
/**
* 持久化資料到cassandra資料庫
*/
public static void savePerson() {
try {
JavaSparkContext sc = getConnection();
List<Person> people = Arrays.asList(
Person.newInstance(1, "John", new Date()),
Person.newInstance(2, "Anna", new Date()),
Person.newInstance(3, "Andrew", new Date())
);
JavaRDD<Person> rdd = sc.parallelize(people);
javaFunctions(rdd).writerBuilder("xmmsg", "people", mapToRow(Person.class)).saveToCassandra();
} catch (Exception e) {
e.printStackTrace();
}
}

報錯資訊:

請大神幫幫我,謝謝啦!!!!!!!
還有一個關于sparksql的問題:
public static void writeResouces(){
JavaSparkContext sc=getConnection("first","local");
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read().format("json").load("c://test//people.json");
//不知道為什么輸出的檔案居然是檔案夾?win和linux區別?
df.select("name", "age").write().format("parquet").save("c://test/namesAndAges2.parquet");
//可以這么 查詢
DataFrame df2 = sqlContext.sql("SELECT * FROM parquet.`c://test/namesAndAges2.parquet");
System.out.println(df2.count());
}
為什么我在win本地生成是namesAndAges2.parquet檔案夾呢,里面啥東西都沒有,

在linux上面能生成檔案,但是沒法讀取!!

uj5u.com熱心網友回復:
為什么沒人呢,哎...轉載請註明出處,本文鏈接:https://www.uj5u.com/qita/78686.html
標籤:Spark
