Spark中如何将JavaScala List高效转换成DataFrameDataSet?

摘要:一、JAVA list 转 DataFrame or DataSet -> 关注清哥聊技术公众号,了解更多技术文章 case class CaseJava( var num: String, var id: Strin
一、JAVA list 转 DataFrame or DataSet->关注清哥聊技术公众号,了解更多技术文章 case class CaseJava( var num: String, var id: String, var start_time: String, var istop_time: String) val listData: java.util.List[CaseJava] = new java.util.ArrayList[CaseJava] listData.add(new CaseJava("11","22","33","44")) val dataFrame = spark.createDataFrame(listData, classOf[CaseJava]) 二、scala MutableList 转 DataFrame or DataSe 1、方式一: val spark = SparkSession.builder().appName("Spark-SQL").master("local[2]").getOrCreate() import spark.implicits._ var tom = new TestPerson("Tom Hanks",37,35.5) var sam = new TestPerson("Sam Smith",40,40.5) val PersonList = mutable.MutableList[TestPerson]() //Adding data in list PersonList += tom PersonList += sam //It will be work. var personDS = Seq(PersonList).toDS() 2、方式二: case class TestPerson(name: String, age: Long, salary: Double) val spark = SparkSession.builder().appName("List to Dataset").master("local[*]").getOrCreate() var tom = new TestPerson("Tom Hanks",37,35.5) var sam = new TestPerson("Sam Smith",40,40.5) // mutable.MutableList[TestPerson]() is not required , i used below way which was // cleaner val PersonList = List(tom,sam) import spark.implicits._ PersonList.toDS().show 3、方式三: case class TestPerson(name: String, age: Long, salary: Double) val tom = TestPerson("Tom Hanks",37,35.5) val sam = TestPerson("Sam Smith",40,40.5) val PersonList = mutable.MutableList[TestPerson]() PersonList += tom PersonList += sam val personDS = PersonList.toDS() println(personDS.getClass) personDS.show() val personDF = PersonList.toDF() println(personDF.getClass) personDF.show() personDF.select("name", "age").show() 更多请参考:https://stackoverflow.com/questions/39397652/convert-scala-list-to-dataframe-or-dataset