About how to create a table in Hive using a data frame

Hi everyone, as usual, I put here a recipe to create fast a table in Hive using the data from a  data frame.

I am using a spark-shell connected to a development cluster, cloudera version is cdh5.5.2, so, according to this official cloudera site, hive version is 1.1.0.

It should be important when I have to code a packaged Scala solution..

The important thing…

Setting default log level to “WARN”.

To adjust logging level use sc.setLogLevel(newLevel).

Welcome to

____              __

/ __/__  ___ _____/ /__

_\ \/ _ \/ _ `/ __/  ‘_/

/___/ .__/\_,_/_/ /_/\_\   version 1.5.0-cdh5.5.2

/_/

Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_67)

Type in expressions to have them evaluated.

Type :help for more information.

17/03/01 09:16:01 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.

Spark context available as sc (master = yarn-client, app id = application_1486135355323_0057).

SQL context available as sqlContext.

// important! be sure that HiveContext is enabled, it should be, but…

scala> val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

sqlContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@7ded4eb

scala> val mydf = sqlContext.read.parquet(“aParquetFile.parquet”)

mydf: org.apache.spark.sql.DataFrame = […]

scala> mydf.cache

res0: mydf.type = […]

scala> mydf.count

res1: Long = 28246060

scala> sqlContext.sql(“show databases”).collect().foreach(println)

[default]

scala> mydf.registerTempTable(“tempTable”)

// doing the bulk loading !

scala> sqlContext.sql(“CREATE TABLE myTable as SELECT * FROM tempTable”)

// IMPORTANT! you must unregister the temp table…

scala> scala.util.Try(sqlContext.dropTempTable(“tempTable”))

res11: scala.util.Try[Unit] = Success(())

then you can go to your Hue client and create queries.

That´s it, have fun in the process and be a nice person,

Alonso

Anuncios

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s