clapotiot-logo

Case Clause Using Spark SQL

Facebook
Twitter
LinkedIn
WhatsApp

Applying Case Clause In Spark SQL For Different Input Sources.

Hive table As Input:

Step 1: Create table using spark sql.

spark.sql("create table student (id int, name string, total_marks int) row format delimited fields terminated by ‘,’  stored as textfile")

Step 2: Insert values to the table using spark sql.

spark.sql("insert into student values(1001, 'ganesh', 435)")
spark.sql("insert into student values(1002, 'prabu', 405)")
spark.sql("insert into student values(1003, 'raju', 375)")
spark.sql("insert into student values(1004, 'mano', 355)")
spark.sql("insert into student values(1005, 'balu', 235)")
spark.sql("insert into student values(1006, 'prasanna', 250)")

Validate the table created through spark sql in hive.

Validate table in hive client.

Step 3: Use case clause to create new dataframe with one more column using the existing column.

newdf=spark.sql("select id,name,total_marks,case when total_marks>400 then 'A' when total_marks<400 and total_marks>350 then 'B' else 'c' end as Grade from student order by id")

Step 4: Validate the results.

newdf.show()
View dataframe values.

Input as CSV:

Step 1: Read the csv and create a dataframe.

df2= spark.read.csv(‘file:/// /home/skiganesh/codes/sources/student/stdf1.csv’,header=‘true’, inferSchema=‘true’) 

inferSchema ‘True’ will go through the input once to determine the input schema.

Header ‘True’ will consider the first row as header.

file is used to read data from local file system, it should be dfs if it is from HDFS system.

Step 2: Verify the schema.

df2.schema

Step 3: Create a temp view using the dataframe.

df2.createOrReplaceTempView(‘sudent_view’)

Step 4: Apply case clause on the tempview using spark sql.

df3=spark.sql("select id,name,total_marks,case when total_marks>400 then 'A' when total_marks<400 and total_marks>350 then 'B' else 'c' end as Grade from student_view order by id") 

Step 5: Validate the result.

df3.show()

References:

https://spark.apache.org/docs/latest/sql-ref-syntax-qry-select-case.html

Leave a Comment

Your email address will not be published. Required fields are marked *

Copyright © Claypot Technologies 2021