Execute SnowFlake procedure with Spark Scala

Execute SnowFlake procedure with Spark Scala - scala

I write you because I don´t know how to execute a snowFlake procedure with Azure Databricks.
This is my SnowFlake procedure:
CREATE OR REPLACE PROCEDURE getBalanceFrontAndInTotalFront(tableName VARCHAR, stringBalanceFront VARCHAR, stringInTotalFront VARCHAR)
RETURNS VARCHAR
NOT NULL
LANGUAGE javascript
AS
$$
var tableName = TABLENAME;
var balanceFront = STRINGBALANCEFRONT;
var inTotalFront = STRINGINTOTALFRONT;
// Dynamically compose the SQL statement to execute.
var sqlCommand = "SELECT BALANCE_FRONT, IN_TOTAL_FRONT, SUM(AMOUNT) AS AMOUNT FROM (SELECT " + balanceFront + " AS \"BALANCE_FRONT\", AMOUNT, " + inTotalFront + " AS \"IN_TOTAL_FRONT\" FROM " + tableName + ") GROUP BY BALANCE_FRONT, IN_TOTAL_FRONT";
// Prepare statement.
var stmt = snowflake.createStatement({sqlText: sqlCommand});
// Execute Statement
var rs = stmt.execute();
arrayValues=[]
while (rs.next()) {
var column1 = rs.getColumnValue(1);
var column2 = rs.getColumnValue(2);
var column3 = rs.getColumnValue(3);
arrayValues.push([column1 + ':' + column3 + ':' + column2]);
}
return arrayValues;
$$;
When I execute the procedure in SnowFlake
set stringBalanceFront = 'CASE WHEN Balance_Type like (\'%A%\')THEN \'ACTIVO\' WHEN Balance_Type like (\'%P%\') THEN \'PASIVO\' WHEN Balance_Type like (\'%N%\') THEN \'NETO\' ELSE \'RESTO\' END';
set stringInTotalFront = 'CASE WHEN Balance_Type like (\'%A%\')THEN \'true\' ELSE \'false\' END';
CALL getBalanceFrontAndInTotalFront('DMAAS_OUTPUT_DATA_TABLE_0049_D18CER', $stringBalanceFront, $stringInTotalFront);
I obtain next array of strings
RESTO:-184281744:false,ACTIVO:-17881395:true,NETO:20599:false,PASIVO:12672:false
I am trying to run this procedure from Spark with the following code and it obviously fails
val stringBalanceFront = Funciones.generarCondiciones(dfOrdenado, Variables.CAMPO_BALANCE_FRONT.toLowerCase())
val stringInTotalFront = Funciones.generarCondiciones(dfOrdenado, Variables.CAMPO_IN_TOTAL_FRONT.toLowerCase())
val query = s"CALL getBalanceFrontAndInTotalFront(${cfgVal.getRutaMasterNoAgregada}, ${stringBalanceFront}, ${stringInTotalFront});"
val arrayBalanceFront = spark.read
.format(SNOWFLAKE_SOURCE_NAME)
.options(snowOptionsRead)
.option("query", query)
.load()
And I get the next error:
21/07/15 17:14:36 ERROR Uncaught throwable from user code: net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
syntax error line 1 at position 15 unexpected 'CALL'.
What is the correct way to execute a SnowFlake procedure from Spark? Keep in mind that I want to return the results to a val in Spark.
Thanks in advance!
Best regards.

Related

How to pass **kwargs as databricks notebook parameter to a function argument? Or execute a databricks notebook function directly through ADF

I am executing a custom function:
def test_insert(col1 , **kwargs):
try:
sql = "INSERT INTO target_tbl SELECT * FROM source_tbl WHERE col1 = '{}'".format(col1)
if len(kwargs.items()) != 0:
for i in kwargs.items():
sql = sql + " AND {} = '{}'".format(i[0],i[1])
return sql
except Exception as e:
return e
It generates a sql script like :- test_insert('1001' , col2 = 'PRD002') => o/p: INSERT INTO sales_history_output SELECT * FROM sales_history WHERE col1 = 'SOP001' AND col2 = 'PRD002'.
Now I want to pass the **kwargs function parameter through databricks notebook parameter. Is there any way to do this? When I am passing '1001' , col2 = 'PRD002' in notebook parameter, it is reading as a single string and not like **kwargs

JDBC SELECT COUNT(*) returns empty resultset on HSQLDB

I would expect to always receive a resultset with one row on a SELECT COUNT, but results.next() always returns false. This is on HSQLDB 2.5.1.
The code below prints:
number of columns: 1. First column C1 with type INTEGER
No COUNT results
statement = connection.createStatement();
// check if table empty
statement.executeQuery("SELECT COUNT(*) FROM mytable");
ResultSet results = statement.getResultSet();
System.out.println("number of columns: " + results.getMetaData().getColumnCount() + ". First column " +results.getMetaData().getColumnName(1) + " with type " +results.getMetaData().getColumnTypeName(1) );
int numberOfRows = 0;
boolean hasResults = results.next();
if (hasResults){
numberOfRows = results.getInt(1);
System.out.println("Table size " + numberOfRows );
}else{
System.out.println("No COUNT results" );
}
statement.close();
Executing the same SQL statement in my SQL console works fine:
C1
104
Other JDBC actions on this database work fine as well.
Is there something obvious I'm missing?

The getResultSet method is applicable to execute, but not executeQuery which returns a ResultSet. That is the one you need to refer to, at the moment you are losing it as you are not assigning it to anything.
See https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeQuery(java.lang.String) and https://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#getResultSet()
ResultSet results = statement.executeQuery("SELECT COUNT(*) FROM mytable");

Add quotation to each element of a list

I need to use a variable that I've created before in spark to select data from a teradata table:
%spark
sqlContext.setConf("spark.sql.parquet.writeLegacyFormat", "true")
val query = "select distinct cod_contrato from xxx.contratos"
val df = sqlContext.sql(query)
val dfv = df.select("cod_contrato")
the variable is a string.
So I would like to query the databe usign that vector of strings:
If I use:
%spark
val sql = s"(SELECT * FROM xx2.CONTRATOS where cod_contrato in '$dfv') as query"
I get:
(SELECT * FROM xx2.CONTRATOS where cod_contrato in '[cod_contrato: string]') as query
The desired result would be:
SELECT * FROM xx2.CONTRATOS where cod_contrato in ('11111', '11112' )
How can I transform the vector to a list enclosed by () and with quotation in each element?
thanks

This is my trial. From some dataframe,
val test = df.select("id").as[String].collect
> test: Array[String] = Array(6597, 8011, 2597, 5022, 5022, 6852, 6852, 5611, 14838, 14838, 2588, 2588)
and so the test is now array. Thus, by using mkString,
val sql = s"SELECT * FROM xx2.CONTRATOS where cod_contrato in " + test.mkString("('", "','", "')") + " as query"
> sql: String = SELECT * FROM xx2.CONTRATOS where cod_contrato in ('6597','8011','2597','5022','5022','6852','6852','5611','14838','14838','2588','2588') as query
where the final result is now string.

Make a temp view of the values you want to filter on and then reference it in the query
%spark
sqlContext.setConf("spark.sql.parquet.writeLegacyFormat", "true")
val query = "select distinct cod_contrato from xxx.contratos"
sqlContext.sql(query).selectExpr("cast(cod_contrato as string)").createOrReplaceTempView("dfv_table"")
val sql = "(SELECT * FROM xx2.CONTRATOS where cod_contrato in (select * from dfv_table)) as query"
this will work for the query in spark sql, but will not return a query string. Lamanus's answer should be sufficient if all you want is the query as string

PostgreSQL {call Update Set ...} getting "syntax error at or near SET"

I'm changing queries from an Oracle Database to PostgreSQL, and in this query I am getting this error:
ERROR: syntax error at or near "SET"
the query is:
{call UPDATE alarm_instance SET last_update_time=default, wait_expire_time=null, core_number=nextval(SEQ_ALRM_NUMBR)
where wait_time <= current_date RETURNING alarm_instance_id bulk collect INTO ?}
I am using JDBC to connect to the database and here is the call code
try (CallableStatement cs = super.prepareCall_(query)) {
cs.registerOutParameter(1, Types.ARRAY);
cs.execute();
...
I have taken a long look at Postgres documentation and cannot find what is wrong and didn't find any answer to this specific situation

An UPDATE statement can't be executed with a CallableStatement. A CallableStatement is essentially only intended to call stored procedures. In case of Oracle that includes anonymous PL/SQL blocks.
And bulk collect is invalid in Postgres to begin with.
It seems you want something like this:
String sql =
"UPDATE alarm_instance " +
" SET last_update_time=default, " +
" wait_expire_time=null, "
" core_number=nextval('SEQ_ALRM_NUMBR') " +
" where wait_time <= current_date RETURNING alarm_instance_id";
Statement stmt = connection.createStatement();
stmt.execute(sql);
int rowsUpdated = stmt.getUpdateCount();
ResultSet rs = stmt.getResultSet();
while (rs.next() {
// do something with the returned IDs
}

How to execute multi line sql in spark sql

How can I execute lengthy, multiline Hive Queries in Spark SQL? Like query below:
val sqlContext = new HiveContext (sc)
val result = sqlContext.sql ("
select ...
from ...
");

Use """ instead, so for example
val results = sqlContext.sql ("""
select ....
from ....
""");
or, if you want to format code, use:
val results = sqlContext.sql ("""
|select ....
|from ....
""".stripMargin);

You can use triple-quotes at the start/end of the SQL code or a backslash at the end of each line.
val results = sqlContext.sql ("""
create table enta.scd_fullfilled_entitlement as
select *
from my_table
""");
results = sqlContext.sql (" \
create table enta.scd_fullfilled_entitlement as \
select * \
from my_table \
")

val query = """(SELECT
a.AcctBranchName,
c.CustomerNum,
c.SourceCustomerId,
a.SourceAccountId,
a.AccountNum,
c.FullName,
c.LastName,
c.BirthDate,
a.Balance,
case when [RollOverStatus] = 'Y' then 'Yes' Else 'No' end as RollOverStatus
FROM
v_Account AS a left join v_Customer AS c
ON c.CustomerID = a.CustomerID AND c.Businessdate = a.Businessdate
WHERE
a.Category = 'Deposit' AND
c.Businessdate= '2018-11-28' AND
isnull(a.Classification,'N/A') IN ('Contractual Account','Non-Term Deposit','Term Deposit')
AND IsActive = 'Yes' ) tmp """

It is worth noting that the length is not the issue, just the writing. For this you can use """ as Gaweda suggested or simply use a string variable, e.g. by building it with string builder. For example:
val selectElements = Seq("a","b","c")
val builder = StringBuilder.newBuilder
builder.append("select ")
builder.append(selectElements.mkString(","))
builder.append(" where d<10")
val results = sqlContext.sql(builder.toString())

In addition to the above ways, you can use the below-mentioned way as well:
val results = sqlContext.sql("select .... " +
" from .... " +
" where .... " +
" group by ....
");

Write your sql inside triple quotes, like """ sql code """
df = spark.sql(f""" select * from table1 """)
This is same for Scala Spark and PySpark.

Categories

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Execute SnowFlake procedure with Spark Scala - scala

Related

How to pass **kwargs as databricks notebook parameter to a function argument? Or execute a databricks notebook function directly through ADF

JDBC SELECT COUNT(*) returns empty resultset on HSQLDB

Add quotation to each element of a list

PostgreSQL {call Update Set ...} getting "syntax error at or near SET"

How to execute multi line sql in spark sql

Categories

Resources