How to use a function or method on a Spark data frame column for transformation using Scala - scala

I have created a function in scala equivalant to ORACLE DECODE function. I want to use the function with SPARK dataframes columns. I have tried it but getting multiple issues with Datatype mismatches.
I do not want to create UDF for each program. I want to create something generic and reuse it multiple times.
Function:
def ODECODE(column: Any, Param: Any*) : Any = {
var index = 0
while (index < Param.length) {
var P = Param(index)
var Q = column
if (P.equals(Q))
return Param(index + 1)
else index = index + 1
}
return Param (Param.length - 1)
}
I want to use it some thing like this:
Assuming "Emp" is a dataframe containing data from employee table with columns(first name, Last Name, Grade).
Emp.select(ODECODE("grade", "A", 1, "B", 2, "C", 3, "FAIL")).show()
This is one example. The datatype in the grade column can be String or Integer. So I have taken Datatypes in the decode function (Above) as ANY but with Dataframes it does not perform the Transformation. It gives datatype mismatches.
I want to create individual functions/Methods for some of the unsupported Oracle functions and reuse them where ever required in my transformations. So any suggestion to make this work is appreciated.

I know this is late, but I actually needed this and found your example. I was able to implement it with a few changes. I am no expert though, there may be a better way of doing this.
import util.control.Breaks._;
def ODECODE[T](column: String, params: Seq[T]) : String = {
try {
var index = 0;
breakable {
while (index < params.length) {
var P = params(index);
var Q = column;
if(P.equals(Q)) {
break;
}
index += 1;
}
}
params(index - 1).toString;
}catch {
case ife: Exception =>
ife.printStackTrace();
"0";
}
}
println(ODECODE("TEST", 0, "TEgST", 8, "***", 0))

Related

Rust: Convert SQL types to Rust automatically using sqlx

I'm new to rust and was working on a generic database viewer app using SQLX. Now I've stumbled upon a problem. While querying like a simple
SELECT * FROM TABLE_NAME
Now the problem is that I don't know the type of rows and columns before hand. So how should I parse the queries?
I've tried the following from this post
for r in row {
let mut row_result: Vec<String> = Vec::new();
for col in r.columns() {
let value = r.try_get_raw(col.ordinal()).unwrap();
let value = match value.is_null() {
true => "NULL".to_string(),
false => {
let mat = value.as_str();
match mat {
Ok(m) => m.to_string(),
Err(err) => {
dbg!(err);
"ERROR".to_string()
}
}
}
};
// println!("VALUE-- {:?}", value);
row_result.push(value);
}
result.push(row_result);
}
The problem is that for some columns it's returning like this. Like for the ID columns
"\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0004"
and for some i'm getting the following error in the dbg! macro
Utf8Error {
valid_up_to: 2,
error_len: Some(
1,
),
}
Anyone can help me here?
BTW I'm using Postgres so all row types are of PgRow
2: Follow up. I was able to get the types of the columns from the information_schema but the problem seems to be that those will be in String and I couldn't find any way to convert those into rust types, like INT8 -> i64, TEXT -> String. Something like that.

Swift Subscript for getting columns and/or rows of matrix

I created a custom Matrix struct but am having problems with subscript definitions. I want to have 2 subscripts, one for getting a row, and one for getting a column. Currently, I have this:
subscript(row: Int, column: Int? = nil) -> [Double] {
get {
return Array(self.matrix[row*self.column..<(row*self.column)+3])
}
set(rowValues) {
for i in 0..<self.column{
self.matrix[row*self.column + i] = rowValues[i]
}
}
}
subscript(row: Int? = nil, column: Int) -> [Double] {
get {
var col = [Double]()
for columnIndex in 0..<self.row{
col.append(self.matrix[columnIndex*self.column + column])
}
return col
}
}
But if I call matrix[i] it returns the ith row (and if I write matrix[column: i] it says it's extra and doesn't accept) and there is no way to specify its the column if I don't call it like matrix[nil, i]. Which is fine but I was wondering if it could be possible to not have to write anything for the row and directly get the column (like matrix[, i]).

How to return a variable in a function in kotlin

I created a function that recieves input and compare it to a list, when find a match it return the match, in this case this match is the attribute of a class that i created.
I understand that the problem is with the return statement, so in the beginning of the function I declare the return as "Any", further more than that I'm kinda lost.
The error is this: A 'return' expression required in a function with a block body ('{...}')
class Class1(var self: String)
var test_class = Class1("")
fun giver(){
test_class.self = "Anything"
}
class Funciones(){
fun match_finder(texto: String): Any{
var lista = listOf<String>(test_class.self)
var lista_de_listas = listOf<String>("test_class.self")
var count = -1
for (i in lista_de_listas){
count = count + 1
if (texto == i){
lista_de_listas = lista
var variable = lista_de_listas[count]
return variable
}
}
}
}
fun main(){
giver()
var x = "test_class.self"
var funcion = Funciones()
var y = funcion.match_finder(x)
println(y)
}
To explain you what the problem is, let's consider the following code:
class MyClass {
fun doSomething(): String {
val numbers = listOf(1, 2, 3)
for (number in numbers) {
if (number % 2 == 0) {
return "There is at least one even number in the list"
}
}
}
}
If you try compiling it you'll get the same error message as in your question: A 'return' expression required in a function with a block body ('{...}'). Why is that?
Well, we defined a function doSomething returning a String (it could be any other type) but we're returning a result only if the list of numbers contains at least one even number. What should it return if there's no even number? The compiler doesn't know that (how could it know?), so it prompts us that message. We can fix the code by returning a value or by throwing an exception:
class MyClass {
fun doSomething(): String {
val numbers = listOf(1, 2, 3)
for (number in numbers) {
if (number % 2 == 0) {
return "There is at least one even number in the list"
}
}
// return something if the list doesn't contain any even number
return "There is no even number in the list"
}
}
The same logic applies to your original code: what should the function return if there is no i such that texto == i?
Please also note that the solution you proposed may be syntactically correct - meaning it compiles correctly - but will probably do something unexpected. The for loop is useless since the if/else statement will always cause the function to return during the first iteration, so the value "There is no match" could be returned even if a match actually exists later in the list.
I searched online, if someone has the same problem, the correct code is as follows:
class Funciones(){
fun match_finder(texto: String): Any{
var lista = listOf<String>(test_class.self)
var lista_de_listas = listOf<String>("test_class.self")
var count = -1
var variable = " "
for (i in lista_de_listas){
count = count + 1
if (texto == i){
lista_de_listas = lista
var variable = lista_de_listas[count]
return variable
} else {
return "There is no match"
}
}
return variable
}
}

Read data with ILNumerics.IO.HDF5

I try ILNumerics.IO.HDF5 and can not read the following data:
Variable length strings in Datasets and Attributes.
Datasets with variable length arrays. Each cell contain a array of numbers, which are histograms.
Compound data, ie. Datasets with structs containing some numbers.
In HDFView 2.10.1 I can read this data:
https://anonfiles.com/file/13756916026cafc4e4ec7c333f235bda
How can I use ILNumerics.IO.HDF5 with this data?
I found an other post with suggestion to read string as char.
But with the variable length string an exception is thrown: "Error reading data from the attribute!"
var file = new H5File("test.h5");
H5Dataset ds1 = file.First<H5Dataset>("Wind");
var att = ds1.Attributes["Aggregator"];
var value = att.Get<char>();
Could you provide more info on how you write the string attributes and what exactly is the issue. When you say 'can not read',Do you get a null return value or do you get an exception.
I write strings as attributes in my application and it works fine. I am guessing there could be a problem in the way you write the string. As per Haymo's suggestion, I convert the string into char array and write as attribute. Here is the sample code
private ILRetArray<Char> ConvertStringToArray(string str)
{
using (ILScope.Enter())
{
ILArray<Char> A = ILMath.array<Char>(' ', 1, str.Length);
for (int i = 0; i < str.Length; i++)
{
A.SetValue(str[i], 0, i);
}
return A;
}
}
Test Case :
using (var file = new H5File("testwrite.h5"))
{
var ds = new H5Dataset("data", ILMath.rand(10,10));
file.Add(ds);
string teststr = "Test string";
ILArray<char> charStr = ConvertStringToArray(mystr);
ds.Attributes.Add(new H5Attribute("mystring",charStr));
//Read back the dataset and its attributes
var group = file.Find<H5Dataset>("data").First();
ILArray<Char> storedData = group.Attributes["mystring"].Get<Char>();
}

SWIFT IF ELSE and Modulo

In Swift, I need to create a simple for-condition-increment loop with all the multiples of 3 from 3-100. So far I have:
var multiplesOfThree: [String] = []
for var counter = 0; counter < 30; ++counter {
multiplesOfThree.append("0")
if counter == 3 {
multiplesOfThree.append("3")
} else if counter == 6 {
multiplesOfThree.append("6")
} else if counter == 9 {
multiplesOfThree.append("9")
}
println("Adding \(multiplesOfThree[counter]) to the Array.")
}
I would like to replace all the if and else if statements with something like:
if (index %3 == 0)
but I’m not sure what the proper syntax would be? Also, if I have a single IF statement do I need a .append line to add to the Array?
You are very much on the right track. A few notes:
Swift provides a more concise way to iterate over a fixed number of integers using the ..< operator (an open range operator).
Your if statement with the modulus operator is exactly correct
To make a string from an Int you can use \(expression) inside a string. This is called String Interpolation
Here is the working code:
var multiplesOfThree: [String] = []
for test in 0..<100 {
if (test % 3 == 0) {
multiplesOfThree.append("\(test)")
}
}
However, there is no reason to iterate over every number. You can simply continue to add 3 until you reach your max:
var multiplesOfThree: [String] = []
var multiple = 0
while multiple < 100 {
multiplesOfThree.append("\(multiple)")
multiple += 3
}
As rickster pointed out in the comments, you can also do this in a more concise way using a Strided Range with the by method:
var multiplesOfThree: [String] = []
for multiple in stride(from: 0, to: 100, by: 3) {
multiplesOfThree.append("\(multiple)")
}
Getting even more advanced, you can use the map function to do this all in one line. The map method lets you apply a transform on every element in an array:
let multiplesOfThree = Array(map(stride(from: 0, to: 100, by: 3), { "\($0)" }))
Note: To understand this final code, you will need to understand the syntax around closures well.