Disable scientific notation in Spark Scala [closed] - scala

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed last year.
Improve this question
I have a table to consume from Spark and do data transformations.
There is a column that have values with 8 numbers after . something like this: 0.00000001
This column have the absolute value like this number above.
I need to take the absolute value of that column, without let Spark transform this in scientific notation. When Spark reads the table, it transforms the column in scientific notation.
I have tried to convert in String, Double or Float, but nothing works. I need this field in the type Decimal. The type of the field is already Decimal natively in the column.
Is there some way to do this?
Code to simulate:
val df = spark.sparkContext.parallelize(Seq((0.00000001))).toDF("Value")
Spark shows me this: 0E-8
And i need this in Decimal Type: 0.00000001
Thanks!

You can use format_number function to get desired result.
spark.sparkContext.parallelize(Seq((0.00000001))).toDF("Value").
selectExpr("format_number(Value,'#.########') as Value").show(false)
/*
+----------+
|Value |
+----------+
|0.00000001|
+----------+*/

Related

How do I filter the line with the most recent date, keeping the same structure as input? [duplicate]

This question already has answers here:
Spark filter based on the max date record
(3 answers)
Closed 6 months ago.
I'm trying to make a function in Scala to filter the most recent date. I want to keep it general, so whatever dataframe I input, as long as it has the column "date", will return me the most recent line of that dataframe. It's worth noting that my date column is usually defined as a string, in the format yyyy-MM-dd. I'm pretty sure that my code here is flawed, but I guess it illustrates the idea.
def fixDate(table: DataFrame): DataFrame = {
table
.withColumn("date", from_unixtime(unix_timestamp(col("date"), "yyyy-MM-dd"), "yyyyMMdd").cast(Integer))
.filter(col("date")===functions.max("date"))
}
It depends a lot on what you want to achieve. If you want to get all the columns with the maximum value by the "date" field, then from the point of view of performance, you'd better divide this action into two stages, getting the maximum value of the column, and filtering by it:
val maxValue = df.withColumn("eventTime", unix_timestamp(col("date"), "yyyy-MM-dd")).cast(TimestampType))
.agg(max("meventTime1")).collect()(0)(0)
df.withColumn("eventTime", unix_timestamp(col("date"), "yyyy-MM-dd")).cast(TimestampType))
.filter(col(eventTime1) === maxValue).drop("eventTime")
If you want any record with the maximum value, then I think the answers from this post will help you: How to get the row from a dataframe that has the maximum value in a specific column?

Get last N rows ordered ascending without using nested SELECTs [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I'd like to get the last N rows in ascending order. Is it possible without using nested SELECTs?
The reason of this question if it is know if there is other way more efficient.
Nested selects:
select id, timestamp, col3, col4
from (
select * from t_myTable
order by timestamp DESC LIMIT 5
) as d
order by timestamp ASC
The id column is auto-incremental.
thanks.
Since you need two different orderings, I cannot imagine that there is a different way to do it without a subquery. And even if there were, I am pretty confident that that query would be more complicated and would perform worse.

Matlab data pull and save [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
What I'm attempting to do is simple:
Given an excel dataset and user defined inputs, open the associated excel file, pull the associated information for the user inputted information and save this information in a separate excel file.
I've already developed a list of values and the program recognizes user input with associated checks. I'm stuck on getting Matlab to use this information to open the correct dataset, I don't know how to get Matlab to pull a row/column in excel with a silent open and I don't know how to get it to save that data into a separate excel file. Any help would be appreciated, thank you.
Consider using the functions readtable, and writetable if you have a recent MATLAB (anything more recent than R2013b). The readtable function will 'silently' read data from a specific worksheet in an Excel file into a MATLAB table variable. From there you can 'query' the table to find the specific rows you want and write the result to a new excel table with writetable.
Using readtable, you can specify the range of data with the parameters sheet and range.
requested_data = readtable(excel_file, ...
'sheet',input_sheet_name, ...
'range',input_data_range);
and write the data to another Excel file with
writetable(requested_data,ouput_excel_file, ...
'sheet',output_sheet_name, ...
'range',output_data_range);
Note: Remember to set the values for excel_file, input_sheet_name, input_data_range, output_excel_file, output_sheet_name, and output_data_range before running above commands.
Querying the table to access data in your table. One way would be to use ismember as in this answer.
Finally, use writetable to store the values.
See also: sheetnames, detectImportOptions, and SpreadsheetImportOptions

Save a ordering list of data into core data [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
i have an nstableview, where the user can add rows.
each row will directly save into Core data.
at the moment i can request the records of core data and sort it by objects id to get the correct order, because the object id is nearly the same like an incremental number.
but now the user can reorder the rows.
how can i save this new order of rows into core data?
Using the id as a way to order is not guaranteed to work, so you shouldn't do that.
Instead, add a field that represents the order.
What we do is have a field called pos that is in integer and we set it sparsely. The first record can be 100, the second 200, etc. Then when we re-order, we set pos to the mid-point of the records right before and after. Every once in a while, you need to re-number the records. The more sparse you do it, the less you need to re-number.
When you add a new record set it to the max value + 100 (or whatever spacing you are using)

What actually high and low means in a ranges table [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
The range table contains the high and a low value. What it actually does and used for?
the range table can be used in open sql queries (using the IN clause) as a restriction. You can define several ranges to include or exclude in your query result. For instance:
report ZZTEST.
types:
lt_kunrng type range of kunnr_d.
data:
it_rnrng type standard table of lt_kunrng,
wa_rnrng type lt_kunrng,
it_kna1 type standard table of kna1.
START-OF-SELECTION.
move:
'I' to wa_rnrng-sign,
'EQ' to wa_rnrng-option,
'0001000000' to wa_rnrng-low,
'0002999999' to wa_rnrng-high.
append wa_rnrng to it_rnrng.
move:
'I' to wa_rnrng-sign,
'EQ' to wa_rnrng-option,
'0003500000' to wa_rnrng-low,
append wa_rnrng to it_rnrng.
select * from kna1 into corresponding fields of table it_kna1
where kna1~kunnr in it_rnrng.
this example defines a RANGE table manually, adds two rows to it and then queries transparent table KNA1 with this range table. The internal table it_kna1 should then contain all KNA1 records with KUNNR between 1000000 and 2999999 and the record with KUNNR 35000000. So you can combine several ranges in a single range table.
The field SIGN contains either 'I' for Include or 'E' for Exclude (meaning that this row should include or exclude records matching the defined range), the field OPTION defines the comparer ('EQ' for equal, 'GT' for greater than, 'GE' for greater or equal and so forth). Fields HIGH and LOW contain the actual ranges. If you only want to query a single value you don't have to supply both LOW and HIGH, you only need to supply field LOW (and of course SIGN and OPTION).
RANGE table are also created when you define a SELECT-OPTION in a report. Users can then use the selection screen to define the ranges they want to use in the report.
REPORT ZZTEST.
TABLES: kna1.
DATA: it_kna1 type standard table of kna1.
SELECT-OPTION:
so_kndrng for kna1-kunnr.
START-OF-SELECTION.
select * from kna1 into corresponding fields of it_kna1
where kna1~kunnr in so_kndrng.
For further information regarding range tables you can also refer to the official documentation here