Why is my specified range in PROC IMPORT being ignored? - import

I am trying to import a set of exchange rates. The data set lookes like this:
That is to say the actual data should be read from row 5 and downwards from the sheet named "Växelkurser". The variable names should be read from row 4.
I try writing the following code:
PROC IMPORT
DATAFILE="/opt3/01_Dataleveranser/03_IBIS/Inläsning/IBIS3/Växelkurser macrobond/Växelkurser19DEC2022.xlsx"
OUT=WORK.VALUTOR_0000
DBMS=xlsx
REPLACE;
sheet="Växelkurser";
getnames=yes;
range="Växelkurser$A4:0";
RUN;
And I get the following result:
I clearly specified that SAS should start reading from the fourth row and that the variable names should be read from that row. Why is this being ignored and how would I make this work?

The problem seems to be that you are specifying both sheet= and range=. The sheet statement is telling SAS to read the whole sheet and I think this is overriding the later range statment.
Remove the following line and the code should work as expected:
sheet="Växelkurser";

Related

MySQL: How do I remove/skip the first row (before the headers) from my stored procedure result?

I am calling a stored procedure which results in the following output
CALL `resale`.`reportProfitAndLossSummary`(3,' ',599025,TRUE);
OUTPUT:
"CONCAT('"',
CONCAT_WS('","',
"Promoter",
"Event",
"Event Description",
"Zone",
"Tickets Unsold",
"
""Promoter","Event","Event Description","Zone","Tickets Unsold","Avg. Unsold Price","Tickets Sold","Avg. Sold Price","Avg. Cost","Profit","Revenue""
""Qcue","10/2/2022 1:15 PM Pirates # Cardinals","Pirates # Cardinals","1/3B Field Box",0,,16,149.761250,42.000000,1724.18,2396.18"
I exported the result to .csv and discovered that a new code chunk is created above the header which distorts the structure of the file. Is there a way to skip this code chunk. I tried "-N" "-ss" since the code chunk appears as the header and none of those worked in MySQLWorkbench. Turning the header option to "FALSE" in the stored procedure call removes the actual headers and not the undesired code.
The stored procedure was developed by someone else so I am not sure where to begin fixing this. The goal is remove the undesired code from the query result itself not the .csv export.

SAS Proc Import Specific Range from xlsm file

I'm in the need of importing an xlsm file and pulling just one cell value from a specific spreadsheet.
I've tried using the below but get a 'CLI error trying to establish connection' error. I do have to use the rsubmit blocks. What am I doing wrong?
RSUBMIT INHERITLIB=(mywork);
OPTIONS msglevel=i VALIDVARNAME= any;
proc import datafile="\\mysite.com\folder1\folder2\myfile.xlsm"
dbms=EXCELCS replace out=Output;
range="EmailSummary$O5";
run;
ENDRSUBMIT;
If you want to import only one cell then you need to tell IMPORT not to look for names and also give it both the upper left and lower right cell of the range.
getnames=no;
range="EmailSummary$O5:O5";

Using the toInteger function with locale and format parameters

I've got a dataflow with a csv file as source. The column NewPositive is a string and it contains numbers formatted in European style with a dot as thousand seperator e.g 1.019 meaning 1019
If I use the function toInteger to convert my NewPositive column to an int via toInteger(NewPositive,'#.###','de'), I only get the thousand cipher e.g 1 for 1.019 and not the rest. Why? For testing I tried creating a constant column: toInteger('1.019','#.###','de') and it gives 1019 as expected. So why does the function not work for my column? The column is trimmed and if I compare the first value with equality function: equals('1.019',NewPositive) returns true.
Please note: I know it's very easy to create a workaround by toInteger(replace(NewPositive,'.','')), but I want to learn how to use the toInteger function with the locale and format parameters.
Here is sample data:
Dato;NewPositive
2021-08-20;1.234
2021-08-21;1.789
I was able to repro this and probably looks to be a bug to me . I have reported this to the ADF team , will let you know once I hear back from them . You already have a work around please go ahead that to unblock yourself .

Issues with "QUERY(IMPORTRANGE)"

Here's my first question on this forum, though I've read through a lot of good answers here.
Can anyone tell me what I'm doing wrong with my attempt to do a query import from one sheet to a column in another?
Here's the formula I've tried, but all my adjustments still get me a parsing error.
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1yGPdI0eBRNltMQ3Wr8E2cw-wNlysZd-XY3mtAnEyLLY/edit#gid=163356401","Master Treatment Log (Responses)!V2:V")"WHERE Col8="'&B2&'")")
Note that importrange is only needed for imports between spreadsheets. If you only import from one sheet into another within the same spreadsheet I would suggest using filter() or query().
Assuming the value in B2 is actually a string (and not a number), you can try
=QUERY(IMPORTRANGE("https://docs.google.com/spreadsheets/d/1yGPdI0eBRNltMQ3Wr8E2cw-wNlysZd-XY3mtAnEyLLY/edit#gid=163356401","Master Treatment Log (Responses)!V2:V"), "WHERE Col8="'&B2&'", 0)
Note the added comma before "WHERE". If you want to import a header row, change 0 to 1.
See if that helps? If not, please share a copy of your spreadsheet (sensitive data erased).

Why does Open XML API Import Text Formatted Column Cell Rows Differently For Every Row

I am working on an ingestion feature that will take a strongly formatted .xlsx file and import the records to a temp storage table and then process the rows to create db records.
One of the columns is strictly formatted as "Text" but it seems like the Open XML API handles the columns cells differently on a row-by-row basis. Some of the values while appearing to be numeric values are truly not (which is why we format the column as Text) -
some examples are "211377", "211727.01", "209395.388", "209395.435"
what these values represent is not important but what happens is that some values (using the Open XML API v2.5 library) will be read in properly as text whether retrieved from the Shared Strings collection or simply from InnerXML property while others get sucked in as numbers with what appears to be appended rounding or precision.
For example the "211377", "211727.01" and "209395.435" all come in exactly as they are in the spreadsheet but the "209395.388" value is being pulled in as "209395.38800000001" (there are others that this happens to as well).
There seems to be no rhyme or reason to which values get messed up and which ones which import fine. What is really frustrating is that if I use the native Import feature in SQL Server Management Studio and ingest the same spreadsheet to a temp table this does not happen - so how is that the SSMS import can handle these values as purely text for all rows but the Open XML API cannot.
To begin the answer you main problem seems to be values,
"209395.388" value is being pulled in as "209395.38800000001"
Yes in .xlsx file value is stored as 209395.38800000001 instead of 209395.388. And it's the correct format to store floating point numbers; nothing wrong in it. You van simply confirm it by following code snippet
string val = "209395.38800000001"; // <= What we extract from Open Xml
Console.WriteLine(double.Parse(val)); // < = Simply pass it to double and print
The output is :
209395.388 // <= yes the expected value
So there's nothing wrong in the value you extract from .xlsx using Open Xml SDK.
Now to cells, yes cell can have verity of formats. Numbers, text, boleans or shared string text. And you can styles to a cell which would format your string to a desired output in Excel. (Ex - Date Time format, Forced strings etc.). And this the way Excel handle the vast verity of data. It need this kind of formatting and .xlsx file format had to be little complex to support all.
My advice is to use a proper parse method set at extracted values to identify what format it represent (For example to determine whether its a number or a text) and apply what type of parse.
ex : -
string val = "209395.38800000001";
Console.WriteLine(float.Parse(val)); // <= Float parse will be deduce a different value ; 209395.4
Update :
Here's how value is saved in internal XML
Try for yourself ;
Make an .xlsx file with value 209395.388 -> Change extention to .zip -> Unzip it -> goto worksheet folder -> open Sheet1
You will notice that value is stored as 209395.38800000001 as scene in attached image.. So nothing wrong on API for extracting stored number. It's your duty to decide what format to apply.
But if you make the whole column Text before adding data, you will see that .xlsx hold data as it is; simply said as string.