I'm inputting a file that needs to be converted to an xml file but I also want to input a .i with the definition of the temp table used to create the xml. Also the delimiter isn't working (I need a way to convert a variable to something the command can read). Thanks!
define input parameter pInputFile as character no-undo.
define input parameter pDelimiter as character no-undo.
???define input parameter pIncludeFile as character no-undo.???
define output parameter pOutputFile as character no-undo init "/tmp/out..
/* start of .i */
define temp-table ttGeneric no-undo
field cust_id as integer
field name as character
field address as character
field address2 as character
field city as character
field state as character
field zip as character
field cust_key as character
index idx is primary cust_id.
/* end of .i */
input stream sImport from value(pInputFile) no-echo.
repeat:
create ttGeneric.
import stream sImport delimiter pDelimiter ttGeneric.
end.
input stream sImport close.
temp-table ttGeneric:write-xml("LONGCHAR", pOutputFile, yes).
maybe set a pre-processor in the calling program (some how).
Delimiters to IMPORT and EXPORT must be literal strings. You cannot use variables, fields, parameters or anything like that.
I have, on occasion, worked around that with a CASE statement. i.e.:
case pDelimiter:
when "," then import stream sImport delimiter "," ttGeneric.
when "|" then import stream sImport delimiter "|" ttGeneric.
end.
Ugly. But it works.
I think you might be trying to say that you want to pass the name of an include file that contains a TT definition? And somehow associate that definition with a temp-table?
If that is more or less correct then you are probably being too specific about your requirement -- you probably really just want to dynamically create a TT whose definition is external to the program and unknown at compile time.
One way to do that is to use the read-xmlschema() method -- you're already using write-xml() so it's a small step... first convert your .i to a little .p like so:
define temp-table ttGeneric no-undo
field cust_id as integer
field name as character
field address as character
field address2 as character
field city as character
field state as character
field zip as character
field cust_key as character
index idx is primary cust_id.
buffer ttGeneric:write-xmlschema( "file", "ttgeneric.xsd", true, ?, ?, ? ).
return.
(This little stub lets you create the .XSD file. It serves no other purpose. Just run it once to get that file.)
Then when you want to use that temp-table:
define variable tx as handle no-undo.
create temp-table tx.
tx:read-xmlschema( "ttgeneric.xsd", "file", ?, ?, ? ).
(Note: Unlike with delimiters you can use variables & parameters etc. for the xsd name and it could be in a longchar rather than a file...)
The next adventure you will run into is figuring out a replacement for IMPORT that works with dynamic temp-tables. Buffer handles don't have import() and export() methods :(
The following snippets may help:
define variable dummy as character no-undo extent 128.
...
dummy = ?.
import dummy.
...
do i = 1 to 128:
if dummy[i] = ? then leave.
tx:buffer-field( i ):buffer-value() = dummy[i].
end.
Related
I am attempting to import a CSV into ADF however the file header is not the first line of the file. It is dynamic therefore I need to match it based on the first column (e.g "TestID,") which is a string.
Example Data (Header is on Line 4)
Date:,01/05/2022
Time:,00:30:25
Test Temperature:,25C
TestID,StartTime,EndTime,Result
TID12345-01,00:45:30,00:47:12,Pass
TID12345-02,00:46:50,00:49:12,Fail
TID12345-03,00:48:20,00:52:17,Pass
TID12345-04,00:49:12,00:49:45,Pass
TID12345-05,00:50:22,00:51:55,Fail
I found this article which addresses this issue however I am struggling to rewrite the expression from using an integer to using a string.
https://kromerbigdata.com/2019/09/28/adf-dynamic-skip-lines-find-data-with-variable-headers
First Expression
iif(!isNull(toInteger(left(toString(byPosition(1)),1))),toInteger(rownum),toInteger(0))
As the article states, this expression looks at the first character of each row and if it is an integer it will return the row number (rownum)
How do I perform this action for a string (e.g "TestID,")
Many Thanks
Jonny
I think you want to consider first line that starts with string as your header and preceding lines that starts with numbers should not be considered as header. You can use isNan function to check if the first character is Not a number(i.e. string) as seen in the below modified expression:
iif(isNan(left(toString(byPosition(1)),1))
,toInteger(rownum)
,toInteger(0)
)
Following is a breakdown of the above expression:
left(toString(byPosition(1)),1): gets first character fron left side of the first column.
isNan: checks if the character is "not a number".
iif: not a number, true then return rownum, false then return 0.
Or you can also use functions like isInteger() to check if the first character is an integer or not and perform actions accordingly.
Later on as explained in the cited article you need to find minimum rownum to skip.
Hope it helps.
I have a series of formatted numeric variables and I would like to convert them all into character variables assigned the corresponding values found in the format labels. Here is an example of the format:
proc format;
value Group
1= 'Experimental 1'
2= 'Experimental 2'
3= 'Treatment as usual';
run;
My variable Group_num has values 1-3 and has this format applied. I want to create a new character variable called Group_char which has the values "Experimental 1", "Experimental 2", and "Treatment as usual".
The (long) way I would do this would be:
data out;
set in;
format Group_char $30.;
if Group_num=1 then Group_char="Experimental 1";
if Group_num=2 then Group_char="Experimental 2";
if Group_num=3 then Group_char="Treatment as usual";
run;
However, I need to do this to 13 different variables and I don't know what their variable values, format names, and format labels are without looking at the data more. Preferably, I would want to use whatever format is already applied to the variable to automatically translate it into a new character variable, without needing to know the format name/labels or original variable values. However, if I need to find out the format name to create a new character variable just by using the format name, that would be better than needing to also know the original variables values and format labels as well.
Alternatively, another way to solve my problem would be if you could tell me if there is a way of importing SPSS datasets using variable value labels only, and leaving the values themselves out of the picture entirely, such that numeric variables with value labels are imported as character variables.
Thank you
First off, it's usually best not to do this - most of the time that you need the character bits, you can get them off of the formats.
But, that said... you need to look at the vvalue function.
data want;
set have;
var_char = vvalue(var_num);
run;
vvalue returns the formatted value of the argument.
I am working in MATLAB and trying to add units to the Column headers to a table of values then I will insert into SQLite Database but I have a column names of German characters (e.g 'ß', 'ä'), but this is invalid because of the special characters. According to everything I've found thus far has said that column headers must be valid variable names, e.g. alphanumeric and "_" only.
But I can not change my original database column names so does anyone know of a work-around it?
My code of building a table and sending into database is:
insertData = cell2table(full_matrix,'VariableNames',colnames);
insert(conn,tableName,colnames,insertData);
And some of my column names:
'maß','kapazität', 'räder'
Thank you very much for helping.
Do you have to create the table first? I would try just passing the cell array of data directly to insert like so:
insert(conn, tableName, colnames, full_matrix);
The above assumes that it is the cell2table call that is giving you an error related to the special characters. If it's the insert call, then I guess MATLAB won't let you create databases with column names that don't conform to its variable naming conventions. If that's the case, you'll have to convert the column names to something valid, which you can do with either genvarname (for older MATLAB versions) or matlab.lang.makeValidName (suggested for versions R2014a and newer):
colNames = {'maß','kapazität', 'räder'};
validNames = genvarname(colNames);
% or...
validNames = matlab.lang.makeValidName(colNames, 'ReplacementStyle', 'hex');
validNames =
1×3 cell array
'ma0xDF' 'kapazit0xE4t' 'r0xE4der'
Note that the above solutions replace the invalid characters with their hex equivalents. You could also change the 'ReplacementStyle' to replace them with underscores or delete them altogether. I would go with the hex values because it gives you the option of converting the column names back to their original string values if you need those for anything later. Here's how you could do that using regexprep, hex2dec, and char:
originalNames = regexprep(validNames, '0x([\dA-F]{2})', '${char(hex2dec($1))}');
originalNames =
1×3 cell array
'maß' 'kapazität' 'räder'
i am facing issue while converting unicode data into national characters.
When i convert the Unicode data into national using national-of function, some junk character like # is appended after the string.
E.g
Ws-unicode pic X(200)
Ws-national pic N(600)
--let the value in Ws-Unicode is これらの変更は. getting from java end.
move function national-of ( Ws-unicode ,1208 ) to Ws-national.
--after converting value is like これらの変更は #.
i do not want the extra # character added after conversion.
please help me to find out the possible solution, i have tried to replace N'#' with space using inspect clause.
it worked well but failed in some specific scenario like if we have # in input from user end. in that case genuine # also converted to space.
Below is a snippet of code I used to convert EBCDIC to UTF. Before I was capturing string lengths, I was also getting # symbols:
STRING
FUNCTION DISPLAY-OF (
FUNCTION NATIONAL-OF (
WS-EBCDIC-STRING(1:WS-XML-EBCDIC-LENGTH)
WS-EBCDIC-CCSID
)
WS-UTF8-CCSID
)
DELIMITED BY SIZE
INTO WS-UTF8-STRING
WITH POINTER WS-XML-UTF8-LENGTH
END-STRING
SUBTRACT 1 FROM WS-XML-UTF8-LENGTH
What this code does is string the UTF8 representation of the EBCIDIC string into another variable. The WITH POINTER clause will capture the new length of the string + 1 (+ 1 because the pointer is positioned to the next position after the string ended).
Using this method, you should be able to know exactly how long second string is and use that string with the exact length.
That should remove the unwanted #s.
EDIT:
One thing I forgot to mention, in my case, the # signs were actually EBCDIC low values when viewing the actual hex on the mainframe
Use inspect with reverse and stop after first occurence of #
I have a folder in c drive,whicn contain 1000 txt file,i want
to get the list of all these txt file. How can i get this list?
Use the OS-DIR() function.
For example:
DEFINE STREAM dirlist.
DEFINE VARIABLE filename AS CHARACTER FORMAT "x(30)" NO-UNDO.
INPUT STREAM dirlist FROM OS-DIR(".").
REPEAT:
IMPORT STREAM dirlist filename.
DISPLAY filename.
END.
INPUT CLOSE.
For example: ipcPath = "C:\temp\
DEFINE INPUT PARAMETER ipcPath AS CHARACTER NO-UNDO.
DEFINE VARIABLE chFiles AS CHARACTER NO-UNDO.
INPUT FROM OS-DIR(ipcPath).
REPEAT:
IMPORT UNFORMATTED chImport NO-ERROR.
DISPLAY chFiles FORMAT "X(75)".
END.
INPUT CLOSE.
chFiles is a spacedelimeted list and contains the filename, the path, and an 'F' or 'D' tag.
I have a directory-tools program which enable a developer to do all kinds of fun things with file systems. You can get the code here: http://communities.progress.com/pcom/docs/DOC-16578