Using nzload to upload a file with two differing date formats

Using nzload to upload a file with two differing date formats - date

I am trying to load onto Netezza a file from a table in an Oracle database, the file contains two separate date formats - one field has the format
DD-MON-YY and the second field has the format DD-MON-YYYY hh24:MI:SS, is there any with in NZLOAD to cater for two different date formats within a file
Thanks
rob..

If your file is fixed-length, you can use zones
However, if its field delimited, you can use some of the preprocessing tools like sed to convert all the date / timestamp to one standard format, before piping the output to nzload.
for ex.,
1. 01-JAN-17
2. 01-JAN-2017 11:20:32
Lets convert the date field to same format
cat output.dat |\
sed -E 's/([0-9]{2})-([A-Z]{3})-([0-9]{2})/\1-\2-20\3/g' |\
nzload -dateStyle DMONY -dateDelim '-'
sed expression is pretty simple here, let's break it down
# looking for 2 digits followed by
# 3 characters and followed by
# 2 digits all separated by '-'
# elements are grouped with '()' so they can be referred by number
's/([0-9]{2})-([A-Z]{3})-([0-9]{2})
# reconstruct the date using group number and separator, prefix 20 to YY
/\1-\2-20\3
# apply globally
/g'
also in nzload we have specified the format of date and its delimiter.
Now we'll have to modify the regular expression depending upon different date formats and what they are getting converted to, this may not be an universal solution.

Related

How to delete space in character text?

I wrote a code that automatically pulls time-related information from the system. As indicated in the table is fixed t247 Month names to 10 characters in length. But it is a bad image when showing on the report screen.
I print this way:
WRITE : 'Bugün', t_month_names-ltx, ' ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.
I tried the CONDENSE t_month_names-ltx NO-GAPS. method to delete the spaces, but it was not enough.
After WRITE, I was able to write statically by setting the blank value:
WRITE : 'Bugün', t_month_names-ltx.
WRITE : 14 'ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.
But this is not a correct use. How do I achieve this dynamically?

You could use a temporary field of type STRING:
DATA l_month TYPE STRING.
l_month = t_month_names-ltx.
WRITE : 'Bugün', l_month.
WRITE : 14 'ayının'.
CONCATENATE gv_words-word '''nci günü' INTO date.
CONCATENATE date ',' INTO date.
CONCATENATE date gv_year INTO date SEPARATED BY space.
TRANSLATE date TO LOWER CASE.

You can not delete trailing spaces from a TYPE C field, because it's of constant length. The unused length is always filled with spaces.
But after you assembled you string, you can use CONDENSE without NO-GAPS to remove any chains of more than one space within the string.
Add CONDENSE date. below the code you wrote and you should get the results you want.
Another option is to abandon CONCATENATE and use string templates (string literals within | symbols) for string assembly instead, which do not have the annoying habit of including trailing spaces of TYPE C fields:
DATA long_char TYPE C LENGTH 128.
long_char = 'long character field'.
WRITE |this is a { long_char } inserted without spaces|.
Output:
this is a long character field inserted without spaces

Extracting symbol names from nm output

I'd like to use nm -P -g symbol names to generate a .c file. however I'm not sure how to extract those symbol names.
Reading https://pubs.opengroup.org/onlinepubs/9699919799/utilities/nm.html says:
The format given in nm STDOUT uses <space> characters between the fields, which may be any number of <blank> characters required to align the columns.
I'm not sure how to interpret this - should my regex be ^[^ ]+_mkdocs[ ] [note: workaround for stackoverflow's wonky code formatting] or something else? I want the result to be whatever symbol name I extracted concatenated with (&doc);
e.g.
foo_mkdocs T 0 0
should become
foo_mkdocs(&doc);
but I'm unsure if I'm understanding nm's output format specification correctly.

multi-character separator in `set datafile separator "|||"` doesn't work

I have an input file example.data with a triple-pipe as separator, dates in the first column, and also some more or less unpredictable text in the last column:
2019-02-01|||123|||345|||567|||Some unpredictable textual data with pipes|,
2019-02-02|||234|||345|||456|||weird symbols # and commas, and so on.
2019-02-03|||345|||234|||123|||text text text
When I try to run the following gnuplot5 script
set terminal png size 400,300
set output 'myplot.png'
set datafile separator "|||"
set xdata time
set timefmt "%Y-%m-%d"
set format x "%y-%m-%d"
plot "example.data" using 1:2 with linespoints
I get the following error:
line 8: warning: Skipping data file with no valid points
plot "example.data" using 1:2 with linespoints
^
"time.gnuplot", line 8: x range is invalid
Even stranger, if I change the last line to
plot "example.data" using 1:4 with linespoints
then it works. It also works for 1:7 and 1:10, but not for other numbers. Why?

When using the
set datafile separator "chars"
syntax, the string is not treated as one long separator. Instead, every character listed between the quotes becomes a separator on its own. From [Janert, 2016]:
If you provide an explicit string, then each character in the string will be
treated as a separator character.
Therefore,
set datafile separator "|||"
is actually equivalent to
set datafile separator "|"
and a line
2019-02-05|||123|||456|||789
is treated as if it had ten columns, of which only the columns 1,4,7,10 are non-empty.
Workaround
Find some other character that is unlikely to appear in the dataset (in the following, I'll assume \t as an example). If you can't dump the dataset with a different separator, use sed to replace ||| by \t:
sed 's/|||/\t/g' example.data > modified.data # in the command line
then proceed with
set datafile separator "\t"
and modified.data as input.

You basically gave the answer yourself.
If you can influence the separator in your data, use a separator which typically does not occur in your data or text. I always thought \t was made for that.
If you cannot influence the separator in your data, use an external tool (awk, Python, Perl, ...) to modify your data. In these languages it is probably a "one-liner". gnuplot has no direct replace function.
If you don't want to install external tools and want ensure platform independence, there is still a way to do it with gnuplot. Not just a "one-liner", but there is almost nothing you can't also do with gnuplot ;-).
Edit: simplified version with the input from #Ethan (https://stackoverflow.com/a/54541790/7295599).
Assuming you have your data in a dataset named $Data. The following code will replace ||| with \t and puts the result into $DataOutput.
### Replace string in dataset
reset session
$Data <<EOD
# data with special string separators
2019-02-01|||123|||345|||567|||Some unpredictable textual data with pipes|,
2019-02-02|||234|||345|||456|||weird symbols # and commas, and so on.
2019-02-03|||345|||234|||123|||text text text
EOD
# replace string function
# prefix RS_ to avoid variable name conflicts
replaceStr(s,s1,s2) = (RS_s='', RS_n=1, (sum[RS_i=1:strlen(s)] \
((s[RS_n:RS_n+strlen(s1)-1] eq s1 ? (RS_s=RS_s.s2, RS_n=RS_n+strlen(s1)) : \
(RS_s=RS_s.s[RS_n:RS_n], RS_n=RS_n+1)), 0)), RS_s)
set print $DataOutput
do for [RS_j=1:|$Data|] {
print replaceStr($Data[RS_j],"|||","\t")
}
set print
print $DataOutput
### end of code
Output:
# data with special string separators
2019-02-01 123 345 567 Some unpredictable textual data with pipes|,
2019-02-02 234 345 456 weird symbols # and commas, and so on.
2019-02-03 345 234 123 text text text

Why does strptime from Time::Piece not parse my format?

My collegue (who has left the company) has written a bunch of scripts, including batch and Perl scripts, and I'm getting rid of the regional settings dependencies.
In the last Perl script, he's written the following piece of code:
my $format = "%d.%m.%Y %H:%M";
my $today_converted = Time::Piece->strptime($today, $format) - ONE_HOUR - ONE_HOUR - ONE_HOUR - ONE_HOUR - ONE_HOUR;
(the idea is to get five hours before midnight of that particular date)
The value of $today seems to be "03/04/2017" (which stands for the third of April (European dateformat)), which seems not to be understood by Time::Piece implementation:
Error parsing time at C:/Perl64/lib/Time/Piece.pm line 481.
Which format can I use which is understood by Time::Piece Perl implementation?

In the format you have dots . as the date delimiter, but in the data you have slashes /. That's why it doesn't parse. It needs an exact match.

I think it's worth clarifying that strptime() will parse most date and time formats - that's the point of the method. But you need to define the format of the date string that you are parsing. That's what the second parameter to strptime() (in this case, your $format variable) is for.
The letters used in the format are taken from a standard list of definitions which used by every implementation of strptime() (and its inverse, strftime()). See man strptime on your system for a complete list of the available options.
In your case, the format is %d.%m.%Y %H:%M - which means that it will parse timestamps which have the day, month and year separated by dots, followed by a space and the hours and minutes separated by a colon. If you want to parse timestamps in a different format, then you will need to change the definition of $format.

How to retrieve date from a Rundeck job

I'm trying to achieve something like this in a rundeck 2.6 job:
touch /foo/bar/${DATE:MM/dd/yyyy}-baz
but it doesn't work properly and the date is not interpreted at all. Is there a proper way to do this?

You can use this bash script :
#!/bin/bash
touch /foo/bar/`date "+%m/%d/%Y"`-baz
The backquotes act as command substitution and replace the output of the date command in the touch command.
According to the date man page :
An operand with a leading plus (`+') sign signals a user-defined format string which
specifies the format in which to display the date and time. The format string may contain any of the conversion specifications described in the strftime(3) manual page, as
well as any arbitrary text.
The date format string use the following conversion specifier character :
%m The month as a decimal number (range 01 to 12). (Calculated
from tm_mon.)
%d The day of the month as a decimal number (range 01 to 31).
(Calculated from tm_mday.)
%Y The year as a decimal number including the century.
(Calculated from tm_year)

You can also define an option that uses that date format specifier.
Set the default value of the option to use the specifier. Eg:
<option name="date" value="${DATE:MM/dd/yyyy}-baz" />
Inside your step reference the ${option.date}.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Using nzload to upload a file with two differing date formats - date

Related

How to delete space in character text?

Extracting symbol names from nm output

multi-character separator in `set datafile separator "|||"` doesn't work

Why does strptime from Time::Piece not parse my format?

How to retrieve date from a Rundeck job

Categories

Resources