UNIX programming - solaris

Hi i want to convert UNIX date to normal date (YYYY-MM-DD)
22222,0,0,0,14387
33333,0,0,0,14170
44444,0,0,0,14244
55555,0,0,0,14190
66666,0,0,0,14528
77777,0,0,0,14200
88888,0,0,0,0
99999,0,0,0,0
here 5th column represents UNIX date
i want to convert into
22222,0,0,0,2009-05-23
and similarly remaining rows
can any body help me

A simple awk script ought to do it
awk -F, 'BEGIN{OFS=","} { print $1,$2,$3,$4,strftime("%Y-%m-%d",$5) }' myFile.txt
Cheers.
EDIT:
You're not using unix timestamps, but I checked your data, and it appears you're using days since the epoch, so here goes:
awk -F, 'BEGIN{OFS=","} { print $1,$2,$3,$4,strftime("%Y-%m-%d",$5*86400) }' myFile.txt

Related

How to regex today or previous days date using awk and $date?

Column 13 of my data contains date in YYMMDD format. I'm trying to regex using $date for today and previous days. Neither of the following code would work. Could someone give me some insights?
TODAY
awk -F, ($13~/$(date '+%Y%m%d')/) {n++} END {print n+0}' file.csv)
3 DAYS AGO
awk -F, ($13~/$(date -d "$date -3 days" '+%Y%m%d')/) {n++} END {print n+0}' file.csv
Your Awk attempts have rather severe quoting problems. You will generally want to single-quote your Awk script, and pass in any parameters as variables with -v.
awk -F, -v when="$(date -d "-3 days" '+%Y%m%d')" '$13~when {n++} END {print n+0}' file.csv
Perhaps notice also that $date is not defined anywhere. The notation $(cmd ...) is a command substitution which runs cmd ... and replaces the expression with its output.
Probably also notice that date -d is a GNU extension and is not portable, though it will work on Linus and other platforms where you have the GNU utilities installed.
More fundamentally, depending on what's in $13, you might want to implement a simple date parsing for that format, so that you can specify a range of acceptable values, rather than search for matches on static text.
This quoting is correct for Bourne-style Unix shells. If you are on Windows, the quoting rules are quite different, and quite likely often impossible to apply in useful ways.
If you are using GNU AWK then you might use its' Time Functions to check if it does work do
awk 'END{print strftime("%y%m%d")}' emptyfile.txt
which should output current day in YYMMDD format. If it does then you might get what you want following way:
awk 'BEGIN{today=strftime("%y%m%d");threedago=strftime("%y%m%d",systime()-259200)}END{print today, threedago}' emptyfile.txt
output (as of today)
210809 210806
Explanation: strftime first argument is time format %y is year 00...99, %m is month 01...12, %d is day 01...31. Second argument is optional, and it is seconds since start of epoch. If skipped current time is used, systime() return number of seconds since start of epoch, 259200 is 72 hours as seconds.
Example of usage as regexp, let say that I have file.txt as follows
210807 120
210808 150
210809 100
and want to retrieve content of 2nd column for today, then I can do
awk 'BEGIN{today=strftime("%y%m%d")}$1~today{print $2}' file.txt
getting output (as of today)
100
(tested in gawk 4.2.1)

Convert timestamp (unix 13 digits) to datetime format of a complet column of a csv file using awk or sed

I have a csv file multiple columns. The first columns has timestamps like,
1529500027127
1529500027227
1529500027327
1529500027428
1529500027528
1529500027628
1529500027728
I know you can do something like that for a specific timestamp:
date -d #1529500027528
But how can I select all values of the columns and do that? I tried the next command:
date -d "$(awk -F , -v OFS=, '$1/=1000')" file.csv
I am trying to understand how date command works with other commands.
Since sample of expected output is not given so could only test it with given 1st column values, written and tested in GNU awk. You could use strftime function of awk, also since OP hs mentioned Input_file is a csv file so mentioning FS and OFS as , here.
awk 'BEGIN{FS=OFS=","} {$1=strftime("%Y/%m/%d %H:%M:%S",$1/1000)}1' Input_file
From man awk for strftime:
strftime([format [, timestamp[, utc-flag]]]) Format timestamp
according to the specification in format. If utc-flag is present and
is non-zero or non-null, the result is in UTC, otherwise the result is
in local time. The timestamp should be of the same form as returned
by systime(). If timestamp is missing, the current time of day is
used. If format is missing, a default format equivalent to
the output of date(1) is used. The default format is available in
PROCINFO["strftime"]. See the specification for the strftime()
function in ISO C for the format conversions that are guaranteed to be
available.
If you want to use an external date -d#.... command, you could do this:
awk -F, -v 'OFS=,' '{"date -d#"$1 | getline timestamp ; $1=timestamp; print}' filename
Obviously finding a builtin function to do the same job (in this case, the strftime function as suggested by another answer) is a more efficient solution in terms of execution time, but the above gives an example of how to call out to external programs that you may be already familiar with.

unix yyyymmddhhmmss format conversion to specific date format

There is a bash script running that outputs folder names appended with time logs_debug_20190213043348. I need to be able to extract the date into a readable format yyyy.mm.dd.hh.mm.ss and also may be convert to GMT timezone. I'm using the below method to extract.
echo "${folder##*_}" | awk '{ print substr($0,1,4)"."substr($0,5,2)"."substr($0,7,2)"."substr($0,9,6)}'
Is there a better way to print the output without writing complex shell scripts?
The internal string conversion functions are too limited, so we use sed and tr when needed.
## The "readable" format yyyy.mm.dd.hh.mm.ss isn’t understood by date.
## yyyy-mm-dd hh:mm:ss is. So we first produce the latter.
# Note how to extract the last 14 characters of ${folder} and that, since
# we know (or should have checked somewhere else) that they are all digits,
# we match them with a simple dot instead of the more precise but less
# readable [0-9] or [[:digit:]]
# -E selects regexp dialect where grouping is done with simple () with no
# backslashes.
d="$(sed -Ee's/(....)(..)(..)(..)(..)(..)/\1-\2-\3 \4:\5:\6/'<<<"${folder:(-14)}")"
# Print the UTF date (for Linux and other systems with GNU date)
date -u -d "$d"
# Convert to your preferred "readable" format
# echo "${d//[: -]/.}" would have the same effect, avoiding tr
tr ': -' '.'<<<"$d"
For systems with BSD date (notably MacOS), use
date -juf'%Y-%m-%d %H:%M:%S' "$d"
instead of the date command given above. Of course, in this case the simplest way would be:
# Convert to readable
d="$(sed -Ee's/(....)(..)(..)(..)(..)(..)/\1.\2.\3.\4.\5.\6/'<<<"${folder:(-14)}")"
# Convert to UTF
date -juf'%Y.%m.%d.%H.%M.%S' "$d"
Here's a pipeline that does what you want. It certainly isn't simple looking, but taking each component it can be understood:
echo "20190213043348" | \
sed -e 's/\([[:digit:]]\{4\}\)\([[:digit:]]\{2\}\)\([[:digit:]]\{2\}\)\([[:digit:]]\{2\}\)\([[:digit:]]\{2\}\)\([[:digit:]]\{2\}\)/\1-\2-\3 \4:\5:\6/' | \
xargs -d '\n' date -d | \
xargs -d '\n' date -u -d
The first line is simply printing the date string so sed can format it (so that it can easily be modified to fit the way you are passing in the string).
The second line with sed is converting the string from the format you give, to something like this, which can be parsed by date: 2019-02-13 04:33:48
Then, we pass the date to date using xargs, and it formats it with the timezone of the device running the script (CST in my case): Wed Feb 13 04:33:48 CST 2019
The final line converts the date string given by the first invocation of date to UTC time rather than being stuck in the local time: Wed Feb 13 10:33:48 UTC 2019
If you want it in a different format, you can modify the final invocation of date using the +FORMAT argument.

sed command to change date format

Here is a snippet of the file I'm working with:
709ENVUN07,SET1,FE10,GB0009252882,GB,GBX,NULL,S,O,LO,1510.00000000,173,N,F,28022007,07:51:15,3717
208ATNHG07,SET1,FE10,GB0009252882,GB,GBX,NULL,S,O,LO,1550.00000000,1800,N,F,18012007,15:48:21,654681
As you can see the date is in this format: 28022007, 18012007
Using sed I've successfully changed to the format I wish.
gzip -dc allGlaxoOrderHistory.CSV.gz |sed 's/\([0-9]\{2\}\)\([0-9]\{2\}\)\(2[0-9]\{3\}\)/\1-\2-\3/g' > newOrderHistory.csv
However sed is also changing GB0009252882 to GB00-09-252882 as you can see below
709ENVUN07,SET1,FE10,GB00-09-252882,GB,GBX,NULL,S,O,LO,1510.00000000,173,N,F,28-02-2007,07:51:15,3717
208ATNHG07,SET1,FE10,GB00-09-252882,GB,GBX,NULL,S,O,LO,1550.00000000,1800,N,F,18-01-2007,15:48:21,654681
Question is how do I change 28022007, 18012007 to this 28-02-2007 ,18-01-2007 without GB0009252882 changing too.
[edit]
Your date field is the 15th from the start. You can write your pattern like this:
sed 's/\(\([^,]*,\)\{14\}..\)\(..\)/\1-\3-/'
Where ,[^,]*, describes a field (with separator).
You can also work by fields more easily with awk. You only need to set the input and output delimiter to ,
With awk (Gnu), target the 15th field:
awk -F, -vOFS=, '{$15=gensub(/(..)(..)(....)/, "\\1-\\2-\\3", "g", $15)}1' yourfile
The parameter -F, set the input delimiter and -vOFS=, the output delimiter. The 1 at the end is used as a shortcut for print).

Bash/AWK/SED Match and ReWrite a string of numbers (date) in a line

I have a text file with the following contents repeating about 60 times coming from a converted .ics file:
Start Vak
Tijd van: 20120411T093000Z
Tijd tot: 20120411T100000Z
Klas(sen) en Docent(en): VPOS0A1 VPOS0A2 Mariel Kers
Vak: Ex. Verst. beperk.
Lokaal: 7.05
Einde Vak
I want to rewrite the "Tijd van" and "Tijd tot" values to become a good date (in a bash script on a gnu/linux system with awk,sed and grep etc.). I tried to use awk to find it:
awk '/^Tijd.*[:digit:][:digit:]Z$/; { getline; print $0; }' rooster2.txt
and grep:
egrep '/^Tijd(.*)[:digit:][:digit:]Z$/' rooster2.txt
But they both do not even find the line.
What I want is to get that date rewritten to a more bash parsable/feasible time format like the EPOCH or something like 31.04.2012 13:00:00. I do not want to replace or rewrite the whole line, just the specific string! Anything, either tips, examples or links are welcome and very usefull.
Try this (GNU sed):
sed -r 's/(Tijd ...: )(....)(..)(..).(..)(..)(..)./\1 \4.\3.\2 \5:\6:\7/' FILE
There are several issues with your awk code:
While [:digit:] refers to "any digit", you still need another pair of square brackets ([...]) for the character group: [[:digit:]] (Just image you wanted "a,any digit or _" , this would be [a[:digit:]_], the outer square brackets defining the character group.)
The semicolon (;) between your pattern (/.../) and the corresponding action ({...}) separates the two, so you have a pattern with no action, resulting in the standard action {print $0}, and a second action without a pattern, resulting in it being performed for all records (i.e. lines).
The getline asks awk to read the next record (i.e. line) before continuing.
Taking all that together your code does the following:
Print all lines matching /^Tijd.*[:digit:][:digit:]Z$/ (that is none, since [:digit:] translates to "one of :,d,i,g, or t").
Additionally, for all lines: read the next line and print it.
Thus, it will print all but the first line (because that is the only one that is not the next one to any other line).
Assuming you just want to print the lines matching "starting with 'Tijd' and ending with two digits followed by a 'Z'" you could use the following code:
awk '/^Tijd.*[[:digit:]][[:digit:]]Z$/{ print $0; }' rooster2.txt
Since {print $0} is the standard action you could even shorten that to
awk '/^Tijd.*[[:digit:]][[:digit:]]Z$/' rooster2.txt
To solve your actual problem you could use something like the following:
awk '/^Tijd.*[[:digit:]][[:digit:]]Z$/{year=substr($NF,1,4);month=substr($NF,5,2);day=substr($NF,7,2);hour=substr($NF,10,2);min=substr($NF,12,2);sec=substr($NF,14,2);$NF=day"."month"."year" "hour":"min":"sec}1' rooster2.txt
This works as follows:
For records (i.e lines) matching the pattern (/.../), rearrange the last field ($NF) to your needs.
Print all records (i.e. lines) (1 is a pattern matching all records (i.e. lines) with no specified action, resulting in the standard one ({print $0}))
Note that GNU awk also has a strftime function.
However, that needs the timestamp to be in a different format.
If you want to use that you must still rearrange the field, first:
awk -v FORMAT="%c" '/^Tijd.*[[:digit:]][[:digit:]]Z$/{$NF=strftime(FORMAT,mktime(substr($NF,1,4)" "substr($NF,5,2)" "substr($NF,7,2)" "substr($NF,10,2)" "substr($NF,12,2)" "substr($NF,14,2)))}1' rooster2.txt
Now, you just need to adjust FORMAT to your needs to change the format.
See man strftime for details.
As a ruby one-liner; requiring time for Time.parse then replacing
matching regexp. You may look strftime method for formatting time
output.
[slmn#uriel ~]$ ruby -rtime -ne 'puts $_.sub(/(Tijd (van|tot): )(.*)/) { $1 + Time.parse($3).strftime("%D %T") }' < yourfile.txt
Start Vak
Tijd van: 04/11/12 09:30:00
Tijd tot: 04/11/12 10:00:00
Klas(sen) en Docent(en): VPOS0A1 VPOS0A2 Mariel Kers
Vak: Ex. Verst. beperk.
Lokaal: 7.05
Einde Vak