I want to change the second column to upper case and I want to do it in shell script only. (no one liners!)
#!/bin/sh
# read file line by line
file="/pdump/country.000000.txt"
while read line
do
mycol=`echo $line | awk -F"," '{print $2}'`
mycol_new=`echo $mycol | tr "[:lower:]" [:upper:]`
echo $line | awk -F"," '{print $1 "," $mycol_new "," $3 "," $4 "," $5 "," $6 "," $7 "," $8}'
done < $file
I am not able to replace the $2 with $mycol_new.
Any suggestion?
awk cannot see $mycol_new because it is a shell variable. Here is one way of passing a shell variable into awk using the -v flag:
echo $line | awk -v var="$mycol_new" -F"," '{print $1 "," var "," $3 "," $4 "," $5 "," $6 "," $7 "," $8}'
Here is an alternative method which lets the shell expand $mycol_new:
echo $line | awk -F"," '{print $1 ",'"$mycol_new"'," $3 "," $4 "," $5 "," $6 "," $7 "," $8}'
why no one liners? Doing homework?
$ cat file
one two three four
five six seven eight
$ awk '{$2=toupper($2)}1' file
one TWO three four
five SIX seven eight
If you want to do this all in the shell, then you don't need awk:
IFS=,
while read line; do
set -- $line
a="$1"
b="${2^^}" # assumes bash, use "tr" otherwise
shift 2
set -- "$a" "$b" "$#"
echo "$*"
done < "$file" > "$file.new"
Related
I am looking for how to convert all dates in a csv file row into this format ? example I want to convert 23/1/17 to 23/01/2017
I use unix
thank you
my file is like this :
23/1/17
17/08/18
1/1/2
5/6/03
18/05/2019
and I want this :
23/01/2017
17/08/2018
01/01/2002
05/06/2003
18/05/2019
I used date_samples.csv as my test data:
23/1/17,17/08/18,1/1/02,5/6/03,18/05/2019
cat date_samples.csv | tr "," "\n" | awk 'BEGIN{FS=OFS="/"}{print $2,$1,$3}' | \
while read CMD; do
date -d $CMD +%d/%m/%Y >> temp
done; cat temp | tr "\n" "," > converted_dates.csv ; rm temp; truncate -s-1 converted_dates.csv
Output:
23/01/2017,17/08/2018,01/01/2002,05/06/2003,18/05/2019
This portion of the code converts your "," to new lines and makes your input DD/MM/YY to MM/DD/YY, since the date command does not accept date inputs of DD/MM/YY. It then loops through re-arranged dates and convert them to DD/MM/YYYY format and temporarily stores them in temp.
cat date_samples.csv | tr "," "\n" | awk 'BEGIN{FS=OFS="/"}{print $2,$1,$3}' | \
while read CMD; do
date -d $CMD +%d/%m/%Y >> temp
done;
This line cat temp | tr "\n" "," > converted_dates.csv ; rm temp; truncate -s-1 converted_dates.csv converts the new line back to "," and puts the output to converted_dates.csv and deletes temp.
Using awk:
awk -F, '{ for (i=1;i<=NF;i++) { split($i,map,"/");if (length(map[3])==1) { map[3]="0"map[3] } "date -d \""map[2]"/"map[1]"/"map[3]"\" \"+%d/%m/%y\"" | getline dayte;close("date -d \""map[2]"/"map[1]"/"map[3]"\" \"+%d/%m/%y\"");$i=dayte }OFS="," }1' file
Explanation:
awk -F, '{
for (i=1;i<=NF;i++) {
split($i,map,"/"); # Loop through each comma separated field and split into the array map using "/" as the field seperator
if (length(map[3])==1) {
map[3]="0"map[3] # If the year is just one digit, pad out with prefix 0
}
"date -d \""map[2]"/"map[1]"/"map[3]"\" \"+%d/%m/%y\"" | getline dayte; # Run date command on day month and year and read result into variable dayte
close("date -d \""map[2]"/"map[1]"/"map[3]"\" \"+%d/%m/%y\""); # Close the date execution pipe
$i=dayte # Replace the field for the dayte variable
}
OFS="," # Set the output field seperator
}1' file
Hi I have a file with logs, some fields of this logs are separated by space and others by a tab or two spaces, how can I set all whitespaces to just a single space using powershell?
something similar to awk in linux bash, beacuse I have one field tha contains staces in his value like this: "type:Windows Resources"
awk '{print " $1 " " $2 " " $3 " " $5 " " $6 " " $7 " " $8 " " $9 " " $10}'
The below will replace only 2 or more spaces with single space.
((Get-Content logFile.txt) -replace '\s{2,}',' ') >> logFileTemp.txt;
Copy-Item logFileTemp.txt logFile.txt -Force
$test = " Testing removal of spaces"
$SpacesConverted = $test -replace "[ ]{1,1000}"," "
$SpacesConverted
Output:
Testing removal of spaces
What I'm doing is replacing spaces, looking for any space between 1 and a 1000 and converting that to a single space. If you have more than 1000 spaces in any given part of the log just increase that number.
If you want to run this on a file...
$File = "c:\file123.txt"
gc $File -replace "[ ]{1,1000}"," " | sc $File
Something like this, replace 1 or more whitespaces:
echo "hi hi`t`thi" > file1.txt
cat file1.txt
hi hi hi
(cat file1.txt) -replace '\s+',' ' | set-content file2.txt
cat file2.txt
hi hi hi
I try to swap two columns in a text file, the field separator is the pipe '|' sign.
I found and tried
awk ' { t = $1; $1 = $2; $2 = t; print; } ' input_file
it works when the field separator is the tab.
I also tried
awk -F\| '{print $2 $1 }' input_file
but I do not manage to specify the '|' as output field separator.
You need to define the OFS, Output Field Separator. You can do it with either of these ways:
awk 'BEGIN{FS=OFS="|"} {t=$1; $1=$2; $2=t; print} ' input_file
or
awk -v OFS="|" -v FS="|" ' {t=$1; $1=$2; $2=t; print} ' input_file
or (thanks Jaypal!)
awk '{t=$1; $1=$2; $2=t; print}' OFS="|" FS="|" input_file
I'm trying to turn a big list of data into a CSV. Its basically a giant list with no spaces, and the rows are separated by newlines. I have made a bash script that basically loops through the document, awks out the line, cuts the byte range, and then adds a comma and appends it to the end of the line. It looks like this:
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 1-12 | tr -d '\n' >> $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 13-17 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 18-22 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
awk -v n=$x 'NR==n { print;exit}' PROP.txt | cut -c 23-34 | tr -d '\n' | xargs -I {} sed -i '' -e 's~$~,{}~' $x.tmp
The problem is this is EXTREMELY slow, and the data has about 400k rows. I know there must be a better way to accomplish this. Essentially I just need to add a comma after every 12/17/22/34 etc character of a line.
Any help is appreciated, thank you!
There are many many ways to do this with Perl. Here is one way:
perl -pe 's/(.{12})(.{5})(.{5})(.{12})/$1,$2,$3,$4,/' < input-file > output-file
The matching pattern in the substitution captures four groups of text from the beginning of each line with 12, 5, 5, and 12 arbitrary characters. The replacement pattern places a comma after each group.
With GNU awk, you could write
gawk 'BEGIN {FIELDWIDTHS="12 5 5 12"; OFS=","} {$1=$1; print}'
The $1=$1 part is to force awk to rewrite the like, incorporating the output field separator, without changing anything.
This is very much a job for substr.
use strict;
use warnings;
my #widths = (12, 5, 5, 12);
my $offset;
while (my $line = <DATA>) {
for my $width (#widths) {
$offset += $width;
substr $line, $offset, 0, ',';
++$offset;
}
print $line;
}
__DATA__
1234567890123456789012345678901234567890
output
123456789012,34567,89012,345678901234,567890
I've a tab delimited log file that has date time in format '2011-07-20 11:34:52' in the first two columns:
An example line from the log file is:
2011-07-20 11:34:15 LHR3 1488 111.111.111.111 GET djq2eo454b45f.cloudfront.net /1010.gif 200 - Mozilla/5.0%20(Windows%20NT%206.1;%20rv:5.0)%20Gecko/20100101%20Firefox/5.0 T=F&Event=SD&MID=67&AID=dc37bcff-70ec-419a-ad43-b92d6092c9a2&VID=8&ACID=36&ENV=demo-2&E=&P=Carousel&C=3&V=3
I'm trying to convert the date time to epoch using just awk:
cat logfile.log | grep 1010.gif | \
awk '{ print $1" "$2" UTC|"$5"|"$10"|"$11"|"$12 }' | \
awk 'BEGIN {FS="|"};{system ("date -d \""$1"\" +%s" ) | getline myvar}'
So this gets me some way, in that it gets me epoch less three 000's on the end - however i'm just getting the output of the system command - where as i really want to substitute $1 with the epoch time.
I'm aiming for the following output:
<epoch time>|$5|$10|$11|$12
I've tried just using:
cat logfile.log | grep 1010.gif | awk '{ print d };' "d=$(date +%s -d"$1")"
But this just gives me blank rows.
Any thoughts.
Thanks
This assumes gawk -- can't do any timezone translation though, strictly local time.
... | gawk '
BEGIN {OFS = "|"}
{
split($1, d, "-")
split($2, t, ":")
epoch = mktime(d[1] " " d[2] " " d[3] " " t[1] " " t[2] " " t[3])
print epoch, $5, $10, $11, $12
}
'