Extracting data from a file

Extracting data from a file - sed

I have a file results.txt which is like:
a.txt
{some data}
success!!
b.txt
{some data}
success!!
c.txt
{some data}
error!!
I want to extract data from it. I want an output like:
a.txt: success
b.txt: success
c.txt: error
The problem is that the {some data} part can be arbitrarily long.
How can this be done?

awk:
BEGIN {
state=0
}
state==0 && /.txt$/ {
filename=$0
state=1
next
}
state==1 && /!!$/ {
print filename ": " gensub(/!!$/, "", $0)
state=0
next
}

$ cat file
a.txt
{some
blah
data}
success!!
b.txt
{some data}
success!!
c.txt
{some data}
error!!
$ awk 'BEGIN{ FS="[{}]|\n";RS=""}{gsub(/!!/,"",$NF);print $1":"$NF}' file
a.txt:success
b.txt:success
c.txt:error
Update:
$ awk -vRS= -vFS="\n" '{print $1":"$NF}' file
a.txt:success!!
b.txt:success!!
c.txt:error!!

You can use the following way also.
sed -e 's/^{some data}$//g;/^$/d;' results.txt | sed '$!N;s/\n/: /'

That works for me:
cat result.txt | xargs |sed 's/\ {[^}]*}/:/g' | sed 's/!! /\n/g'
a.txt: success
b.txt: success
c.txt: error!!

cat results.txt | grep -E "(([a-z]\.txt)|((success)|(error)!!))" | tr -d '\n' | sed 's/!!/!!\n/'
should do it. You might have to replace \n with a literal newline though.

awk '{print $1": "$4}' RS="\n\n" results.txt

Related

bas64 decode to csv file, sed script

I have the following script to extract text inside "reportBody" text, but I need also to decode this text from a new file to base64. How can I do this?
Here's a script:
cat $1 | tr "\n" "|" | grep -o '<reportBody>.*</reportBody>' | sed 's/\(<reportBody>\|<\/reportBody>\)//g' | sed 's/|/\n/g' | sed '/^\s*$/d' > $2
tried :
cat $1 | tr "\n" "|" | grep -o '<reportBody>.*</reportBody>' | sed 's/\(<reportBody>\|<\/reportBody>\)//g' | sed 's/|/\n/g' | sed '/^\s*$/d' | base64 -d $2 > $2
but it doesn't decode it,
Can I overwrite the same file or at least save decoded text in a new one? without calling addition modules from python etc.
Note: File contains 20k+ symbols to decode.

grep + grep + sed = sed: no input files

Can anybody help me please?
grep " 287 " file.txt | grep "HI" | sed -i 's/HIS/HID/g'
sed: no input files
Tried also xargs
grep " 287 " file.txt | grep HI | xargs sed -i 's/HIS/HID/g'
sed: invalid option -- '6'
This works fine
grep " 287 " file.txt | grep HI

If you want to keep your pipeline:
f=file.txt
tmp=$(mktemp)
grep " 287 " "$f" | grep "HI" | sed 's/HIS/HID/g' > "$tmp" && mv "$tmp" "$f"
Or, simplify:
sed -i -n '/ 287 / {/HI/ s/HIS/HID/p}' file.txt
That will filter out any line that does not contain " 287 " and "HI" -- is that what you want? I suspect you really want this:
sed -i '/ 287 / {/HI/ s/HIS/HID/}' file.txt
For lines that match / 287 /, execute the commands in braces. In there, for lines that match /HI/, search for the first "HIS" and replace with "HID". sed implicitly prints all lines if -n is not specified.
Other commands that do the same thing:
awk '/ 287 / && /HI/ {sub(/HIS/, "HID")} {print}' file.txt > new.txt
perl -i -pe '/ 287 / and /HI/ and s/HIS/HID/' file.txt
awk does not have an "in-place" option (except gawk -i inplace for recent gawk versions)

Removing matching text from line

I have a example cut down from a log file.
112 172.172.172.1#50912 (ssl.bing.com):
I would like some how to remove the # and numbers after and (): from the url.
Would like the result.
112 172.172.172.1 ssl.bing.com
Here is the sed oneliner I have been working on.
cat newdns.log | sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > old.log
Thanks

Using sed, you could say:
sed 's/#[0-9]*//;s/(\(.*\)):$/\1/' filename
or, in a single substitution:
sed 's/#[0-9]* *(\(.*\)):$/ \1/' filename

Another sed:
sed -r 's/#[^ ]+|[():]//g'
$ echo '112 172.172.172.1#50912 (ssl.bing.com):' | sed -r 's/#[^ ]+|[():]//g'
112 172.172.172.1 ssl.bing.com

Pattern extraction using SED or AWK

How do I extract 68 from v1+r0.68?

Using awk, returns everything after the last '.'
echo "v1+r0.68" | awk -F. '{print $NF}'

Using sed to get the number after the last dot:
echo 'v1+r0.68' | sed 's/.*[.]\([0-9][0-9]*\)$/\1/'

grep is good at extracting things:
kent$ echo " v1+r0.68"|grep -oE "[0-9]+$"
68

Match the digit string before the end of the line using grep:
$ echo 'v1+r0.68' | grep -Eo '[0-9]+$'
68
Or match any digits after a .
$ echo 'v1+r0.68' | grep -Po '(?<=\.)\d+'
68
Print everything after the . with awk:
echo "v1+r0.68" | awk -F. '{print $NF}'
68
Substitute everything before the . with sed:
echo "v1+r0.68" | sed 's/.*\.//'
68

type man grep
and you will see
...
-o, --only-matching
Show only the part of a matching line that matches PATTERN.
then type echo 'v1+r0.68' | grep -o '68'
if you want it any where special do:
echo 'v1+r0.68' | grep -o '68' > anyWhereSpecial.file_ending

Using grep with sed and writing a new file based on the results

I'm very new to some of the command line utilities and have been looking for a while for a command that would accomplish my goal.
The goal is to find files that contain a string of text, replace it with a new string, and then write the results to a file that is named the same as the original, but in a different directory.
Obviously this is not working, so I am asking how you who know about this stuff would go about it.
grep -rl 'stringToFind' *.* | sed 's|oldString|newString|g' < fileNameFromGrep > ./new/fileNameFromGrep
Thanks for your input!
John

for f in "`find /YOUR/SEARCH/DIR/ROOT -type f -exec fgrep -l 'stirngToFind' \{\} \;`" ; do
sed 's|oldString|newString|g' < "${f} > ./new/"${f}
done
Will do it for you.
If you have spaces in filenames:
OLDIFS=$IFS
IFS=''
find /PATH -print0 -type f | while read -r -d $'' file
do
fgrep -l 'stirngToFind' "$file" && \
sed 's|oldString|newString|g' < "${file} > ./new/"${file}
done
IFS=$OLDIFS

#!/bin/bash
for file in *; do
if grep -qF 'stringToFind' "$file"; then
sed 's/oldString/newString/g' "$file" > "./new/$file"
fi
done

for file in path/to/dir/*
do
grep -q 'pattern' "$file" > /dev/null
if [ $? == 0 ]; then
sed 's/oldString/newString/g' "$file" > /path/to/newdir/"$file"
fi
done

You try:
sed -ie "s/oldString/newString/g" \
$(grep -Rsi 'pattern' path/to/dir/ | cut -d: -f1)
sed:
i in_place
e exec other command or script
grep:
R recursive
s Suppress error messages
i ignore case sensitive

Categories

python-imaging-library

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Extracting data from a file - sed

awk: BEGIN { state=0 } state==0 && /.txt$/ { filename=$0 state=1 next } state==1 && /!!$/ { print filename ": " gensub(/!!$/, "", $0) state=0 next }

You can use the following way also. sed -e 's/^{some data}$//g;/^$/d;' results.txt | sed '$!N;s/\n/: /'

That works for me: cat result.txt | xargs |sed 's/\ {[^}]*}/:/g' | sed 's/!! /\n/g' a.txt: success b.txt: success c.txt: error!!

cat results.txt | grep -E "(([a-z]\.txt)|((success)|(error)!!))" | tr -d '\n' | sed 's/!!/!!\n/' should do it. You might have to replace \n with a literal newline though.

awk '{print $1": "$4}' RS="\n\n" results.txt

Related

bas64 decode to csv file, sed script

grep + grep + sed = sed: no input files

Removing matching text from line

Pattern extraction using SED or AWK

Using grep with sed and writing a new file based on the results

Categories

Resources