Print the rows start with a string in awk

Print the rows start with a string in awk - sh

For example:
Event .123232434
1232323 Event
1233 Event 234
Event 2323
I just want to the rows start with "Event" through awk
is there some way?

It's just:
/^Event/
A rule that isn't associated with any code prints every line that matches.
As paxdiablo pointed out, you could also use grep in this case:
grep '^Event'
If you want to read the input from a file, this becomes
awk '/^Event/' /path/to/file
or
grep '^Event' /path/to/file

You don't need awk for this, you can just use:
grep '^[^0-9]' inputfilename
That will give you all lines that start with a character that isn't a digit.
If you must use awk, it has an equivalent variant:
awk '$0 ~ /^[^0-9]/' inputfilename

Related

sed: How do I delete the first 100 lines of a text file

I would like to delete the first 100 lines of a text file using sed. I know how delete to the first line by using:
sed '1d' filename
or the 100th line by typing
sed '100d' filename
How do I specify a range? I thought something like this would work:
sed '1:100d' filename
However, this obviously didn't work. Can someone show me how to specify a range? Thanks in advance for your help.

This should work in gnu sed
sed '1,100d' file

awk can also be used to print data based on conditions related to rows.
Like: Following will print the lines (Records in terms of awk) whose number is greater than 100.
awk 'NR>100' inputfile
One can also use other conditions like:
awk 'NR==100' inpuftile #this will print the 100th line
awk 'NR<100' inputfile #this will print 1-99th line
awk 'NR>100' inputfile #this will print from 101st line onwards
awk 'NR>=100' inputfile #this will print from 100th onwards

try: following too:
sed -n '1,100p' Input_file

Should I use cut or awk to extract fields and field substrings?

I have a file with pipe-separated fields. I want to print a subset of field 1 and all of field 2:
cat tmpfile.txt
# 10 chars.|variable length num|text
ABCDEFGHIJ|99|U|HOMEWORK
JIDVESDFXW|8|C|CHORES
DDFEXFEWEW|73|B|AFTER-HOURS
I'd like the output to look like this:
# 6 chars.|variable length num
ABCDEF|99
JIDVES|8
DDFEXF|73
I know how to get fields 1 & 2:
cat tmpfile.txt | awk '{FS="|"} {print $1"|"$2}'
And know how to get the first 6 characters of field 1:
cat tmpfile.txt | cut -c 1-6
I know this is fairly simple, but I can't figure out is how to combine the awk and cut commands.
Any suggestions would be greatly appreciated.

You could use awk. Use the substr() function to trim the first field:
awk -F'|' '{print substr($1,1,6),$2}' OFS='|' inputfile
For your input, it'd produce:
ABCDEF|99
JIDVES|8
DDFEXF|73
Using sed, you could say:
sed -r 's/^(.{6})[^|]*([|][^|]*).*/\1\2/' inputfile
to produce the same output.

You could use cut and paste, but then you have to read the file twice, which is a big deal if the file is very large:
paste -d '|' <(cut -c 1-6 tmpfile.txt ) <(cut -d '|' -f2 tmpfile.txt )

Just for another variation: awk -F\| -vOFS=\| '{print $1,$2}' t.in | cut -c 1-6,11-
Also, as tripleee points out, two cuts can do this too: cut -c 1-6,11- t.in | cut -d\| -f 1,2

I like a combination of cut and sed, but that's just a preference:
cut -f1-2 -d"|" tmpfile.txt|sed 's/\([A-Z]\{6\}\)[A-Z]\{4\}/\1/g'
Result:
# 10-digits|variable length num
ABCDEF|99
JIDVES|8
DDFEXF|73
Edit: (Removed the useless cat) Thanks!

grep or awk - how to return line if column 1 and 3 have the same value

I have a tab delimited file and I want the output to have the entire line in my file if values in column 1 are the same as the values in column 3. Having very limited knowledge in perl and linux, this is as close as I came to a solution.
File example
Apple Sugar Apple
Apple Butter Orange
Raisins Flour Orange
Orange Butter Orange
The results would be:
Apple Sugar Apple
Orange Butter Orange
Code:
#!/bin/sh
awk '{
prev=$0; f1=$1; f3=$3;
getline
if ($1 == $3) {
print prev
print
}'
} myfilename
I am sure that there is an easier solution to it. Maybe even a grep or awk on the command line. But that was the only code I could find that seemed to give me my solution.
Thanks!

It's easy with awk:
awk '$1 == $3' myfile
The default action is to print out the record, so if fields 1 and 3 are equal, that's what will happen.

Using awk
awk is the tool for the job:
awk '$1 == $3'
If your fields in the data are strictly tab separated and may contain blanks, then you will need to specify the field separator explicitly:
awk -F'\t' '$1 == $3'
(where the The \t represents a tab; you may have to type Tab (or even Control-VTab) to get it into the string).
Using grep
You can do it with grep, but you don't want to do it with grep:
grep -E '([A-Za-z]+)\t[A-Za-z]+\t\1'
The key part of the regex is the \1 which means 'the same value as the first captured string.
You might even go through gyrations like this in bash:
grep -E $'([A-Za-z]+)\t[A-Za-z]+\t\\1'
You could simplify life by noting (assuming) there are no spaces within fields:
grep -E '([A-Za-z]+)[[:space:]]+[A-Za-z]+[[:space:]]+\1'
As noted in one of the comments, I didn't put a $ at the end of the search pattern; it would be feasible (though the data would have to be cleaned up to contain tabs and drop trailing blanks), so that 'Good Noise GoodBad' would not be picked up. There are other ways to do it, and you can make the regex more and more complex to handle more possible situations. But those only go to emphasize that the awk solution is better; awk deals with the details automatically.

Using grep:
grep -P "([^\t]+)\t[^\t]+\t\1" inFile

Using sed or awk, how can I alter the first field in a delimited line?

I have a delimited file whose first few fields look like this:
2774013300|184500|2012-01-04 23:00:00|
and I want to alter certain rows whose first field equals or exceeds 8 characters.
I want to truncate the value in the first column.
In the case of 2774013300 I want its value to become become 27740133.
I would like to do this in sed, preferably, or awk.
Using sed, I can find any number that exceeds 8 digits at the beginning of the line, but am not quite sure how to truncate it, using, I would assume, substitute.
sed -n -e /'^[0-9]\{10,\}/p' infile
I am thinking I could use grouping for the first 8 characters and return those in a substitute command, but I'm not quite sure how to do that.
In awk, I can detect the first field, but am not quite sure how to use substr to alter the first field and then return the remaining fields, so a full line is preserved.
awk -F'|' '{ if (length($1) > 9) { print $1; print length($1);} }' infile

Depending on the subtleties of your situation, you can use
sed 's/^\([0-9]\{8\}\)[0-9]*/\1/' infile
or
sed 's/^\([0-9]\{8\}\)[0-9]\{1,\}/\1/' infile
which with GNU sed can be simplified to
sed -r 's/^([0-9]{8})[0-9]+/\1/' infile
or, if you need to, add -n and p.
Example:
$ sed 's/^\([0-9]\{8\}\)[0-9]*/\1/' <<<'2774013300|184500|2012-01-04 23:00:00|'
27740133|184500|2012-01-04 23:00:00|

Using awk:
awk -F'|' 'BEGIN{OFS=FS}length($1)>9{$1=substr($1, 0,9)}{print}'
example:
$ echo "2774013300|184500|2012-01-04 23:00:00|" | awk -F'|' 'BEGIN{OFS=FS}length($1)>9{$1=substr($1, 0,9)}{print}'
27740133|184500|2012-01-04 23:00:00|

How do i print word after regex but not a similar word?

I want an awk or sed command to print the word after regexp.
I want to find the WORD after a WORD but not the WORD that looks similar.
The file looks like this:
somethingsomething
X-Windows-Icon=xournal
somethingsomething
Icon=xournal
somethingsomething
somethingsomething
I want "xournal" from the one that say "Icon=xournal". This is how far i have come until now. I have tried an AWK string too but it was also unsuccessful.
cat "${file}" | grep 'Icon=' | sed 's/.*Icon=//' >> /tmp/text.txt
But i get both so the text file gives two xournal which i don't want.

Use ^ to anchor the pattern at the beginning of the line. And you can even do the grepping directly within sed:
sed -n '/^Icon=/ { s/.*=//; p; }' "$file" >> /tmp/text.txt
You could also use awk, which I think reads a little better. Using = as the field separator, if field 1 is Icon then print field 2:
awk -F= '$1=="Icon" {print $2}' "$file" >> /tmp/text.txt

This might be useful even though Perl is not one of the tags.
In case if you are interested in Perl this small program will do the task for you:
#!/usr/bin/perl -w
while(<>)
{
if(/Icon\=/i)
{
print $';
}
}
This is the output:
C:\Documents and Settings\Administrator>io.pl new2.txt
xournal
xournal
explanation:
while (<>) takes the input data from the file given as an argument on the command line while executing.
(/Icon\=/i) is the regex used in the if condition.
$' will print the part of the line after the regex.

All you need is:
sed -n 's/^Icon=//p' file

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Print the rows start with a string in awk - sh

For example: Event .123232434 1232323 Event 1233 Event 234 Event 2323 I just want to the rows start with "Event" through awk is there some way?

It's just: /^Event/ A rule that isn't associated with any code prints every line that matches. As paxdiablo pointed out, you could also use grep in this case: grep '^Event' If you want to read the input from a file, this becomes awk '/^Event/' /path/to/file or grep '^Event' /path/to/file

You don't need awk for this, you can just use: grep '^[^0-9]' inputfilename That will give you all lines that start with a character that isn't a digit. If you must use awk, it has an equivalent variant: awk '$0 ~ /^[^0-9]/' inputfilename

Related

sed: How do I delete the first 100 lines of a text file

Should I use cut or awk to extract fields and field substrings?

grep or awk - how to return line if column 1 and 3 have the same value

Using sed or awk, how can I alter the first field in a delimited line?

How do i print word after regex but not a similar word?

Categories

Resources