removing columns with perl on tab delimited file [closed] - perl

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am trying to remove three columns with perl on a tabulated file.
Input file:
A B C D
Expected/new file:
A B C
I saw in other question how to remove only one column, the answer being:
perl.exe -na -e "print qq{$F[3]\n}" < input
How could I rewrite this to remove three columns?
Thanks

Does this work for you:
perl.exe -na -e "print qq{#F[0..2]\n}" < input > newfile

Use perl in awk-mode:
$ cat -T f1
a^Ib^Ic^Id^Ie^If
a^Ib^Ic^Id^Ie^If
a^Ib^Ic^Id^Ie^If
$ perl -F'\t' -lane 'print $F[0],"\t",$F[1],"\t",$F[2]' input
a b c
a b c
a b c
or space-separated:
$ perl -F'\t' -lane 'print qq{#F[0..2]}' input
a b c
a b c
a b c
or to print the first three columns, tab-separated in awk
$ awk 'BEGIN{OFS="\t"}{print $1, $2, $3}' input
a b c
a b c
a b c

perl -lane "pop #F; print qq(#F)" input

Here's another option (Perl v5.14+):
perl -lne "print s/.+\K\s+\S$//r" inFile

Related

Linux shell script, parsing each line [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am facing a problem with my shell script (I'm using SH):
I have a file with multiple line including mail adressess, for example:
abcd
plm
name_aA.2isurnamec#Text.com -> this is a line that checks the correct condition
random efgh
aaaaaa
naaame_aB.3isurnamec#Text.ro ->same (this is not part of the file)
I have used grep to filter the correct mail adresses like this:
grep -E '^[a-z][a-zA-Z_]*.[0-9][a-zA-Z0-9]+#[A-Z][A-Z0-9]{,12}.(ro|com|eu)$' file.txt
I have to write a shell that cheks the file and prints the following (for the above example it would be like this ):
"Incorrect:" abcd
"Incorrect:" plm
"Correct:" name_aA.2isurnamec#Text.com
"Incorrect:" random efgh
"Incorrect:" aaaaaa
"Correct:" naaame_aB.3isurnamec#Text.ro
I want to solve this problem using grep or sed, while, if, or pipes etc i dont want to use lists or other things.
I have tried using something like this
grep condition abc.txt | while read -r line ; do
echo "Processing $line"
# your code goes here
done
but it only prints the correct lines, and i know that i can also print the lines that dont match the grep condition using -v on grep, but i want to print the lines in the order they appear in the text file.
I'm having trouble trying to parse each line of the file, or maybe i don't need to parse the lines 1
by 1, i really dont know how to solve it.
If you could help me i would appreciate it.
Thanks
#!/bin/bash
pattern='^[a-z][a-zA-Z_]*\.[0-9][a-zA-Z0-9]+#[A-Z][A-Za-z0-9]{,12}\.(ro|com|eu)$'
while read line; do
if [ "$line" ]; then
if echo "$line" | grep -E -q $pattern; then
echo "\"Correct:\" $line"
else
echo "\"Incorrect:\" $line"
fi
fi
done
Invoke like this, assuming the bash script is called filter and the text file, text.txt: ./filter < text.txt.
Note that the full stops in the regular expression are escaped and that the domain name can contain lowercase letters (although, I think that your regex is too restrictive). Other characters are not escaped because the string is in single quotes.
while reads the standard input line by line into $line; the first if skips the empty lines; the second one checks $line against $pattern (-q suppresses grep output).

What is the function of the '-n option' in sed? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 months ago.
Improve this question
Can anyone give me an example to showcase the usage of the -n option? The way the document explained it was too vague for me.
Toggles if the line is printed even if no match.
Given these lines:
$ printf '%s\n' {a..c}{0..1}
a0
a1
b0
b1
c0
c1
With -n only prints the lines with a '1' match:
$ printf '%s\n' {a..c}{0..1} | sed -n '/1/p;'
a1
b1
c1
Without -n every line is printed and the lines with '1' match are printed a second time because of the p command:
$ printf '%s\n' {a..c}{0..1} | sed '/1/p;'
a0
a1
a1
b0
b1
b1
c0
c1
c1
Easily available with man sed:
-n By default, each line of input is echoed to the standard output
after all of the commands have been applied to it. The -n option
suppresses this behavior.
You use the -n option when you're using sed as a filter and are mainly eliminating material, so you don't want input lines printed by default. In the absence of -n, at the end of each edit cycle, the 'line' in the pattern space is printed by default. For example:
sed -n -e '/something/ s/another-thing/dohicky/p' some-file
This looks for lines containing 'something', and effectively ignores all other lines. On the lines that match 'something', the string 'another-thing' is replaced by 'dohicky' and the resulting line is printed.
A slightly more realistic example: change the shell of users of csh to tcsh:
sed -n -e '/\bin\csh/ s%/bin/csh%/bin/tcsh%p' /etc/passwd
This shows you only the lines that will be changed. Then, if you decide it is correct, you can use:
sed -i .bak -e '/\bin\csh/ s%/bin/csh%/bin/tcsh%' /etc/passwd
to make the changes permanent. Still not a good example (you need to lock the password file so it is not modified by other programs while you're editing it, and so on and so forth), but it looks more realistic.

How to remove a string between a pattern using sed / awk command in linux [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
How can I remove the letter f from the below mentioned string in a file:
a;b;c;d;e;f;g;h;i;j;k;l;m
This needs to be done only by using delimiter ; using sed or awk.
The output will be:
a;b;c;d;e;g;h;i;j;k;l;m
This might work for you (GNU sed):
sed 's/[^;];//6' file
$ echo 'a;b;c;d;e;f;g;h;i;j;k;l;m' | sed 's/;*f;*/;/'
a;b;c;d;e;g;h;i;j;k;l;m
easier using perl pie than sed (unless sed has added an inplace-edit flag in the last 20 years).
perl -p -i -e 's/;f;/;/' fileName.txt
sed 's/f;//' YourFile
be carefull if f is only a sample pattern for the sample due to possible special character in a généric pattern

How to find specific number patterns in a data file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a data file that looks like this
15105021
15105043
15106013
15106024
15106035
15105024
15105042
15106015
15106021
15106034
and I need to grep lines that have sequence numbers like 1510603, 1510504
I tried this awk command
awk /[1510603,1510504]/ soursefile.txt
but it does not work.
Using egrep and word boundary on LHS since OP wants to match all matching numbers on RHS:
egrep '\b(1510603|1510504)' file
15105043
15106035
15105042
15106034
An shorter awk
awk '/1510603|1510504/' file
Based on the contents of your file the following should suffice
grep -E '^1510603|^1510504' file
If your grep version does not support the -E flag, try egrep instead of grep
If you insist on awk
awk '/^1510603/ || /^1510504/' file
Think this works:
egrep '1510603|1510504' source
Your question is very poorly stated, but if you want to print all numbers in the file that begin with either 1510603 or 1510504, then you can write this in Perl
perl -ne 'print if /^1510(?:603|504)/' sourcefile.txt

How to double quote all fields in a text file? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm looking for a quick and efficient way to double quote all fields in tab delimited or comma separated text files.
Ideally, this would be a Perl one-liner that I can run from the command-line, but I'm open to any kind of solution.
Use Text::CSV:
perl -MText::CSV -e'
my $c = Text::CSV->new({always_quote => 1, binary => 1, eol => "\n"}) or die;
$c->print(\*STDOUT, $_) while $_ = $c->getline(\*ARGV)' <<'END'
foo,bar, baz qux,quux
apple,"orange",spam, eggs
END
Output:
"foo","bar"," baz qux","quux"
"apple","orange","spam"," eggs"
The always_quote option is the important one here.
If your file does not contain any double quoted strings containing the delimiter, you can use
perl -laF, -ne '$" = q(","); print qq("#F")'
awk -F, -v OFS='","' -v q='"' '{$0=q$0q;$1=$1}7' file
for example, comma sep:
kent $ echo "foo,bar,baz"|awk -F, -v OFS='","' -v q='"' '{$0=q$0q;$1=$1}7'
"foo","bar","baz"
tab sep would be similar.