How to double quote all fields in a text file? [closed] - perl

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I'm looking for a quick and efficient way to double quote all fields in tab delimited or comma separated text files.
Ideally, this would be a Perl one-liner that I can run from the command-line, but I'm open to any kind of solution.

Use Text::CSV:
perl -MText::CSV -e'
my $c = Text::CSV->new({always_quote => 1, binary => 1, eol => "\n"}) or die;
$c->print(\*STDOUT, $_) while $_ = $c->getline(\*ARGV)' <<'END'
foo,bar, baz qux,quux
apple,"orange",spam, eggs
END
Output:
"foo","bar"," baz qux","quux"
"apple","orange","spam"," eggs"
The always_quote option is the important one here.

If your file does not contain any double quoted strings containing the delimiter, you can use
perl -laF, -ne '$" = q(","); print qq("#F")'

awk -F, -v OFS='","' -v q='"' '{$0=q$0q;$1=$1}7' file
for example, comma sep:
kent $ echo "foo,bar,baz"|awk -F, -v OFS='","' -v q='"' '{$0=q$0q;$1=$1}7'
"foo","bar","baz"
tab sep would be similar.

Related

Find and replace a string in Perl [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
I have the following command line:
perl -i -pe 's/_GSV*//g' file.fasta
My goal is change some sequences that have the following pattern:
GSVIVG01006342001_GSVIVT01006342001
I want to find all sequences that starts with _GSV and finish with anything (that`s why I put the '*') and substitute for nothing.
When I run my command it just recognize the _GSV and return to me that:
GSVIVG01006342001IVT01006342001
and I want that:
GSVIVG01006342001
Can anybody tell me what's wrong with my command line?
before the *, include a dot that means any character
perl -i -pe 's/_GSV.*//g' file.fasta
You can also include the symbol $ to ensure you arrive until the end of the string
perl -i -pe 's/_GSV.*$//g' file.fasta

Print all lines matching a pattern and within a delimeter [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have a file containing lines as below.
#user
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode;
#user1
code1code1code1code1code1code1
code1code1code1code1code1code1
code1code1code1code1code1code1
code1code1code1code1code1code1;
#user2
code2code2code2code2code2code2
code2code2code2code2code2code2
code2code2code2code2code2code2
code2code2code2code2code2code2;
#user (again "user" but with a different code)
code3code3code3code3code3code3
code3code3code3code3code3code3
code3code3code3code3code3code3
code3code3code3code3code3code3;
I want extract only codes from the "user", the output I'm looking for is:
#user
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode
codecodecodecodecodecodecodecode;
#user
code3code3code3code3code3code3
code3code3code3code3code3code3
code3code3code3code3code3code3
code3code3code3code3code3code3;
Results retuned only the lines matching "user" and its respective codes.
I tried awk -F";" '{print $1}' $file but i cant isolate codes from a specific user.
In a perl one liner:
perl -ne 'print if /^#/' in.txt
You can use awk as follows:
awk -F\; '/code/' test.txt
Assuming the field delimiter is ; and the pattern is code.
If you want to print a particular line or column, use awk's NR and NF variables respectively.
For more info: http://www.gnu.org/software/gawk/manual/gawk.html

removing columns with perl on tab delimited file [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I am trying to remove three columns with perl on a tabulated file.
Input file:
A B C D
Expected/new file:
A B C
I saw in other question how to remove only one column, the answer being:
perl.exe -na -e "print qq{$F[3]\n}" < input
How could I rewrite this to remove three columns?
Thanks
Does this work for you:
perl.exe -na -e "print qq{#F[0..2]\n}" < input > newfile
Use perl in awk-mode:
$ cat -T f1
a^Ib^Ic^Id^Ie^If
a^Ib^Ic^Id^Ie^If
a^Ib^Ic^Id^Ie^If
$ perl -F'\t' -lane 'print $F[0],"\t",$F[1],"\t",$F[2]' input
a b c
a b c
a b c
or space-separated:
$ perl -F'\t' -lane 'print qq{#F[0..2]}' input
a b c
a b c
a b c
or to print the first three columns, tab-separated in awk
$ awk 'BEGIN{OFS="\t"}{print $1, $2, $3}' input
a b c
a b c
a b c
perl -lane "pop #F; print qq(#F)" input
Here's another option (Perl v5.14+):
perl -lne "print s/.+\K\s+\S$//r" inFile

How to find specific number patterns in a data file [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a data file that looks like this
15105021
15105043
15106013
15106024
15106035
15105024
15105042
15106015
15106021
15106034
and I need to grep lines that have sequence numbers like 1510603, 1510504
I tried this awk command
awk /[1510603,1510504]/ soursefile.txt
but it does not work.
Using egrep and word boundary on LHS since OP wants to match all matching numbers on RHS:
egrep '\b(1510603|1510504)' file
15105043
15106035
15105042
15106034
An shorter awk
awk '/1510603|1510504/' file
Based on the contents of your file the following should suffice
grep -E '^1510603|^1510504' file
If your grep version does not support the -E flag, try egrep instead of grep
If you insist on awk
awk '/^1510603/ || /^1510504/' file
Think this works:
egrep '1510603|1510504' source
Your question is very poorly stated, but if you want to print all numbers in the file that begin with either 1510603 or 1510504, then you can write this in Perl
perl -ne 'print if /^1510(?:603|504)/' sourcefile.txt

Change "<path>" value with different string depends on last line [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
Change "<path>" value with different string depends on last line. In that case when see in the last line "*" to replace "<path>" with "ls -lrt" and to separate the "*" from the last line when see slash for anything else with "find".
Text file:
<path>/etc/inet.d/*.conf
<path>/etc/rc/*
<path>/etc/rc*
Expect View:
find /etc/inet.d/*.conf
ls -lrt /etc/rc/ *
ls -lrt /etc/rc*
I think you mean last character of each line, not last line!
if it is right, check this out:
awk '{if($0~/\*$/)sub(/<path>/,"ls -lrt ");else sub(/<path>/,"find ")}7' file
with your data:
kent$ echo "<path>/etc/inet.d/*.conf
<path>/etc/rc/*
<path>/etc/rc*"|awk '{if($0~/\*$/)sub(/<path>/,"ls -lrt ");else sub(/<path>/,"find ")}7'
find /etc/inet.d/*.conf
ls -lrt /etc/rc/*
ls -lrt /etc/rc*