Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I have
<funcprototype>
<funcdef>void <function>foo</function></funcdef>
<paramdef>int <parameter>target</parameter></paramdef>
<paramdef>char <parameter>name</parameter></paramdef>
</funcprototype>
<funcprototype>
<funcdef>void <function>foo2</function></funcdef>
<paramdef>int <parameter>target2</parameter></paramdef>
<paramdef>char <parameter>name2</parameter></paramdef>
</funcprototype>
I need to get : void foo( int tagret char name)
void foo2( int tagre2 char name2)
Using sed I can do
void foo(
int target
char name
)
void foo2(
int target2
char name2
)
I do it using this command
awk "/\<funcprototype\>/,/\<\/funcprototype\>/ { print }" foo.xml | sed 's/^[ ^t]*//;s/[ ^]*$//'|sed -e '/^$/d'|sed 's/ //g'| sed 's/<funcprototype>//;s/<funcdef>//;s/<function>/ /;s/<\/function><\/funcdef>/(/;s/<paramdef>//;s/<parameter>/ /;s/<\/parameter><\/paramdef>//;s/<\/funcprototype>/)/;'
How can I do what i want?
Processing file formats like XML in sed is always a hack, so the "correct" solution highly depends on what inputs you want to except. The following sed script at least works fine on the example data you provided:
:loop
/<\/funcprototype>/ ! { N; b loop; }
s/\n/ /g;
s/<\/\?\(funcdef\|parameter\|function\|funcprototype\)>//g;
s/<paramdef>/(/g;
s/<\/paramdef>/)/g;
s/) *(/, /g;
s/ */ /g;
s/^ //;
s/ $//;
s/ (/(/;
The interesting bit is the :loop part in the first two lines: The :loop line defines a label and the 2nd line appends the next line from the input to the buffer and jumps back to the label until the buffer contains the closing </funcprototype> tag. So after this two commands a whole multi-line <funcprototype> .. </funcprototype> block is in the buffer (with \n characters separating the lines). The newline characters are then replaced with blanks using the command s/\n/ /g in the 3rd line.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
My original txt file is like:
A "a,b,c"
B "d"
C "e,f"
How do I convert it to:
A a
A b
A c
B d
C e
C f
I tried this
perl -ane '#s=split(/\,/, $F[1]); foreach $k (#s){print "$F[0] $k\n";}' txt.txt
It worked but how can I eliminate the " "
You can use substitution to remove the double quotes
#s = split /,/, $F[1] =~ s/"//gr;
The /r returns the value instead of changing the value in place.
#!usr/bin/perl
#Open files
open(FH,'<','rearrange letters.txt');
open(TMP, '>','temp.txt');
# A "a,b,c"
# B "d"
# C "e,f"
#<----This extra new line is necessary for this code to run
while(chomp($ln=<FH>)){
#Get rid of double quotes
$ln=~s/\"//g;
#Take out the non quoted first character and store it in a scalar variable
my #line = split(' ',$ln);
my $capletter = shift #line;
#split the lower case characters and assign them to an array
my #lowercaselist = split(',',$line[0]);
#Iterate through the list of lower case characters and print each to the temp file preceded by the capital character
for my $lcl(#lowercaselist){
print TMP "$capletter $lcl\n";
}
}
#Close the files
close(FH);
close(TMP);
#Overwrite the old file with the temp file if that is what you want
rename('temp.txt','rearrange letters.txt');
PASS AC=0;AF=0.048;
AN=2;
ASP;
BaseQRankSum=0.572;
CAF=[0.9605,.,0.03949];
CLNACC=RCV000111759.1,RCV000034730
I'm a new here.I want to know how to match CAF = [0.9605,.,0.03949] using regular expression,thank you.
while (<>) {
if (
/^CAF= # start of line, then literal 'CAF='
\[ # literal '['
[^\]]+ # 1+ characters different from ']'
\]; # closing ']'
/x
)
{
print;
}
}
The /x modifier allows for linebreaks and comments in the regex (to improve readability).
Or, as a one liner:
perl -ne 'print if (/^CAF=\[[^\]]+\];/);' <your_file>
This prints the complete lines containing the desired pattern.
You need to read the documentation for Perl regex. What you are asking doesn't look more complex than a beginner could match having read the docs:
http://perldoc.perl.org/perlre.html
What is a sed script that will remove the "\n" character but only if it is inside "" characters (delimited string), not the \n that is actually at the end of the (virtual) line?
For example, I want to turn this file
"lalala","lalalslalsa"
"lalalala","lkjasjdf
asdfasfd"
"lalala","dasdf"
(line 2 has an embedded \n ) into this one
"lalala","lalalslalsa"
"lalalala","lkjasjdf \\n asdfasfd"
"lalala","dasdf"
(Line 2 and 3 are now joined, and the real line feed was replaced with the character string \\n (or any other easy to spot character string, I'm not picky))
I don't just want to remove every other newline as a previous question asked, nor do I want to remove ALL newlines, just those that are inside quotes. I'm not wedded to sed, if awk would work, that's fine too.
The file being operated on is too large to fit in memory all at once.
sed is an excellent tool for simple substitutions on a single line but for anything else you should use awk., e.g:
$ cat tst.awk
{
if (/"$/) {
print prev $0
prev = ""
}
else {
prev = prev $0 " \\\\n "
}
}
$ awk -f tst.awk file
"lalala","lalalslalsa"
"lalalala","lkjasjdf \\n asdfasfd"
"lalala","dasdf"
Below was my original answer but after seeing #NeronLeVelu's approach of just testing for a quote at the end of the line I realized I was doing this in a much too complicated way. You could just replace gsub(/"/,"&") % 2 below with /"$/ and it'd work the same but the above code is a simpler implementation of the same functionality and will now handle embedded escaped double quotes as long as they aren't at the end of a line.
$ cat tst.awk
{ $0 = saved $0; saved="" }
gsub(/"/,"&") % 2 { saved = $0 " \\\\n "; next }
{ print }
$ awk -f tst.awk file
"lalala","lalalslalsa"
"lalalala","lkjasjdf \\n asdfasfd"
"lalala","dasdf"
The above only stores 1 output line in memory at a time. It just keeps building up an output line from input lines while the number of double quotes in that output line is an odd number, then prints the output line when it eventually contains an even number of double quotes.
It will fail if you can have double quotes inside your quoted strings escaped as \", not "", but you don't show that in your posted sample input so hopefully you don't have that situation. If you have that situation you need to write/use a real CSV parser.
sed -n ':load
/"$/ !{N
b load
}
:cycle
s/^\(\([^"]*"[^"]*"\)*\)\([^"]*"[^"]*\)\n/\1\3 \\\\n /
t cycle
p' YourFile
load the lines in working buffer until a close line (ending with ") is found or end reach
replace any \n that is after any couple of open/close " followed by a single " with any other caracter that " between from the start of file by the escapped version of new line (in fact replace starting string + \n by starting string and escaped new line)
if any substitution occur, retry another one (:cycle and t cycle)
print the result
continue until end of file
thanks to #Ed Morton for remark about escaped new line
I'm a beginner to sed. I know that it's possible to apply a command (or a set of commands) to a certain range of lines like so
sed '/[begin]/,/[end]/ [some command]'
where [begin] is a regular expression that designates the beginning line of the range and [end] is a regular expression that designates the ending line of the range (but is included in the range).
I'm trying to use this to specify a range of lines in a file and join them all into one line. Here's my best try, which didn't work:
sed '/[begin]/,/[end]/ {
N
s/\n//
}
'
I'm able to select the set of lines I want without any problem, but I just can't seem to merge them all into one line. If anyone could point me in the right direction, I would be really grateful.
One way using GNU sed:
sed -n '/begin/,/end/ { H;g; s/^\n//; /end/s/\n/ /gp }' file.txt
This is straight forward if you want to select some lines and join them. Use Steve's answer or my pipe-to-tr alternative:
sed -n '/begin/,/end/p' | tr -d '\n'
It becomes a bit trickier if you want to keep the other lines as well. Here is how I would do it (with GNU sed):
join.sed
/\[begin\]/ {
:a
/\[end\]/! { N; ba }
s/\n/ /g
}
So the logic here is:
When [begin] line is encountered start collecting lines into pattern space with a loop.
When [end] is found stop collecting and join the lines.
Example:
seq 9 | sed -e '3s/^/[begin]\n/' -e '6s/$/\n[end]/' | sed -f join.sed
Output:
1
2
[begin] 3 4 5 6 [end]
7
8
9
I like your question. I also like Sed. Regrettably, I do not know how to answer your question in Sed; so, like you, I am watching here for the answer.
Since no Sed answer has yet appeared here, here is how to do it in Perl:
perl -wne 'my $flag = 0; while (<>) { chomp; if (/[begin]/) {$flag = 1;} print if $flag; if (/[end]/) {print "\n" if $flag; $flag = 0;} } print "\n" if $flag;'
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Character Lowercase to Uppercase in Shell Scripting
I have value as: james,adam,john I am trying to make it James,Adam,John (First character of each name should be Uppercase).
echo 'james,adam,john' | sed 's/\<./\u&/g'
is not working in all the systems. In one system its showing ok..but not ok in another system...
A="james adam john"
B=( $A )
echo "${B[#]^}"
its throwing some syntax error...So, i am doing it through a long query sing while loop, which is too lengthy.
Is there any shortcut way to do this?
There are many ways to define "beginning of a name". This method chooses any letter after a word boundary and transforms it to upper case. As a side effect, this will also work with names such as "Sue Ellen", or "Billy-Bob".
echo "james,adam,john" | perl -pe 's/(\b\pL)/\U$1/g'
With Perl:
echo "james,adam,john" | \
perl -ne 'print join(",", map{ ucfirst } split(/,/))'
You can use awk like this to capitalize first letter of every word in your input:
echo "james,adam,john" | awk 'BEGIN { RS=","; FS=""; ORS=","; OFS=""; }
{ $1=toupper($1); print $0; }'
OUTPUT
James,Adam,John
Same method as TLP but with GNU sed:
echo "james,adam,john,sue ellen,billy-bob" | sed -r 's/\b(.)/\u\1/g'
output:
James,Adam,John,Sue Ellen,Billy-Bob
If only the first letter should be capitalized, use this instead:
echo "james,adam,john,sue ellen,billy-bob" | sed 's/[^,]*/\u&/g'
output:
James,Adam,John,Sue ellen,Billy-bob