How to deal with sed in Tcl script with substitution - sed

I am writing a Tcl script which inserts some text in a file behind the matched line. The following are the basic codes in the script.
set test_lists [list "test_1"\
"test_2"\
"test_3"\
"test_4"\
"test_5"
]
foreach test $test_lists {
set content "
'some_data/$test'
"
exec sed -i "/dog/a$content" /Users/l/Documents/Codes/TCL/file.txt
}
However, when I run this script, it always shows me this error:
dyn-078192:TCL l$ tclsh test.tcl
sed: -e expression #1, char 12: unknown command: `''
while executing
"exec sed -i "/dog/a$content" /Users/l/Documents/Codes/TCL/file.txt"
("foreach" body line 5)
invoked from within
"foreach test $test_lists {
set content "
'some_data/$test'
"
exec sed -i "/dog/a$content" /Users/l/Documents/Codes/TCL/file.txt
}"
(file "test.tcl" line 8)
Somehow it always tried to evaluate the first word in $contentas a command.
Any idea what should I do here to make this work?
Thanks.

You first should decide exactly what characters need to be processed by sed. (See https://unix.stackexchange.com/questions/445531/how-to-chain-sed-append-commands for why this can matter…) They might possibly be:
/dog/a\
'some_data/test_1'
which would turn a file like:
abc
dog
hij
into
abc
dog
'some_data/test_1'
hij
If that's what you want, you can then proceed to the second stage: getting those characters from Tcl into sed.
# NB: *no* newline here!
set content "'some_data/$test'"
# NB: there's a quoted backslashes and two quoted newlines here
exec sed -i "/dog/a\\\n$content\n" /Users/l/Documents/Codes/TCL/file.txt
One of the few places where you need to be careful with quoting in Tcl is when you have backslashes and newlines in close proximity.
Why not perform the text transformation directly in Tcl itself? This might reverse the order of inserted lines compared to the original code. You can fix that by lreverseing the list at a convenient time, and perhaps you will also want to do further massaging of the text to insert. That's all refinements...
set test_lists [list "'some_data/test_1'"\
"'some_data/test_2'"\
"'some_data/test_3'"\
"'some_data/test_4'"\
"'some_data/test_5'"
]
set filename /Users/l/Documents/Codes/TCL/file.txt
set REGEXP "dog"
# Read in the data; this is good even for pretty large files
set f [open $filename]
set lines [split [read $f] "\n"]
close $f
# Search for first matching line by regular expression
set idx [lsearch -regexp $lines $REGEXP]
if {$idx >= 0} {
# Found something, so do the insert in the list of lines
set lines [linsert $lines [expr {$idx + 1}] {*}$test_lists]
# Write back to the file as we've made changes
set f [open $filename "w"]
puts -nonewline $f [join $lines "\n"]
close $f
}

(an extended comment, not an answer)
Running this in the shell to clarify your desired output: is this what you want?
$ cat file.txt
foo
dog A
dog B
dog C
dog D
dog E
bar
$ for test in test_{1..5}; do content="some_data/$test"; sed -i "/dog/a$content" file.txt; done
$ cat file.txt
foo
dog A
some_data/test_5
some_data/test_4
some_data/test_3
some_data/test_2
some_data/test_1
dog B
some_data/test_5
some_data/test_4
some_data/test_3
some_data/test_2
some_data/test_1
dog C
some_data/test_5
some_data/test_4
some_data/test_3
some_data/test_2
some_data/test_1
dog D
some_data/test_5
some_data/test_4
some_data/test_3
some_data/test_2
some_data/test_1
dog E
some_data/test_5
some_data/test_4
some_data/test_3
some_data/test_2
some_data/test_1
bar

Related

How to print some free text in addition to SED extract

Well-known SED command to extract a first line and print to another file
sed -n '1 p' /p/raw.txt | cat >> /p/001.txt ;
gives an output in /p/001.txt like
John Doe
But how to modify this command above to add some free text and have, for example, the output like
Name: John Doe
Thanks for any hint to try.
You can do that in a single command (and no sub-shells):
sed 's/^/Name: /;q' /p/raw.txt >> /p/001.txt
This prefixes "Name: " in front of the first line, prints it, then quits so you don't process additional lines. Add a line number before the q to print all lines up to (and including) that number. The output is appended to /p/001.txt just like your original code.
If you want a range of lines:
sed -n '3,9{s/^/Name: /;p}9q' /p/raw.txt >> /p/001.txt
This reads from lines 3-9, performs the substitution, prints, then quits after line 9.
If you want specific lines, I recommend awk:
awk 'NR==3 || NR==9 { print "Name: " $0 } NR>=9 { exit }' /p/raw.txt >> /p/001.txt
This has two clauses. One says the number of record (line number) is either 3 or 9, in which case we print the prefix and the line. The other tells us to stop reading the file after the 9th record.
Here are two more commands to show how awk can act on just the first line(s) or a given range:
awk '{ print "Name: " $0 } NR >= 1 { exit }' /p/raw.txt >> /p/001.txt
awk '3 <= NR { print "Name: " $0 } NR >= 9 { exit }' /p/raw.txt >> /p/001.txt
It appears you're continuously building one file from the other. Consider:
tail -Fn0 /p/raw.txt |sed 's/^/Name: /' >> /p/001.txt
This will run continuously, adding only new entries (added after the command is run) to /p/001.txt
Perhaps you have lots of duplicates to resolve?
awk 'NR != FNR { $0 = "Name: " $0 } !s[$0]++' \
/p/001.txt /p/raw.txt > /tmp/001.txt && mv /tmp/001.txt /p/001.txt
This folds together the previously saved names with any new names, printing names only once (!s[$0]++ is true when s[$0] is zero (its default state), but after the evaluation, it increments to one, making it false on the second occurrence. When a bare clause has no action, the line is printed.) Because we're reading the output file, we need a temporary output. Upon its successful completion, we then move it atop the target output file.
printf "Name : %s\n" "$(sed -n '1p;q' /p/raw.txt)" >/p/001.txt
should do it. If sed is not a requirement do
echo -e "Name : $(sed -n '1p;q' /p/raw.txt)" >/p/001.txt
Note
The q option with the sed quits it without processing any more commands or input.
The -e option tells echo to interpret escape sequences. This is a peculiarity of bash shell.

replace text only in clause determinated by keywork

I want to processes text in some files that sometime expression exist in one line and sometime in multiple line
for example in multiple line
myfunc(param1,
param2,
param3);
or in one line
myfunc(param1, param2, param3);
does exist way that sed processes text only between myfunc and ;? keywords
at first step this help to me to port all multiple line to a line. Then I can to do my manupulation one a line type
if this possible?
Sed is for simple substitutions on individual lines, that is all. For anything even slightly more interesting you should be using awk. Something like this is probably what you want:
$ cat file
myfunc(param1,
param2,
param3);
myfunc(param1, param2, param3);
$ cat tst.awk
/myfunc/ { buf=""; inBlock=1 }
inBlock {
buf = (buf==""?"":buf RS) $0
if (/;/) {
$0 = buf
sub(/param2/,"lets have a tea party")
inBlock = 0
}
}
!inBlock
$ awk -f tst.awk file
myfunc(param1,
lets have a tea party,
param3);
myfunc(param1, lets have a tea party, param3);
Just replace the sub(/param2/,"lets have a tea party") line with whatever it is you really want to do with that block of text between myfunc and ;.
You can use the following sed script:
extract.sed:
# Check for "my"
/\bmy\b/ {
# Replace everything in front of
# my (including it)
s/.*\bmy\b//
# Define a label "a"
:a
# If the line does not contain "processes"
/\bprocesses\b/!{
# Get the next line of input and append
# it to the pattern buffer
N
# Branch back to label "a"
ba
}
# Replace "processes" and everything after it
s/\bprocesses\b.*//
# Print the pattern buffer
p
}
Call it like this:
sed -nf extract.sed input.txt

Use sed to replace word in 2-line pattern

I try to use sed to replace a word in a 2-line pattern with another word. When in one line the pattern 'MACRO "something"' is found then in the next line replace 'BLOCK' with 'CORE'. The "something" is to be put into a reference and printed out as well.
My input data:
MACRO ABCD
CLASS BLOCK ;
SYMMETRY X Y ;
Desired outcome:
MACRO ABCD
CLASS CORE ;
SYMMETRY X Y ;
My attempt in sed so far:
sed 's/MACRO \([A-Za-z0-9]*\)/,/ CLASS BLOCK ;/MACRO \1\n CLASS CORE ;/g' input.txt
The above did not work giving message:
sed: -e expression #1, char 30: unknown option to `s'
What am I missing?
I'm open to one-liner solutions in perl as well.
Thanks,
Gert
Using a perl one-liner in slurp mode:
perl -0777 -pe 's/MACRO \w+\n CLASS \KBLOCK ;/CORE ;/g' input.txt
Or using a streaming example:
perl -pe '
s/^\s*\bCLASS \KBLOCK ;/CORE ;/ if $prev;
$prev = $_ =~ /^MACRO \w+$/
' input.txt
Explanation:
Switches:
-0777: Slurp files whole
-p: Creates a while(<>){...; print} loop for each line in your input file.
-e: Tells perl to execute the code on command line.
When in one line the pattern 'MACRO "something"' is found then in the
next line replace 'BLOCK' with 'CORE'.
sed works on lines of input. If you want to perform substitution on the next line of a specified pattern, then you need to add that to the pattern space before being able to do so.
The following might work for you:
sed '/MACRO/{N;s/\(CLASS \)BLOCK/\1CORE/;}' filename
Quoting from the documentation:
`N'
Add a newline to the pattern space, then append the next line of
input to the pattern space. If there is no more input then sed
exits without processing any more commands.
If you want to make use of address range as in your attempt, then you need:
sed '/MACRO/,/CLASS BLOCK/{s/\(CLASS\) BLOCK/\1 CORE/}' filename
I'm not sure why do you need a backreference for substituting the macro name.
You could try this awk command also,
awk '{print}/MACRO/ {getline; sub (/BLOCK/,"CORE");{print}}' file
It prints all the lines as it is and do the replacing action on seeing a word MACRO on a line.
Since getline has so many pitfall I try not to use it, so:
awk '/MACRO/ {a++} a==1 {sub(/BLOCK/,"CORE")}1' file
MACRO ABCD
CLASS CORE ;
SYMMETRY X Y ;
This could do it
#!awk -f
BEGIN {
RS = ";"
}
/MACRO/ {
sub("BLOCK", "CORE")
}
{
printf s++ ? ";" $0 : $0
}
"line" ends with ;
sub BLOCK for CORE in "lines" with MACRO
print ; followed by "line" unless first line

find the line number where a specific word appears with “sed” on tcl shell

I need to search for a specific word in a file starting from specific line and return the line numbers only for the matched lines.
Let's say I want to search a file called myfile for the word my_word and then store the returned line numbers.
By using shell script the command :
sed -n '10,$ { /$my_word /= }' $myfile
works fine but how to write that command on tcl shell?
% exec sed -n '10,$ { /$my_word/= }' $file
extra characters after close-brace.
I want to add that the following command works fine on tcl shell but it starts from the beginning of the file
% exec sed -n "/$my_word/=" $file
447431
447445
448434
448696
448711
448759
450979
451006
451119
451209
451245
452936
454408
I have solved the problem as follows
set lineno 10
if { ! [catch {exec sed -n "/$new_token/=" $file} lineFound] && [string length $lineFound] > 0 } {
set lineNumbers [split $lineFound "\n"]
foreach num $lineNumbers {
if {[expr {$num >= $lineno}] } {
lappend col $num
}
}
}
Still can't find a single line that solve the problem
Any suggestions ??
I don't understand a thing: is the text you are looking for stored inside the variable called my_word or is the literal value my_word?
In your line
% exec sed -n '10,$ { /$my_word/= }' $file
I'd say it's the first case. So you have before it something like
% set my_word wordtosearch
% set file filetosearchin
Your mistake is to use the single quote character ' to enclose the sed expression. That character is an enclosing operator in sh, but has no meaning in Tcl.
You use it in sh to group many words in a single argument that is passed to sed, so you have to do the same, but using Tcl syntax:
% set my_word wordtosearch
% set file filetosearchin
% exec sed -n "10,$ { /$my_word/= }" $file
Here, you use the "..." to group.
You don't escape the $ in $my_word because you want $my_word to be substitued with the string wordtosearch.
I hope this helps.
After a few trial-and-error I came up with:
set output [exec sed -n "10,\$ \{ /$myword/= \}" $myfile]
# Do something with the output
puts $output
The key is to escape characters that are special to TCL, such as the dollar sign, curly braces.
Update
Per Donal Fellows, we do not need to escape the dollar sign:
set output [exec sed -n "10,$ \{ /$myword/= \}" $myfile]
I have tried the new revision and found it works. Thank you, Donal.
Update 2
I finally gained access to a Windows 7 machine, installed Cygwin (which includes sed and tclsh). I tried out the above script and it works just fine. I don't know what your problem is. Interestingly, the same script failed on my Mac OS X system with the following error:
sed: 1: "10,$ { /ipsum/= }": extra characters at the end of = command
while executing
"exec sed -n "10,$ \{ /$myword/= \}" $myfile"
invoked from within
"set output [exec sed -n "10,$ \{ /$myword/= \}" $myfile]"
(file "sed.tcl" line 6)
I guess there is a difference between Linux and BSD systems.
Update 3
I have tried the same script under Linux/Tcl 8.4 and it works. That might mean Tcl 8.4 has nothing to do with it. Here is something else that might help: Tcl comes with a package called fileutil, which is part of the tcllib. The fileutil package contains a useful tool for this case: fileutil::grep. Here is a sample on how to use it in your case:
package require fileutil
proc grep_demo {myword myfile} {
foreach line [fileutil::grep $myword $myfile] {
# Each line is in the format:
# filename:linenumber:text
set lineNumber [lindex [split $line :] 1]
if {$lineNumber >= 10} { puts $lineNumber}
}
}
puts [grep_demo $myword $myfile]
Here is how to do it with awk
awk 'NR>10 && $0~f {print NR}' f="$my_word" "$myfile"
This search for all line larger than line number 10 that contains word in variable $my_word in file name stored in variable myfile

Remove newline depending on the format of the next line

I have a special file with this kind of format :
title1
_1 texthere
title2
_2 texthere
I would like all newlines starting with "_" to be placed as a second column to the line before
I tried to do that using sed with this command :
sed 's/_\n/ /g' filename
but it is not giving me what I want to do (doing nothing basically)
Can anyone point me to the right way of doing it ?
Thanks
Try following solution:
In sed the loop is done creating a label (:a), and while not match last line ($!) append next one (N) and return to label a:
:a
$! {
N
b a
}
After this we have the whole file into memory, so do a global substitution for each _ preceded by a newline:
s/\n_/ _/g
p
All together is:
sed -ne ':a ; $! { N ; ba }; s/\n_/ _/g ; p' infile
That yields:
title1 _1 texthere
title2 _2 texthere
If your whole file is like your sample (pairs of lines), then the simplest answer is
paste - - < file
Otherwise
awk '
NR > 1 && /^_/ {printf "%s", OFS}
NR > 1 && !/^_/ {print ""}
{printf "%s", $0}
END {print ""}
' file
This might work for you (GNU sed):
sed ':a;N;s/\n_/ /;ta;P;D' file
This avoids slurping the file into memory.
or:
sed -e ':a' -e 'N' -e 's/\n_/ /' -e 'ta' -e 'P' -e 'D' file
A Perl approach:
perl -00pe 's/\n_/ /g' file
Here, the -00 causes perl to read the file in paragraph mode where a "line" is defined by two consecutive newlines. In your example, it will read the entire file into memory and therefore, a simple global substitution of \n_ with a space will work.
That is not very efficient for very large files though. If your data is too large to fit in memory, use this:
perl -ne 'chomp;
s/^_// ? print "$l " : print "$l\n" if $. > 1;
$l=$_;
END{print "$l\n"}' file
Here, the file is read line by line (-n) and the trailing newline removed from all lines (chomp). At the end of each iteration, the current line is saved as $l ($l=$_). At each line, if the substitution is successful and a _ was removed from the beginning of the line (s/^_//), then the previous line is printed with a space in place of a newline print "$l ". If the substitution failed, the previous line is printed with a newline. The END{} block just prints the final line of the file.