Is there a wildcard to patch any sequence of characters between two characters? - diff

I want to modify a patch that changes the checksum of a file
-SRCREV = "a43570ced29f21cfbd5eff12b843f9214271aaf3"
+SRCREV = "somechecksum"
to something else. Meanwhile the explicit checksum in the patch has changed in the actual file, the hunk is failing and I would have to generate a new patch, which can get a bit cumbersome.
So is there a way to replace anything in between the quotes or the whole line?

If you tell diff to use ed style output, the original text does not appear. However, there won't be any protection against corruption if you try to patch a file where the line to change has moved:
mkdir a b c t
cat <<'EOD' >a/test
1. a
2. b
3. c
4. d
5. e
EOD
cat <<'EOD' >b/test
1. a
2. b
3 <<< changed!
4. d
5. e
EOD
cat <<'EOD' >c/test
1. a
2. b
4. d
3. c
5. e
EOD
diff -u a/test b/test >patch1
cp a/test t/test
patch -p1 t/test <patch1
cat t/test # identical to b/test
cp c/test t/test
patch -p1 t/test <patch1
cat t/test # correct line is changed
diff -e a/test b/test >patch2
cp a/test t/test
patch -p1 t/test <patch2
cat t/test # identical to b/test
cp c/test t/test
patch -p1 t/test <patch2
cat t/test # wrong line is changed - result is broken
If you don't have to use diff/patch, then a simple sed script might work. Something like:
sed -i -e 's/^\(SRCREV = \)"[^"]*"/\1"somechecksum"/' fileToPatch

Related

How to list files and substitute pattern in makefile

I have scripts like this
#!/bin/sh
SHELL_CMD=busybox
and try to substitute some pattern to bash shell using makefile,
#!/bin/bash
#SHELL_CMD=busybox
Followings are the procedures in Makefile
release:
#rm -rf my_temp/
#mkdir my_temp/
#cp dir1/burn_*.sh dir1/dump_*.sh my_temp/
#cd my_temp/; \
for f in $(shell ls); do \
sed 's:#!/bin/sh\nSHELL_CMD=busybox:#!/bin/bash\n#SHELL_CMD=busybox:1' $${f} > temp; mv temp $${f}; \
done; \
cd ..;
#cd my_temp/; tar -jcv -f bash.tar.bz2 *.sh; cd ..;
My questions are:
1 in the for loop, it didn't get the correct script names in for loop.
How to patch it ?
Any better patterns in this sed substitution?
You are much better off doing the substitution without trying to match the newline in the source string. Unless you are ready to do some complex sed-fu (How can I replace a newline (\n) using sed?) you can just apply the substitution on each of the line with a
You can do this to apply the action on both 1st and 2nd lines. Also the $(shell ls) part is not needed. You can just run a shell glob expression to get you the files ending with .sh
#for f in *.sh; do \
sed -i -e '1 s:#!/bin/sh:#!/bin/bash:1' -e '2 s:^:#:1' $${f} ;\
done
If you don't want the -i in-place substitution, use the tmp file approach as you had originally shown.

Replace first line in directory files

I would like to execute this make command to first replace the first line of all csv files inside the directory and then replace the # for commas through the other lines.
The second command is working fine and does what it is supposed to do, but the first one only replaces the line on the first file.
Could anyone give me a help on that?
csv:
$(DOCKER_RUN) npm run csv-generator
make format-csv
format-csv:
#sed -i '' '1 s/^.*$$/"bar","repository"/g' $(CURDIR)/foo/npm/*.csv
#sed -i '' 's/\(.*\)#/\1","/g' $(CURDIR)/foo/npm/*.csv
The reason that the first sed command "fails" is that sed doesn't reset the line counter between input files (on your system, and neither on my Mac OS X machine, see comments):
$ cat test1
a
b
g
$ cat test2
aa
bb
cc
$ sed -n '=' test1 test2 # the '=' sed command outputs line numbers
1
2
3
4
5
6
This is why the first sed command isn't doing what you want it to do, it only affects the first file's first line.
The solution is to loop over the files and call sed for each of them (untested in Makefile):
#for f in $(CURDIR)/foo/npm/*.csv; do \
sed -i '' '1 s/^.*$$/"bar","repository"/g' $f; \
done
Using find and xargs will also work, just make sure that find isn't picking up files further down in the folders.
EDIT: In light of the comments on this answer, I would recommend avoiding the use of sed -i on multiple files altogether, and convert both statements into for-loops (in this case, they may be collapsed into one loop with two statements):
#for f in $(CURDIR)/foo/npm/*.csv; do \
sed -i '' '1 s/^.*$$/"bar","repository"/g' $f; \
sed -i '' 's/\(.*\)#/\1","/g' $f; \
done
In my experience, using for-loops in Makefiles seems to be far more common compared to using find and xargs. This is probably due to incompatibility between find and xargs versions between Unices. It also makes the Makefile a lot easier to read if one uses explicit loops.
I managed to solve with:
#find $(CURDIR)/foo/npm -name "*.csv" -type f | xargs -L 1 sed -i '' '1 s/^.*$$/"bar"/g'

Sed replace pattern with line number

I need to replace the pattern ### with the current line number.
I managed to Print in the next line with both AWK and SED.
sed -n "/###/{p;=;}" file prints to the next line, without the p;, it replaces the whole line.
sed -e "s/###/{=;}/g" file used to make sense in my head, since the =; returns the line number of the matched pattern, but it will return me the the text {=;}
What am i Missing? I know this is a silly question. I couldn't find the answer to this question in the sed manual, it's not quite clear.
If possible, point me what was i missing, and what to make it work. Thank you
Simple awk oneliner:
awk '{gsub("###",NR,$0);print}'
Given the limitations of the = command, I think it's easier to divide the job in two (actually, three) parts. With GNU sed you can do:
$ sed -n '/###/=' test > lineno
and then something like
$ sed -e '/###/R lineno' test | sed '/###/{:r;N;s/###\([^\n]*\n\)\([^\n]*\)/\2\1/;tr;:c;s/\n\n/\n/;tc}'
I'm afraid there's no simple way with sed because, as well as the = command, the r and GNU extension R commands don't read files into the pattern space, but rather directly append the lines to the output, so the contents of the file cannot be modified in any way. Hence piping to another sed command.
If the contents of test are
fooo
bar ### aa
test
zz ### bar
the above will produce
fooo
bar 2 aa
test
zz 4 bar
This might work for you (GNU sed):
sed = file | sed 'N;:a;s/\(\(.*\)\n.*\)###/\1\2/;ta;s/.*\n//'
An alternative using cat:
cat -n file | sed -E ':a;s/^(\s*(\S*)\t.*)###/\1\2/;ta;s/.*\t//'
As noted by Lev Levitsky this isn't possible with one invocation of sed, because the line number is sent directly to standard out.
You could have sed write a sed-script for you, and do the replacement in two passes:
infile
a
b
c
d
e
###
###
###
a
b
###
c
d
e
###
Find the lines that contain the pattern:
sed -n '/###/=' infile
Output:
6
7
8
11
15
Pipe that into a sed-script writing a new sed-script:
sed 's:.*:&s/###/&/:'
Output:
6s/###/6/
7s/###/7/
8s/###/8/
11s/###/11/
15s/###/15/
Execute:
sed -n '/###/=' infile | sed 's:.*:&s/^/& \&/:' | sed -f - infile
Output:
a
b
c
d
e
6
7
8
a
b
11
c
d
e
15
is this ok ?
kent$ echo "a
b
c
d
e"|awk '/d/{$0=$0" "NR}1'
a
b
c
d 4
e
if match pattern "d", append line number at the end of the line.
edit
oh, you want to replace the pattern not append the line number... take a look the new cmd:
kent$ echo "a
b
c
d
e"|awk '/d/{gsub(/d/,NR)}1'
a
b
c
4
e
and the line could be written like this as well: awk '1+gsub(/d/,NR)' file
one-liner to modify the FILE in place, replacing LINE with the corresponding line number:
seq 1 `wc -l FILE | awk '{print $1}'` | xargs -IX sed -i 'X s/LINE/X/' FILE
Following on from https://stackoverflow.com/a/53519367/29924
If you try this on osx the version of sed is different and you need to do:
seq 1 `wc -l FILE | awk '{print $1}'` | xargs --verbose -IX sed -i bak "X s/__line__/X/" FILE
see https://markhneedham.com/blog/2011/01/14/sed-sed-1-invalid-command-code-r-on-mac-os-x/

split a large text (xyz) database into x equal parts

I want to split a large text database (~10 million lines). I can use a command like
$ sed -i -e '4 s/(dB)//' -e '4 s/Best\ unit/Best_Unit/' -e '1,3 d' '/cygdrive/c/ Radio Mobile/Output/TRC_TestProcess/trc_longlands.txt'
$ split -l 1000000 /cygdrive/P/2012/Job_044_DM_Radio_Propogation/Working/FinalPropogation/TRC_Longlands/trc_longlands.txt 1
The first line is to clean the databse and the next is to split it -
but then the output files do not have the field names. How can I incorporate the field names into each dataset and pipe a list which has the original file, new file name and line numbers (from original file) in it. This is so that it can be used in the arcgis model to re-join the final simplified polygon datasets.
ALTERNATIVELY AND MORE USEFULLY -as this needs to go into a arcgis model, a python based solution is best. More details are in https://gis.stackexchange.com/questions/21420/large-point-to-polygon-by-buffer-join-buffer-dissolve-issues#comment29062_21420 and Remove specific lines from a large text file in python
SO GOING WITH A CYGWIN based Python solution as per answer by icyrock.com
we have process_text.sh
cd /cygdrive/P/2012/Job_044_DM_Radio_Propogation/Working/FinalPropogation/TRC_Longlands
mkdir processing
cp trc_longlands.txt processing/trc_longlands.txt
cd txt_processing
sed -i -e '4 s/(dB)//' -e '4 s/Best\ unit/Best_Unit/' -e '1,3 d' 'trc_longlands.txt'
split -l 1000000 trc_longlands.txt trc_longlands_
cat > a
h
1
2
3
4
5
6
7
8
9
^D
split -l 3
split -l 3 a 1
mv 1aa 21aa
for i in 1*; do head -n1 21aa|cat - $i > 2$i; done
for i in 21*; do echo ---- $i; cat $i; done
how can "TRC_Longlands" and the path be replaced with the input filename -in python we have %path%/%name for this.
in the last line is "do echo" necessary?
and this is called by python using
import os
os.system("process_text.bat")
where process_text.bat is basically
bash process_text.sh
I get the following error when run from dos...
Microsoft Windows [Version 6.1.7601] Copyright (c) 2009 Microsoft
Corporation. All rights reserved.
C:\Users\georgec>bash
P:\2012\Job_044_DM_Radio_Propogation\Working\FinalPropogat
ion\TRC_Longlands\process_text.sh 'bash' is not recognized as an
internal or external command, operable program or batch file.
also when I run the bash command from cygwin -I get
georgec#ATGIS25
/cygdrive/P/2012/Job_044_DM_Radio_Propogation/Working/FinalPropogation/TRC_Longlands
$ bash process_text.sh : No such file or directory:
/cygdrive/P/2012/Job_044_DM_Radio_Propogation/Working/FinalPropogation/TRC_Longlands
cp: cannot create regular file `processing/trc_longlands.txt\r': No
such file or directory : No such file or directory: txt_processing :
No such file or directoryds.txt
but the files are created in the root directory.
why is there a "." after the directory name? how can they be given a .txt extension?
If you want to just prepend the first line of the original file to all but the first of the splits, you can do something like:
$ cat > a
h
1
2
3
4
5
6
7
^D
$ split -l 3
$ split -l 3 a 1
$ ls
1aa 1ab 1ac a
$ mv 1aa 21aa
$ for i in 1*; do head -n1 21aa|cat - $i > 2$i; done
$ for i in 21*; do echo ---- $i; cat $i; done
---- 21aa
h
1
2
---- 21ab
h
3
4
5
---- 21ac
h
6
7
Obviously, the first file will have one line less then the middle parts and the last part might be shorter, too, but if that's not a problem, this should work just fine. Of course, if your header has more lines, just change head -n1 to head -nX, X being the number of header lines.
Hope this helps.

diff and patch using wrong line ending when creating new files or line end changed

I'm trying to create a patch using diff, but I can't get the patch to use the line end characters used in the files when creating a new file or to change the line ending when the file changes it. Basically, I'm doing:
cp -r dir1 dir3
diff -ruN dir1 dir2 > dir3\patch.txt
cd dir3
patch -p1 < patch.txt
All the changes between dir1 and dir2 properly apply, except the end of line character for new files is defaulting to CR+LF, even where the file in dir2 uses LF as an end of line marker. Also, any files where the difference between them is just a line end change are not patched in any way -- diff doesn't seem to see any change.
So running diff -rq dir2 dir3 gives a bunch of Files aaa and bbb differ, but diff -rwq dir2 dir3 works fine.
I'm using diff - GNU diffutils version 2.7 and patch 2.5 from UnxUtils on Windows XP.
Is there any way to make new and changed files included in the patch keep the line ending from the original file?
This works:
cp -r dir1 dir3
diff --binary -ruN dir1 dir2 > dir3\patch.txt
cd dir3
patch --no-backup-if-mismatch --binary -u -p1 < patch.txt
Not using the --binary flag means that the file is parsed line by line, ignoring EOL. For some reason, it won't always patch cleanly (gives a Hunk #1 succeeded at 1 with fuzz 1. message) so I had to include --no-backup-if-mismatch to prevent it making .orig files. The -u seems to be optional, since patch will figure the patch type out on it's own.