sed utility -d functionality in filename listing

sed utility -d functionality in filename listing - sed

I want the explanation for the below command as I am not able to understand functionality of -d in sed
Does it list the file based on the modified time of the file or does it list the files if the count of files is more than 90 in the specified path.
Filename format is same for all the file
Filename: 20171010220002.txt
FILES_TO_RETAIN=90
ls -1t /apps/feroz/*.txt|sed "1,${FILES_TO_RETAIN:-90}d"
I know function ls and sed. ls functionality is explained below
-1 list only the filename
-t list the file based on the modified time, new file will on the top
As per man page for sed the explanation for -d is provided below.
-d Delete pattern space. Start next cycle.

There is no option -d in sed. The first parameter to sed is taken as the sed-script.
Here d is the command in the sed-script and is one of the Commands which accept address ranges.
1,90 is the address range for the command d.
sed "1,${FILES_TO_RETAIN:-90}d" will delete line 1 to 90 from the input, which in your case is the file-list
Conclusion: The resulting output is the list of files sorted by modified time (newest on top), excluding the first 90 files (newest ones).
side-note: using tail +<n> would do the same (but excluding n-1 lines).

Related

Remove space between a keyword and paranthesis

I have nearly 300 files in 60 folders .
As per the C++ coding guidelines, I need to replace below lines from *.cpp and *.cl files (wants to remove extra space between if and for statement) -
for (* .....)
with
for(* .....)
and also
if (* .....)
with
if(* .....)
Can any one suggest me the grep command to do search and replace for all files.
Edited:
I tried with below commands:
sed -i 's/for (/for(/g' *.cpp
But got error like below:
sed: can't read *.cpp: No such file or directory

I think you need sed command (stream editor, see man sed on your mashine). It is more suitable for file editing.
sed -i -E 's/(for|if)[ ]+(\(.*\))/\1\2/g'
Let me explain:
-i stands for inline, that means that all changes will be done and saved in the file
-E is needed to use extended regular expression inside with sed
s/(for|if)[ ]+(\(.*\))/\1\2/g
s stands for substitute
/ is a separator, which separates different parts of command. Between first / and second / there is pattern that you need to find (and then replace). After second / and third / there that we want to have after substitution.
g in the end stands for global, that means to make changes in the whole file.
How to apply to every file that you need?
This question is already exist, so in the end you need to run in directory where are your files stored following command
find ./ -type f -exec sed -i -E 's/(for|if)[ ]+(\(.*\))/\1\2/g' {} \;
I hope, this will help:)

I have created the file "brol.txt", with following content:
for (correct
for(wrong
if (correct
if(wrong
I have launched following grep command:
grep -E "for \(|if \(" brol.txt
With following result:
for (correct
if (correct
Explanation:
grep -E means extended grep (allows to search for expression1 OR expression2,
separated by a pipe character)
\( means the search for a round bracket. The backslash is an escape character.

Sed inside a while read loop

I have been reading a lot of questions and answers about using sed within a while loop. I think I have the command down correctly, but I seem to get no output once I put all of the pieces together. Can someone tell me what I am missing?
I have an input file with 700 variables, one on each line. I need to use each of these 700 variables within a sed command. I run the following command to verify variables are outputting correctly:
cat Input_File.txt | while read var; do echo $var; done
I then try to add in the sed command as follows:
cat Input_File.txt | while read var; do sed -n "/$var/,+10p" Multi-BLAST_5814.txt >> Multi_BLAST_Subset; done
This command leaves me without an error, but a blinking cursor as if this is an infinite loop. It should use each of the 700 variables, find the corresponding line in Multi_BLAST_5814.txt and output the search variable line and the 10 lines after the search term into a new file, appending each as it goes. I can execute the sed command alone with a manually set single value variable successfully and I can execute the while loop successfully using the input file. Anyone have a thought as to why this is not working?
User, that is exactly what I have done to this point.
I have a large text file (128 MB) with BLAST output. I need to search through this for a subset of results for 769 samples (Out of the 5814 samples that are in the file).
I have created a .txt file with those 769 sample names.
To test grep and sed, I manually assigned a variable with one of the 769 samples names I need to search and can get the results I need as follows:
$ Otu="S41_Folmer_Otu96;size=12;"
$ grep $Otu -A 10 Multi_BLAST_5814.txt
OR
$ sed -n "/$Otu/,+10p" Multi_BLAST_5814.txt
The OUTPUT is exactly what I want as follows:
Query= S41_Folmer_Otu96;size=12;
Length=101
Sequences producing significant alignments: Score(Bits) E Value
gi|58397553|gb|AY830431.1| Scopelocheirus schellenbergi clone... 180 1E-41
gi|306447543|gb|HQ018876.1| Liposcelis paeta isolate CZ cytoc... 174 6E-40
gi|306447533|gb|HQ018871.1| Liposcelis decolor isolate CQ cyt... 104 9E-19
gi|1043259532|gb|KX130860.1| Batocera rufomaculata isolate Br... 99 4E-17
gi|987210821|gb|KR141076.1| Psocoptera sp. BOLD:ACO1391 vouch... 81 1E-11
To Test to make sure the input file contains the correct variables I run the following:
$ Cat Input_File.txt
$ while read Otu; do echo $Otu; done <Input_File.txt
S41_Folmer_Otu96;size=12;
S78_Folmer_Otu15;size=538;
S73_Leray_Otu52;size=6;
S66_Leray_Otu93;size=6;
S10_Folmer_Otu10;size=1612;
... All 769 variables
Again, this is exactly what I expect and is correct.
But, When I do either of the following commands, nothing is printed to the screen (if I leave off the write file/append action) or to the file I need to create.
$ cat Input_File.txt | while read Otu; do grep "$Otu" -A 10 Multi_BLAST_5814.txt >> Multi_BLAST_Subset.txt; done
$ cat Input_File.txt | while read Otu; do sed -n "/$Otu/,+10p" Multi_BLAST_5814.txt >> Multi_BLAST_Subset.txt; done
Sed hangs and never closes, leaving me at a blinking cursor. Grep finishes but also gives no output. I am at a loss as to why this is not working. Everything works inidividually, so I may be left with manually searching all 769 samples copy/paste.

If you have access to GNU grep no need for a sed command, grep "$var" -A 10 will do the same thing and won't break if $var contains the delimiter used in your sed command.
From man grep :
-A NUM, --after-context=NUM
Print NUM lines of trailing context after matching lines.
Places a line containing a group separator (--) between
contiguous groups of matches. With the -o or --only-matching
option, this has no effect and a warning is given.

Not sure whether you have already attempted it but try breaking the problem into smaller chunks. Simple example below :
$ cat Input_File.txt
one
two
three
$
$ cat file.txt
This is line one
This is line two
This is line three
This is another four
This is another five
This is another six
This is another seven
$
$ cat Input_File.txt | while read var ; do echo $var ; sed -n "/$var/,+1p" file.txt ; done
one
This is line one
This is line two
two
This is line two
This is line three
three
This is line three
This is another four
$

Extracting the contents between two different strings using bash or perl

I have tried to scan through the other posts in stack overflow for this, but couldn't get my code work, hence I am posting a new question.
Below is the content of file temp.
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://schemas.xmlsoap.org/soap/envelope/<env:Body><dp:response xmlns:dp="http://www.datapower.com/schemas/management"><dp:timestamp>2015-01-
22T13:38:04Z</dp:timestamp><dp:file name="temporary://test.txt">XJzLXJlc3VsdHMtYWN0aW9uX18i</dp:file><dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:file></dp:response></env:Body></env:Envelope>
This file contains the base64 encoded contents of two files names test.txt and test1.txt. I want to extract the base64 encoded content of each file to seperate files test.txt and text1.txt respectively.
To achieve this, I have to remove the xml tags around the base64 contents. I am trying below commands to achieve this. However, it is not working as expected.
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g' > test.txt
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g' > test1.txt
Below command:
sed -n '/test.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test.txt">##g'|perl -p -e 's#</dp:file>##g'
produces output:
XJzLXJlc3VsdHMtYWN0aW9uX18i
<dp:file name="temporary://test1.txt">lc3VsdHMtYWN0aW9uX18i</dp:response> </env:Body></env:Envelope>`
Howeveer, in the output I am expecting only first line XJzLXJlc3VsdHMtYWN0aW9uX18i. Where I am commiting mistake?
When i run below command, I am getting expected output:
sed -n '/test1.txt"\>/,/\<\/dp:file\>/p' temp | perl -p -e 's#<dp:file name="temporary://test1.txt">##g'|perl -p -e 's#</dp:file></dp:response></env:Body></env:Envelope>##g'
It produces below string
lc3VsdHMtYWN0aW9uX18i
I can then easily route this to test1.txt file.
UPDATE
I have edited the question by updating the source file content. The source file doesn't contain any newline character. The current solution will not work in that case, I have tried it and failed. wc -l temp must output to 1.
OS: solaris 10
Shell: bash

sed -n 's_<dp:file name="\([^"]*\)">\([^<]*\).*_\1 -> \2_p' temp
I add \1 -> to show link from file name to content but for content only, just remove this part
posix version so on GNU sed use --posix
assuming that base64 encoded contents is on the same line as the tag around (and not spread on several lines, that need some modification in this case)
Thanks to JID for full explaination below
How it works
sed -n
The -n means no printing so unless explicitly told to print, then there will be no output from sed
's_
This is to substitute the following regex using _ to separate regex from the replacement.
<dp:file name=
Regular text
"\([^"]*\)"
The brackets are a capture group and must be escaped unless the -r option is used( -r is not available on posix). Everything inside the brackets is captured. [^"]* means 0 or more occurrences of any character that is not a quote. So really this just captures anything between the two quotes.
>\([^<]*\)<
Again uses the capture group this time to capture everything between the > and <
.*
Everything else on the line
_\1 -> \2
This is the replacement, so replace everything in the regex before with the first capture group then a -> and then the second capture group.
_p
Means print the line
Resources
http://unixhelp.ed.ac.uk/CGI/man-cgi?sed
http://www.grymoire.com/Unix/Sed.html

/usr/xpg4/bin/sed works well here.
/usr/bin/sed is not working as expected in case if the file contains just 1 line.
below command works for a file containing only single line.
/usr/xpg4/bin/sed -n 's_<env:Envelope\(.*\)<dp:file name="temporary://BackUpDir/backupmanifest.xml">\([^>]*\)</dp:file>\(.*\)_\2_p' securebackup.xml 2>/dev/null
Without 2>/dev/null this sed command outputs the warning sed: Missing newline at end of file.
This because of the below reason:
Solaris default sed ignores the last line not to break existing scripts because a line was required to be terminated by a new line in the original Unix implementation.
GNU sed has a more relaxed behavior and the POSIX implementation accept the fact but outputs a warning.

SED Delete lines and replace with new from file

Have been looking at SED documention but need a little pointer in the right direction
I have 200 files I want to modify in a batch.
Source is html file.
Need to create a new file for the changes.
Want to delete the first part of each file up to the first tag (This is 20 or so lines but can vary slightly).
Then insert the contents of a source file (the same for all files) into the new target file starting at line 1, for 30 or so lines. The number of lines to insert does not match the number that are deleted though.
Hope you can help.
Paul

This can certainly be done with sed(1), but I would probably use the vanilla editor ed(1).
$ cat > bigfix.sh
for i in "$#"; do
ed "$i" << \eof
1,/<tag>/-1d
0r otherfile.html
w
q
eof
done
$ sh bigfix.sh file*.html
This shell script takes arguments and runs ed(1) on each arg. It deletes lines starting from the first and ending on the line right before the one with <tag>. It then puts otherfile.html at the top and writes out the result.

For an individual file:
sed -e '1,/tag/{/tag/r insertfile' -e ';d}' inputfile > outputfile
For many files:
find . -name 'criterion*.ext' -type f -exec sh -c 'sed -e "1,/tag/{/tag/r insertfile" -e ';d}" "{}" > "{}.new"' \;
Edit:
Fixed the find command to use sh because of the redirection. Note the change in quoting from the previous version.

How do you get a list of files included in a diff?

I have a patch file containing the output from git diff. I want to get a summary of all the files that, according to the patch file, have been added or modified. What command can I use to achieve this?

patchutils includes a lsdiff utility.

grep '+++' mydiff.patch seems to do the trick.
I can also use git diff --names-only which is probably the better approach.

grep '+++' mydiff.patch|perl -pe 's/\+\+\+ //g'
Details:
git diff produces output in the format
+++ b/file
So if you're using grep as Nathan suggested
grep '+++' mydiff.patch
You'll have the list of affected files, prepended by '+++ ' (3 plus signs and a space).
I often need to further process files and find it convenient to have one filename per line without anything else. This can be achieved with the following command, where perl/regex removes these plus signs and the space.
grep '+++' mydiff.patch|perl -pe 's/\+\+\+ //g'
For patch files generated with diff -Naur, the mydiff.patch file contains entries with filename and date ( is indicating the tabulator whitespace character)
+++ b/file<tab>2013-07-03 13:58:45.000000000 +0200
To extract the filenames for this, use
grep '+++' mydiff.patch|perl -pe 's/\+\+\+ (.*)\t.*/\1/g'

A decent way to do this is to use the --stat flag (or the --summary flag, if you need only new / deleted / renamed files for some reason).
Example:
git apply --stat peer.diff | awk '{ print $1 }' | sed '$d'
1-js/03-code-quality/index.md
CONTR.md
LICENSE.md
README.md
chat-app.readme.md

When you parse patches generated by git format-patch or others containing additional information about number of lines edited, it's crucial to search for ^+++ (at the start of the line) rather than just +++.
For example:
grep '^+++' *.patch | sed -e 's#+++ [ab]/##'
will output paths without a/ or b/ at the begin.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

sed utility -d functionality in filename listing - sed

Related

Remove space between a keyword and paranthesis

Sed inside a while read loop

Extracting the contents between two different strings using bash or perl

SED Delete lines and replace with new from file

How do you get a list of files included in a diff?

Categories

Resources