weird option of command wc

weird option of command wc - command-line

Command wc has an option --files0-from=F. According to the manual, it reads input from the files specified by NUL-terminated names in file F; If F is - then read names from standard input. Why NUL-terminated names? Isn't it more convenient just separating the names with newline or space?

It's more convenient if you have filenames with spaces (or new-lines, or tabs) in them.
This is sometimes used with find -print0 that outputs its list of files with \0 as a separator instead of spaces.
$ find . -type f -print0 | wc -c --files0-from=-
15 ./c d
12 ./a b
27 total
xargs has a -0 option for similar reasons.

Related

sed remove a special control character from many files

Can someone please give me the exact syntax for removing ^# from thousands of html files in nested directories using sed? The ^# is a control character inserted by a windows program that generated these files. I cannot seem to get the syntax right.
I tried this (but it did not work) using a file since I could not enter the control-character at the command prompt:
find ./ *.html -type f -exec sed -i 's/^#//g' {} ;

POSIX sed doesn't handle NUL in input but GNU sed can with hex escape:
find . -name '*.html' -type f -exec sed -i 's/\x0//g' '{}' +

How to double the spacing of certain number of lines in a file using sed

I need to double the spacing for a certain number of lines in all the files in a folder.
I need to know the command for the same.
For doubling the number of lines of one whole file I used the command:
sed '/^$/d' fileName | sed G
I need to know how to do this for only specific number of lines
I want to make the change in all the files in the folder structure

This might work for you (GNU find,sed and parallel):
find -type f | parallel --dry-run -q sed -Ezi 's/\n+/\n/g;s/\n/&&/10g;s/\n+/\n/21g'
This will double space lines 10-20 for all files in current directory and below (but only after the commands to do this have be checked by the user and the --dry-run removed).
An alternative, less efficient:
find -type f | parallel --dry 'sed -i "/\S/!d" {} ; sed -i "10,20G" {}'

How to rename all the files (without for loop) in a single line command?

I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).

If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp

If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.

finding most recent file version from list of file path names with jumbled file names

I recently lost a bunch of files from eclipse in an accidental copy/replace dilema. I was able to recover most of them but I found in the eclipse metadata folder a history of files, some of which are the ones I need. The path for the history is:
($WORKSPACE/.metadata/.plugins/org.eclipse.core.resources/.history).
Inside there are a bunch of folders like 3e,2f,1a,ff, etc.. each with a couple files named like "2054f7f9a0d30012175be7013ca49f5b". I was able to do a recursive grep with a keyword i know would be in the file and return a list of file names (grep -R -l 'KEYWORD') and now I can't figure out how to sort them by most recently modified.
any help would be great, thanks!

you can try:
find $WORK.../.history -type f -printf '%T#\t%p\n' | sort -nr | cut -f2- | xargs grep 'your_pattern'
Decomposed:
the find finds all plain files and prints their modification time and path
the sort sort sort them numerically - and reverse, so highest number comes first (the latest modified)
the cut removes the time from each line
the xargs run its argument for each file what get to it input,
in this case will run the grep command, so
the 1st file what the grep find - was the lastest modified
The above not works when the filenames containing spaces, but hopefully this is not your case... The -printf works only with GNU find.
For the repetative work, you can split the command to two parts:
find $WORK.../.history -type f -printf '%T#\t%p\n' | sort -nr | cut -f2- > /somewhere/FILENAMES_SORTED_BY_MODIF_TIME
so in 1st step you save to somewhere the list of filenames sorted by their modification times, and after you can repeatedly use the grep command on their content with:
< /somewhere/FILENAMES_SORTED_BY_MODIF_TIME xargs grep 'your_pattern'
the above command is usually written as
xargs grep 'your_pattern' < /somewhere/FILENAMES_SORTED_BY_MODIF_TIME
but for the bash is OK write the redirection to the start and in this case is simpler changing the pattern for the grep if the pattern is in the last place...
If you want check the list of filenames with modification times, you can break the above commands as:
find $WORK.../.history -type f -printf "%T#\t%Tc\t%p\n" | sort -nr >/somewehre/FILENAMES_WITH_DATE
check the list (they now contains readable date too) and use the next
< /somewehre/FILENAMES_WITH_DATE cut -f3- | xargs grep 'your_pattern'
note, now need to use -f3- and not -f2- as in the 1st example.

unix find and replace text in dir and subdirs

I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml

find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete

This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).

There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

weird option of command wc - command-line

Command wc has an option --files0-from=F. According to the manual, it reads input from the files specified by NUL-terminated names in file F; If F is - then read names from standard input. Why NUL-terminated names? Isn't it more convenient just separating the names with newline or space?

Related

sed remove a special control character from many files

How to double the spacing of certain number of lines in a file using sed

How to rename all the files (without for loop) in a single line command?

finding most recent file version from list of file path names with jumbled file names

unix find and replace text in dir and subdirs

Categories

Resources