Output of a command looks something like this
# ls -l abc.zip
lrwxrwxrwx 1 sri dba 122 Mar 27 23:37 /a/b/c/abc.zip -> /x/y/z/abc.zip
I need to extract/cut the whole path which comes after ->. I have used cut -f -d but its not working some times, I guess column number is changing. So I need sed equivalent of this.
What you are reading is the information about a symlink. Parsing ls is definitely a bad idea.
What about using readlink instead? It does read value of a symbolic link:
readlink abc.zip
Or with -f for the full path:
readlink -f abc.zip
See an example:
$ touch a
$ ln -s a b # we create a symbolic link "b" to the file "a"
$ ls -l b
lrwxrwxrwx 1 me me 1 Apr 6 09:54 b -> a
$ readlink -f "b" # we check the full "destination" of the symlink "b"
/home/me/a
$ readlink "b" # we check the "destination" of the symlink "b"
a
This might work for you (GNU sed):
sed -n 's/^l.*->\s*//p' file
Using 'cut' you can only specify single character delimiters.
For string delimiters an easy and clear option is awk. In your case try:
ls -ls abc.zip | awk -F '->' '{print $2}'
Related
I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).
If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp
If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.
sed -i is creating a backup of all files in subdirectories before editing in place (as expected) but it's not actually editing files in subdirectories.
$ mkdir -p a/b
$ echo "A" > a/a.txt
$ echo "B" > a/b/b.txt
Now I have two text files, one in a one in a subdirectory of a
$ sed -i.bac "1s/^/PREPENDED /" a/**/*.txt
Backups are created for both:
$ find a
a
a/a.txt
a/a.txt.bac
a/b
a/b/b.txt
a/b/b.txt.bac
Only a.txt is edited:
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
B
I'm using ZSH (so I have globstar support) and I'm on Mac.
Why is this happening and how can I fix it?
It's happening because your sed invocation only has a single line 1, which happens to be in a.txt. If you want it to do it for each file then you need to invoke sed multiple times.
for f in a/**/*.txt
do
sed ... "$f"
done
Since you are needing to descend through several levels of directories, a single invocation of sed alone is not sufficient. However, using find you can accomplish what you want in a single line. If you are not familiar with find ... -exec '{}' \; it is worth taking a few minutes with startpage.com and do a quick search. In your case, the following invocation works well:
find a -type f -name "*.txt" -exec sed -i.bac 's/^/PREPENDED /' '{}' \;
Here find searches directory a and all below for any file (-type f) matching *.txt, then for each file (indicated by '{}') -exec executes sed -i.bac 's/^/PREPENDED /' and lastly an escaped \; is given to indicate the end of the -exec command.
results:
$ ls -1 a
b
a.txt
a.txt.bac
$ ls -1 a/b
b.txt
b.txt.bac
$ cat a/a.txt
PREPENDED A
$ cat a/b/b.txt
PREPENDED B
As was correctly pointed out, with globstar set shopt -s globstar it is unnecessary to use find as the following invocation of sed is sufficient:
sed -i.bac 's/^/PREPENDED /' a/**/*.txt
I want TO extract list of names from other bigger file (input), having that name and some additional information associated with that name. My problem is with grep -f option, as it is not matching the exact entries in input file but some other entries that contain similar name.
I tried:
$ grep -f list.txt -A 1 input >output
Following are the format of files;
list.txt
TE_final_35005
TE_final_1040
Input file
>TE_final_10401
ACGTACGTACGTACGT
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
Required output:
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
output I am getting:
>TE_final_10401
ACGTACGTACGTACGT
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
Although TE_final_10401 is not in the list.txt
How I can use ^ in list?
Please help to match the exact value or suggest other ways to do this.
Add the whole word switch (-w):
grep -w -A1 -f list.txt infile
Output:
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
A couple of things, remove the blanks lines from the files first:
sed -i '/^\s*$/d' file list
Then -w is used to match whole words only and -A1 will print the next line after the match:
$ grep -w -A1 -f list file > new_file
$ cat new_file
>TE_final_35005
ACGTACGATCAGT
>TE_final_1040
ACGTACGTACGT
as others have mentioned, adding the -w flag is the cleanest and easiest approach based on your sample data. but since you explicitly asked how you could use ^ in list.txt, here's another option.
to add ^ and/or $ anchors to each line in list.txt:
$ cat list.txt
^>TE_final_35005[ ]*$
^>TE_final_1040[ ]*$
this searches for your patterns at the start of the line, preceded by a > character, and ignores any trailing spaces. then your previous command will work (assuming you either remove those blank lines or change your argument to -A 2).
if you'd like to add these anchors to the list file automatically (and delete any blank lines at the same time), use this awk construct:
awk '{if($0 != ""){print "^>"$0"[ ]*$"}}' list.txt >newlist.txt
or if you prefer sed inplace editing:
sed -i '/^[ ]*$/d;s/\(.*\)/^>\1[ ]*$/g' list.txt
The following command is correctly changing the contents of 2 files.
sed -i 's/abc/xyz/g' xaa1 xab1
But what I need to do is to change several such files dynamically and I do not know the file names. I want to write a command that will read all the files from current directory starting with xa* and sed should change the file contents.
I'm surprised nobody has mentioned the -exec argument to find, which is intended for this type of use-case, although it will start a process for each matching file name:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} \;
Alternatively, one could use xargs, which will invoke fewer processes:
find . -type f -name 'xa*' | xargs sed -i 's/asd/dsg/g'
Or more simply use the + exec variant instead of ; in find to allow find to provide more than one file per subprocess call:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} +
Better yet:
for i in xa*; do
sed -i 's/asd/dfg/g' $i
done
because nobody knows how many files are there, and it's easy to break command line limits.
Here's what happens when there are too many files:
# grep -c aaa *
-bash: /bin/grep: Argument list too long
# for i in *; do grep -c aaa $i; done
0
... (output skipped)
#
You could use grep and sed together. This allows you to search subdirectories recursively.
Linux: grep -r -l <old> * | xargs sed -i 's/<old>/<new>/g'
OS X: grep -r -l <old> * | xargs sed -i '' 's/<old>/<new>/g'
For grep:
-r recursively searches subdirectories
-l prints file names that contain matches
For sed:
-i extension (Note: An argument needs to be provided on OS X)
Those commands won't work in the default sed that comes with Mac OS X.
From man 1 sed:
-i extension
Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup
will be saved. It is not recommended to give a zero-length
extension when in-place editing files, as you risk corruption
or partial content in situations where disk space is exhausted, etc.
Tried
sed -i '.bak' 's/old/new/g' logfile*
and
for i in logfile*; do sed -i '.bak' 's/old/new/g' $i; done
Both work fine.
#PaulR posted this as a comment, but people should view it as an answer (and this answer works best for my needs):
sed -i 's/abc/xyz/g' xa*
This will work for a moderate amount of files, probably on the order of tens, but probably not on the order of millions.
Another more versatile way is to use find:
sed -i 's/asd/dsg/g' $(find . -type f -name 'xa*')
I'm using find for similar task. It is quite simple: you have to pass it as an argument for sed like this:
sed -i 's/EXPRESSION/REPLACEMENT/g' `find -name "FILE.REGEX"`
This way you don't have to write complex loops, and it is simple to see, which files you are going to change, just run find before you run sed.
u can make
'xxxx' text u search and will replace it with 'yyyy'
grep -Rn '**xxxx**' /path | awk -F: '{print $1}' | xargs sed -i 's/**xxxx**/**yyyy**/'
There's some good answers above. I thought I'd throw in one more that is succinct and parallelizable, using GNU parallel, which I often prefer to xargs:
parallel sed -i 's/abc/xyz/g' {} ::: xa*
Combine this with the -j N option to run N jobs in parallel.
If you are able to run a script, here is what I did for a similar situation:
Using a dictionary/hashMap (associative array) and variables for the sed command, we can loop through the array to replace several strings. Including a wildcard in the name_pattern will allow to replace in-place in files with a pattern (this could be something like name_pattern='File*.txt' ) in a specific directory (source_dir).
All the changes are written in the logfile in the destin_dir
#!/bin/bash
source_dir=source_path
destin_dir=destin_path
logfile='sedOutput.txt'
name_pattern='File.txt'
echo "--Begin $(date)--" | tee -a $destin_dir/$logfile
echo "Source_DIR=$source_dir destin_DIR=$destin_dir "
declare -A pairs=(
['WHAT1']='FOR1'
['OTHER_string_to replace']='string replaced'
)
for i in "${!pairs[#]}"; do
j=${pairs[$i]}
echo "[$i]=$j"
replace_what=$i
replace_for=$j
echo " "
echo "Replace: $replace_what for: $replace_for"
find $source_dir -name $name_pattern | xargs sed -i "s/$replace_what/$replace_for/g"
find $source_dir -name $name_pattern | xargs -I{} grep -n "$replace_for" {} /dev/null | tee -a $destin_dir/$logfile
done
echo " "
echo "----End $(date)---" | tee -a $destin_dir/$logfile
First, the pairs array is declared, each pair is a replacement string, then WHAT1 will be replaced for FOR1 and OTHER_string_to replace will be replaced for string replaced in the file File.txt. In the loop the array is read, the first member of the pair is retrieved as replace_what=$i and the second as replace_for=$j. The find command searches in the directory the filename (that may contain a wildcard) and the sed -i command replaces in the same file(s) what was previously defined. Finally I added a grep redirected to the logfile to log the changes made in the file(s).
This worked for me in GNU Bash 4.3 sed 4.2.2 and based upon VasyaNovikov's answer for Loop over tuples in bash.
The Silver Searcher Solution
I'm adding another option for those people who don't know about the amazing tool called The Silver Searcher (command line tool is ag).
Note: You can use grep and other tools to do the same thing here, but The Silver Searcher is fantastic :)
TLDR
ag -l 'abc' | xargs sed -i 's/abc/xyz/g'
Install The Silver Searcher
sudo apt install silversearcher-ag # Debian / Ubuntu
sudo pacman -S the_silver_searcher # Arch / EndeavourOS
sudo yum install epel-release the_silver_searcher # RHEL / CentOS
Demo Files
Paste the following into your terminal to create some demonstration files:
mkdir /tmp/food
cd /tmp/food
content="Everybody loves to abc this food!"
echo "$content" > ./milk
echo "$content" > ./bread
mkdir ./fastfood
echo "$content" > ./fastfood/pizza
echo "$content" > ./fastfood/burger
mkdir ./fruit
echo "$content" > ./fruit/apple
echo "$content" > ./fruit/apricot
Using 'ag'
The following ag command will recursively find all the files that contain the string 'abc'. It ignores the .git directory, .gitignore files, and other ignore files:
$ ag 'abc'
milk
1:Everybody loves to abc this food!
bread
1:Everybody loves to abc this food!
fastfood/burger
1:Everybody loves to abc this food!
fastfood/pizza
1:Everybody loves to abc this food!
fruit/apple
1:Everybody loves to abc this food!
fruit/apricot
1:Everybody loves to abc this food!
To just list the files that contain the string 'abc', use the -l switch:
$ ag -l 'abc'
bread
fastfood/burger
fastfood/pizza
fruit/apricot
milk
fruit/apple
Changing Multiple Files
Finally, using xargs and sed, we can replace the 'abc' string with another string:
ag -l 'abc' | xargs sed -i 's/abc/eat/g'
In the above command, ag is listing all the files that contain the string 'abc'. The xargs command is splitting the file names and piping them individually into the sed command.
I would like the option of extracting the following string/data:
/work/foo/processed/25
/work/foo/processed/myproxy
/work/foo/processed/sample
=or=
25
myproxy
sample
But it would help if I see both.
From this output using cut or perl or anything else that would work:
Found 3 items
drwxr-xr-x - foo_hd foo_users 0 2011-03-16 18:46 /work/foo/processed/25
drwxr-xr-x - foo_hd foo_users 0 2011-04-05 07:10 /work/foo/processed/myproxy
drwxr-x--- - foo_hd testcont 0 2011-04-08 07:19 /work/foo/processed/sample
Doing a cut -d" " -f6 will get me foo_users, testcont. I tried increasing the field to higher values and I'm just not able to get what I want.
I'm not sure if cut is good for this or something like perl?
The base directories will remain static /work/foo/processed.
Also, I need the first line Found Xn items removed. Thanks.
You can do a substitution from beginning to the first occurrence of / , (non greedily)
$ your_command | ruby -ne 'print $_.sub(/.*?\/(.*)/,"/\\1") if /\//'
/work/foo/processed/25
/work/foo/processed/myproxy
/work/foo/processed/sample
Or you can find a unique separator (field delimiter) to split on. for example, the time portion is unique , so you can split on that and get the last element. (2nd element)
$ ruby -ne 'print $_.split(/\s+\d+:\d+\s+/)[-1] if /\//' file
/work/foo/processed/25
/work/foo/processed/myproxy
/work/foo/processed/sample
With awk,
$ awk -F"[0-9][0-9]:[0-9][0-9]" '/\//{print $NF}' file
/work/foo/processed/25
/work/foo/processed/myproxy
/work/foo/processed/sample
perl -lanF"\s+" -e 'print #F[-1] unless /^Found/' file
Here is an explanation of the command-line switches used:
-l: remove line break from each line of input, then add one back on print
-a: auto-split each line of input into an #F array
-n: loop through each line of input
-F: the regexp pattern to use for the auto-split (with -a)
-e: the perl code to execute (for each line of input if using -n or -p)
If you want to just output the last portion of your directory path, and the basedir is always '/work/foo/processed', I would do this:
perl -nle 'print $1 if m|/work/foo/processed/(\S+)|' file
Try this out :
<Your Command> | grep -P -o '[\/\.\w]+$'
OR if the directory '/work/foo/processed' is always static then:
<Your Command>| grep -P -o '\/work\/foo\/processed\/.+$'
-o : Show only the part of a matching line that matches PATTERN.
-P : Interpret PATTERN as a Perl regular expression.
In this example, the last word in the input will be matched .
(The word can also contain dot(s)),so file names like 'text_file1.txt', can be matched).
Ofcourse, you can change the pattern, as per your requirement.
If you know the columns will be the same, and you always list the full path name, you could try something like:
ls -l | cut -c79-
which would cut out the 79th character until the end. That might work in this exact case, but I think it would be better to find the basename of the last field. You could easily do this in awk or perl. Respond if this is not what you want and I'll add the awk and perl versions.
take the output of your ls command and pipe it to awk
your command|awk -F'/' '{print $NF}'
your_command | perl -pe 's#.*/##'