I have multiple files with lines like:
foo, 123456
bar, 654321
baz, 098765
I would like to remove everything on each line before (and including) the comma.
The output would be:
123456
654321
098765
I attempted to use the following after seeing something similar on another question, but the user didn't leave an explanation, so I'm not sure how the wildcard would be handled:
find . -name "*.csv" -type f | xargs sed -i -e '/*,/d'
Thank you for any help you can offer.
METHOD 1:
If it's always the 2nd column you want, you can do this with awk -- this command is actually splitting the rows on the whitespace rather than the comma, so it gets your second column -- the numbers, but without the leading space:
awk '{print $2}' < whatever.csv
METHOD 2:
Or to get everything after the comma (including the space):
sed -e 's/^.*,//g' < whatever.csv
METHOD 3:
If you want to find all of the .csv files and get the output of all of them together, you can do:
sed -e 's/^.*,//g' `find . -name '*.csv' -print`
METHOD 4:
Or the same way you were starting to -- with find and xargs:
find . -name '*.csv' -type f -print | xargs sed -e 's/^.*,//'
METHOD 5:
Making all of the .csv files into .txt files, processed in the way described above, you can make a brief shell script. Like this:
Create a script "bla.sh":
#!/bin/sh
for infile in `find . -name '*.csv' -print` ; do
outfile=`echo $infile | sed -e 's/.csv/.txt/'`
echo "$infile --> $outfile"
sed -e 's/^.*,//g' < $infile > $outfile
done
Make it executable by typing this:
chmod 755 bla.sh
Then run it:
./bla.sh
This will create a .txt output file with everything after the comma for each .csv input file.
ALTERNATE METHOD 5:
Or if you need them to be named .csv, the script could be updated like this -- this just makes an output file named "file-new.csv" for each input file named "file.csv":
#!/bin/sh
for infile in `find . -name '*.csv' -print` ; do
outfile=`echo $infile | sed -e 's/.csv/-new.csv/'`
echo "$infile --> $outfile"
sed -e 's/^.*,//g' < $infile > $outfile
done
Something like this should work for a single file. Let's say the
input is 'yourfile' and you want the output to go to 'outfile'.
sed 's/^.*,//' < yourfile > outfile
The syntax to do a search-and-replace is s/input_pattern/replacement/
The ^ anchors the input pattern to the beginning of the line.
A dot . matches any single character; .* matches a string of zero or more of any character.
The , matches the comma.
The replacement pattern is empty, so whatever matched the input_pattern
will be removed.
Related
I have a folder of 500 *.INI files that I need to manually edit. Within each INI file, I have the line Source =. I would like that line to become Source = C:\software\{filename}.
For instance, a dx4.ini file would need to be fixed to become: Source = C:\software\dx4
Is there a quick way to do this with Find, Grep, or Sed functions?
You can try with sed
For example
Input file contents:
file.txt
Source =
some lines..
script:
newstring='Source = C:\software\dx4'
oldstring='Source ='
echo `sed "s/$oldstring/$newstring/g" file.txt` > file.txt
After running the above commands
output:
Source = C:\software\dx4
some lines..
If you want to edit a file in a script, I think ed is the way to go. Combined with a shell for loop:
for file in *.INI; do
base=$(basename "$file" .INI)
ed -s "$file" <<EOF
/^Source =/s/=/= C:\\\\software\\\\$base/
w
EOF
done
(This does assume that filenames will not have newlines or ampersands in their names)
With GNU awk for the 3rd arg to match(), gensub(), and "inplace" editing:
awk -i inplace '
match($0,/(.*Source = C:\\software\\){filename}(.*)/,a) {
fname = gensub(/\..*/,"",1,FILENAME)
$0 = a[1] fname a[2]
}
1' *.INI
The above assumes you're running in a UNIX environment though your use of the term folder instead of directory and that path starting with C: and containing backslashes makes me suspicious. If you're on Windows then save the part between the 2 's (exclusive) in a file named foo.awk and execute it as awk -i inplace foo.awk *.INI or however it is you normally execute commands like this in Windows.
find *.ini -type -f > stack
while read line
do
sed -i s"#Source =#Source = C:\\software\\dx4#" "${line}"
done < stack
Assuming that a} You have sed with "-i" (the insert flag, which AFAIK is not always portable) and b} sed doesn't crap itself about a double escape sequence, I think that will work.
How do I split a file to N files using as a filename the first 2 chars on the line.
Ex input file:
AA23409234TEXT
BA23201202Other Text
AA23509234YADA
BA23202202More Text.
C1000000000000000000
Should generate 3 files:
AA.txt
AA23409234TEXT
AA23509234YADA
BA.txt
BA23201202Other Text
BA23202202More Text.
C1.txt
C1000000000000000000
I'm thinking of using a sed script similar to this
/^(..)/w \1
But what that really does is create a file named '\1' instead of the capture group.
Any ideas?
$ awk '{fname=substr($0, 0, 2); print >>fname}' input.txt
Or
$ while read line; do echo "$line" >>"${line:0:2}"; done <input.txt
The first thing you need to do is determine all of your file names:
filenames=$(sed 's/\(..\).*/\1/' listOfStrings.txt | sort | uniq)
Then, loop through those filenames
for filename in $filenames
do
sed -n '/^$filename/ p' listOfStrings.txt > $filename.txt
done
I have not tested this, but I think it should work.
This might work for you:
sed 's/\(..\).*/echo "&" >>\1.txt/' file | sh
or if you have GNU sed:
sed 's/\(..\).*/echo "&" >>\1.txt/e' file
I'm trying to change the name of "my-silly-home-page-name.html" to "index.html" in all documents within a given master directory and subdirs.
I saw this: Shell script - search and replace text in multiple files using a list of strings.
And this: How to change all occurrences of a word in all files in a directory
I have tried this:
grep -r "my-silly-home-page-name.html" .
This finds the lines on which the text exists, but now I would like to substitute 'my-silly-home-page-name' for 'index'.
How would I do this with sed or perl?
Or do I even need sed/perl?
Something like:
grep -r "my-silly-home-page-name.html" . | sed 's/$1/'index'/g'
?
Also; I am trying this with perl, and I try the following:
perl -i -p -e 's/my-silly-home-page-name\.html/index\.html/g' *
This works, but I get an error when perl encounters directories, saying "Can't do inplace edit: SOMEDIR-NAME is not a regular file, <> line N"
Thanks,
jml
find . -type f -exec \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g' {} +
Or if your find doesn't support -exec +,
find . -type f -print0 | xargs -0 \
perl -i -pe's/my-silly-home-page-name(?=\.html)/index/g'
Both pass to Perl as arguments as many names at a time as possible. Both work with any file name, including those that contains newlines.
If you are on Windows and you are using a Windows build of Perl (as opposed to a cygwin build), -i won't work unless you also do a backup of the original. Change -i to -i.bak. You can then go and delete the backups using
find . -type f -name '*.bak' -delete
This should do the job:
find . -type f -print0 | xargs -0 sed -e 's/my-silly-home-page-name\.html/index\.html/g' -i
Basically it gathers recursively all the files from the given directory (. in the example) with find and runs sed with the same substitution command as in the perl command in the question through xargs.
Regarding the question about sed vs. perl, I'd say that you should use the one you're more comfortable with since I don't expect huge differences (the substitution command is the same one after all).
There are probably better ways to do this but you can use:
find . -name oldname.html |perl -e 'map { s/[\r\n]//g; $old = $_; s/oldname.txt$/newname.html/; rename $old,$_ } <>';
Fyi, grep searches for a pattern; find searches for files.
I'm new to sed, and need to grab just the filename from the output of find. I need to have find output the whole path for another part of my script, but I want to just print the filename without the path. I also need to match starting from the beginning of the line, not from the end. In english, I want to match, the first group of characters ending with ".txt" not containing a "/". Here's my attempt that doesn't work:
ryan#fizz:~$ find /home/ryan/Desktop/test/ -type f -name \*.txt
/home/ryan/Desktop/test/two.txt
/home/ryan/Desktop/test/one.txt
ryan#fizz:~$ find /home/ryan/Desktop/test/ -type f -name \*.txt | sed s:^.*/[^*.txt]::g
esktop/test/two.txt
ne.txt
Here's the output I want:
two.txt
one.txt
Ok, so the solutions offered answered my original question, but I guess I asked it wrong. I don't want to kill the rest of the line past the file suffix i'm searching for.
So, to be more clear, if the following:
bash$ new_mp3s=\`find mp3Dir -type f -name \*.mp3\` && cp -rfv $new_mp3s dest
`/mp3Dir/one.mp3' -> `/dest/one.mp3'
`/mp3Dir/two.mp3' -> `/dest/two.mp3'
What I want is:
bash$ new_mp3s=\`find mp3Dir -type f -name \*.mp3\` && cp -rfv $new_mp3s dest | sed ???
`one.mp3' -> `/dest'
`two.mp3' -> `/dest'
Sorry for the confusion. My original question just covered the first part of what I'm trying to do.
2nd edit:
here's what I've come up with:
DEST=/tmp && cp -rfv `find /mp3Dir -type f -name \*.mp3` $DEST | sed -e 's:[^\`].*/::' -e "s:$: -> $DEST:"
This isn't quite what I want though. Instead of setting the destination directory as a shell variable, I would like to change the first sed operation so it only changes the cp output before the "->" on each line, so that I still have the 2nd part of the cp output to operate on with another '-e'.
3rd edit:
I haven't figured this out using only sed regex's yet, but the following does the job using Perl:
cp -rfv `find /mp3Dir -type f -name \*.mp3` /tmp | perl -pe "s:.*/(.*.mp3).*\`(.*/).*.mp3\'$:\$1 -> \$2:"
I'd like to do it in sed though.
Something like this should do the trick:
find yourdir -type f -name \*.txt | sed 's/.*\///'
or, slightly clearer,
find yourdir -type f -name \*.txt | sed 's:.*/::'
Why don't you use basename instead?
find /mydir | xargs -I{} basename {}
No need external tools if using GNU find
find /path -name "*.txt" -printf "%f\n"
I landed on the question based on the title: using sed to grab filename from fullpath.
So, using sed, the following is what worked for me...
FILENAME=$(echo $FULLPATH | sed -n 's/^\(.*\/\)*\(.*\)/\2/p')
The first group captures any directories from the path. This is discarded.
The second group capture is the text following the last slash (/). This is returned.
Examples:
echo "/test/file.txt" | sed -n 's/^\(.*\/\)*\(.*\)/\2/p'
file.txt
echo "/test/asd/asd/entrypoint.sh" | sed -n 's/^\(.*\/\)*\(.*\)/\2/p'
entrypoint.sh
echo "/test/asd/asd/default.json" | sed -n 's/^\(.*\/\)*\(.*\)/\2/p'
default.json
find /mydir | awk -F'/' '{print $NF}'
path="parentdir2/parentdir1/parentdir0/dir/FileName"
name=${path##/*}
I am interested into getting into bash scripting and would like to know how you can traverse a unix directory and log the path to the file you are currently looking at if it matches a regex criteria.
It would go like this:
Traverse a large unix directory path file/folder structure.
If the current file's contents contained a string that matched one or more regex expressions,
Then append the file's full path to a results text file.
Bash or Perl scripts are fine, although I would prefer how you would do this using a bash script with grep, awk, etc commands.
find . -type f -print0 | xargs -0 grep -l -E 'some_regexp' > /tmp/list.of.files
Important parts:
-type f makes the find list only files
-print0 prints the files separated not by \n but by \0 - it is here to make sure it will work in case you have files with spaces in their names
xargs -0 - splits input on \0, and passes each element as argument to the command you provided (grep in this example)
The cool thing with using xargs is, that if your directory contains really a lot of files, you can speed up the process by paralleling it:
find . -type f -print0 | xargs -0 -P 5 -L 100 grep -l -E 'some_regexp' > /tmp/list.of.files
This will run the grep command in 5 separate copies, each scanning another set of up to 100 files
use find and grep
find . -exec grep -l -e 'myregex' {} \; >> outfile.txt
-l on the grep gets just the file name
-e on the grep specifies a regex
{} places each file found by the find command on the end of the grep command
>> outfile.txt appends to the text file
grep -l -R <regex> <location> should do the job.
If you wanted to do this from within Perl, you can take the find commands that people suggested and turn them into a Perl script with find2perl:
If you have:
$ find ...
make that
$ find2perl ...
That outputs a Perl program that does the same thing. From there, if you need to do something that easy in Perl but hard in shell, you just extend the Perl program.
find /path -type f -name "*.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
#do some other things here
}
}
}'
similar thread
find /path -type f -name "outfile.txt" | awk '
{
while((getline line<$0)>0){
if(line ~ /pattern/){
print $0":"line
}
}
}'