How to write .csv from another .csv in Bash or Ksh - matlab

I have a lot of output's from my matlab in .csv. I would like to put them together in one output.csv file.
So my idea was use my .csv created by Matlab as variables for my output.csv global.
#!/bin/ksh -p
# Reading results from results.csv
echo Study name?
read NAME
cd PROJECTS/04_${NAME}
sed -i 's/\r//g' Results.csv
while IFS=";" read -r R1 R2 R3
do
echo $R1
echo $R2
echo $R3
while IFS=";" write -r var1 var2 var3 var4
do
var1=$NAME
var2=$R1
var3=$R2
var4=$R3
done > >(tail -n +2 /PROJECTS/teste_output.csv)
done < <(tail /PROJECTS/04_${NAME}/Results.csv)
Each results.csv are in this format :
2.1680114865303;0;-0.00516967741714325
Using my code for one specific file i get :
2.1680114865303
0
-0.00516967741714325
What's mean it's doing the first part but its not writing in my output.csv.
So i would like to know how to write in this case. Is it possible to read more than 1 .csv at the same time?
In my dreams i would like to have code with one more while to read a list of files, get results.csv and write into output.csv:
#!/bin/ksh -p
# Reading results from results.csv
while IFS=";" read -r NAME c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11
do
cd PROJECTS/04_${NAME}
sed -i 's/\r//g' Results.csv
while IFS=";" read -r R1 R2 R3
do
echo $R1
echo $R2
echo $R3
while IFS=";" write -r var1 var2 var3 var4
do
var1=$NAME
var2=$R1
var3=$R2
var4=$R3
done > >(tail -n +2 /PROJECTS/teste_output.csv)
done < <(tail /PROJECTS/04_${NAME}/Results.csv)
done < <(tail -n +2 /PROJECTS/input.csv)
So i read a list of files in my input.csv , i get NAME , get results from this NAME and put in my global output.csv.
From now, my code it's able to read the list, read the results from matlab (results.csv) but it's not writing in my output.csv. If its easier, i could make 2 or 3 bash scripts for do it step by step.
I already tried with bash and ksh but none of them worked.
Thanks for you help in advance :)

Is this what you are trying to accomplish?
(Making a few guesses...)
read -p "Study name? " name
in=PROJECTS/04_"${name}"/Results.csv
out=/PROJECTS/teste_output.csv
sed -i 's/\r//g' "$in"
printf "Study," > "$out"
head -1 "in" >> "$out"
while read -r line || [[ -n "$line" ]]
do echo "$name;$line"
done < <(tail -n +2 "$in") >> "$out"
I switched your input and output - sorry if that's not what you meant.
This doesn't parse the fields, just leaves the separataors in place and prepends the study name, and does it all in one loop.
Again, apologies if I misread your intent.

Related

How to rename all the files (without for loop) in a single line command?

I want to rename all the files in my home directory (example abc), in the format (abc_bkp) without using any loops and it should be a single line command in unix (bash script).
If the directory contains nothing but files, this should do it:
ls | xargs -I {} mv {} {}_bkp
If it contains subdirectories, links, and other things you don't want to rename, you must filter the output of ls. Here is a crude way to do it; maybe someone can suggest a more elegant approach:
ls -l | grep ^- | cut -d' ' -f 13 | xargs -I {} mv {} {}_bkp
If you don't want to use loops then I believe the BEST way could be find command, try following command as a DRY run first and once you are satisfy with results then you could remove echo from it to give a real shot.
find -type f -or -type d | xargs -I % echo mv % %_bkp
-I: From man xargs page:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not
terminate
input items; instead the separator is the newline character. Implies -x and -L 1.

How to insert total number of lines in a new line before the first line command?

I have thousands of .xyz files which are chemical coordinates like this one (for instance):
Fe 0.000000000 0.000000000 0.000000000
C 2.112450000 0.000000000 0.000000000
C 0.039817193 1.817419422 0.000000000
I searched a lot for a simple command, like sed or head and tail, to write the counted number of lines on top of the file (with making a newline \n and with two spaces before the total number) but couldn't be successful.
I would really appreciate any help given.
The output mst be like this:
3
Fe 0.000000000 0.000000000 0.000000000
C 2.112450000 0.000000000 0.000000000
C 0.039817193 1.817419422 0.000000000
You can do it as a composite command:
(wc -l < fileA && cat fileA) > outputA
Note that I used < there to make sure wc does not print the filename on the first line. On Mac OS at least, it does if you don't use redirection like that.
Edit: If you need to apply the same to many files:
mkdir output
ls *.xyz | while read filename; do
(wc -l < $filename && cat $filename) > output/$filename
done
Just for fun, here's a command that you should not use, but works in some cases:
tee xxx < fileA | wc -l > xxx # don't do this
Try this:
sed '1i \
'$(wc -l < file.xyz) file.xyz
To process multiple files with find and edit the files in place using the -i flag:
find . -name '*.xyz' -exec sh -c 'sed -i "1s/.*/$(wc -l < {})\n&/" {}' \;
NB: as the files will be edited in place with this command, you may want to first check the output. To do so, remove the -i flag after sed. If the command meets your needs, then add the -i flag again.

Replace first line in directory files

I would like to execute this make command to first replace the first line of all csv files inside the directory and then replace the # for commas through the other lines.
The second command is working fine and does what it is supposed to do, but the first one only replaces the line on the first file.
Could anyone give me a help on that?
csv:
$(DOCKER_RUN) npm run csv-generator
make format-csv
format-csv:
#sed -i '' '1 s/^.*$$/"bar","repository"/g' $(CURDIR)/foo/npm/*.csv
#sed -i '' 's/\(.*\)#/\1","/g' $(CURDIR)/foo/npm/*.csv
The reason that the first sed command "fails" is that sed doesn't reset the line counter between input files (on your system, and neither on my Mac OS X machine, see comments):
$ cat test1
a
b
g
$ cat test2
aa
bb
cc
$ sed -n '=' test1 test2 # the '=' sed command outputs line numbers
1
2
3
4
5
6
This is why the first sed command isn't doing what you want it to do, it only affects the first file's first line.
The solution is to loop over the files and call sed for each of them (untested in Makefile):
#for f in $(CURDIR)/foo/npm/*.csv; do \
sed -i '' '1 s/^.*$$/"bar","repository"/g' $f; \
done
Using find and xargs will also work, just make sure that find isn't picking up files further down in the folders.
EDIT: In light of the comments on this answer, I would recommend avoiding the use of sed -i on multiple files altogether, and convert both statements into for-loops (in this case, they may be collapsed into one loop with two statements):
#for f in $(CURDIR)/foo/npm/*.csv; do \
sed -i '' '1 s/^.*$$/"bar","repository"/g' $f; \
sed -i '' 's/\(.*\)#/\1","/g' $f; \
done
In my experience, using for-loops in Makefiles seems to be far more common compared to using find and xargs. This is probably due to incompatibility between find and xargs versions between Unices. It also makes the Makefile a lot easier to read if one uses explicit loops.
I managed to solve with:
#find $(CURDIR)/foo/npm -name "*.csv" -type f | xargs -L 1 sed -i '' '1 s/^.*$$/"bar"/g'

comparing two directories with separate diff output per file

I'd need to see what has been changed between two directories which contain different version of a software sourcecode. While I have found a way to get a unique .diff file, how can I obtain a different file for each changed file in the two directories? I'd need this, as the "main" is about 6 MB and wanted some more handy thing.
I came around this problem too, so I ended up with some lines of a shell script. It takes three arguments: Source and destination directory (as used for diff) and a target folder (should exist) for the output.
It's a bit hacky, but maybe it would be useful for someone. So use with care, especially if your paths have special characters.
#!/bin/sh
DIFFARGS="-wb"
LANG=C
TARGET=$3
SRC=`echo $1 | sed -e 's/\//\\\\\\//g'`
DST=`echo $2 | sed -e 's/\//\\\\\\//g'`
if [ ! -d "$TARGET" ]; then
echo "'$TARGET' is not a directory." >&2
exit 1
fi
diff -rqN $DIFFARGS "$1" "$2" | sed "s/Files $SRC\/\(.*\?\) and $DST\/\(.*\?\) differ/\1/" | \
while read file
do
if [ ! -d "$TARGET/`dirname \"$file\"`" ]; then
mkdir -p "$TARGET/`dirname \"$file\"`"
fi
diff $DIFFARGS -N "$1/$file" "$2/$file" > "$TARGET"/"$file.diff"
done
if you want to compare source code it is better to commit it to a source vesioning program as "svn".
after you have done so. do a diff of your uploaded code and pipe it to file.diff
svn diff --old svn:url1 --new svn:url2 > file.diff
A bash for loop will work for you. The following will diff two directories with C source code and produce a separate diff for each file.
for FILE in $(find <FIRST_DIR> -name '*.[ch]'); do DIFF=<DIFF_DIR>/$(echo $FILE | grep -o '[-_a-zA-Z0-9.]*$').diff; diff -u $FILE <SECOND_DIR>/$FILE > $DIFF; done
Use the correct patch level for the lines starting with +++

Change multiple files

The following command is correctly changing the contents of 2 files.
sed -i 's/abc/xyz/g' xaa1 xab1
But what I need to do is to change several such files dynamically and I do not know the file names. I want to write a command that will read all the files from current directory starting with xa* and sed should change the file contents.
I'm surprised nobody has mentioned the -exec argument to find, which is intended for this type of use-case, although it will start a process for each matching file name:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} \;
Alternatively, one could use xargs, which will invoke fewer processes:
find . -type f -name 'xa*' | xargs sed -i 's/asd/dsg/g'
Or more simply use the + exec variant instead of ; in find to allow find to provide more than one file per subprocess call:
find . -type f -name 'xa*' -exec sed -i 's/asd/dsg/g' {} +
Better yet:
for i in xa*; do
sed -i 's/asd/dfg/g' $i
done
because nobody knows how many files are there, and it's easy to break command line limits.
Here's what happens when there are too many files:
# grep -c aaa *
-bash: /bin/grep: Argument list too long
# for i in *; do grep -c aaa $i; done
0
... (output skipped)
#
You could use grep and sed together. This allows you to search subdirectories recursively.
Linux: grep -r -l <old> * | xargs sed -i 's/<old>/<new>/g'
OS X: grep -r -l <old> * | xargs sed -i '' 's/<old>/<new>/g'
For grep:
-r recursively searches subdirectories
-l prints file names that contain matches
For sed:
-i extension (Note: An argument needs to be provided on OS X)
Those commands won't work in the default sed that comes with Mac OS X.
From man 1 sed:
-i extension
Edit files in-place, saving backups with the specified
extension. If a zero-length extension is given, no backup
will be saved. It is not recommended to give a zero-length
extension when in-place editing files, as you risk corruption
or partial content in situations where disk space is exhausted, etc.
Tried
sed -i '.bak' 's/old/new/g' logfile*
and
for i in logfile*; do sed -i '.bak' 's/old/new/g' $i; done
Both work fine.
#PaulR posted this as a comment, but people should view it as an answer (and this answer works best for my needs):
sed -i 's/abc/xyz/g' xa*
This will work for a moderate amount of files, probably on the order of tens, but probably not on the order of millions.
Another more versatile way is to use find:
sed -i 's/asd/dsg/g' $(find . -type f -name 'xa*')
I'm using find for similar task. It is quite simple: you have to pass it as an argument for sed like this:
sed -i 's/EXPRESSION/REPLACEMENT/g' `find -name "FILE.REGEX"`
This way you don't have to write complex loops, and it is simple to see, which files you are going to change, just run find before you run sed.
u can make
'xxxx' text u search and will replace it with 'yyyy'
grep -Rn '**xxxx**' /path | awk -F: '{print $1}' | xargs sed -i 's/**xxxx**/**yyyy**/'
There's some good answers above. I thought I'd throw in one more that is succinct and parallelizable, using GNU parallel, which I often prefer to xargs:
parallel sed -i 's/abc/xyz/g' {} ::: xa*
Combine this with the -j N option to run N jobs in parallel.
If you are able to run a script, here is what I did for a similar situation:
Using a dictionary/hashMap (associative array) and variables for the sed command, we can loop through the array to replace several strings. Including a wildcard in the name_pattern will allow to replace in-place in files with a pattern (this could be something like name_pattern='File*.txt' ) in a specific directory (source_dir).
All the changes are written in the logfile in the destin_dir
#!/bin/bash
source_dir=source_path
destin_dir=destin_path
logfile='sedOutput.txt'
name_pattern='File.txt'
echo "--Begin $(date)--" | tee -a $destin_dir/$logfile
echo "Source_DIR=$source_dir destin_DIR=$destin_dir "
declare -A pairs=(
['WHAT1']='FOR1'
['OTHER_string_to replace']='string replaced'
)
for i in "${!pairs[#]}"; do
j=${pairs[$i]}
echo "[$i]=$j"
replace_what=$i
replace_for=$j
echo " "
echo "Replace: $replace_what for: $replace_for"
find $source_dir -name $name_pattern | xargs sed -i "s/$replace_what/$replace_for/g"
find $source_dir -name $name_pattern | xargs -I{} grep -n "$replace_for" {} /dev/null | tee -a $destin_dir/$logfile
done
echo " "
echo "----End $(date)---" | tee -a $destin_dir/$logfile
First, the pairs array is declared, each pair is a replacement string, then WHAT1 will be replaced for FOR1 and OTHER_string_to replace will be replaced for string replaced in the file File.txt. In the loop the array is read, the first member of the pair is retrieved as replace_what=$i and the second as replace_for=$j. The find command searches in the directory the filename (that may contain a wildcard) and the sed -i command replaces in the same file(s) what was previously defined. Finally I added a grep redirected to the logfile to log the changes made in the file(s).
This worked for me in GNU Bash 4.3 sed 4.2.2 and based upon VasyaNovikov's answer for Loop over tuples in bash.
The Silver Searcher Solution
I'm adding another option for those people who don't know about the amazing tool called The Silver Searcher (command line tool is ag).
Note: You can use grep and other tools to do the same thing here, but The Silver Searcher is fantastic :)
TLDR
ag -l 'abc' | xargs sed -i 's/abc/xyz/g'
Install The Silver Searcher
sudo apt install silversearcher-ag # Debian / Ubuntu
sudo pacman -S the_silver_searcher # Arch / EndeavourOS
sudo yum install epel-release the_silver_searcher # RHEL / CentOS
Demo Files
Paste the following into your terminal to create some demonstration files:
mkdir /tmp/food
cd /tmp/food
content="Everybody loves to abc this food!"
echo "$content" > ./milk
echo "$content" > ./bread
mkdir ./fastfood
echo "$content" > ./fastfood/pizza
echo "$content" > ./fastfood/burger
mkdir ./fruit
echo "$content" > ./fruit/apple
echo "$content" > ./fruit/apricot
Using 'ag'
The following ag command will recursively find all the files that contain the string 'abc'. It ignores the .git directory, .gitignore files, and other ignore files:
$ ag 'abc'
milk
1:Everybody loves to abc this food!
bread
1:Everybody loves to abc this food!
fastfood/burger
1:Everybody loves to abc this food!
fastfood/pizza
1:Everybody loves to abc this food!
fruit/apple
1:Everybody loves to abc this food!
fruit/apricot
1:Everybody loves to abc this food!
To just list the files that contain the string 'abc', use the -l switch:
$ ag -l 'abc'
bread
fastfood/burger
fastfood/pizza
fruit/apricot
milk
fruit/apple
Changing Multiple Files
Finally, using xargs and sed, we can replace the 'abc' string with another string:
ag -l 'abc' | xargs sed -i 's/abc/eat/g'
In the above command, ag is listing all the files that contain the string 'abc'. The xargs command is splitting the file names and piping them individually into the sed command.