Concatenate txt file contents and/or add break to all - append

I have a bunch of.txt files that need to be made into one big file that can be read by programs such as Microsoft Excel.
The problem is that the files currently do not have a break at the end of them, so they end up in one long line.
Here's an example of what I have (the numbers represent the line number):
1. | first line of txt file
2. | second line
Here's what I want to turn that into:
1. | first line of txt file
2. | second line
3. |
I have around 3000 of these files in a folder, all in the same format. Is there any way to take these files and add a blank line to the end of them all? I'd like to do this without the need for complicated code, i.e. PHP, etc.. I know there are similar things you can do using the terminal (I'm on CentOS), but if something does specifically what I require I'm missing it.

The simplest way to achieve this is with a bash for-loop:
for file in *.txt; do
echo >> "$file"
done
This iterates over all .txt files in the current directory and appends a newline to each file. It can be written in one line, you only need to add a ; before the done.
Note that $file is quoted to handle files with spaces and other funny characters in their names.
If the files are spread across many directories and not all in the same one, you can replace *.txt with **/*.txt to iterate over all .txt files in all subdirectories of the current folder.
An alternative way is to use sed:
sed -i "$ s:$:\n:" *.txt
The -i flag tells sed to edit the files in-place. $ matches the last line, and then the s command substitutes the end of the line (again $) with a new line (\n), thus appending a line to the end of the file.

Try this snippet:
for f in *; do ((cat $f && echo "") > $f.tmp) done && rename -f 's/\.tmp$//' *.tmp
This basically takes any file in the folder (for f in *; do).
Outputs the file on STDOUT (cat $f) followed by a newline (echo "")
and redirects the output into filename.tmp (> $f.tmp)
and then moves the *.tmp files to the original files (rename -f 's/\.tmp$//' *.tmp).
Edit:
Or even simpler:
for f in *; do (echo "" >> $f) done
This basically takes any file in the folder (for f in *; do).
Outputs a newline (echo "")
and appends it to the file (>> $f)

Related

sed - Incorporate a string from specific file/folder names into text files

I try to add a string to the first line in text files.
The first part of every string is identical. Here "#Test:". The last part of the string should incorporate the digit which is derived from the folder and file names.
The folder names consist exclusively of digits from 1-52. There are between 1-20 files in each folder with the following structure:
1 (folder)
1_tree1 (file)
1_tree2 (file)
1_tree3 (file)
...
2 (folder)
2_tree1 (file)
2_tree2 (file)
2_tree3 (file)
...
...
The operating system is Ubuntu 20.04.
I am able to change each file separately. For example, the following command in the terminal adds #Tree:1 for one file the first folder.
sed -i '1s/^/#Test:1 \n/' '/path/to/the/file'
However, if I try to do this for all files in the folder, I can not proceed. Could you show me how to do it automatically? I am not necessarily restricted to sed.
Thank you.
This might be what you want, using GNU awk for "inplace" editing, BEGINFILE (so it'll work even on empty input files), and gensub():
find . -type f -exec awk -i inplace '
BEGINFILE { print "#Test:" gensub(".*/([0-9]+)_[^/]+$","\\1",1,FILENAME) }
1' {} +

Using Perl with Sed and Capture Groups to add another string to end of a string

I have a requirement to go through each file and add a new string at the end of a particular statement in the file. I already have the list of each such file (actually each file is a SAS code) having this statement . My aim is to edit each file in-place after creating a backup first.So i have decided to use PERL to do this in-place editing on a AIX 7.1 machine.
The particular statement that i intend to add to in each file will always have 3 keywords :FILENAME, FTP and HOST identifying such a statement and it is always terminated by a semicolon. The statement can also occur multiple times in same file.
Example of the statement in the file is:
FILENAME IN FTP "" LS HOST=XXXX USER=XXXX PASS=XXXX ;
The same type of statement also be in multiple lines as well with some additional options on the statement.
FILENAME Test FTP "Sample.xls"
CD="ABCDEFG"
USER=XXXXX
PASS=XXXXX
HOST=XXXXX
BINARY
;
OR
filename Novell ftp &pitalist.
HOST=&HOST.
USER="XXXXXXXX"
PASS="XXXXXXX"
DEBUG
LRECL=10000;
My aim is add a new string : %ftps_opts at the end of the above string just before ending semicolon.There should be atleast one space or a newline between existing statement and this new string as shown below.
FILENAME IN FTP "" LS HOST=XXXX USER=XXXX PASS=XXXX %ftps_opts;
FILENAME Test FTP "Sample.xls"
CD="ABCDEFG"
USER=XXXXX
PASS=XXXXX
HOST=XXXXX
BINARY
%ftps_opts;
filename Novell ftp &pitalist.
HOST=&HOST.
USER="XXXXXXXX"
PASS="XXXXXXX"
DEBUG
LRECL=10000 %ftps_opts;
Is there a way to use Capture group and PERL to capture the existing statement in each file just before the semicolon and then append the new string at the end of it with a space or newline? The Input.txt files has list of files having the FILENAME FTP statement as shown above.
Something like this :
#!/bin/bash
input="~/Input.txt"
while IFS= read -r line
do
echo "$line"
perl -p -i.orig -e 's/(capture group)/\1 %ftps_opts /gi' "$line"
echo "done"
done < "$input"
Thank you.
You can tell Perl to process the whole file instead of processing it line by line:
perl -0777 -pe 's/(filename[^;]*ftp[^;]*host[^;]*)/$1 %ftps_opts/gi' -- file

Copy lines from multiple files in subfolders into one file

I'm very very very new to programming and trying to learn how to make tedious analysis tasks a little faster. I have a master folder (Master) with 50 experiment folders and within each experiment folder are another set of folders holding text files. I want to extract 2 lines from one of the text fiels (experiment title on line 7, slope on line 104) and copy them to a new single file.
So far, all I have learned is how to extract the lines and add to a new file.
sed -n '7p; 104 p' reco.txt >> results.txt
How can I extract these two lines from all files 'reco.txt' in the subfolder of the folder 'Master' and export into a single text file?
As much explanation as you can bear would be great to help me learn.
You can use find in combination with xargs for this. On its own, you can get a list of all relevant files:
find . -name reco.txt -print
This finds all files named reco.txt in the current directory (.) or any subdirectories and writes them to standard output.
Now, normally you can use the -exec argument to find, which will run a program for each file found, except that typically multiple results are combined into a single execution (appended to the command line). Your particular invocation of sed only works on one file at a time.
So, instead of -exec, you can use xargs which is essentially the same thing but with more control.
find Master -name reco.txt -print0 | xargs -0 -n1 sed -n '7p; 104 p' > results.txt
This does the following:
Searches in the directory Master or subdirectories for any file named reco.txt.
Outputs each filename with null-terminator instead of newline (-print0) -- this allows the full path to contain characters that usually need escaping (such as spaces)
Pipes the result into xargs, which does the following:
Accepts null-terminated strings (-0)
Only puts at most one file into each command (-n1)
Runs sed -n '7p; 104 p' on that file
Entire output is redirected to results.txt, which will overwrite any existing contents in the file.

Bash rename files with underscore

I have files that are named CULT_2009_BARRIERS_EXP_Linear.dbf
and would like to rename them to
CULT_BARRIERS_EXP_Linear.dbf .
The files have a date prefixed with them which is always different showing when it was captured.
I have tried to replace them with regular expressions. i want to test the string if it contains numbers and then rename. I have used
if [[ $file =~ [0-9] ]]; then rename -v "s/[0-9]//g" * && rename -v s/[_]_/_/ *;
which partially works. But I would ideally like to have one rename command as it is good practice
A single rename command would be enough. Just run the below command on the directory where .def files are actually stored.
rename -v "s/_[0-9]+//g" *.dbf
[0-9]+ matches one or more digits where [0-9] will match a single digit character. + repeats the previous token one or more times.

Appending and overwriting the beginning of a text file (windows)

I have two text files. I'd like to take the content of file1.txt that has four lines and append on the first four lines of the file2.txt. That has to be done overwriting all the records of the first four lines of file2.txt but keeping the rest of the original content (the other lines).
How can I do that using a batch or the windows prompt?
copy file1.txt temp.txt
echo. >> temp.txt
more +5 file2.txt >> temp.txt
move /y temp.txt file2.txt
EDIT: added the "echo. >> temp.txt" instruction, which should add a newline to temp.txt, thereby allowing for a "clean" merge of file2.txt (if file1.txt doesn't end with a newline).
Unless the four lines at the start of the two files occupy exactly the same amount of space, you can't, without rewriting the whole file.
You can't insert or delete data into files at arbitrary points - you can overwrite existing data (byte for byte), truncate the file or append to the end, but not remove or insert into the middle.
So basically you'd need to:
Start a new file consisting of the first four lines of file1.txt
Skip past the first four lines of file2.txt
Append the rest of file1.txt to the new file2.txt
You can do this fairly easily with the head/tail commands from Unix, which you could get from Cygwin if that's an acceptable solution. It's likely that the head/tail from the Windows Services for Unix would work too.
If you grab the coreutils from Gnutils you'll be able to do a lot of the stuff you can do with Cygwin without having to install cygwin.
Then you can use things like head, tail and cat which will allow you to do what you're looking to.
e.g.
head -n 4 file2.txt
to get the first four lines of file2.
Extract the zip from the page linked above, and grab whichever of the utils you need to use out of the bin directory and put them in a directory in your path - e.g. for the below you'd want mv, head and tail. You could use the built in DOS move command, but you'd need to change the options slightly.
The question is a little unclear, but if you're looking to remove the first four lines of file2.txt and append them to file1.txt you can do the following:
head -n 4 file2.txt >> file1.txt
tail -n +5 file2.txt >> temp.txt
mv temp.txt file2.txt
With batch alone I'm not sure you can do it.
With Unix commands you can -- and you can easily use Unix commands under Windows using Cygwin.
In that case you want:
#!/bin/bash
head -n 4 file1.txt > result.txt # first 4 lines of file1
tail -n +5 file2.txt >> result.txt # append lines 5, 6, 7... of file2
mv result.txt file2.txt # replace file2.txt with the result
you could do it if you wrote a script in something other than windows batch. vbscript or jscript with windows scripting host should be able to do it. Each of those would have a method to grab lines from one file and overwrite the lines of another.
You can do this by creating a temporary third file, pulling the lines from the first file and adding them to the temp file, then reading the second file and, after reading in four carriage return/linefeed pairs, write the rest to the temp file. Then, delete the second file and rename the temp file to the second file name.