batch rename files that start with '-'? - perl

I usually get a bunch of files whose name start with a dash '-' . This is causing all sorts of problem when i do any kind of linux commands because anything after - is interpreted as a flag.
What is the fastest way to rename these files without dash character in the front of the file. I can manually rename each file by adding a '--' in front of the file name.For eg: '-File1' will be renamed as
mv -- -File1 File1
But this is not ideal when i have to rename 100's of files on the fly. Currently I have to export it out and use a windows program so I can batch rename them and then upload it back to a Linux box.

The easiest way to refer to such a file is ./-File1. (You only have the problem if the file is in the current directory, anyway.) Maybe if you get used to that it's not so bad.
To bulk rename them, you could do something like:
for f in -*; do mv "./$f" "renamed$f"; done
or, as #shellter suggests in a comment, to reproduce the example in the OP:
for f in -*; do mv "./$f" "${f#-}"; done
Note: the above will only remove a single - from the name.

If you have the util-linux package (most do?):
rename - '' ./-*
man rename

Might be easier to do this in the shell, but if you're worried about special cases or if you would just rather use perl there's a couple ways to do it. One is to use File::Copy mv:
use strict;
use warnings;
use File::Copy qw(mv);
opendir(my $dir, ".") or die "Can't open $!";
foreach my $file (readdir($dir)) {
my $new_name = $file =~ s/^-+//r; #works if filename begins with multiple '-'s
if ($new_name ne $file) {
say "$file -> $new_name";
mv $file, $new_name;
}
}
or use the rename builtin, but this theoretically can not work for some system implementations:
rename $file, $new_name; #instead of mv $file, $new_name;
In either case, if a file with the new name exists it will get silently overwritten with this code. You might need some logic to take care of that:
# Stick inside the "if" clause above
if (-e $new_name) {
say "$new_name already exists!"
next;
}

Using find:
find -name '-*' -exec rename -- - '' {} \;

Related

Read same extension multiple files in one directory in Perl

I currently have an issue with reading files in one directory.
I need to take all the fastq files in a file and run the script for each file then put new files in an ‘Edited_sequences’ folder.
The one script I had is
perl -ne '$i++; if($i<80001){print}' BM2003_TCCCAGAACAAC_L001_R1_001.fastq > ./Edited_sequences/BM2003_TCCCAGAACAAC_L001_R1_001.fastq
It takes the first 80000 lines in one fastq file then outputs the result.
Now for example I have 2000 fastq files, then I need to copy and paste for 2000 times.
I know there is a glob command suit for this situation but I just do not know how to deal with that.
Please help me out.
You can use perl to do copy/paste for you, first argument *.fastq are all fastq files, and second ./Edited_sequences is target folder for new files,
perl -e '$d=pop; `head -8000 "$_" > "$d/$_"` for #ARGV' *.fastq ./Edited_sequences
glob gets you an array of filenames matching a particular expression. It's frequently used with <> brackets, a lot like reading input (you can think of it as reading files from a directory).
This is a simple example that will print the names of every ".fastq" file in the current directory:
print "$_\n" for <*.fastq>;
The important part is <*.fastq>, which gives us an array of filenames matching that expression (in this case, a file extension). If you need to change which directory your Perl script is working in, you can use chdir.
From there, we can process your files as needed:
while (my $filename = <*.fastq>) {
open(my $in, '<', $filename) or die $!;
open(my $out, '>', "./Edited_sequences/$filename") or die $!;
for (1..80000) {
my $line = <$in>;
print $out $line;
}
}
You have two choices:
Use Perl to read in the 2000 files and run it as part of your program
Use the Shell to pass each of those 2000 file to your command line
Here's the bash alternative:
for file in *.fastq
do
perl -ne '$i++; if($i<80001){print}' "$file" > "./Edited_sequences/$file"
done
Your same Perl script, but with the shell finding each file. This should work and not overload the command line. The for loop in bash, if handed a glob can expand them correctly.
However, I always recommend that you don't actually execute the command, but echo the resulting commands into a file:
for file in *.fastq
do
echo "perl -ne '\$i++; if(\$i<80001){print}' \
\"$file\" > \"./Edited_sequences/$file\"" >> myoutput.txt
done
Then, you can look at myoutput.txt to make sure it looks good before you actually do any real harm. Once you've determined that myoutput.txt is a good file, you can execute that as a shell script:
$ bash myoutput.txt

how to create a script from a perl script which will use bash features to copy a directory structure

hi i have written a perl script which copies all the entire directory structure from source to destination and then i had to create a restore script from the perl script which will undo what the perl script has done that is create a script(shell) which can use bash features to restore the contents from destination back to source i m struggling to find the correct function or command which can copy recursively (not an requirement) but i want exactly the same structure as it was before
Below is the way i m trying to create a file called restore to do the restoration process
i m particularly looking for algorithm.
Also restore will restore the structure to a command line directory input if it is supplied if not You can assume the default input supplied to perl script
$source
$target
in this case we would wanna copy from target to source
So we have two different parts in one script.
1 which will copy from source to destination.
2 it will create a script file which will undo what part 1 has done
i hope this makes it very clear
unless(open FILE, '>'."$source/$file")
{
# Die with error message
# if we can't open it.
die "\nUnable to create $file\n";
}
# Write some text to the file.
print FILE "#!/bin/sh\n";
print FILE "$1=$target;\n";
print FILE "cp -r \n";
# close the file.
close FILE;
# here we change the permissions of the file
chmod 0755, "$source/$file";
The last problem i have is i couldn't get $1 in my restore file as it refers to a some variable in perl
but i need this for getting command line input when i run restore as $0 = ./restore $1=/home/xubuntu/User
First off, the standard way in Perl for doing this:
unless(open FILE, '>'."$source/$file") {
die "\nUnable to create $file\n";
}
is to use the or statement:
open my $file_fh, ">", "$source/$file"
or die "Unable to create "$file"";
It's just easier to understand.
A more modern way would be use autodie; which will handle all IO problems when opening or writing to files.
use strict;
use warnings;
use autodie;
open my $file_fh, '>', "$source/$file";
You should look at the Perl Modules File::Find, File::Basename, and File::Copy for copying files and directories:
use File::Find;
use File::Basename;
my #file_list;
find ( sub {
return unless -f;
push #file_list, $File::Find::name;
},
$directory );
Now, #file_list will contain all the files in $directory.
for my $file ( #file_list ) {
my $directory = dirname $file;
mkdir $directory unless -d $directory;
copy $file, ...;
}
Note that autodie will also terminate your program if the mkdir or copy commands fail.
I didn't fill in the copy command because where you want to copy and how may differ. Also you might prefer use File::Copy qw(cp); and then use cp instead of copy in your program. The copy command will create a file with default permissions while the cp command will copy the permissions.
You didn't explain why you wanted a bash shell command. I suspect you wanted to use it for the directory copy, but you can do that in Perl anyway. If you still need to create a shell script, the easiest way is via the :
print {$file_fh} << END_OF_SHELL_SCRIPT;
Your shell script goes here
and it can contain as many lines as you need.
Since there are no quotes around `END_OF_SHELL_SCRIPT`,
Perl variables will be interpolated
This is the last line. The END_OF_SHELL_SCRIPT marks the end
END_OF_SHELL_SCRIPT
close $file_fh;
See Here-docs in Perldoc.
First, I see that you want to make a copy-script - because if you only need to copy files, you can use:
system("cp -r /sourcepath /targetpath");
Second, if you need to copy subfolders, you can use -r switch, can't you?

How do I run a Perl script on multiple input files with the same extension?

How do I run a Perl script on multiple input files with the same extension?
perl scriptname.pl file.aspx
I'm looking to have it run for all aspx files in the current directory
Thanks!
In your Perl file,
my #files = <*.aspx>;
for $file (#files) {
# do something.
}
The <*.aspx> is called a glob.
you can pass those files to perl with wildcard
in your script
foreach (#ARGV){
print "file: $_\n";
# open your file here...
#..do something
# close your file
}
on command line
$ perl myscript.pl *.aspx
You can use glob explicitly, to use shell parameters without depending to much on the shell behaviour.
for my $file ( map {glob($_)} #ARGV ) {
print $file, "\n";
};
You may need to control the possibility of a filename duplicate with more than one parameter expanded.
For a simple one-liner with -n or -p, you want
perl -i~ -pe 's/foo/bar/' *.aspx
The -i~ says to modify each target file in place, and leave the original as a backup with an ~ suffix added to the file name. (Omit the suffix to not leave a backup. But if you are still learning or experimenting, that's a bad idea; removing the backups when you're done is a much smaller hassle than restoring the originals from a backup if you mess something up.)
If your Perl code is too complex for a one-liner (or just useful enough to be reusable) obviously replace -e '# your code here' with scriptname.pl ... though then maybe refactor scriptname.pl so that it accepts a list of file name arguments, and simply use scriptname.pl *.aspx to run it on all *.aspx files in the current directory.
If you need to recurse a directory structure and find all files with a particular naming pattern, the find utility is useful.
find . -name '*.aspx' -exec perl -pi~ -e 's/foo/bar/' {} +
If your find does not support -exec ... + try with -exec ... \; though it will be slower and launch more processes (one per file you find instead of as few as possible to process all the files).
To only scan some directories, replace . (which names the current directory) with a space-separated list of the directories to examine, or even use find to find the directories themselves (and then perhaps explore -execdir for doing something in each directory that find selects with your complex, intricate, business-critical, maybe secret list of find option predicates).
Maybe also explore find2perl to do this directory recursion natively in Perl.
If you are on Linux machine, you could try something like this.
for i in `ls /tmp/*.aspx`; do perl scriptname.pl $i; done
For example to handle perl scriptname.pl *.aspx *.asp
In linux: The shell expands wildcards, so the perl can simply be
for (#ARGV) {
operation($_); # do something with each file
}
Windows doesn't expand wildcards so expand the wildcards in each argument in perl as follows. The for loop then processes each file in the same way as above
for (map {glob} #ARGV) {
operation($_); # do something with each file
}
For example, this will print the expanded list under Windows
print "$_\n" for(map {glob} #ARGV);
You can also pass the path where you have your aspx files and read them one by one.
#!/usr/bin/perl -w
use strict;
my $path = shift;
my #files = split/\n/, `ls *.aspx`;
foreach my $file (#files) {
do something...
}

How can I create a directory if one doesn't exist using Perl?

Currently, my Perl output is hard-coded to dump into the following Unix directory:
my $stat_dir = "/home/courses/" . **NEED DIR VAR HERE**;
The filename is built as such:
$stat_file = $stat_dir . "/" . $sess.substr($yr, 2, 2) . "_COURSES.csv";
I need a similar approach to building Unix directories, but I need to check if they exist first before creating them.
How can I do auto-numbering (revisions) of the $stat_file so that when these files get pumped into the same directory, they do not overwrite or append to existing files in the directory?
Erm... mkdir $stat_dir unless -d $stat_dir?
It really doesn't seem like a good idea to embed 'extra' questions like that.
Use the -d operator and File::Path.
use File::Path qw(make_path);
eval { make_path($dir) };
if ($#) {
print "Couldn't create $dir: $#";
}
make_path has an advantage over mkdir in that it can create trees of arbitrary depth.
And use -e to check file exists
my $fileSuffix = 0;
while (-e $filename) {
$filename = $filePrefix . ++$fileSuffix . $fileExtension;
}
Remember the directory's -d existence doesn't mean -w writable. But assuming you're in a personal area the mkdir($dir) unless(-d $dir) would work fine.
Perl has a built-in function mkdir
Take a look at perldoc perlfunc or the mkdir program from Perl Power Tools.
I believe it is safe to create a directory that already exists, take a look at the docs.

How can I scan multiple log files to find which ones have a particular IP address in them?

Recently there have been a few attackers trying malicious things on my server so I've decided to somewhat "track" them even though I know they won't get very far.
Now, I have an entire directory containing the server logs and I need a way to search through every file in the directory, and return a filename if a string is found. So I thought to myself, what better of a language to use for text & file operations than Perl? So my friend is helping me with a script to scan all files for a certain IP, and return the filenames that contain the IP so I don't have to search for the attacker through every log manually. (I have hundreds)
#!/usr/bin/perl
$dir = ".";
opendir(DIR, "$dir");
#files = grep(/\.*$/,readdir(DIR));
closedir(DIR);
foreach $file(#files) {
open FILE, "$file" or die "Unable to open files";
while(<FILE>) {
print if /12.211.23.200/;
}
}
although it is giving me directory read errors. Any assistance is greatly appreciated.
EDIT: Code edited, still saying permission denied cannot open directory on line 10. I am just going to run the script from within the logs directory if you are questioning the directory change to "."
Mike.
Can you use grep instead?
To get all the lines with the IP, I would directly use grep, no need to show a list of files, it's a simple command:
grep 12\.211\.23\.200 *
I like to pipe it to another file and then open that file in an editor...
If you insist on wanting the filenames, it's also easy
grep -l 12\.211\.23\.200 *
grep is available on all Unix//Linux with the GNU tools, or on windows using one of the many implementations (unxutils, cygwin, ...etc.)
You have to concatenate $dirname with $filname when using files found through readdir, remember you haven't chdir'ed into the directory where those files resides.
open FH, "<", "$dirname/$filname" or die "Cannot open $filname:$!";
Incidentally, why not just use grep -r to recursively search all subdirectories under your log dir for your string?
EDIT: I see your edits, and two things. First, this line:
#files = grep(/\.*$/,readdir(DIR));
Is not effective, because you are searching for zero or more . characters at the end of the string. Since it's zero or more, it'll match everything in the directory. If you're trying to exclude files ending in ., try this:
#files = grep(!/\.$/,readdir(DIR));
Note the ! sign for negation if you're trying to exclude those files. Otherwise (if you only want those files and I'm misunderstanding your intent), leave the ! out.
In any case, if you're getting your die message on line 10, most likely you're hitting a file that has permissions such that you can't read it. Try putting the filename in the die output so you can see which file it's failing on:
open FILE, "$file" or die "Unable to open file: $file";
But as with other answers, and to reiterate: Why not use grep? The unix command, not the Perl function.
This will get the file names you are looking for in perl, and probably do it much faster than running and doing a perl regex.
#files = `find ~/ServerLogs -name "*.log" | xargs grep -l "<ip address>"`'
Although, this will require a *nix compliant system, or Cygwin on Windows.
Firstly get a list of files within your source directory:
opendir(DIR, "$dir");
#files = grep(/\.log$/,readdir(DIR));
closedir(DIR);
And then loop through those files
foreach $file(#files)
{
// file processing code
}
My first suggest would be to use grep instead. The right tool for the job, they say...
But to answer your question:
readdir just returns the filenames from the directory. You'll need to concatenate the directory name and filename together.
$path = "$dirname/$filname";
open FH, $path or die ...
Then you should ignore files that are actually directories, such as "." and "..". After getting the $path, check to see if it's a file.
if (-f $path) {
open FH, $path or die ...
while (<FH>)
BTW, I thought I would throw in a mention for File::Next. To iterate over all files in a directory (recursively):
use Path::Class; # always useful.
use File::Next;
my $files = File::Next::files( dir(qw/path to files/) ); # look in path/to/files
while( defined ( my $file = $files->() ) ){
$file = file( $file );
say "Examining $file";
say "found foo" if $file->slurp =~ /foo/;
}
File::Next is taint-safe.
~ doesn't auto-expand in Perl.
opendir my $fh, '~/' or die("Doin It Wrong"); # Doing It Wrong.
opendir my $fh, glob('~/') and die( "Thats right!" );
Also, if you must use readdir(), make sure you guard the expression thus:
while (defined(my $filename = readdir(DH))) {
...
}
If you don't do the defined() test, the loop will terminate if it finds a file called '0'.
Have you looked on CPAN for log parsers? I searched with 'log parse' and it yielded over 200 hits. Some (probably many) won't be relevant - some may be. It depends, in part, on which web server you are using.
Am I reading this right? Your line 10 that gives you the error is
open FILE, "$file" or die "Unable to open files";
And the $file you are trying to read, according to line 6,
#files = grep(/\.*$/,readdir(DIR));
is a file that ends with zero or more dot. Is this what you really wanted? This basically matches every file in the directory, including "." and "..". Maybe you don't have enough permission to open the parent directory for reading?
EDIT: if you only want to read all files (including hidden ones), you might want to use something like the following:
opendir(DIR, ".");
#files = readdir(DIR);
closedir(DIR);
foreach $file (#files) {
if ($file ne "." and $file ne "..") {
open FILE, "$file" or die "cannot open $file\n";
# do stuff with FILE
}
}
Note that this doesn't take care of sub directories.
I know I am way late to this discussion (ran across it while searching for grep related posts) but I am going to answer anyway:
It isn't specified clearly if these are web server logs (Apache, IIS, W3SVC, etc.) but the best tool for mining those for data is the LogParser tool from Microsoft. See logparser.com for more info.
LogParser will allow you to write SQL-like statements against the log files. It is very flexible and very fast.
Use perl from the command line, like a better grep
perl -wnl -e '/12.211.23.200/ and print;' *.log > output.txt
the benefit here is that you can chain logic far easier
perl -wnl -e '(/12.211.23.20[1-11]/ or /denied/i ) and print;' *.log
if you are feeling wacky you can also use more advanced command line options to feed perl one liner result into other perl one liners.
You really need to read "Minimal Perl: For UNIX and Linux People", awesome book on this very sort of thing.
First, use grep.
But if you don't want to, here are two small improvements you can make that I haven't seen mentioned yet:
1) Change:
#files = grep(/\.*$/,readdir(DIR));
to
#files = grep({ !-d "$dir/$_" } readdir(DIR));
This way you will exclude not just "." and ".." but also any other subdirectories that may exist in the server log directory (which the open downstream would otherwise choke on).
2) Change:
print if /12.211.23.200/;
to
print if /12\.211\.23\.200/;
"." is a regex wildcard meaning "any character". Changing it to "\." will reduce the number of false positives (unlikely to change your results in practice but it's more correct anyway).