Attempting to pull just filename from backtick - perl

I have an issue where I am attempting to pull just a filename from the output of running a backtick, my code is as follows:
$var = `munpack -f filename`;
If anyone is familiar with mpack the output will be something like:
tempdesc.txt: File exists
file_20130620.zip (application/octet-stream)
I am trying to just get the filename, however, all my attempted regexes have failed. I have even tried to just remove the linebreaks and then attempt to process the information and I cannot. I thought they could just be whitespace and remove the whitespace but those regexes have failed. I could go through and list every regex I have tried to pull this data and I can provide that if necessary, but maybe someone has something that could work. I can't produce any matches that id like nor alter the output in any way. So just to be clear im looking for something that will output me just the filename ex: file_20130620.zip
Some suggestions given with output:
$var =~ m{^(.+?)\(}m and print "$1\n";
output:
tempdesc.txt: File exists
file_20130620.zip
($filename) = $var =~ /(?s:.*\n)?(.*) \([^)]+\)\n/;
output:
tempdesc.txt: File exists
file_20130620.zip
if($var =~/\S+: [^\n]+\n(\S+) [^\n]+\n/) { printf $1; }
output:
tempdesc.txt: File exists
Fix per ysth:
$var = `munpack -f filename 2>/dev/null`; #will remove 'tempdesc.txt: File exists'

Assuming the filename is before a space before a parenthesized mimetype on the last line of output:
($filename) = $var =~ /(?s:.*\n)?(.*) \([^)]+\)\n/;
Though I'd rather create a temporary directory (use File::Temp) and unpack in it and just look for what file(s) are there than parse the output.
It is possible that the File Exists warning isn't actually in $var, but is appearing in your output because munpack is writing it to stderr (which doesn't get captured by backticks.)
Try doing munpack -q -f ... or munpack -f ... 2>/dev/null.

If you can assume that the filename is followed by a parenthesized description, something like this works:
$var =~ m{^(.+?)\(}m and print "$1\n";
The \m modifier treats a string as one with multiple lines so that you can match ^ and $ on any line. See perlre

I put your sample output to file1.txt as I dont have mpack utility installed.
And this regexp works
#!/usr/bin/perl
my $var = `more file1.txt`;
if($var =~/\S+: [^\n]+\n(\S+) [^\n]+\n/)
{
printf $1;
}

Related

How to rename multiple files in a folder with a specific format?

I have many files in a folder with the format '{galaxyID}-cutout-HSC-I-{#}-pdr2_wide.fits', where {galaxyID} and {#} are different numbers for each file. Here are some examples:
2185-cutout-HSC-I-9330-pdr2_wide.fits
992-cutout-HSC-I-10106-pdr2_wide.fits
2186-cutout-HSC-I-9334-pdr2_wide.fits
I want to change the format of all files in this folder to match the following:
2185_HSC-I.fits
992_HSC-I.fits
2186_HSC-I.fits
namely, I want to take out "cutout", the second number, and "pdr2_wide" from each file name. I would prefer to do this in either Perl or Python. For my Perl script, so far I have the following:
rename [-n];
my #parts=split /-/;
my $this=$parts[0].$parts[1].$parts[2].$parts[3].$parts[4].$parts[5];
$_ = $parts[0]."_".$parts[2]."_".$parts[3];
*fits
which gives me the error message
Not enough arguments for rename at ./rename.sh line 3, near "];" Execution of ./rename.sh aborted due to compilation errors.
I included the [-n] because I want to make sure the changes are what I want before actually doing it; either way, this is in a duplicated directory just for safety.
It looks like you are using the rename you get on Ubuntu (it's not the one that's on my ArchLinux box), but there are other ones out there. But, you've presented it oddly. The brackets around -n shouldn't be there and the ; ends the command.
The syntax, if you are using what I think you are, is this:
% rename -n -e PERL_EXPR file1 file2 ...
The Perl expression is the argument to the -e switch, and can be a simple substitution. Note that this expression is a string that you give to -e, so that probably needs to be quoted:
% rename -n -e 's/-\d+-pdr2_wide//' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-cutout-HSC-I.fits)
And, instead of doing this in one step, I'd do it in two:
% rename -n -e 's/-cutout-/-/; s/-\d+-pdr2_wide//' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
There are other patterns that might make sense. Instead of taking away parts, you can keep parts:
% rename -n -e 's/\A(\d+).*(HSC-I).*/$1-$2.fits/' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
I'd be inclined to use named captures so the next poor slob knows what you are doing:
% rename -n -e 's/\A(?<galaxy>\d+).*(HSC-I).*/$+{galaxy}-$2.fits/' *.fits
rename(2185-cutout-HSC-I-9330-pdr2_wide.fits, 2185-HSC-I.fits)
From your description {galaxyID}-cutout-HSC-I-{#}-pdr2_wide.fits, I assume that cutout-HSC-I is fixed.
Here's a script that will do the rename. It takes a list of files on stdin. But, you could adapt to take the output of readdir:
#!/usr/bin/perl
master(#ARGV);
exit(0);
sub master
{
my($oldname);
while ($oldname = <STDIN>) {
chomp($oldname);
# find the file extension/suffix
my($ix) = rindex($oldname,".");
next if ($ix < 0);
# get the suffix
my($suf) = substr($oldname,$ix);
# only take filenames of the expected format
next unless ($oldname =~ /^(\d+)-cutout-(HSC-I)/);
# get the new name
my($newname) = $1 . "_" . $2 . $suf;
printf("OLDNAME: %s NEWNAME: %s\n",$oldname,$newname);
# rename the file
# change to "if (1)" to actually do it
if (0) {
rename($oldname,$newname) or
die("unable to rename '$oldname' to '$newname' -- $!\n");
}
}
}
For your sample input file, here's the program output:
OLDNAME: 2185-cutout-HSC-I-9330-pdr2_wide.fits NEWNAME: 2185_HSC-I.fits
OLDNAME: 992-cutout-HSC-I-10106-pdr2_wide.fits NEWNAME: 992_HSC-I.fits
OLDNAME: 2186-cutout-HSC-I-9334-pdr2_wide.fits NEWNAME: 2186_HSC-I.fits
The above is how I usually do things but here's one with just a regex. It's fairly strict in what it accepts [for safety], but you can adapt as desired:
#!/usr/bin/perl
master(#ARGV);
exit(0);
sub master
{
my($oldname);
while ($oldname = <STDIN>) {
chomp($oldname);
# only take filenames of the expected format
next unless ($oldname =~ /^(\d+)-cutout-(HSC-I)-\d+-pdr2_wide([.].+)$/);
# get the new name
my($newname) = $1 . "_" . $2 . $3;
printf("OLDNAME: %s NEWNAME: %s\n",$oldname,$newname);
# rename the file
# change to "if (1)" to actually do it
if (0) {
rename($oldname,$newname) or
die("unable to rename '$oldname' to '$newname' -- $!\n");
}
}
}

How to grep a variable which stores a full text file and print matching lines

Hi I have been trying to execute a code where i used a variable $logs to save all my linux logs.
Now i want to grep the variable for a pattern and print the whole line for the lines that have the pattern in them.
I want to print whole line where i do grep /pattern/ and the lines that have pattern in them have to be printed.
Anyways here is my code.
my $logs = $ssh->exec("cat system_logs");
my $search = "pattern";
if(grep($search,$logs))
{
# this is where i want to print the lines matched.
# I want to print the whole line what command to use?
}
any help is greatly appreciated.
Try this:
foreach (grep(/$search/, split(/\n/, $logs))) {
print $_."\n";
}

Weird behavior with Perl string concatenation

I'm working on a pretty simple script, reading a maplist.txt file and using the \n separated map names in it to build a command string - however, I'm getting some unexpected behavior.
My full code:
# compiles a map pack from maplist.txt
# for every server.
# Filipe Dobreira <dobreira#gmail.com>
# v1 # Sept. 2011
use strict;
my #servers = <*>;
foreach my $server (#servers)
{
# we only want folders:
next if -f $server;
print "server: $server\n";
my $maplist = $server . '/orangebox/cstrike/maplist.txt';
my $mapdir = $server . '/orangebox/cstrike/maps';
print " maplist: $maplist\n";
print " map folder: $mapdir\n";
# check if the maplist actually exists:
if(!(-e $maplist))
{
print "!!! failed to find $maplist\n";
next;
}
open MAPLIST, "<$maplist";
foreach my $map (<MAPLIST>)
{
chomp($map);
next if !$map;
# full path to the map file:
my $mapfile = "$mapdir/$map.bsp";
print "$mapfile\n";
}
}
Where I declare $mapfile, I expect the result to be something like:
zombieescape1/orangebox/cstrike/maps/ze_stargate_escape_v8.bsp
However, it seems like the concatenation is being made to the START of the string, and the final result ends up being something like:
.bspiescape1/orangebox/cstrike/maps/ze_stargate_escape_v8
So the .bsp portion is actually being written over the start of the leftmost string. I have very little perl experience, and I can only assume this is me failing to understand some quirk or operator behavior.
Note: I've also tried using "${mapdir}/${map}.bsp", concatenating everything with the dot operator, and a join "", $mapdir, $map, ".bsp", with the same result.
Thanks in advance.
PS: for reference, here's what a maplist.txtlooks like:
zm_3dubka_v3
zm_4way_tunnel_v2
zm_abstractchode_pyramid2
zm_anotheruglyzmap_v1e
zm_app7e_betterbworld_JDfix_v3
zm_atix_helicopter_mini
zm_base_winter_beta3
zm_battleforce_panic_ua
zm_black_lion_macd_v8
zm_bunker_f57_v2
zm_burbsdelchode_b3
zm_choddarena_b12
zm_choddasnowpanic_b4
zm_citylife_V2b
zm_crazycity
zm_deep_thought_nv
zm_desert_fortress_v2
ZM_desprerados_a1
zm_doomlike_station_v2
zm_dust_arena_v1_final
zm_exhibit_night_2F
zm_facility_v1
zm_farm3_nav72
zm_firewall_samarkand
zm_fortress_b7
zm_ghs_flats
zm_gl33m4x_errata
zm_idm_hauntedhouse_v1
zm_industry_v2
zm_kruma_kakariko_village_006
zm_kruma_panic_004
zm_lila_off!ce_v4
zm_little_city_v5pf_fix
zm_moonlight_v3_pF
zm_moon_roflicious_pF_02
zm_moocbblechode_b2
zm_mountain_b2
zm_neko_abura_v2
zm_neko_athletic_park_v2
zm_novum_v3_JDfix
zm_ocx_orly_v4
zm_officeattack_b5a
zm_officerush_betav7
zm_officesspace_pfss
zm_omi_facility_pfv2
zm_penumbra_PF3
zm_raindance_ak_v2
zm_roflicious_pfcf2
zm_roy_abandoned_canals_new
zm_roy_barricade_factory
zm_roy_highway
zm_roy_industrial_complex
zm_roy_old_industrial_pF
zm_roy_the_ship_pf
zm_roy_zombieranch_night_b4
zm_survival_f2a
zm_temple_v3pf
zm_towers_v3
zm_tx_highschool_zkedit_v2
zm_unpanicv2_pF
zm_vc2_office_redone_b1
zm_wasteyard_beta3
zm_winterfun_b4a
zm_wtfhax_v6
zm_wtfhax_v6e
zm_wwt_twinsteel_v8
I'd guess that the maplist.txt has non-unix line endings - probably dos - and as result you see what looks like prepending.
The problem is that the chomp() is only consuming one of the two line ending characters, leaving the carriage return behind.
You might find that if you set the Perl special variable $/ (input record seperator) before opening the map list, that chomp then does the job - it will consume both line-ending characters.
$/ = qq{\r\n};
Another solution would be to convert the line endings in the file before processing, perhaps using dos2unix.

Read from a file and compare the content with a variable

#!/usr/bin/perl
some code........
..................
system ("rpm -q iptables > /tmp/checkIptables");
my $iptables = open FH, "/tmp/checkIptables";
The above code checks whether iptables is installed in your Linux machine? If it is installed the command rpm -q iptables will give the output as shown below:
iptables-1.4.7-3.el6.x86_64
Now I have redirected this output to the file named as checkIptables.
Now I want to check whether the variable $iptables matches with the output given above or not. I do not care about version numbers.
It should be something like
if ($iptables eq iptables*){
...............
.......................}
But iptables* gives error.
You could use a regex to check the string:
$iptables =~ /^iptables/
Also, you do not need a tmp file, you can instead open a pipe:
use strict;
use warnings;
use autodie;
open my $fh, '-|', "rpm -q iptables";
my $line = <$fh>;
if ($line =~ /^iptables/) {
print "iptables is installed";
}
This will read the first line of the output, and check it against the regex.
Or you can use backticks:
my $lines = `rpm -q iptables`;
if ($lines =~ /^iptables/) {
print "iptables is installed";
}
Note that backticks may return more than one line of data, so you may need to compensate for that.
I think what you're looking for is a regular expression or a "pattern match". You want the string to match a pattern, not a particular thing.
if ( $iptables =~ /^iptables\b/ ) {
...
}
=~ is the binding operator and tells the supplied regular expression that its source is that variable. The regular expression simply says look at the beginning of the string for the sequence "iptables" followed by a "word-break". Since '-' is a "non-word" character (not alphanumeric or '_') it breaks the word. You could use '-' as well:
/^iptables-/
But you can probably do the whole thing with this statement:
use strict;
use warnings;
use List::MoreUtils qw<any>;
...
if ( any { m/^iptables-/ } `rpm -q iptables` ) {
...
}
piping the output directly into a list via backticks and searching through that list via any (See List::MoreUtils::any
Why not just look at the return value of "rpm -q", which will return 0 or 1 whether it is installed or not respectively?

Perl Regular Expressions + delete line if it starts with #

How to delete lines if they begin with a "#" character using Perl regular expressions?
For example (need to delete the following examples)
line="#a"
line=" #a"
line="# a"
line=" # a"
...
the required syntax
$line =~ s/......../..
or skip loop if line begins with "#"
from my code:
open my $IN ,'<', $file or die "can't open '$file' for reading: $!";
while( defined( $line = <$IN> ) ){
.
.
.
You don't delete lines with s///. (In a loop, you probably want next;)
In the snippet you posted, it would be:
while (my $line = <IN>) {
if ($line =~ /^\s*#/) { next; }
# will skip the rest of the code if a line matches
...
}
Shorter forms /^\s*#/ and next; and next if /^\s*#/; are possible.
perldoc perlre
/^\s*#/
^ - "the beginning of the line"
\s - "a whitespace character"
* - "0 or more times"
# - just a #
Based off Aristotle Pagaltzis's answer you could do:
perl -ni.bak -e'print unless m/^\s*#/' deletelines.txt
Here, the -n switch makes perl put a loop around the code you provide
which will read all the files you pass on the command line in
sequence. The -i switch (for “in-place”) says to collect the output
from your script and overwrite the original contents of each file with
it. The .bak parameter to the -i option tells perl to keep a backup of
the original file in a file named after the original file name with
.bak appended. For all of these bits, see perldoc perlrun.
deletelines.txt (initially):
#a
b
#a
# a
c
# a
becomes:
b
c
Program (Cut & paste whole thing including DATA section, adjust shebang line, run)
#!/usr/bin/perl
use strict;
use warnings;
while(<DATA>) {
next if /^\s*#/; # skip comments
print; # process data
}
__DATA__
# comment
data
# another comment
more data
Output
data
more data
$text ~= /^\s*#.*\n//g
That will delete all of the lines with # in the entire file of $text, without requiring that you loop through each line of the text manually.