s3cmd list of contents - only filenames - perl one liner? - perl

Currently I'm using s3cmd ls s3://location/ > file.txt to get a list of contents of my s3 bucket and save on a txt. However the above returns dates, filesizes paths and filenames.
for example:
2011-10-18 08:52 6148 s3://location//picture_1.jpg
I only need the filenames of the s3 bucket - so on the above example I only need picture_1.jpg.
Any suggestions?
Could this be done with a Perl one liner maybe after the initial export?

Use awk:
s3cmd ls s3://location/ | awk '{ print $4 }' > file.txt
If you have filenames with spaces, try:
s3cmd ls s3://location/ | awk '{ s = ""; for (i = 4; i <= NF; i++) s = s $i " "; print s }' > file.txt

File::Listing does not support this format because the designers of this listing format were stupid enough to not simply reuse an existing one. Let's parse it manually instead.
use URI;
my #ls = (
"2011-10-18 08:52 6148 s3://location//picture_1.jpg\n",
"2011-10-18 08:52 6148 s3://location//picture_2.jpg\n",
"2011-10-18 08:52 6148 s3://location//picture_3.jpg\n",
);
for my $line (#ls) {
chomp $line;
my $basename = (URI->new((split q( ), $line)[-1])->path_segments)[-1];
}
__END__
picture_1.jpg
picture_2.jpg
picture_3.jpg
As oneliner:
perl -mURI -lne 'print ((URI->new((split q( ), $line)[-1])->path_segments)[-1])' < input

I am sure a specific module is the safer option, but if the data is reliable, you can get away with a one-liner:
Assuming the input is:
2011-10-18 08:52 6148 s3://location//picture_1.jpg
2011-10-18 08:52 6148 s3://location//picture_2.jpg
2011-10-18 08:52 6148 s3://location//picture_3.jpg
...
The one-liner:
perl -lnwe 'print for m#(?<=//)([^/]+)$#'
-l chomps the input, and adds newline to end of print statements
-n adds a while(<>) loop around the script
(?<=//) lookbehind assertion finds a double slash
...followed by non-slashes to the end of the line
The for loop assures us that non-matches are not printed.
The benefit of the -n option is that this one-liner may be used in a pipe, or on a file.
command | perl -lnwe '...'
perl -lnwe '...' filename

Related

Passing bash variable to perl one-liners - insights required

I am fairly new to Perl and I have been toying with one-liners to get some file operations done. I am using Perl to print segment between defined by line numbers which are obtained from another file. My current issue is as follows:
export var=10 ; perl -ne 'print $_ if $. == $ENV{var}' filename.txt
prints line number 10, but if i want to print from line 10 to the end of file, i tried
export var=10 ; perl -ne 'print if $ENV{var} .. -1' filename.txt
--fails. The output generated prints the whole file. Additionally, the following works,
export var=10 ; perl -ne 'print if $. >= $ENV{var} $$ $. <= $ENV{var}+5 ' filename.txt
But since i am dealing with a variable file length after the required line, this is not a viable solution.
Perl flip-flop operator has some of his own warts (like is my variable line number or boolean?), so when in doubt do explicit comparison to $. line number.
export var=10 ; perl -ne 'print if $.== $ENV{var} .. -1' filename.txt
You don't need to use Environmental variables:
var=10; echo "$(seq 20 35)" | perl -lne 'print if $. >= '"$var"';'
29
30
31
32
33
34
35
Take a look at the way I escaped $var
Using flip-flop:
var=10; echo "$(seq 20 35)" | perl -lne 'print if $.== '"$var"' .. -1;'
29
30
31
32
33
34
35
From line 10 to the end of the file:
export var=10 ; perl -ne 'print $_ if $. > $ENV{var}' filename.txt
If you want it to include line 10:
export var=10 ; perl -ne 'print $_ if $. >= $ENV{var}' filename.txt

How to add blank line after every grep result using Perl?

How to add a blank line after every grep result?
For example, grep -o "xyz" may give something like -
file1:xyz
file2:xyz
file2:xyz2
file3:xyz
I want the output to be like this -
file1:xyz
file2:xyz
file2:xyz2
file3:xyz
I would like to do something like
grep "xyz" | perl (code to add a new line after every grep result)
This is the direct answer to your question:
grep 'xyz' | perl -pe 's/$/\n/'
But this is better:
perl -ne 'print "$_\n" if /xyz/'
EDIT
Ok, after your edit, you want (almost) this:
grep 'xyz' * | perl -pe 'print "\n" if /^([^:]+):/ && ! $seen{$1}++'
If you don’t like the blank line at the beginning, make it:
grep 'xyz' * | perl -pe 'print "\n" if /^([^:]+):/ && ! $seen{$1}++ && $. > 1'
NOTE: This won’t work right on filenames with colons in them. :)½
If you want to use perl, you could do something like
grep "xyz" | perl -p -e 's/(.*)/\1\n/g'
If you want to use sed (where I seem to have gotten better results), you could do something like
grep "xyz" | sed 's/.*/\0\n/g'
This prints a newline after every single line of grep output:
grep "xyz" | perl -pe 'print "\n"'
This prints a newline in between results from different files. (Answering the question as I read it.)
grep 'xyx' * | perl -pe '/(.*?):/; if ($f ne $1) {print "\n"; $f=$1}'
Use a state machine to determine when to print a blank line:
#!/usr/bin/env perl
use strict;
use warnings;
# state variable to determine when to print a blank line
my $prev_file = '';
# change DATA to the appropriate input file handle
while( my $line = <DATA> ){
# did the state change?
if( my ( $file ) = $line =~ m{ \A ([^:]*) \: .*? xyz }msx ){
# blank lines between states
print "\n" if $file ne $prev_file && length $prev_file;
# set the new state
$prev_file = $file;
}
# print every line
print $line;
}
__DATA__
file1:xyz
file2:xyz
file2:xyz2
file3:xyz

Unix/Perl - Remove contents of a file before a pattern

I have a file like this
### SECTION 1 ###
data data
data data
### SECTION 2 ###
data data
data data
Now I want everything before SECTION 2 to be removed.
How can I do this in Perl or Unix?
To edit the file in-place:
perl -i -ne 'print if /SECTION 2/..0' file
perl -ne '$m = 1 if $_ =~ /SECTION 2/ ; next unless $m ; print $_;' filename > newfilename
$ perl -pi -e '$_ = "" unless /SECTION 2/ .. /(*FAIL)/' file

want to read file line by line and then want to cut the line on delimiter

cat $INPUT_FILE| while read LINE
do
abc=cut -d ',' -f 4 $LINE
Perl:
cat $INPUT_FILE | perl -ne '{my #fields = split /,/; print $fields[3];}'
The key is to use command substitution if you want the output of a command saved in a variable.
POSIX shell (sh):
while read -r LINE
do
abc=$(cut -d ',' -f 4 "$LINE")
done < "$INPUT_FILE"
If you're using a legacy Bourne shell, use backticks instead of the preferred $():
abc=`cut -d ',' -f 4 "$LINE"`
In some shells, you may not need to use an external utility.
Bash, ksh, zsh:
while read -r LINE
do
IFS=, read -r f1 f2 f3 abc remainder <<< "$LINE"
done < "$INPUT_FILE"
or
while read -r LINE
do
IFS=, read -r -a array <<< "$LINE"
abc=${array[3]}
done < "$INPUT_FILE"
or
saveIFS=$IFS
while read -r LINE
do
IFS=,
array=($LINE)
IFS=$saveIFS
abc=${array[3]}
done < "$INPUT_FILE"
Bash:
while read line ; do
cut -d, -f4 <<<"$line"
done < $INPUT_FILE
Straight Perl:
open (INPUT_FILE, "<$INPUT_FILE") or die ("Could not open $INPUT_FILE");
while (<INPUT_FILE>) {
#fields = split(/,/, $_);
$use_this_field_value = $fields[3];
# do something with field value here
}
close (INPUT_FILE);

awk or perl one-liner to print line if second field is longer than 7 chars

I have a file of 1000 lines, each line has 2 words, separated by a space. How can I print each line only if the last word length is greater than 7 chars? Can I use awk RLENGTH? is there an easy way in perl?
#OP, awk's RLENGTH is used when you call match() function. Instead, use the length() function to check for length of characters
awk 'length($2)>7' file
if you are using bash, a shell solution
while read -r a b
do
if [ "${#b}" -gt 7 ];then
echo $a $b
fi
done <"file"
perl -ane 'print if length($F[1]) > 7'
You can do:
perl -ne '#a=split/\s+/; print if length($a[1]) > 7' input_file.txt
Options used:
-n assume 'while () { ... }' loop around program
-e 'command' one line of program (several -e's allowed, omit programfile)
You can use the auto-split option as used by Chris
-a autosplit mode with -n or -p (splits $_ into #F)
perl -ane 'length $F[1] > 7 && print' <input_file>
perl -lane 'print if (length($F[$#F]) > 7)' fileName
or
perl -pae '$_ = "" if (length($F[$#F]) <= 7)' fileName