Split, insert and join

Split, insert and join - perl

Here's I want to archive. I want to split a one-liner comma-separated and insert #domain.com then join it back as comma-separated.
The one-liner contains something like:
username1,username2,username3
and I want to be something like:
username1#domain.com,username2#domain.com,username3#domain.com
So my Perl script that I tried which doesn't not work properly:
my $var ='username1,username2,username3';
my #tkens = split /,/, $var;
my #user;
foreach my $tken (#tkens) {
push (#user, "$tken\#domain.com");
}
my $to = join(',',#user);
Is there any shortcut on this in Perl and please post sample please. Thanks

Split, transform, stitch:
my $var ='username1,username2,username3';
print join ",", map { "$_\#domain.com" } split(",", $var);
# ==> username1#domain.com,username2#domain.com,username3#domain.com

You could also use a regular expression substitution:
#!/usr/bin/perl
use strict;
use warnings;
my $var = "username1,username2,username3";
# Replace every comma (and the end of the string) with a comma and #domain.com
$var =~ s/$|,/\#domain.com,/g;
# Remove extra comma after last item
chop $var;
print "$var\n";

You already have good answers. Here I am just telling why your script is not working. I didn't see any print or say line in your code, so not sure how you are trying to print something. No need of last line in your program. You can simply suffix #domain.com with each value, push to an array and print it with join.
#!/usr/bin/perl
use strict;
use warnings;
my $var = 'username1,username2,username3';
my #tkens = split ',', $var;
my #user;
foreach my $tken (#tkens)
{
push #user, $tken."\#domain.com"; # `.` after `$tken` for concatenation
}
print join(',', #user), "\n"
Output:
username1#domain.com,username2#domain.com,username3#domain.com

Related

How to remove array's newlines and add an element at the beginning of it in Perl?

First of I have to apologize for editing my initial post. But after I provide my code I did the question fuzzy.
So, I have this an array (#start_cod) containing lines separated by /n as follows:
print #start_cod;
tatatattataattatatttat
cacacacaacaccacaac
aaaaaaaaaaaaaaa
I need to remove the newlines and add ">text" ONLY at the beginning of the array as follow:
>text
tatatattataattatatttatcacacacaacaccacaacaaaaaaaaaaaaaaa
I tried:
s/\s+\z// for #start_cod;
print ">text#start_cod";
I tried also with chomp
chomp #start_cod;
print ">text#start_cod";
and
my #start_cod = split("\n",$start_cod);
$start_cod = join("",#start_cod);
print ">text$start_cod";
but I get
aaaaaaaaaaaaaaaaaaa>textcacacacacaacaccacaac>textaattatatattataattatatttat
Any suggestions on how to handle this in Perl Programming?
Here is my code which works 100%.
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
my %alliloux =();
$/="\n>";
while (<>) {
s/>//g;
my ($onoma, #seq) = split (/\n/, $_);
my ($sp, $head) = split (/\./, $onoma);
push #{ $alliloux{$sp} }, join "\n", ">$onoma", #seq;
}
foreach my $sp (keys %alliloux) {
chomp $sp;
my ($head, $dna) = split(/\t/, $sp);
my #start_cod = substr($dna, 3);
say #start_cod;
Input file:
>name aaaaaaaaaaaaaaaaaa
>name2 acacacacacaacaccacaac
>namex aattatatattataattatatttat
output after Perl run
tatatattataattatatttat
cacacacaacaccacaac
aaaaaaaaaaaaaaa
Desired output:
>text
tatatattataattatatttatcacacacaacaccacaacaaaaaaaaaaaaaaa

If I understand your question correctly, this should do what you want:
use strict;
use warnings;
my #start_cod = (
'aaaaaaaaaaaaaaaaaa',
'acacacacacaacaccacaac',
'aattatatattataattatatttat',
);
print ">text\n", #start_cod, "\n";
The print first prints ">text" and a newline once, then you get the #start_cod items on a line, and the last "\n" makes sure you have a newline after the last element.
Output:
>text
aaaaaaaaaaaaaaaaaaacacacacacaacaccacaacaattatatattataattatatttat

You might want to see Read FASTA into Hash. It's the same problem and very close to the code I wrote before I read it. Also, there are modules on CPAN that can handle FASTA.
I think you want to combine the sequences that start with the same name, disregarding the numbers. The sequences shouldn't have interior whitespace. In your code, you are constantly adding whitespace. You even join on a newline. So, you go to the doctor and say "My arm hurts when I do this", and the doctor says "So don't do that". :)
When you run into these sort of problems, check the results of your operations at each step to see if you get what you expect. Here's a much simplified version of a program that I think does what you want. I've removed most of the data structure because they are complicating your process.
In short, read a line and remove the newline at the end. That's one source of your newlines. Then, extract the sequence and concatenate that to the previous sequence. When you join with newlines, you are adding newlines. So, don't do that:
use v5.14;
use warnings;
use Data::Dumper;
my %alliloux = ();
while (<DATA>) {
chomp; # get rid of that newline!
s/>//g;
# now split on whitespace, but only up to two parts.
# There's no array here.
my( $name, $seq ) = split /\s+/, $_, 2;
# remove the numbers at the end to get the prefix of the
# name.
my $prefix = $name =~ s/\d+\z//r;
# append the current sequence for this prefix to what we
# have already seen.f
$alliloux{$prefix} .= $seq;
}
say Dumper( \%alliloux );
foreach my $base ( keys %alliloux ) {
say ">text $alliloux{$base}";
}
__DATA__
>name aaa
>name2 cccc
>name99 aattaatt
You don't need the intermediate array. You can build up your string as you go. You don't need to have all the parts before you do that.
Now, to figure out where you might be going wrong, do a little at once. Ensure that you've extracted the right thing. It's handle to put characters around the variables you interpolate so you can see whitespace at the beginning or end:
while (<DATA>) {
chomp; # get rid of that newline!
s/>//g;
my( $name, $seq ) = split /\s+/, $_, 2;
say "Name: <$name>";
say "Seq: <$seq>"
}
Then, add another step, and ensure that works:
while (<DATA>) {
chomp; # get rid of that newline!
s/>//g;
my( $name, $seq ) = split /\s+/, $_, 2;
say "Name: <$name>";
say "Seq: <$seq>"
my $prefix = $name =~ s/\d+\z//r;
say "Prefix: <$prefix>";
}
Repeat this process for each step. Then, when you come with a question, you've pinpointed the point where things diverge. Here's the same technique in your program:
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
while (<DATA>) {
s/>//g;
my ($onoma, #seq) = split (/\n/, $_);
say "Onoma: <$onoma>";
}
__DATA__
>name aaa
>name2 cccc
>name99 aattaatt
The output shows that you never had anything in #seq. You are splitting on a newline, but unless you've changed the default line ending, you'll only get a newline at the end:
Onoma: <name aaa>
Onoma: <name2 cccc>
Onoma: <name99 aattaatt>
Now there's nothing in #seq, so a line like join "\n", ">$onoma", #seq; is really just join "\n", ">$onoma". You could have seen that with a little checking.

The description lacks clarity of the problem.
By looking at the desired output the following code comes to mind. Please see if it does what you was looking for.
Even looking at your code it is not clear what you try to do -- some part of the code does not make much sense.
use strict;
use warnings;
use feature 'say';
my #start_cod;
while( <DATA> ) {
chomp;
next unless />\s?name.?\s+(.*)/;
push #start_cod, $1;
}
print ">text\n " . join('',#start_cod);
__DATA__
>name aaaaaaaaaaaaaaaaaa
>name2 acacacacacaacaccacaac
.
.
.
> namex aattatatattataattatatttat

How to get the last item of a split in Perl?

$k="1.3.6.1.4.1.1588.2.1.1.1.6.2.1.37.32";
#a= split('\.',$k);
print #a[-1]; # WORKS!
print (split '\.',$k)[-1]; # Fails: not proper syntax.`
I'd like to print the last element of a split without having to use an intermediary variable. Is there a way to do this? I'm using Perl 5.14.

Perl is attributing the open parenthesis( to the print function. The syntax error comes from that the print() cannot be followed by [-1]. Even if there is whitespace between print and (). You need to prefix the parenthesis with a + sign to force list context if you do not want to add parens to your print.
print +(split'\.', $k)[-1];
If you are not using your syntax as the parameter to something that expects to have parens, it will also work the way you tried.
my $foo = (split '\.', $k)[-1];
print $foo;

Instead of creating a complete list and slicing it to get the last element, you could use a regex capture:
use strict;
use warnings;
my $k = "1.3.6.1.4.1.1588.2.1.1.1.6.2.1.37.32";
my ($last) = $k =~ /(\d+)$/;
print $last;
Output:
32

rindex() split last position while index() split from first position found
print substr( $k, rindex($k, '.')+1 );

Matching in Perl

I am trying to get text in between two dots of a line, but my program returns the entire line.
For example: I have text which looks like:
My sampledata 1,2 for perl .version 1_1.
I used the following match statement
$x =~ m/(\.)(.*)(\.)/;
My output for $x should be version 1_1, but I am getting the entire line as my match.

In your code, the value of $x will not change after the match.
When $x is successfully matched with m/(.)(.*)(.)/, your three capture groups will contain '.', 'version 1_1' and '.' respectively (in the order given). $2 will give you 'version 1_1'.
Considering that you might probably only want the part 'version 1_1', you need not capture the two dots. This code will give you the same result:
$x =~ m/\.(.*)\./;
print $1;

Try this:
my $str = "My sampledata 1,2 for perl .version 1_1.";
$str =~ /\.\K[^.]+(?=\.)/;
print $&;
The period must be escaped out of a character class.
\K resets all that has been matched before (you can replace it by a lookbehind (?<=\.))
[^.] means any character except a period.
For several results, you can do this:
my $str = "qwerty .target 1.target 2.target 3.";
my #matches = ($str =~ /\.\K[^.]+(?=\.)/g);
print join("\n", #matches);
If you don't want to use twice a period you can do this:
my $str = "qwerty .target 1.target 2.target 3.";
my #matches = ($str =~ /\.([^.]+)\./g);
print join("\n", #matches)."\n";

It should be simple enough to do something like this:
#!/usr/bin/perl
use warnings;
use strict;
my #tests = (
"test one. get some stuff. extra",
"stuff with only one dot.",
"another test line.capture this. whatever",
"last test . some data you want.",
"stuff with only no dots",
);
for my $test (#tests) {
# For this example, I skip $test if the match fails,
# otherwise, I move on do stuff with $want
next if $test !~ /\.(.*)\./;
my $want = $1;
print "got: $want\n";
}
Output
$ ./test.pl
got: get some stuff
got: capture this
got: some data you want

Read from input and store comma separated values in Hash

I have a Perl question like this:
Write a Perl program that will read a series of last names and phone numbers from the given input. The names and numbers should be separated by a comma. Then print the names and numbers alphabetically according to last name.Use hashes.
Any idea how to solve this?

There's more than one way to do it :)
my %phonebook;
while(<>) {
chomp;
my ($name, $phone) = split /,/;
$phonebook{$name} = $phone;
}
print "$_ => $phonebook{$_}\n" for sort keys %phonebook;

Something like the following perhaps.
my %hash;
foreach(<>){ #reads yor args from commandline or input-file
my #arr = split(/\,/); #split at comma, every line
$hash{$arr[0]} = $arr[1]; #assign to hash
}
#print hash here
foreach my $key (sort keys %hash ) #sort and iterate
{
print "Name: " . $key . " Number: " . $hash{$key} . "\n";
}

Tasks like this are the strength of perl's command line switches. See perldoc perlrun for more infos!
Command line input
$ perl -naF',\s*' -lE'$d{$F[0]}=$F[1];END{say"$_: $d{$_}"for sort keys%d}'
Moe, 12345
Pi, 31416
Homer, 54321
Output
Homer: 54321
Moe: 12345
Pi: 31416

Assuming that we split on commas (you should use Text::CSV generally), we can actually create this hash with a simple application of the map function and the diamond operator (<>).
#!/usr/bin/env perl
use strict;
use warnings;
my %phonebook = map { chomp; split /,/ } <>;
use Data::Dumper;
print Dumper \%phonebook;
The last two lines are just to visualize the result, and the upper three should be in all scripts. The meat of the work is done all in the one line.

how to put a field-separator in Spreadsheet::ParseExcel::Simple

The following code works well for me, but I am not able to figure out how to separate columns with a field-separator like comma (,) character.
Please advise, thanks.
#! /usr/bin/perl
use strict;
use warnings;
use Spreadsheet::ParseExcel::Simple;
my #data;
my $xls = Spreadsheet::ParseExcel::Simple->read('mylargefile.xls');
foreach my $sheet ($xls->sheets) {
while ($sheet->has_data) {
#data = $sheet->next_row;
print "#data \n";
}
}

Since #data is an array of cells, you can use the built-in join() function like so:
print join(',', #data);
Or replace the comma with a separator of your choice.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Split, insert and join - perl

Split, transform, stitch: my $var ='username1,username2,username3'; print join ",", map { "$_\#domain.com" } split(",", $var); # ==> username1#domain.com,username2#domain.com,username3#domain.com

Related

How to remove array's newlines and add an element at the beginning of it in Perl?

How to get the last item of a split in Perl?

Matching in Perl

Read from input and store comma separated values in Hash

how to put a field-separator in Spreadsheet::ParseExcel::Simple

Categories

Resources