XML::LibXML parsing and choosing element a quick one

XML::LibXML parsing and choosing element a quick one - perl

I always get stuck at some logic questions when it comes to programming. This is logical and easy, I think, but hard for me, as I am unable to get there. I am using XML::LibXML to parse XML files. Now in the following code
<CommentsCorrectionsList>
<CommentsCorrections RefType="Cites">
<RefSource>Brain Cogn. 2005 Jul;58(2):245</RefSource>
</CommentsCorrections>
<CommentsCorrections RefType="RepublishedIn">
<RefSource>Brain Cogn. 2005 Jul;58(2):246-8</RefSource>
<PMID Version="1">16044513</PMID>
</CommentsCorrections>
<CommentsCorrections RefType="PartialRetractionOf">
<RefSource>Curr Opin Organ Transplant. 2001 Mar;6(1):95-101</RefSource>
</CommentsCorrections>
</CommentsCorrectionsList>
I want to choose commentscorrections for all other RefType except for 'Cites'. How do I do it. I thought of doing it by putting all the wanted RefTypes in another variable and then using it to get the other data. Is it the right way, I tried with some dummy variables like the following
my $sam = "A" || "B" || "C";
print "test= ";
my $test = <>;
if ($test == $sam) {
print $test;
print "success";} else {
print "NO";}
I know this may be silly for some of you but I writing a program since a month or so and I sometime get frustrated because I don't know what to do. I try to learn many things. Please forgive me if this is really a silly question.
Also, I thought to do
if(!($foo->findnodes('CommentsCorrectionList/CommentsCorrections[#RefType="Cites"]'))){
do foreach and get the data
}
But in this case how do I avoid the RefType="Cites" in foreach and make it equal to the other RefType which I want. I mean I don't know if such boolean can be used in foreach statement. I tried finding and also did trial and error but nothing in hand. Any help is greatly appreciated.
Thank you.

CommentsCorrectionList/CommentsCorrections[#RefType != "Cites"]
I personally use the XPath spec as my reference, but there might be friendlier references out there.

Related

How do I return the text of a WebElement using Perl's Selenium::Remote::Driver?

I feel like I must be missing something obvious. I know the XPath to a WebElement and I want to compare the text in that element to some string.
Say the XPath for the element is /html/body/div/a/strong and I want to compare it to $str.
So it shows up in source like ...<strong>find this string</strong>
So I say
use strict;
use warnings;
use Selenium::Remote::Driver;
# Fire up a Selenium object, get to the page, etc..
Test::More::ok($sel->find_element("/html/body/div/a") eq $str, "Text matched");
When I run this the test fails when it should pass. When I try to print the value of find_element($xpath) I get some hash reference. I've googled around some and find examples telling me to try find_element($xpath)->get_text() but get_text() isn't even a method in the original package. Is it an obsolete method that used to actually work?
Most of the examples online for this module say "It's easy!" then show me how to get_title() but not how to check the text at an XPath. I might be going crazy.

Following up on our comment thread, I'm posting this as an answer:
Selenium::Remote::WebDriver definitely has a method named find_element(), and Selenium::Remote::WebElement definitely has a method named get_text().
Something like this...
my $text = $sel->find_element(...)->get_text();
...works fine on my end, though it looks like it'll error out if the element isn't found.

Perl Net::Google::Calendar event occurrence

I have the following code ( perl ), but I"m not sure how to extract the occurrence events ?
use Net::Google::Calendar;
my #cals = $parent->get_calendars;
for my $cal_selected (#cals) {
printf "CALENDAR: %s\n", $cal_selected->title;
$parent->set_calendar( $cal_selected );
for ( $parent->get_events() ) {
my $title = $_->title || "";
my ($start, $end) = $_->when();
print "title: $title, start: $start, end: $end \n";
}
}
So, one of the events has a "start" var, but it jump every 4 days - and I don't see it in my current output.
Output looks like this:
title: "Event 1", start: 2012-08-06T00:00:00, end: 2012-08-13T00:00:00
title: "Event 2", start: 2012-07-05T00:00:00, end: 2012-07-05T00:00:00
"Event1" has a very clear start and end dates, 6 days ..
See that "Event 2" is at the same day, but the occurances are at 2012-07-05, 2012-07-09, 2012-07-13
But I don't see that in the output itself ...

You might need try the recurrence method. The problem is that if you use this and you don't have the Data::ICal module installed, you'll get an undefinstead of an error. Unfortunately, you'll also get an undef if you have Data::ICal installed, and there's no reoccurrence. That's just bad module programming.
The recurrence method returns an Data::Ical::Entry::Event object.
From there, I'm not entirely sure what to do. The Data::ICal::Entry::Event documentation is really sparse. I had to read through the iCalendar RFC which mentioned that this field is stored in something called rdate. There's a mention of multiple rdate properties in the Data::ICal::Entry::Event module, but nothing that says how to access it, or what format it's in.
I think you're suppose to use Data::ICal::DateTime to help parse this information. This module has a reoccurrence method that allows you to pull out the union of all rdates. Unfortunately, it uses an Data::ICal object, and you have an Data::ICal::Entry::Event object. Maybe that's a subclass of Data::ICal, so maybe it'll still work.
Unfortunately, I don't have Google Calendar setup, so I really have no way of playing around with this. Sorry about not being more definite about this. However, I hope this leads you the right way.
One more recommendation: Use Data::Dumper to dump out the various objects you're getting. This will help you understand what's going on and how information is stored in these objects. With a bit of luck, you'll be able to figure out exactly what methods and modules you need to get the information you want.
It's not polite to peek, but when you don't understand the question, looking up the answer in back of the book can sometimes help.

Problems check username input against flat file for user creation

I am working on a user login and am having trouble with the user creation part. My problem is that I am trying to check the input username against a text file to see if that username already exists. I can't seem to get it to compare the input username to the array that I have brought in. I have tried two different ways of accomplishing this. One using an array and another using something I read online that I don't quite understand. Any help or explanation would be greatly appreciated.
Here is my attempt using an array to compare off of
http://codepad.org/G7xmsf3z
Here is my second attempt
http://codepad.org/SbeqmdbG

In your first attempt, try to put the if inside of the loop:
foreach my $pair(#incomingarray) {
(my $name,my $value) = split (/:/, $pair);
if ($name eq $username) {
print p("Username is already taken, try again");
close(YYY);
print end_html();
}
else {
open(YYY, ">>password.txt");
print YYY $username.":".$hashpass."\n";
print p("Your account has been created sucessfully");
close(YYY);
print end_html();
}
}
In you second attempt, I think you should try and change the line:
if (%users eq $username) {
with this one:
if (defined $users{$username}) {

As has been stated above regarding locking the flatfile from other processes there is the issue with scaling too. the more users you have the slower the lookup will be.
I started years ago with a flat file, believing I would never scale enough to require a real database and didn't want to learn how to use mySQL for example. Eventually after flatfile corruptions and long lookup times I had no choice but to move to a database.
Later you will find yourself wanting to store user preferences and such, it's easy to add a new field to a database. Flatfile will end up having the overhead of splitting each line into separate fields.
I'd suggest you do it properly with a database.

As in my comment, you should not be using a flatfile to hold your user info. You should use a proper database that will handle concurrent access for you rather than having to understand and code up how to deal with all of that yourself!
If you insist on using an array, you can search it with grep() if it is not "too large":
if (grep /^$username:/, #incomingarray) {
print "user name '$username' is already registered, try again\n";
}
else {
print "user name '$username' is not already registered\n";
}
I see some other problems in your code as well.
You should always prefer lexical (my) variables over package (our) variables.
Why do you think (erroneously) that $name and $username cannot be lexical variables?
You should always use the 3-arg form of open() and check its return value like in your 2nd code example. Your open() in the 1st code example is how it was done many many years ago.

How to skip 'die' in perl

I am trying to extract data from website using perl API. The process is to use a list of uris as input. Then I extract related information for each uri from website. If the information for one uri is not present it dies. Some thing like the code below
my #tags = $c->posts_for(uri =>"$currentURI");
die "No candidate related articles\n" unless #tags;
Now, I don't want the program to stop if it doesn't get any tags. I want the program to skip that particular uri and go to the next available uri. How can i do it?
Thank you for your time and help.
Thank you,
Sammed

Well, assuming that you're inside a loop processing each of the URIs in turn, you should be able to do something like:
next unless #tags;
For example, the following program only prints lines that are numeric:
while (<STDIN>) {
next unless /^\d+$/;
print;
}
The loop processes every input line in turn but, when one is found that doesn't match that regular expression (all numeric), it restarts the loop (for the next input line) without printing.
The same method is used in that first code block above to restart the loop if there are no tags, moving to the next URI.

Besides the traditional flow control tools, i.e. next/last in a loop or return in a sub, one can use exceptions in perl:
eval {
die "Bad bad thing";
};
if ($#) {
# do something about it
};
Or just use Try::Tiny.
However, from the description of the task it seems next is enough (so I voted for #paxdiablo's answer).

The question is rather strange, but as near as I can tell, you are asking how to control the flow of your current loop. Of course, using die will cause your program to exit, so if you do not want that, you should not use die. Seems elementary to me, that's why it is a strange questions.
So, I assume you have a loop such as:
for my $currentURI (#uris) {
my #tags = $c->posts_for(uri =>"$currentURI");
die "No candidate related articles\n" unless #tags;
# do stuff with #tags here....
}
And if #tags is empty, you want to go to the next URI. Well, that's a simple thing to solve. There are many ways.
next unless #tags;
for my $tag (#tags) { ... stuff ... }
if (#tags) { .... }
Next is the simplest one. It skips to the end of the loop block and starts with the next iteration. However, using a for or if block causes the same behaviour, and so are equivalent. For example:
for my $currentURI (#uris) {
my #tags = $c->posts_for(uri =>"$currentURI");
for my $tag (#tags) {
do_something($tag);
}
}
Or even:
for my $currentURI (#uris) {
for my $tag ($c->posts_for(uri =>"$currentURI")) {
do_something($tag);
}
}
In this last example, we removed #tags all together, because it is not needed. The inner loop will run zero times if there are no "tags".
This is not really complex stuff, and if you feel unsure, I suggest you play around a little with loops and conditionals to learn how they work.

Creating a sort of "composable" parser for log files

I've started a little pet project to parse log files for Team Fortress 2. The log files have an event on each line, such as the following:
L 10/23/2009 - 21:03:43: "Mmm... Cycles!<67><STEAM_0:1:4779289><Red>" killed "monkey<77><STEAM_0:0:20001959><Blue>" with "sniperrifle" (customkill "headshot") (attacker_position "1848 813 94") (victim_position "1483 358 221")
Notice there are some common parts of the syntax for log files. Names, for example consist of four parts: the name, an ID, a Steam ID, and the team of the player at the time. Rather than rewriting this type of regular expression, I was hoping to abstract this out slightly.
For example:
my $name = qr/(.*)<(\d+)><(.*)><(Red|Blue)>/
my $kill = qr/"$name" killed "$name"/;
This works nicely, but the regular expression now returns results that depend on the format of $name (breaking the abstraction I'm trying to achieve). The example above would match as:
my ($name_1, $id_1, $steam_1, $team_1, $name_2, $id_2, $steam_2, $team_2)
But I'm really looking for something like:
my ($player1, $player2)
Where $player1 and $player2 would be tuples of the previous data. I figure the "killed" event doesn't need to know exactly about the player, as long as it has information to create the player, which is what these tuples provide.
Sorry if this is a bit of a ramble, but hopefully you can provide some advice!

I think I understand what you are asking. What you need to do is reverse your logic. First you need to regex to split the string into two parts, then you extract your tuples. Then your regex doesn't need to know about the name, and you just have two generic player parsing regexs. Here is an short example:
#!/usr/bin/perl
use strict;
use Data::Dumper;
my $log = 'L 10/23/2009 - 21:03:43: "Mmm... Cycles!<67><STEAM_0:1:4779289><Red>" killed "monkey<77><STEAM_0:0:20001959><
Blue>" with "sniperrifle" (customkill "headshot") (attacker_position "1848 813 94") (victim_position "1483 358 221")';
my ($player1_string, $player2_string) = $log =~ m/(".*") killed (".*?")/;
my #player1 = $player1_string =~ m/(.*)<(\d+)><(.*)><(Red|Blue)>/;
my #player2 = $player2_string =~ m/(.*)<(\d+)><(.*)><(Red|Blue)>/;
print STDERR Dumper(\#player1, \#player2);
Hope this what you were looking for.

Another way to do it, but the same strategy as dwp's answer:
my #players =
map { [ /(.*)<(\d+)><(.*)><(Red|Blue)>/ ] }
$log_text =~ /"([^\"]+)" killed "([^\"]+)"/
;
Your log data contains several items of balanced text (quoted and parenthesized), so you might consider Text::Balanced for parts of this job, or perhaps a parsing approach rather than a direct attack with regex. The latter might be fragile if the player names can contain arbitrary input, for example.

Consider writing a Regexp::Log subclass.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse