Mojolicious, Mojo::DOM select tag by contains text - perl

Is there analog ":contains()"(JQuery, JSoup) selector in Mojolicious?
Selector ":contains('text') ~ td + td" work in JQuery and JSoup. How can I convert it to Mojolicious selector?
http://api.jquery.com/contains-selector/
Description: Select all elements that contain the specified text.
version added: 1.1.4jQuery( ":contains(text)" ) text: A string of text
to look for. It's case sensitive.
http://jsoup.org/apidocs/org/jsoup/select/Selector.html
:contains(text) elements that contains the specified text. The search
is case insensitive. The text may appear in the found element, or any
of its descendants.
Mojolicious analog?

Untested, but I would go in the direction of
$dom->find('*')
->grep(sub { $_->all_text =~ /text/ })
->map('following', 'td')
->map('find', 'td')
(if you have something more specific before your :contains, like at least a tag name selector, then replace the * with that, which should greatly help the performance).

Few experiment with hobbs code and I can repeat JQuery, JSoup selector result:
:contains('some string') ~ td + td
Mojo:
$dom
-> find('*')
-> grep(sub { $_ -> text =~ /some string/; })
-> map('following', '~ td + td')
-> flatten;
But, I don't think it's universal and best way to do such select. Just for start.
text
Extract text content from this element only (not including child
elements), smart whitespace trimming is enabled by default.
flatten
Flatten nested collections/arrays recursively and create a new
collection with all elements.

Related

autocomplete with Materialize - text instead of optional image

I am using Materialize Autocomplete and I wonder if there is a way to use text instead of "optional image". Why? In case the text is not unique then the user will not know which one to choose. It might happen that the options will be names and there might two people with the same name and surname.
When typing my question I found out that I cannot use duplicate entries in data
data: {
"Radek": myself,
"Radek": some other Radek,
"Radoslav": 'http://placehold.it/250x250'
},
js fiddle example
When you look at the source you find the following lines relevant for the images:
autocompleteOption.append('<img src="'+ data[key] +'" class="right circle"><span>'+ key +'</span>');
and
var img = $el.find('img');
$el.html("<span>" + beforeMatch + "<span class='highlight'>" + matchText + "</span>" + afterMatch + "</span>");
$el.prepend(img);
This prevents us from using the image-attribute for anything other than images.
We can insert something like this to trick Materialize
"Radoslav": " style="display: none;">Inserted Text <br><span style="display: none`
but it will just be converted to text resulting in a option equivalent to
"Inserted Text Radoslav": none
So there is sadly nothing to be gained here.
If you are looking to insert a linebreak, however, you can use this answer on How to force Materialize autocomplete text to flow to new line?

Regex: Capture Groups and Empty Fields (SWIFT 5 | ICU Regex Engine)

I am in need of some help correcting my RegEx string - I have a string of text (A large body of HTML) and I need to take this HTML String and then pattern match it so that data that I have nested within' <div> tags can be extracted and used.
Lets take an example with a test case of <div id=1>
<div id=1>UID:1currentPartNumber:63222TRES003H1workcenter:VLCSKDcycleTime:98.8curPartCycleTime:63.66partsMade:233curCycleTimeActual:62.4target:291actual:233downtime:97statusReason:lineStatus:Productionefficiency:80.05plusminus:-260curProdTime:7/16/2019 12:28:01 PM</div>
What should be noted is that lineStatus can either have a value or be empty such as the same with statusReason
I am able to come up with a regex that does MOST of the work but I am struggling with cases where values are not present.
Here is my attempt:
(
(<div id=(\d|\d\d)>)
(UID:(\d|\d\d))
(currentPartNumber:(.{1,20}))
(workcenter:(.{1,20}))
(cycleTime:(.{1,6}))
(curPartCycleTime:(.{1,6}))
(partsMade:(.{1,6}))
(CycleTimeActual:(.{1,6}))
(target:(.{1,6}))
(actual:(.{1,6}))
(downtime:(.{1,6}))
((statusReason:((?:.)|(.{1,6}))))
((lineStatus:((?:.)|(.{1,6}))))
(Productionefficiency:(.{1,6}))
(plusminus:(.{1,6}))
(curProdTime:(.{1,30}))
)
Split it up just for readability.
Thanks,
You are very, very close.
If you use:
(
(<div id=\d{1,2}>)
(UID:\d{1,2})
(currentPartNumber:(.{1,20}))
(workcenter:(.{1,20}))
(cycleTime:(.{1,6}))
(curPartCycleTime:(.{1,6}))
(partsMade:(.{1,6}))
(CycleTimeActual:(.{1,6}))
(target:(.{1,6}))
(actual:(.{1,6}))
(downtime:(.{1,6}))
(statusReason:(.{0,6}))
(lineStatus:(.{0,6}))
(Productionefficiency:(.{1,6}))
(plusminus:(.{1,6}))
(curProdTime:(.{1,30}))
(<\/div>)
)
Then $3\n$4\n$6\n$8\n$10\n$12\n$14\n$16\n$18\n$20\n$22\n$24\n$26\n$28\n$30 will be:
UID:1
currentPartNumber:63222TRES003H1
workcenter:VLCSKD
cycleTime:98.8
curPartCycleTime:63.66
partsMade:233cur
CycleTimeActual:62.4
target:291
actual:233
downtime:97
statusReason:
lineStatus:
Productionefficiency:80.05
plusminus:-260
curProdTime:7/16/2019 12:28:01 PM
By using (statusReason:(.{0,6}))(lineStatus:(.{0,6})) you make the value of statusReason and lineStatus truly optional.
I also simplified the start <div> and UID detection.
Try Regex: ((<div id=(\d|\d\d)>)(UID:(\d|\d\d))(currentPartNumber:(.{1,20}))(workcenter:(.{1,20}))(cycleTime:(.{1,6}))(curPartCycleTime:(.{1,6}))(partsMade:(.{1,6}))(CycleTimeActual:(.{1,6}))(target:(.{1,6}))(actual:(.{1,6}))(downtime:(.{1,6}))(statusReason:(.{1,6})?)(lineStatus:(.{1,6})?)(Productionefficiency:(.{1,6}))(plusminus:(.{1,6}))(curProdTime:(.{1,30})))
Demo
Warning: You can't Parse HTML with regex

What are all the Unicode properties a Perl 6 character will match?

The .uniprop returns a single property:
put join ', ', 'A'.uniprop;
I get back one property (the general category):
Lu
Looking around I didn't see a way to get all the other properties (including derived ones such as ID_Start and so on). What am I missing? I know I can go look at the data files, but I'd rather have a single method that returns a list.
I am mostly interested in this because regexes understand properties and match the right properties. I'd like to take any character and show which properties it will match.
"A".uniprop("Alphabetic") will get the Alphabetic property. Are you asking for what other properties are possible?
All these that have a checkmark by them will likely work. This just displays that status of roast testing for it https://github.com/perl6/roast/issues/195
This may more more useful for you, https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483
The first hash is just mapping aliases for the property names to the full names. The second hash specifices whether the property is B for boolean, S for a string, I for integer, nv for numeric value, na for Unicode Name and a few other specials.
If I didn't understand you question please let me know and I will revise this answer.
Update: Seems you want to find out all the properties that will match. What you will want to do is iterate all of https://github.com/rakudo/rakudo/blob/master/src/core/Cool.pm6#L396-L483 and looking only at string, integer and boolean properties. Here is the full thing: https://gist.github.com/samcv/ae09060a781bb4c36ae6cac80ea9325f
sub MAIN {
use Test;
my $char = 'a';
my #result = what-matches($char);
for #result {
ok EVAL("'$char' ~~ /$_/"), "$char ~~ /$_/";
}
}
use nqp;
sub what-matches (Str:D $chr) {
my #result;
my %prefs = prefs();
for %prefs.keys -> $key {
given %prefs{$key} {
when 'S' {
my $propval = $chr.uniprop($key);
if $key eq 'Block' {
#result.push: "<:In" ~ $propval.trans(' ' => '') ~ ">";
}
elsif $propval {
#result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
}
}
when 'I' {
#result.push: "<:" ~ $key ~ "<" ~ $chr.uniprop($key) ~ ">>";
}
when 'B' {
#result.push: ($chr.uniprop($key) ?? "<:$key>" !! "<:!$key>");
}
}
}
#result;
}
sub prefs {
my %prefs = nqp::hash(
'Other_Grapheme_Extend','B','Titlecase_Mapping','tc','Dash','B',
'Emoji_Modifier_Base','B','Emoji_Modifier','B','Pattern_Syntax','B',
'IDS_Trinary_Operator','B','ID_Continue','B','Diacritic','B','Cased','B',
'Hangul_Syllable_Type','S','Quotation_Mark','B','Radical','B',
'NFD_Quick_Check','S','Joining_Type','S','Case_Folding','S','Script','S',
'Soft_Dotted','B','Changes_When_Casemapped','B','Simple_Case_Folding','S',
'ISO_Comment','S','Lowercase','B','Join_Control','B','Bidi_Class','S',
'Joining_Group','S','Decomposition_Mapping','S','Lowercase_Mapping','lc',
'NFKC_Casefold','S','Simple_Lowercase_Mapping','S',
'Indic_Syllabic_Category','S','Expands_On_NFC','B','Expands_On_NFD','B',
'Uppercase','B','White_Space','B','Sentence_Terminal','B',
'NFKD_Quick_Check','S','Changes_When_Titlecased','B','Math','B',
'Uppercase_Mapping','uc','NFKC_Quick_Check','S','Sentence_Break','S',
'Simple_Titlecase_Mapping','S','Alphabetic','B','Composition_Exclusion','B',
'Noncharacter_Code_Point','B','Other_Alphabetic','B','XID_Continue','B',
'Age','S','Other_ID_Start','B','Unified_Ideograph','B','FC_NFKC_Closure','S',
'Case_Ignorable','B','Hyphen','B','Numeric_Value','nv',
'Changes_When_NFKC_Casefolded','B','Expands_On_NFKD','B',
'Indic_Positional_Category','S','Decomposition_Type','S','Bidi_Mirrored','B',
'Changes_When_Uppercased','B','ID_Start','B','Grapheme_Extend','B',
'XID_Start','B','Expands_On_NFKC','B','Other_Uppercase','B','Other_Math','B',
'Grapheme_Link','B','Bidi_Control','B','Default_Ignorable_Code_Point','B',
'Changes_When_Casefolded','B','Word_Break','S','NFC_Quick_Check','S',
'Other_Default_Ignorable_Code_Point','B','Logical_Order_Exception','B',
'Prepended_Concatenation_Mark','B','Other_Lowercase','B',
'Other_ID_Continue','B','Variation_Selector','B','Extender','B',
'Full_Composition_Exclusion','B','IDS_Binary_Operator','B','Numeric_Type','S',
'kCompatibilityVariant','S','Simple_Uppercase_Mapping','S',
'Terminal_Punctuation','B','Line_Break','S','East_Asian_Width','S',
'ASCII_Hex_Digit','B','Pattern_White_Space','B','Hex_Digit','B',
'Bidi_Paired_Bracket_Type','S','General_Category','S',
'Grapheme_Cluster_Break','S','Grapheme_Base','B','Name','na','Ideographic','B',
'Block','S','Emoji_Presentation','B','Emoji','B','Deprecated','B',
'Changes_When_Lowercased','B','Bidi_Mirroring_Glyph','bmg',
'Canonical_Combining_Class','S',
);
}
OK, so here's another take on answering this question, but the solution is not perfect. Bring the downvotes!
If you join #perl6 channel on freenode, there's a bot called unicodable6 which has functionality that you may find useful. You can ask it to do this (e.g. for character A and π simultaneously):
<AlexDaniel> propdump: Aπ
<unicodable6> AlexDaniel, https://gist.github.com/b48e6062f3b0d5721a5988f067259727
Not only it shows the value of each property, but if you give it more than one character it will also highlight the differences!
Yes, it seems like you're looking for a way to do that within perl 6, and this answer is not it. But in the meantime it's very useful. Internally Unicodable just iterates through this list of properties. So basically this is identical to the other answer in this thread.
I think someone can make a module out of this (hint-hint), and then the answer to your question will be “just use module Unicode::Propdump”.

button in a nested table that has dynamic id

I am trying to click on a button in a table cell that has a dynamic name with a prefix of button_keep
this is the unique path that firebug has pointed out for the table cell.
#mergePatientsSelectedTable > tbody:nth-child(2) > tr:nth-child(1) > td:nth-child(2) > table:nth-child(1) > tbody:nth-child(1) > tr:nth-child(1) > td:nth-child(1)
I'm trying to get there using the following but it does not find the button.
b.table(:id => "mergePatientsSelectedTable").tbody{2}.tr{1}.td{2}.table{1}.tbody{1}.tr{1}.td{1}.button(:index => 0).button(:name => /button_keep/).click
I have tried to flash on the cells and use IRB yet it continues to give me an assert that cannot find the button.
Any suggestions . thank you
You do not need to write out the whole path. Each method/locator looks in all descendents, not just direct children. Using entire paths can make the code quite brittle.
So why not just locate the button based on the known attributes:
b.table(:id => "mergePatientsSelectedTable").button(:name => /button_keep/).click
But to explain why you were having problems with your solution, doing tbody{2} actually returns the first tbody element (not the second). The {2} is a block that gets ignored.
For example, consider the html:
<div>hi</div>
<div>bye</div>
You can see the first div gets returned when using a block:
b.div{2}.text
#=> "hi"
To get the second div, you can either use the index locator or get the second element of the collection:
b.div(:index => 1).text
#=> "bye"
b.divs[1].text
#=> "bye"
So if you really wanted to do the entire path, you could have done:
b.table(:id => "mergePatientsSelectedTable").tbody(:index => 1).tr.td(:index => 1).table.tbody.tr.td.button(:name => /button_keep/).click
Note that:
nth-child is 1-based index while Watir uses a 0-based index.
If you want the first match, you do not need to include the index - :index => 0 is assumed.

xpath find location in document

Trying to find the value of "aging-timer" in the following XML document:
<ospf3-database-information xmlns="http://xml.juniper.net/junos/11.1R2/junos-routing">
<ospf3-database external-heading="OSPF3">
<lsa-type>Extern</lsa-type>
<lsa-id>0.0.0.1</lsa-id>
<advertising-router>172.27.255.6</advertising-router>
<sequence-number>0x80000001</sequence-number>
<age>1792</age>
<checksum>0x90bd</checksum>
<lsa-length>28</lsa-length>
<ospf3-external-lsa>
<ospf3-prefix>::/0</ospf3-prefix>
<ospf3-prefix-options>0x0</ospf3-prefix-options>
<type-value>1</type-value>
<metric>0</metric>
</ospf3-external-lsa>
<ospf-database-extensive>
<aging-timer junos:seconds="1808">
00:30:08
</aging-timer>
<installation-time junos:seconds="1790">
00:29:50
</installation-time>
<expiration-time junos:seconds="1808">
00:30:08
</expiration-time>
<send-time junos:seconds="1790">
00:29:50
</send-time>
<lsa-changed-time junos:seconds="1790">
00:29:50
</lsa-changed-time>
<lsa-change-count>1</lsa-change-count>
</ospf-database-extensive>
</ospf3-database>
</ospf3-database-information>
The only caveat is that I only want it if "ospf3-prefix" has a text of ::/0.
I am trying this xpath:
//x:ospf3-external-lsa[x:ospf3-prefix=\"::/0\"]/ancestor::x:ospf3-database/x:ospf-database-extensive/x:aging-timer
using the x to handle namespaces in perl:
my $xdatav6 = XML::LibXML->load_xml(string => $defv6);
my $dataxv6 = XML::LibXML::XPathContext->new($xdatav6);
$dataxv6->registerNs("x", "http://xml.juniper.net/junos/11.1R2/junos-routing");
but this doesn't work, it seems that when I use the ancestor it can grab everything except anything under the "ospf-database-extensive" tree.
The XML you show isn't valid as it doesn't have a URI for the junos namespace that is used in junos:seconds. I changed the root tag to read
xmlns:junos="http://xml.juniper.net/junos/11.1R2/junos-routing"
to get things working.
This is the longest XPath expression I've seen, but this does the trick! Put the appropriate namespace prefixes in as needed.
/ospf3-database-information/ospf3-database[ospf3-external-lsa/ospf3-prefix="::/0"]/ospf-database-extensive/aging-timer