Get inner most regex match for a href regex catcher - swift

I want to catch href tags with a youtube link inside them.
I have this regex:
"<a.*?href=(.*?youtu.?be.*?)>.*?</a>"
It works but it doesn't stop if the tag is closed.
Meaning if I have:
<a href=www.google.come>google</a> <a href=www.youtube.com>youtube</a>
Desired result:
<a href=www.youtube.com>youtube</a
Actual result:
<a href=www.google.come>google</a> <a href=www.youtube.com>youtube</a>
It catches the whole thing, recognizing Googles <a as the opener and youtubes </a> as the closer.
I want to make my regex a little smarter so it knows how to stop the match when the the Google closer appeared, and start a new match attempt when the Youtube opener appears.
I tried this but it didn't work:
"<a.*?[^>]href=(.*?youtu.?be.*?)>.*?</a>"
And also tried this:
"<a[^>].*?href=(.*?youtu.?be.*?)>.*?</a>"

You can try this pattern :
"<a[^>]*?href=[^>]*?youtu.?be[^>]*?>[^>]*?<\/a>"

Related

Illegal Character; Space is not Allowed

I am trying to validate my website and I am unable to solve this. This is the error I get:
**Bad value https://www.facebook.com/sharer/sharer.php?u=https://www.company.com/blog/post/back-to-running/&t=Back to Running for attribute href on element a: Illegal character in query: space is not allowed.**
Where is the space I am supposed to fix? I tried %20, %2F and a bunch of other unicode but everything I tried broke my blog. Also got errors related to XML. Any help would be appreciated.
<a class="tweet" target="_blank" href="https://twitter.com/share?url=https%3A%2F%2Fwww.company.com%2Fblog%2Fpost%2Fback-to-running%2F" title="Tweet This"></a>
<a class="facebook" target="_blank" href="https://www.facebook.com/sharer/sharer.php?u=https://www.snapediting.com/blog/post/back-to-running/&t=Back to Running" title="Share on Facebook"></a>
Try to fix
back-to-running/&t=Back to Running
to
back-to-running/&t=Back%20to%20Running
if not working, use "+":
back-to-running/&t=Back+to+Running

Using xpath to parse out html attributes from webpage

I am having trouble extracting some attributes out of an html page and need some ideas to help me get unstuck.
I am using PowerShell and am using the htmlagilitypack to help me parse the html. I have a very crude version that I was able to do with regex but it doesn't always work so I thought the better option would be to use xpath to parse the results. If regex is the way to go please let me know.
So far I have been able to grab the page that I am interested in and split it apart by rows.
$results = $htmldoc.DocumentNode.SelectNodes("//p[#class='row']")
After the page is split up I am trying to iterate through each row using xpath to grab the information I am interested in.
ForEach ($item in $results) {
$ID=$null
$ID = $item.OuterHtml
}
This gets me close to what I am wanting but it grabs a bunch of other info that I don't want as well. Here is what the $item.outerhml looks like at this point.
OuterHtml : <p class="row" data-latitude="41.5937565437255" data-longitude="-93.6437636649079" data-pid="4184719674">
<span class="star"></span> <span class="pl"> <span class="date">Nov 27</span> iPhone and other Cell Phone Unlocks
</span> <span class="l2"> <span class="pnr"> <small> (Des Moines)</small> <span class="px"> <span class="p"> <a href="#" class="maptag"
data-pid="4184719674">map</a></span></span> </span> <a class="gc" href="/mod/" data-cat="mod">cell phones - by dealer</a> </span> </p>
I just want the data-pid attribute.
I have tried a bunch of other ways to extract the data-pid attribute but haven't had any success. Here is one such method I have tried, but it keeps returning the same value over and over.
$ID = $Date.DocumentNode.SelectSingleNode("//p/#data-pid")
I have a feeling that this is something simple but have hit a roadblock. Let me know what other information I need to post.
In your foreach loop you should be able to get the attribute's value like this:
$ID = $item.GetAttributeValue("data-pid", "")
To walk all the attributes on that node try:
$item.Attributes | Select Name,Value

Need help to click Watir Web driver button

I have the HTML as
<li>
<div data-track="discovered_spots" data-filter="discovered_spots" class="button filter-button">
<span class="icon-compass"></span>
Discoveries (2)
</div>
</li>
I am trying to click the button link
#browser.div(:class =>'button filter-button').span(:text => ' Discoveries (2) ').click
Just not working or producing any errors.
I have also tried using the xpath
##browser.div(:xapth => "//div[#data-track='discovered_spots']").click
##browser.div(:text => "Discoveries").click
but that produce errors.
I have also viewed and tried the code present in other questions chains but none worked any help will be very useful.
Try:
#browser.div(:text, /Discoveries/).click
Regex for the attribute should hit the text. Not specifying \d anchor will cast a wider net in case that value of 2 fluctuates by only looking for the match of Discoveries in text and ignoring the number and parentheses.

Html.ActionLink syntax error

I want to change this
<a class="more" href="Subject?SubjectId=#Html.DisplayFor(model => item.Id)">Devamı »</a>
to
<a class="more"#Html.ActionLink("Devamı ", "Subject", new {Subject?SubjectId= item.Id }) ></a>
I can not put &raquo and Subject?SubjectId= to #Html.actionlink.
Why?
If you take a look at Microsoft's site, it says Html.ActionLink returns a complete anchor (<a></a>) object based on the parameters you give it.
You can't put the anchor returned by ActionLink into your anchor element. Choose one of the other ways of creating the link.
You may be looking for the Url.Action method, http://msdn.microsoft.com/en-us/library/system.web.mvc.urlhelper.action(v=vs.108).aspx
<a class="more" href=#Url.Action("ControllerName", new {SubjectId = item.Id})
>Devamı »
</a>

In Tritium, how do I transform all <p> tags to <div> tags?

I’m working in the Moovweb SDK and am optimizing my personal desktop site for mobile.
How do I transform all my <p> tags to <div> tags? I really don't want to do it manually! Search and replace?? haha
You can use the name() function to change the name of an element. For example:
$("//p") {
name("div")
}
See it in action here: http://tester.tritium.io/bd1be4f2c187aed317351688e23f01127d26343a
Cheap way: Add p{margin:0} to your CSS, this will remove the only special styling of <p> tags making them look like <div>s.
This is only a visual effect, though. For instance, you're still not allowed to put a <form> inside a <p>, even with the above CSS. If that's what you're after, a simple search and replace will do:
Replace <p> with <div>
Replace <p␣ (left angle, p, space) with <div␣ (there's a space at the end of that one too)
Replace </p> with </div>
That should do it!