I have an option on my social media website to search for users.
After searching for users I display the searched users name and username in plain text. I want to make it so the user can click on the plain text and be redirected to the user's profile they searched for.
So far I searched for CSS which would help me achieve this but all of them link to a url but I want to call a function instead. Is this possible?
No. The functions don't exist outside of the Perl, which runs on the server.
The browser can only interact with the Perl by requesting URLs from the server.
You need to map the URLs on to the functions you want to run.
If you're doing this by hand, you would typically do something like:
my $action = $q->param('action');
if ($action eq "show_user") {
show_user();
}
Frameworks such as Catalyst, Dancer, and Web::Simple provide routing systems to make this easier.
For example, in Catalyst (probably the most complex of the options I suggested above, but the one I'm most familiar with) you might do something like:
sub show_user : Local : Args(1) {
# code to run in http://example.com/show_user/gettingthere is requested
my ($self, $c, $username) = #_;
}
Firstly, you can't put links in plain text. So I suspect that you're actually returning HTML.
Secondly, I'm not sure why you think that CSS would be the right too for changing how your links work. CSS is for changing appearance, not behaviour.
On most social media web sites, each user has a profile page and that page has a URL. So shouldn't your search results just include those profile page links rather than a call to some Perl function?
All in all, you seem rather confused. I think you should probably have another go at explaining what you're doing - as currently it's really not very clear. Perhaps include some code - that usually clarifies things.
Related
I am trying to get specific content on a confluence cloud wiki to display content based on a specific user. The scenario here is that there are links on a page but only 1 should display, the one that displays is based on whom ever is logged in.
I have been told how a macro is the way forward, but I have read the documentation and I am at a loss. I do not understand what I have to do or how to write a confluence macro. could someone help me out with either an example or some links? I have searched like crazy, but maybe i am not asking the right questions but hopfully you can all help me out?
There's a plugin for this:
https://marketplace.atlassian.com/plugins/net.customware.confluence.plugin.visibility
But I'm not sure how thoroughly it hides the content. It might still be visible if users view the page source. If you're trying to hide content which needs to be really protected, you'll probably need to do something else.
Depending on how many users are going to be using the page, you could also just make separate spaces for them, add the permissions to those spaces, and then use a page-include on your "main" page to display the content. If they don't have access it shouldn't show up. You might experience some formatting issues with that solution, however.
Finally, you could grab the username with jquery and display stuff based on that. This solution will be pretty easy if you are familiar with javascript/jquery.
Edit: Here are some helpful resources on how to use javascript and jquery within confluence:
https://confluence.atlassian.com/display/CONFKB/How+to+Use+JavaScript+in+Confluence
https://developer.atlassian.com/confdev/confluence-plugin-guide/writing-confluence-plugins/including-javascript-and-css-resources
I need something like this
def sitemap(): SiteMap = SiteMap( Menu.i("Home") / "index#myhash/subhash" )
The point is I need add hash to the menu url. So, the resulting url would be like this
Home .
Is there any way to do it? I need just a temporary solution, so, any idea/hack would help.
thank you
The answer I recieved in lift community mail list is that I can't add hash when using sitemap because (as I see) sitemap is intented to be a server side thing that will also catch/match different url locations and help to router them. And since hashes are never sent to server, SiteMap does not work with hashes at all.
To solve my problem I just built menu links myself (I just added html markup to the page). It was acceptable for my case. In more complicated situation one can use snippet (self-written) to generate the menu in preferable way.
I have this gwt app which say, runs on http://mygwtapp.com/ (which is actually: http://mygwtapp.com/index.html)
The app host a database of users, queried by searching usernames using the search view and results are shown in the user results view. Pretty useful enough. However I need to bb add a way that user view can be viewed by just typing http://myapp.com/user123
I am thinking that the question I have here, the answer is a server side solution. However if there's a client side solution, please let me know.
One fellow here in StackOVerflow suggested that the format would be like this:
mygwtapp.com/index.html#user123
However the format is important to be like: http://myapp.com/user123
The 'something' in 'http://host/path#something' is a Fragment identifier. FIs have a specific feature: the page isn't reloaded if only FI part in URL changes, but they still take part in browser history.
FI's are a browser mechanism that GWT uses to create "pages", i.e. parts of GWT application that are bookmarkable and have history support.
You can try to use an URL without # (the FI separator), but then you will have a normal URL, that reloads the page with every change and it could not be (easily) a part of a normal GWT app.
mygwtapp.com/index.html#user123
That would be using the History mechanism (http://code.google.com/webtoolkit/doc/latest/DevGuideCodingBasicsHistory.html) which I would add is the recommended way of doing it.
However, if you insist on using something like http://myapp.com/user123, one of the possible ways is to have a servlet which accepts this request (you might have to switch to something like http://myapp.com/details?id=user123). The servlet will look up the DB and return your host html back. Before returning it will inject the required details as a Dictionary entry in the page (http://google-web-toolkit.googlecode.com/svn/javadoc/1.5/com/google/gwt/i18n/client/Dictionary.html) On the client you can read this data and display on the UI
I am interested in learning Perl. I am using Learning Perl books and cpan's web-sites for reference.
I am looking forward to do some web/text scraping application using Perl to apply whatever I have learnt.
Please suggest me some good options to begin with.
(this is not a homework. want to do something in Perl that would help me exploit basic Perl features)
If the web pages you want to scrape require JavaScript to function properly, you are going to need more than what WWW::Mechanize can provide you. You might even have to resort to controlling a specific browser via Perl (e.g. using Win32::IE::Mechanize or WWW::Mechanize::Firefox).
I haven't tried it, but there is also WWW::Scripter with the WWW::Scripter::Plugin::JavaScript plugin.
As others have said, WWW::Mechanize is an excellent module to use for web scraping tasks; you'll do well to learn how to use it, it can make common tasks very easy. I've used it for several web scraping tasks, and it just takes care of all the boring stuff - "go here, find a link with this text and follow it, now find a form with fields named 'username' and 'password', enter these values and submit the form...".
Scrappy is also well worth a look - it lets you do a lot with very little code - an example from its documentation:
my $spidy = Scrappy->new;
$spidy->crawl('http://search.cpan.org/recent', {
'#cpansearch li a' => sub {
print shift->text, "\n";
}
});
Scrappy makes use of Web::Scraper under the hood, which you might want to look at too as another option.
Also, if you need to extract data from HTML tables, HTML::TableExtract makes this dead easy - you can locate the table you're interested in by naming the headings it contains, and extract data very easily indeed, for example:
use HTML::TableExtract;
$te = HTML::TableExtract->new( headers => [qw(Date Price Cost)] );
$te->parse($html_string) or die "Didn't find table";
foreach $row ($te->rows) {
print join(',', #$row), "\n";
}
The most popular web scraping module for Perl is WWW::Mechanize, which is excellent if you can't just retrieve your destination page but need to navigate to it using links or forms, for instance, to log in. Have a look at its documentation for inspiration.
If your needs are simple, you can extract the information you need from the HTML using regular expressions (but beware your sanity), otherwise it might be better to use a module such as HTML::TreeBuilder to do the job.
A module that seems interesting, but that I haven't really tried yet, is WWW::Scripter. It's a subclass of WWW::Mechanize, but has support for Javascript and AJAX, and also integrates HTML::DOM, another way to extract information from the page.
Try the Web-Scraper Perl module. A beginners tutorial can be found here.
It's safe, easy to use and fast.
You may also want to have a look at my new Perl wrapper over Java HtmlUnit. It is very easy to use, e.g. look at the quick tutorial here:
http://code.google.com/p/spidey/wiki/QuickTutorial
By tomorrow I will publish some detailed installation instructions and a first release.
Unlike Mechanize and alike you get some JavaScript support and it is way faster and less memory demanding than screen scraping.
I've just noticed that the long, convoluted Facebook URLs that we're used to now look like this:
http://www.facebook.com/example.profile#!/pages/Another-Page/123456789012345
As far as I can recall, earlier this year it was just a normal URL-fragment-like string (starting with #), without the exclamation mark. But now it's a shebang or hashbang (#!), which I've previously only seen in shell scripts and Perl scripts.
The new Twitter URLs now also feature the #! symbols. A Twitter profile URL, for example, now looks like this:
http://twitter.com/#!/BoltClock
Does #! now play some special role in URLs, like for a certain Ajax framework or something since the new Facebook and Twitter interfaces are now largely Ajaxified?
Would using this in my URLs benefit my Web application in any way?
This technique is now deprecated.
This used to tell Google how to index the page.
https://developers.google.com/webmasters/ajax-crawling/
This technique has mostly been supplanted by the ability to use the JavaScript History API that was introduced alongside HTML5. For a URL like www.example.com/ajax.html#!key=value, Google will check the URL www.example.com/ajax.html?_escaped_fragment_=key=value to fetch a non-AJAX version of the contents.
The octothorpe/number-sign/hashmark has a special significance in an URL, it normally identifies the name of a section of a document. The precise term is that the text following the hash is the anchor portion of an URL. If you use Wikipedia, you will see that most pages have a table of contents and you can jump to sections within the document with an anchor, such as:
https://en.wikipedia.org/wiki/Alan_Turing#Early_computers_and_the_Turing_test
https://en.wikipedia.org/wiki/Alan_Turing identifies the page and Early_computers_and_the_Turing_test is the anchor. The reason that Facebook and other Javascript-driven applications (like my own Wood & Stones) use anchors is that they want to make pages bookmarkable (as suggested by a comment on that answer) or support the back button without reloading the entire page from the server.
In order to support bookmarking and the back button, you need to change the URL. However, if you change the page portion (with something like window.location = 'http://raganwald.com';) to a different URL or without specifying an anchor, the browser will load the entire page from the URL. Try this in Firebug or Safari's Javascript console. Load http://minimal-github.gilesb.com/raganwald. Now in the Javascript console, type:
window.location = 'http://minimal-github.gilesb.com/raganwald';
You will see the page refresh from the server. Now type:
window.location = 'http://minimal-github.gilesb.com/raganwald#try_this';
Aha! No page refresh! Type:
window.location = 'http://minimal-github.gilesb.com/raganwald#and_this';
Still no refresh. Use the back button to see that these URLs are in the browser history. The browser notices that we are on the same page but just changing the anchor, so it doesn't reload. Thanks to this behaviour, we can have a single Javascript application that appears to the browser to be on one 'page' but to have many bookmarkable sections that respect the back button. The application must change the anchor when a user enters different 'states', and likewise if a user uses the back button or a bookmark or a link to load the application with an anchor included, the application must restore the appropriate state.
So there you have it: Anchors provide Javascript programmers with a mechanism for making bookmarkable, indexable, and back-button-friendly applications. This technique has a name: It is a Single Page Interface.
p.s. There is a fourth benefit to this technique: Loading page content through AJAX and then injecting it into the current DOM can be much faster than loading a new page. In addition to the speed increase, further tricks like loading certain portions in the background can be performed under the programmer's control.
p.p.s. Given all of that, the 'bang' or exclamation mark is a further hint to Google's web crawler that the exact same page can be loaded from the server at a slightly different URL. See Ajax Crawling. Another technique is to make each link point to a server-accessible URL and then use unobtrusive Javascript to change it into an SPI with an anchor.
Here's the key link again: The Single Page Interface Manifesto
First of all: I'm the author of the The Single Page Interface Manifesto cited by raganwald
As raganwald has explained very well, the most important aspect of the Single Page Interface (SPI) approach used in FaceBook and Twitter is the use of hash # in URLs
The character ! is added only for Google purposes, this notation is a Google "standard" for crawling web sites intensive on AJAX (in the extreme Single Page Interface web sites). When Google's crawler finds an URL with #! it knows that an alternative conventional URL exists providing the same page "state" but in this case on load time.
In spite of #! combination is very interesting for SEO, is only supported by Google (as far I know), with some JavaScript tricks you can build SPI web sites SEO compatible for any web crawler (Yahoo, Bing...).
The SPI Manifesto and demos do not use Google's format of ! in hashes, this notation could be easily added and SPI crawling could be even easier (UPDATE: now ! notation is used and remains compatible with other search engines).
Take a look to this tutorial, is an example of a simple ItsNat SPI site but you can pick some ideas for other frameworks, this example is SEO compatible for any web crawler.
The hard problem is to generate any (or selected) "AJAX page state" as plain HTML for SEO, in ItsNat is very easy and automatic, the same site is in the same time SPI or page based for SEO (or when JavaScript is disabled for accessibility). With other web frameworks you can ever follow the double site approach, one site is SPI based and another page based for SEO, for instance Twitter uses this "double site" technique.
I would be very careful if you are considering adopting this hashbang convention.
Once you hashbang, you can’t go back. This is probably the stickiest issue. Ben’s post put forward the point that when pushState is more widely adopted then we can leave hashbangs behind and return to traditional URLs. Well, fact is, you can’t. Earlier I stated that URLs are forever, they get indexed and archived and generally kept around. To add to that, cool URLs don’t change. We don’t want to disconnect ourselves from all the valuable links to our content. If you’ve implemented hashbang URLs at any point then want to change them without breaking links the only way you can do it is by running some JavaScript on the root document of your domain. Forever. It’s in no way temporary, you are stuck with it.
You really want to use pushState instead of hashbangs, because making your URLs ugly and possibly broken -- forever -- is a colossal and permanent downside to hashbangs.
To have a good follow-up about all this, Twitter - one of the pioneers of hashbang URL's and single-page-interface - admitted that the hashbang system was slow in the long run and that they have actually started reversing the decision and returning to old-school links.
Article about this is here.
I always assumed the ! just indicated that the hash fragment that followed corresponded to a URL, with ! taking the place of the site root or domain. It could be anything, in theory, but it seems the Google AJAX Crawling API likes it this way.
The hash, of course, just indicates that no real page reload is occurring, so yes, it’s for AJAX purposes. Edit: Raganwald does a lovely job explaining this in more detail.