How to put the Amazon Kindle Web Browser into Article Mode - mobile-website

How do you put the Amazon Kindle web browser into Article Mode via HTML or Javascript?
Editor's note: Some pages are not automatically detected as "articles" by the Kindle 3 browser, and give an error message when trying to go into Article mode. What does the Article mode use to determine what portion of the page to display?

Is this related at all to Readability?
http://lifehacker.com/5163401/readability-bookmarklet-quick+formats-pages-for-smoother-text
Actual JavaScript code for Readability, which is heuristic based:
// Study all the paragraphs and find the chunk that has the most <p>'s and keep it:
This also appears to be related to Safari 5's Reader mode. Here is what is required for Safari Reader:
This definitely needs more investigating, but so far, these appear to be the most important factors for Safari’s Reader functionality to kick in:
Use the right markup, i.e. make sure the most important content is wrapped inside a container element. Whether you use <article>, <div> or even <span> doesn’t seem to matter — as long as it’s not <p>.
The content needs to be long enough. Use enough words, use enough paragraphs, use enough punctuation. Every paragraph should have at least 100 characters.
Reader doesn’t work for local documents.

http://www.wired.com/gadgetlab/2010/09/simple-tip-turns-kindle-into-ultimate-news-reader/ - The "f" key feature outlined above or some other feature? Not quite sure what article mode means.

It means that the browser will try to identify if the page you are looking at has a main body of text (is an article), parse it out and then display only that text without clutter and for easy scrolling.
I don't think you can force it via the web page's code

As far as I know, once the website has loaded (and if you are on an specific topic) you can turn on the "Article Mode" from the menu.
I've seen similar JS tools for Chrome too, so I assume it's part of webkit.

Related

Google Rich Snippets warnings for hCard

I get the following errors from the Google Rich Snippet Tool for my website http://iancrowther.co.uk/
hcard
Warning: This information will not appear as a rich snippet in search results results, because it seems to describe an organization. Google does not currently display organization information in rich snippets
Warning: At least one field must be set for Hcard.
Warning: Missing required field "name (fn)".
Im experimenting with vcard and Schema.org and am wondering if I'm missing something or the validator is playing up. I have added vcard and Schema.org markup to the body which may be causing confusion. Also, I am making the assumption I can use both methods to markup my code.
Update:
I guess with the body tag, I'm just trying to let Google discover the elements which make up the schema object within the page. I'm not sure if this is a good / bad way to approach things? However it lets my markup be free of specific blocks of markup. I guess this is open to discussion but I like the idea of having a natural flow to the content that's decorated in the background. Do you think there is any negative impact? I'm undecided.
I am in favour of the Person structure, this was a good call as this is more representative of the current site content. I am a freelance developer and as such use this page as my Organisation landing page, so I guess I have to make a stronger decision of the sites goals and tailor the content accordingly, ie Organisation or Person.
I understand that there is no immediate rich snippet gains, but im a web guy so have a keen interest in these kind of things.
With schema testing, I find it easiest to start from the most obvious problem, and try to work our way deeper from there. Note, I have zero experience with hcard, but I don't believe the error you mentioned actually has anything to do with your hcard properties.
The most obvious problem I see, is that your body tag has an itemtype of schema.org\Organization. When you set an itemtype on a dom element, you are saying that everything inside of that element is going to help describe that itemtype. Since you've placed this on your body element, you are quite literally telling Google that your entire page is about an organization.
From the content of your page, I would recommend changing that itemtype to schema.org\Person. This would seem to be a more accurate description. Once you make that change and run the scanner again, you may see more errors relating to the schema and we can work through those too (for example, you'll probably need to set familname and givenName).
With all of that said, you should know that currently there are no rich snippets that you will gain from adding this schema data. Properly setting this up on your page, is only good to do, especially since we don't know what rich snippets Google or others will expose in the future, but currently you won't see any additional rich snippets in Google search results from adding these tags. I don't want to discourage you from setting this up properly but I just want to set your expectations.

Is it safe to use only HTML editor instead of Textarea?

I am thinking of converting my forum input textarea exclusively to TinyMCE HTML editor. I already have both options but it is a pain maintaining both and inserting images in textarea needs preview etc...
This is more of a general question. Do you think it is safe to include HTML editor (with all the safety measures like paste only text, filter for html not allowed etc...) as the only kind of editor on a forum? It's 2011 and machines are generally fast, connection are better.
What are the downsides of using HTMl editor instead of text field? I can not imagine a blog CMS to have "normal" textarea for input.
But for some reason on forums I do not see many html editors... Even the TinyMCE site has a textarea for their editor. So is there really something to watch out for and a no go...?
I know it is more of a phylosophical question, but I guess you have experience with forums, blogs, etc...
My site is about cooking and beeing able to insert pictures (and upload them) the easy way seems to be a big plus for our home cooks ;-)
If you don't consider security (you'll need to filter the HTML input on the server side so it won't contain anything dangerous), there's only the user experience left for consideration. On a forum you write text most of the time. There's seldom any use for more functionality than bold, italics and images. The solution used here on Stack Overflow addresses this by having a very limited set of functions, and applying it in the textarea with a sane markup language.
Other forums either use old software or didn't think the improved user experience was worth the effort. The textarea-only solution fits most forums well enough since most of the input is text-only anyway.
I do think you would benefit from HTML input. Make sure that only allowed HTML can be sent though, since the user can circumvent everything on the client side.
TinyMCE uses Javascript to add functionality to an existing textarea. If Javascript is disabled, then the user will be presented with a normal textarea anyway.
I would say it's relatively safe, as long as all input from the user is validated on the server before it's used for anything.

How to enable iOS 5 Safari Reader on my website?

How does the Reader function of Mobile Safari in iOS 5 work? How do I enable it on my site. How do I tell it what content on my page is an article to trigger this function?
A lot of the answers posted here contain false information. Here are some corrections/clarifications:
The <article> element works fine as a wrapper; Safari Reader recognizes it. My site is an example. It doesn’t matter which wrapper element you choose, as long as there is one, other than <body> or <p>. You can use <article>, <div>, <section>; or elements that are semantically incorrect for this purpose, like <nav>, <aside>, <footer>, <header>; or even inline elements like <span> (!).
No headings are required for Reader to work. Here’s an example of a document without any <h*> elements on which Reader works fine: http://mathiasbynens.be/demo/safari-reader-test-3
I posted some more details regarding my findings here: http://mathiasbynens.be/notes/safari-reader
I've tested 100 or so variations of this on my iPhone in order to figure out what triggers this elusive Reader state. My conclusions are as follows:
Here is what I found had an impact:
Having around 200 or more words (or 1000 characters including whitespace) in the article you want to trigger the "Reader" seems necessary
The reader was NEVER triggered when I had less than 170 words; although it was sometimes triggered when I had 180 or 190 words.
Text inside certain elements such as <ol> or <ul> (that are not typically used to contain a story) will not count towards the 200 words (they will however be displayed in the reader if the reader is triggered for other reasons)
Wrapping the 200 words in a block element such as a <div> or <article> seems necessary (that said, I'd be surprised if there were any websites where that was not already the case)
For full disclosure, here is what I found did NOT have an impact:
Whether using a header or not
Whether wrapping the text in a <p> or letting it flow freely
Punctuations (ie removing all periods, commas, etc, did not have an impact)
It seems the algorithm it is based on is looking for p-Tags and it counts delimiters like "." in the innerText. The section (div) with the most points gets the focus.
see:
http://lab.arc90.com/experiments/readability/
Seems to be the base for the Reader-mode, at least Safari attributes it in the Acknowledgements, see:
file:///C:/Program%20Files/Safari/Safari.resources/Help/Acknowledgments.html
Arc90 ( Readability )
Copyright © Arc90 Inc.
Readability is licensed under the Apache License, Version 2.0.
This question (How to disable Safari Reader in a web page) has more details. Copied here:
I'm curious to know more about what triggers the Reader option in Safari and what does not. I wouldn't plan to implement anything that would disable it, but curious as a technical exercise.
Here is what I've learned so far with some basic playing around:
You need at least one H tag
It does not go by character count alone but by the number of P tags and length
Probably looks for sentence breaks '.' and other criteria
Safari will provide the 'Reader' if, with a H tag, and the following:
1 P tag, 2417 chars
4 P tags, 1527 chars
5 P tags, 1150 chars
6 P tags, 862 chars
If you subtract 1 character from any of the above, the 'Reader' option is not available.
I should note that the character count of the H tag plays a part but sadly did not realize this when I determined the results above. Assume 20+ characters for H tag and fixed throughout the results above.
Some other interesting things:
Setting for P tags removes them from the count
Setting display to none, and then showing them 230ms later with Javascript avoided the Reader option too
I'd be interested if anyone can determine this in full.
Both Firefox and Chrome have the similar plugin named iReader. Here is its project with source code.
http://code.google.com/p/ireader-extension/
Read the code to get more.
I was struggling with this. I finally took out the <ul> markings in my story, and viola! it started working.
I didn't put any wrapper around the body, but may have done it by accident.
HTML5 article tag doesn't trigger it on my tests. It also doesn't seem to work on offline content (i.e. pages saved on your local machine).
What does seem to trigger it is a div block with a lot of p's with a lot of text.
The p tag theory sounds good. I think it also detects other elements as well. One of our pages with 6 paragraphs didn't trigger the Reader, but one with 4 paragraphs and an img tag did.
It's also smart enough to detect multi-page articles. Try it out on a multi-page article on nytimes.com or nymag.com. Would be interested to know how it detects that as well.
Surprising though it may be, it indeed does not pay any attention to the HTML5 article tag, particularly disappointing given that Safari 5 has complete support for article, section, nav, etc in CSS--they can be styled just like a div now, and behave the same as any block level element.
I had specifically set up a site with an article tag and several inner section tags, in prep for semantic HTML5 labeling for exactly such a purpose, so I was really hoping that Safari 5 would use that for Reader. No such luck--probably should file a bug on this, as it would make a great deal of sense. It in fact completely ignores most of the h2 level subheads on the page, each marked as a section, only displaying the single div that adheres to the criteria mentioned previously.
Ironically, the old version of the same site, which has neither article, section, nor separating div tags, recognizes the whole body for display in Reader.
See Article Publishing Guidelines.
Here are APIs about how to read and parse: Readability Developer APIs. There's already a project you can refer: ruby-readability.
A brief history:
The Safari Reader feature since Apple's Safari 5 browser embeded a codebase named Readability, and Readability started off as a simple, Javascript-based reading tool that turned any web page into a customizable reading view. It was released by Arc90 (as an Arc90 Lab experiment), a New York City-based design and technology shop, back in early 2009. It's also embeded in Amazon Kindle and popular iPad applications like Flipboard and Reeder.
I am working on algorithms for cleaning web-sites from information "waste" similar to Safari Reader feature. It's not so good as readability but has some cool stuff.
You can learn more at smartbrowser.codeplex.com project page.

What's the shebang/hashbang (#!) in Facebook and new Twitter URLs for?

I've just noticed that the long, convoluted Facebook URLs that we're used to now look like this:
http://www.facebook.com/example.profile#!/pages/Another-Page/123456789012345
As far as I can recall, earlier this year it was just a normal URL-fragment-like string (starting with #), without the exclamation mark. But now it's a shebang or hashbang (#!), which I've previously only seen in shell scripts and Perl scripts.
The new Twitter URLs now also feature the #! symbols. A Twitter profile URL, for example, now looks like this:
http://twitter.com/#!/BoltClock
Does #! now play some special role in URLs, like for a certain Ajax framework or something since the new Facebook and Twitter interfaces are now largely Ajaxified?
Would using this in my URLs benefit my Web application in any way?
This technique is now deprecated.
This used to tell Google how to index the page.
https://developers.google.com/webmasters/ajax-crawling/
This technique has mostly been supplanted by the ability to use the JavaScript History API that was introduced alongside HTML5. For a URL like www.example.com/ajax.html#!key=value, Google will check the URL www.example.com/ajax.html?_escaped_fragment_=key=value to fetch a non-AJAX version of the contents.
The octothorpe/number-sign/hashmark has a special significance in an URL, it normally identifies the name of a section of a document. The precise term is that the text following the hash is the anchor portion of an URL. If you use Wikipedia, you will see that most pages have a table of contents and you can jump to sections within the document with an anchor, such as:
https://en.wikipedia.org/wiki/Alan_Turing#Early_computers_and_the_Turing_test
https://en.wikipedia.org/wiki/Alan_Turing identifies the page and Early_computers_and_the_Turing_test is the anchor. The reason that Facebook and other Javascript-driven applications (like my own Wood & Stones) use anchors is that they want to make pages bookmarkable (as suggested by a comment on that answer) or support the back button without reloading the entire page from the server.
In order to support bookmarking and the back button, you need to change the URL. However, if you change the page portion (with something like window.location = 'http://raganwald.com';) to a different URL or without specifying an anchor, the browser will load the entire page from the URL. Try this in Firebug or Safari's Javascript console. Load http://minimal-github.gilesb.com/raganwald. Now in the Javascript console, type:
window.location = 'http://minimal-github.gilesb.com/raganwald';
You will see the page refresh from the server. Now type:
window.location = 'http://minimal-github.gilesb.com/raganwald#try_this';
Aha! No page refresh! Type:
window.location = 'http://minimal-github.gilesb.com/raganwald#and_this';
Still no refresh. Use the back button to see that these URLs are in the browser history. The browser notices that we are on the same page but just changing the anchor, so it doesn't reload. Thanks to this behaviour, we can have a single Javascript application that appears to the browser to be on one 'page' but to have many bookmarkable sections that respect the back button. The application must change the anchor when a user enters different 'states', and likewise if a user uses the back button or a bookmark or a link to load the application with an anchor included, the application must restore the appropriate state.
So there you have it: Anchors provide Javascript programmers with a mechanism for making bookmarkable, indexable, and back-button-friendly applications. This technique has a name: It is a Single Page Interface.
p.s. There is a fourth benefit to this technique: Loading page content through AJAX and then injecting it into the current DOM can be much faster than loading a new page. In addition to the speed increase, further tricks like loading certain portions in the background can be performed under the programmer's control.
p.p.s. Given all of that, the 'bang' or exclamation mark is a further hint to Google's web crawler that the exact same page can be loaded from the server at a slightly different URL. See Ajax Crawling. Another technique is to make each link point to a server-accessible URL and then use unobtrusive Javascript to change it into an SPI with an anchor.
Here's the key link again: The Single Page Interface Manifesto
First of all: I'm the author of the The Single Page Interface Manifesto cited by raganwald
As raganwald has explained very well, the most important aspect of the Single Page Interface (SPI) approach used in FaceBook and Twitter is the use of hash # in URLs
The character ! is added only for Google purposes, this notation is a Google "standard" for crawling web sites intensive on AJAX (in the extreme Single Page Interface web sites). When Google's crawler finds an URL with #! it knows that an alternative conventional URL exists providing the same page "state" but in this case on load time.
In spite of #! combination is very interesting for SEO, is only supported by Google (as far I know), with some JavaScript tricks you can build SPI web sites SEO compatible for any web crawler (Yahoo, Bing...).
The SPI Manifesto and demos do not use Google's format of ! in hashes, this notation could be easily added and SPI crawling could be even easier (UPDATE: now ! notation is used and remains compatible with other search engines).
Take a look to this tutorial, is an example of a simple ItsNat SPI site but you can pick some ideas for other frameworks, this example is SEO compatible for any web crawler.
The hard problem is to generate any (or selected) "AJAX page state" as plain HTML for SEO, in ItsNat is very easy and automatic, the same site is in the same time SPI or page based for SEO (or when JavaScript is disabled for accessibility). With other web frameworks you can ever follow the double site approach, one site is SPI based and another page based for SEO, for instance Twitter uses this "double site" technique.
I would be very careful if you are considering adopting this hashbang convention.
Once you hashbang, you can’t go back. This is probably the stickiest issue. Ben’s post put forward the point that when pushState is more widely adopted then we can leave hashbangs behind and return to traditional URLs. Well, fact is, you can’t. Earlier I stated that URLs are forever, they get indexed and archived and generally kept around. To add to that, cool URLs don’t change. We don’t want to disconnect ourselves from all the valuable links to our content. If you’ve implemented hashbang URLs at any point then want to change them without breaking links the only way you can do it is by running some JavaScript on the root document of your domain. Forever. It’s in no way temporary, you are stuck with it.
You really want to use pushState instead of hashbangs, because making your URLs ugly and possibly broken -- forever -- is a colossal and permanent downside to hashbangs.
To have a good follow-up about all this, Twitter - one of the pioneers of hashbang URL's and single-page-interface - admitted that the hashbang system was slow in the long run and that they have actually started reversing the decision and returning to old-school links.
Article about this is here.
I always assumed the ! just indicated that the hash fragment that followed corresponded to a URL, with ! taking the place of the site root or domain. It could be anything, in theory, but it seems the Google AJAX Crawling API likes it this way.
The hash, of course, just indicates that no real page reload is occurring, so yes, it’s for AJAX purposes. Edit: Raganwald does a lovely job explaining this in more detail.

Use of HTML 5 doctype creates a gap at top of page on iphone safari browser

Update: Please disregard, my problem was caused by an advertisement bar being inserted by the vendor who provides my workplace wireless service.
I was building a mobile friendly website and wanted to use HTML 5. However when I specify the doctype as <!DOCTYPE HTML> , I get a gap at the top of the page on safari on the iphone.
I notice that other sites have the same problem such as nextstop.com and nike.com
I guess safari does not fully support HTML 5 yet. Anybody know of a workaround?
HTML 5 is still in a very unstable state. Don't use it in a production environment.
Edit Just so you guys know what it's about, HTML 5 is currently an Editor's Draft, and the document clearly states (in the Status of This Document section) that this specification is not stable, and that a consensus may not have been reached on any of the proposed sections. I think it should be clear enough that it means it's a bit early to start using it.
All browsers correctly interpret the HTML doctype. Putting it in sets your browser into Standards Compliant mode, that is the only difference with or without the doctype.
You can use a CSS reset tool like http://meyerweb.com/eric/tools/css/reset/ to get rid of default margins and padding on all elements.