mobile: html5 vs xhtml [closed]

mobile: html5 vs xhtml [closed] - iphone

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I am building a mobile app (hybrid mobile web app but with a native shell) with most users on the iphone (some on the blackberry) and am wondering if it should be written in html5 or xhtml?
Any insight would be great.

tl;dr: Use HTML5, because text/html XHTML is parsed as HTML5, and proper XHTML can fail spectacularly.
Current browsers don't actually support HTML4 or XHTML/1.x any more. They treat all documents as HTML5/XHTML5 (e.g. <video> will work even if you set HTML4 or XHTML/1.x DOCTYPE).
Your choice isn't really between HTML5 and XHTML, but between text/html and XML parsing mode and quirks and standards rendering modes. Browser engines are not aligned with versions of W3C specs.
The real choices are:
quirks vs standards mode. "Quirks" is emulation of IE5 bugs and box model. Quirks bites if you fail to put DOCTYPE or use one of obsolete DOCTYPEs (like HTML4 Transitional).
The obvious choice is to enable standards mode by putting (any) modern DOCTYPE in every document.
text/html vs application/xhtml+xml. The XML mode enables XHTML features that weren't in HTML (such as namespaces and self-closing syntax on all elements) and most importantly enables draconian error handling.
NB: it's not possible to enable XML mode from within a document. The only way to enable it is via real Content-Type HTTP header (for this purpose <meta> and DOCTYPE are ignored!)
The XML mode was supposed to be the best for mobiles in times of WAP and XHTML Basic, but in practice it turned out to be a fantasy!
If you use application/xhtml+xml mode your page will be completely inaccessible to many users of GSM connections!
There's some proxy software used by major mobile operators, at least in UK and Poland where I've tested it, which injects invalid HTML to everything that looks HTML-like, including properly served XHTML documents.
This means that your well-formed perfect XHTML will be destroyed in transit and user will see only XML parse error on their side. User won't be able to notify you about the problem, and since markup is malformed outside your server, it isn't something you could fix.
That's how all XML-mode (correctly served XHTML) pages look like on O2 UK:
(the page renders fine when loaded via Wi-Fi or VPN that prevents mobile operator from screwing up the markup)

HTML5 and XHTML are not exclusive choices. You can use both at once (XHTML 5) or you can use neither (HTML 4).
I wouldn't author documents to [X]HTML5 yet as the standard is not yet finished, never mind any implementations. The “HTML5” features we have available in some browsers are generally scripting extensions that don't affect HTML at a markup level at all.

My understanding is that neither the iPhone nor the Blackberry fully support HTML 5 yet. So unless you need some specific HTML 5 features I would stick with XHTML.

Pick any of them. XHTML is just an XML-language serialisation of HTML, so in reality, it's just DOM nodes encoded in a different way. (Maybe I could create a JSON-serialised version of HTML?) Really, the choice of SGML or XML serialisation depends on whether or not the device supports it. Apple uses WebKit, which fully supports XHTML.
Remember to send your XHTML as application/xhtml+xml or it won't be treated as XHTML!
Oh... and one other thing. All browsers that I know of support XHTML except IE.

Related

Web Scraping with Scala [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Just wondering if anyone knows of a web-scraping library that takes advantage of Scala's succinct syntax. So far, I've found Chafe, but this seems poorly-documented and maintained. I'm wondering if anyone out there has done scraping with Scala and has advice. (I'm trying to integrate into an existing Scala framework rather than use a scraper written in, say, Python.)

First there is a plethora of HTML scraping libs in JVM all you need to do is pimp one of them (pimp my library pattern).
The four I have used are:
HtmlUnit - Will emulate the browser and even run Javascript
Jericho - Preserves formatting and ideal if you want to edit the scraped HTML
NekoHtml
JSoup -- does not work with Scala. Might work
I have used Selenium but never for scraping. Scala has a wrapper around selenium.
I would recommend pimping an existing Java library over some half baked Scala lib.

I don't have a Scala-specific recommendation, but for the JVM in general I've had good success with:
JSoup You can CSS selectors to "scrape" the document. Really nice to work with.
Use Tagsoup to get your input HTML to XML, then use XML processors to "Scrape".
The Tagsoup route actually works quite well with Scala since Scala's built-in XML "dsl" is pretty concise (if you can forgive its perf issues and occasional API weirdness). Also, Tagsoup will handle nearly any garbage document you give it. It also has niceties like built-in understanding of many HTML entities that other SAXParsers will choke on as being undeclared.
tl;dr - JSoup + CSS selectors if possible, otherwise Tagsoup + scala XML. If slow is ok, tagsoup first, then jsoup the result.

I'd recommend Goose: https://github.com/jiminoc/goose
It's not as general-use as you might need but if you are scraping article content from popular sites, it may work out of the box. It also provides a framework for you to work from if you want to extend their code to cover other sites.

Web developers - what is the best tool for inspecting network traffic, and what is your default browser? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I'm quite new at web development, and I heard about all kind of tools and plug-ins for inspecting the network traffic.
The vast majority of my work is on server side, and I work with ASP.NET if that's relevant..
For now I use Fiddler which seems great, but I also heard about Firebug (for firefox) and ieHttpHeaders.. are these in the same category as Fiddler or do they serve a different purpose? Are there more tools I should be aware of?
As my default browser I use Chrome because I think it's the fastest.
What do you use and why?
Edit: If you recomment a tool, could you please explain what it does, and why you chose it?
Thanks a lot.

The Firebug net tab is useful for looking at the network activity. It is somewhat better integrated in the browser than Fiddler.

Probably should make this a community wiki.
I use Firefox because I cannot live without some of my extensions. I stick usually with Firebug because of the nice integration with Firefox and an extension. I do have Fiddler installed to use with IE and also the Firefox extension.
I also run WireShark in case I need a deeper level of capture, packets.

Fiddler is good for looking at network traffic in detail.
Firebug does it also (under the net panel). And Firebug Lite for IE also deserves a look.
I personally use Firefox as my default browser and for development with Firebug.
For pure research I sometimes fire up Chrome because of its snappiness as you mentioned.

Firefox with firebug extension - the Net panel contains pretty much anything you could want to inspect network traffic.
Chrome has something very similar builtin (no extension needed), but has slightly better tools to benchmark the performance, along with an audit tab that firebug doesn't have.
Both are good. You might also want to try opera dragonfly which is a little different, but just as useful as those 2.

Ethereal can be very handy for network protocol analysis. Otherwise firefox / firebug as main tools.

Firefox with firebug is good, you can also try Chrome which includes a suite of develop tools.

Software Requirement Specifications for Web Applications [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm looking for some guidance/books to read when it comes to creating a software requirement specification for a web application. For inspiration I have read some spec documents for desktop based applications. The documents I have read capture a systems functional requirements in use cases which tend to be rather data oriented with use cases centered around the various CRUD operations the application is intended to perform.
I like this structure however I'm finding it rather difficult to marry it to what my web application needs to do, mostly reading data as opposed to manipulating it. I've had a go at writing some use cases however they all tend to boil down to "Search for item", "Change view of search results" or "User selects facet to refine search results". This doesn't sound quite right to me and makes me wonder if I'm going about this the right way.
Are there planning differences between web based and desktop based applications?

In my experience, there is really nothing wrong as having all the specifications being CRUD. Most of the time, any application isn't just "a simple CRUD app." Requirements evolve and different parts of the systems tend to diverge and acquire some specific logic.
Even if it feels like repeating the same CRUD sentences over and over, actually writing them down and thinking about it (instead of copy & pasting) will often uncover hidden requirements.

The differences between desktop based applications and web based applications is staggering.
I recommend reading these in exactly this order and apply this knowledge in exactly the opposite order, aside from CSS 3, HTML 5, and XHTML 1.1:
RFC 3986 - URI
RFC 2616 - HTTP 1.1
RFC 4346 - TLS 1.1
RFC 4251 - SSH Protocol
RFC 4252 - SSH Authentication
RFC 4253 - SSH Transport
RFC 2045 - MIME
RFC 4627 - JSON
HTML 4.01
XML
XHTML 1.0
XHTML 1.1
ECMAScript
CSS 2
HTML 5 (Not a standard)
CSS 3 (Not a standard)
Web Content Accessibility Guidelines 2.0
Symantec Internet Security Threat Report Volume XIV
Symantec Internet Security Threat Report Volume XV
OWASP Top 10
SEO
Once you have finished reading this you should begin to understand how the basic technology of the web works. Only at this point would you be ready to develop, conformantly, for a web application. There are many other technologies at play, but these are the basics and once you are familiar with the basics you will know where else to look for more information.

Basically you can sue the same method as for desktop applications, although you might make some addition, because we applications often tend to have different type of requrements. First of all, read something good about Use Cases, there are different use case levels and that might be a solution to your use cases which do not seem so right. Also do not forget about use case generalization and parametrized use cases if CRUD repetition is the problem. One thing, which is often more important in web applications than in desktop apps is the aspect of usability. This is because of the nature of the web - people have ofthe the coice of not using your service and go to next google result if you app is not usable. So what I think is a good addition to the spec are Personas - just find some possible instances of the human actors for your use cases and try to think of some goals they might want to achieve often using your web app and present how they will achieve them using your web app (and try to make it super easy of course). Another important thing is the Information Architecture - the way in which you will provide information in your web app. This comprises of navigation, some basic layout, but not necessarily design, just information about where to find something in your web app. This can be done using some rapid prototyping tools.

Can you suggest a component CMS that is compatible with IBM's DITA [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I am looking for a Component CMS solution that is compatible with IBM's DITA in terms of preserving the document hierarchy/structure created in DITA (ditamaps).
I am not necessarily looking for an open source solution.
Other requirements would be:
- file migration
- XML support (ingestion, editing, export)
- PDF support (publishing)
- Workflow management
- Localization support (managing versions across locales)
- Output tagging

As you are looking at CCMS, be sure that you consider the following factors:
How easy is it to get your content into the system?
How easy is it to get your content out of the system?
Does the system use proprietary mechanisms for filtering, rather than support for DITAVAL that is part of the OASIS standards?
Part of the beauty of DITA is that if you follow the standard and do not use proprietary mechanisms, you easily can exchange content with business partners, move to another CCMS, if you needed, and so forth.
Older CCMS use proprietary mechanisms for some things. It's entirely understandable, since they were developed before DITA was a standard, and so have legacy customers with implementations that they must support.

It's a bit dangerous (a lot dangerous) to be choosing something like a Component CMS based on questions on a forum like this, but as long as you're asking you could look at things like: SDL Trisoft, IXIASOFT DITA CMS, Vasont, XDocs, or DITA Exchange to get an idea of what is out there. CCMS systems are vastly different from each other both in price and functionality, so things like:
Number of users
Distribution of users
IT 'religious affiliations' (e.g. SharePoint addiction, Linux)
Use of DITA features like Conref, KeyRef, SubjectScheme
Versioning flexibility requirements
Translation management
will all greatly affect the decision making process. We tend to spend time with a client before making solid recommendations so this is simply something to get you started in your research.
PS - As you may know Arbortext is not a Component CMS at all, it's an editor.

Sorry for not understanding the following thing, from the question.
[file-migration] What is the current format?
If it comes as (DITA) XML (or can be migrated to XML), the following procedure might be a solution:
[Import] Import the (DITA) XML into a Version Control System;
[Edit] From there it is easy to modify by multiple people;
[Export] Always possible, from CVS system;
[Publishing] Automatic PDF generation (DITA Open Toolkit);
[Localization support] Use branches for the different languages;
[Tagging] Tag a final release, when it is is published.
See also the What is the recommended tool chain for formatting XML DocBook? as these same suggestions can be used for a DITA tool chain.

IBM built DITA on Arbortext. Arbortext was the only vendor to be a charter member of the OASIS DITA Technical Committee and they continue that activity as PTC. (Arbortext was acquired by PTC in 2005).
Arbortext also supports DocBook (since conception), S1000D, and custom doctypes with no customization of the application required.
Happy to talk more about this offline.

If you are still looking for options you should check out easyDITA (http://easydita.com)

What tool/format do you use for writing your specifications? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I would like to know what kind of tool you use for writing your specifications. I think it's essential to use a tool that supports some kind of plain text format so that one can control the specification with a source control system like SVN. For the specification as for the code as well, it's important to have a history of all changes.
At present we write our specification in a XML format. TeX would also be an alternative, but it's hard for people who have never been working with it.
So let me know, what kind of tools or formats you use for specifications.

DocBook edited with XXE, translated to pdf with xslt when needed to be sent to clients.
Best change ever, so much easier to write, so much easier to merge, and when it's converted it doesn't look so godawfully unprofessional as MSWord.
Plus the structured document style is already there, unlike bloody word which you have to fight with to get working.

We used TeX (MikTeX) and it was perfect because:
plaint text - edit in Vim/Notepad - just everywhere
powerful formatting using predefined macros one of us did
onclick generation to PDF
The only problem was to get diagrams (from ArgoUML) in.
At another project I saw using Word templates - awful stuff directed from above.
I'd consider using something like wiki/forum on intranet. Imagine using GoogleDocs - there is versioning, it's online.. but not applicable for commercial development.

At work a lot of our documents go under Sharepoint or some other document system that really slows down the "release" of a document. This means there are copies of the documents all over the place and getting someone to properly release something is a headache. Due to this I normally received specs in power point or scrap paper. So I put up a wiki (Media Wiki) at work that we now keep all project specs in. This allows them to be viewable by anyone in the company and editable by our development group. Sometimes a developer will ask the boss for a clarification as they pass by or whatever and the developer can update the spec themselves which I think is a huge advantage. Also, when people update a spec with new information using the history it is very easy to see what the most recent changes were - meaning I can see what was happening before and what needs to happen now, which I think is a huge advantage.
I still keep a spec that was scribbled on some notebook paper up on my wall as a reminder.

I've come to use Docbook for all such things. It's easy, flexible, and will generate html, tex (and thus pdf), etc.

Microsoft Word. I know it doesn't meet with your requirements but in every job I've had I've used Microsoft Word for the specifications. You can, and I have, put Word documents in a source control system - The only thing you lose is the ability to diff between documents. Although I do vaguely remember reading somewhere that there are diff tools for word that can be used.

At work we use a wiki because they are great for collaboration, but Microsoft Word will work.
You can actually diff two different versions of a Word document using Word itself - it uses the "track changes" feature to show the differences. (If you don't believe me, try diffing two versions of a Word document using TortoiseSVN.)
For long documents, I actually prefer Word over the wiki because it is well suited to editing long documents and business folks are more comfortable working with Word documents.