Create URL based on entered data in iOS - iphone

If user entered say google then
i need to add http://www.google.com ie missing part.
User may enter any thing say google.in or www.google or anything.
Now goal to complete the left over url as we check url using regex like this:
NSString *urlRegEx = #"(http|https)://((\\w)*|([0-9]*)|([-|_])*)+([\\.|/]((\\w)*|([0-9]*)|([-|_])*))+";
That given url is valid or not

Don't forget the third level domain names with suffixes like .co.uk, .co.us, .com.co, etc.
A fully qualified domain name must have at least one dot. If it doesn't have at least one dot, then you might add .com to the end.
If it does have at least one dot, then it gets more complicated. .google could be a top level domain in the future, though it isn't now. Perhaps you want to keep a white list of all "valid" first and second level domain names. You evaluate the entered domain name from the right until it stops matching domains from your list. The remainder is the "registered" domain name and any sub domains. If you don't find any matches, then add .com.
Alternatively, rather than parsing the domain name, you could just try to resolve it, and if it doesn't resolve, then add .com and try again.

I think I know what you're asking however other than the regex how are you actually validating that the URL is valid? It seems as though you're making an incorrect assumption that all URLs follow a common syntax. As as example, http://www.www.extra-www.org/ is a valid URL so if you apply your regex (as I understand your intent) the user would get to http://www.extra-www.org which may not be the same site as the one the user wanted (even though in this case it is because it forwards). Another example is http://www.www.com... if the user enters "www" your regex will kill it. A final example is if a site doesn't have its DNS registered WITH the "www" - your regex will incorrectly add the "www" piece.
EDIT: what happens if the user needs https as opposed to http and only enters "google"?

Haven't tried this one myself yet, but an NSDataDetector for the type NSTextCheckingTypeLink should be able to do you a decent job.

Related

Application Request Routing on local machine

I installed ARR on my local machine and setup a server farm with a single server in it (localhost). I added two redirect routing rules. However, it doesn't do the redirect. My Default Web Site has ab additional binding like this one: localhost.mycompany.com. I tried putting that in the server farm and it still didn't work. The redirect rules look like this.
Uses wildcards in the pattern
inbound pattern: */path2/*/*/*/method*
Redirect URL: /path1/path2/api/item/method
EDIT: When I use the Test Pattern and enter one of the URLs against my rule it parses it successfully
Also tried putting the full hostname (e.g. http://localhost.mycompany.com/...) in the redirect rule as well as using the alias localServerFarm (which is the name of server farm). Nothing worked.
The module is "working" in some respect because when I had a broken rule it sure told me about it when I tried to load any url on localhost. Once I fixed the rule, I no longer got the error message but it doesn't do any redirection.
This was just a matter of getting the redirect rule correct. In the rules list there is a column named Input and it's setting is URL Path. So, the only input to the pattern match is the path part of the URL not including the / at the beginning. All I had to do was change the */ at the beginning of my pattern to just *, e.g. */path2/*/*/*/method* changed to *path2/*/*/*/method*.
I don't know if there's any other setting for the Input field (it isn't settable in the rule definition screen) but for anyone creating rules remember that only the path without a leading / is what's used for evaluating the pattern match. One note is that if you're matching from the beginning of the path, as I am, you don't need the * at the beginning of the pattern. However, if you go into the test pattern screen and paste a full URL into the Input data it will not just grab the path part of that URL and feed it to the pattern match will use the entire string so it will require an * at the beginning of your pattern to work.

How can I be sure an email address is unique?

There's a pub in my town whereby, if you sign up to their newsletter using their website and provide a "unique" email address, you get a free drink. On a whim, I decided to sign up a second time using myemail+one#gmail.com. It let me. I'm now sitting on a nice comfy pile of free drink vouchers.
This got me thinking about a system we have here, where the email address is considered the unique identifier. Checking the code, sure enough, if we were offering vouchers in our business, someone else would be sitting pretty.
The basic, stab-in-the-dark, fix is to check for the "+" character and ignore everything after it (up to the #), and compare using that. But I am unsure if this was the intent for the + character. Would that work?
Secondly, are there any other caveats that would allow a user to sign up multiple times with a seemingly different email address, but which actually would always end up in the same mailbox?
This question is language-agnostic.
While using a plus sign as an e-mail address alias is a known feature of gmail, other mailers do either not allow it or use a minus sign instead. '+' is a legitimate character to be used as part of an email address according to the RFC.
The use of '.' is also a gray area. john.doe#gmail.com and johndoe#gmail.com send also both to the same email address and look different.
In order to validate the uniqueness of an email address you will have to prepare a rule base for your application, keep it up to date and still expect surprises...

Including email in restful URIs

In my application I have URIs that include the userid to identify user private resources. And userids are emails, for example:
/users/user2#example.com/private-resource
It's a good practice to put email in the URI, including characters like . and #?
Or should I use some other type of userid? Like an hash for example?
If the email can function as fixed identifier it would be ok.
The thing is that most of the time applications will allow users to change emails; in this case it would be more bullet proof to control the ID-space yourself, e.g. by using a surrogate key. (Because the email (if users can change them) is not the identity but merely a property of the resource).
Another argument against emails is - as #Rob points out - a potential security issue.
There's no problem with that... just make sure you URL-encode the appropriate characters.
(Email's kind of sensitive information, though, you might want to pass a hash or surrogate key instead just to protect your users' email addresses if the URLs get passed around.)
though having '-' in uri is absolutely fine.
use of '#' , '#' are discouraged
however, as long as email id is an id e.g. http://example.org/user/mail/{email}/1234
guess thats fine.

Is it possible to include comments inside a non email host name?

I am working on a more complete email validator in java and came across an interesting ability to embed comments within an email both in the "username" and "address" portions.
The following snippet from http://www.dominicsayers.com/isemail/ has this to say about comments within an email.
Comments are text surrounded by parentheses (like these). These are OK but don't form part of the address. In other words mail sent to first.last#example.com will go to the same place as first(a).last(b)#example(c).com(d). Strange but true.
Do emails like this really exist ?
Is it possible to form hosts like this ?
I tried entering an url such as "google(ignore).com" but firefox and some other browsers failed and i was wondering is this because its it wrong or is it because they dont know about host name comments ?
That syntax -- comments within an addr-spec -- was indeed permissible by the original email RFC, RFC 822. However, the placement of comments like you'd like to use them was deprecated when that RFC was revised by RFC 2822... 10 years ago. It's still marked as obsolete in the current version, RFC 5322. There's no good excuse for emitting anything using that syntax.
Address parsers are supposed to be backwards-compatible in order to cover all conceivable cases, including 10-years-deprecated bits like the one you're trying to take advantage of here. But I'll bet that many, many receiving mail agents will fail to properly parse out those comments. So even though you may have technically found a loophole via the "obsolete addressing" section of the RFC, it's not likely to do you much good in practice.
As for HTTP, the syntax rules aren't the same as email syntax rules. As you're seeing, the comment section from RFC 822 isn't applicable.
Just because you can do it in the spec, doesn't mean that you should. For example, Gmail will not accept that comment format for address.
Second, (to your last point), paren-comments being allowed in email addresses doesn't mean that they work for URLs.
Finally, my advice: I'd tailor the completeness of your validator to your requirements. If you're writing a new MTA (mail transfer agent), you'll probably have to do it all. If you're writing a validator for a user input, keep it simple:
look for one #,
make sure you have stuff before (username) and after (domain name),
make sure you have a "dot" in the hostname string,
[extra credit] do a DNS lookup of the hostname to make sure it resolves.

"/ " in a URL completely bypassing ASP Routing

One of my routes in an MVC project serves documents that have been uploaded.
For SEO and user friendliness purposes I want the document title to be included in the URL, the route will take the ID from the incoming URL, match it with the document then redirect to a URL with the filename appended to the ID. As document titles can have a wide variety of characters in the title including ones that are used to break up parameters the filename is a catchall parameter.
This works fine for almost all characters in the title including reserved ones such as "/" but when the title include the combination "/ " routing breaks. Not just in terms of not match this route but apparently bypassing the entirety of the application and returning a 404, I tried to use Phil Haack's RouteDebugger but that was also giving a 404 rather than catching the request.
My web.config has requestvalidation turned off and I can't seem to find anyway to get the application to catch the request.
I would recommend you filtering those character or you will have many headaches. Here's a function you might find interesting.