"/ " in a URL completely bypassing ASP Routing - asp.net-mvc-2

One of my routes in an MVC project serves documents that have been uploaded.
For SEO and user friendliness purposes I want the document title to be included in the URL, the route will take the ID from the incoming URL, match it with the document then redirect to a URL with the filename appended to the ID. As document titles can have a wide variety of characters in the title including ones that are used to break up parameters the filename is a catchall parameter.
This works fine for almost all characters in the title including reserved ones such as "/" but when the title include the combination "/ " routing breaks. Not just in terms of not match this route but apparently bypassing the entirety of the application and returning a 404, I tried to use Phil Haack's RouteDebugger but that was also giving a 404 rather than catching the request.
My web.config has requestvalidation turned off and I can't seem to find anyway to get the application to catch the request.

I would recommend you filtering those character or you will have many headaches. Here's a function you might find interesting.

Related

Application Request Routing on local machine

I installed ARR on my local machine and setup a server farm with a single server in it (localhost). I added two redirect routing rules. However, it doesn't do the redirect. My Default Web Site has ab additional binding like this one: localhost.mycompany.com. I tried putting that in the server farm and it still didn't work. The redirect rules look like this.
Uses wildcards in the pattern
inbound pattern: */path2/*/*/*/method*
Redirect URL: /path1/path2/api/item/method
EDIT: When I use the Test Pattern and enter one of the URLs against my rule it parses it successfully
Also tried putting the full hostname (e.g. http://localhost.mycompany.com/...) in the redirect rule as well as using the alias localServerFarm (which is the name of server farm). Nothing worked.
The module is "working" in some respect because when I had a broken rule it sure told me about it when I tried to load any url on localhost. Once I fixed the rule, I no longer got the error message but it doesn't do any redirection.
This was just a matter of getting the redirect rule correct. In the rules list there is a column named Input and it's setting is URL Path. So, the only input to the pattern match is the path part of the URL not including the / at the beginning. All I had to do was change the */ at the beginning of my pattern to just *, e.g. */path2/*/*/*/method* changed to *path2/*/*/*/method*.
I don't know if there's any other setting for the Input field (it isn't settable in the rule definition screen) but for anyone creating rules remember that only the path without a leading / is what's used for evaluating the pattern match. One note is that if you're matching from the beginning of the path, as I am, you don't need the * at the beginning of the pattern. However, if you go into the test pattern screen and paste a full URL into the Input data it will not just grab the path part of that URL and feed it to the pattern match will use the entire string so it will require an * at the beginning of your pattern to work.

Wrong NSURLQueryItem percentage encoding for Google CSE

I'm writing app using Google custom search engine.
I received my search engine ID XXXXXXXX219143826571:7h9XXXXXXX (most interesting part bold).
Now I'm trying to use NSURLQueryItem to embed my ID into URL by using:
let params = ["cx" : engineID,...]
...
components.queryItems = parameters.map {
NSURLQueryItem(name: String($0), value: String($1))
}
It should percentage escape item to XXXXXXXX219143826571%3A7h9XXXXXXX (This value I'm getting when using Google APIs explorer while testing, it shows url dress that was used). It is not doing it. I'm getting url without escaping, no changes. If I use escaped string as engine ID in this mapping, I'm getting escaped string XXXXXXXX219143826571%253A7h9XXXXXXX (additional '25' is added to query).
Can someone tell me how to fix it? I don't want to use String and then convert it to URL by NSURL(string: str)!. It is not elegant.
Edit:
I'm using app Info.plist to save ID and I retrieve it by calling:
String(NSBundle.mainBundle().objectForInfoDictionaryKey("ApiKey")!)
Colons are allows in the query part of a URL string. There should be no need to escape them.
Strictly speaking, the only things that absolutely have to be encoded in that part of a URL are ampersands, hash marks (#), and (assuming you're doing a GET query with form encoding) equals signs. However, question marks in theory may cause problems, slashes are technically not allowed (but work just fine), and semicolons are technically allowed (but again, work in practice).
Colons, AFAIK, only have special meaning in the context of paths (if the OS treats it as a path separator) and in that it separates the scheme (protocol) from the rest of the URL.
So don't worry about the colon being unencoded unless the Google API barfs for some reason.

How to use URI as a REST resource?

I am building a RESTful API for retrieving and storing comments in threads.
A comment thread is identified by an arbitrary URI -- usually this is the URL of the web page where the comment thread is related to. This design is very similar to what Disqus uses with their system.
This way, on every web page, querying the related comment thread doesn't require storing any additional data at the client -- all that is needed is the canonical URL to the page in question.
My current implementation attempts to make an URI work as a resource by encoding the URI as a string as follows:
/comments/https%3A%2F%2Fexample.org%2Ffoo%2F2345%3Ffoo%3Dbar%26baz%3Dxyz
However, before dispatching it to my application, the request URI always gets decoded by my server to
/comments/https://example.org/foo/2345?foo=bar&baz=xyz
This isn't working because the decoded resource name now has path delimiters and a query string in it causing routing in my API to get confused (my routing configuration assumes the request path contains /comments/ followed by a string).
I could double-encode them or using some other encoding scheme than URI encode, but then that would add complexity to the clients, which I'm trying to avoid.
I have two specific questions:
Is my URI design something I should continue working with or is there a better (best?) practice for doing what I'm trying to do?
I'm serving API requests with a Go process implemented using Martini 'microframework'. Is there something Go or Martini specific that I should do to make the URI-encoded resource names to stay encoded?
Perhaps a way to hint to the routing subsystem that the resource name is not just a string but a URL-encoded string?
I don't know about your url scheme for your application, but single % encoded values are valid in a url in place of the chars they represent, and should be decoded by the server, what you are seeing is what I would expect. If you need to pass url reserved characters as a value and not have them decoded as part of the url, you will need to double % encode them. It's a fairly common practice, the complexity added to the client & server will not be that much, and a short comment will do rightly.
In short, If you need to pass url chars, double % encode them, it's fine.

Create URL based on entered data in iOS

If user entered say google then
i need to add http://www.google.com ie missing part.
User may enter any thing say google.in or www.google or anything.
Now goal to complete the left over url as we check url using regex like this:
NSString *urlRegEx = #"(http|https)://((\\w)*|([0-9]*)|([-|_])*)+([\\.|/]((\\w)*|([0-9]*)|([-|_])*))+";
That given url is valid or not
Don't forget the third level domain names with suffixes like .co.uk, .co.us, .com.co, etc.
A fully qualified domain name must have at least one dot. If it doesn't have at least one dot, then you might add .com to the end.
If it does have at least one dot, then it gets more complicated. .google could be a top level domain in the future, though it isn't now. Perhaps you want to keep a white list of all "valid" first and second level domain names. You evaluate the entered domain name from the right until it stops matching domains from your list. The remainder is the "registered" domain name and any sub domains. If you don't find any matches, then add .com.
Alternatively, rather than parsing the domain name, you could just try to resolve it, and if it doesn't resolve, then add .com and try again.
I think I know what you're asking however other than the regex how are you actually validating that the URL is valid? It seems as though you're making an incorrect assumption that all URLs follow a common syntax. As as example, http://www.www.extra-www.org/ is a valid URL so if you apply your regex (as I understand your intent) the user would get to http://www.extra-www.org which may not be the same site as the one the user wanted (even though in this case it is because it forwards). Another example is http://www.www.com... if the user enters "www" your regex will kill it. A final example is if a site doesn't have its DNS registered WITH the "www" - your regex will incorrectly add the "www" piece.
EDIT: what happens if the user needs https as opposed to http and only enters "google"?
Haven't tried this one myself yet, but an NSDataDetector for the type NSTextCheckingTypeLink should be able to do you a decent job.

How to find the #fragment in a URL in Lift

I'm pretty new to Lift, and one of the things I've been trying to find is how to, in the context of a snippet, find the '#' in the current page's URL. So if a user visits http://www.example.com/some/path/page#stuff then I would like to extract "stuff" from that. I've been googling and searching the API docs and have yet to find anything for this.
I don't think the part behind the # ever gets sent to the server in the first place.
That's what wikipedia has to say about it:
In URIs a hashmark # introduces the
optional fragment near the end of the
URL. The generic RFC 3986 syntax for
URIs also allows an optional query
part introduced by a question mark ?.
In URIs with a query and a fragment
the fragment follows the query. Query
parts depend on the URI scheme and are
evaluated by the server — e.g., http:
supports queries unlike ftp:.
Fragments depend on the document MIME
type and are evaluated by the client
(Web-browser). Clients are not
supposed to send URI-fragments to
servers when they retrieve a document,
and without help from a local
application (see below) fragments do
not participate in HTTP redirections.
I don't think the part behind the #
ever gets sent to the server in the
first place.
You are correct, sir. That is the entire point of the hash.
Dylan, you could do something from the Javascript side:
$.ajax( { data : { fragment : window.location.hash ...