nginx: rewrite a LOT (2000+) of urls with parameters - redirect

I have to migrate a lot of URLs with params, which look like that:
/somepath/somearticle.html?p1=v1&p2=v2 --> /some-other-path-a
and also the same URL without params:
/somepath/somearticle.html --> /some-other-path-b
The tricky part is that the two destination URLs are totally different pages in the new system, whereas in the old system the params just indicated which tab to open by default.
I tried different rewrite rules, but came to the conclusion that parameters are not considered by nginx rewrites. I found a way using location directives, but having 2000+ location directives just feels wrong.
Does anybody know an elegant way how to get this done? It may be worth noting that beside those 2000+ redirects, I have another 200.000(!) redirects. They already work, because they're rather simple. So what I want to emphasize is that performance should be key!

You cannot match the query string (anything from the ? onwards) in location and rewrite expressions, as it is not part of the normalized URI. See this document for details.
The entire URI is available in the $request_uri parameter. Using $request_uri may be problematic if the parameters are not sent in a consistent order.
To process many URIs, use a map directive, for example:
map $request_uri $redirect {
default 0;
/somepath/somearticle.html?p1=v1&p2=v2 /some-other-path-a;
/somepath/somearticle.html /some-other-path-b;
}
server {
...
if ($redirect) {
return 301 $redirect;
}
...
}
You can also use regular expressions in the map, for example, if the URIs also contain optional unmatched parameters. See this document for more.

Related

Should a RESTful API avoid requiring the client to know the resource hierarchy?

Our API's entry point has a rel named "x:reports" (where x is a prefix defined in the HAL representation, by way of a curie - but that's not important right now).
There are several types of reports. Following "x:report" provides a set of these affordances, each with a rel of its own - one rel is named "x:proofofplay". There is a set of lookup values associated with this type of report (and only this type of report). The representation returned by following "x:proofofplay" has a rel to this set of values "x:artwork".
This results in the following hierarchy
reports
proofofplay
artwork
While the "x:artwork" resource is fairly small, it does take some time to fetch it (10 sec). So the client has opted to async load it at app launch.
In order to get the "x:artwork"'s href the client has to follow the links. I'm not sure whether this is a problem. It seems potentially unRESTful, as the client is depending on out-of-band knowledge of the path to this resource. If ever path to artwork changes (highly unlikely) the client will break (though the hrefs themselves can change with impunity).
To see why I'm concerned, the launch function looks like this:
launch: function () {
var me = this;
Rest.getLinksFromEntryPoint(function(links) {
Rest.getLinksFromHref(links["x:reports"].href, function(reportLinks){
Rest.getLinksFromHref(reportLinks["x:proofofplay"].href, function(popLinks){
me.loadArtworks(popLinks["x:artwork"].href);
});
});
});
}
This hard-coding of the path simultaneously makes me think "that's fine - it's based on a published resource model" and "I bet Roy Fielding will be mad at me".
Is this fine, or is there a better way for a client to safely navigate such a hierarchy?
The HAL answer to this is to embed the resources.
Depending a bit on your server-side technology, this should be good enough in your case because you need all the data to be there before the start of the application, and since you worry about doing this sequentially, you might parallelize this on the server.
Your HAL client should ideally treat things in _links and things in _embedded as the same type of thing, with the exception that in the second case, you are also per-populating the HTTP cache for the resources.
Our js-based client does something like this:
var client = new Client(bookMarkUrl);
var resource = await client
.follow('x:reports')
.follow('x:proofofplay')
.follow('x:artwork')
.get();
If any of these intermediate links are specified in _links, we'll follow the links and do GET requests on demand, but if any appeared in _embedded, the request is skipped and the local cache is used. This has the benefit that in the future we can add new things from _links to _embedded, and speeding up clients who don't have to be aware of this change. It's all seamless.
In the future we intend to switch from HAL's _embedded to use HTTP2 Push instead.

nginx redirect old site urls and modify 1 language suffix only

I want to redirect old site urls to new site. But new site has different page names and language chars have changed too.
for example:
en/about/info will redirect to en/com/information
but
ge/about/info will go to ka/com/information
map $request_uri $redirect_uri {
<lang>/about/info/ $lang/com/information/
}
any ideas how I would go about this? There are a lot of urls, so I don't want to write these urls hardcoded for each language.
The map directive can capture parts of a regular expression, but cannot use that capture in the mapped result.
So it is possible to create a named capture called lang (for example) and use it after the mapped variable is evaluated. For example:
map $request_uri $redirect_uri {
~*(?<lang>/\w\w/)about/info/ com/information/;
}
And in the server or location block:
if ($redirect_uri) {
return 301 $lang$redirect_uri;
}
Note that $lang is only created after the value of $redirect_uri is evaluated in the if statement.
See this document for details.

How can I redirect hundreds of hostnames to other hostnames using nginx

I have a system where many (~20k) subdomains use nginx's default_server, which passes the work off to an app.
I also have many (~100) hostnames that need to be redirected to a correct one, that is different for each hostname and that would then redirect to the default_server.
one.example.com -> eleven.example.com
two.example.com -> twelve.domain.com
three.example.com -> wibble.example.com
blah.domain.com -> fifteen.example.com
The redirects are arbitrary, ie there is no pattern to them.
Rather than having to update nginx config to add a new server block whenever a new redirect is needed or updated I'd prefer to use a map file of some sort that nginx can check for redirects.
Sadly having searched about quite a bit I've not found anything like it, all examples I've found use a new server block for each redirecting host or use regexes. I'd prefer to be able to update a map file or database on the fly that nginx can refer to.
My current best option I have is to update the background app to apply the redirects.
I did previously find the map but it wasn't clear that it could be used in this way and none of the examples showed it. Saying that it turned out to be quite easy.
This is what I have that seems to work;
map $host $redirect_host {
hostnames;
one.david.org eleven.david.org;
two.david.org twelve.steve.org;
three.steve.org thirteen.david.org;
four.steve.org fourteen.steve.org;
}
server {
...
if ($redirect_host) {
return 301 $scheme://$redirect_host$request_uri;
}
...
}
It's a shame that this solution requires nginx restart, but it's not a big deal.

Mojo Routes: Handle asorted tags in url

I am building a Mojo app to replace a vanilla mod_perl application.
The app currently handles url structures like:
/
/type/bold/
/keyword/hello/
/audience/all/
/type/bold/keyword/hello/
/keyword/hello/audience/all/
/keyword/hello/type/bold/audience/all/
/audience/all/type/bold/keyword/hello/
key/value pairs in the URL, that can exist in any order.
I am looking for a way to handle that without simply making a route for every permutation of tag, as that gets repetitive even after 3 different types of tags
In that case you should probably just make a route that matches everything and parse the url yourself.

zend framework urls and get method

I am developing a website using zend framework.
i have a search form with get method. when the user clicks submit button the query string appears in the url after ? mark. but i want it to be zend like url.
is it possible?
As well as the JS approach you can do a redirect back to the preferred URL you want. I.e. let the form submit via GET, then redirect to the ZF routing style.
This is, however, overkill unless you have a really good reason to want to create neat URLs for your search queries. Generally speaking a search form should send a GET query that can be bookmarked. And there's nothing wrong with ?param=val style parameters in a URL :-)
ZF URLs are a little odd in that they force URL parameters to be part of the main URL. I.e. domain.com/controller/action/param/val/param2/val rather than domain.com/controller/action?param=val&param2=val
This isn't always what you want, but seems to be the way frameworks are going with URL parameters
There is no obvious solution. The form generated by zf will be a standard html one. When submitted from the browser using GET it will result in a request like
/action/specified/in/form?var1=val1&var2=var2
Only solution to get a "zendlike url" (one with / instead of ? or &), would be to hack the form submission using javascript. For example you can listen for onSubmit, abort the submission and instead redirect browser to a translated url. I personally don't believe this solution is worth the added complexity, but it should perform what you're looking for.
After raging against this for a day-and-a-half, and doing my best to figure out the right way to do this fairly simple this, I gave up and did the following. I still can't believe there's not a better way.
The use case that necessitates this is a simple record listing, with a form up top for adding some filters (via GET), maybe some column sorting, and Zend_Paginate thrown in for good measure. I ran into issues using the Url view helper in my pagination partial, but I suspect with even just sorting and a filter-form, Zend_View_Helper_Url would still fall down.
But I digress. My solution was to add a method to my base controller class that merges any raw query-string parameters with the existing zend-style slashy-params, and redirects (but only if necessary). The method can be called in any action that doesn't have to handle POSTs.
Hopefully someone will find this useful. Or even better, find a better way:
/**
* Translate standard URL parameters (?foo=bar&baz=bork) to zend-style
* param (foo/bar/baz/bork). Query-string style
* values override existing route-params.
*/
public function mergeQueryString(){
if ($this->getRequest()->isPost()){
throw new Exception("mergeQueryString only works on GET requests.");
}
$q = $this->getRequest()->getQuery();
$p = $this->getRequest()->getParams();
if (empty($q)) {
//there's nothing to do.
return;
}
$action = $p['action'];
$controller = $p['controller'];
$module = $p['module'];
unset($p['action'],$p['controller'],$p['module']);
$params = array_merge($p,$q);
$this->_helper->getHelper('Redirector')
->setCode(301)
->gotoSimple(
$action,
$controller,
$module,
$params);
}