I have a website that has multiple languages. The way this is set up now is that it looks at the http accept language and redirect the user to the specific language, or uses a default language when none is found.
The problem that I am facing is that web crawlers can't index the root page, because it gives a 302 redirect. http://www.mydomain.com gets redirected to http://www.mydomain.com/nl/
The only way the website can be indexed is if I supply a sitemap for the whole website, including the languages. I have done that but I have not seen any indexed pages for weeks now.
So my question is: Will it be better to just have the website work in a default language.
To have the website in your own language you have to select the language when you are in the root website itself.
The problem that I am facing is that web crawlers can't index the root page
I haven't seen this problem before. Webcrawlers certainly follows 302 redirects. Any chance that you're (unawarely) blocking visitors without an Accept-Language header like webcrawlers?
So my question is: Will it be better to just have the website work in a default language. To have the website in your own language you have to select the language when you are in the root website itself.
I'd rather prefer the Accept-Language header and display the language which has the closest match with the in the header specified language(s) as per the HTTP 1.1 Specification. If none is specified, I'd display English as default language or at least the language which has the biggest coverage among the (expected) website audience.
I see in your question history that you're a PHP developer, so here's an useful snippet to determine the closest match based on the Accept-Language header as per the HTTP 1.1 specification:
function get_language($available_languages, $preferred_language = 'auto') {
preg_match_all('/([[:alpha:]]{1,8})(-([[:alpha:]|-]{1,8}))?(\s*;\s*q\s*=\s*(1\.0{0,3}|0\.\d{0,3}))?\s*(,|$)/i',
$preferred_language == 'auto' ? $_SERVER['HTTP_ACCEPT_LANGUAGE'] : $preferred_language, $languages, PREG_SET_ORDER);
$preferred_language = $available_languages[0]; // Set default for the case no match is found.
$best_qvalue = 0;
foreach ($languages as $language_items) {
$language_prefix = strtolower($language_items[1]);
$language = $language_prefix . (!empty($language_items[3]) ? '-' . strtolower($language_items[3]) : '');
$qvalue = !empty($language_items[5]) ? floatval($language_items[5]) : 1.0;
if (in_array($language, $available_languages) && ($qvalue > $best_qvalue)) {
$preferred_language = $language;
$best_qvalue = $qvalue;
} else if (in_array($language_prefix, $available_languages) && (($qvalue*0.9) > $best_qvalue)) {
$preferred_language = $language_prefix;
$best_qvalue = $qvalue * 0.9;
}
}
return $preferred_language;
}
(the above is actually a rewrite/finetune of an example found somewhere at php.net)
It can be used as follows:
$available_languages = array(
'en' => 'English',
'de' => 'Deutsch',
'nl' => 'Nederlands'
);
$requested_language = get_it_somehow_from_URL() ?: 'auto';
$current_language = get_language(array_keys($languages), $requested_language);
if ($requested_language != $current_language) {
// Unknown language.
header('Location: /' . $current_language . '/' . $requested_page);
exit;
}
Related
I have a wordpress multisite with nginx, and I have been trying to find a way to redirect users based on their browser language.
Thanks to Mark and Joris I was able to redirect in most cases, but I have one problem.
Here are my situations and code for your information.
My situations
My multisite setup is in subdomains. My main site is in Korean and other two sites are in Japanese and English.
Obviously I want to redirect Japan users to Japanese site and international users to English site, and I think I figured this out.
But, if I want to go to the main Korean site from subdomain sites, I keep getting redirected back to jp.domain.com or en.domain.com. There would not be many use cases like this, but I think this should be possible.
Code
location = / {
default_type text/html;
rewrite_by_lua '
if ngx.var.cookie_lang == "ko" then
return
elseif ngx.var.cookie_lang == "ja" then
ngx.redirect("http://jp.domain.com/")
return
elseif ngx.var.cookie_lang == "en" then
ngx.redirect("http://en.domain.com/")
return
end
if ngx.var.http_accept_language then
for lang in (ngx.var.http_accept_language .. ","):gmatch("([^,]*),") do
if string.sub(lang, 0, 2) == "ko" then
ngx.header["Set-Cookie"] = "lang=ko; path=/"
return
elseif string.sub(lang, 0, 2) == "ja" then
ngx.header["Set-Cookie"] = "lang=ja; path=/"
ngx.redirect("http://jp.domain.com/")
return
end
end
end
ngx.header["Set-Cookie"] = "lang=en; path=/"
ngx.redirect("http://en.domain.com/")
';
}
location / {
try_files $uri $uri/ /index.php?$args;
rewrite_by_lua '
if ngx.var.arg_lang == "ko" then
ngx.header["Set-Cookie"] = "lang=ko; path=/"
elseif ngx.var.arg_lang == "ja" then
ngx.header["Set-Cookie"] = "lang=ja; path=/"
elseif ngx.var.arg_lang == "en" then
ngx.header["Set-Cookie"] = "lang=en; path=/"
end
';
}
Any help would be appreciated.
I am using varnish 4 in front of apache. I need requests made to deutsh.de coming from headers with the preferred language es or ca (unless it also has de or en) to be redirected to spanish.es.
Could somebody provide me with the appropriate syntax?
Thank you
So I managed to put together something in the file used to start varnish:
sub vcl_recv {
if((req.http.Accept-Language !~ "de" || req.http.Accept-Language !~ "en") && (req.http.Accept-Language ~ "es" || req.http.Accept-Language ~ "ca" || req.http.Accept-Language ~ "eu"))
{
return(synth(301,"Moved Permanently"));
}
}
sub vcl_synth {
if(req.http.Accept-Language ~ "es" || req.http.Accept-Language ~ "ca" || req.http.Accept-Language ~ "eu")
{
set resp.http.Location = "http://spanish.es";
return (deliver);
}
}
...This appears to work
I have slightly extended the proposed solution with some regex that guarantees that we dont have german or english as a higher prioritised language configured in the accept-language header.
To explain the regex I think it would be good to keep in mind how such an Accept-Language header might look like: Accept-Language: de-DE,en-US,es
To consider the preferences of the users the used regex searches for the provided language but at the same time ensures that none of the other offered languages will be found before.
The latter is achieved somewhat cryptically with a negative look ahead expression "(^(?!de|en).)*" to ensure that neither de, nor en appears before the "es|ca|eu" entry.
^ # line beginning
.* # any character repeated any number of times, including 0
?! # negative look-ahead assertion
Additionally I have added a check if SSL is already used to achieve the language and SSL switch in one redirect.
With the return(synth(850, "Moved permanently")); you save one if clause in the vcl_synth which will reduce your config a lot especially when you have to do many of those language based redirects.
sub vcl_recv {
if (req.http.X-Forwarded-Proto !~ "(?i)https" && req.http.Accept-Language ~ "^((?!de|en).)*(es|ca|eu)" {
set req.http.x-redir = "https://spanish.es/" + req.url;
return(synth(850, "Moved permanently"));
}
}
sub vcl_synth {
if (resp.status == 850) {
set resp.http.Location = req.http.x-redir;
set resp.status = 301;
return (deliver);
}
}
In a TYPO3 6.1 site, I would like to make the creation of restricted (fe_groups) pages as easy as possible for editors. There's not one single protected area, but several protected pages all over the pagetree.
What I would like to achieve would be that whenever a page has some login behaviour/restriction and no valid fe_user is logged in, there is a redirection to a central login page.
I have found this post
TYPO3 - Redirecting to login page when user is not logged in that refers to the same issue - but the solution requires setting PIDs by hand.
I can hardly believe that such a feature ("set target page for redirections based on access restrictions") is not available. Or does it exist, or is it on a roadmap somewhere? And if not, is there a workaround?
This is indeed a big missing feature in TYPO3. The problem is that because of the way TYPO3 is built it's hard to determine whether a page doesn't exist (404) or access is forbidden (403). I did some further development of an unpublished extension that does the job, see https://github.com/phluzern/adfc_pagenotfound
In readme.txt you will find the configuration that is needed. It is in use with TYPO3 4.7, therefore some used classes may be deprecated or removed in 6.1. If so, fork the project, change them and make some pull requests so I can update it.
The extension makes use of a custom parameter $arPid (access restriction pid). The ID to the page that is access restricted is sent to the login page. Your login form must be able to handle this parameter in order to redirect, see an example here:
https://github.com/phluzern/phzldap/blob/master/pi1/class.tx_phzldap_pi1.php#L133
It might be better to use a redirect_url as it is supported in felogin.
Update
In the meantime, I'm using an improved class with the following features:
If access to page is forbidden, redirect to a login page with the standard redirect_url parameter. This allows a redirect after a successful fe login using EXT:felogin without modifications and also supports speaking URLs.
Redirect to 404 page if page is not found respecting the current language of the site.
The code is as follows:
<?php
use TYPO3\CMS\Core\Utility\GeneralUtility;
class user_pageNotFound {
/**
* Detect language and redirect to 404 error page
*
* #param array $params "currentUrl", "reasonText" and "pageAccessFailureReasons"
* #param \TYPO3\CMS\Frontend\Controller\TypoScriptFrontendController $tsfeObj
*/
public function pageNotFound($params, $tsfeObj) {
/*
* If a non-existing page with a RealURL path was requested (www.mydomain.tld/foobar), a fe_group value for an empty
* key is set:
* $params['pageAccessFailureReasons']['fe_group'][null] = 0;
* This is the reason why the second check was implemented.
*/
if (!empty($params['pageAccessFailureReasons']['fe_group']) && !array_key_exists(null, $params['pageAccessFailureReasons']['fe_group'])) {
// page access failed because of missing permissions
header('HTTP/1.0 403 Forbidden');
$this->initTSFE(1);
/** #var \TYPO3\CMS\Frontend\ContentObject\ContentObjectRenderer $cObj */
$cObj = GeneralUtility::makeInstance('TYPO3\\CMS\\Frontend\\ContentObject\\ContentObjectRenderer');
$loginUrl = $cObj->typoLink_URL(array(
'parameter' => $GLOBALS['TYPO3_CONF_VARS']['FE']['pageNotFound_handling_loginPageID'],
'useCacheHash' => FALSE,
'forceAbsoluteUrl' => TRUE,
'additionalParams' => '&redirect_url=' . $params['currentUrl']
));
TYPO3\CMS\Core\Utility\HttpUtility::redirect($loginUrl);
} else {
// page not found
// get first realurl configuration array (important for multidomain)
$realurlConf = array_shift($GLOBALS['TYPO3_CONF_VARS']['EXTCONF']['realurl']);
// look for language configuration
foreach ($realurlConf['preVars'] as $conf) {
if ($conf['GETvar'] == 'L') {
foreach ($conf['valueMap'] as $k => $v) {
// if the key is empty (e.g. default language without prefix), break
if (empty($k)) {
continue;
}
// we expect a part like "/de/" in requested url
if (GeneralUtility::isFirstPartOfStr($params['currentUrl'], '/' . $k . '/')) {
$tsfeObj->pageErrorHandler('/index.php?id=' . $GLOBALS['TYPO3_CONF_VARS']['FE']['pageNotFound_handling_redirectPageID'] . '&L=' . $v);
}
}
}
}
// handle default language
$tsfeObj->pageErrorHandler('/index.php?id=' . $GLOBALS['TYPO3_CONF_VARS']['FE']['pageNotFound_handling_redirectPageID']);
}
}
/**
* Initializes a TypoScript Frontend necessary for using TypoScript and TypoLink functions
*
* #param int $id
* #param int $typeNum
*/
protected function initTSFE($id = 1, $typeNum = 0) {
\TYPO3\CMS\Frontend\Utility\EidUtility::initTCA();
if (!is_object($GLOBALS['TT'])) {
$GLOBALS['TT'] = new \TYPO3\CMS\Core\TimeTracker\NullTimeTracker;
$GLOBALS['TT']->start();
}
$GLOBALS['TSFE'] = GeneralUtility::makeInstance('TYPO3\\CMS\\Frontend\\Controller\\TypoScriptFrontendController', $GLOBALS['TYPO3_CONF_VARS'], $id, $typeNum);
$GLOBALS['TSFE']->sys_page = GeneralUtility::makeInstance('TYPO3\\CMS\\Frontend\\Page\\PageRepository');
$GLOBALS['TSFE']->sys_page->init(TRUE);
$GLOBALS['TSFE']->connectToDB();
$GLOBALS['TSFE']->initFEuser();
$GLOBALS['TSFE']->determineId();
$GLOBALS['TSFE']->initTemplate();
$GLOBALS['TSFE']->rootLine = $GLOBALS['TSFE']->sys_page->getRootLine($id, '');
$GLOBALS['TSFE']->getConfigArray();
if (\TYPO3\CMS\Core\Utility\ExtensionManagementUtility::isLoaded('realurl')) {
$rootline = \TYPO3\CMS\Backend\Utility\BackendUtility::BEgetRootLine($id);
$host = \TYPO3\CMS\Backend\Utility\BackendUtility::firstDomainRecord($rootline);
$_SERVER['HTTP_HOST'] = $host;
}
}
}
The only thing you need to configure are the PIDs of the page not found and login pages:
// ID of the page to redirect to if page was not found
$GLOBALS['TYPO3_CONF_VARS']['FE']['pageNotFound_handling_redirectPageID'] = 4690;
// ID of the page to redirect to if current page is access protected
$GLOBALS['TYPO3_CONF_VARS']['FE']['pageNotFound_handling_loginPageID'] = 5404;
Tell please what script uses zend framework for definition current URL? More exactly I interest what use ZEND for definition domain name: this $_SERVER['HTTP_HOST'] or this
$_SERVER['SERVER_NAME'] ? (or may be something other)?
P.S. ( I search in documentation but not found, (I do not know this framework), also I search in google, but also not found answer on my question? )
Try use: $this->getRequest()->getRequestUri() to get current of requested URI.
In the view script use: $this->url() to get current URL.
Or using via static integrated Zend Controller front via instance:
$uri = Zend_Controller_Front::getInstance()->getRequest()->getRequestUri();
You can get a value of URI implementation via singleton to get a value of request() data:
$request = Zend_Controller_Front::getInstance()->getRequest();
$url = $request->getScheme() . '://' . $request->getHttpHost();
On the View use it as:
echo $this->serverUrl(true); # return with controller, action,...
You should avoid hardcode such as example (NOT TO USE!):
echo 'http://' . $_SERVER['SERVER_NAME'] . $_SERVER['PHP_SELF'];
instead of this example use as on a view:
$uri = $this->getRequest()->getHttpHost() . $this->view->url();
If you want using getRequest in ZEND more explore The Request Object.
SKIP IT BELOW (AUTOPSY EXAMPLE HOW WORKS IT).
Full of example code how getRequestUri() how it works and why is isRequest instead using $_SERVER is because on a platform specific is randomly get a data:
first if uri null, thand if requested from IIS set is as HTTP_X_REWRITE_URL. If not, check on IIS7 rewritten uri (include encoded uri). If not on IIS than REQUEST_URI will check scheme of HTTP_HOSTNAME, or if failed use as ORIG_PATH_INFO and grab a QUERY_STRING.
If is setted, grab a data automatically via string of returned object $this in a class.
If failed, than will be set a parsed string than set it.
if ($requestUri === null) {
if (isset($_SERVER['HTTP_X_REWRITE_URL'])) { // check this first so IIS will catch
$requestUri = $_SERVER['HTTP_X_REWRITE_URL'];
} elseif (
// IIS7 with URL Rewrite: make sure we get the unencoded url (double slash problem)
isset($_SERVER['IIS_WasUrlRewritten'])
&& $_SERVER['IIS_WasUrlRewritten'] == '1'
&& isset($_SERVER['UNENCODED_URL'])
&& $_SERVER['UNENCODED_URL'] != ''
) {
$requestUri = $_SERVER['UNENCODED_URL'];
} elseif (isset($_SERVER['REQUEST_URI'])) {
$requestUri = $_SERVER['REQUEST_URI'];
// Http proxy reqs setup request uri with scheme and host [and port] + the url path, only use url path
$schemeAndHttpHost = $this->getScheme() . '://' . $this->getHttpHost();
if (strpos($requestUri, $schemeAndHttpHost) === 0) {
$requestUri = substr($requestUri, strlen($schemeAndHttpHost));
}
} elseif (isset($_SERVER['ORIG_PATH_INFO'])) { // IIS 5.0, PHP as CGI
$requestUri = $_SERVER['ORIG_PATH_INFO'];
if (!empty($_SERVER['QUERY_STRING'])) {
$requestUri .= '?' . $_SERVER['QUERY_STRING'];
}
} else {
return $this;
}
} elseif (!is_string($requestUri)) {
return $this;
} else {
// Set GET items, if available
if (false !== ($pos = strpos($requestUri, '?'))) {
// Get key => value pairs and set $_GET
$query = substr($requestUri, $pos + 1);
parse_str($query, $vars);
$this->setQuery($vars);
}
}
$this->_requestUri = $requestUri;
return $this;
I'm using a lighttpd 404 error handler, with a static 404 page. The entire conf file looks like this:
server.document-root = "/home/www/local/www/mysite/html"
url.rewrite = (
"^(.*)/($|\?.*)" => "$1/index.html",
"^(.*)/([^.?]+)($|\?.*)$" => "$1/$2.html"
)
server.error-handler-404 = "/404.html"
$HTTP["scheme"] == "http" {
url.redirect = ( "^/blog.html$" => "/blog/",
// various individual redirects
)
}
$HTTP["scheme"] == "https" {
$HTTP["url"] !~ "^/blog/admin/" {
url.redirect = ( "^/(.*)" => "http://www.mysite.com/$1" )
}
}
However, when I go to an address that should 404, I do correctly see our 404 page, but the status code is 200.
The lighttpd docs say that you should get a 404 status code if using a static page.
I think we're using a static page, but could something about the way we're redirecting mean that we're actually not?
Sorry for the newbie question.
Fixed this by using server.errorfile-prefix instead - think it is simply a bug in server.error-handler-404.
Seems to be a known bug in some versions of lighttpd. I was actually using a later version than 1.4.17, but still seeing the same problem.