What is the common percentage of visits without referer when following a link from one site to another (both https)? - http-referer

In my webserver logfiles there is a mismatch between the number of visits that one referring website reports to me, and how often that site appears as a referer in my Apache logfiles.
Given that the only difference between one visit and another is on the side of the user, and given that most internet users do not even know about referers and how to change or suppress them in their browser, I am wondering:
What percentage of users will usually have their referer empty when they follow a link from one website to another with a browser?
All links are from and to https.
Notes.
I do understand in which cases the referer may be empty. That is not the question.
I can calculate the percentage for this specific website, but I'm not sure I can trust the numbers they report.

Related

How to redirect a website according to country's IP address

I'm working on a messenger app whose server side code is developed in Erlang.
The problem which I'm facing is regarding redirection of website according to country specific domain.
For example: when user's types google.co in message box, it automatically displays google.co.uk, how can I redirect it to google.co.in if I'm in India?
For finding country's location, I found this library on github: https://github.com/mochi/egeoip
How can I use this geoLocation for redirecting to particular country specific website?
ScreenShot, when I entered facebook.com, it automatically displays preview in my local language.
But in case of my app, it shows preview in some foreign language, russian maybe.
I've read the comments, and since you are not considering having datasets as an option, I think what you may want to do is something like this:
First thing to understand is how those previews work. In any (popular) messaging app, if you type in a URL, the app will send a request to the URL and get the website metadata. Then it will be displayed in the UI.
The country detection, is a bit more complicated and done in a variety of ways. But thankfully, you (almost) don't have to do anything. This is a rather long topic, but I'll try to shorten it out.
Text Localization
In some websites (might be the case of Facebook's in your example), they do country detection on the application layer, and then based on that country, it will use a specific language for the website's text. This all usually happens before the website renders it's content, so you do not have to worry about it.
GeoDNS
This one occurs on the DNS layer, and probably the most popular. Domain names can be assigned a handful of IP addresses. These IPs can point to different versions of the website, and in the case of GeoDNS it will be up to the DNS manager to assign a country to an IP. So when a DNS query came from Russia, the requesting IP's country will be resolved and then the IP assigned to it (if any) will be returned. This is used by websites especially for country-specific features or content. Best example is Netflix.
Redirects
In case of Google redirecting you to a different domain, this might be how they do it. Country is being resolved via the IP address in the application (HTTP) layer, and then does a 301/302 redirect, pointing to the new domain name. This one, you may need to do something on. So given that your application needs to do an HTTP request to the URL the user has entered, if it returns a redirect, you must follow it. Many HTTP libs/clients already does this, but on some you might have to explicitly turn on the option to follow redirects.
One important thing to note is to do the HTTP request on the client side. Otherwise, you will be resolving to the same country (where your server resides) regardless of where your user is.

301 Redirect with limited access to site

I've recently developed a new website for a local charity that organises an annual sporting event. With the event coming up in a few weeks we approached the previous/existing 'dev' company to either redirect the domain to the new site/server or transfer the domain to us.
This other 'company' is refusing to do anything, simply because they want to force the charity to stay with them, so that they get good local publicity.
So, we've purchased a new domain for the site but need to redirect the old site to the new one. Unfortunately the system the old web company uses is very poor and cumbersome. It also only give us access to files which form the content of a given page. It doesn't however give us any access to the site template / style elements of the site, nor does it give us access to things like .htacess file(s).
So, at the moment the best I've come up with is using the existing systems single input for the site description, to force in a meta refresh that will bump users over to the new domain/site. However, this isn't going to result in a permanent 301 redirect for users or search engines.
As such, I'm desperately hoping to come up with a way to force a 301 for all pages without directly accessing every page content file and manually adding in some sort of redirect.
Due to some crappy sitewide one size fits all unescaped metatag implementation, I was able to inject an additional metatag with a redirect to the new domain.

screenshot-grabbing email tool

I have a web site with various graphs embedded in it that are generated externally. Occasionally those graphs will fail to generate and I would like to catch that when it happens. These graphs are embedded in multiple pages and I would rather not check each page manually. Is there any kind of tool or perhaps a browser addon that could periodically take screenshots of different URLs and email them in a single email? It would be sufficient to have scaled-down screenshots of full pages emailed maybe once a day to me, allowing me to take a quick glance and see that all the graphs are there and look okay.
I'm a big fan of automation. Rather than have emails generated that you then have to look at, take a look at 'replacing custom missing images in jquery'. This will run a piece of Javascript for each image that fails. Extending that to make a request to a URL that you control, which may also include the broken URL (or just the filename that is broken) would not be too hard. That URL would then generate an email, and store the broken URL so that it doesn't send 5000 emails if there's a flurry of hits to your page.
Another idea building on the above is to effectively change the external 404 from the source site to a local one (eg /backend/missing-images/) - the full-path need not exist - you are just generating a local 404 record in your apache logs. Logwatch will send a list of 404 pages from the apache log to you daily (or more often, if you want) by email.

Redirect to a web server depending on location with nginx

Im working on a web site that has to be reachable from many countries under the same domain.
Id like to know how can I receive a request with nginx (or any other static file server), and send it to different web servers depending on IP's location.
I mean, what is the point on having multiple db machines on country A and B, if the server that serves you the page is chosen by round robin.
Maybe theres another solution to my problem, and I would be very happy if someone can explain it to me.
It sounds like you are looking for a geographic page re-director.
This company provides a solution that will do the trick: www.geobytes.com
The idea is that your web server will redirect visitors to a location specific HTML page. So that, a guy in India that visits www.example.com will be shown a page customized for India, while a visitor from say Canada will see the Canadian home page.
It looks like they have PHP(http://forums.geobytes.com/viewtopic.php?f=9&t=6815) and Javascript APIs.
Some of their products are free, like the geographic page re-director(http://www.geobytes.com/GeoDirection.htm)
Hope it helps.
As stackoverflow is for programming issues, You’ll probably get a better response at https://serverfault.com/, which is geared toward “Networking, servers, or maintaining other people's PCs”. (See the FAQ.)

is it possible to know where the user is coming from when he uses the back button?

For example,
if user goes to google -> example.com -> newwebsite.com
If he goes back to example.com, the http-referrer page will still be google.com
How can I detect that he went to newwebsite.com
I believe that the back button will send the HTTP headers that were sent to the site the first time around, since it's not really a new visit.
Say you displayed an error page if the user's http-referrer was newwebsite.com. The first time they visited, they would get your site. If they went to newwebsite.com, and then hit back (meaning they wanted to go back in time, through their browser history, not load the page again with new headers), then they would get an error page, and the nature of the back button would be defeated. I don't know if this inspires that behavior or not, it just makes sense to me that way.
Maybe it's possible, but it would be entirely browser-dependent. Why do you need this functionality, anyway? Newwebsite isn't referring the user to your website at all, there's no connection between the two at all--it just happens to be the last page that the user visited.
If a visitor uses the back button, the page might be loaded from browser cache. In that case, no referrer is sent.
Using google analytics, you can see how many visitors came from a given web site. This might give you some information.
I don't believe that this is generally possible. You could pull tricks with javascript on your site so that all the links navigated from there could be detected and recorded, but once the users off your site you've got no control.
If you provided the browser, ie. developed your one yourself, then you could choose to expose the browser history via an api.
http://jeremiahgrossman.blogspot.com/2006/08/i-know-where-youve-been.html
Describes a technique for exploiting the browsers agreement to modify links to indicate that they have been traversed (eg. changing the colour of the link) so that visited sites can be detected, however this only works for a pre-declared set of links, it's not a generally applicable approach.
My feeling is that attempts to hide the nature of browsers - users can hop around all over the place - tend to lead to unsatisfactory 79% solutions that mystify users.
What problem are you actually trying to solve?
You can use sessions inorder to track the path of pages.it really works wwell.try it.