I am new to Github Pages and was just trying out the links to some pages.
username.github.io works, but www.username.github.io does not. Why is that so? I understand that the answer will be in some corner of the internet, but I did search it and failed to find an explanation for it.
That is due to the simple fact that Github has not configured their DNS records to support this naming scheme.
While this is entirely possible using wildcards, see Wildcard DNS record, the web has been shifting away from the www convention for some time now.
The reason why this is happening is very subjective, so it is not in the scope of Stackoverflow to answer this, but it can be inferred that given the ubiquitous nature of the World Wide Web, the shorter an URL is, the better it is, if just to allow people to more easily remember them, and type them.
Related
So I work behind a corporate network and, at the moment, VSCode's extension search feature doesn't work because it's blocked by our proxy. I'm in the middle of working with our network guys to get it whitelisted and had a question regarding the hostnames listed in their documentation page.
Namely, what each of the hostnames they tell you to whitelist does/which features need said hostname. Some of them are fairly self explanatory based on the name, but others are less clear. For convenience, here's the list of URLs along with a guess for their purpose:
VSCode Proxy URLs
I tried asking on the /r/vscode subreddit, but got no bites there. Since the GitHub issue tracker isn't for questions, I'm asking here.
I recently read an article, "www. is not deprecated", which strongly advices against redirection from www to no-www. I would like to know the main cons of such a redirection and the main cons of redirecting from no-www to www. How would it impact site scalability, search engines visibility, problems with cookies, etc.
I'm going to suggest something controversial. It doesn't matter. Use either domain.
There are legitimate issues with serving content from a single domain with HTTP 1.1. You have to do domain sharding in order to parallelize content. However browsers only open up 4 connections at the same time, so even that scaling is limited. This is called sharding.
However the issues of sharding are gone with HTTP/2. With HTTP/2 you can parallize assets natively over a single connection. https://http2.github.io/faq/
When you need to scale beyond a single server you'll be faced with other issues, but throwing more hardware at the problem will be the easiest solution. When your site becomes so large you'll want to use a Content Delivery Network at which point, scaling becomes a non-issue for the front end.
There are issues with cross domain cookies. If you do scale to such a large size that you need a single sign on solution, you won't be worried about sub domain cookies, you'll probably be looking at a single sign on service, such as facebook, google, openid, or you'll roll your own saml2.0 solution, a CDN will also be able to provide a solution to do cross domain cookies as well.
Someone else can speak to authority regarding SEO.
Build your site the way you find aesthetically pleasing, and deal with the scaling issues when you come to them.
Edit: I did think of one advantage of using www.example.com You can cname www, whereas you would not be able to cname the example.com.
Since the article covers the reasons for www domain, I'll not repeat that and look at other side instead:
It's mostly aesthetic - some people think a bare domain looks better.
The www isn't needed and some think is a relic of the past - who even differentiates between the World Wide Web and the Internet anymore? Certainly not your browser which is more concerned with the protocol (http/https) than three random letters tacked on to the beginning of a website domain.
And finally it's extra typing for the user, or speaking - www is actually quite a mouthful when reading out a web address, and don't even come near me with the "dub dub dub" phrasing that some try to use to address this.
Personally I still think www wins it for me - mostly from recognition factor rather than from the technical issues raised in the article (though they help cement this opinion). In the same way that a .com or .country domain is more recognisable as a web address than some of the new TLDs.
Using a subdomain in your website address (of which www is the most recognisable) does have technical advantages as raised in the article - some of which can be worked around - but other than those it's a personal preference so not sure that SO is best place for this since there is no "right" answer.
One thing is clear. You should have one domain variant and stick with it. So redirect to your preferred version (with or without www) so if anyone ends up on the wrong one they are steered right. This just makes sense from a cleanliness point of view and also from an SEO point of view since search engines see the two domains as separate and so you don't want content showing on both as duplicate. Along the same vein, it's best practice to have your webserver listen to both domains to do that redirect and, if using https, to make sure your certificate covers both domains.
I know Tumblr makes it very easy to point your TumblrBlog to a specific subdomain by changing the "CNAME" or "A" record in DNS. I wanted to know how can I point the TumblrBlog to website.com/blog rather than the subdomain blog.website.com?
If you simply prefer website.com/blog over blog.website.com, you could set up the former url with a redirect to blog.website.com. That has the added benefit of allowing people to access the site two different ways.
However, if the problem is that you can't change your DNS settings, you could use an iframe. That probably isn't a very good idea though, because it messes with the browser's back and forward buttons and doesn't display the updated url when you navigate within the frame.
Another non-DNS solution, if you have some kind of scripting available, is to use Tumblr's API to recreate your blog in your own page.
Your options are really dependent on your situation, which you haven't told us much about.
I'm maintaining the website http://www.totalworkflow.co.uk and not sure if the HTTrack follow the instructions given in robots.txt file. If there is any answer that we can keep the HTTrack away from the website please suggest me implement with or else just tell the robot name so I could be able to block this crap from crawling my website. If this is not possible by robots.txt, please recommend if any other way to keep this robots away from the website?
You are right there is no necessity for spam crawlers to follow the guidelines given in the robots.txt file. I know that the robots.txt is only for genuine search engines only. However, the application HTTrack may look genuine if the developers hard code this application not to skip the robots.txt guidelines if provided. If this option is provided then the application would be really useful for the purpose intended. OK lets come to my issue, actually what I would like to find the solution is to keeps the HTTRack crawlers away without hard code anything on the web server. I try to solve this issue at the webmasters level first. However, your idea is great to consider in the future. Thank you
It should obey robots.txt, but robots.txt is a thing that you don't have to obey (and actually a pretty good thing to find what you don't want other people to see for spam bots) so what's the guarantee that (even if it obeys robots now) some time in the future there won't be an option to ignore all robots.txt and metatags? I think a better way is to configure your server-side application to detect and block user agents. There is a chance that the user agent string is hardcoded somewhere in the crawler's source code and the user won't be able to change it to stop you from blocking that crawler. All you have to do is write a server script to spit out user agent information (or check server logs) and then create blocking rules according to this information. Alternatively, you can just google a list of known "bad agents". To block user agents on a server that supports HTACCESS, have a look at this thread for one way of doing it:
Block by useragent or empty referer
I'm setting up another website and still run into this issue. Once and for all, I would like to know what are the pros and cons of each method. No external resource seems to provide a decent answer, so I hoped fellow coders can help me. I don't want to know HOW to do it, it is fairly easy to find out, I just want to know which one (wwww. to null or null to www.) would be better for my site.
Redirect www.domain.com to domain.com.
Advertising the www. part of a domain is now rather antiquated, and besides, without the www., it's shorter. When we post a domain, the general public assumes that we mean the web service unless we specify otherwise.
If there's a question as to whether it's a domain (for less common Top Level Domains like .cc), I'd rather include http:// than www..
The main reason not to include the www. is because it's shorter (ie. the www. is not necessary).
Put yourself in the reader's shoes, with their short attention spans. Since the www. does not differentiate your website, you want the reader to see and recognize the differentiating part of your domain immediately. The best way to do this is to put the unique part first (without the www.). Plus, social networking and the mobile space like shorter links.
To summarize, the trend is to no longer use www., but to redirect for the people who are in the habit of typing www..
As far as I know there is no technical or SEO advantage either way, as long as the redirects work properly.
I prefer no-www, because the 'www.' is simply unnecessary.