Individual pages via GitHub pages custom domain or FTP - webserver

I manage a website which is built from a GitHub repository via an action which commits a live version to a certain branch, and then the webserver routinely checks if there are any updates on this branch and, if so, pulls them down to its public_html directory. This then serves the website on some domain, say
example.com.
For various (practically immutable) reasons, there are individual webpages that are "built" from other individual repositories — here I say "built" because these repositories are almost always just some .html files and such, with little post-processing, and could thus be served directly via GitHub pages. I want these to be served at example.com/individual-page. To achieve this, I currently have a GitHub action which transfers the files via FTP to a directory on the webserver which is symlinked inside public_html, thus making all the files accessible.
However, it now occurs to me that I could "simply" (assuming this is even possible — I imagine it would need some DNS tweaking) activate GitHub pages on these individual repositories, set with the custom domain example.com, and avoid having to pass via FTP. On one hand, it seems maybe conceptually simpler to have public_html on the webserver only contain the files coming from the main website build, and it would be simpler to make new standalone pages from GitHub repositories; on the other hand, it seems like maybe "everything that appears on example.com should be found in the same directory" is a good idea.
What (if any) is the recommended best practice here: to have these individual pages managed by GitHub pages with custom domains (since they are basically just web previews of the contents of the repositories), or to continue to transfer everything over to the webserver in one directory?
In other words maybe, is it a "good idea" to partially host your website with GitHub pages? Is this even possible with the right DNS settings?
(I must admit, I don't really understand what exactly my browser does when I navigate to example.com/individual-page, and what would happen if such a directory existed in my webserver and also GitHub pages was trying to serve up a webpage at this same address, so I guess bonus points if you want to explain the basics!)

The DNS option you describe doesn't work for pages.
While you can use a CNAME record to point your domain to another domain or an A record to point your domain to an IP address, DNS doesn't handle individual pages (as in example.com/a). It would work if each page was, for instance, a.example.com, but that's not a page, it's a whole different website.
What you can do, however, is include those other repositories as submodules of your repository, and then everything works without any DNS magic (except the simple CNAME record, which isn't magic).
It would be a good idea to implement this described solution, as it's the simplest. In any case, as long as your current solution works automatically without issues and the hosting cost isn't an issue, I don't see any need to take the time to implement a new solution.
If you want to only serve some files or pages from the submodules, you can have build actions and serve a specific directory.
I must admit, I don't really understand what exactly my browser does when I navigate to example.com/individual-page
Your browser requests the DNS records of your domain (example.com), in this case the A record (since this is the root domain). This A record gives an IP address, which your browser uses to request the page. As you can see, pages aren't handled by DNS at all.
That means only one machine handles all requests for a domain (which doesn't mean it can't delegate some pages to another machine, but that's more complex). That means it's impossible to have "directory conflicts", because either your server handles the request or GitHub does. Your browser doesn't check if the page exists at server A and if not, if it exists at server B.

Related

How to do file versioning with CDN and a loadbalancer?

So I'm using a very simple CDN service. You point to your website and if you call it through their HostName they'll cache it for you after the first call.
I use this for all my static content, like JavaScript files and images.
This all works perfect - and I like that it has very little maintenance or setup cost.
Problem starts when rolling out new versions of JavaScript files. New JavaScript files automatically get a new hash if the files changes.
Because roll out over multiple instances is not simultaneously a problem occurs though. I tried to model it in this diagram:
In words:
Request hits server with new version
Requests Js file with new version hash
CDN detects correctly that the file is not cached
CDN requests the original file with the new hash from the load balancer
loadbalancer serves request of CDN to a random server - accidently serving from a server with the old version
CDN caches old version with the new hash
everyone gets served old versions from the CDN
There are some ways I know how to fix this - i.e. manually uploading files to a seperate storage with the hash baked in, etc.
But this needs extra code and has more "moving parts" that makes maintenance more complicated.
I would prefer to have something that works as seamlessly as the normal CDN behavior.
I guess this is a common problem for sites that are running on multiple instances, but I can't find a lot of information about this.
What is the common way to solve this?
Edit
I think another solution would be to somehow force the CDN to go to the same instance for the .js file as the original html file - but how?
Here are a few ideas from my solutions in the past, though the CDN you are using will rule out some of these:
Exclude .js files from the CDN Caching Service, prevents it being cached in the first place.
Poke the CDN with a request to invalidate the cache for a specific file at the time of release.
In your build/deploy script, change the name of the .js file and reference the new file in your HTML.
Use query parameters after the .js file name, which are ignored but cached under a different address reference, e.g. /mysite/myscript.js?build1234
The problem with this kind of issues is that the cache control resides on the browser side, so you cannot do too much form the server side.
The most common way I know is basically the one you mention about adding some hash to the file names or the URLs you use to get them.
The thing is that you should not do this manually. You should use some web application builder, like Webpack, to automate this process and it will depend on the technologies you are using. I saw this for the first time using GWT 13 years ago, and all the last projects I worked with, using AngularJS or React, had been integrated with builders that does what you need automatically.
Once it's implemented, your users will get the last version, and resources will be cached correctly to speed up your site.
If you can also automate the full pipeline to remove the old resources from the CDN once the expiration configured on them have been reached, you touched the sky.
I fixed this in the end by only referencing to the CDN version after a few minutes of runtime.
So if the runtime is less then 5 minutes it refers to:
/scripts/example.js?v=351
After 5 minutes it refers to the CDN version:
https://cdn.example.com/scripts/example.js?v=351
After 5 minutes we are pretty sure that all instances are running the new version, so that we don't accidently cache an old version with the new hash.
The downside is that on very busy moments you don't have the advantage of the CDN if you would redeploy, but I haven't seen a better alternative yet.

Where are mxrecords set - in the domain or the hosting?

I have a domain name at web.com, hosting with another site, and want to setup emails through Google Suite. My memory from previous experience tells me that I should be able to set mxrecords directly inside the domain, but web.com is telling me:
Your domain name is not pointing towards Web.com's name servers, this
means if you want to make any changes, you will need to access the
zonefile at the place where you are pointing them, as control is
there.
The hosting is taking place on another site. So I read the above and think this means I need to contact the guy who handles my hosting to add the records. So I do, and I get this response:
You have to do it through web.com to change dns and add new records.
It is done through domain not hosting
Is someone wrong? Or am I misunderstanding how this process works?

Publicly available static files on Google Storage not loading if user's browser in incognito mode

We keep static files (images, javascript, and css) for our websites stored in a Google Storage bucket with different folders for different types of resources. Each file is accessed via its name coupled with a custom subdomain mapped via a CNAME record to the appropriate Google Storage bucket.
This approached has worked fine. Today, however, when attempting to access our main website in Chrome's incognito (private browsing) mode, all pages on the site wouldn't load. After some detective work, we've determined that the problem is with the files stored at Google Storage, which are not loading.
Unfortunately, this doesn't seem to be a problem specific to Google Chrome. It occurs in the private browsing modes in Firefox and Internet Explorer as well (at least on the Windows 8.1 Professional platform we're using for testing).
The problem appears to occur only if we use the CNAME-based approach for accessing a file. For example, if this method is used in a private browser window to access one of our image files on Google Storage,
Image of a crowd on Google Storage - direct access to Google Storage
the file can be viewed without a problem. If, on the other hand, the file is viewed in a private browsing window using the CNAME approach, like this
Image of a crowd on Google Storage - access via CNAME
the image will not load.
What's worse, for reasons we don't completely understand, once this problem occurs in a private browsing window, it will continue to interfere with the proper viewing of the website in regular (non-private browsing) browser windows in the case of some browsers.
Has anyone encountered this problem and, if so, found a solution for it?
Thanks in advance for any tips or suggestions.
UPDATE (2015-05-26)
This problem is still under investigation. It may be ISP-specific, although our ISP (Verizon) believes it is a problem on Google's end. An attempt to resolve the problem yesterday by tweaking some DNS settings seemed to solve the problem, but that was only temporary. We began to experience the problem again today. I will update this posting further as more information becomes available.
ADDITIONAL UPDATE (2016-08-25)
(Note: I originally wrote this update on 2015-05-26, but failed to post it, and discovered it today. I'm adding it to complete the description of the issue.)
This issue has been resolved. I cannot say for certain what the source of the problem was, but I can give further information on what exactly the nature of the problem was and what may have solved it.
As I mentioned in the comments below, this appears to have been an issue that was relatively isolated. Further investigation revealed that the problem was occurring only with access to the particular subdomain through Verizon Internet service (land-based or mobile) in the U.S. I do not know if the problem was a regional problem within the Verizon system, or throughout the entire Verizon system. But I do know it affected both landline and mobile access using Verizon.
The problem also evolved. What started as a problem accessing files at the subdomain in a browser's incognito mode became a problem regardless of what browsing mode was used. That said, it was only a problem if the attempt to load files from the subdomain was used with a browser. The files could be retrieved with no problem with, for example, wget. Also, pinging the subdomain also worked fine over the Verizon network.
As the problem became more acute, I decided to do a thorough check of the DNS settings related to the subdomain. Here is where I discovered what may have been causing the problem. There was a slight discrepancy between the DNS settings at the domain registrar and the (separate) DNS service that we use.
The discrepancy didn't lead to conflicting reports as to how the subdomain should be resolved (which is probably why this problem hadn't occurred in the past). But, if I recall correctly, it led to the DNS service providing the CNAME record for the subdomain, without the registrar's DNS information fully confirming that the DNS service had the right to provide that information.
This discrepancy was corrected. Within an hour or two, the problem resolved itself -- anyone viewing the file using the two links above should be successful with both links.
I cannot say for certain, however, whether the change to the DNS settings we made to resolve the discrepancy, or some updating at Verizon, was responsible for the problem being resolved. I will say, however, that I never reported the issue to Verizon. (I didn't get that far.)
Although the DNS discrepancy had existed for more than a year or two, and had not created any problems that we were aware of, I personally think it is what caused the problem.

How to replace single web resources locally?

I am often working with websites to which I have no code access but for which I need to test javascripts, ajax calls and other resources in locally modified versions.
As an example I needed to test a new creative code originally coming from a DoubleClick server. Instead of working directly on DoubleClick and loose a lot of time waiting for my changes to take effect I needed to manipulate a local copy of that javascript resource.
So I changed my /etc/hosts to point the DoubleClick server to localhost where I run a local web server. This way I was able to test a local script instead of the original.
Of course this redirected all resources and not only the one i was interested in what often results in either weird behaviour as resources become unavailable or a lot of effort on my part to make them available.
Is there a way to replace a specific URL by redirecting it to a local file/URL and let all the others from the same origin be untampered?

CDN that supports switching between 2 files, dependant on User-agent

I have a conundrum. I'd like my entire domain to be hosted by CDN. So the root page, www.mysite.com/ should be served by a CDN. This is fine. However I'd like to conditionally serve a different page (or redirect) dependant on whether the user-agent string is detected to be mobile (e.g. like on http://detectmobilebrowser.com/). And I'd like this, if possible, to be done server-side.
I know Cloudfront can serve 2 different versions of the same file dependant on the header (gzipped or not), but I can't find any documentation stating if it or any others support any way of switching dependant on the user agent. Anyone come across a way of doing this?
Thanks for any much appreciated help :D,Alec
Your CDN must be able to reply with a HTTP response 301 Moved Permanently based on the User-Agent text parsing results, when the user tries to access the webpage or object you´d like to switch.
A content delivery network (CDN) is more thought to host your static content like images, scripts, media files, documents etc. instead of your entire website.
The meaning is to lighten the load by removing the static content from your origin server as well as serving the static content more locally through a network of servers around the world.
A typical hosting setup for what you would like to do would be to have a page/server hosted at a "normal" provider, detect the user agent (client side oder server side) and then render the links to the static resources hosted on the CDN based on the user agent.
To your second point, as mentioned before CDNs are thought to host static files so server-side detection of the user agent is not likely. If you have a hosting environment like I stated with a page/server at the provider of your choice plus a CDN you'll have all options.
Some providers (e.g. Media Temple) offers CDN support on top of their normal page/server hosting.
Hope that helps.