encoding issue on some pages (json files) served by github pages - github

I have an encoding issue on this repository: https://github.com/franceimage/franceimage.github.io
1/ Accents are wrong when I display https://franceimage.github.io/json/youtube.json in my browser (served by github)
2/ However, accents are right when I display the same page but run it served locally (jekyll serve)
3/ Accents are right on the html pages (served by github pages)
Can somebody explain what is happening ?

When you call json/youtube.json :
Locally, you get a Content-Type:application/json; charset=UTF-8 response header.
From github pages, you get Content-Type:application/json.
Transmitted files are identical.
As RFC 4627 states : "JSON text SHALL be encoded in Unicode. The default encoding is UTF-8."
It seems that browsers are not falling back to utf-8 when they receive a Content-Type:application/json response header.
An idea can be to submit this question to Jekyll/Github pages community. Maybe you can introduce a feature request in order to get Github pages sending encoding header.
Jekyll talk can be a good entry point for such a question.

Related

Best practices for linking [<a href] canonical URL that have redirects while using localhost

Say I have a website:
https://www.example.com
This website has many different HTML pages such as:
https://www.example.com/page.html
The website is hosted on AWS Amplify and has a variety of 301 redirects which are handled with JSON. Below is an example:
[
{
"source": "https://www.example.com/page.html",
"target": "https://www.example.com/page",
"status": "301",
"condition": null
}
]
So, as result, my page is always showing /page instead of /page.html on the client side, as expected. I read a lot about canonical URLS today and learned:
For the quickest effect, use 3xx HTTP (also known as server-side) redirects.
Suppose your page can be reached in multiple ways:
https://example.com/home
https://home.example.com
https://www.example.com
Pick one of those URLs as your canonical URL, and use redirects to send traffic from the other URLs to your preferred URL.
From: How to specify a canonical with rel="canonical" and other methods | Google Search Central | Documentation | Google Developers
Which is what I did with the JSON in AWS. I also found that using <link rel="canonical" href="desired page" in the <head> of my HTML is the best practice for telling google (Analytics, etc.) which page is the desired canonical. Which I have since updated all my pages with.
Now the main problem is whenever you hover a href or copy the link address, it includes the .HTML extension on the client side. As soon as this link is pasted and entered the server updates without the .HTML extension. My question is what is the best practice to exclude the extension and display the target address when copying the link address or hovering and the href appearing in the bottom left (Chrome MacOS 110.0.5481.77).
I've seen sites using absolute paths that include the full domain. This isn't a problem, however, most of the development of the site is done on a localhost. Doing this will make that a hassle as I would have to type in the full local path each time which includes the .html extension to get an accurate representation locally. Is there a certain way to do this, which is the correct way?
*Most of this is all new information to me so if something I'm saying is invalid, please correct me.

Can I embed a google document in Github's readme.md using markdown?

I am trying to embed a google docs in Github's readme.md using markdown. Is this possible?
I have done the following:
Published the document to web and copied the iframe code.
Pasted the code in markdown. Nothing happens.
Pasted the code between and nothing happens.
Any suggestions?
This isn't possible. When GitHub renders a README or other text document on the site, it gets passed through a filter to sanitize it and remove anything potentially malicious, including JavaScript and iframes. That's because these documents are rendered in the context of the github.com domain, and any malicious code could steal user credentials or otherwise create privacy or security problems.
Note that even if you could bypass the sanitizer, GitHub sends a Content-Security-Policy header that restricts JavaScript to a single, specific domain and rejects all frames, so your browser would not render such content, and it would use the report URI to send a notification to GitHub that a violation had been detected.
You could save the Google Doc as Markdown, AsciiDoc, or HTML and place it in your repository for people to use if you want it to be visible on GitHub.

Can't upload files to github - HTTP ERROR 400, crbug/1173575

I'm trying to upload some images to my github repo. In Firefox and Firefox developer edition, I'm getting a blank screen after I click "Commit changes", with an error in the console:
The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol.
In Chrome and in Edge, I get a mostly blank page with:
This page isn’t working. If the problem continues, contact the site owner. HTTP ERROR 400
And in the console:
Failed to load resource: the server responded with a status of 400 ()
VM9:7146 crbug/1173575, non-JS module files deprecated.
I can't find any information about what might be causing this and it doesn't seem to be happening to anyone else - it was working fine this morning, but is now failing in all browsers.
Edit: This is happening in both my Github accounts, so is not account specific.

Encrypted site and header location redirect

I have recently encrypted a site for a client and am finding some people get insecure site warning in Chrome with the triangle icon.
My major problem tho is that the contact form php processor has a:
header('location:https://www.bioloo.co.nz/index.php/thanks');
but viewers get the warning instead of to the Thank You page. My webhost reports the certificate is valid and all absolute urls are set properly.
Do I have to set a no-cache tag?
Any ideas please?

Facebook Share problem for Non English Urls

We have an arabic website and we are trying to share a Url on face book. The Url looks like
http://www.website.com/ar/شاهدى-عروض-الأزياء-العالمية-بعيون-عربية/موضة/story/75
The problem is that the facebook does not get thumbnails present on the above link.
When we debugged this through fiddler, we found that the url that facebook is trying to access is not the same as given above, this url is like
www.website.com/ar/%c3%98%c2%b4%c3%98%c2%a7%c3%99%e2%80%a1%c3%98%c2%af%c3%99%e2%80%b0-%c3%98%c2%b9%c3%98%c2%b1%c3%99%cb%86%c3%98%c2%b6-%c3%98%c2%a7%c3%99%e2%80%9e%c3%98%c2%a3%c3%98%c2%b2%c3%99%c5%a0%c3%98%c2%a7%c3%98%c2%a1-%c3%98%c2%a7%c3%99%e2%80%9e%c3%98%c2%b9%c3%98%c2%a7%c3%99%e2%80%9e%c3%99%e2%80%a6%c3%99%c5%a0%c3%98%c2%a9-%c3%98%c2%a8%c3%98%c2%b9%c3%99%c5%a0%c3%99%cb%86%c3%99%e2%80%a0-%c3%98%c2%b9%c3%98%c2%b1%c3%98%c2%a8%c3%99%c5%a0%c3%98%c2%a9/%c3%99%e2%80%a6%c3%99%cb%86%c3%98%c2%b6%c3%98%c2%a9/story/75
I need to know what facebook did to the url that it became as shown.
One more thing that i know is that this url is not UTF8 encoded. If the given arabic url is converted to UTF8 then it looks like following and not as above
www.website.com/ar/%D8%B4%D8%A7%D9%87%D8%AF%D9%89-%D8%B9%D8%B1%D9%88%D8%B6-%D8%A7%D9%84%D8%A3%D8%B2%D9%8A%D8%A7%D8%A1-%D8%A7%D9%84%D8%B9%D8%A7%D9%84%D9%85%D9%8A%D8%A9-%D8%A8%D8%B9%D9%8A%D9%88%D9%86-%D8%B9%D8%B1%D8%A8%D9%8A%D8%A9/%D9%85%D9%88%D8%B6%D8%A9/story/75
So i need to know which encoding the face book is using or what facebook is doing to access the following url when we share the url
www.website.com/ar/شاهدى-عروض-الأزياء-العالمية-بعيون-عربية/موضة/story/75
http://www.website.com/ar/شاهدى-عروض-الأزياء-العالمية-بعيون-عربية/موضة/story/75
That's not a URI (or URL). It's an IRI. Unfortunately a lot of software doesn't support IRI directly (including SO, as you can see from the way it has linked only the first part of the address!).
So if you want the link to work everywhere you'll have to write it up as a plain URI with UTF-8-URL-encoded pathnames, as in the last example (%D8%B4...). Browser will usually present the encoded link in the address bar as a nice IRI regardless of the link in the HTML document being plain URI.
%c3%98%c2%b4... is what you get when you take bytes that are UTF-8 encoded and treat them as if they were ISO-8859-1-encoded (and then UTF-8-URL-encoding them again, giving a broken “double UTF-8”). How are you getting the IRI into Facebook? Either there's an interface you're using that you're sending UTF-8 but which expects ISO-8859-1, or it's just a plain old bug on Facebook's part. Either way, you'll have to use the URI version for now.