How to link files directly from Github (raw.github.com) - github

Are we allowed to link files directly from Github ?
<link rel="stylesheet" href="https://raw.github.com/username/project/master/style.css"/>
<script src="https://raw.github.com/username/project/master/script.js"></script>
I know this is allowed on Google Code. This way I don't have to worry about updating a local file.

The great service RawGit was already mentioned, but I'll throw another into the ring: GitCDN.link
Benefits:
Lets you link to specific commits, as well as auto-get the latest (aka master)
Incurs no damage from high traffic volumes; RawGit asks that it's dev.rawgit.com links be only used during development, where as GitCDN give you access to the latest version, without the danger of the servers exploding
Give you the option of auto minifying your HTML, CSS and JavaScript, or serving it as written (https://min.gitcdn.link).
Adds compression (GZip)
Adds all the correct headers (Content-Type, cache-control, e-tag, etc)
Full disclosure, I'm a project maintainer at GitCDN.link

You can use external server rawgithub.com. Just remove a dot between words 'raw' and 'github' https://raw.github.com/.. => https://rawgithub.com/ and use it. More info you find in this question.
However, according to the rawgithub website it will be shutting down at the end of October 2019.

You can link directly to raw files, but it's best not to do it since the raw files always get sent with a plain/text header and can cause loading problems.

You need carry out the following steps
Get the raw url of the file from github. Which is something like https://raw.githubusercontent.com/username/folder/example.css
Visit http://rawgit.com/. Paste the git url above in the input box. It will generate two url's, one for development and other for production purpose.
Copy any one of them and you are done.
The file will act as a CDN. You can also use gist urls.

GitHub Pages: https://yourusername.github.io/script.js
GitHub repo raw files: https://github.com/yourusername/yourusername.github.io/blob/master/script.js
Use GitHub Pages, DO NOT use raw files.
Reason:
GitHub Pages are based on CDN, raw files are not. Accessing raw files will directly hit on GitHub servers and increase server load.

Add a branch your project using the name "gh-pages" and then you'll (shortly after branching) be able to use a direct URL such as https://username.github.io/project/master/style.css (using your URL, and assuming "style.css" is a file in the "master" folder in the root of your "project" repository...and that your Github account is "username").

For those who ended up in this post and just want to get the raw link from an image in GitHub:
If it is the case of an image, you can just add '?raw=true' at the end of the link to the file.
E.g.
Original link:
https://github.com/githubusername/repo_name/blob/master/20160309_212617-1.png
Raw link:
https://github.com/githubusername/repo_name/blob/master/20160309_212617-1.png?raw=true

Use jsdelivr.com
Copied directly from https://www.jsdelivr.com/?docs=gh:
load any GitHub release, commit, or branch
note: we recommend using npm for projects that support it
https://cdn.jsdelivr.net/gh/user/repo#version/file
load jQuery v3.2.1
https://cdn.jsdelivr.net/gh/jquery/jquery#3.2.1/dist/jquery.min.js
use a version range instead of a specific version
https://cdn.jsdelivr.net/gh/jquery/jquery#3.2/dist/jquery.min.js
https://cdn.jsdelivr.net/gh/jquery/jquery#3/dist/jquery.min.js
omit the version completely to get the latest one
you should NOT use this in production
https://cdn.jsdelivr.net/gh/jquery/jquery/dist/jquery.min.js
add ".min" to any JS/CSS file to get a minified version
if one doesn't exist, we'll generate it for you
https://cdn.jsdelivr.net/gh/jquery/jquery#3.2.1/src/core.min.js
add / at the end to get a directory listing
https://cdn.jsdelivr.net/gh/jquery/jquery/

After searching for this same functionality, I ended up writing my own PHP script to act as a proxy. The trouble I kept running into is even when you get the RAW version/link from Github and link to it in your own page, the header sent over was 'text/plain' and Chrome was not executing my JavaScript file from Github. I also didn't like the other links posted for using third party services because of the obvious security/tampering issues possible.
So using this script, I can pass over the RAW link from Github, have the script set the correct headers, and then output the file as if it were coming from my own server. This script can also be used with a secure application to pull in non-secure scripts without throwing SSL errors warning of "Non-secure links used".
Linking:
<script src="proxy.php?link=https://raw.githubusercontent.com/UserName/repo/master/my_script.js"></script>
proxy.php
<?php
###################################################################################################################
#
# This script can take two URL variables
#
# "type"
# OPTIONAL
# STRING
# Sets the type of file that is output
#
# "link"
# REQUIRED
# STRING
# The link to grab and output through this proxy script
#
###################################################################################################################
# First we need to set the headers for the output file
# So check to see if the type is specified first and if so, then set according to what is being requested
if(isset($_GET['type']) && $_GET['type'] != ''){
switch($_GET['type']){
case 'css':
header('Content-Type: text/css');
break;
case 'js':
header('Content-Type: text/javascript');
break;
case 'json':
header('Content-Type: application/json');
break;
case 'rss':
header('Content-Type: application/rss+xml; charset=ISO-8859-1');
break;
case 'xml':
header('Content-Type: text/xml');
break;
default:
header('Content-Type: text/plain');
break;
}
# Otherwise, try and determine what file type should be output by the file extension from the link
}else{
# See if we can find a file type in the link specified and set the headers accordingly
# If css file extension is found, then set the headers to css format
if(strstr($_GET['link'], '.css') != FALSE){
header('Content-Type: text/css');
# If javascript file extension is found, then set the headers to javascript format
}elseif(strstr($_GET['link'], '.js') != FALSE){
header('Content-Type: text/javascript');
# If json file extension is found, then set the headers to json format
}elseif(strstr($_GET['link'], '.json') != FALSE){
header('Content-Type: application/json');
# If rss file extension is found, then set the headers to rss format
}elseif(strstr($_GET['link'], '.rss') != FALSE){
header('Content-Type: application/rss+xml; charset=ISO-8859-1');
# If css xml extension is found, then set the headers to xml format
}elseif(strstr($_GET['link'], '.xml') != FALSE){
header('Content-Type: text/xml');
# If we still haven't found a suitable file extension, then just set the headers to plain text format
}else{
header('Content-Type: text/plain');
}
}
# Now get the contents of our page we're wanting
$contents = file_get_contents($_GET['link']);
# And finally, spit everything out
echo $contents;
?>

If your webserver has active allow_url_include, GitHub serving the files as raw plain/text is not a problem since you can include the file first in a PHP script and modify its Headers to the proper MIME type.

Related

TYPO3: File download in Backend

I would like to integrate a CSV download in the Backend. The CSV file doesn't have to be saved on the server, so just a simple Array-to-CSV for download.
I know using FAL is quite tedious in TYPO3 so I would like to know if there is a simple solution for my issue. Like calling a "download" action an returning a "CSV string" to download ?
I did used this solution for the download action but I am looking for a solution without FAL and without keeping the file on the server.
No need for FAL or saving a file on the server. You can add a custom action in your controller that sets the content-type and disposition headers to treat your request like a download:
public function exportAction()
{
// Just an example on how you could access the downloadable data.
$records = $GLOBALS['TYPO3_DB']->exec_SELECTgetRows('*', 'tx_domain_model_table');
// modify the result to be a csv encoded string, json or whatever you want it to be.
$data = myConvert($records, 'csv');
header('Content-Type: text/x-csv');
header('Content-Disposition: attachment; filename="download.csv"');
header('Pragma: no-cache');
return $data;
}
Where $data equals a csv encoded array for example.
What's more interesting is what kind of data you want to be downloadable. To make your data downloadable, setting the header()'s and returning any simple data type should work.

TYPO3 7.6: 404 error page: HTML wrapped in numbers

I created my own “404 Page not found” error page on a TYPO3 website and implemented it via the /typo3conf/LocalConfiguration.php as follows, using the page’s Speaking URL path:
return [
...
'FE' => [
...
'pageNotFound_handling' => '/page-not-found/',
]
]
Now when I call a non-existing page, the error page gets displayed but there is a 4-digit alphanumeric number (hexadecimal as far as I’ve seen by now) BEFORE the HTML source code and a “0” AFTER it. Example (the number in the beginning is different after most of the reloads):
37b3
<!DOCTYPE html>
...
</html>
0
When calling the error page URL itself the page is returned correctly without those numbers.
Having the RealURL extension activated or deactivated does not make a difference.
Thanks a lot in advance!
I added the full description from the install tool and I guess we might find the solution there.
How TYPO3 should handle requests for non-existing/accessible pages.
empty (default)
The next visible page upwards in the page tree is shown.
'true' or '1'
An error message is shown.
String
Static HTML file to show (reads content and outputs with correct headers), e.g. notfound.html or http://www.example.org/errors/notfound.html.
Prefix "REDIRECT:"
If prefixed with "REDIRECT:" it will redirect to the URL/script after the prefix.
Prefix "READFILE:"
If prefixed with "READFILE" then it will expect the remaining string to be a HTML file which will be read and outputted directly after having the marker "###CURRENT_URL###" substituted with REQUEST_URI and ###REASON### with reason text, for example: READFILE:fileadmin/notfound.html.
Prefix "USER_FUNCTION:"
If prefixed with "USER_FUNCTION:" a user function is called, e.g. USER_FUNCTION:fileadmin/class.user_notfound.php:user_notFound->pageNotFound where the file must contain a class user_notFound with a method pageNotFound() inside with two parameters $param and $ref.
What you configured:
You're passing a string, thus TYPO3 expects to find a file - which you don't have, because it's more like an URL.
From what you try to achieve I'd go with REDIRECT:/page-not-found/.
Thanks for pointing this one out btw, I will remove the string configuration from the core since it does not make sense to have more people trip into this pitfall.
In short: change the following line in the FE section of your LocalConfiguration.php:
'pageNotFound_handling' => '/your404page.html',
to
'pageNotFound_handling' => 'REDIRECT:/your404page.html',
Cause
The actual cause is a combination of chunked Content-Encoding and the TYPO3 not being able to decode that in some cases. In your case the page not found handler eventually uses GeneralUtility::getUrl() to retrieve the error page.
If you have [SYS][curlUse] enabled it will use cUrl to retrieve the page and there is no problem.
If you don't have [SYS][curlUse] enabled it will open a socket, read the headers and then read the rest of the body. If the webserver uses "chunked" Content-Encoding the body will contain blocks of data and each block starts with a line with the length in hexadecimal format. The content ends with an empty block (with of course a line with the length "0").
cUrl apparently knows how to decode chunked data.
getUrl() itself does not know how to handle chunked data and uses the content as is as the page content.
In TYPO3 8 LTS the guzzle library is used to handle HTTP requests. In the guzzle code I can't find anything about handling chunked data. Guzzle will check if the cUrl PHP extension is present and use that as preferred transport. In most installations cUrl is present and since this decodes chunked data automagically no problem is visible. I have to test guzzle with PHP that has cUrl disabled to see if the issue is also present in v8/master.
Workaround/solution
If the PHP extension cUrl is enabled in your installation you can simply set [SYS][curlUse] in the Install Tool. The numbers around the 404 page content will disappear.

How to determine file extension of downloaded content

I am downloading multiple files which may be of different types (eg. PDF or TIFF). I would like to save the files with the correct file extension for each file. I am able to look at the content-type header using:
$type = $mech->response->headers->header( 'Content-Type' );
Then I can work from there and make up my own file extensions based on the content-type found, but is there a perl module that already does this? How else can it be done?

How to use Postgres and static file manupulation in Nginx

I would like to use Nginx as my CDN for a file hosting system. I saw a great module for nginx that allows postgres connection (https://github.com/FRiCKLE/ngx_postgres) it works really well, however when I try to use it while having alias directive it seems to ignore the alias or file download and rather give me an empty file.
My idea is, to use the UUID from the URL and find the correct file doing a query and then using the found details to change the filename header so that the user's client will download automatically set the name to the original filename instead of a uuid.
Here is the code.
location /dl{
postgres_output none;
postgres_pass database;
postgres_query "SELECT * FROM \"Files\" WHERE uuid = '$args'";
postgres_set $filename 0 name;
alias /home/ubuntu/fileStorage;
add_header Content-Disposition "attachment; filename=$filename";
}
I think somehow the postgres directive is locking up this block. Is there a way I can run the postgres query without effecting the download block?
It seem that you expect that the line
add_header Content-Disposition "attachment; filename=$filename";
will cause the browser to download the file given by $filename. This is not how the Content-Disposition header works, it simply tells the browser to interpret the response body as a file. You're going to have to do something additional to get the proper content to the client. Perhaps what you really want is to issue a redirect?

How do I know what to name a file downloaded using HTTP?

I am creating an HTTP client downloader in Python. I am able to correctly download a file such as http://www.google.com/images/srpr/logo11w.png just fine. However, I'm not sure what to actually name the thing.
There is of course the filename at the end of the URL, but is this always reliable?
If I recall correctly, wget uses the following heuristic:
If a Content-Disposition header exists, get the filename from there.
If the filename component of the URL exists (e.g. http://myserver/filename), use that.
If there is no filename component (e.g. http://www.google.com), derive the filename from the Content-Type header (such as index.html for text/html)
In all cases, if this filename is already present in the directory use a numerical suffix, such as index (1).html, or overwrite, depending on configuration.
There are plenty of other flags that control other heuristics, such as creating .html for ASP/DHTML content-types.
In short, it really depends how far you want to go. For most people, doing the first two + basic Content-Type->name mapping should be enough.