"Sorry, this is too big to display." - github

I uploaded a .ipynb file on github using git lfs but the file is not being displayed. Any ideas on how to fix this?
"Sorry, this is too big to display."
The option "Include Git LFS objects in archives" in the settings is already active. I tried to find something here but couldn't find any topic with this specific problem. I would like to add this to my portfolio repo. Thanks a lot.

The problem is that the file is too large to display in the web view. The file is 36.8 MB in size, and in order to render that Jupiter Notebook, it would have to be downloaded and rendered, which, due to its size, would probably take substantial resources and might simply time out.
As a result, GitHub is telling you that it's not going to render it because it would likely fail due to timeouts and resource constraints necessary to prevent denial of service attacks. If you want to display it elsewhere on your own site, you can render it there yourself, but GitHub won't do it for you.

Related

How to completely restore a repository history that uses LFS?

I am very confused about how this all works, so I am gonna make a series of questions.
So I am almost at the end of my final degree project and I have been using Github for version control. At some point, I had to store large files (>100mb) and got this message:
My first question is: what does actually happen if I click "commit anyway"? does it mean that I can't commit anymore?
Anyway, I have done research about lfs and eventually installed it in my repo (btw, this is a Unity project). I have followed this video: https://www.youtube.com/watch?v=09McJ2NL7YM&t=615s. This guy suggests using this custom .gitattribute: https://gist.github.com/nemotoo/b8a1c3a0f1225bb9231979f389fd4f3. This automatically tracks all files with a certain extension and pushes them in lfs. At the time I thought this was cool, until I realised that this file made me push all tracked files no matter how big or small they are. what I should have done is use bash for pushing to lfs only when I would get the message above (so only for files >100mb). Since my whole degree depends on this project, I did not want to mess with GitHub and spend time trying to "fix it", but I want to know if there is any way to restore the whole history and made it as I have never used lfs (?)
Lastly, since I got loads of files stored in lfs, I get that whenever someone re-clones my repo, can just do git lfs fetch --all and then git lfs pull (and this should use up my bandwidth, right?). But.. what happens if someone decides to just "download" the project ("download zip)? Well, I have tried it, and all those files are missing completely.. Is there a way to download the project with the original files instead of pointers?
Also, if you exceed the free 1gb data pack that GitHub provides, and stop paying for additional storage, do you lose all those files??
At some point in the future, I would like to remove lfs, and if I have to, only store files >100mb (I think they are just 2 in total). But would that still mean that to have a complete version of the project the only option would be to clone the repo instead of downloading (?).
Sorry for the long question but I really need to understand these things.

Why doesn't GitHub Pages update immediately after you have pushed a binary file in commit?

I see when I pushed a text file to github it reflects immediately but when I am pushing a binary file it takes a minute or so to reflect back. I am using GitHub's API to push my changes. Is there any official docs which explains this?
Any help will be appreciated.
I tried searching and found out like GitHub is not very good with very files but didn't find anything which can explain this behaviour.
This probably has nothing to do with text files vs. binary files.
GitHub Pages sites are build using background jobs. Depending on how busy the Pages servers are, how many jobs are ahead of yours, etc. you may see a build happen very quickly, or you may have to wait a while.

Can you hide certain file types on GitHub's Pull Request page?

The current project that I am working on uses Jest snapshots and as it is in the early days still, the snapshots are constantly changing. These new snapshots are filling up my PRs and when diffing files I need to either scroll past them (they tend to be long files) or go through the page at the start and minimize them all manually.
Is there a feature in GitHub or a Chrome extension that would allow me to automatically filter these files out? Even just minimize them?
I've tried Pretty Pull Requests but I can't seem to get it working to recognize the .snap files.
Thanks!
A simple expand / collapse function in the Files Changed view of GitHub will not work? They explained the issue here --> Collapse / Expand files in the Files Changed view of a Pull Request
Otherwise the refined-github extension will maybe help.
Edit with #robdonn comment: Though the collapse/expand function in the refined-github is removed. Previously part of refined-github

source code for all of my files has changed

I have created a total of 9 sample pages for my project. I tagged three of them as templates to speed the design process. I successfully exported all nine files but upon returning to my account discovered that all nine files had some how revered to the same source code.
In other words, instead of 9 distinctly different pages I have 9 copies of the same page. How could this happen and how can I fix this? I need to continue my work but with no way to correct this problem or upload my correct files form my previous export I feel as if I need to either start over or do all of my editing in an HTML editor manually going forward.
Any guidance would be much appreciated!
Do backup next time. I don't think you would be able to retrieve your "OLD files" if you didn't back them up and without UPLOADING your previous files.
Please get in touch with support if you lost any data and we'll work with you to get it restored right away. Bugs like these are always high priority and we want to make sure your data is never lost.
In the new version of Divshot saving is now manual and with version history you now have the option to revert back to a previous version of a file at any time. All data is backed up on Amazon S3 with versioning.
After upgrading let me know if you continue to experience issues with cloning pages or using templates. We'll be happy to help!

Should I keep my site media in my website's repository?

I have a simple blog application written in Python, using Django. I use Git to version-control this website. The main content of the site is a blog. The blog entries are stored in a SQLite database (which is not version-controlled, but is backed up regularly); some entries contain images and other media (like PDFs).
I currently store this "blog media" in the repository, alongside other media (such as external JavaScript code, and images used for layout purposes -- all nicely organized, of course). It occurred to me, however, that this isn't really a good strategy, for a few reasons:
Whenever I post a new blog entry that contains an image or a link to a PDF, I have to add the image to the repo and then copy a new version to the production server -- which seems like a lot of work just to add an image. It'd be easier just to upload the image to the server (and make a local backup, of course).
Since this media is content rather than code, it doesn't seem necessary to store it alongside the code (and related style media) itself.
The repo contains a lot of binary files, which increase the overall size of the repo; and more importantly,
I never really edit these images, so why keep them under version-control?
So I'm considering removing these files from the repo, and just copying them to a directory on the server outside of the directory containing the Python code, templates, style sheets, etc., for the website.
However, I wondered: Is there a "best practice" for dealing with content images and other media in a website's repo, as opposed to images, etc., that are actually used as part of the site's layout and functionality?
Edit
To elaborate, I see a difference between keeping the code for the website in the repo, and also keeping the content of the site in the repo -- I feel that perhaps the content should be stored separately from the code that actually provides the functionality of the site (especially since the content may change more frequently, and I don't see a need to create new commits for "stuff" that isn't necessary for the functioning of the site itself).
Keep them in version control. If they never change, you don't pay a penalty for it. If they do change, well, then it turns out you needed the version control after all.
Initially, I would say don't put them in the repo because they'll never change but then consider the situation of moving your website to a different server, or hosting provider. You'd need an easy way to deploy it, and unless it's not under version control, that's a lot of copy/paste that could go wrong. At least it's all in once place if/when something happens.
This isn'y really an answer as much as it's something to consider.
Version them. Why not? I version the PSD's and everything. But if that makes you wince, I can understand. You should version the javascript and stylesheets though, that stuff is code (of sorts).
Now, if by content, you mean "the image I uploaded for a blog post" or "a pdf file I'm using in a comment", then I'd say no--dont version it. That kind of content is accounted for in the database or somewhere else. But the logo image, the sprites, and the stuff that makes up the look and feel of the site should absolutely be versioned.
I'll give you one more touchy-feely reason if you aren't convinced. Some day you'll wish you could go into your history and see what your site looked like 5 years ago. If you versioned your look & feel stuff, you'll be able to do it.
You are completely correct on two points.
You are using Version Control for your code.
You are backing up your live content database.
You have come to the correct conclusion that the "content images" are just that and have no business in your code's Version Control.
Backup your content images along with your database. You do not want to blur the lines between the two unless you want your "code" to be just your own blog site.
What if you wanted to start a completely different blog. Or your friends all wanted one.You wouldn't be giving them a copy of your database with all your content. Nor would it be any use for them to have a copy with all your content images.
Move version control systems don't work well with binary files, that being said, if they're not changing, it makes no (little) difference.
You just have to decide which is easier, backing it up on the repository and the multistep process to add an image/pdf/whatever, or maintaining a separate set of actions for them (including backup). Personally I'd keep them on the version-control. If you're not changing them it's not harming anything. Why worry about something that isn't causing harm?
I think you need to ask yourself why you are using version control and why are you making back-ups Probably because you want to safeguard yourself against loss or damage of your files and in the event of something terrible happens you can fall back on your backups.
If you use version control and a separate backup system you get into the problem of distribution because the latest version of your site lives in different places. What if something does go wrong, then how much effort is it going to take you to restore things? To me, having a distributed system with version control and backup's seems like a lot of manual work that's not easy script-able. Even more, when something does go wrong you're probably already stressed out anyway. Making the restoration process harder will probably not help you much.
The way I see it, putting your static files in version control doesn't do any harm. You have to put them some where anyway be in a version control repository or a normal file system. since your static files never change they're not taking up more space over time, so what's the problem? I recommend you just place all of it under version control and make it easy on yourself. Personally I would make a backup of my database with regular intervals and commit this backup to version control as well. This way you have everything in one place and in the case of disaster you can easily do a new checkout/export to restore your site.
I've build this website. It has over a gig of PDF files and everything is stored under version control. If the server dies, all I have to do is a clean export and re-import the database and the site it up and running again.
If you are working on a web project, I would recommend creating a virtual directory for your media. For example, we setup a virtual directory in our local working copy IIS for /images/ /assets/ etc. which points to the development/staging server that the customer has access to.
This increases the speed of the source control (especially using something clunky like Visual Source Safe), and if the customer changes something during testing, this is automatically reflected in our local working copy.