APIs/Services to Generate Thumbnails of Document - thumbnails

We have a website, which allows users to upload documents (word, pdf, PPT, etc.).
We are uploading files to Amazon S3. So, all files will have it's own web URL.
For these uploaded documents, we would like to generate thumbnails. This thumbnail needs to be generated based on it's content (like Google document viewer).
Is there any Service/API, which generates thumbnails of documents by it's URL?
Thanks and Regards,
Ashish Shukla

You could roll your own solution. I'm evaluating 2JPEG and it appears to support 275 formats including Word, Excel, Publisher & Powerpoint files. fCoder recommends running 2JPEG as a scheduled background task. The command line syntax is pretty comprehensive. I don't think it has the ability to process remote AWS files, but you could retain it locally temporarily, generate the thumbnail and then delete the local source file.
Here's a sample snippet to generate a thumbnail for a specific file:
2jpeg.exe -src "c:\files\myfile.docx" -dst "c:\files" -oper Resize size:"100 200" fmode:fit_width -options pages:"1" scansf:no overwrite:yes template:"{Title}_thumb.jpg" silent:yes

You should also take a look at AWS Lambda. In fact, this presentation from the AWS re:Invent 2014 conference shows a live example of using Lambda to generate thumbnail images. This solution will be very reliable, and very cost-effective, but has the downside that you'll be responsible for maintaining the code, or debugging issues.

Related

Load PDFs from local file share

I'm developing a web app for in-house use and I'm looking for a better way to display PDFs.
I've played around with Adobe's 'Work with Local File' example from GitHub, Adobe GitHub Example, and it works great using the file picker to display a PDF. Is it possible with Adobe's PDF Embed API to take a file located on a local file share and display the PDF?
I'm thinking I need to create a file promise but I'm not sure how to create that.
Unless you can make a network request to load the PDF, the answer is no. Browsers generally can't read from local files unless a user action actually picks the file. If your local share can be made accessible via HTTP, then you would be good to go.

Providing EXIF-free images in a gallery or other webpage

First, thanks for any and all help regarding this topic.
Sites like Facebook and Twitter strip EXIF information from images as they are uploading. My goal is to allow users to upload images to our platform (working with Nextcloud and others) with full EXIF information, however, we need to display images that do not contain EXIF information or any metadata. Without stripping and creating a second, Exif-Free image for each, is it possible to simply hide that EXIF info so that, if a user downloads that image, the EXIF is not embedded?
We were told that the only way to do this is to have a second, exif-free copy (the order of when that's created is irrelevant pre/during/post upload). I'm hoping there's a way that we can simply display such a copy without doubling our physical space requirements.
Thanks again for your help.
Exif is metadata, along with IPTC, XMP, AFCP, ICC, FPXR, MPF, JPS and a comment, just for the JFIF/JPEG file format alone. Other picture file formats support even more/other metadata.
You wrote it yourself: a download - so it's a file in any case. Pictures are files, just like executables, movies, texts, music and archives are files, too. And metadata is part of its content, so whoever accesses the raw bytes of the file can grab everything in it. Which is not "please don't look" proof. If you
create that on the fly by stripping metadata everytime a download is requested,
or if you do it once to preserve performance and instead occupy space remains your decision.
If there would be something as simple as a "don't show" feature then it would still be in the file and could be extracted easily by software written to ignore that instruction. Seriously, there's no shortcut to that - do it properly and don't spare yourself from getting work done at the wrong end.

Embed large or non-standard files in github issues

We use Github's issue tracking for a lot of project management, both code-related and not. For simple files like images, we can share them with Github's CDN via drag and drop into an issue or comment. However, this has limitations:
Github imposes a file type restriction: they will only allow GIF, JPEG, JPG, PNG, DOCX, GZ, LOG, PDF, PPTX, TXT, XLSX or ZIP.
Files larger than 25 MB or images larger than 10 MB are not supported.
While URLs are anonymized with Camo (https://docs.github.com/en/free-pro-team#latest/github/authenticating-to-github/about-anonymized-image-urls), files are not actually securely stored or password protected. This is really problematic when the files shared have a lot of sensitive data in them.
Is there a plugin or simple solution that would let us securely attach large or non-standard file types, while maintaining the nice UI of github issues? We'd be ok using a 3rd-party storage system (like Drive/Dropbox/Sharepoint/AWS), but forcing users to upload something then copy/paste a link over into the issue isn't ideal.
There's no way for you to embed other file types in an issue without using a standard Markdown link that gets rendered through Camo. That's because GitHub has a strict Content-Security-Policy that prevents files from other domains from even loading. That's intentional, since it prevents people from trying to embed tracking content or content that changes based on the user (e.g., ads).
Even if you could use some way to embed the files in the page, your browser probably wouldn't render them due to the Content-Security-Policy.

Which files to download for all wikipedia images

I want to download all the Chinese Wikipedia data (text + images), I downloaded the articles but I got confused with these media files, and also the remote-media files are ridiculously huge, what are they? do I have to download them?
From: http://ftpmirror.your.org/pub/wikimedia/imagedumps/tarballs/fulls/20121104/
zhwiki-20121104-local-media-1.tar 4.1G
zhwiki-20121104-remote-media-1.tar 69.9G
zhwiki-20121104-remote-media-2.tar 71.1G
zhwiki-20121104-remote-media-3.tar 69.3G
zhwiki-20121104-remote-media-4.tar 48.9G
Thanks!
I'd assume that they are the media files included from Wikimedia Commons, which are most of the images in the articles. From https://wikitech.wikimedia.org/wiki/Dumps/media:
For each wiki, we dump the image, imagelinks and redirects tables via /backups/imageinfo/wmfgetremoteimages.py. Files are written to /data/xmldatadumps/public/other/imageinfo/ on dataset2.
From the above we then generate the list of all remotely stored (i.e. on commons) media per wiki, using different args to the same script.
And it's not that huge for all files from the Chinese Wikipedia :-)

How to attach file to a GitHub issue?

I migrated with a project from Bitbucket to GitHub and I cannot find a way to attach a file to an issue (ex: screenshot, specs, etc).
How to do it?
You upload it somewhere and add the link in a comment. GitHub's Issues is rather primitive and doesn't allow attaching files.
Update: You can post images to GitHub issues now. The easiest way is to copy the image (right click, Copy image) and then paste it into the text box where you describe the issue.
OR
Just drag and drop
As of December 7, 2012, you can attach images by drag/drop or use a file chooser. See https://github.com/blog/1347-issue-attachments for more details.
To attach a file to an issue or pull request conversation, drag and drop it into the comment box.
The maximum size for files is 25MB and the maximum size for images is 10MB.
ZenHub.io Chrome plug-in will enable you to add any type of file to a github issue. It's stored on ZenHub's AWS server instead of github.com. From their website...
GitHub only allows you to upload image files. ZenHub adds the ability
to upload any type of file into issues and comments, transferring
securely to Amazon S3. With this you can really take your workflow to
the next level; try using GitHub for everything! Centralized
collaboration and transparency are awesome.
Update:
As of 11/03/2015 you can now upload these types of files to github without any extension or plug-in: PNG, GIF, JPG, DOCX, PPTX, XLSX, TXT, or PDF
As an illustration of the previous answers, see this comment:
I create a repository called catfood http://github.com/blueheadpublishing/catfood/ where I keep misc stuff (like screenshots and other attachments).
That way I can reference them in issues.
See https://github.com/blueheadpublishing/bookshop/issues/10
Some images showing the types of layout templates we want to have generated by templates:
Example One - Three Percentage Columns
Example Two - Two Percentage Columns Left
Example Three - Two Percentage Columns Right
Back in 2009, GitHub expressed the intent to add attachment to issues.
Attachments are something we'd like to add.
That topic wasn't raised since in the GitHub group though...
The format for embedding images into a GitHub comment is:
Format: ![Alt Text](url)
Example: ![GitHub Logo](/images/logo.png)
Use gist.github.com to upload any contents like code, log, html files etc. and share the link.
It's a bit of a kludge but you could create a junk branch, then commit the file to that branch and purge it later.
EDIT: This script may be of use to you:
https://github.com/wereHamster/ghup
I found an easy way to embed images in issues using Skitch. Just set up Skitch sharing and auto-copy the URL to the clipboard. Then paste it in when writing up the issue. I blogged about it here.
One quick/easy hack is to upload your attachment (say PDF or Office doc) to Dropbox, then include the Dropbox URL in the Github issue.
Mildly easier than using S3; many organizations are already using Dropbox; and Dropbox has good support for viewing many documents inline in the browser already.
8 years later (Dec. 2020), you can not only drag and drop images to PR/issues, but also... videos!
And in May 2021, this is now generally available.
Video upload public beta
You can now upload .mp4 and .mov files to issue, pull request, and discussion comments to share reproduction steps, design ideas, and experience details with your team.
The public beta will gradually rollout to all GitHub accounts over the coming week.
OK, here's what I use for screenshots.
http://www.techsmith.com/jing.html
It's free, fast, automatically uploads the image and pastes a URL link to your clipboard which you can Ctrl-V into the GitHub issue instantly.
It was a big sigh of relief when I discovered this :)
If your image is already uploaded to github, then you can attach raw link to issues. For example, if your image's location in github is:
https://github.com/Qlio/someproj/blob/master/assets/image.png
then you can can change blob to raw like this:
https://github.com/Qlio/someproj/raw/master/assets/image.png
and then you can use this link to show image:
![My cool Image](https://github.com/Qlio/someproj/raw/master/assets/image.png)