How to embed a custom url path to a file in Google sites - google-sites-2016

I have a Google Sites website with a custom domain, let's say www.mysite.net for example. I want to put a dataset file (let's say file.csv) into the site, that can be downloaded with a link as in www.mysite.net/datasets/file.csv. I do not know how to do that. I can insert a file from Drive into the site, but I want it to have a custom url like I mentioned to download it. How can this be done? Can it be done from within the Google sites domain, or do I have to do something special?
Thanks!

Related

How do I rewrite URL to drop file extension for pdf on github pages?

Imagine my website is hosted on GitHub Pages and has a custom domain website.com. I can access a pdf at website.com/mypdf.pdf
Is there a way where I can make it work at website.com/mypdf?
As mentioned in comments, if you are using static website hosted by a 3rd party like GitHub pages, you don't really get a lot of control over http server. I would tentatively say you cannot control URL rewrite rules on GitHub.
What you could potentially do instead is to host a page with a bit of JavaScript that would start the download on a given event (button click, page load, etc) this way you could mask your actual download URL with this html page (that by convention comes with no file extension)
UPD: and surely enough someone's been doing it already: http://lea.verou.me/2016/11/url-rewriting-with-github-pages/. The post is going on about having nice urls, but I believe file downloads implementation can be implemented similarly
Yes you should make your website with MVC structure. Make a controller and in Index action load pdf file.
Then on action calling your pdf will be loaded like that:
Students/AllResult etc

Site results redirect to Google

A site I'm working on has been hacked. The CMS (which I didn't build) was accessed and some files (e.g. "km2jk4.php.jpg") were uploaded in image fields. I have since deleted them (a week ago). Now, when I search for the site on Google, then click the result, it either:
a) simply redirects me to the Google search page
OR
b) a download dialogue appears asking me to download a zip file, with the source domain something like gb.celebritytravelnetwork.com
Clearly the site's been compromised. But if I simply type the URL in the address bar, the site loads fine. This only happens when I click through Google results.
There is no .htaccess file on the server, and this is not a virus on my computer, since many other people have reported the same thing happening, so this question is not relevant.
Any ideas please?
Thanks.
Your Source files have been changed.
Check all the files included in the index page. They might be header , footer pages.
And try using : fetch as google bot.

Different index files for different directories on Google Cloud Storage, possible?

Problem: Planning to have my Jekyll-generated static site served from Google Cloud Storage, but need to serve feeds from example.com/feed/ for backwards compatibility with WordPress.
Possible solution: Say the static feed file (index.xml) is located at example.com/feed/index.xml. Then if it's possible to set a different index file for a directory itself (apart from what's set for the bucket e.g. index.html), then people would be able to access my feed from example.com/feed/.
But is this possible? If not, is there an alternative I'd be missing?
You could potentially create an object in the example.com bucket with the name /feed/. That's a bit awkward to think about, and because of the way gsutil works you'd have to do it via the API manually, but it would allow you to serve a feed from example.com/feed/.
Alternately, you could simply name your xml content /feed/index.html. If all of your users are indeed visiting example.com/feed/, then being able to name the file index.xml is not entirely relevant. The only special thing you'd need to do is make sure you set the right content type for the /feed/index.html object.
Another thing to keep in mind here is that the feed itself should be linked from your main index page with a link like this:
<link rel="alternate" type="application/rss+xml" title="My Awesome Feed" href="http://example.com/path/to/feed.xml" />
That gives you the ability to name your feed sanely, and your users can point their feed readers at http://example.com/ directly -- the reader should be able to follow the link to the feed itself. That won't help you if you have established readership that expects a wordpress style feed, but you could steer new people in the right direction and deprecate the weird wordpress style after a while.

Facebook won't render link image on wall post

Need a way for Facebook to render an image in wall posts. Sometimes, all you need to do is copy and paste the link to the image, like here:
But, with the images I need to link to, which are on S3, I get this:
You can click here to verify that the second link is in fact a valid image address (you should see a creepy hand drawn smiley face).
I did this test just using the normal Facebook GUI on the site, but I will be using the answer to this question in my app, which integrates with Facebook via the Open Graph API. (In case anyone thought this question was not programming related).
Anyone know what I need to do to get the second image to render in the post?
The key is image size. Don't ask me what the magic number is, but larger images work and smaller ones don't. It could be that it has to be larger than the preview size that facebook uses.
In my test, I had static website configuration turned off.
My guess is only static links will generate a thumbnail for posts on facebook. The link you are trying to use not a static one.
This is what Facebook gets when trying to generate a thumbnail.
<Error>
<SCRIPT/>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>D577337ADC9FA36A</RequestId>
<HostId>
DN9BnBduVLgHbf2lONA+e/fXQIOuT7W3WOFUPdthdpP2MZQhSLlolTvyJ0t9eZXn
</HostId>
</Error>
Solution :
It turns out that to make it work, you cannot just map any arbitrary subdomain to any arbitrary bucket. The fully qualified subdomain name must be the same as the S3 bucket name.
Suppose the name of your site is static.mydomain.com. Then you need
to create a S3 bucket with that same name, named
static.mydomain.com.
Once you configure that bucket as a S3 static web site, it will have
a URL assigned to it that looks something like http://static.mydomain.com.s3-website-us-east-1.amazonaws.com.
Go to your domain host and map your subdomain to the URL from step 2. In enom.com, that meant mapping the host "static" to the address "static.mydomain.com.s3-website-us-east-1.amazonaws.com" as a CNAME
record.
Source, it will help you host a static site from your S3 account. Use images from that custom domain URL path. That will work.

Can I use a `robots.txt` file for a subdirectory on my school's domain?

I own some webspace which is registered with a University. Google has unfortunately found my CV (resume) on the site, but has mis-indexed it as a scholarly publication, which is screwing up things like citation counts on Google Scholar. I tried to upload a robots.txt into my local subdirectory. The problem is that google ignores this file, and instead uses the rules listed for the school domain.
That is, the url looks like
www.someschool.edu/~myusername/mycv.pdf
I have uploaded a robots.txt, which can be found here
www.someschool.edu/~myusername/robots.txt
And Google is ignoring it and instead using the robots.txt for the school's domain
www.someschool.edu/robots.txt
How can I make Googlebot ignore my CV?
Sadly, robots.txt is defined to be whatever you get when you GET /robots.txt, so you can't use it for your subdirectory.
What you can do is use the X-Robots-Tag HTTP header, if you can use custom .htaccess files. Here's Google's documentation on X-Robots-Tag.