Remove /index.html from url for static site - google-cloud-storage

I have a static site on google-cloud-storage bucket.
I rsync my site to the storage bucket with:
args: ["-m", "-h", "Content-Encoding:gzip", "rsync", "-c", "-r", "./folder", "gs://mysite.com"]
I have set in my cloud bucket for website config:
/index.html
This results in:
mysite.com/category/index.html
And from this I want to remove the index.html, so I tried in addition to above args in a second line, the following:
args: ["-h", "Content-Type:text/html", "cp", "./folder/*/index.html", "gs://mysite.com/*"]
But this second args did not work.
How to write the second args so that the index.html is removed from the URL in mysite.com/category/index.html?

The second args are probably working, the thing is that you are using cp which copies files, so you are just uploading the index.html file again.
If you want to remove the index.html you have to use rm:
args: ["-h", "Content-Type:text/html", "rm", "gs://mysite.com/category/index.html"]

Related

How to set charset to UTF8 for text files with gsutil when uploading to Google Cloud Storage bucket?

We have a (public) Google Cloud Storage bucket that hosts a simple website, meaning both HTML and images.
Our build process uses Google Cloud Build, however the question is not tied to using Cloud Build, but specifically regarding on how to use gsutil properly.
This is our current gsutil task:
# Upload it to the bucket
- name: gcr.io/cloud-builders/gsutil
dir: "public/"
args: [
"-m", # run the rsync command in parallel
"-h", "Cache-Control: public, max-age=0", # Custom cache control header
"cp", # copy command
"-r", # recursively
".", # source folder
"gs://mybucket/" # the target bucket and folder
]
As you can see, this copies everything in the local public/ folder to the bucket and applies the Cache-Control header on all objects.
According to this:
https://cloud.google.com/storage/docs/gsutil/addlhelp/WorkingWithObjectMetadata
You can specify the content type with
-h "Content-Type:text/html; charset=utf-8"
However, this makes all objects (not only .html files, but also images, etc) to get the content type text/html; charset=utf-8.
(I have even tried -h "Content-Type:; charset=utf-8" but then gsutil fails saying its an invalid content type value).
Is there a way to tell gsutil to apply charset=utf-8 on all objects, without actually overwriting the main content type?

FileSystemException when runnning a Dart Shelf Docker Container

I generated a dart project with dart create -t server-shelf . --force.
On the top folder I created a json file (my_data.json) with some mock data.
In my code I am using the data from the json file like:
final _data = json.decode(File('my_data.json').readAsStringSync()) as List<dynamic>;
But if I try to start my server with docker run -it -p 8080:8080 myserver I am getting:
FileSystemException: Cannot open file, path = 'my_data.json' (OS
Error: No such file or directory, errno = 2)
My Dockerfile:
# Use latest stable channel SDK.
FROM dart:stable AS build
# Resolve app dependencies.
WORKDIR /app
COPY pubspec.* ./
RUN dart pub get
# Copy app source code (except anything in .dockerignore) and AOT compile app.
COPY . .
RUN dart compile exe bin/server.dart -o bin/server
# Build minimal serving image from AOT-compiled `/server`
# and the pre-built AOT-runtime in the `/runtime/` directory of the base image.
FROM scratch
COPY --from=build /runtime/ /
COPY --from=build /app/bin/server /app/bin/
COPY my_data.json /app/my_data.json
# Start server.
EXPOSE 8080
CMD ["/app/bin/server"]
I think since you didn't set the WORKDIR for the new image that you started building FROM scratch. You can fix this simply by adding WORKDIR /app again, to the specification of the new image you're building, which is being used to run your application. It will look like this:
...
# Start server.
WORKDIR /app
EXPOSE 8080
CMD ["/app/bin/server"]
Replace
COPY my_data.json /app/my_data.json
with
COPY --from=build app/my_data.json app/

Is there a way to automatically create a container when starting Azurite?

For test purposes I create and run an Azurite docker image, in a test pipeline.
I would like to have the blob container automatically created though after Azurite is started, as it would simplify things.
Is there any good way to achieve this?
For the Postgres image we use, we can specify an init.sql which is run on startup. If something similar is available for Azurite, that would be awesome.
You can use the following Dockerfile to install the azure-storage-blob Python package on the Alpine based azurite image. The resulting image size is ~400MB compared to the ~1.2GB azure-cli image.
ARG AZURITE_VERSION="3.17.0"
FROM mcr.microsoft.com/azure-storage/azurite:${AZURITE_VERSION}
# Install azure-storage-blob python package
RUN apk update && \
apk --no-cache add py3-pip && \
apk add --virtual=build gcc libffi-dev musl-dev python3-dev && \
pip3 install --upgrade pip && \
pip3 install azure-storage-blob==12.12.0
# Copy init_azurite.py script
COPY ./init_azurite.py init_azurite.py
# Copy local blobs to azurite
COPY ./init_containers init_containers
# Run the blob emulator and initialize the blob containers
CMD python3 init_azurite.py --directory=init_containers & \
azurite-blob --blobHost 0.0.0.0 --blobPort 10000
The init_azurite.py script is a local Python script that uses the azure-storage-blob package to batch upload files and directories to the azurite blob storage emulator.
import argparse
import os
from time import sleep
from azure.core.exceptions import ResourceExistsError
from azure.storage.blob import BlobServiceClient, ContainerClient
def upload_file(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a single file to a path inside the container.
"""
print(f"Uploading {source} to {dest}")
with open(source, "rb") as data:
try:
container_client.upload_blob(name=dest, data=data)
except ResourceExistsError:
pass
def upload_dir(container_client: ContainerClient, source: str, dest: str) -> None:
"""
Upload a directory to a path inside the container.
"""
prefix = "" if dest == "" else dest + "/"
prefix += os.path.basename(source) + "/"
for root, dirs, files in os.walk(source):
for name in files:
dir_part = os.path.relpath(root, source)
dir_part = "" if dir_part == "." else dir_part + "/"
file_path = os.path.join(root, name)
blob_path = prefix + dir_part + name
upload_file(container_client, file_path, blob_path)
def init_containers(
service_client: BlobServiceClient, containers_directory: str
) -> None:
"""
Iterate on the containers directory and do the following:
1- create the container.
2- upload all folders and files to the container.
"""
for container_name in os.listdir(containers_directory):
container_path = os.path.join(containers_directory, container_name)
if os.path.isdir(container_path):
container_client = service_client.get_container_client(container_name)
try:
container_client.create_container()
except ResourceExistsError:
pass
for blob in os.listdir(container_path):
blob_path = os.path.join(container_path, blob)
if os.path.isdir(blob_path):
upload_dir(container_client, blob_path, "")
else:
upload_file(container_client, blob_path, blob)
if __name__ == "__main__":
parser = argparse.ArgumentParser(
description="Initialize azurite emulator containers."
)
parser.add_argument(
"--directory",
required=True,
help="""
Directory that contains subdirectories named after the
containers that we should create. Each subdirectory will contain the files
and directories of its container.
"""
)
args = parser.parse_args()
# Connect to the localhost emulator (after 5 secs to make sure it's up).
sleep(5)
blob_service_client = BlobServiceClient(
account_url="http://localhost:10000/devstoreaccount1",
credential={
"account_name": "devstoreaccount1",
"account_key": (
"Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq"
"/K1SZFPTOtr/KBHBeksoGMGw=="
)
}
)
# Only initialize if not already initialized.
if next(blob_service_client.list_containers(), None):
print("Emulator already has containers, will skip initialization.")
else:
init_containers(blob_service_client, args.directory)
This script will be copied to the azurite container and will populate the initial blob containers every time the azurite container is started unless some containers were already persisted using docker volumes. In that case, nothing will happen.
Following is an example docker-compose.yml file:
azurite:
build:
context: ./
dockerfile: Dockerfile
args:
AZURITE_VERSION: 3.17.0
restart: on-failure
ports:
- 10000:10000
volumes:
- azurite-data:/opt/azurite
volumes:
azurite-data:
Using such volumes will persist the emulator data until you destroy them (e.g. by using docker-compose down -v).
Finally, init_containers is a local directory that contains the containers and their folders/files. It will be copied to the azurite container when the image is built.
For example:
init_containers:
container-name-1:
dir-1:
file.txt
img.png
dir-2:
file.txt
container-name-2:
dir-1:
file.txt
img.png
I've solved the issue by creating a custom docker image and executing azure-cli tools from a health check. There could certainly be better solutions, and I will update the accepted answer if someone posts a better solution.
In more details
A solution to create the required data on startup is to run my own script. I chose to trigger the script from a health check I defined in docker-compose. What it does is use azure cli tools to create a container and then verify that it exists.
The script:
AZURE_STORAGE_CONNECTION_STRING="UseDevelopmentStorage=true"
export AZURE_STORAGE_CONNECTION_STRING
az storage container create -n images
az storage container show -n images
exit $?
However, the azurite image is based on alpine, which doesn't have apt, so installing azure cli was a bit tricky. So I did it the other way around, and based my image on mcr.microsoft.com/azure-cli:latest. With that done I installed Azurite like this:
RUN apk add npm
RUN npm install -g azurite --silent
All that's left is to actually run azurite, see the official azurite dockerfile for details.
It is possible to do this without azure-cli and use curl instead (and with that, not having to use the azure-cli docker image). However this was a bit complicated to get the authentication header working properly, so using azure-cli was easier.

Getting HASH of individual files within folder uploaded to IPFS

When I upload a folder of .jpg files to IPFS, I get the HASH of that folder - which is cool.
But is each individual file in that folder also getting hashed?
And if so, how do I get the hash of each file?
I basically want to be able to upload a whole bunch of files - like 500 images - and do it all at once, or programmatically, and have the hash of each file be returned to me.
Any way to do this?
Yes! From the command line you get back the CIDs (the Content IDentifier, aka, IPFS hash) for each file added when you run ipfs add -r <path to directory>
$ ipfs add -r gifs
added QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih gifs/martian-iron-man.gif
added QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE gifs/needs-more-dogs.gif
added QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK gifs/satisfied-with-your-care.gif
added QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg gifs/stone-of-triumph.gif
added QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK gifs/thanks-dog.gif
added QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC gifs
the root CID for the directory is always the last item in the list.
You can limit the output of that command to just include the CIDs using the --quiet flag
⨎ ipfs add -r gifs --quiet
QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih
QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE
QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK
QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg
QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK
QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC
Or, if you know the CID for a directory, you can list out the files it contains and their individual CIDs with ipfs ls. Here I list out the contents of the gifs dir from the previous example
$ ipfs ls QmTnuLaivKc1Aj8LBf2iWBHDXsmedip3zSPbQcGi6BFwTC
QmfBAEYhJp9ZjGvv8utB3Yv8uuuxsDKjv9rurkHRsYU3ih 2252675 martian-iron-man.gif
QmRBHTH3p4W2xAzgLxvdh8VJvAmWBgchwCr9G98EprwetE 1233669 needs-more-dogs.gif
QmZbffnCcV598QxsUy7WphXCAMZJULZAzy94tuFZzbFcdK 1395067 satisfied-with-your-care.gif
QmTxnmk85ESr97j2xLNFeVZW2Kk9FquhdswofchF8iDGFg 1154617 stone-of-triumph.gif
QmcN71Qh56oSg2YXsEXuf8o6u5CrBXbyYYzgMyAkdkcxxK 2322454 thanks-dog.gif
You can it programatically with the core api in js-ipfs or go-ipfs. Here is an example of adding a files from the local file system in node.js using js-ipfs from the docs for ipfs.addAll(files) - https://github.com/ipfs/js-ipfs/blob/master/docs/core-api/FILES.md#importing-files-from-the-file-system
There is a super helpful video on how adding files to IPFS works over at https://www.youtube.com/watch?v=Z5zNPwMDYGg
And a walk through of js-ipfs here https://github.com/ipfs/js-ipfs/tree/master/examples/ipfs-101

make wget download a file directly to disk from bash

On a website, after logging in with my credentials I am able to download daa by changing the url address to variations of this:
https://data.somewhere.com/DataDownload/getfile.jsp?ccy=AUDUSD&df=BBO&year=2014&month=02&dllater=Download
This put a zip file in my downlaod directory.
If I try to automate it with wget using:
wget "https://data.somewhere.com/DataDownload/getfile.jsp?ccy=AUDUSD&df=BBO&year=2014&month=02&dllater=Download" --no-check-certificate --ignore-length
$ ~/dnloadHotSpot.sh
--2014-03-22 16:05:16-- https://data.somewhere.com/DataDownload/getfile.jsp?ccy=AUDUSD&df=BBO&year=2014&month=02&dllater=Download
Resolving data.somewhere.com (data.somewhere.com)... 209.191.250.173
Connecting to data.somewhere.com (data.somewhere.com)|209.191.250.173|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: ignored [text/html]
Saving to: `getfile.jsp#ccy=AUDUSD&df=BBO&year=2014&month=02&dllater=Download'
[ <=> ] 8,925 --.-K/s in 0.001s
2014-03-22 16:05:18 (14.4 MB/s) - `getfile.jsp#ccy=AUDUSD&df=BBO&year=2014&month=02&dllater=Download' saved [8925]
What else to I need to add to make wget actually download the file?
If you want to specify the name of the output file into which wget places the contents of the file is is downloading, then use the capital O parameter, something like:
wget -O myfilename ......