Fluentd daemonset alternative to S3 on Azure (Blob) - azure-devops

For log-intensive microservices, I was hoping to persist my logs into blobs and save them in azure blob storage (s3 alternative). However, I noticed that fluentd does not seem to support it out of the box.
Is there any alternative for persisting my logs in azure like so in S3?

There are plugins that support Fluend with Azure blob storage,specifically blob append:
Azure Storage Append Blob output plugin buffers logs in local file and uploads them to Azure Storage Append Blob periodically.
there's a step by step guide available here which is a Microsoft solution, there's also an external plugin with same capabilities here

There is an easy solution to use a lightweight log forwarding agent from DataDog which is called vector. This is, of course, free to use and a better alternative to fluentd for a non-enterprise level use case.
I recently set that up to forward the logs from Azure AKS to a storage bucket in near real-time. Feel free to check out my Blog and Youtube Video on the same. I hope it helps.

Related

Mounting Azure Blob Storage to Azure Databricks without using cluster

We have a requirement that while provisioning the Databricks service thru CI/CD pipeline in Azure DevOps we should able to mount a blob storage to DBFS without connecting to a cluster. Is it possible to mount object storage to DBFS cluster by using a bash script from Azure DevOps ?
I looked thru various forums but they all mention about doing this using dbutils.fs.mount but the problem is we cannot run this command in Azure DevOps CI/CD pipeline.
Will appreciate any help on this.
Thanks
What you're asking is possible but it requires a bit of extra work. In our organisation we've tried various approaches and I've been working with Databricks for a while. The solution that works best for us is to write a bash script that makes use of the databricks-cli in your Azure Devops pipeline. The approach we have is as follows:
Retrieve a Databricks token using the token API
Configure the Databricks CLI in the CI/CD pipeline
Use Databricks CLI to upload a mount script
Create a Databricks job using the Jobs API and set the mount script as file to execute
The steps above are all contained in a bash script that is part of our Azure Devops pipeline.
Setting up the CLI
Setting up the Databricks CLI without any manual steps is now possible since you can generate a temporary access token using the Token API. We use a Service Principal for authentication.
https://learn.microsoft.com/en-US/azure/databricks/dev-tools/api/latest/tokens
Create a mount script
We have a scala script that follows the mount instructions. This can be Python as well. See the following link for more information:
https://docs.databricks.com/data/data-sources/azure/azure-datalake-gen2.html#mount-azure-data-lake-storage-gen2-filesystem.
Upload the mount script
In the Azure Devops pipeline the databricks-cli is configured by creating a temporary token using the token API. Once this step is done, we're free to use the CLI to upload our mount script to DBFS or import it as a notebook using the Workspace API.
https://learn.microsoft.com/en-US/azure/databricks/dev-tools/api/latest/workspace#--import
Configure the job that actually mounts your storage
We have a JSON file that defines the job that executes the "mount storage" script. You can define a job to use the script/notebook that you've uploaded in the previous step. You can easily define a job using JSON, check out how it's done in the Jobs API documentation:
https://learn.microsoft.com/en-US/azure/databricks/dev-tools/api/latest/jobs#--
At this point, triggering the job should create a temporary cluster that mounts the storage for you. You should not need to use the web interface, or perform any manual steps.
You can apply this approach to different environments and resource groups, as do we. For this we make use of Jinja templating to fill out variables that are environment or project specific.
I hope this helps you out. Let me know if you have any questions!

Azure Storage: Deploy Queue via ARM template

Is there a way to deploy Queues within a Storage Account via ARM templates? Haven't found an option so far.
If yes, how?
If not, is it planned to provide this capability?
If not and not planned, which approach do you recommend for automated deployments?
Thanks
Is there a way to deploy Queues within a Storage Account via ARM templates?
No, it's not support now.
ARM templates could deploy Azure Storage accounts, blob container, but not support to deploy queues, or tables within storage account.
Here is a link to a sample template to create a container: https://azure.microsoft.com/en-us/resources/templates/101-storage-blob-container/
You can vote up your voice to this feedback to promote the further achieved.
A few ways you can think about data plane operations in deployment/apps
1) the code that consumes the queue can create on init if it doesn't exist
2) if you're deploying via a pipeline, use a pipeline task to perform data plane operations
3) in a template use the deploymentScript resource to create it - this is a bit heavyweight for the task you need, but it will work.
That help?

RMAN backup into Google Cloud Storage

I want to take Oracle database backup using RMAN directly into the Google Cloud Storage
I am unable to find the plugin to use to take the RMAN backups into Cloud Storage. We have a plugin for Amazon S3 and am looking for one such related to Google Cloud Storage.
I don't believe there's an official way of doing this. Although I did file a Feature Request for the Cloud Storage engineering team to look into that you can find here.
I recommend you to star the Feature Request, for easy visibility and access, allowing you to view its status updates. The Cloud Storage team might ask questions there too.
You can use gcsfuse to mount GCS bucket as file systems on your machine and use RMAN to create backups there.
You can find more information about gcsfuse on its github page. Here are the basic steps to mount a bucket and run RMAN:
Create a bucket oracle_bucket. Check that it doesn't have a retention policy defined on it (it looks like gcsfuse has some issues with retention policies).
Please have a look at mounting.md that describes credentials for GCS. For example, I created a service account with Storage Admin role and created a JSON key for it.
Next, set up credentials for gcsfuse on your machine. In my case, I set GOOGLE_APPLICATION_CREDENTIALS to the path to JSON key from step 1. Run:
sudo su - oracle
mkdir ./mnt_bucket
gcsfuse --dir-mode 755 --file-mode 777 --implicit-dirs --debug_fuse oracle_bucket ./mnt_bucket
From gcsfuse docs:
Important: You should run gcsfuse as the user who will be using the
file system, not as root. Do not use sudo.
Configure RMAN to create a backup in mnt_bucket. For example:
configure controlfile autobackup format for device type disk to '/home/oracle/mnt_bucket/%F';
configure channel device type disk format '/home/oracle/mnt_bucket/%U';
After you run backup database you'll see a backup files created in your GCS bucket.

how we can do automatic backup for compute engine disk everyday ? in google cloud

I have created instance in compute engine with windows server 2012. i cant see any option to take automatic backup for instance disk database everyday. there is option of snapshot but we need to operate this manually. please suggest any way to backup automatically and can be restore able on a single click. if is there any other possibility using cloud SQL storage or any other storage please recommend.
thanks
There's an API to take snapshots, see API section here:
https://cloud.google.com/compute/docs/disks/create-snapshots#create_your_snapshot
You can write a simple app to get triggered from Cron or something to take a snapshot periodically.
You have no provision for automatic back up for compute engine disk. But you can do a manual disk backup by creating a snapshot.
Best alternative way is to create a bucket and move your files there. Google cloud buckets have automated back up facility available.
Cloud storage and cloud SQL are your options for automated back ups in google cloud.

Setting the Durable Reduced Availability (DRA) attribute for a bucket using Storage Console

When manually creating a new cloud storage bucket using the web-based storage console (https://console.developers.google.com/), is there a way to specify the DRA attribute? From the documentation, it appears that the only way to create buckets with that attribute is to either use Curl, gsutil or some other script, but not the console.
There is currently no way to do this.
At present, the storage console provides only a subset of the Cloud Storage API, so you'll need to use one of the tools you mentioned to create a DRA bucket.
For completeness, it's pretty easy to do this using gsutil (documentation at https://developers.google.com/storage/docs/gsutil/commands/mb):
gsutil mb -c DRA gs://some-bucket