Hashicorp Vault Snapshot Decryption - hashicorp-vault

I created a snapshot of my vault server with the following command:
vault operator raft snapshot save snapshot.save
I now have a file of the snapshot, and I am able to use it to restore the server. I am trying to decrypt and read the snapshot file programmatically so that I can search for a value inside the snapshot. Is there a way to decrypt vault snapshots into plaintext ?

There isn't a way to just decrypt it into a mysql-dump-like output file, no.
You can put it into recovery mode, then iterate thru the tree, greping for what you're looking for.
You can find docs on that here:
https://learn.hashicorp.com/tutorials/vault/inspecting-data-integrated-storage?in=vault/monitoring#secret-engine-data-example

Related

s3fs-fuse encrypt/securely store passwords

We are following this instruction to mount S3 bucket to a machine using this below instruction:
https://docs.jdcloud.com/en/object-storage-service/s3fs
Question mark :
we are storing plain text secrets/keys in a file required to mount, is there any other way we can avoid exposing plain text something can we encrypt or store somewhere. For example, we are mounting with "-o password.." with this, we will find this in the process daemon like lsof. Hence. We need security fix for this.

Reading Json file from Azure datalake as a file using Json.load in Azure databricks /Synapse notebooks

I am trying to parse Json data with multi nested level. I am using the approach is giving filename and using open(File-name) to load the data. when I am providing datalake path, it is throwing error that file path not found. I am able to read data in dataframes but How can I read file from data lake without converting to dataframes and reading it as a file and open it?
Current code approach on local machine which is working:
f = open(File_Name.Json)
data = json.load(f)
Failing scenario when provding datalake path:
f = open(Datalake path/File_Name.Json)
data = json.load(f)
You need to mount the data lake folder to a location in dbfs (in Databricks), although mounting is a security risk. Anyone with access to Databricks resource will have access to all mounted locations.
Documentation on mounting to dbfs: https://docs.databricks.com/data/databricks-file-system.html#mount-object-storage-to-dbfs
The open function works only with local files, not understanding (out of box) the cloud file paths. You can of course try to mount the cloud storage, but as it was mentioned by #ARCrow, it would be a security risk (until you create so-called passthrough mount that will control access on the cloud storage level).
But if you're able to read file into dataframe, then it means that cluster has all necessary settings for accessing the cloud storage - in this case you can just use dbutils.fs.cp command to copy file from the cloud storage to local disk, and then open it with open function. Something like this:
dbutils.fs.cp("Datalake path/File_Name.Json", "file:///tmp/File_Name.Json")
with open("/tmp/File_Name.Json", "r") as f:
data = json.load(f)

Aspera Node API /files/{id}/files endpoint not returning up to date data

I am working on a webapp for transferring files with Aspera. We are using AoC for the transfer server and an S3 bucket for storage.
When I upload a file to my s3 bucket using aspera connect everything appears to be successful, I see it in the bucket, and I see the new file in the directory when I run /files/browse on the parent folder.
I am refactoring my code to use the /files/{id}/files endpoint to list the directory because the documentation says it is faster compared to the /files/browse endpoint. After the upload is complete, when I run the /files/{id}/files GET request, the new file does not show up in the returned data right away. It only becomes available after a few minutes.
Is there some caching mechanism in place? I can't find anything about this in the documentation. When I make a transfer in the AoC dashboard everything updates right away.
Thanks,
Tim
Yes, the file-id base system uses an in-memory cache (redis).
This cache is updated when a new file is uploaded using Aspera. But for files movement directly on the storage, there is a daemon that will periodically scan and find new files.
If you want to bypass the cache, and have the API read the storage, you can add this header in the request:
X-Aspera-Cache-Control: no-cache
Another possibility is to trigger a scan by reading:
/files/{id}
for the folder id

Databricks load file from path which contains equals (=) sign

I'm looking to export Azure Monitor data from Log Analytics to a storage account and the read the JSON files into Databricks using PySpark.
The blob path for the Log Analytics export contains an equals (=) sign and Databricks throws and exception when using the path.
WorkspaceResourceId=/subscriptions/subscription-id/resourcegroups/<resource-group>/providers/microsoft.operationalinsights/workspaces/<workspace>/y=<four-digit numeric year>/m=<two-digit numeric month>/d=<two-digit numeric day>/h=<two-digit 24-hour clock hour>/m=<two-digit 60-minute clock minute>/PT05M.json
Log Analytics Data Export
Is there a way to escape the equals sign so that the JSON files can be loaded from the blob location?
I tried the similar use case referring from Microsoft Documentation, below are the steps:
Mount the storage container. We can do it with python code as below, make sure you pass all the parameters correct, because incorrect parameters will lead to multiple different errors.
dbutils.fs.mount(
source = "wasbs://<container-name>#<storage-account-name>.blob.core.windows.net",
mount_point = "/mnt/<mount-name>",
extra_configs = {"<conf-key>":dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>")})
Below are the parameters description:
<storage-account-name> is the name of your Azure Blob storage account.
<container-name> is the name of a container in your Azure Blob storage account.
<mount-name> is a DBFS path representing where the Blob storage container or a folder inside the container (specified in source) will be mounted in DBFS.
<conf-key> can be either fs.azure.account.key.<storage-account-name>.blob.core.windows.net or fs.azure.sas.<container-name>.<storage-account-name>.blob.core.windows.net
dbutils.secrets.get(scope = "<scope-name>", key = "<key-name>") gets the key that has been stored as a secret in a secret scope.
Then you can access those files as below:
df = spark.read.text("/mnt/<mount-name>/...")
df = spark.read.text("dbfs:/<mount-name>/...")
Also there are multiple ways in accessing the file, all of these were mentioned clearly in the doc.
And check this Log Analytics workspace doc to understand about exporting the data to Azure Storage.

Is there any example reference to use KMS keys for encrypting records delivered by firehose in Python 3?

I am trying to enable encryption of records deliverd to s3 by firehose using KMS key.so that data in s3 will be encrypted. I trying acheive it using cloudformation YAML file and python
I tried using below code under . Is this right way ? we are using lambda to read encrypted files from that bucket than decrypt and send to sftp server.
ExtendedS3DestinationConfiguration:
BucketARN: !GetAtt s3bucket.Arn
BufferingHints:
IntervalInSeconds: '60'
SizeInMBs: '10'
CompressionFormat: UNCOMPRESSED
EncryptionConfiguration:
KMSEncryptionConfig:
AWSKMSKeyARN: "arn:aws:kms:...:key/keyvalue'
Is this will be enough or anything else i need to add ? Any better way to acheive this ?