Input/output error when writing to google cloud storage bucket - google-cloud-storage

I created a gcs bucket on https://console.cloud.google.com/storage/
and successfully mounted on an instance with gcsfuse,
however, when I try to write to the mounted directory, it shows an Input/output error.
fuse_debug: Op 0x00000003 connection.go:395] <- LookUpInode (parent 1, name "test")
gcs: Req 0x1: <- StatObject("test/")
gcs: Req 0x2: <- StatObject("test")
gcs: Req 0x1: -> StatObject("test/") (31.355698ms): gcs.NotFoundError: googleapi: Error 404: Not Found, notFound
gcs: Req 0x2: -> StatObject("test") (51.589538ms): gcs.NotFoundError: googleapi: Error 404: Not Found, notFound
fuse_debug: Op 0x00000003 connection.go:476] -> Error: "no such file or directory"
fuse_debug: Op 0x00000004 connection.go:395] <- MkDir (parent 1, name "test")
gcs: Req 0x3: <- CreateObject("test/")
gcs: Req 0x3: -> CreateObject("test/") (13.513239ms): googleapi: Error 403: Insufficient Permission, insufficientPermissions
fuse_debug: Op 0x00000004 connection.go:476] -> Error: "CreateChildDir: googleapi: Error 403: Insufficient Permission, insufficientPermissions"
fuse: 2016/06/09 02:12:40.128885 *fuseops.MkDirOp error: CreateChildDir: googleapi: Error 403: Insufficient Permission, insufficientPermissions

It appears from the --foreground --debug_fuse output that you're using credentials that aren't allowed to write to the bucket. They are probably read-only (StatObject didn't return a 403, and gcsfuse checks at startup that it can list the bucket).
Try giving the docs about credentials a careful read. In particular, if you're getting credentials automatically on a Google Compute Engine VM, you probably forgot to create it with the storage-full scope.

Please run gcsfuse with --foreground (and perhaps --debug_fuse) to get some indication of what the error is when it happens.

Related

Loki-logs for storing logs in gcs bucket

I am trying to configure the storage in loki logs , i have configured gcs bucket.
But in when I try to see loki logs, I am getting a 403 error as follows:
""2822-18-21 18:43:52 level-error ts-2822-18-21T05:13:52.8647427222
caller=flush.go:146 org_id=fake msg="failed to flush user err="store put
chunk: googleapi: Error 483: Access denied., forbidden""
What might be the reason?

Operation failed: "This request is not authorized to perform this operation." in Synapse with a Pyspark Notebook

I try to execute the following command line:
mssparkutils.fs.ls("abfss://mycontainer#myadfs.dfs.core.windows.net/myfolder/")
I get the error:
Py4JJavaError: An error occurred while calling z:mssparkutils.fs.ls.
: java.nio.file.AccessDeniedException: Operation failed: "This request is not authorized to perform this operation.", 403, GET, https://myadfs.dfs.core.windows.net/mycontainer?upn=false&resource=filesystem&maxResults=5000&directory=myfolder&timeout=90&recursive=false, AuthorizationFailure, "This request is not authorized to perform this operation.
I followed the steps described in this link
by granting access to me and my Synapse workspace the role of "Storage Blob Data Contributor" in the container or file system level:
Even that, I still get this persistent error. Am I missing other steps?
I got the same kind of error in my environment. I just followed this official document and done the repro, now it's working fine for me. You can follow the below code it will solve your problem.
Sample code:
from pyspark.sql import SparkSession
account_name = 'your_blob_name'
container_name = 'your_container_name'
relative_path = 'your_folder path'
linked_service_name = 'Your_linked_service_name'
sas_token = mssparkutils.credentials.getConnectionStringOrCreds(linked_service_name)
Access to Blob Storage
path = 'wasbs://%s#%s.blob.core.windows.net/%s' % (container_name,account_name,relative_path)
spark.conf.set('fs.azure.sas.%s.%s.blob.core.windows.net' % (container_name,account_name),sas_token)
print('Remote blob path: ' + path)
Sample output:
Updated answer
Reference to configure Spark in pyspark notebook:
https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/notebook-this-request-is-not-authorized-to-perform-this/ba-p/1712566

Getting an error while using copy activity (polybase) in adf to copy parquet files in ADLS gen2 to Azure synapse table

My source is parquet files in ADLS gen2. All the parquet files are part files of size 10-14 MB. The total size should be around 80 GB
Sink is Azuresynapse table.
Copy method is Polybase. Getting below error within 5 sec of execution like below:
ErrorCode=PolybaseOperationFailed,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Error happened when loading data into SQL Data Warehouse. Operation: 'Create external table'.,Source=Microsoft.DataTransfer.ClientLibrary,''Type=System.Data.SqlClient.SqlException,Message=External file access failed due to internal error: 'Error occurred while accessing HDFS: Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
HdfsBridge::isDirExist - Unexpected error encountered checking whether directory exists or not: AbfsRestOperationException: Operation failed: "This request is not authorized to perform this operation.", 403, HEAD, URL',Source=.Net SqlClient Data Provider,SqlErrorNumber=105019,Class=16,ErrorCode=-2146232060,State=1,Errors=[{Class=16,Number=105019,State=1,Message=External file access failed due to internal error: 'Error occurred while accessing HDFS: Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
HdfsBridge::isDirExist - Unexpected error encountered checking whether directory exists or not: AbfsRestOperationException: Operation failed: "This request is not authorized to perform this operation.", 403, HEAD,
I've seen this error due to failed authentication, check whether the authorization header and/or signature is wrong.
For example, create the scope credential using your ADLS Gen2 storage account access key:
CREATE DATABASE SCOPED CREDENTIAL [MyADLSGen2Cred] WITH
IDENTITY='user',
SECRET='zge . . . 8V/rw=='
The external data source is created as follows:
CREATE EXTERNAL DATA SOURCE [MyADLSGen2] WITH (
TYPE=HADOOP,
LOCATION='abfs://myblob#pabechevb.dfs.core.windows.net',
CREDENTIAL=[MyADLSGen2Cred])
You can specify wasb instead of abfs, and if you're using SSL, specify it as abfss. Then the external table is created as follows:
CREATE EXTERNAL TABLE [dbo].[ADLSGen2] (
[Content] varchar(128))
WITH (
LOCATION='/',
DATA_SOURCE=[MyADLSGen2],
FILE_FORMAT=[TextFileFormat])
You can find additional information in my book "Hands-On Data Virtualization with Polybase".

Kubernetes pod running via MicrosK8s on Ubuntu OS on a GCE VM on google cloud

I am trying to run google cloud python sdk from inside a k8 pod, running on google compute engine. There is a service account attached to the VM, which is giving it access to the secrets manager. I am able to access secrets manager from the host, however running the python sdk from k8 pod complains of not able to access the metadata service
>>> secret_id = 'unskript_test'
>>> name = client.secret_path(project_id, secret_id)
>>> response = client.get_secret(request={"name": name})
Traceback (most recent call last):
File "/opt/conda/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 67, in error_remapped_callable
return callable_(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/grpc/_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "/opt/conda/lib/python3.7/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.UNAVAILABLE
details = "Getting metadata from plugin failed with error: Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Compute Engine Metadata server unavailable"
debug_error_string = "{"created":"#1630634901.103779641","description":"Getting metadata from plugin failed with error: Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true from the Google Compute Enginemetadata service. Compute Engine Metadata server unavailable","file":"src/core/lib/security/credentials/plugin/plugin_credentials.cc","file_line":90,"grpc_status":14}"
>
metadata.google.internal doesnt get resolved from the k8 pod
jovyan#jovyan-25ca6c8c-157d-49e5-9366-f9d57fcb7a9f:~$ wget http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
--2021-09-03 02:11:19-- http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
Resolving metadata.google.internal (metadata.google.internal)... failed: Name or service not known.
wget: unable to resolve host address ‘metadata.google.internal’
However, host is able to resolve it
ubuntu#gcp-test-proxy:~$ wget http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
--2021-09-03 02:11:27-- http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/?recursive=true
Resolving metadata.google.internal (metadata.google.internal)... 169.254.169.254
Connecting to metadata.google.internal (metadata.google.internal)|169.254.169.254|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2021-09-03 02:11:27 ERROR 403: Forbidden.
How can i make the pod resolve metadata.google.internal?

Auth GET failed: 500 Internal Server Error

i have problem with swift..when i execute swift -V 2.0 -A http://xxx.xxx.x.xx:5000/v2.0/ -U cookbook:demo -K openstack stat
and then this is output
Auth GET failed: http://xxx.xxx.x.xx:5000/v2.0/tokens 500 Internal Server Error
any solution for me? :)
I hit this error while execute 'swift list'.
Error: Account GET failed ... 503 Internal Server Error (first 60 chars of response)...
On swift storage node, check log '/var/log/swift/account-server.log', and get a piece of error message:[Errno 13] Permission denied '/srv/node/sdb1/accounts'
According to the error message, I found the root cause is that, on swift storage node, the swift user doesn't have permission on directory '/srv/node/'. Grant permission with CMD: chown -R swift:swift /srv/node
And the problem is solved. Hope this helpful.