Getting Error -"insufficient system resources exist to complete the requested service " during Split File - filestream

I am trying to split large data file(4 GB) in multiple files.
I am getting below error:
insufficient system resources exist to complete the requested service
How can I remove this?

Related

ADLS Gen2 operation failed for: An error occurred while sending the request. User error 2011

Hi I have the above error coming up when accessing storage container folder where I am trying to get the metadata of a folder and its files. It can't access the folders for some reason. checked linked service and storage container where public access is enabled and private end point is also set.
Please let me know what else is missing.
I tried to reproduce the error and got similar error.
The cause of error was the I am trying to access the ADLS gen 2 which is not available or present.
After providing correct information I am successfully able to connect ADLS Gen 2

External Table on DELTA format files in ADLS Gen 1

We have number of databricks DELTA tables created on ADLS Gen1. and also, there are external tables built on top each of those tables in one of the databricks workspace.
similarly, I am trying to create same sort of external tables on the same DELTA format files,but in different workspace.
I do have read only access via Service principle on ADLS Gen1. So I can read DELTA files through spark data-frames, as in given below:
read_data_df = spark.read.format("delta").load('dbfs:/mnt/data/<foldername>')
I can even able to create hive external tables, but I do see following warning while reading data from the same table:
Error in SQL statement: AnalysisException: Incompatible format detected.
A transaction log for Databricks Delta was found at `dbfs:/mnt/data/<foldername>/_delta_log`,
but you are trying to read from `dbfs:/mnt/data/<foldername>` using format("hive"). You must use
'format("delta")' when reading and writing to a delta table.
To disable this check, SET spark.databricks.delta.formatCheck.enabled=false
To learn more about Delta, see https://learn.microsoft.com/azure/databricks/delta/index
;
If I create external table 'using DELTA', then I see a different access error as in:
Caused by: org.apache.hadoop.security.AccessControlException:
OPEN failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
failed with error 0x83090aa2 (Forbidden. ACL verification failed.
Either the resource does not exist or the user is not authorized to perform the requested operation.).
Does it mean that I would need full access, rather just READ ONLY?, on those underneath file system?
Thanks
Resolved after upgrading to Databricks Runtime environment to runtime version DBR-7.3.

GCloud custom image upload failure due to size or permissions

I've been trying to upload two custom images for some time now and I have failed repeatedly. During the import process the Google application always responds with the message that the Compute Engine Default Service Account does not have the role 'roles/compute.storageAdmin'. However, I have both assigned it using the CLI as the webinterface.
Notable is that the application throws this error during resizing of the disk. The original size of the disk is about 10GB, however, it tries to convert it to a 1024GB (!) disk. This got me thinking, could it be that this is too big for the application, hence it throwing the error it lacks permissions?
As a follow up questions, I have not found any options to set the size of the disk (not in the CLI nor in the webapp). Does anybody know of such options?
Here is the error message I have recieved:
ate-import-3ly9z": StatusMatch found: "Import: Resizing temp-translation-disk-3ly9z to 1024GB in projects/0000000000000/zones/europe-west4-a."
[import-and-translate]: 2020-05-01T07:46:30Z Error running workflow: step "import" run error: step "wait-for-signal" run error: WaitForInstancesSignal FailureMatch found for "inst-importer-import-and-translate-import-3ly9z": "ImportFailed: Failed to resize disk. The Compute Engine default service account needs the role: roles/compute.storageAdmin'"
[import-and-translate]: 2020-05-01T07:46:30Z Serial-output value -> target-size-gb:1024
[import-and-translate]: 2020-05-01T07:46:30Z Serial-output value -> source-size-gb:7
[import-and-translate]: 2020-05-01T07:46:30Z Serial-output value -> import-file-format:vmdk
[import-and-translate]: 2020-05-01T07:46:30Z Workflow "import-and-translate" cleaning up (this may take up to 2 minutes).
[import-and-translate]: 2020-05-01T07:47:34Z Workflow "import-and-translate" finished cleanup.
[import-image] 2020/05/01 07:47:34 step "import" run error: step "wait-for-signal" run error: WaitForInstancesSignal FailureMatch found for "inst-importer-import-and-translate-import-3ly9z": "ImportFailed: Failed to resize disk. The Compute Engine default service account needs the role: roles/compute.storageAdmin'"
ERROR
ERROR: build step 0 "gcr.io/compute-image-tools/gce_vm_image_import:release" failed: step exited with non-zero status: 1
ERROR: (gcloud.compute.images.import) build a9ccbeac-92c5-4457-a784-69d486e85c3b completed with status "FAILURE"
Thanks for your time!
EDIT: Not sure but I'm farily certain this is due to the 1024GB being too big. I've uploaded a 64GB without any issues using the same methods. For those who read after me, that's most likely the issue (:
This error message with the import of virtual disks have 2 root causes:
1.- Cloud Build and/or Compute engine and/or your User account did not have the correct IAM roles to perform these tasks. You can verify them here.
Cloud Build SA roles needed:
roles/iam.serviceAccountTokenCreator
roles/compute.admin
roles/iam.serviceAccountUser
Compute Engine SA roles needed:
roles/compute.storageAdmin
roles/storage.objectViewer
User Account roles needed:
roles/storage.admin
roles/viewer
roles/resourcemanager.projectIamAdmin
2.- " Not sure but I'm fairly certain this is due to the 1024GB being too big" The disk quota you have is less than 1T. The normal disk quota is 250-500 GB so that could be why by importing a 64 GB disk you encounter no problem.
You can check your quota in step 1 of this document; If you need to request more, you can follow steps 2 to 7.

Loading Amazon Redshift with a manifest, with an error in one file

When using the COPY command to load Amazon Redshift with a manifest, suppose one of the files contains an error.
Is there a way to just log the error for that file, but continue loading the other files?
The manifest file indicates whether a file is mandatory and whether an error should be generated if a file is not found. (Using a Manifest to Specify Data Files)
The COPY command will retry if it cannot read a file. (Errors When Reading Multiple Files)
The COPY command can specify a MAXERRORS parameter that permits a certain number of errors before the COPY command fails. (MAXERROR)
When loading data from files, Amazon Redshift will report any errors in the STL_LOAD_ERRORS table. (STL_LOAD_ERRORS)
As said above, the maxerror property should satisfy the above requirement.
In addition, copy-noload property checks the validity of the data without loading. Running with NOLOAD parameter is much faster as it only parses the file

KIE Workbench - How to Upload Large Rule File.xls

We are uploading 45000 rules to Kie Workbench. These rules are declared in a single excel sheet. We are planning to upload 5 files to a single KIe_project totalling upto 200,000 rules.
Problem Statement
Currently for 20,000 rules , the validation and build takes lot of time
We have to raise the VM options to be a follows, otherwise we are getting very high response times and some times java.io.buffersize exceed exception
Xms=512m Xmx=7168 MaxPermGen=4096
If the rule file contains validation errors then validation alone will take 15 mins
Converting a xls file to GDST format and then Build&Deploy project is taking more than 1 hour. and its not satisfactory that we are getting to know after 1 hour that there are validation errors or build failure or success in deployment
Other connected users are unable to perform any operation on Kie-Workbench during upload/validation/deployment of such large files
After successful convertion of an xls file to a gdst format, Guided Decision table edito is unable to load for even 10k records.
During each validaiton or upload, we are getting the error that unable to deploy artifact to http://repo1.maven.org/maven2 . We are only uploading to Kie-WB, why is it going to the mentioned repo for deployment. Secondly we have our own nexus repository deployed in organization where assets needs to be deployed, not in the http://repo1.maven.org/maven2. For infor we are using maven's in kie-project's pom to deploy to our repository.