We manage our Github org through terraform and are getting the following error against many of our modules.
* module.product_tools.github_team_repository.write: 2 error(s) occurred:
* module.product_tools.github_team_repository.write[1]: github_team_repository.write.1: At least one permission expected from permissions map.
This occurred after we hit an abuse detection mechanism in Github which seemed to mess up our terraform state. I reverted back to an earlier version of the state file however I am now getting the above errors.
Anyone have any ideas about the permissions map?
Thanks!
Sinéad
Related
I recently updated our CDK code to move our OpenSearch cluster from version 1.3 to 2.3. The cluster itself seems to have upgraded to a healthy state and is still accessible / usable by our application, but CloudFormation failed when attempting to update our domain resource with:
Resource handler returned message: "Resource handler returned message: "Invalid request provided: DP Nodes are OOS, Tags operation is not allowed"
This kicked the stack into UPDATE_ROLLBACK_FAILED, which is not allowed. The cluster cannot be downgraded back to 1.3.
I'm struggling to find any information about this error it's kicking out and not quite sure how to resolve it to unblock the CloudFormation stack.
Things I have tried:
Digging through CloudWatch logs only revealed information pertaining to queries.
Forcing the rollback to occur without Domain resource. This got me back to an UPDATE_COMPLETE state, but each subsequent deploy of this stack will cause it to fail again since the core issue is not resolved.
This was an odd presentation of a permissions issue. As I was reading through some docs, I stumbled upon this section, which discusses changes to tag-based access control.
This lead me start looking into CloudTrail a bit and stumbled upon the exact error that was firing when this deploy happened. It was a little odd because the assumed role granted admin access to CloudFormation, but the last line of this event record caught my eye:
"sourceIPAddress": "cloudformation.amazonaws.com",
"userAgent": "cloudformation.amazonaws.com",
"errorCode": "ValidationException",
"errorMessage": "DP Nodes are OOS, Tags operation is not allowed",
"eventSource": "es.amazonaws.com",
Upon adding es.amazonaws.com to the trust relationship of that role, the deploy fully re-ran successfully.
Hopefully this helps someone else.
our terraform plan is suddenly reporting errors such as the following while it is 'refreshing state':
Error: multiple IAM policies found matching criteria (ARN:arn:aws:iam::aws:policy/ReadOnlyAccess); try different search;
on ../../modules/xxxx/policies.tf line 9, in data "aws_iam_policy" "read_only_access":
9: data "aws_iam_policy" "read_only_access" {
and
Error: no IAM policy found matching criteria (ARN: arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy); try different search
on ../../modules/xxxx/iam.tf line 97, in data "aws_iam_policy" "aws_eks_worker_node":
97: data "aws_iam_policy" "aws_eks_worker_node" {
We recently updated our dev EKS cluster from 1.20 to 1.21. Stage and Live environments are still on 1.20 but they are built from the same module. We didn't see these errors until a day after the upgrade and there were no changes to these reported terraform files. The errors also appear to be somewhat intermittent and random. One plan run will be successful, while the next will include some of these policies that we have defined.
I know this is a shot in the dark with limited information so please ask questions if you have them. I'm just really looking for someone knows what this error means because Google isn't returning anything useful.
We also run terraform version 0.14
Due to some reasons deleted the AWS s3 keys and replaced them. Unable to deploy the code in guthub actions due to the below errors.
"Resource is not in the state servicesStable"
In AWS events tab we are getting the below error.
"unable to consistently start tasks successfully".
If required let me know, I will share the task-def.json file.
I deleted my KF cluster last night to create a new one (using kubectl cluster command not Kfctl delete), and then when I tied to create a new one, it fails, it does not work with CLI not Console. I found other people have run into this issue before, for example (here and here)
"However, as I said even with CLI my deployment fails, the error from console is:
ailed to apply: (kubeflow.error): Code 500 with message: coordinator Apply failed for gcp: (kubeflow.error): Code 500 with message: gcp apply could not update deployment manager Error could not update storage-kubeflow.yaml; Insert deployment error: googleapi: Error 403: Request had insufficient authentication scopes.
More details:
Reason: insufficientPermissions, Message: Insufficient Permission"
and the error I get from Console is:
"Please enable APIs for your project and try again
Please enable cloud resource manager API: https://console.developers.google.com/apis/api/cloudresourcemanager.googleapis.com/ and iam API: https://console.developers.google.com/apis/api/iam.googleapis.com/"
Note that this error is wrong, all the apis are active already. I'm quite sure this is a bug of KF but not sure how to find a workaround, any thoughts?
With CLI, I'm using my own account which has "owner" privileges.
Thanks
It seems you have an issue with IAM and the installation of Kubeflow, a 3rd party product that itself is not supported by us; nevertheless I went ahead and dig some information about this Machine Learning product.
The main issues (and although it seems you already cover permissions) are permissions, number of projects and some fine grained points.
I was checking and found out the following things that may help
a) Troubleshooting Kubeflow 1
b) Deploying Kubeflow in GKE[2]
c) Kubleflow auto deployer for GKE[3]
There are also some discussion about a mismatch permissions setting in Kubeflow that may be worth reading [4]
Finally there is a group that, also on a best-effort basis due the nature of Kubeflow:"google-kubeflow-support#google.com" that may come in handy.
I trust this information will be useful for you to solve your issue
I encountered a strange permissions error while building Docker images on the cloud. I switched to another machine, installed Gcloud, did gcloud init and everything worked again.
However, I noticed while building images, it took much longer because I didn't enable kaniko cache (which I figured out from this post: gcloud rebuilds complete container but Dockerfile is the same, only the script has changed)
After enabling this feature, I tried to rebuild my last image and bam, the same error message:
Status: Downloaded newer image for gcr.io/kaniko-project/executor:latest
gcr.io/kaniko-project/executor:latest
error checking push permissions --
make sure you entered the correct tag name, and that you are authenticated correctly, and try again:
checking push permission for "eu.gcr.io/pipeline/tree-par": creating push check transport for eu.gcr.io failed:
GET https://eu.gcr.io/v2/token?scope=repository%3pipeline%2Ftree-par%3Apush%2Cpull&service=eu.gcr.io:
UNAUTHORIZED: You don't have the needed permissions to perform this operation, and you may have invalid credentials.
To authenticate your request, follow the steps in: https://cloud.google.com/container-registry/docs/advanced-authentication
ERROR
ERROR: build step 0 "gcr.io/kaniko-project/executor:latest" failed: step exited with non-zero status: 1
-------------------------------------------------------------------------------------------------------------------------------
ERROR: (gcloud.builds.submit) build bad4a9a4-054d-4ad7-991d-e5aeae039b7c completed with status "FAILURE"
Anyone any idea why this failed upon enabling the Kaniko cache? I hate to not use it because when it still worked, it really decreased the time it took to create docker images.
It seems that the issue comes from Kaniko's end.
Three days ago, on version v0.21.0, they added this fix:
Fix: GCR credential helper check does not respect DOCKER_CONFIG environment variable
Even after this release, 1 day later, this issue was reported where users saw a very similar Error message:
"[...] You don't have the needed permissions to perform this operation, and you may have invalid credentials[...] "
This was already fixed yesterday with the release of the v0.22.0 version. The suggested workaround is to execute the following command:
gcr.io/kaniko-project/executor:v0.22.0
I would suggest use that command instead of executor:latest to "force" the use of the v0.22.0 version.
I hope this is helpful! :)