"Please check the configuration, the workflow was not able to start " error in litmus chaosCenter portal after opening the status of workflow - workflow

"Please check the configuration, the workflow was not able to start " error in litmus chaosCenter portal after opening the status of workflow"

I faced the same issue while setting up LitmusChaos on Openshift local, it will be mostly due to lack of permission to create a Pod that will create chaos scenario.
You can find the reason of failure in "workflow controller" or "subscriber" pod logs, in my case I found it on "workflow controller", failed to start chaos pod because pod is trying to start with user 2000, the solution to overcome is
oc adm policy add-scc-to-group anyuid system:authenticated
The above command will grant "anyuid" scc to all the authenticated users. For production environment, refrain using this command. Best Practice is to add anyuid scc to service account which you created for litmus agent.
References:
https://stackoverflow.com/a/65231547/8496688
https://cloud.redhat.com/blog/managing-sccs-in-openshift#:~:text=anyuid,inside%20and%20outside%20the%20container.
Edit: Instead of assigning Priviliged SCC to all authenticated users, Create a new SCC provided by Litmus team (Openshift SCC Link) and assign it to the Service Account (SA) which you provided at the time of agent creation. e.g
oc adm policy add-scc-to-user privileged system:serviceaccount:myproject:mysvcacct
Ensure that you add the prefix "system:serviceaccount:<project_name>" while granting SCC to SA
References:
https://docs.openshift.com/container-platform/3.11/admin_guide/manage_scc.html

Related

No agent pool found with identifier 61: why can't project collection administrator see deployment pools?

I am trying to set up a deployment agent on a server.
The process is failing with the message "No agent pool found with identifier 61".
Looking at the logs I can see the following
    INFO DeploymentGroupAgentConfigProvider] Found deployment group Web Servers with id 23
    INFO DeploymentGroupAgentConfigProvider] PoolId for deployment group 'Web Servers' is '61'.
This would suggest the server is connecting as it can find the deployment group id but not that of the pool.
This ties in with the fact that although I can see the relevant Deployment Group in DevOps, I cannot see the corresponding Deployment Pool.
My account is registered as a collection administrator which I would have thought would be enough to give me visibility of everything.
Also a colleague, who is also a collection administrator, CAN see the Deployment Pools which I can't.
Anyone got any idea why that might be?
When Googling for help on Deployment Groups and Pools, I found loads of information about Build Agents but not Deployment Agents and Pools.
Does anyone have a definitive resource for giving guidance on Deployment Groups and Pools, how they relate, how they differ and how they are administered?
Thanks
Link to the Deployment group documentation.
Deployment group is basically a specialized agent pool, only usable with classic release definitions, and provide some extra functionality, like which deployment group agent to use in which stage. You can't use them with multistage yaml-pipelines, which is good to keep in mind.
Not sure how you are installing and configuring you agent as you are not seeing the actual deployment groups, but there is a specific switches in the agent installation command for deployment groups:
.\config.cmd --deploymentpool --deploymentpoolname "WebAppDemo-deployment-group-1" --agent $env:COMPUTERNAME --runasservice --work '_work' --url 'https://dev.azure.com/myorg/
Which might explain your errors. When creating a new deployment group, it inherits some groups from the azdo project, so maybe check if you are part of those: Contributors, Deployment Group Adminstrators, Project Administrators, Release Adminstrators. Can't remember how the agent configuration authenticates if you are not using a PAT-token, but I'd guess that you basically can't authenticate in the agent config with those deployment group switches unless you rights to that group, or someone who has the rights have created a PAT-token for you.
If you do get rights to the deployment group, you can find an autogenerated installation script for agents there. I never got it to work out of the box back in the days (*, but you can at least copy the config command from there and run it manually.
*) When Azure DevOps Server was launched, haven't really used deployment groups in three years, so my info might be a bit outdated.
#JakkaK
Thanks for the response.
I have seen the Deployment group documentation page you referenced however I believe we have taken everything on that into account so I was looking for alternatives to get other view points.
In terms of security I have added myself to
Contributors
Deployment Group Adminstrators
Project Administrators
Release Adminstrators
The agent installation command/switch combo I used was
.\config.cmd --gituseschannel --deploymentgroup --deploymentgroupname "My Deployment Group" --agent $env:COMPUTERNAME --runasservice --work '_work' --url 'https://MyOrg.co.uk/tfs/' --collectionname 'MyCollection' --projectname 'My Project';
I tried the agent installation command/switch combo suggested and got the message
Access denied. <<User>> needs Manage permissions for pool "My Deployment Pool" to perform the action. For more information, contact the Azure DevOps Server administrator.
Given that I am in all the Security Groups listed about this makes no sense.
One thing I have noticed is that, when creating a Development Group the Development Pool name defaults to "Project Name-Development Group Name".
The Project portion of the Deployment Pool I am trying to connect to indicates to me that the Project Name has been altered since the Deployment Group was created.
ie
The project is called "BBB Project Name"
The Deployment Pool which, as I have stated above appears to be set by a default format, is called "AAA BBB Project Name-Development Group Name".
i.e. the project looks like it has been renamed from "AAA BBB Project Name" to "BBB Project Name".
You'd like to think that relationships were by identifiers rahter then names but I'm wondering if that would cause this issue.

Can't create role for service account because it is "not supported for this resource"

I have the following script I'd like to execute to create my service account and give it a Cloud Build Service Account role.
# create service account for github actions
gcloud iam service-accounts create github-actions --display-name="Github Actions"
# add iam permissions to github actions service account
gcloud iam service-accounts add-iam-policy-binding github-actions#project-id.iam.gserviceaccount.com --member='serviceAccount:github-actions#project-id.iam.gserviceaccount.com' --role='roles/cloudbuild.builds.builder'
The execution fails on the last command with
ERROR: Policy modification failed. For a binding with condition, run "gcloud alpha iam policies lint-condition" to identify issues in condition.
ERROR: (gcloud.iam.service-accounts.add-iam-policy-binding) INVALID_ARGUMENT: Role roles/cloudbuild.builds.builder is not supported for this resource.
I don't know what that means or better said what I can do to solve that. I need that service account to have that role so I can start Cloud Build through my Github Actions pipeline with that service account.
This can be confusing. Service Accounts have a "dual nature". They can be treated as resources and identities just not at the same time. See Managing Service Accounts
You're attempting to grant a Service Account (github-actions#project-id.iam.gserviceaccount.com) an IAM binding. In this case, the Service Account is a resource. The binding you're attempting to make references the same Service Account (github-actions#project-id.iam.gserviceaccount.com), this time as an identity. This is not possible.
You possibly (!?) want to grant the binding (as it is) to the project (!) not the Service Account:
PROJECT=[[YOUR-PROJECT]]
ACCOUNT=github-actions
EMAIL="${ACCOUNT}#${PROJECT}.iam.gserviceaccount.com"
gcloud projects add-iam-policy-binding ${PROJECT} \
--member="serviceAccount:${EMAIL}" \
--role='roles/cloudbuild.builds.builder'
But, please ensure that is your intent before issuing that command.

GKE Workload Identity PermissionDenied

I am trying to use Google's preferred "Workload Identity" method to enable my GKE app to securely access secrets from Google Secrets.
I've completed the setup and even checked all steps in the Troubleshooting section (https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity?hl=sr-ba#troubleshooting) but I'm still getting the following error in my logs:
Unhandled exception. Grpc.Core.RpcException:
Status(StatusCode=PermissionDenied, Detail="Permission
'secretmanager.secrets.list' denied for resource
'projects/my-project' (or it may not exist).")
I figured the problem was due to the node pool not using the correct service account, so I recreated it, this time specifying the correct service account.
The service account has the following roles added:
Cloud Build Service
Account Kubernetes Engine Developer
Container Registry Service Agent
Secret Manager Secret Accessor
Secret Manager Viewer
The relevant source code for the package I am using to authenticate is as follows:
var data = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase);
var request = new ListSecretsRequest
{
ParentAsProjectName = ProjectName.FromProject(projectName),
};
var secrets = secretManagerServiceClient.ListSecrets(request);
foreach(var secret in secrets)
{
var value = secretManagerServiceClient.AccessSecretVersion($"{secret.Name}/versions/latest");
string secretVal = this.manager.Load(value.Payload);
string configKey = this.manager.GetKey(secret.SecretName);
data.Add(configKey, secretVal);
}
Data = data;
Ref. https://github.com/jsukhabut/googledotnet
Am I missing a step in the process?
Any idea why Google is still saying "Permission 'secretmanager.secrets.list' denied for resource 'projects/my-project' (or it may not exist)?"
Like #sethvargo mentioned in the comments, you need to map the service account to your pod because Workload Identity doesn’t use the underlying node identity and instead maps a Kubernetes service account to a GCP service account. Everything happens at the per-pod level in Workload identity.
Assign a Kubernetes service account to the application and configure it to act as a Google service account.
1.Create a GCP service account with the required permissions.
2.Create a Kubernetes service account.
3.Assign the Kubernetes service account permission to impersonate the GCP
service account.
4.Run your workload as the Kubernetes service account.
Hope you are using project ID instead of project name in the project or secret.
You cannot update the service account of an already created pod.
Refer the link to add service account to the pods.

gcloud confusion about set/get IAM policy for a service account

There are 2 commands I am confused for some time:
gcloud iam service-accounts get-iam-policy
gcloud iam service-accounts set-iam-policy
from the --help command, these 2 commands treat service account as a resource, most often I use service account as an identity, for example, in a project, set policy by binding role with service account so this service account can operate on something in that project.
Can someone please point out what is the usage to attach the policy to service account? how does service account act as a resource rather than an identity?
As explained in this below part of the official documentation Managing service accounts
:
When thinking of a service account as a resource, you can grant roles to other users to access or manage that service account.
So, use it as a resource has to goal for you to manage who can use and control the service account. To provide some additional details, as in this example here, with the policies attached to them, you can configure the level of access that different users can have within service accounts - as mentioned there, you can configure that some users have viewer access, while others have editor level.
To summarize, the functinality of attaching policies to a service account is for you to set different levels of access and permissions to users who can access the service account.

OpenShift, how do I give myself clutser-admin?

I just started using OpenShift and have permissions problems. I am on the free trial for OpenShift 4.3.3 and cannot get my containers to run as root. I am the only user on my instance and I have admin, but it says I need cluster-admin to run the containers as root?
I tried running:
oc policy add-role-to-group cluster-admin anyuid
and that returned:
Error from server (Forbidden): rolebindings.rbac.authorization.k8s.io "cluster-admin" is forbidden: user "hustlin" (groups=["system:authenticated:oauth" "system:authenticated"]) is attempting to grant RBAC permissions not currently held:
{APIGroups:["*"], Resources:["*"], Verbs:["*"]}
{NonResourceURLs:["*"], Verbs:["*"]}
Going through OpenShift Online -> Administrator view -> User Management -> Roles -> cluster-admin -> Role Bindings, it states:
Restricted Access
You don't have access to this section due to cluster policy.
Error details
rolebindings.rbac.authorization.k8s.io is forbidden: User "hustlin" cannot list resource "rolebindings" in API group "rbac.authorization.k8s.io" at the cluster scope
I feel like it should not be this difficult for me to run a container as root. Just testing out OpenShift and I haven't been able to successfully run a single container on the platform, they all eventually go to CrashLoopBackOff.
Yes, I did try the:
oc login -u system:admin
command and it prompted me for my password before returning:
error: username system:admin is invalid for basic auth
I even tried following this guide from the OpenShift blog, but it would not recognize oadm.