Checkov scan particular folder or PR custom branch files - azure-devops

Trying to run Checkov (for IaC validation) via Azure DevOps YAML pipelines, for ARM template files stored in Azure DevOps version control. The code below:
trigger: none
pool:
vmImage: ubuntu-latest
stages:
- stage: 'runCheckov'
displayName: 'Checkov - Scan ARM files'
jobs:
- job: 'RunCheckov'
displayName: 'Checkov solution'
steps:
- bash: |
docker pull bridgecrew/checkov
workingDirectory: $(System.DefaultWorkingDirectory)
displayName: 'Pull bridgecrew/checkov image'
- bash: |
docker run \
--volume $(pwd):/scripts bridgecrew/checkov \
--directory /scripts \
--output junitxml \
--soft-fail > $(pwd)/CheckovReport.xml
workingDirectory: $(System.DefaultWorkingDirectory)
displayName: 'Run checkov'
- task: PublishTestResults#2
inputs:
testRunTitle: 'Checkov run results'
failTaskOnFailedTests: false
testResultsFormat: 'JUnit'
testResultsFiles: 'CheckovReport.xml'
searchFolder: '$(System.DefaultWorkingDirectory)'
mergeTestResults: false
publishRunAttachments: true
displayName: 'Publish Test results'
The problem - how to change the path/folder of ARM templates to scan. Now it scans all ARM templates found under my whole repo1, regardless what directory value I set.
Also, how to scan PR files committed to custom branch during PR review, so it would trigger the build but the build would scan only those files in the custom branch. I know how to set to trigger build via DevOps repository settings, but again, how to assure build pipeline uses/scan particular PR commit files, not whole repo1 (and master branch).

I recommend you use the Docker image bridgecrew/checkov to set up a container job to run the Checkov scan. The container job will run all the tasks of the job into the Docker container started from this image.
In the container job, you can check out the source repository into the container, then use a script task (such as Bash task) to run the related Checkov CLI to do the files scan. On the script task, you can use the 'workingDirectory' option to specify the path/folder where the command lines run in. Normally, the command lines will only act on files which are in the specified directory and its subdirectories.
If you want to only scan the files in a specific branch in the job, you can clone/checkout the specific branch to the working directory of the job in the container, then like as above mentioned, use the related Checkov CLI to scan files under the specified directory.
[UPDATE]
In the pipeline job, you can try to call the Azure DevOps REST API "Commits - Get Changes" to get all the changed files and folders for the particular commit.
Then use the Checkov CLI with the parameter --directory (-d) or --file (-f) to scan the specified file or folder.

Related

How do you copy azure repo folders to a folder on a VM in an Environment in a pipeline?

I have an Environment called 'Dev' that has a resource, which is a VM. As part of the 'Dev' pipeline I want to copy files from a specific folder on the develop branch of a specific repo to a specific folder on the VM that's on the Environment.
I've not worked with Environments before or yaml pipelines much but I gather I need to use the CopyFiles#2 task.
So I've got an azure pipeline yaml file something like this:
variables:
isDev: $[eq(variables['Build.SourceBranch'], 'refs/heads/develop')]
stages:
- stage: Build
jobs:
- job: Build
pool:
vmImage: 'windows-latest'
steps:
- task: CopyFiles#2
displayName: 'Copy Files'
inputs:
contents: 'myFolder\**'
Overwrite: true
targetFolder: $(Build.ArtifactStagingDirectory)
- task: PublishBuildArtifacts#1
inputs:
pathToPublish: $(Build.ArtifactStagingDirectory)
artifactName: myArtifact
- stage: Deployment
dependsOn: Build
condition: and(succeeded(), eq(variables.isDev, true))
jobs:
- deployment: Deploy
displayName: Deploy to Dev
pool:
vmImage: 'windows-latest'
environment: Dev
strategy:
runOnce:
deploy:
steps:
- script: echo Foo Bar
The first question is how to I get this to copy the files to a specific path on the Dev environment?
Is the PublishBuildArtifacts really needed? The reason I ask is that I want this to copy files every time the pipeline is run and not error if the artifact already exists.
It also feels a bit dirty to have to check the branch is the correct branch this way. Is there a better way to do it?
The deployment strategy you're using relies on specifying an agent pool, which means it doesn't run on the machines in the environment. If you use a strategy such as rolling, it will run the specified steps on those machines automatically, including any download steps to download artifacts.
Ref: https://learn.microsoft.com/en-us/azure/devops/pipelines/process/deployment-jobs?view=azure-devops#deployment-strategies
You need to publish artifacts as part of the pipeline if you want them to be automatically available to down-stream jobs. Each run will get a different set of artifacts, even if the actual artifact contents are the same.
That said, based on the YAML you posted, you probably don't need to. In fact, you don't need the "build" stage at all. You could just add a checkout step during your rolling deployment, and the repo would be cloned on each of the target machines.
Ok, worked this out with help from this article: https://dev.to/kenakamu/azure-devops-yaml-release-pipeline-trigger-when-build-pipeline-completed-54d5.
I've taken the advice from Daniel Mann regarding the strategy being 'rolling'. I then split my pipeline into 2 pipelines; 1 for building the artifacts and 1 for releasing (copying them).
If you want just download the particular folders instead of all the source files from the repository, you can try using the REST API "Items - Get" to download each particular folder individually.
GET https://dev.azure.com/{organization}/{project}/_apis/git/repositories/{repositoryId}/items?path={path}&download=true&$format=zip&versionDescriptor.version={versionDescriptor.version}&resolveLfs=true&api-version=6.0
For example:
Have the repository like as below.
Now, in the YAML pipeline, I just want to download the 'TestFolder01' folder from the main branch.
jobs:
- job: build
. . .
steps:
- checkout: none # Do not check out all the source files.
- task: Bash#3
displayName: 'Download particular folder'
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
inputs:
targetType: inline
script: |
curl -X GET \
-o TestFolder01.zip \
-u :$SYSTEM_ACCESSTOKEN 'https://dev.azure.com/MyOrg/MyProj/_apis/git/repositories/ShellScripts/items?path=/res/TestFolder01&download=true&$format=zip&versionDescriptor.version=main&resolveLfs=true&api-version=6.0'
This will download the 'TestFolder01' folder as a ZIP file (TestFolder01.zip) into the current working directory. You can use the unzip command to decompress it.
[UPDATE]
If you want to download the particular folders in the deploy job which target to your VM environment, yes, the folders will be download into the pipeline working directory on the VM.
Actually, you can consider a VM type environment resource is a self-hosted agent installed on the VM. So, when your deploy job is targeting to the VM environment resource, it is running on the self-hosted agent on the VM.
The pipeline working directory is under the directory where you install the VM environment resource (self-hosted agent). Normally, you can use the variable $(Pipeline.Workspace) to get value of this path (see here).
stages:
- stage: Deployment
jobs:
- deployment: Deploy
displayName: 'Deploy to Dev'
environment: 'Dev.VM-01'
strategy:
runOnce:
deploy:
steps:
- task: Bash#3
displayName: 'Download particular folder'
env:
SYSTEM_ACCESSTOKEN: $(System.AccessToken)
inputs:
targetType: inline
script: |
echo "Current working directory: $PWD"
curl -X GET \
-o TestFolder01.zip \
-u :$SYSTEM_ACCESSTOKEN 'https://dev.azure.com/MyOrg/MyProj/_apis/git/repositories/ShellScripts/items?path=/res/TestFolder01&download=true&$format=zip&versionDescriptor.version=main&resolveLfs=true&api-version=6.0'

Not found scriptPath in azure devops

I put a shell script file in a folder on my repo root and tried to run that in my devops pipeline but it says that cannot find the scriptPath:
[error]Not found scriptPath: /home/vsts/work/1/s/pipelines/databricks-cli-config.sh
I am simply creating a task to run the shell script, like this:
- task: ShellScript#2
inputs:
scriptPath: 'pipelines/databricks-cli-config.sh'
args: '$(databricks_host) $(databricks_token)'
displayName: "Install and configure the Databricks CLI"
Any idea?
Make sure you checkout your code and you are on correct level. So if you are on regular job please add working directory:
- task: ShellScript#2
inputs:
scriptPath: 'pipelines/databricks-cli-config.sh'
args: '$(databricks_host) $(databricks_token)'
cwd: '$(System.DefaultWorkingDirectory)'
displayName: "Install and configure the Databricks CLI"
and if you use it on deployment job, by default code is not being checked out there. So you need you need to publish this script as artifact and then download it in deployment job (deployment jobs download artifact by default) or add
- checkout: self
step do download code on deployment job.
I assumed that you use YAML.

Avoid git clean with Azure Devops self-hosted Build Agent

I have a YAML build script in an Azure hosted git repository which gets triggered across 7 build agents running on a local VM. Every time this runs, the build performs a git clean which takes a significant amount of time due to a large node_modules folder which takes a long time to clean up.
The MSDN page here seems to suggest this is configurable but shows no detail of how to configure it. I can't tell whether this is a setting that should be specified on the agent, the YAML script, within DevOps on the pipeline, or where.
Is there any other documentation I'm missing or is this not possible?
Update:
The start of the YAML file is here:
variables:
BUILD_VERSION: 1.0.0.$(Build.BuildId)
buildConfiguration: 'Release'
process.clean: false
jobs:
###### ######################################################
###### 1 - Build and publish .NET
#############################################################
- job: net_build_publish
displayName: .NET build and publish
pool:
name: default
steps:
- script: echo $(BUILD_VERSION)
- task: DotNetCoreCLI#2
displayName: dotnet build $(buildConfiguration)
inputs:
command: 'build'
projects: |
myrepo/**/API/*.csproj
arguments: '-c $(buildConfiguration) /p:Version=$(BUILD_VERSION)'
The complete yaml is a lot longer, but the output from the first job includes this output in a Checkout task
Checkout myrepo#master to s
View raw log
Starting: Checkout myrepo#master to s
==============================================================================
Task : Get sources
Description : Get sources from a repository. Supports Git, TfsVC, and SVN repositories.
Version : 1.0.0
Author : Microsoft
Help : [More Information](https://go.microsoft.com/fwlink/?LinkId=798199)
==============================================================================
Syncing repository: myrepo (Git)
Prepending Path environment variable with directory containing 'git.exe'.
git version
git version 2.26.2.windows.1
git lfs version
git-lfs/2.11.0 (GitHub; windows amd64; go 1.14.2; git 48b28d97)
git config --get remote.origin.url
git clean -ffdx
Removing myrepo/Data/Core/API/bin/
Removing myrepo/Data/Core/API/customersettings.json
Removing myrepo/Data/Core/API/obj/
Removing myrepo/Data/Core/Shared/bin/
Removing myrepo/Data/Core/Shared/obj/
....
We have another job further down which runs npm install and npm build for an Angular project, and every build in the pipeline is taking 5 minutes to perform the npm install step, possibly because of this git clean when retrieving the repository?
Click on your pipeline to show the run history
Click Edit
Click the 3 dot kebab menu
Click Triggers
Click YAML
Click Get Sources
Set Clean to False and Save
To say this is obfuscated is an understatement!
I can't say what affect this will have though, I think the agent reuses the same folder each time a pipeline runs and I'm not Node.js developer so I don't know what leaving old node_modules hanging around will do!
P.S. what people were saying about pipeline caching I don't think is what you were asking, also pipeline caching zips up the cached folder and uploads it to your artifacts storage, it then downloads it each time, if you only have 1 build agent then actually not doing a git clean might be more efficent I'm not 100%
As I mentioned below. You need to calculate hash before you run npm install. If hash is the same as the one kept close to node_modules you can skip installing dependencies. This may help you achieve this:
steps:
- task: PowerShell#2
displayName: 'Calculate and save packages.config hash'
inputs:
targetType: 'inline'
pwsh: true
script: |
# generates a hash of package-lock.json
$newHash = Get-FileHash -Algorithm MD5 -Path (Get-ChildItem package-lock.json)
$hashPath = "$(System.DefaultWorkingDirectory)/cache-npm/hash.txt"
if(Test-Path -path $hashPath) {
if(Compare-Object -ReferenceObject $(Get-Content $hashPath) -DifferenceObject $newHash) {
Write-Host "##vso[task.setvariable variable=NodeModulesAreUpToDate;]true"
$newHash > $hashPath
Write-Host ("Hash File saved to " + $hashPath)
} else {
# files are the same
Write-Host "no need to install node_modules"
}
} else {
$newHash > $hashPath
Write-Host ("Hash File saved to " + $hashPath)
}
$storedHash = Get-Content $hashPath
Write-Host $storedHash
workingDirectory: '$(System.DefaultWorkingDirectory)/cache-npm'
- script: npm install
workingDirectory: '$(Build.SourcesDirectory)/cache-npm'
condition: ne(variables['NodeModulesAreUpToDate'], true)
git clean -ffdx will clean any change untracked by source control in the source. You may try Pipeline caching, which can help reduce build time by allowing the outputs or downloaded dependencies from one run to be reused in later runs, thereby reducing or avoiding the cost to recreate or redownload the same files again. Check the following link:
https://learn.microsoft.com/en-us/azure/devops/pipelines/release/caching?view=azure-devops#nodejsnpm
variables:
npm_config_cache: $(Pipeline.Workspace)/.npm
steps:
- task: Cache#2
inputs:
key: 'npm | "$(Agent.OS)" | package-lock.json'
restoreKeys: |
npm | "$(Agent.OS)"
path: $(npm_config_cache)
displayName: Cache npm
In the checkout step, it allows us to set the boolean option clean to true or false. The default is true so it runs git clean by default.
Below is a minimal example with clean set to false.
jobs:
- job: Build_Job
timeoutInMinutes: 0
pool: 'PoolOne'
steps:
- checkout: self
clean: false
submodules: recursive
- task: PowerShell#2
displayName: Make build
inputs:
targetType: 'inline'
script: |
bash -c 'make'
More documentation and related options can be found here

How to keep secure files after a job finishes in Azure Devops Pipeline?

Currently I'm working on a pipeline script for Azure Devops. I want to provide a maven settings file as a secure files for the pipeline. The problem is, when I define a job only for providing the file, the file isn't there anymore when the next job starts.
I tried to define a job with a DownloadSecureFile task and a copy command to get the settings file. But when the next job starts the file isn't there anymore and therefore can't be used.
I already checked that by using pwd and ls in the pipeline.
This is part of my current YAML file (that actually works):
some variables
...
trigger:
branches:
include:
- stable
- master
jobs:
- job: Latest_Release
condition: eq(variables['Build.SourceBranchName'], 'master')
steps:
- task: DownloadSecureFile#1
name: settingsxml
displayName: Download maven settings xml
inputs:
secureFile: settings.xml
- script: |
cp $(settingsxml.secureFilePath) ./settings.xml
docker login -u $(AzureRegistryUser) -p $(AzureRegistryPassword) $(AzureRegistryUrl)
docker build -t $(AzureRegistryUrl)/$(projectName):$(projectVersionNumber-Latest) .
docker push $(AzureRegistryUrl)/$(projectName):$(projectVersionNumber-Latest)
....
other jobs
I wanted to put the DownloadSecureFile task and "cp $(settingsxml.secureFilePath) ./settings.xml" into an own job, because there are more jobs that need this file for other branches/releases and I don't want to copy the exact same code to all jobs.
This is the YAML file as I wanted it:
some variables
...
trigger:
branches:
include:
- stable
- master
jobs:
- job: provide_maven_settings
# no condition because all branches need the file
- task: DownloadSecureFile#1
name: settingsxml
displayName: Download maven settings xml
inputs:
secureFile: settings.xml
- script: |
cp $(settingsxml.secureFilePath) ./settings.xml
- job: Latest_Release
condition: eq(variables['Build.SourceBranchName'], 'master')
steps:
- script: |
docker login -u $(AzureRegistryUser) -p $(AzureRegistryPassword) $(AzureRegistryUrl)
docker build -t $(AzureRegistryUrl)/$(projectName):$(projectVersionNumber-Latest) .
docker push $(AzureRegistryUrl)/$(projectName):$(projectVersionNumber-Latest)
....
other jobs
In my dockerfile the settings file is used like this:
FROM maven:3.6.1-jdk-8-alpine AS MAVEN_TOOL_CHAIN
COPY pom.xml /tmp/
COPY src /tmp/src/
COPY settings.xml /root/.m2/ # can't find file when executing this
WORKDIR /tmp/
RUN mvn install
...
The error happens, when docker build is started, because it can't find the settings file. It can though, when I use my first YAML example. I have a feeling that it has something to do with each job having a "Checkout" phase, but I'm not sure about that.
Each job in Azure DevOps is running on different agent, so when you use Microsoft Hosted Agents and you separator the pipeline to few jobs, if you copy the secure file in one job, the second job running in new fresh agent that of course don't have the file.
You can solve your issue by using Self Hosted agent (then copy the file to your machine and the second job running in the same machine).
Or you can upload the file to somewhere else (secured) that you can downloaded it in the second job (so why not do it from the start...).

How to have conditional image tags on azure devops build yaml pipeline?

For several environments, I need to have different docker image tagging policies, that is: dev and release should utilize 'Latest' tag, while production should have proper version tag.
I am currently using single Yaml file for all AzureDevOps Build Pipeline, and want to have image tagging mode to be defined as Variable per build /lets say called $(Versioned)/.
The build step is shown below:
steps
- bash: docker push $(imageFullName):latest
displayName: 'docker push'
So is there any way to have IF statement or other conditional operation here.
For example:
steps
- bash: docker push $(imageFullName):IF($(Versioned), $(Build.BuildNumber), latest)
displayName: 'docker push'
you can maybe do this with something like this:
steps
- bash: docker push $(imageFullName):latest
displayName: 'docker push'
condition: eq($(Versioned), 'true')
- bash: docker push $(imageFullName):$(Build.BuildNumber)
displayName: 'docker push'
condition: ne($(Versioned), 'true')