Split tests file when using github actions parallel - github

I am actually using circle ci to launch my behat tests through docker.
I want to migrate to github actions for cost reasons but i can't find any clue on how i can split my files on github actions when using parallel build.
Here is the command i am using to split my files on circleCI: circleci tests glob "features/**/*.feature" | circleci tests split --split-by=filesize.
Is there any existing command to do it appart from using knapsack pro which i can't use.
Thanks!

Try using split_tests, a tool similar to circleci tests glob.
It has the same set of features: split files according to filename, lines count or timing of a previous test result.
You'll probably need to configure your Github-Actions ci.yml to:
Download a binary release of split_tests
Create a matrix with a number of nodes,
For each node, run the tests using the split_tests partial files list,
(Optionally) Save the JUnit test report to a cache, to re-use it in the next run for timing.

Related

Why choose github action when we can just run bash script in github workflow?

Just completed a GitHub workflow using more of them are actions, but also with one bash script.
When writing the workflow, it seems much quicker use bash script than actions.(since some actions are just do one thing. ) Why are the some reasons that we just need GitHub actions rather than bash script or python script trigger?
Or we are just supposed to use script languages for most part, then use GitHub actions for small portion of the whole workflow?
Interesting but not easy to answer with more information about what your goal is. The right answer might depend on your use case.
I have not used GitHub actions yet. Let me try to explain it anyway, starting pretty high level. Unfortunately, there's no option to add a table of contents ;) Please let me know if this helps.
1. What are GitHub Actions for?
From this "What is GitHub Actions? Benefits and examples" PDF file
GitHub Actions is a CI/CD tool for the GitHub flow. You can use it to integrate and deploy code changes to a third-party cloud application platform as well as test, track, and manage code changes. GitHub Actions also supports third-party CI/CD tools, the container platform Docker, and other automation platforms.
From docs.github.com
GitHub Actions is a continuous integration and continuous delivery (CI/CD) platform that allows you to automate your build, test, and deployment pipeline. You can create workflows that build and test every pull request to your repository or deploy merged pull requests to production. [...]
GitHub Actions goes beyond just DevOps and lets you run workflows when other events happen in your repository.
2. Continuous Integration/Continuous Deployment (CI/CD)
Usually, people run CI/CD tools to build, deploy, test, and run other tasks while doing that. We use another 3rd party CI/CD pipeline using Rake to build, test, and check links. Our pipeline invokes these small scripts you mention.
3. GitHub actions and scripts
From Essential features of GitHub Actions
If your job generates files that you want to share with another job in the same workflow, or if you want to save the files for later reference, you can store them in GitHub as artifacts. Artifacts are the files created when you build and test your code. For example, artifacts might include binary or package files, test results, screenshots, or log files. Artifacts are associated with the workflow run where they were created and can be used by another job. All actions and workflows called within a run have write access to that run's artifacts.
Here's the key point, I guess. You can really do a lot of crazy stuff within a workflow. All is related/specific to GitHub. Workflows are event-driven, meaning that you can run a series of commands after a specified event has occurred. For example, every time someone creates a pull request, you can automatically run a command that executes a test or other script.
4. GitHub action workflow and scripts
You can include different scripts in your workflow, e.g. using
Javascript: https://github.com/actions/github-script
Python: https://github.com/marketplace/actions/run-python-script
5. (Complex) Examples
You can check out the repository for docs.github.com for some more complex examples, see action-scripts and workflow folders. GitHub themselves seems to use it pretty heavily.
6. Advantages/Disadvantages of GitHub actions
OR: Differences to other CI tools
It took some time to find something not marketing-ish. Key points are:
beginner-friendly using YAML config files
no need to set up your own CI pipeline
You can check out this SO post from 2019 for a list of what's good and bad about GitHub actions.
In short - for readability and the DRY ("Don't repeat yourself") principle.
It's more or less the same as using functions in programming.
I can agree that some trivial actions are useless.
But "actions/checkout" for example is priceless!

Remove build from PR in Bitbucket using API

i have a project with two submodules - one is for the source code(A) and the second one contains end to end tests(B). The problem is that the build of the source code is successful, but e2e tests are failing and this does not allow me to merge. Is there any way bitbucket to get result only from A submodule and to skip the result from submodule B? Is there any way to say which result to be used as validation?
I believe you can disable the e2e test from your plan spec file. Or you can disable the test scan from the UI as well from under the tasks tab in the job stage. But it would be good to investigate why your tests are failing.

Deploy build files from continuous integration

I am working on a project with multiple people, a website application which requires webpack to be built, uglified, concatenated into a few files e.g. app.min.js, style.min.css etc. - As a result of this, in an effort to prevent merge conflicts we recently added the build folder to .gitignore, under the assumption that we would be able to build during deployment.
When pushing to the Master branch, we automatically "deploy" through Semaphore CI (similar to Travis) which runs composer install, npm install, and finally "npm run build" which triggers the webpack build. This is all built and then tested on the CI side of things, and then Semaphore automatically deploys to Amazon's Elastic Beanstalk where our application is hosted.
The problem with this is, it seems Semaphore doesn't upload the build it's just tested, but rather the Master branch itself which has no built JS or CSS. I'm wondering if there's a way to push these built files to deployment as well, or if running the entire build process AGAIN on Elastic Beanstalk is the only route. It seems unnecessary to have to do that process essentially 3 times, locally, CI, and then deployment. Every time a step like this is needed on EB the actual re-instantiation time gets longer, which I'd like to keep as short as possible.
Obviously if building it a 3rd time on EB is the only way to go about this then I'll have to, just wondering if there are better solutions for this whole workflow.
I haven't worked with Semaphore CI, but you might be able to use an .ebignore file.
If you create one, the cli will use that instead of your .gitignore file.
I find in some deployment situations you want the inverse of your .gitignore (all compiled, no src). It essentially lets you pick the files from your project directory that you want to deploy, in the same way as the .gitignore file.
Edit: I just noticed the documentation on aws is lacking. It only mentions file exclusion, but you can include files too.
Edit 2: I don't think Semaphore supports the use of .ebignore, so right now this solution isn't of any use. :(
I just had a great first experience with https://deploybot.com/. The can deploy directly to elastic beanstalk. It might be interesting or you.

Build Flow vs Build Pipeline

I'm trying to split up a few Jenkins jobs using the Build Flow plugin so that instead of three monolithic jobs, we have three "starting points" that then use the DSL to trigger downstream jobs. I chose Build Flow over the Build Pipeline plugin because it seemed like it was a lot harder to share jobs between different pipelines ( ie, sharing the workspace of the multiple starting jobs with a single compile job ).
Previously, I had three jobs set up: Project-PR, Project-DEV, and Project-PROD.
Project-PR would build whenever a pull request happened in GitHub, and would just run a smaller subset of our unit tests, so that we could get quick verification that the PR is okay to merge.
Project-DEV would build whenever a feature branch was merged in GitHub into the main development branch, as well as having the ability to be manually triggered and given a different branch to pull. It would run the full suite of unit -- basically a sanity check that everything is still good. Then it would compile and minify, and push to a QA environment for testing, and then it would run the full suite of integration tests against that QA environment. This step was configured as a parametrized build, with the parameter being the name of the branch to pull, test, and push. It would push to and set up QA environment specific to that branch, so that we could QA multiple features without having to merge to development ( ie, feature-one.qa.example.com, feature-two.qa.example.com ).
Project-PROD would only ever be manually triggered, and would do the full unit and integration test suite, compile and minify the front-end code ( Less, JS, and CSS ), and push the built code into a special "release branch" in GitHub that can then be deployed -- we haven't quite reached the point of Jenkins being in charge of deployment.
Now, what I wanted to set up was to split the subtasks into their own jobs, so that it'd be easy to set up new jobs to without having to copy and paste all the build steps ( or copying the job and changing all the things that need to be unique ). This would let us do things like create a copy of the Project-DEV, but switch out the last job for one that deploys to a staging environment set up in the cloud. Or easily create a job that could report test results to a third party source, ie copy the results to a shared network folder or something. Or any number of things. The goal is basically to use these subtask jobs as building blocks to let us build more complicated jobs, while also making it easier to update how one portion of the build works ( for example, maybe we switch to a different technology for compiling, which might change how Jenkins would compile the code ).
For example, the Project-PR would be split into the following:
Project-PULL -> Project-SetupBuildEnv -> Project-PartialUnitTests
(BuildFlow) (Normal Job) (Normal Job)
The SetupBuildEnv would just pull down any NPM or Composer requirements, and set up the directories required for testing and building. PartialUnitTests then run, and report it's results back up to the
The Project-DEV could be split up like so:
Project-DEV -> Project->SetupBuildEnv -> Project-FullUnitTests -> Project-Compile -> Project-Minify -> Project-DeployQA -> Project-FullIntegrationTests
This way, the parts of the build process that are shared ( in this case, Project-SetupBuildEnv ) can be easily shared between jobs, reducing duplication, and making it easier to update a step in the build process without having to remember EVERY job that uses that step.
Right now, I'm using the Shared Workspace plugin so that all the steps use the same workspace. However, I'm running into an issue with that: it's not actually using one workspace. What's happening is that the Build Flow job will get a directory ( eg: /sharedspace/shared_one ), and download the code from GitHub into there. Then it will trigger the DSL, which starts up the 'SetupBuildEnv' job. But instead of working inside the same directory, it will get a directory with a name like "/sharedspace/shared_one#2", and run the build setup task in there. Then when it goes to do the third step ( unit testing ), it fails, because now it's got a third directory ( /sharedspace/shared_one#3 ), but that directory didn't have the setup run, so the required node and composer modules are missing. What's weird is that it looks like the Shared Workspace plugin is copying the first shared workspace to another directory and incrementing a counter ( the #N part of the directory name ) and giving that to the other jobs to work in.
So, question time:
is there a way to fix the Shared Workspace plugin so that it's actually only using one directory for each job?
if not, is it possible to have the Clone Workspace plugin take an argument, so I can specify what archived workspace to use instead of using the dropdown?
another possiblity: would using the shared workspace plugin, but use the "Local subdirectory for repo (optional)" option in the advanced git job options to specify the directory to use?
failing all that, is there some other way to set up a build pipeline that can share jobs with other pipelines that I've missed out on?
In my experience, even if you do get this working, this might not be a scalable way to go longer term. We've found the shared workspace plugin entirely a bad idea for long / complex builds (similar reasons to yours - but also: scaling across dozens of slaves becomes hard suddenly). Arguably the idea is slightly against the spirit of modern scalable CI.
I'd instead delegate more to your build tools, be they Maven / Gradle, Ant, even Grunt, whatever. If you want to keep these builds truly modular, but can't afford to rebuild at each step (we decided full independence was worth wasting a few minutes per build) then perhaps look at creating useful artefacts at key stages - in your case, minified assets TARs, library JARs, or maybe webjars, or whatever, and deploy them to a (Maven?) repo.
Later build steps in your pipeline can quickly, easily, and repeatably pull the latest (or named version) assets from this centralised repo, and continue with the build process.
An alternative (with similarities) is to build one or more assets, but only promote them after increasing numbers of tests are run, which can be done in separate builds coordinated by your build flow, using the Promoted Builds plugin etc.

How to scale buildbot in a company

I've been looking into buildbot lately, and the lack of good documentation and sample configurations makes it hard to understand how buildbot is commonly used.
According to the buildbot manual, each buildmaster is responsible for 1 code base. That means that a company that wants to use buildbot on, say, 10 projects, needs to maintain 10 different sets of buildbot installations (master-slaves configurations, open ports, websites with output etc.). Is this really the way things are done? Am I missing an option that creates a mash-up that is easy to maintain and monitor?
Thanks!
At my place of work we use Buildbot to test a single program over several architectures and versions of Python. I use one build master to oversee about 16 slaves. Each set of slaves pulls from a different repo and tests it against Python 2.X.
From my experience, it would be easy to configure a single build master to run a mash-up of projects. This might not be a good idea because the waterfall page (where the build slaves report results) can get very congested with more than a few slaves. If you are comfortable scrolling through a long waterfall page, then this will not be an issue.
EDIT:
The update command in master.cfg:
test_python26_linux.addStep(ShellCommand, name = "update pygr",
command = ["/u/opierce/PygrBuildBot/update.sh","000-buildbot","ctb"], workdir=".")
000-buildbot and ctb are additional parameters to specify which branch and repo to pull from to get the information. The script update.sh is something I wrote to avoid an unrelated git problem. If you wanted to run different projects, you could write something like:
builder1.addStep(ShellCommand, name = "update project 1",
command = ["git","pull","git://github.com/your_id/project1.git"], workdir=".")
(the rest of builder1 steps)
builder2.addStep(ShellCommand, name = "update project 2",
command = ["git","pull","git://github.com/your_id/project2.git"], workdir=".")
(the rest of builder2 steps)
The two projects don't have to be related. Buildbot creates a directory for each builder and runs all the steps in that directory.
FYI, BuildBot 0.8.x supports several repositories on one master, simplifying things a bit.