I am interested in understanding whether bazel can handle "two stage builds", where dependencies are discovered based on the file contents and dependencies must be compiled before the code that depends on them (unlike C/C++ where dependencies are mostly header files that are not separately compiled). Concretely, I am building the Coq language which is like Ocaml.
My intuition for creating a build plan would use an (existing) tool (called coqdep) that reads a .v file and returns a list of all of its direct dependencies. Here's the algorithm that I have in mind:
invoke coqdep on the target file and (transitively) on each of its dependent files,
once transitive dependencies for a target are computed, add a rule to build the .vo from the .v that includes transitive dependencies.
Ideally, the calls to coqdep (in step 1) would be cached between builds and so only need to be re-computed when the file changes. And the transitive closure of the dependency information would also be cached.
Is it possible to implement this in bazel? Are there any pointers to setting up builds for languages like this? Naively, it seems to be a two-stage build and I'm not sure how this fits into bazel's compilation model. When I looked at the rules for Ocaml, it seemed like it was relying on ocamlbuild to satisfy the build order and dependency requirements rather than doing it "natively" in bazel.
Thanks for any pointers or insights.
(don't have enough rep to comment yet, so this is an answer)
#2 of Toraxis' answer is probably the most canonical.
gazelle is an example of this for Golang, which is in the same boat: dependencies for Golang files are determined outside a Bazel context by reading the import statements of source files. gazelle is a tool that writes/rewrites Golang rules in BUILD files according to the imports in source files of the Bazel workspace. Similar tools could be created for other languages that follow this pattern.
but the generated BUILD file will be in the output folder, not in the source folder. So you also have to provide an executable that copies the files back into the source folder.
Note that binaries run via bazel run have the environment variable BUILD_WORKSPACE_DIRECTORY set to the root of the Bazel workspace (see the docs) so if your tool uses this environment variable, it could edit the BUILD files in-place rather than generating and copying back.
(In fact, the generating-and-copying-back strategy would likely not be feasible, because purely-generated files would contain only Coq rules, and not any other types of rules. To generate a BUILD file with Coq rules from one with other types of rules, one would have to add the BUILD files themselves as dependencies - which would create quite the mess!)
I'm looking into similar questions because I want to build ReasonML with Bazel.
Bazel computes the dependencies between Bazel targets based on the BUILD files in your repository without accessing your source files. The only interaction you can do with the file system during this analysis phase is to list directory contents by using glob in your rule invocations.
Currently, I see four options for getting fine-grained incremental builds with Bazel:
Spell out the fine-grained dependencies in hand-written BUILD files.
Use a tool for generating the BUILD files. You cannot directly wrap that tool in a Bazel rule to have it run during bazel build because the generated BUILD file would be in the output folder, not in the source folder. But you can run rules that call coqdep during the build, and provide an executable that edits the BUILD file in the source folder based on the (cacheable) result of the coqdep calls. Since you can read both the source and the output folder during the build, you could even print a message to the user if they have to run the executable again. Anyway, the full build process would be bazel run //tools/update-coq-build-files && bazel build to reach a fixed point.
Have coarse-grained dependencies in the BUILD files but persistent workers to incrementally rebuild individual targets.
Have coare-grained dependencies in the BUILD files but generate a separate action for each target file and use the unused_inputs_list argument of ctx.actions.run to communicate to Bazel which dependencies where actually unused.
I'm not really sure whether 3 and 4 would actually work or how much effort would be involved, though.
Related
I'm trying to import a C project to eclipse (CDT) that is managed by waf. There is a list of predefines generated by waf (when running ./waf configure). That list has to be imported to Project->Properties->C/C++ General/Paths and Symbols/Symbols/GNU C so that the indexer knows about them and does not print errors. That list (when using the GUI) is stored to the .cproject file. I created a Build Target that runs ./waf configure and stores the list to a file named DEFINES.txt. How do I automatically update the list of .cproject with the values of DEFINES.txt after running the Build Target?
I thought about the following solutions and their follow-up problems:
Solution: Writing a plug-in.
Problem: What is the appropriate extension point?
Solution: Writing an external program that calls ./waf configure reads DEFINES.txt and writes the list to .cproject. That program replaces the old Build Target.
Problem: How safe is this? Am I allowed to change the .cproject file by an external program without causing any problems?
Solution: Implementing the .cproject updating algorithm in wscript file.
Problem: This is not a solution for me, because the project is used by others, too, that do not use eclipse as IDE. So the modified wscript will cause errors if the other developers want to build the project.
Does anybody have better ideas or some advice?
Here is how to go about it:
Writing a plug-in: What I recommend you do is write an extension to the LanguageSettingsProvider. The FAQ has some more info, but the summary is that provider does:
This extension point is used to contribute a new Language Settings
Provider. A Language Settings Provider is used to get additions to
compiler options such as include paths (-I) or preprocessor defines
(-D) and others into the project model.
CMake has an option to generate .cproject as part of its configure state, so you could do something similar. See the CMake Wiki for inspiration, but the summary is that you don't store and .cproject/.project in source control and have CMake (or waf in your case) generate the IDE specific files.
You could also just pick up the build settings using the build output parser and ignore the DEFINES.txt altogether. That requires running the build once from within Eclipse for CDT to see all the commands, and requires the commands to be parseable in the build output.
Newcomer to the Intellij IDE here, with no Java background. I've looked at Build Definition to get a brief idea on how should I organize my scala files, but their example doesn't cover the full structure of an SBT-based project shown attached.
Can you advise what each folder should be used for (e.g. where my source files should go, etc.) and also point me to sources where I can go read up more.
Thanks so much
It is described pretty well here:
http://www.scala-sbt.org/0.13.5/docs/Getting-Started/Directories.html
But to sum up.
.idea:
This contains the project files for your idea project, and has nothing directly to do with sbt itself. However idea (if auto refresh is enabled) updates its own project, each time the sbt build files change.
project:
This contains the sbt project files, except for the main build file (files ending in .sbt). Sbt build is itself based on scala, and if you need to have some scala code included in your build (e.g., code-generation/meta-programming, pre-compiler macros), then you can place scala source files in this directory. The code of these files can be used in your build system, and is not part of your project itself. To really understand how a build is made, then you will need to understand the difference in how sbt files and scala files for the build should be placed. When you run sbt, then it will search for .sbt files in the directory your are standing in, when these are found, it will search for scala files in the project directory. These files together are the source of the build system, but because these are source files, they need to be built before they can be used. To build this build system, sbt uses sbt. So a build system to build the build system is needed. It therefore looks for sbt files inside the project directory, and scala files for this build inside project/project and build these files to get a build system, that can build the build system (that can build your project). Actually it can continue recursive down to any project/project/project... directory, until it finds a project folder containing no scala files, and therefore needs no building before use.
The target folder inside project, is the target folder for the sbt build of your build definition. See below what a target folder is.
Normally you would not need to be concerned about this; just remember that build.sbt in your root directory is the build script for your project. project/plugins.sbt defines plugins activated for your build system, and project/build.properties contains special sbt properties. Currently the only sbt property I now of, is what version of sbt should be used.
src:
This is where your place the source files of your project. You should place any java sources in src/main/java, scala sources in src/main/scala. Resources are placed in src/main/resources.
The src/main/scala_2.11 folder is typically used, if you have some code that it not binary compatible with different versions of scala. In such cases you would be able to configure sbt to use different source files when building for different versions of scala. You probably do not need this, so I would advise to just delete the src/main/scala_2.11 folder.
Test sources are placed inside src/test/java and source/test/scala, and test resources are placed in src/test/resources.
target
This folder is the target folder for sbt. All compiled files, generated packages and so on are placed somewhere inside this dir.
Most things in this dir are not so interesting, as most of it is just internal sbt things. However if your build a jar file by calling sbt package, then it will be placed inside target/scala-x (where x is the scala version). There are also a lot of different plugins, that can package your application in different ways, and they will normally also place the package files somewhere inside the target dir.
I have an sbt project with two sub-projects, A and B. A produces a standalone scala-based executable exe. When exe is run, it will produce a file out.xml. I want this file to be part of resources for project B. I do not want B to include any references to A's code, all I want is the out.xml file to be part of it. I suspect that http://www.scala-sbt.org/0.13.5/docs/Howto/generatefiles.html should be a good starting point, but I can't get my head around on how to split it between two projects. Any takers?
Since A is a dependency of the build process, which needs to run the executable to generate your xml file you would list it as a libraryDepencency in project/[something].sbt or project/project/[something].scala. This would make it available to code you put in build.sbt or project/[something].scala but not make it a transitive dependency of the resulting artifact of project B.
(Or you could of course make project A a sbt-plugin itself, or create yet another project which is a plugin depending on A that runs the executable.)
From Gradle's documentation:
The scripts generated by this task are intended to be committed to
your version control system. This task also generates a small
gradle-wrapper.jar bootstrap JAR file and properties file which should
also be committed to your VCS. The scripts delegates to this JAR.
From: What should NOT be under source control?
I think generated files should not be in the VCS.
When are gradlew and gradle/gradle-wrapper.jar needed?
Why not store a gradle version in the build.gradle file?
Because the whole point of the gradle wrapper is to be able, without having ever installed gradle, and without even knowing how it works, where to download it from, which version, to clone the project from the VCS, to execute the gradlew script it contains, and to build the project without any additional step.
If all you had was a gradle version number in a build.gradle file, you would need a README explaining everyone that gradle version X must be downloaded from URL Y and installed, and you would have to do it every time the version is incremented.
Because the whole point of the Gradle wrapper is to be able, without having ever installed Gradle
Same argument goes for the JDK, do you want to commit that also? Do you also commit all your dependency libraries?
The dependencies should be upgraded continuously as new versions are released to get security and other bug fixes, and because if you get to far behind it can be a very time consuming task to get up to date again.
If the Gradle wrapper is incremented for every new release, and it is committed, the repo will grow very large. The problem is obvious when working with distributed VCS where a clone will download all versions of everything.
and without even knowing how it works
Create a build script that downloads the wrapper and uses it to build. Everyone does not need to know how the script works, they need to agree that the project is build by executing it.
where to download it from, which version
task wrapper(type: Wrapper) {
gradleVersion = 'X.X'
}
for Gradle version >= 5:
wrapper {
gradleVersion = 'X.X'
}
and then
gradle wrapper
to download the correct version.
to clone the project from the VCS, to execute the gradlew script it contains, and to build the project without any additional step.
Solved by the steps above. Downloading the Gradle wrapper is not different from downloading any other dependency. The script could be smart enough to check for any current Gradle wrapper and only download it if there is a new version.
If the developer has never used Gradle before and maybe doesn't know the project is built with Gradle, then it is more obvious to run a build.sh compared to running gradlew build.
If all you had was a gradle version number in a build.gradle file, you would need a README explaining everyone that gradle version X must be downloaded from URL Y an installed,
No, you would not need a README. You could have one, but we are developers and we should automate as much as possible. Creating a script is better.
and you would have to do it every time the version is incremented.
If the developers agree that the correct process is to:
Clone repo
Run build script
Then there upgrading to latest Gradle wrapper is no problem. If the version is incremented since last run, the script could download the new version.
I would like to recommend a simple approach.
In your project's README, document that an installation step is required, namely:
gradle wrapper --gradle-version 3.3
This works with Gradle 2.4 or higher. This creates a wrapper without requiring a dedicated task to be added to "build.gradle".
With this option, ignore (do not check in) these files/folders for version control:
./gradle
gradlew
gradlew.bat
The key benefit is that you don't have to check-in a downloaded file to source control. It costs one extra step on installation. I think it is worth it.
According to Gradle docs, adding gradle-wrapper.jar to VCS is expected as making Gradle Wrapper available to developers is part of the Gradle approach:
To make the Wrapper files available to other developers and execution environments you’ll need to check them into version control. All Wrapper files including the JAR file are very small in size. Adding the JAR file to version control is expected. Some organizations do not allow projects to submit binary files to version control. At the moment there are no alternative options to the approach.
What is the "project"?
Maybe there is a technical definition of this idiom that excludes build scripts. But if we accept this definition, then we must say your "project" is not all the things that you need to versioned!
But if we say "your project" is everything you have done. Then we can say you must include it and only it into VCS.
This is very theoretical and maybe not practical in case of our development works. So we change it to "your project is every file (or folder) you need to editing them directly".
"directly" means "not indirectly" and "indirectly" means by editing another file and then an effect will be reflected into this file.
So we reach the same that OP said (and is said here):
I think Generated files should not be in the VCS.
Yes. Because you haven't created them. So they are not part of "your project" according to the second definition.
What is the result about these files:
build.gradle: Yes. We need to edit it. Our works should be versioned.
Note: There is no difference where you edit it. Whether in your text editor environment or in Project Structure GUI environment. Anyway you doing it directly!
gradle-wrapper.properties: Yes. We need to at least determine Gradle version in this file.
gradle-wrapper.jar and gradlew[.bat]: I haven't created or edited them in any of my development works, till this moment! So the answer is "No". If you have done so, the answer is "Yes" about you at that work (and about the same file you edited).
The important note about the latest case is the user who clones your repo, needs to execute this command on repo's <root-directory> to auto-generate wrapper files:
> gradle wrapper --gradle-version=$v --distribution-type=$distType
$v and $distType are determined from gradle-wrapper.properties:
distributionUrl=https\://services.gradle.org/distributions/gradle-{$v}-{$distType}.zip
See https://gradle.org/install/ for more information.
gradle executable is bin/gradle[.bat] in local distribution. It's not required that local distribution be same as that determined in the repo. After wrapper files created then gradlew[.bat] can download determined Gradle distribution automatically (if not exists locally). Then he/she probably must regenerate wrapper files using new gradle executable (in downloaded distribution) using above instructions.
Note: In the above instructions, supposed the user has at least one Gradle distribution locally (e.g. ~/.gradle/wrapper/dists/gradle-4.10-bin/bg6py687nqv2mbe6e1hdtk57h/gradle-4.10). It covers almost all real cases. But what happens if the user hasn't any distribution already?
He/She can download it manually using the URL in .properties file. But if he/she doesn't locate it in the path that the wrapper expected, the wrapper will download it again! The expected path is completely predictable but is out of the subject (see here for the most complex part).
There are also some easier (but dirty) ways. For example, he/she can copy wrapper files (except .properties file) from any other local/remote repository to his/her repository and then run gradlew on his/her repository. It will automatically download the suitable distribution.
Old question, fresh answer. If you don't upgrade gradle often (most of us don't), it's better to commit it to VCS. And the main reason for me is to increase the build speed on the CI server. Nowadays, most of the projects are getting built and installed by CI servers, different server instance every time.
If you don't commit it, CI server will download a jar for every build and it significantly increases a build time. There are other ways to handle this problem, but I find this one the easiest to maintain.
I want my Ant build to take all Java sources from src/main/*, compile them, and place them inside bin/main/. I also want it to compile src/test/* sources to bin/test/. I wan this behavior because I want to package the binaries into different JARs, and if they all just go to a single bin/ directory it will be impossible* (extremely difficult!) to know which class files belong where.
When I go to configure my build path and then click the Source tab I see an area toward the bottom where it reads Default output folder: and then allows you to browser for its location.
I'm wondering how to create bin/main and bin/test in an existing project without "breaking" Eclipse (it happens). I'm also wondering that if I just have my Ant build make and delete those directories during the clean-n-build process, that Eclipse might not care what the default output is set to. But I can't find any documentation either way.
Thanks in advance for any help here.
In Eclipse, you can only have one output folder per project for your compiled Java files. So you cannot get Eclipse to do the split you want (i.e. compile src/main to bin/main and src/test to bin/test).
You can, if you want, create two Eclipse projects, one main project and one test project, where the test project depends on (and tests) the main project. However, in that case, each project should be in its own directory structure, which is not what you are asking for. But this is a common approach.
Another way, which I would recommend, would be to not mix Ant compilation and Eclipse's compilation. Make the Ant script the way you describe (i. e. compile the main and test directories separately and create two separate jar files). Change the Eclipse compile directory to something different, for instance bin/eclipse. Use the Ant script when making official builds or building for release. Use Eclipse's building only for development/debugging. This way, your two build systems will not get in each other's way and confuse each other.
Hope this answers your question and I understood it correctly. Good luck!