In the book "Embedded Linux Systems with the Yocto Project", Chapter 4 contains a sample called "HelloWorld - BitBake style". I encountered a bunch of problems trying to get the old example working against the "Sumo" release 2.5.
If you're like me, the first error you encountered following the book's instructions was that you copied across bitbake.conf and got:
ERROR: ParseError at /tmp/bbhello/conf/bitbake.conf:749: Could not include required file conf/abi_version.conf
And after copying over abi_version.conf as well, you kept finding more and more cross-connected files that needed to be moved, and then some relative-path errors after that... Is there a better way?
Here's a series of steps which can allow you to bitbake nano based on the book's instructions.
Unless otherwise specified, these samples and instructions are all based on the online copy of the book's code-samples. While convenient for copy-pasting, the online resource is not totally consistent with the printed copy, and contains at least one extra bug.
Initial workspace setup
This guide assumes that you're working with Yocto release 2.5 ("sumo"), installed into /tmp/poky, and that the build environment will go into /tmp/bbhello. If you don't the Poky tools+libraries already, the easiest way is to clone it with:
$ git clone -b sumo git://git.yoctoproject.org/poky.git /tmp/poky
Then you can initialize the workspace with:
$ source /tmp/poky/oe-init-build-env /tmp/bbhello/
If you start a new terminal window, you'll need to repeat the previous command which will get get your shell environment set up again, but it should not replace any of the files created inside the workspace from the first time.
Wiring up the defaults
The oe-init-build-env script should have just created these files for you:
bbhello/conf/local.conf
bbhello/conf/templateconf.cfg
bbhello/conf/bblayers.conf
Keep these, they supersede some of the book-instructions, meaning that you should not create or have the files:
bbhello/classes/base.bbclass
bbhello/conf/bitbake.conf
Similarly, do not overwrite bbhello/conf/bblayers.conf with the book's sample. Instead, edit it to add a single line pointing to your own meta-hello folder, ex:
BBLAYERS ?= " \
${TOPDIR}/meta-hello \
/tmp/poky/meta \
/tmp/poky/meta-poky \
/tmp/poky/meta-yocto-bsp \
"
Creating the layer and recipe
Go ahead and create the following files from the book-samples:
meta-hello/conf/layer.conf
meta-hello/recipes-editor/nano/nano.bb
We'll edit these files gradually as we hit errors.
Can't find recipe error
The error:
ERROR: BBFILE_PATTERN_hello not defined
It is caused by the book-website's bbhello/meta-hello/conf/layer.conf being internally inconsistent. It uses the collection-name "hello" but on the next two lines uses _test suffixes. Just change them to _hello to match:
# Set layer search pattern and priority
BBFILE_COLLECTIONS += "hello"
BBFILE_PATTERN_hello := "^${LAYERDIR}/"
BBFILE_PRIORITY_hello = "5"
Interestingly, this error is not present in the printed copy of the book.
No license error
The error:
ERROR: /tmp/bbhello/meta-hello/recipes-editor/nano/nano.bb: This recipe does not have the LICENSE field set (nano)
ERROR: Failed to parse recipe: /tmp/bbhello/meta-hello/recipes-editor/nano/nano.bb
Can be fixed by adding a license setting with one of the values that bitbake recognizes. In this case, add a line onto nano.bb of:
LICENSE="GPLv3"
Recipe parse error
ERROR: ExpansionError during parsing /tmp/bbhello/meta-hello/recipes-editor/nano/nano.bb
[...]
bb.data_smart.ExpansionError: Failure expanding variable PV_MAJOR, expression was ${#bb.data.getVar('PV',d,1).split('.')[0]} which triggered exception AttributeError: module 'bb.data' has no attribute 'getVar'
This is fixed by updating the special python commands being used in the recipe, because #bb.data was deprecated and is now removed. Instead, replace it with #d, ex:
PV_MAJOR = "${#d.getVar('PV',d,1).split('.')[0]}"
PV_MINOR = "${#d.getVar('PV',d,1).split('.')[1]}"
License checksum failure
ERROR: nano-2.2.6-r0 do_populate_lic: QA Issue: nano: Recipe file fetches files and does not have license file information (LIC_FILES_CHKSUM) [license-checksum]
This can be fixed by adding a directive to the recipe telling it what license-info-containing file to grab, and what checksum we expect it to have.
We can follow the way the recipe generates the SRC_URI, and modify it slightly to point at the COPYING file in the same web-directory. Add this line to nano.bb:
LIC_FILES_CHKSUM = "${SITE}/v${PV_MAJOR}.${PV_MINOR}/COPYING;md5=f27defe1e96c2e1ecd4e0c9be8967949"
The MD5 checksum in this case came from manually downloading and inspecting the matching file.
Done!
Now bitbake nano ought to work, and when it is complete you should see it built nano:
/tmp/bbhello $ find ./tmp/deploy/ -name "*nano*.rpm*"
./tmp/deploy/rpm/i586/nano-dbg-2.2.6-r0.i586.rpm
./tmp/deploy/rpm/i586/nano-dev-2.2.6-r0.i586.rpm
I have recently worked on that hands-on hello world project. As far as I am concerned, I think that the source code in the book contains some bugs. Below there is a list of suggested fixes:
Inheriting native class
In fact, when you build with bitbake that you got from poky, it builds only for the target, unless you mention in your recipe that you are building for the host machine (native). You can do the latter by adding this line at the end of your recipe:
inherit native
Adding license information
It is worth mentioning that the variable LICENSE is important to be set in any recipe, otherwise bitbake rises an error. In our case, we try to build the version 2.2.6 of the nano editor, its current license is GPLv3, hence it should be mentioned as follow:
LICENSE = "GPLv3"
Using os.system calls
As the book states, you cannot dereference metadata directly from a python function. Which means it is mandatory to access metadata through the d dictionary. Bellow, there is a suggestion for the do_unpack python function, you can use its concept to code the next tasks (do_configure, do_compile):
python do_unpack() {
workdir = d.getVar("WORKDIR", True)
dl_dir = d.getVar("DL_DIR", True)
p = d.getVar("P", True)
tarball_name = os.path.join(dl_dir, p+".tar.gz")
bb.plain("Unpacking tarball")
os.system("tar -x -C " + workdir + " -f " + tarball_name)
bb.plain("tarball unpacked successfully")
}
Launching the nano editor
After successfully building your nano editor package, you can find your nano executable in the following directory in case you are using Ubuntu (arch x86_64):
./tmp/work/x86_64-linux/nano/2.2.6-r0/src/nano
Should you have any comments or questions, Don't hesitate !
This should be simple, but something isn't quite right. Here is my scenario and then I'll give a brief overview of the commands I'm using. It helps to know that we have 3 specific dev areas, Live, Staging, and of course our own local dev areas.
I developed a new "beta" area of my site which has gone live and had appropriate testing. Now I'm ready to move it from a beta directory, to where it really should be and move out the old. When I do it locally, it seems fine, but when I try to merge my local branch into the staging branch, it doesn't seem to map the files correctly, and gives me a bunch of those use (c)hanged version, (d)elete, or leave (u)nresolved? prompts. The problem comes when my old directory has files that are named the same as the beta directory (like index.php for instance). Here's a quick example of what I mean:
currentDir/index.php
currentDir/update.php
currentDir/another_file.php
currentDir-beta/index.php
currentDir-beta/update.php
currentDir-beta/a_new_file.php
currentDir-beta/another_new_file.php
This is my process.
# creates a new branch from the live branch
hg branch new-branch-name
# move the current directory somewhere else
hg mv currentDir/* currentDir-old/
# commit...
hg com -m "moved current to -old"
# everything is fine up to this point
# move the beta directory to where the old one was
hg mv currentDir-beta/* currentDir/
# when I run hg st, it only shows that files are being removed from the -beta directory and added to the new/old directory
# commit
hg com -m "moved -beta to currentDir"
# when this commits is when the problems start happening.
# At this point when I run this next command, it shows that
# currentDir/index.php and other common files are now "modified" instead of "added"
hg st --rev "max(ancestors('new-branch-name') and branch(live)):'new-branch-name'"
# then try to merge to staging
hg up staging
hg merge new-branch-name
# errors happen with "common" file names like index.php. It treats them as though they were only modified instead of added.
Even if I ignored the above "modified" quirk, when I go to merge this new branch into the staging branch with other changes programmers have done, it complains that "local has this which remote deleted". I really wouldn't care with most of this as I could just throw this live and then any new branches would have this change. The thing I do care about is that any work done in the currentDir-beta folder on those "common" files from other programmers will no longer map to the new location. I can copy/paste the code and commit it, but it basically means that those branches are hosed as it pertains to keeping the changes other programmers did on those common files. To give you an example of what I mean, when I merge and type hg st it might look something like this.
M currentDir/index.php
M currentDir/update.php
M currentDir/a_new_file.php # why is this M? It should be A right?
M currentDir/another_new_file.php # why is this M? It should be A right?
M currentDir-old/another_file.php # why is this M? It should be A right?
R currentDir/another_file.php
R currentDir-beta/index.php
R currentDir-beta/update.php
R currentDir-beta/a_new_file.php
R currentDir-beta/another_new_file.php
Any suggestions on how to get around this? My goal is to make it so existing code changes that took place in currentDir-beta are "forwarded" to currentDir/ in the staging environment. All the other "not common" file changes are mapped, just not these common files.
UPDATE
Forgot to mention, I'm using Mercurial 3.9 on macOS Sierra.
I don't know
Your version of Mercurial
OS
but on my Win-box with Mercurial-3.9.1 my impressions (and results) differ
Clean initial state (folders shortened due to lazyness)
>hg st -A
C Current-beta\a_new_file.php
C Current-beta\another_new_file.php
C Current-beta\index.php
C Current-beta\update.php
C Current\another_file.php
C Current\index.php
C Current\update.php
First rename
>hg mv Current Current-Backup
moving Current\another_file.php to Current-Backup\another_file.php
moving Current\index.php to Current-Backup\index.php
moving Current\update.php to Current-Backup\update.php
...commit details skipped...
Second rename
>hg mv Current-beta Current
moving Current-beta\a_new_file.php to Current\a_new_file.php
moving Current-beta\another_new_file.php to Current\another_new_file.php
moving Current-beta\index.php to Current\index.php
moving Current-beta\update.php to Current\update.php
and working directory after it (as expected)
>hg st
A Current\a_new_file.php
A Current\another_new_file.php
A Current\index.php
A Current\update.php
R Current-beta\a_new_file.php
R Current-beta\another_new_file.php
R Current-beta\index.php
R Current-beta\update.php
...commit details skipped...
If you want to see how it was recorded by Mercurial: I used such slightly puzzling at first glance log for better interpreting of output
hg log -T "{rev}:{node|short}\n{if(file_adds,'\tAdded: {join(file_adds,', ')}\n')}{if(file_copies,'\tCopied: {join(file_copies,', ')}\n')}{if(file_dels,'\tDeleted: {join(file_dels,', ')}\n')}{if(file_mods,'\tModified: {join(file_mods,', ')}\n')}\n"
and here it's result
2:98955fcb7e71
Added: Current/a_new_file.php, Current/another_new_file.php, Current/index.php, Current/update.php
Copied: Current/a_new_file.php (Current-beta/a_new_file.php), Current/another_new_file.php (Current-beta/another_new_file.php), Current/index.php (Current-beta/index.php), Current/update.php (Current-beta/update.php)
Deleted: Current-beta/a_new_file.php, Current-beta/another_new_file.php, Current-beta/index.php, Current-beta/update.php
1:61068c6ba8a7
Added: Current-Backup/another_file.php, Current-Backup/index.php, Current-Backup/update.php
Copied: Current-Backup/another_file.php (Current/another_file.php), Current-Backup/index.php (Current/index.php), Current-Backup/update.php (Current/update.php)
Deleted: Current/another_file.php, Current/index.php, Current/update.php
0:454486bc43e5
Added: Current-beta/a_new_file.php, Current-beta/another_new_file.php, Current-beta/index.php, Current-beta/update.php, Current/another_file.php, Current/index.php, Current/update.php
As you can see - no edits ("Modified") at all (and here log /per changeset/ is more correct than aggregated status)
PS: I couldn't see on the fly purpose of your revset in hg st and necessity of branching+merging
PPS: OK, I saw
>hg st --rev "0:"
M Current\index.php
M Current\update.php
A Current-Backup\another_file.php
A Current-Backup\index.php
A Current-Backup\update.php
A Current\a_new_file.php
A Current\another_new_file.php
R Current-beta\a_new_file.php
R Current-beta\another_new_file.php
R Current-beta\index.php
R Current-beta\update.php
R Current\another_file.php
Aggregated results in considering only the boundary conditions is (correctly, technically speaking) modified files for files 1) in the same location 2) with the same name 3) and with changed content
I'm using Terraform in a modular fashion in order to build out my infrastructure. I do this by having a configuration file that calls in the different modules. I want to pass an infrastructure variable which picks up what tagged version of the Github repository the application should be building out. Most importantly I'm trying to figure out how to make a concatenation of a string happen in the "source" variable of the configuration file.
module "athenaelb" {
source = "${concat("git::https://github.com/ORG/REPONAME.git?ref=",var.infra_version)}"
aws_access_key = "${var.aws_access_key}"
aws_secret_key = "${var.aws_secret_key}"
aws_region = "${var.aws_region}"
availability_zones = "${var.availability_zones}"
subnet_id = "${var.subnet_id}"
security_group = "${var.athenaelb_security_group}"
branch_name = "${var.branch_name}"
env = "${var.env}"
sns_topic = "${var.sns_topic}"
s3_bucket = "${var.elb_s3_bucket}"
athena_elb_sns_topic = "${var.athena_elb_sns_topic}"
infra_version = "${var.infra_version}"
}
I want it to compile and for the source to look like this (for example): git::https://github.com/ORG/REPONAME.git?ref=v1
Anyone have any thoughts on how to make this work?
Thanks,
Keren
This is not possible currently in Terraform itself.
The only way to achieve something like this is to use a separate script to interact with the git repository that Terraform clones into a subdirectory of the .terraform/modules directory and switch it to a different tag depending on which version you need. This is non-ideal since Terraform organizes these into directories based on a hash of the module path, but if you can identify the module in question it is safe to run git checkout within these repositories as long as you do not run terraform get again afterwards.
For more details and discussion on this issue, see issue #1439 in Terraform's issue tracker, where this feature was requested.
You could use envsubst or python jinja and use these wrapper scripts in your pipeline deploy script to actually build the scripts from .envsubst and .jinja files before your terraform plan/apply
https://github.com/uvoo/process-templates/tree/main/scripts
I wish terraform would support this but my guess is they never will so just add some simple functions/files into deploy scripts which is usually the best way to deploy.
In a GitHub repository you can see “language statistics”, which displays the percentage of the project that’s written in a language. It doesn’t, however, display how many lines of code the project consists of. Often, I want to quickly get an impression of the scale and complexity of a project, and the count of lines of code can give a good first impression. 500 lines of code implies a relatively simple project, 100,000 lines of code implies a very large/complicated project.
So, is it possible to get the lines of code written in the various languages from a GitHub repository, preferably without cloning it?
The question “Count number of lines in a git repository” asks how to count the lines of code in a local Git repository, but:
You have to clone the project, which could be massive. Cloning a project like Wine, for example, takes ages.
You would count lines in files that wouldn’t necessarily be code, like i13n files.
If you count just (for example) Ruby files, you’d potentially miss massive amount of code in other languages, like JavaScript. You’d have to know beforehand which languages the project uses. You’d also have to repeat the count for every language the project uses.
All in all, this is potentially far too time-intensive for “quickly checking the scale of a project”.
You can run something like
git ls-files | xargs wc -l
Which will give you the total count →
You can also add more instructions. Like just looking at the JavaScript files.
git ls-files | grep '\.js' | xargs wc -l
Or use this handy little tool → https://line-count.herokuapp.com/
A shell script, cloc-git
You can use this shell script to count the number of lines in a remote Git repository with one command:
#!/usr/bin/env bash
git clone --depth 1 "$1" temp-linecount-repo &&
printf "('temp-linecount-repo' will be deleted automatically)\n\n\n" &&
cloc temp-linecount-repo &&
rm -rf temp-linecount-repo
Installation
This script requires CLOC (“Count Lines of Code”) to be installed. cloc can probably be installed with your package manager – for example, brew install cloc with Homebrew. There is also a docker image published under mribeiro/cloc.
You can install the script by saving its code to a file cloc-git, running chmod +x cloc-git, and then moving the file to a folder in your $PATH such as /usr/local/bin.
Usage
The script takes one argument, which is any URL that git clone will accept. Examples are https://github.com/evalEmpire/perl5i.git (HTTPS) or git#github.com:evalEmpire/perl5i.git (SSH). You can get this URL from any GitHub project page by clicking “Clone or download”.
Example output:
$ cloc-git https://github.com/evalEmpire/perl5i.git
Cloning into 'temp-linecount-repo'...
remote: Counting objects: 200, done.
remote: Compressing objects: 100% (182/182), done.
remote: Total 200 (delta 13), reused 158 (delta 9), pack-reused 0
Receiving objects: 100% (200/200), 296.52 KiB | 110.00 KiB/s, done.
Resolving deltas: 100% (13/13), done.
Checking connectivity... done.
('temp-linecount-repo' will be deleted automatically)
171 text files.
166 unique files.
17 files ignored.
http://cloc.sourceforge.net v 1.62 T=1.13 s (134.1 files/s, 9764.6 lines/s)
-------------------------------------------------------------------------------
Language files blank comment code
-------------------------------------------------------------------------------
Perl 149 2795 1425 6382
JSON 1 0 0 270
YAML 2 0 0 198
-------------------------------------------------------------------------------
SUM: 152 2795 1425 6850
-------------------------------------------------------------------------------
Alternatives
Run the commands manually
If you don’t want to bother saving and installing the shell script, you can run the commands manually. An example:
$ git clone --depth 1 https://github.com/evalEmpire/perl5i.git
$ cloc perl5i
$ rm -rf perl5i
Linguist
If you want the results to match GitHub’s language percentages exactly, you can try installing Linguist instead of CLOC. According to its README, you need to gem install linguist and then run linguist. I couldn’t get it to work (issue #2223).
I created an extension for Google Chrome browser - GLOC which works for public and private repos.
Counts the number of lines of code of a project from:
project detail page
user's repositories
organization page
search results page
trending page
explore page
If you go to the graphs/contributors page, you can see a list of all the contributors to the repo and how many lines they've added and removed.
Unless I'm missing something, subtracting the aggregate number of lines deleted from the aggregate number of lines added among all contributors should yield the total number of lines of code in the repo. (EDIT: it turns out I was missing something after all. Take a look at orbitbot's comment for details.)
UPDATE:
This data is also available in GitHub's API. So I wrote a quick script to fetch the data and do the calculation:
'use strict';
async function countGithub(repo) {
const response = await fetch(`https://api.github.com/repos/${repo}/stats/contributors`)
const contributors = await response.json();
const lineCounts = contributors.map(contributor => (
contributor.weeks.reduce((lineCount, week) => lineCount + week.a - week.d, 0)
));
const lines = lineCounts.reduce((lineTotal, lineCount) => lineTotal + lineCount);
window.alert(lines);
}
countGithub('jquery/jquery'); // or count anything you like
Just paste it in a Chrome DevTools snippet, change the repo and click run.
Disclaimer (thanks to lovasoa):
Take the results of this method with a grain of salt, because for some repos (sorich87/bootstrap-tour) it results in negative values, which might indicate there's something wrong with the data returned from GitHub's API.
UPDATE:
Looks like this method to calculate total line numbers isn't entirely reliable. Take a look at orbitbot's comment for details.
You can clone just the latest commit using git clone --depth 1 <url> and then perform your own analysis using Linguist, the same software Github uses. That's the only way I know you're going to get lines of code.
Another option is to use the API to list the languages the project uses. It doesn't give them in lines but in bytes. For example...
$ curl https://api.github.com/repos/evalEmpire/perl5i/languages
{
"Perl": 274835
}
Though take that with a grain of salt, that project includes YAML and JSON which the web site acknowledges but the API does not.
Finally, you can use code search to ask which files match a given language. This example asks which files in perl5i are Perl. https://api.github.com/search/code?q=language:perl+repo:evalEmpire/perl5i. It will not give you lines, and you have to ask for the file size separately using the returned url for each file.
Not currently possible on Github.com or their API-s
I have talked to customer support and confirmed that this can not be done on github.com. They have passed the suggestion along to the Github team though, so hopefully it will be possible in the future. If so, I'll be sure to edit this answer.
Meanwhile, Rory O'Kane's answer is a brilliant alternative based on cloc and a shallow repo clone.
From the #Tgr's comment, there is an online tool :
https://codetabs.com/count-loc/count-loc-online.html
You can use tokei:
cargo install tokei
git clone --depth 1 https://github.com/XAMPPRocky/tokei
tokei tokei/
Output:
===============================================================================
Language Files Lines Code Comments Blanks
===============================================================================
BASH 4 48 30 10 8
JSON 1 1430 1430 0 0
Shell 1 49 38 1 10
TOML 2 78 65 4 9
-------------------------------------------------------------------------------
Markdown 4 1410 0 1121 289
|- JSON 1 41 41 0 0
|- Rust 1 47 38 5 4
|- Shell 1 19 16 0 3
(Total) 1517 95 1126 296
-------------------------------------------------------------------------------
Rust 19 3750 3123 119 508
|- Markdown 12 358 5 302 51
(Total) 4108 3128 421 559
===============================================================================
Total 31 6765 4686 1255 824
===============================================================================
Tokei has support for badges:
Count Lines
[![](https://tokei.rs/b1/github/XAMPPRocky/tokei)](https://github.com/XAMPPRocky/tokei)
By default the badge will show the repo's LoC(Lines of Code), you can also specify for it to show a different category, by using the ?category= query string. It can be either code, blanks, files, lines, comments.
Count Files
[![](https://tokei.rs/b1/github/XAMPPRocky/tokei?category=files)](https://github.com/XAMPPRocky/tokei)
You can use GitHub API to get the sloc like the following function
function getSloc(repo, tries) {
//repo is the repo's path
if (!repo) {
return Promise.reject(new Error("No repo provided"));
}
//GitHub's API may return an empty object the first time it is accessed
//We can try several times then stop
if (tries === 0) {
return Promise.reject(new Error("Too many tries"));
}
let url = "https://api.github.com/repos" + repo + "/stats/code_frequency";
return fetch(url)
.then(x => x.json())
.then(x => x.reduce((total, changes) => total + changes[1] + changes[2], 0))
.catch(err => getSloc(repo, tries - 1));
}
Personally I made an chrome extension which shows the number of SLOC on both github project list and project detail page. You can also set your personal access token to access private repositories and bypass the api rate limit.
You can download from here https://chrome.google.com/webstore/detail/github-sloc/fkjjjamhihnjmihibcmdnianbcbccpnn
Source code is available here https://github.com/martianyi/github-sloc
Hey all this is ridiculously easy...
Create a new branch from your first commit
When you want to find out your stats, create a new PR from main
The PR will show you the number of changed lines - as you're doing a PR from the first commit all your code will be counted as new lines
And the added benefit is that if you don't approve the PR and just leave it in place, the stats (No of commits, files changed and total lines of code) will simply keep up-to-date as you merge changes into main. :) Enjoy.
Firefox add-on Github SLOC
I wrote a small firefox addon that prints the number of lines of code on github project pages: Github SLOC
npm install sloc -g
git clone --depth 1 https://github.com/vuejs/vue/
sloc ".\vue\src" --format cli-table
rm -rf ".\vue\"
Instructions and Explanation
Install sloc from npm, a command line tool (Node.js needs to be installed).
npm install sloc -g
Clone shallow repository (faster download than full clone).
git clone --depth 1 https://github.com/facebook/react/
Run sloc and specifiy the path that should be analyzed.
sloc ".\react\src" --format cli-table
sloc supports formatting the output as a cli-table, as json or csv. Regular expressions can be used to exclude files and folders (Further information on npm).
Delete repository folder (optional)
Powershell: rm -r -force ".\react\" or on Mac/Unix: rm -rf ".\react\"
Screenshots of the executed steps (cli-table):
sloc output (no arguments):
It is also possible to get details for every file with the --details option:
sloc ".\react\src" --format cli-table --details
Open terminal and run the following:
curl -L "https://api.codetabs.com/v1/loc?github=username/reponame"
If the question is "can you quickly get NUMBER OF LINES of a github repo", the answer is no as stated by the other answers.
However, if the question is "can you quickly check the SCALE of a project", I usually gauge a project by looking at its size. Of course the size will include deltas from all active commits, but it is a good metric as the order of magnitude is quite close.
E.g.
How big is the "docker" project?
In your browser, enter api.github.com/repos/ORG_NAME/PROJECT_NAME
i.e. api.github.com/repos/docker/docker
In the response hash, you can find the size attribute:
{
...
size: 161432,
...
}
This should give you an idea of the relative scale of the project. The number seems to be in KB, but when I checked it on my computer it's actually smaller, even though the order of magnitude is consistent. (161432KB = 161MB, du -s -h docker = 65MB)
Pipe the output from the number of lines in each file to sort to organize files by line count.
git ls-files | xargs wc -l |sort -n
This is so easy if you are using Vscode and you clone the project first. Just install the Lines of Code (LOC) Vscode extension and then run LineCount: Count Workspace Files from the Command Pallete.
The extension shows summary statistics by file type and it also outputs result files with detailed information by each folder.
There in another online tool that counts lines of code for public and private repos without having to clone/download them - https://klock.herokuapp.com/
None of the answers here satisfied my requirements. I only wanted to use existing utilities. The following script will use basic utilities:
Git
GNU or BSD awk
GNU or BSD sed
Bash
Get total lines added to a repository (subtracts lines deleted from lines added).
#!/bin/bash
git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD | \
sed 's/[^0-9,]*//g' | \
awk -F, '!($2 > 0) {$2="0"};!($3 > 0) {$3="0"}; {print $2-$3}'
Get lines of code filtered by specified file types of known source code (e.g. *.py files or add more extensions, etc).
#!/bin/bash
git diff --shortstat 4b825dc642cb6eb9a060e54bf8d69288fbee4904 HEAD -- *.{py,java,js} | \
sed 's/[^0-9,]*//g' | \
awk -F, '!($2 > 0) {$2="0"};!($3 > 0) {$3="0"}; {print $2-$3}'
4b825dc642cb6eb9a060e54bf8d69288fbee4904 is the id of the "empty tree" in Git and it's always available in every repository.
Sources:
My own scripting
How to get Git diff of the first commit?
Is there a way of having git show lines added, lines changed and lines removed?
shields.io has a badge that can count up all the lines for you here. Here is an example of what it looks like counting the Raycast extensions repo:
You can use sourcegraph, an open source search engine for code. It can connect to your GitHub account, index the content, and then on the admin section you would see the number of lines of code indexed.
I made an NPM package specifically for this usage, which allows you to call a CLI tool and providing the directory path and the folders/files to ignore
it goes like this:
npm i -g #quasimodo147/countlines
to get the $ countlines command in your terminal
then you can do
countlines . node_modules build dist
I'm doing VPATH builds with automake. I'm now also using generated source, with SWIG. I've got rules in Makefile.am like:
dist_noinst_DATA = whatever.swig
whatever.cpp: whatever.swig
swig -c++ -php $^
Then the file gets used later:
myprogram_SOURCES = ... whatever.cpp
It works fine when $builddir == $srcdir. But when doing VPATH builds (e.g. mkdir build; cd build; ../configure; make), I get error messages about missing whatever.cpp.
Should generated source files go to $builddir or $srcdir? (I reckon probably $builddir.)
How should dependencies and rules be specified to put generated files in the right place?
Simple answer
You should assume that $srcdir is a read-only, so you must not write anything there.
So, your generated source-code will end up in $(builddir).
By default, autotool-generated Makefiles will only look for source-files in $srcdir, so you have to tell it to check $builddir as well. Adding the following to your Makefile.am should help:
VPATH = $(srcdir) $(builddir)
After that you might end up with a no rule to make target ... error, which you should be able to fix by updating your source-generating rule as in:
$(builddir)/whatever.cpp: whatever.swig
# ...
A better solution
You might notice that in your current setup, the release tarball (as created by make dist) will contain the whatever.cpp file as part of your sources, since you added this file to the myprogram_SOURCES.
If you don't want this (e.g. because it might mean that the build-process will really take the pregenerated file rather than generating it again), you might want to use something like the following.
It uses a wrapper source-file (whatever_includer.cpp) that simply includes the generated file, and it uses -I$(builddir) to then find the generated file.
Makefile.am:
dist_noinst_DATA = whatever.swig
whatever.cpp: whatever.swig
swig -c++ -php $^
whatever_includer.cpp: whatever.cpp
myprogram_SOURCES = ... whatever_includer.cpp
myprogram_CPPFLAGS = ... -I$(builddir)
clean-local::
rm -f $(builddir)/whatever.cpp
whatever_includer.cpp:
#include "whatever.cpp"
Usually, you want to keep $srcdir readonly, so that if for instance the source is distributed unpacked on a CDROM, you can still run /.../configure from some other part of the file-system.
However if you are using SWIG to generate source code for a wrapper library, you probably want to distribute that SWIG-generated code as well so that your users do not need to install SWIG to compile your code. Then you have indeed a choice: you can decide that the SWIG-generated code should end in $builddir (it's OK: make dist will collect it there and include it in the tarball), or you could decide to output SWIG-generated code in $srcdir since it is really a source from the point of view of the distributed package. An advantage of keeping it in $srcdir is that when make distcheck attempts to build your package from a read-only source directory, it will fail on any attempt to call SWIG to regenerate the wrapper source. If you have your wrapper source in $builddir, you might not notice you have some broken rule that cause SWIG to be run on the user's host; by generating in $srcdir you ensure that SWIG is not needed by your users.
So my preference is to output SWIG wrapper sources in $srcdir. My setup for Python wrappers looks as follows:
EXTRA_DIST = spot.i
python_PYTHON = $(srcdir)/spot.py # _PYTHON is distributed by default
pyexec_LTLIBRARIES = _spot.la
MAINTAINERCLEANFILES = $(srcdir)/spot_wrap.cxx $(srcdir)/spot.py
_spot_la_SOURCES = $(srcdir)/spot_wrap.cxx $(srcdir)/spot_wrap.h
_spot_la_LDFLAGS = -avoid-version -module
_spot_la_LIBADD = $(top_builddir)/src/libspot.la
$(srcdir)/spot_wrap.cxx: $(srcdir)/spot.i
$(SWIG) -c++ -python -I$(srcdir) -I$(top_srcdir)/src $(srcdir)/spot.i
# Handle the multi-file output of SWIG.
$(srcdir)/spot.py: $(srcdir)/spot.i
$(MAKE) $(AM_MAKEFLAGS) spot_wrap.cxx
Note that I use $(srcdir) for all targets, because of limitations of the VPATH feature on various flavors of make. My setup to deal with the multiple files output by SWIG could be improved, but as these rules are not run by users and it has never caused me any problem, I do not bother.