How would I generate list of dates and git commits? - github

I wrote some code that generates a github contributions-style heatmap in the terminal given a csv file that contains timestamps and some unsigned value.
I'd like to generate a csv that contains dates and the number of github contributions I made on that date.
Is there a simple way to do this?

You could use git log and a custom format:
git log --date=short --format="%an %ad [%h] %s" | cut -d ' ' -f1 -f2 -f3 -f4- | sed -E 's/ /,/' | sed -E 's/ /,/' | sed -E 's/ /,/'
I get:
Lachlan,Miller,2019-03-25,[e20b847] Rename method
Lachlan,Miller,2019-03-25,[6c47dbf] Add a POC using JS
lmiller1990,2018-04-12,[c295307],Add song class
lmiller1990,2018-04-12,[876cbe2],Add timer

You could use grep for this job. Also, flags like i, A and color will help you cleaning things up a bit. Also, output the result in a .csv file using >
use man grep to know a more about its flags.
Try using:
git log | grep -E -A 2 --color "commit|Date" > output.csv
You could also add --summary flag to log.

Related

Why is sed returning more characters than requested

In a part of my script I am trying to generate a list of the year and month that a file was submitted. Since the file contains the timestamp, I should be able to cut the filenames to the month position, and then do a sort+uniq filtering. However sed is generating an outlier for one of the files.
I am using this command sequence
ls -1 service*json | sed -e "s|\(.*201...\).*json$|\1|g" | sort |uniq
And this works for most of time except in some cases it outputs the whole timestamp:
$ ls
service-parent-20181119092630.json service-parent-20181123134132.json service-parent-20181202124532.json service-parent-20190121091830.json service-parent-20190125124209.json
service-parent-20181119101003.json service-parent-20181126104300.json service-parent-20181211095939.json service-parent-20190121092453.json service-parent-20190128163539.json
service-parent-20181120095850.json service-parent-20181127083441.json service-parent-20190107035508.json service-parent-20190122093608.json
service-parent-20181120104838.json service-parent-20181129155835.json service-parent-20190107042234.json service-parent-20190122115053.json
$ ls -1 service*json | sed -e "s|\(.*201...\).*json$|\1|g" | sort |uniq
service-parent-201811
service-parent-201811201048
service-parent-201812
service-parent-201901
I have also tried this variation but the second output line is still returned:
ls -1 service*json | sed -e "s|\(.*201.\{3\}\).*json$|\1|g" | sort |uniq
Can somebody explain why service-parent-201811201048 is returned past the requested 3 characters?
Thanks.
service-parent-201811201048 happens to have 201048 to match 201....
Might try ls -1 service*json | sed -e "s|\(.*-201...\).*json$|\1|g" | sort |uniq to ask for a dash - before 201....
It is not recommended to parse the output of ls. Please try instead:
for i in service*json; do
sed -e "s|^\(service-.*-201[0-9]\{3\}\).*json$|\1|g" <<< "$i"
done | sort | uniq
Your problem is explained at https://stackoverflow.com/a/54565973/1745001 (i.e. .* is greedy) but try this:
$ ls | sed -E 's/(-[0-9]{6}).*/\1/' | sort -u
service-parent-201811
service-parent-201812
service-parent-201901
The above requires a sed that supports EREs via -E, e.g. GNU sed and OSX/BSD sed.

sed with filename from pipe

In a folder I have many files with several parameters in filenames, e.g (just with one parameter) file_a1.0.txt, file_a1.2.txt etc.
These are generated by a c++ code and I'd need to take the last one (in time) generated. I don't know a priori what will be the value of this parameter when the code is terminated. After that I need to copy the 2nd line of this last file.
To copy the 2nd line of the any file, I know that this sed command works:
sed -n 2p filename
I know also how to find the last generated file:
ls -rtl file_a*.txt | tail -1
Question:
how to combine these two operation? Certainly it is possible to pipe the 2nd operation to that sed operation but I dont know how to include filename from pipe as input to that sed command.
You can use this,
ls -rt1 file_a*.txt | tail -1 | xargs sed -n '2p'
(OR)
sed -n '2p' `ls -rt1 file_a*.txt | tail -1`
sed -n '2p' $(ls -rt1 file_a*.txt | tail -1)
Typically you can put a command in back ticks to put its output at a particular point in another command - so
sed -n 2p `ls -rt name*.txt | tail -1 `
Alternatively - and preferred, because it is easier to nest etc -
sed -n 2p $(ls -rt name*.txt | tail -1)
-r in ls is reverse order.
-r, --reverse
reverse order while sorting
But it is not good idea when used it with tail -1.
With below change (head -1 without r option in ls), performance will be better, that you needn't wait to list all files then pipe to tail command
sed -n 2p $(ls -t1 name*.txt | head -1 )
I was looking for a similar solution: taking the file names from a pipe of grep results to feed to sed. I've copied my answer here for the search & replace, but perhaps this example can help as it calls sed for each of the names found in the pipe:
this command to simply find all the files:
grep -i -l -r foo ./*
this one to exclude this_shell.sh (in case you put the command in a script called this_shell.sh), tee the output to the console to see what happened, and then use sed on each file name found to replace the text foo with bar:
grep -i -l -r --exclude "this_shell.sh" foo ./* | tee /dev/fd/2 | while read -r x; do sed -b -i 's/foo/bar/gi' "$x"; done
I chose this method, as I didn't like having all the timestamps changed for files not modified. Feeding the grep result allows only the files with target text to be looked at (thus likely may improve performance / speed as well)
be sure to backup your files & test before using. May not work in some environments for files with embedded spaces. (?)
fwiw - I had some problems using the tail method, it seems that the entire dataset was generated before calling tail on just the last item.

In-place replacement

I have a CSV. I want to edit the 35th field of the CSV and write the change back to the 35th field. This is what I am doing on bash:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g'
so, I am pulling the 35th entry using awk and then replacing the "0" in the starting position in the string with "+91". This one works perfet and I get desired output on the console.
Now I want this new entry to get written in the file. I am thinking of sed's "in -place" replacement feature but this fetuare needs and input file. In above command, I cannot provide input file because my primary command is awk and sed is taking the input from awk.
Thanks.
You should choose one of the two tools. As for sed, it can be done as follows:
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/' test.csv
Not sure about awk, but #shellter's comment might help with that.
The in-place feature of sed is misnamed, as it does not edit the file in place. Instead, it creates a new file with the same name. eg:
$ echo foo > foo
$ ln -f foo bar
$ ls -i foo bar # These are the same file
797325 bar 797325 foo
$ echo new-text > foo # Changes bar
$ cat bar
new-text
$ printf '/new/s//newer\nw\nq\n' | ed foo # Edit foo "in-place"; changes bar
9
newer-text
11
$ cat bar
newer-text
$ ls -i foo bar # Still the same file
797325 bar 797325 foo
$ sed -i s/new/newer/ foo # Does not edit in-place; creates a new file
$ ls -i foo bar
797325 bar 792722 foo
Since sed is not actually editing the file in place, but writing a new file and then renaming it to the old file, you might as well do the same.
awk ... test.csv | sed ... > test.csv.1 && mv test.csv.1 test.csv
There is the misperception that using sed -i somehow avoids the creation of the temporary file. It does not. It just hides the fact from you. Sometimes abstraction is a good thing, but other times it is unnecessary obfuscation. In the case of sed -i, it is the latter. The shell is really good at file manipulation. Use it as intended. If you do need to edit a file in place, don't use the streaming version of ed; just use ed
So, it turned out there are numerous ways to do it. I got it working with sed as below:
sed -i 's/0\([0-9]\{10\}\)/\+91\1/g' test.csv
But this is little tricky as it will edit any entry which matches the criteria. however in my case, It is working fine.
Similar implementation of above logic in perl:
perl -p -i -e 's/\b0(\d{10})\b/\+91$1/g;' test.csv
Again, same caveat as mentioned above.
More precise way of doing it as shown by Lev Levitsky because it will operate specifically on the 35th field
sed -ri 's/^(([^,]*,){34})0([^,]*)/\1+91\3/g' test.csv
For more complex situations, I will have to consider using any of the csv modules of perl.
Thanks everyone for your time and input. I surely know more about sed/awk after reading your replies.
This might work for you:
sed -i 's/[^,]*/+91/35' test.csv
EDIT:
To replace the leading zero in the 35th field:
sed 'h;s/[^,]*/\n&/35;/\n0/!{x;b};s//+91/' test.csv
or more simply:
|sed 's/^\(\([^,]*,\)\{34\}\)0/\1+91/' test.csv
If you have moreutils installed, you can simply use the sponge tool:
awk -F "," '{print $35}' test.csv | sed -i 's/^0/+91/g' | sponge test.csv
sponge soaks up the input, closes the input pipe (stdin) and, only then, opens and writes to the test.csv file.
As of 2015, moreutils is available in package repositories of several major Linux distributions, such as Arch Linux, Debian and Ubuntu.
Another perl solution to edit the 35th field in-place:
perl -i -F, -lane '$F[34] =~ s/^0/+91/; print join ",",#F' test.csv
These command-line options are used:
-i edit the file in-place
-n loop around every line of the input file
-l removes newlines before processing, and adds them back in afterwards
-a autosplit mode – split input lines into the #F array. Defaults to splitting on whitespace.
-e execute the perl code
-F autosplit modifier, in this case splits on ,
#F is the array of words in each line, indexed starting with 0
$F[34] is the 35 element of the array
s/^0/+91/ does the substitution

How to determine date/time when a file was first added to CVS repository

Is there any simple cvs command through which I can get date/time when a file was first added to the module in CVS repository.
I actually want a one line output that can be consumed by my some script.
This is not directly possible in CVS. One can get activity logs for a file and then identify the date from them. Following is the single line command that works like a charm and gives the date when the file was first added in the repository.
cvs -Q -d :pserver:*User*:*Pass*#*HostName*:/cvsroot rlog -N *FilePath* | grep ^date: | sort | head -n 1 | cut -d\; -f1 | sed -e 's/date: //'
Above command looks through the entire repository and gives the date. If one is looking for first activity on that file on a branch use following commands.
For Branch:
cvs -Q -d :pserver:*User*:*Pass*#*HostName*:/cvsroot rlog -N -r*BranchName* *FilePath* | grep ^date: | sort | head -n 1 | cut -d\; -f1 | sed -e 's/date: //'
For Trunk:
cvs -Q -d :pserver:*User*:*Pass*#*HostName*:/cvsroot rlog -N -r::HEAD *FilePath* | grep ^date: | sort | head -n 1 | cut -d\; -f1 | sed -e 's/date: //'

Filter text based in a multiline match criteria

I have the following sed command. I need to execute the below command in single line
cat File | sed -n '
/NetworkName/ {
N
/\n.*ims3/ p
}' | sed -n 1p | awk -F"=" '{print $2}'
I need to execute the above command in single line. can anyone please help.
Assume that the contents of the File is
System.DomainName=shayam
System.Addresses=Fr6
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=AS
System.DomainName=ims5.com
System.DomainName=Ram
System.Addresses=Fr9
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims7.com
System.DomainName=mani
System.Addresses=Hello
System.Trusted=Yes
System.Infrastructure=No
System.NetworkName=Peer
System.DomainName=ims3.com
And after executing the command you will get only peer as the output. Can anyone please help me out?
You can use a single nawk command. And you can lost the useless cat
nawk -F"=" '/NetworkName/{n=$2;getline;if($2~/ims3/){print n} }' file
You can use sed as well as proposed by others, but i prefer less regex and less clutter.
The above save the value of the network name to "n". Then, get the next line and check the 2nd field against "ims3". If matched, then print the value of "n".
Put that code in a separate .sh file, and run it as your single-line command.
cat File | sed -n '/NetworkName/ { N; /\n.*ims3/ p }' | sed -n 1p | awk -F"=" '{print $2}'
Assuming that you want the network name for the domain ims3, this command line works without sed:
grep -B 1 ims3 File | head -n 1 | awk -F"=" '{print $2}'
So, you want the network name where the domain name on the following line includes 'ims3', and not the one where the following line includes 'ims7' (even though the network names in the example are the same).
sed -n '/NetworkName/{N;/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;};}' File
This avoids abuse of felines, too (not to mention reducing the number of commands executed).
Tested on MacOS X 10.6.4, but there's no reason to think it won't work elsewhere too.
However, empirical evidence shows that Solaris sed is different from MacOS sed. It can all be done in one sed command, but it needs three lines:
sed -n '/NetworkName/{N
/ims3/{s/.*NetworkName=\(.*\)\n.*/\1/p;}
}' File
Tested on Solaris 10.
You just need to put -e pretty much everywhere you'd break the command at a newline or have a semicolon. You don't need the extra call to sed or awk or cat.
sed -n -e '/NetworkName/ {' -e 'N' -e '/\n.*ims3/ s/[^\n]*=\(.*\).*/\1/P' -e '}' File