How to filter json data based on key value pair using jq? - select

Suppose I have some json data given below:
{"name":"alon","department":"abc","id":"ss12sd"}
{"name":"kate","department":"xyz","id":"ajsj3" }
{"name":"sam","department":"abc","id":"xx1d2"}
I want to filter data based on particular department and save it in a different json file. From above data suppose I want to filter all the data whose department is 'abc' and save it in some new json file. How I can do this using jq. I am checking it's manual from here but didn't understood that much.

jq 'map(select(.department == "abc")) ' yourfile.json

A flexible template might be like this:
jq --arg key department --arg value abc \
'.[] | select(.[$key] == $value)' input_file.json > output_file.json
This way you can change the criteria at the arguments stage rather than the expression.
Implementing that into a shell script might look like this:
myscript.sh
#!/usr/bin/env bash
key="$1"
value="$2"
file="$3"
outfile="$4"
jq --arg key "$1" --arg value "$2" \
'.[] | select(.[$key] == $value)' "$3" > "$4"
Which you would invoke like so:
./myscript.sh department abc input.json output.json
Edit: Changed ."\($key)" to .[$key] - thanks #peak

Related

How do you remove quotes from output?

I don't see a corresponding option in yq to what jq has:
-r output raw strings, not JSON texts;
My exact usage is:
cat example.xml | yq --input-format xml --output-format json

Saving environment variables in gopass

How can I save environment variables (sensitive) in gopass and retrieve them and set in my bash terminal them from command line.
I know this can be done in 1password where the password are stored:
myaccount
key1 - value1
key2 - value2
key3 - value3
Internally this is in json format and can be pull with a command like:
op get item "myaccount" | jq .
source <(op get item ${1} | jp -r --arg key2 "${2} '.details.section[] |
select(.title==$title) | .fields[0].v' | base64 -D ")
The whole idea is my environment variables must be set in a automate way from a secure vault rather than me exporting them like this:
export key1=value1
export key2=value2
export key3=value3
This is a bit of a late answer, but you could save them into gopass like this:
echo $key1 | gopass insert store/environment/key1
Or if you need them to be base64 ecoded:
echo $key1 | base64 -w 0 | gopass insert store/environment/key1
Then you could source them still using export, but sourcing them from gopass, rather than hard coding them.
key1=$(gopass store/environment/key1)
key1=$(gopass store/environment/key1 | base64 -d)

What kubectl command can I use to get events sorted by specific fields and print only specific details of events?

I need to print only specific fields of Kubernetes Events, sorted by a specific field.
This is to help me gather telemetry and analytics about my namespace
How could I do that?
kubectl get events --sort-by='.lastTimestamp'
Following command does it.
It prints the events sorted by timestamp of creation.
It also users go-template to filter out specific fields of the kubernetes-event object.
kubectl get events --sort-by='.metadata.creationTimestamp' -o 'go-template={{range .items}}{{.involvedObject.name}}{{"\t"}}{{.involvedObject.kind}}{{"\t"}}{{.message}}{{"\t"}}{{.reason}}{{"\t"}}{{.type}}{{"\t"}}{{.firstTimestamp}}{{"\n"}}{{end}}'
I am using the following command to sort it after timestamp
kubectl get event --all-namespaces --sort-by='.metadata.managedFields[0].time'
For filtering out the information you can of course combine it with the go-template described by #suryakrupa or with jq described by #Chris Stryczynski
If you don't mind seeing the output as JSON:
kubectl get event -o json | jq '.items |= sort_by(.lastTimestamp)'
This requires jq.
Here's the Bash function I use:
function kubectl-events {
{
echo $'TIME\tNAMESPACE\tTYPE\tREASON\tOBJECT\tSOURCE\tMESSAGE';
kubectl get events -o json "$#" \
| jq -r '.items | map(. + {t: (.eventTime//.lastTimestamp)}) | sort_by(.t)[] | [.t, .metadata.namespace, .type, .reason, .involvedObject.kind + "/" + .involvedObject.name, .source.component + "," + (.source.host//"-"), .message] | #tsv';
} \
| column -s $'\t' -t \
| less -S
}
You can use it like: kubectl-events -A, kubectl-events -n foo, etc.

Finding a date from string using shell script?

How can I find a date inside a string using shell script?
For example, I have this string "/foo/bar/mxm-20140908.txt"
and the out put should be 20140908, thanks!
You can use egrep with the -o option. One where it uses a - separator (as per the original question):
pax> echo /foo/bar/mxm-2014-09-08.txt | egrep -o '[0-9]{4}(-[0-9]{2}){2}'
2014-09-08
Or, with no separator (as per the changes made):
pax> echo /foo/bar/mxm-20140908.txt | egrep -o '[0-9]{8}'
20140908
Just have to be careful in that latter case if the eight digits may show up somewhere in a non-date context.

right tool to filter the UUID from the output of blkid program (using grep, cut, or awk, e.t.c)

I want to filter the output of the blkid to get the UUID.
The output of blkid looks like
CASE 1:-
$ blkid
/dev/sda2: LABEL="A" UUID="4CC9-0015"
/dev/sda3: LABEL="B" UUID="70CF-169F"
/dev/sda1: LABEL=" NTFS_partition" UUID="3830C24D30C21234"
In somecases the output of blkid looks like
CASE 2:-
$ blkid
/dev/sda1: UUID="d7ec380e-2521-4fe5-bd8e-b7c02ce41601" TYPE="ext4"
/dev/sda2: UUID="fc54f19a-8ec7-418b-8eca-fbc1af34e57f" TYPE="ext4"
/dev/sda3: UUID="6f218da5-3ba3-4647-a44d-a7be19a64e7a" TYPE="swap"
I want to filter out the UUID.
Using the combination of grep and cut it can be done as
/sbin/blkid | /bin/grep 'sda1' | /bin/grep -o -E 'UUID="[a-zA-Z|0-9|\-]*' | /bin/cut -c 7-
I have tried using awk , grep and cut as below for filtering the UUID
$ /sbin/blkid | /bin/grep 'sda1' | /usr/bin/awk '{print $2}' | /bin/sed 's/\"//g' | cut -c 7-
7ec380e-2521-4fe5-bd8e-b7c02ce41601
The above command(which uses awk) is not reliable since sometimes an extra field such as LABEL may be present in the output of the blkid program as shown in the above output.
What is the best way to create a command using awk which works reliably?
Please post if any other elegant method exits for the job using bin and core utils. I dont want to use perl or python since this has to be run on busybox.
NOTE:-I am using busybox blkid to which /dev/sda1 can not be passed as the args(the version i am using does not support it) hence the grep to filter the line.
UPDATE :- added the CASE 2: -output to show that field position can not be relied upon.
Why are you making it so complex?
Try this:
# blkid -s UUID -o value
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
fc54f19a-8ec7-418b-8eca-fbc1af34e57f
6f218da5-3ba3-4647-a44d-a7be19a64e7a
Or this:
# blkid -s UUID -o value /dev/sda1
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
Install proper blkid package if you don't have it:
sudo apt-get install util-linux
sudo yum install util-linux
For all the UUID's, you can do :
$ blkid | sed -n 's/.*UUID=\"\([^\"]*\)\".*/\1/p'
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
fc54f19a-8ec7-418b-8eca-fbc1af34e57f
6f218da5-3ba3-4647-a44d-a7be19a64e7a
Say, only for a specific sda1:
$ blkid | sed -n '/sda1/s/.*UUID=\"\([^\"]*\)\".*/\1/p'
d7ec380e-2521-4fe5-bd8e-b7c02ce41601
The sed command tries to group the contents present within the double quotes after the UUID keyword, and replaces the entire line with the token.
Here's a short awk solution:
blkid | awk 'BEGIN{FS="[=\"]"} {print $(NF-1)}'
Output:
4CC9-0015
70CF-169F
3830C24D30C21234
Explanation:
BEGIN{FS="[=\"]"} : Use = and " as delimiters
{print $(NF-1)}: NF stands of Number of Fields; here we print the 2nd to last field
This is based on the consistent structure of blkid output: UUID in quotes is at the end of each line.
Alternatively:
blkid | awk 'BEGIN{FS="="} {print $NF}' | sed 's/"//g'
data.txt
/dev/sda2: LABEL="A" UUID="4CC9-0015"
/dev/sda3: LABEL="B" UUID="70CF-169F"
/dev/sda1: LABEL=" NTFS_partition" UUID="3830C24D30C21234"
awk and sed combination
cat data.txt | awk 'BEGIN{FS="UUID";RS="\n"} {print $2}' | sed -e 's/=//' -e 's/"//g'
Explanation:
Set the Field Separator to the string 'UUID', $2 will give the rest output
use sed then to remove the = and " as shown where -e is a switch so that you can give multiple sed commands/expression in one.
All occurrences of " are removed using the ending g option i.e. global.
The question has a "e.t.c" so I'm going to assume python is one of the options ;)
#!/usr/bin/env python3
import subprocess, re, json
# get blkid output
blkid = subprocess.check_output(["blkid"]).decode('utf-8')
devices = []
for line in [x for x in blkid.split('\n') if x]:
parameters = line.split()
for idx, parameter in enumerate(parameters):
if idx is 0:
devices.append({"DEVICE": re.sub(r':$','',parameter)})
continue
key_and_value = parameter.split('=')
devices[-1].update({
key_and_value[0]: re.sub(r'"','',key_and_value[1])
})
uuids = [{dev['DEVICE']: dev['UUID']} for dev in devices if 'UUID' in dev.keys()]
print(json.dumps(uuids, indent=4, sort_keys=True))
Although, this is probably overkill and quite a few error handling/optimization is missing from this script XD
I assume you're using busybox in an initramfs and you are waiting for your e.g. USB drive with the rootfs on it to become available.
You could use the following awk script (busybox awk compliant).
# cat get-ruuid.awk
BEGIN {
ruuid=ENVIRON["RUUID"]
}
/^\/dev\/sd[a-z]/ {
if (index($0, tolower(ruuid)) || index($0, toupper(ruuid))) {
split($1, parts, ":")
printf("%s\n", parts[1])
exit(0) # Return success and stop further scanning.
}
}
END {
exit(1) # If we reach the end, it means RUUID was not found.
}
Call it as follows from e.g. the init script; this is not the most ideal way.
# The UUID of your root partition
export RUUID="<put proper uuid value here>"
for x in 1, 2, 3, 4, 5 ; do
mdev -s
found=$(blkid | awk -f ./get-ruuid.awk)
test -z $found || break; # If no longer zero length, break the loop.
sleep 1
done
But if this is the only reason why you would want to have an initramfs, I would use the 'root=PARTUUID=... waitroot' Linux kernel command line option. Check the kernel docs and sources.
Get the proper PARTUUID (NOT UUID) of your root partition with the blkid command.