groupby results of a matching function - group-by

given the below sample json, what i need to do, is to find if each value found in serialno, exists in serialnopool. If this holds TRUE, then i need a way, to have the values of serialno, that were found in serialnopool, grouped, along with their corresponding row ids.
example output of serialno value 11079851 that exists in serialnopool:
11079851: 098, 798, etc, each serialno value would go on with its new line.
Follows the sample json:
[
{
"rowid": "098",
"serialno": 11079851,
"serialnopool": 11079851
},
{
"rowid": 110,
"serialno": 11089385,
"serialnopool": 25853201
},
{
"rowid": 118,
"serialno": 11089385,
"serialnopool": 22412115
},
{
"rowid": 798,
"serialno": 11079851,
"serialnopool": 22412115
},
{
"rowid": "",
"serialno": "",
"serialnopool": 5423
},
{
"rowid": "",
"serialno": "",
"serialnopool": 5421312
}
]
How could this be achieved with the use of jq?

Assuming that by "exists in serialnopool" you mean "occurs at least once in the set of serialnopool values", the following is a solution provided that your jq is version 1.5 or higher:
map(.serialnopool) as $snp
| (INDEX( group_by(.serialno)[]; .[0].serialno|tostring)
| map_values(map(.rowid))) as $dict
| (map(.serialno) | unique[]) as $sn
| if IN($sn; $snp[])
then "\($sn): " + ($dict[$sn|tostring] | join(", "))
else empty end
The main point of interest in this solution is the dictionary ($dict), which maps .serialno|tostring values to the corresponding .rowid values.

Related

Powershell add column name to array data then convert it to JSON

Basically I formed an array of data based on certain conditions inside a loop and arrary data is something like this:
student1 RL123 S12 student2 RL423 S32 student6 RL166 S02 student34 RL993 P12 student99 RL923 S12
Above array data needs to be converted to JSON as below:
{
"Name" : "student1"
"RollNo" : "RL123"
"SubjectID" : "S12"
},
{
"Name" : "student2"
"RollNo" : "RL423"
"SubjectID" : "S32"
},
{
"Name" : "student6"
"RollNo" : "RL166"
"SubjectID" : "S02"
},
{
"Name" : "student34"
"RollNo" : "RL993"
"SubjectID" : "RL993"
},
{
"Name" : "student99"
"RollNo" : "RL923"
"SubjectID" : "S12"
}
If indeed your array is an array of strings (let's call it $students), you can skip the below line.
However, the way you have formatted it in your question makes it a string with space-separated items, so if that is the case, we should create a proper array from it by splitting on the whitespaces:
$students = 'student1 RL123 S12 student2 RL423 S32 student6 RL166 S02 student34 RL993 P12 student99 RL923 S12' -split '\s+'
Now just iterate over the values in the array and create objects using 3 array elements on each object:
$data = for ($i = 0; $i -lt $students.Count; $i+=3) {
[PsCustomObject]#{
Name = $students[$i]
RollNo = $students[$i + 1]
SubjectID = $students[$i + 2]
}
}
# here you convert the object array to JSON
$data | ConvertTo-Json
Output:
[
{
"Name": "student1",
"RollNo": "RL123",
"SubjectID": "S12"
},
{
"Name": "student2",
"RollNo": "RL423",
"SubjectID": "S32"
},
{
"Name": "student6",
"RollNo": "RL166",
"SubjectID": "S02"
},
{
"Name": "student34",
"RollNo": "RL993",
"SubjectID": "P12"
},
{
"Name": "student99",
"RollNo": "RL923",
"SubjectID": "S12"
}
]
the opening [ and closing ] denote that we're dealing with an array in Json syntax

Django Rest Framework and MongoDB - listField does not works It returns None when object is embedded

I'm using MongoDB the connection is provided by Djongo, over is being used DRF to manage all request to mi API.
My data (profile) is structured like this
{
"name" : "profile name",
"description" : "this is a description",
"params" : "X1, X2,X3, etc",
"config" : "CONFIG OF DEVICE",
"user" : {
"name" : "user name",
"middle_name" : "test middle name",
"last_name" : "test last name",
"email" : "test#test.com",
"institute" : {
"name" : "MIT",
"place" : {
"coordinates" : [ 30.0, 101.0, 0.0 ],
"type" : "Point"
},
"country" : "US"
}
},
"place" : {
"coordinates" : [ 90.0, 901.0, 10.0 ],
"type" : "Point"
},
"devices" : [
{
"name" : "DEVICE 1",
"verification_code" : "",
"verificated" : 0,
"configuration" : "kjk",
"places" : [
{
"coordinates" : [ 30.0, 101.0, 0.0 ],
"type" : "Point"
},
{
"coordinates" : [ 31.0, 102.0, 1.0 ],
"type" : "Point"
}
]
}
]
}
I know, the coordinates are wrong, but is just for test.
Well I send that object to my view and then to the ProfileSerializer, this get the responsible to check the embedded objects (each one have your own serializer). After checking data, the info is saved without problem as you can see in next picture:
But the problem is when I try to. retrieve all profiles. Just the coordinates are null, Other embedded objects are retrieved in good way, only the Place Object is malformed. Next, I'll show you the response:
[
{
"id": 22,
"name": "profile name",
"description": "this is a description",
"params": "X1, X2,X3, etc",
"config": "CONFIG OF DEVICE",
"user": {
"name": "user name",
"middle_name": "test middle name",
"last_name": "test last name",
"email": "test#test.com",
"institute": {
"name": "MIT",
"place": {
"coordinates": **null**,
"type": "Point"
},
"country": "US",
"created_at": "2019-03-21T20:43:33.928000Z"
},
"created_at": "2019-03-21T20:43:33.959000Z"
},
"place": {
"coordinates": **null**,
"type": "Point"
},
"devices": [
{
"name": "DEVICE 1",
"verificated": 0,
"configuration": "kjk",
"places": [
{
"coordinates": **null**,
"type": "Point"
},
{
"coordinates": **null**,
"type": "Point"
}
],
"created_at": "2019-03-21T20:43:33.898000Z"
}
],
"created_at": "2019-03-21T20:43:33.976000Z"
}
]
For this questions only I'll describe/show the serializer of one object, but if you need some info I'll get you as soon as possible.
Models
class Place(models.Model):
coordinates = models.ListField(blank=True, null=True, default=[0.0, 0.0, 0.0])
type = models.CharField(max_length=10, default="Point")
objects = models.DjongoManager()
class Profile(models.Model):
name = models.CharField(max_length=200)
description = models.TextField(default="Without Description")
params = models.TextField(default="No params")
config = models.CharField(max_length=200)
user = models.EmbeddedModelField(
model_container=User
)
place = models.EmbeddedModelField(
model_container=Place
)
devices = models.ArrayModelField(
model_container=Device
)
created_at = models.DateTimeField(auto_now_add=True)
objects = models.DjongoManager()
Serializers
class PlaceSerializer(serializers.ModelSerializer):
coordinates = serializers.ListSerializer(
child=serializers.FloatField(),
)
class Meta:
model = Place
fields = ('id', 'coordinates', 'type')
class ProfileSerializer(serializers.ModelSerializer):
user = UserSerializer( )
place = PlaceSerializer()
devices = DeviceSerializer( many=True)
class Meta:
model = Profile
fields = ('id', 'name', 'description', 'params', 'config',
'user', 'place', 'devices', 'created_at')
depth=8
def create(self, validated_data):
# get principal fields
user_data = validated_data.pop('user')
**place_data = validated_data.pop('place')**
devices_data = validated_data.pop('devices')
# get nested fields
# devices nested fields
devices = []
for device in devices_data:
places = []
places_data = device.pop('places')
for place in places_data:
places.append( **Place(coordinates=place['coordinates'], type=place['type'])** )
device['places'] = places
devices.append( Device.objects.create(**device) )
validated_data['devices'] = devices
# user nested fields
institute_data = user_data.pop('institute')
place = institute_data.pop('place')
institute_data['place'] = Place(coordinates=place['coordinates'], type=place['type'])
user_data['institute'] = Institute.objects.create(**institute_data)
validated_data['user'] = User.objects.create(**user_data)
profile = Profile.objects.create(**validated_data)
return profile
I've defined PlaceSerializer on many ways but all of them gets the same result, Below describe this ways
CASE 1
class PlaceSerializer(serializers.ModelSerializer):
coordinates = serializers.ListSerializer(
child=serializers.FloatField(),
)
CASE 2
class CoordinatesSerializer(serializers.ListSerializer):
child=serializers.FloatField()
class PlaceSerializer(serializers.ModelSerializer):
coordinates = CoordinatesSerializer()
CASE 3
class PlaceSerializer(serializers.ModelSerializer):
coordinates = serializers.ListField(
child=serializers.FloatField()
)
CASE 4
class PlaceSerializer(serializers.ModelSerializer):
coordinates = serializers.ListField()
CASE 5
class PlaceSerializer(serializers.ModelSerializer):
coordinates = serializers.ListSerializer()
#gives error for child is not present
I had changed the types, CharField, IntegerField, FloatField, etc with same results.
Another tests that I've done are append to serializer the methods create, update, to_representation, to_internal_value, all of this for to manage in a better way the info that will saved o retrieved but any works. Another curiosity, if I add a simple Listfield like [10,90,1], is saved and retrieved without problem in contrast when this ListField is inside Place Objects
Please if you know how to solve this I'll glad to you.

JOLT to index a field based on the name of another

Not sure if this is possible in jolt.
We are trying to extract a value whose field name is indexed by another field. Please take a look at the description below.
{
"_src" : {
"SomeName" : 123,
"FName" : "SomeName"
}
}
to
{
"val": "123",
"_src" : {
"SomeName" : 123,
"FName" : "SomeName"
}
}
Any ideas on how approach this, or if this is even possible in JOLT?
Thanks
Using shift spec:
Match on _src
Set value using SomeName
Using syntax #(1,src) which means go up 1 level and copy src, & will get the name of the current element.
[
{
"operation": "shift",
"spec": {
"_src": {
"SomeName": "val",
"#(1,_src)": "&"
}
}
}
]

Error on querying FIWARE Orion with orderBy custom attribute

I am facing the following issue while querying Orion with orderBy, asking the results to return on chronological order. My data looks like this:
{
"id": "some_unique_id",
"type": "sensor",
"someVariable": {
"type": "String",
"value": "1",
"metadata": {}
},
"datetime": {
"type": "DateTime",
"value": "2018-09-21T00:38:57.00Z",
"metadata": {}
},
"humidity": {
"type": "Float",
"value": 55.9,
"metadata": {}
},
"someMoreVariables": {
"type": "Float",
"value": 6.29,
"metadata": {}
}
My call looks like this:
http://my.ip.com:1026/v2/entities?type=sensor&offset=SOMENUMBERMINUS30&limit=30&orderBy=datetime
Unfortunately, the response is the following:
{
"error": "InternalServerError",
"description": "Error at querying MongoDB"}
Both tenant and subtenant are used in the call, while the Orion version is 1.13.0-next and tenant has been indexed inside the MongoDB. I am running Orion and MongoDB from different Docker instances in the same server.
As always, any help will be highly appreciated.
EDIT1: After fgalan's recommendation, I am adding the relative record from the log (I am sorry, I didn't do it from the beginning):
BadInput some.ip
time=2018-10-16T07:47:36.576Z | lvl=ERROR | corr=bf8461dc-d117-11e8-b0f1-0242ac110003 | trans=1539588749-153-00000013561 | from=some.ip | srv=some_tenant | subsrv=/some_subtenant | comp=Orion | op=AlarmManager.cpp[211]:dbError | msg=Raising alarm DatabaseError: nextSafe(): { $err: "Executor error: OperationFailed: Sort operation used more than the maximum 33554432 bytes of RAM. Add an index, or specify a smaller limit.", code: 17144 }
time=2018-10-16T07:47:36.576Z | lvl=ERROR | corr=bf8461dc-d117-11e8-b0f1-0242ac110003 | trans=1539588749-153-00000013561 | from=some.ip | srv=some_tenant | subsrv=/some_subtenant | comp=Orion | op=AlarmManager.cpp[235]:dbErrorReset | msg=Releasing alarm DatabaseError
From the above, it is clear that indexing is required. I have already done that according to fgalan's answer to another question I had in the past: Indexing Orion
EDIT2: Orion response after indexing:
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "orion.entities"
},
{
"v" : 2,
"key" : {
"location.coords" : "2dsphere"
},
"name" : "location.coords_2dsphere",
"ns" : "orion.entities",
"2dsphereIndexVersion" : 3
},
{
"v" : 2,
"key" : {
"creDate" : 1
},
"name" : "creDate_1",
"ns" : "orion.entities"
}
]
You have an index {creDate: 1} which is the fine if you order by entity creation date using dateCreated (or doesn't specify the orderBy parameters, as creation date is the default ordering):
GET /v2/entities
GET /v2/entities?orderBy=dateCreated
However, if you plan to order by a different attribute defined by you (as I understand datetime is) and get the OperationFailed: Sort operation used more than the maximum error, then you have to create an index for the value of such attribute. In particular you have to create this index:
{ attrs.datetime.value: 1 }
EDIT: as suggested by a comment to this answer, the command for creating the above index typically is:
db.entities.createIndex({"attrs.datetime.value": 1});
EDIT2: have a look to this section in documentation for more detail on this kind of indexes.

Map Reduce by date with RockMongo admin panel

I am trying to find the best way to clean my Mongo DB from old rows. This is the row structure:
{
"_id": ObjectId("52adb7fb12e20f2e2400be38"),
"1": "2013-12-15 06: 07: 20",
"2": "1",
"3": "",
"4": "",
"5": "ID",
"6": "139.195.98.240",
"7": "",
"8": "youtube",
"9": NumberInt(0),
"10": "",
"11": ""
}
The date field is this.1. So I want to set a delte method for all rows older then 30 days.
So I figured out that a map can help with this task, if there are any other suggestion please feel free to suggest.
This is the map function that i am trying to run:
{
mapreduce : "delete_rows",
map : function () {
var delete_date = new Date();
delete_date.setDate(delete_date.getDate()-7);
row_date = new Date(this.1);
if(row_date < delete_date){
emit(this._id,{date: this.1}, {all_data: this});
}
},
out : {
"delete_rows"
},
keeptemp:false,
jsMode : false,
verbose : false
}
I get the following error at rockmongo query window:
Criteria must be a valid JSON object
Can anyone jelp me with this syntax?
Thanks