Kafka topic with Variable nested JSON object as KSQL DB stream - apache-kafka

I'm trying to join two existing Kafka topics in KSQL. Some data samples from Kafka (actual values redacted due to corporate environment):
device topic:
{
"persistTime" : "2020-10-06T13:30:25.373Z",
"previous" : {
"device" : "REDACTED",
"type" : "REDACTED",
"group" : "REDACTED",
"inventoryState" : "unknown",
"managementState" : "registered",
"communicationId" : "REDACTED",
"manufacturer" : "",
"description" : "",
"model" : "",
"location" : {
"geo" : {
"latitude" : "REDACTED",
"longitude" : "REDACTED"
},
"address" : {
"city" : "",
"postalCode" : "",
"street" : "",
"houseNumber" : "",
"floor" : "",
"company" : "",
"country" : "",
"reference" : "",
"timeZone" : "",
"region" : "",
"district" : ""
},
"logicalInstallationPoint" : ""
},
"tags" : [ ]
},
"current" : {
"device" : "REDACTED",
"type" : "REDACTED",
"group" : "REDACTED",
"inventoryState" : "unknown",
"managementState" : "registered",
"communicationId" : "REDACTED",
"manufacturer" : "",
"description" : "",
"model" : "",
"location" : {
"geo" : {
"latitude" : "REDACTED",
"longitude" : "REDACTED"
},
"address" : {
"city" : "",
"postalCode" : "",
"street" : "",
"houseNumber" : "",
"floor" : "",
"company" : "",
"country" : "",
"reference" : "",
"timeZone" : "",
"region" : "",
"district" : ""
},
"logicalInstallationPoint" : ""
},
"tags" : [ ]
}
}
device-event topic (1st sample):
{
"device" : "REDACTED",
"event" : "403151",
"firstOccurrenceTime" : "2020-09-30T11:03:50.000Z",
"lastOccurrenceTime" : "2020-09-30T11:03:50.000Z",
"occurrenceCount" : 1,
"receiveTime" : "2020-09-30T11:03:50.000Z",
"persistTime" : "2020-09-30T14:32:59.580Z",
"state" : "open",
"context" : {
"2" : "25",
"3" : "0",
"4" : "60",
"1" : "REDACTED"
}
}
device-event topic (2nd sample):
{
"device" : "REDACTED",
"event" : "402004",
"firstOccurrenceTime" : "2020-10-07T07:02:48Z",
"lastOccurrenceTime" : "2020-10-07T07:02:48Z",
"occurrenceCount" : 1,
"receiveTime" : "2020-10-07T07:02:48Z",
"persistTime" : "2020-10-07T07:15:28.533Z",
"state" : "open",
"context" : {
"2" : "2020-10-07T07:02:48.0000000Z",
"1" : "REDACTED"
}
}
The issue that I'm facing is the varying amount of variables inside of context under the device-event topic.
I've tried the following statements for creating the events stream on ksqlDB:
CREATE STREAM "events"\
("device" VARCHAR, \
"event" VARCHAR, \
"firstOccurenceTime" VARCHAR, \
"lastOccurenceTime" VARCHAR, \
"occurenceCount" INTEGER, \
"receiveTime" VARCHAR, \
"persistTime" VARCHAR, \
"state" VARCHAR, \
"context" ARRAY<STRING>) \
WITH (KAFKA_TOPIC='device-event', VALUE_FORMAT='JSON');
CREATE STREAM "events"\
("device" VARCHAR, \
"event" VARCHAR, \
"firstOccurenceTime" VARCHAR, \
"lastOccurenceTime" VARCHAR, \
"occurenceCount" INTEGER, \
"receiveTime" VARCHAR, \
"persistTime" VARCHAR, \
"state" VARCHAR, \
"context" STRUCT\
<"1" VARCHAR, \
"2" VARCHAR, \
"3" VARCHAR, \
"4" VARCHAR>) \
WITH (KAFKA_TOPIC='ext_device-event_10195', VALUE_FORMAT='JSON');
The second statement only brings in data that has all four context variables present ("1", "2", "3" and "4").
How would one go about creating the KSQL equivalent stream for the device-event Kafka topic?

You need to use a MAP rather than a STRUCT.
BTW you also don't need the \ line separator any more :)
Here's a working example using ksqlDB 0.12.
Load the sample data into a topic
kafkacat -b localhost:9092 -P -t events <<EOF
{ "device" : "REDACTED", "event" : "403151", "firstOccurrenceTime" : "2020-09-30T11:03:50.000Z", "lastOccurrenceTime" : "2020-09-30T11:03:50.000Z", "occurrenceCount" : 1, "receiveTime" : "2020-09-30T11:03:50.000Z", "persistTime" : "2020-09-30T14:32:59.580Z", "state" : "open", "context" : { "2" : "25", "3" : "0", "4" : "60", "1" : "REDACTED" } }
{ "device" : "REDACTED", "event" : "402004", "firstOccurrenceTime" : "2020-10-07T07:02:48Z", "lastOccurrenceTime" : "2020-10-07T07:02:48Z", "occurrenceCount" : 1, "receiveTime" : "2020-10-07T07:02:48Z", "persistTime" : "2020-10-07T07:15:28.533Z", "state" : "open", "context" : { "2" : "2020-10-07T07:02:48.0000000Z", "1" : "REDACTED" } }
EOF
In ksqlDB, declare the stream:
CREATE STREAM "events" (
"device" VARCHAR,
"event" VARCHAR,
"firstOccurenceTime" VARCHAR,
"lastOccurenceTime" VARCHAR,
"occurenceCount" INTEGER,
"receiveTime" VARCHAR,
"persistTime" VARCHAR,
"state" VARCHAR,
"context" MAP < VARCHAR, VARCHAR >
) WITH (KAFKA_TOPIC = 'events', VALUE_FORMAT = 'JSON');
Query the stream to check things work:
ksql> SET 'auto.offset.reset' = 'earliest';
Successfully changed local property 'auto.offset.reset' to 'earliest'. Use the UNSET command to revert your change.
ksql> SELECT "device", "event", "receiveTime", "state", "context" FROM "events" EMIT CHANGES;
+----------+--------+--------------------------+--------+------------------------------------+
|device |event |receiveTime |state |context |
+----------+--------+--------------------------+--------+------------------------------------+
|REDACTED |403151 |2020-09-30T11:03:50.000Z |open |{1=REDACTED, 2=25, 3=0, 4=60} |
|REDACTED |402004 |2020-10-07T07:02:48Z |open |{1=REDACTED, 2=2020-10-07T07:02:48.0|
| | | | |000000Z} |
Use the [''] syntax to access specific keys within the map:
ksql> SELECT "device", "event", "context", "context"['1'] AS CONTEXT_1, "context"['3'] AS CONTEXT_3 FROM "events" EMIT CHANGES;
+-----------+--------+------------------------------------+-----------+-----------+
|device |event |context |CONTEXT_1 |CONTEXT_3 |
+-----------+--------+------------------------------------+-----------+-----------+
|REDACTED |403151 |{1=REDACTED, 2=25, 3=0, 4=60} |REDACTED |0 |
|REDACTED |402004 |{1=REDACTED, 2=2020-10-07T07:02:48.0|REDACTED |null |
| | |000000Z} | | |

Related

How to extract specific fields from mongodb in a specific time range?

I have a huge mongodb collection and I am trying to export only specific fields from my mongodb but everything is getting exported in csv. I used following query to return all data in moduleName, time and device_ip(field in events array) field
mongoexport --host host-ipaddress --port 27017 --username admin --password password#123 --authenticationDatabase admin --db servername --collection alert --fields 'originalAlert.moduleName:1,originalAlert.time:1,originalAlert.events.0.device_ip:1' --query '{"receivedTime":{$gte:new Date(1583020800000), $lt:new Date(1585612800000)}}' --out /tmp/test.csv
Below is one object from alert collection in mongodb
{
"_id" : ObjectId("5e4bbb89208a6a8a435e064e"),
"receivedTime" : ISODate("2020-02-18T10:25:13.111Z"),
"status" : "GROUPED_IN_INCIDENT",
"originalHeaders" : {
"name" : "Name of the Alert",
"description" : null,
"version" : 0,
"severity" : 5,
"timestamp" : NumberLong(1582021513108),
"signatureId" : "30a9fedd3a7cb83dd66436057dd11445c6adfd242849c3813b38e62399128fd8",
"deviceVendor" : "ABC",
"deviceProduct" : "XYZ",
"deviceVersion" : "123"
},
"originalAlert" : {
"severity" : 5,
"eventSourceId" : "x.x.x.x:50005:406265417822",
"respondEnabled" : true,
"moduleType" : "BASIC",
"engineUri" : "Some Value",
"moduleName" : "Name of the Alert",
"suppressMessageBus" : false,
"transientAlert" : false,
"notificationReasons" : [
"Some-Value",
"Some-Value.2"
],
"actualEventsCount" : 3,
"instanceId" : "30a9fedd3a7cb83dd66436057dd11445c6adfd242849c3813b38e62399128fd8",
"statement" : "Module_5d7ccff0f28050b535cad89b_Alert",
"id" : "9bef15ce-7dc5-4445-838f-79d78d2d6ea6",
"time" : "Feb 18, 2020 10:25:13 AM UTC",
"moduleId" : "5d7ccff0f28050b535cad89b",
"events" : [
{
"msg" : "sshd[4719444]: Failed password for invalid user ISTOPR from x.x.x.x port 58134 ssh2",
"event_byte_size" : 386,
"ec_activity" : "Logon",
"header_id" : "0013",
"alias_host" : [
"some-hostname"
],
"event_cat_name" : "User.Activity.Failed Logins",
"ip_src" : "x.x.x.x",
"device_type" : "aix",
"sessionid" : NumberLong(406265417822),
"medium" : 32,
"inv_context" : [
"audit",
"compliance",
"authentication"
],
"rid" : NumberLong(444833155418),
"feed_name" : [
"investigation"
],
"event_cat" : 1401030000,
"forward_ip" : "x.x.x.x",
"alert_id" : [
"account:logon-failure"
],
"client" : "sshd",
"com_rsa_asoc_streams_source_trail" : [
"admin#x.x.x.x:50005.deployed-rules-sa-managed"
],
"msg_id" : "00003:05",
"device_disc" : 55,
"com_rsa_asoc_streams_stream" : "deployed-rules-sa-managed-stream",
"lc_cid" : "some-id",
"ec_subject" : "User",
"event_source_id" : "x.x.x.x:50005:406265417822",
"com_rsa_asoc_streams_arrival_sequence" : 1789715,
"esa_time" : NumberLong(1582021513102),
"ec_theme" : "Authentication",
"com_rsa_asoc_streams_arrival_timestamp" : NumberLong(1582021512436),
"device_disc_type" : "aix",
"inv_category" : [
"assurance",
"identity"
],
"device_ip" : "x.x.x.x",
"ip_srcport" : 58134,
"event_desc" : "Password failed",
"user_dst" : "invalid user ISTOPR",
"size" : 210,
"netname" : [
"private src"
],
"device_class" : "Unix",
"time" : NumberLong(1582021395000),
"ec_outcome" : "Failure",
"did" : "some-did"
},
{
"msg" : "sshd[4719444]: Failed password for invalid user ISTOPR from x.x.x.x port 58134 ssh2",
"event_byte_size" : 386,
"ec_activity" : "Logon",
"header_id" : "0013",
"alias_host" : [
"some-hostname"
],
"event_cat_name" : "User.Activity.Failed Logins",
"ip_src" : "x.x.x.x",
"device_type" : "aix",
"sessionid" : NumberLong(406265417824),
"medium" : 32,
"inv_context" : [
"audit",
"compliance",
"authentication"
],
"rid" : NumberLong(444833155420),
"feed_name" : [
"investigation"
],
"event_cat" : 1401030000,
"forward_ip" : "x.x.x.x",
"alert_id" : [
"account:logon-failure"
],
"client" : "sshd",
"com_rsa_asoc_streams_source_trail" : [
"admin#x.x.x.x:50005.deployed-rules-sa-managed"
],
"msg_id" : "00003:05",
"device_disc" : 55,
"com_rsa_asoc_streams_stream" : "deployed-rules-sa-managed-stream",
"lc_cid" : "some-id",
"ec_subject" : "User",
"event_source_id" : "x.x.x.x:50005:406265417824",
"com_rsa_asoc_streams_arrival_sequence" : 1789717,
"esa_time" : NumberLong(1582021513103),
"ec_theme" : "Authentication",
"com_rsa_asoc_streams_arrival_timestamp" : NumberLong(1582021512436),
"device_disc_type" : "aix",
"inv_category" : [
"assurance",
"identity"
],
"device_ip" : "x.x.x.x",
"ip_srcport" : 58134,
"event_desc" : "Password failed",
"user_dst" : "invalid user ISTOPR",
"size" : 210,
"netname" : [
"private src"
],
"device_class" : "Unix",
"time" : NumberLong(1582021395000),
"ec_outcome" : "Failure",
"did" : "some-did"
},
{
"msg" : "sshd[4719444]: Failed password for invalid user ISTOPR from x.x.x.x port 58134 ssh2",
"event_byte_size" : 386,
"ec_activity" : "Logon",
"header_id" : "0013",
"alias_host" : [
"some-hostname"
],
"event_cat_name" : "User.Activity.Failed Logins",
"ip_src" : "x.x.x.x",
"device_type" : "aix",
"sessionid" : NumberLong(406265417826),
"medium" : 32,
"inv_context" : [
"audit",
"compliance",
"authentication"
],
"rid" : NumberLong(444833155422),
"feed_name" : [
"investigation"
],
"event_cat" : 1401030000,
"forward_ip" : "x.x.x.x",
"alert_id" : [
"account:logon-failure"
],
"client" : "sshd",
"com_rsa_asoc_streams_source_trail" : [
"admin#x.x.x.x:50005.deployed-rules-sa-managed"
],
"msg_id" : "00003:05",
"device_disc" : 55,
"com_rsa_asoc_streams_stream" : "deployed-rules-sa-managed-stream",
"lc_cid" : "some-id",
"ec_subject" : "User",
"event_source_id" : "x.x.x.x:50005:406265417826",
"com_rsa_asoc_streams_arrival_sequence" : 1789719,
"esa_time" : NumberLong(1582021513103),
"ec_theme" : "Authentication",
"com_rsa_asoc_streams_arrival_timestamp" : NumberLong(1582021512436),
"device_disc_type" : "aix",
"inv_category" : [
"assurance",
"identity"
],
"device_ip" : "x.x.x.x",
"ip_srcport" : 58134,
"event_desc" : "Password failed",
"user_dst" : "invalid user ISTOPR",
"size" : 210,
"netname" : [
"private src"
],
"device_class" : "Unix",
"time" : NumberLong(1582021395000),
"ec_outcome" : "Failure",
"did" : "some-did"
}
],
"suppressNotification" : false
},
"alert" : {
"groupby_source_device_mac_address" : "",
"user_summary" : [
"invalid user ISTOPR"
],
"source" : "Event Stream Analysis",
"type" : [
"Log"
],
"groupby_user_src" : "",
"groupby_source_country" : "",
"grouby_src_device_dns_domain" : "",
"grouby_detector_dns_hostname" : "",
"groupby_analysis_file" : "",
"groupby_filename" : "",
"groupby_source_username" : "",
"groupby_detector_ip" : "x.x.x.x",
"events" : [
{
"agent_id" : "",
"data" : [
{
"filename" : "",
"size" : 210,
"hash" : ""
}
],
"destination" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : "",
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : "invalid user ISTOPR"
},
"hash" : ""
},
"description" : "Password failed",
"domain_src" : "",
"device_type" : "aix",
"event_source" : "x.x.x.x:50005",
"source" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : 58134,
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "x.x.x.x",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : ""
},
"hash" : ""
},
"type" : "Log",
"analysis_file" : "",
"enrichment" : "",
"user_src" : "",
"hostname" : "some-hostname",
"analysis_service" : "",
"file" : "",
"detected_by" : "Unix-aix,x.x.x.x",
"process_vid" : "",
"host_src" : "",
"action" : "",
"operating_system" : "",
"alias_ip" : "",
"from" : "x.x.x.x:58134",
"timestamp" : ISODate("2020-02-18T10:23:15.000Z"),
"event_source_id" : "406265417822",
"related_links" : [
{
"type" : "investigate_original_event",
"url" : "/investigation/host/x.x.x.x:50005/navigate/event/AUTO/406265417822"
},
{
"type" : "investigate_destination_domain",
"url" : "/investigation/x.x.x.x:50005/navigate/query/alias.host%3D'some-hostname'%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
}
],
"port_dst" : "",
"domain_dst" : "",
"user_dst" : "invalid user ISTOPR",
"host_dst" : "",
"size" : 210,
"domain" : "some-hostname",
"user_account" : "",
"to" : "",
"category" : "",
"detector" : {
"device_class" : "Unix",
"ip_address" : "x.x.x.x",
"product_name" : "aix"
},
"user" : "invalid user ISTOPR",
"analysis_session" : "",
"username" : ""
},
{
"agent_id" : "",
"data" : [
{
"filename" : "",
"size" : 210,
"hash" : ""
}
],
"destination" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : "",
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : "invalid user ISTOPR"
},
"hash" : ""
},
"description" : "Password failed",
"domain_src" : "",
"device_type" : "aix",
"event_source" : "x.x.x.x:50005",
"source" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : 58134,
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "x.x.x.x",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : ""
},
"hash" : ""
},
"type" : "Log",
"analysis_file" : "",
"enrichment" : "",
"user_src" : "",
"hostname" : "some-hostname",
"analysis_service" : "",
"file" : "",
"detected_by" : "Unix-aix,x.x.x.x",
"process_vid" : "",
"host_src" : "",
"action" : "",
"operating_system" : "",
"alias_ip" : "",
"from" : "x.x.x.x:58134",
"timestamp" : ISODate("2020-02-18T10:23:15.000Z"),
"event_source_id" : "406265417824",
"related_links" : [
{
"type" : "investigate_original_event",
"url" : "/investigation/host/x.x.x.x:50005/navigate/event/AUTO/406265417824"
},
{
"type" : "investigate_destination_domain",
"url" : "/investigation/x.x.x.x:50005/navigate/query/alias.host%3D'some-hostname'%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
}
],
"port_dst" : "",
"domain_dst" : "",
"user_dst" : "invalid user ISTOPR",
"host_dst" : "",
"size" : 210,
"domain" : "some-hostname",
"user_account" : "",
"to" : "",
"category" : "",
"detector" : {
"device_class" : "Unix",
"ip_address" : "x.x.x.x",
"product_name" : "aix"
},
"user" : "invalid user ISTOPR",
"analysis_session" : "",
"username" : ""
},
{
"agent_id" : "",
"data" : [
{
"filename" : "",
"size" : 210,
"hash" : ""
}
],
"destination" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : "",
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : "invalid user ISTOPR"
},
"hash" : ""
},
"description" : "Password failed",
"domain_src" : "",
"device_type" : "aix",
"event_source" : "x.x.x.x:50005",
"source" : {
"path" : "",
"file_SHA256" : "",
"filename" : "",
"launch_argument" : "",
"device" : {
"compliance_rating" : "",
"netbios_name" : "",
"port" : 58134,
"mac_address" : "",
"criticality" : "",
"asset_type" : "",
"ip_address" : "x.x.x.x",
"facility" : "",
"business_unit" : "",
"geolocation" : {
"country" : "",
"city" : "",
"latitude" : null,
"organization" : "",
"domain" : "",
"longitude" : null
}
},
"user" : {
"email_address" : "",
"ad_username" : "",
"ad_domain" : "",
"username" : ""
},
"hash" : ""
},
"type" : "Log",
"analysis_file" : "",
"enrichment" : "",
"user_src" : "",
"hostname" : "some-hostname",
"analysis_service" : "",
"file" : "",
"detected_by" : "Unix-aix,x.x.x.x",
"process_vid" : "",
"host_src" : "",
"action" : "",
"operating_system" : "",
"alias_ip" : "",
"from" : "x.x.x.x:58134",
"timestamp" : ISODate("2020-02-18T10:23:15.000Z"),
"event_source_id" : "406265417826",
"related_links" : [
{
"type" : "investigate_original_event",
"url" : "/investigation/host/x.x.x.x:50005/navigate/event/AUTO/406265417826"
},
{
"type" : "investigate_destination_domain",
"url" : "/investigation/x.x.x.x:50005/navigate/query/alias.host%3D'some-hostname'%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
}
],
"port_dst" : "",
"domain_dst" : "",
"user_dst" : "invalid user ISTOPR",
"host_dst" : "",
"size" : 210,
"domain" : "some-hostname",
"user_account" : "",
"to" : "",
"category" : "",
"detector" : {
"device_class" : "Unix",
"ip_address" : "x.x.x.x",
"product_name" : "aix"
},
"user" : "invalid user ISTOPR",
"analysis_session" : "",
"username" : ""
}
],
"grouby_detector_dns_domain" : "",
"host_summary" : [
"x.x.x.x:58134"
],
"groupby_username" : "",
"grouby_src_device_dns_hostname" : "",
"grouby_dst_usr_ad_username" : "",
"groupby_file_sha_256" : "",
"groupby_user_dst" : "invalid user ISTOPR",
"groupby_os" : "",
"grouby_src_usr_ad_domain" : "",
"name" : "Multiple Failed AIX Logins detected",
"groupby_host_src" : "",
"groupby_analysis_service" : "",
"groupby_destination_device_mac_address" : "",
"groupby_version" : "0",
"grouby_src_device_geolocation_domain" : "",
"destination_country" : [],
"groupby_type" : "Log",
"grouby_src_device_netbios_name" : "",
"groupby_device_type" : "aix",
"groupby_domain" : "some-hostname",
"grouby_dst_device_dns_hostname" : "",
"groupby_destination_country" : "",
"grouby_dst_usr_username" : "invalid user ISTOPR",
"grouby_dst_usr_ad_domian" : "",
"groupby_analysis_session" : "",
"signature_id" : "30a9fedd3a7cb83dd66436057dd11445c6adfd242849c3813b38e62399128fd8",
"groupby_data_hash" : "",
"groupby_domain_dst" : "",
"groupby_destination_ip" : "",
"groupby_host_dst" : "",
"grouby_dst_device_geolocation_domain" : "",
"grouby_dst_device_netbios_name" : "",
"groupby_source_ip" : "x.x.x.x",
"groupby_detector_mac_address" : "",
"timestamp" : ISODate("2020-02-18T10:25:13.108Z"),
"severity" : 50.0,
"related_links" : [
{
"type" : "investigate_session",
"url" : "/investigation/x.x.x.x:50005/navigate/query/sessionid%3D406265417822%7C%7Csessionid%3D406265417824%7C%7Csessionid%3D406265417826"
},
{
"type" : "investigate_device_ip",
"url" : "/investigation/x.x.x.x:50005/navigate/query/device.ip%3D10.192.30.44%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
},
{
"type" : "investigate_src_ip",
"url" : "/investigation/x.x.x.x:50005/navigate/query/ip.src%3D10.192.8.167%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
},
{
"type" : "investigate_destination_domain",
"url" : "/investigation/x.x.x.x:50005/navigate/query/alias.host%3D'some-hostname'%2Fdate%2F2020-02-18T10%3A13%3A15.000Z%2F2020-02-18T10%3A33%3A15.000Z"
}
],
"risk_score" : 50.0,
"grouby_dst_device_dns_domain" : "",
"grouby_src_usr_ad_username" : "",
"groupby_destination_port" : "",
"groupby_c2domain" : "",
"groupby_host_name" : "some-hostname",
"source_country" : [],
"groupby_domain_src" : "",
"numEvents" : 3,
"groupby_agent_id" : ""
},
"partOfIncident" : true,
"_class" : "com.rsa.asoc.respond.commons.domain.Alert",
"incidentCreated" : ISODate("2020-02-18T10:25:47.228Z"),
"incidentId" : "INC-1"
}
Sample Output:
Sample Output
Please let me know where I did wrong.
You are missing the --csv argument to mongoexport command.
As of MongoDB v4.2.2, the --csv is deprecated but will work. The more correct option would be to do a --type=csv as an argument to mongoexport
Also, I see there are issues in the way the fields argument and query argument are constructed.
mongoexport --host <host> --port <port> --db <db> --collection <collection> --type=csv --fields 'originalAlert.moduleName,originalAlert.time,originalAlert.events.0.device_ip' --out /tmp/test.csv
Produces results
originalAlert.moduleName,originalAlert.time,originalAlert.events.0.device_ip
Name of the Alert,"Feb 18, 2020 10:25:13 AM UTC",x.x.x.x
On adding the --query, I added a --query '{"receivedTime":{"$gte": {"$date" : "<start-date>"}, "$lt": {"$date":"<end-date>"}}}' to produce :
mongoexport --host <host> --port <port> --db <db> --collection <collection> --type=csv --fields 'originalAlert.moduleName,originalAlert.time,originalAlert.events.0.device_ip' --query '{"receivedTime":{"$gte": {"$date" : "2020-02-18T00:00:01Z"}, "$lt": {"$date":"2020-02-19T00:00:01Z"}}}' --out /tmp/test.csv
This does filter the records appropriately.

Find distinct field in multiple same collections mongodb

I have 7 collections in one database that the collections are the same and I have to find a unique field in those collections and count it or show it in a new collection named result.
I tried lookup but it it did not work. Can anyone guide me how to do the job?
These are 7 collections: 20200309-20200310-20200311-20200312-20200313-20200314-20200315 and result is the collection for store the result of query.
col : 20200309
{
"_id" : ObjectId("5e6f7c7c0371c86b8737628b"),
"sid" : 13328,
"trans-id" : "PROV_158374123364907198165",
"status" : "1",
"base-price-point" : "6000",
"msisdn" : "989115506327",
"keyword" : "",
"validity" : 0,
"next_renewal_date" : "",
"shortcode" : "",
"billed-price-point" : "",
"trans-status" : 0,
"chargeCode" : "AVMREWCAVMAW6000",
"datetime" : "2020-03-09 11:37:13.649",
"event-type" : "1.5",
"channel" : "system"
}
{
"_id" : ObjectId("5e6f7c7c0371c86b8737628c"),
"sid" : 13328,
"trans-id" : "PROV_158374123384007267165",
"status" : "1",
"base-price-point" : "6000",
"msisdn" : "989107351827",
"keyword" : "",
"validity" : 0,
"next_renewal_date" : "",
"shortcode" : "",
"billed-price-point" : "",
"trans-status" : 0,
"chargeCode" : "AVMREWCAVMAW6000",
"datetime" : "2020-03-09 11:37:13.840",
"event-type" : "1.5",
"channel" : "system"
}
col : 20200310
{
"_id" : ObjectId("5e6f7d140371c86b873e6bce"),
"sid" : 13328,
"trans-id" : "PROV_158383144246275616515",
"status" : "1",
"base-price-point" : "6000",
"msisdn" : "989909789746",
"keyword" : "",
"validity" : 0,
"next_renewal_date" : "",
"shortcode" : "",
"billed-price-point" : "",
"trans-status" : 0,
"chargeCode" : "AVMREWCAVMAW6000",
"datetime" : "2020-03-10 12:40:42.462",
"event-type" : "1.5",
"channel" : "system"
}
{
"_id" : ObjectId("5e6f7d140371c86b873e6bcf"),
"sid" : 13328,
"trans-id" : "PROV_158382430015338271227",
"status" : "1",
"base-price-point" : "6000",
"msisdn" : "989901812412",
"keyword" : "",
"validity" : 0,
"next_renewal_date" : "",
"shortcode" : "",
"billed-price-point" : "",
"trans-status" : 0,
"chargeCode" : "AVMREWCAVMAW6000",
"datetime" : "2020-03-10 10:41:40.153",
"event-type" : "1.5",
"channel" : "system"
}
and there is 5 collections like above
The following script can be used to get distinct msisdn values in an array and the count. Run the script from the mongo shell.
For this to run efficiently, there needs to be an index on the two fields msisdn and status as: { status: 1, msisdn: 1}. The index needs to be created on all the collections. Note the indexes are not required in case the collections have just few thousand documents.
collections = [ "20200309", "20200310", // ... ]
combined = [ ]
for (let coll of collections) {
combined = combined.concat( db.getCollection(coll).distinct( "msisdn", { status: "1" } ) )
}
combined = [...new Set(combined) ]
print('Combined count:', combined.length)
The combined = [...new Set(combined) ] statement removes duplicates from the combined distinct values from the seven collections.

how to copy subdocument into the same document

my need is copy one subdocument from a document and insert into the
same document.But i use forEach and findAndModify it can not insert
into the same document.
document example:
{
"_id" : ObjectId("59b5e84d71ab5580d643d070"),
"modifiedOn" : ISODate("2019-04-03T14:57:22.177+0000"),
"modifiedBy" : "XXX",
"createdOn" : ISODate("2017-09-09T16:33:34.464+0000"),
"createdBy" : "liuyu",
"channelSales" : [
{
"platform" : "amazon",
"channel" : "amazon_ca",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dc1")
},
{
"platform" : "amazon",
"channel" : "amazon_uk",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dc0")
},
{
"platform" : "amazon",
"channel" : "amazon_us",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dbf")
},
{
"platform" : "amazon",
"channel" : "amazon_jp",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dbe")
},
{
"platform" : "amazon",
"channel" : "amazon_de",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dbd")
},
{
"platform" : "amazon",
"channel" : "amazon_es",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dbc")
},
{
"platform" : "amazon",
"channel" : "amazon_fr",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dbb")
},
{
"platform" : "amazon",
"channel" : "amazon_it",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39dba")
},
{
"platform" : "ebay",
"channel" : "ebay_au",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db9")
},
{
"platform" : "ebay",
"channel" : "ebay_de",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db8")
},
{
"platform" : "ebay",
"channel" : "ebay_es",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db7")
},
{
"platform" : "ebay",
"channel" : "ebay_fr",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db6")
},
{
"platform" : "ebay",
"channel" : "ebay_it",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db5")
},
{
"platform" : "ebay",
"channel" : "ebay_uk",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db4")
},
{
"platform" : "ebay",
"channel" : "ebay_us",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("59b5f86aaa0ee15555a39db3")
},
{
"platform" : "walmart",
"channel" : "walmart_us",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5a4d9de2bb1aee844f03e1a6")
},
{
"platform" : "walmart",
"channel" : "walmart_ca",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5a4d9de2bb1aee844f03e1a5")
},
{
"platform" : "amazon",
"channel" : "amazon_au",
"saleStatus" : "T",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5abe095bb1d48d194f6187c0")
},
{
"platform" : "amazon",
"channel" : "amazon_in",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5c9af2776f3dcf04491818f2")
}
],
"statusLevel" : "",
"statusType" : "",
"status" : "A",
"skuId" : "abc001",
"__v" : NumberInt(3)
}
i want to copy:
{
"platform" : "walmart",
"channel" : "walmart_us",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5a4d9de2bb1aee844f03e1a6")
}
and change the "channel" : "walmart_dsv",the other files is the same like this :
{
"platform" : "walmart",
"channel" : "walmart_dsv",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"_id" : ObjectId("5a4d9de2bb1aee844f03e1a6")
}
and insert into the same document.
i use this command:
db.getCollection("0521").aggregate([
{$unwind: "$channelSales"},
{$project: {platform: "$channelSales.platform",
channel: "$channelSales.channel",
saleStatus: "$channelSales.saleStatus",
type: "$channelSales.type",
url: "$channelSales.url",
remark: "$channelSales.remark",
isCaught: "$channelSales.isCaught",
_id: "$channelSales._id"
}},
{ $match : { "channel" : "amazon_us"} }
]).forEach(function(award_team){
if(award_team != null)
{
db.getCollection("0521").findAndModify(
{
query: {_id: award_team._id},
update: { $push: {channelSales: [ {platform: award_team.platform, channel: "walmart_dsv", saleStatus: award_team.saleStatus, type: award_team.type, url: award_team.url, remark: award_team.remark, isCaught: award_team.isCaught, _id: award_team._id }] } },
upsert: true,
});
}
});
but it is add a new document,what can i do?
You are making a mistake in projection:
You are using sub-document _id to update the main document and you are using option {upsert: true} so every time it is inserting a new document in case it does not find _id.
Update your project as below:
db.collection.aggregate([
{$unwind: "$channelSales"},
{$project: {
_id: "$_id", // Main Document Id
platform: "$channelSales.platform",
channel: "$channelSales.channel",
saleStatus: "$channelSales.saleStatus",
type: "$channelSales.type",
url: "$channelSales.url",
remark: "$channelSales.remark",
isCaught: "$channelSales.isCaught",
channelSalesId: "$channelSales._id" // Sub-Document Id (Caannel Sales Id)
}},
{ $match : { "channel" : "amazon_us"} }
])
Now you will get a response like below:
{
"_id" : ObjectId("59b5e84d71ab5580d643d070"),
"platform" : "amazon",
"channel" : "amazon_us",
"saleStatus" : "A",
"type" : "",
"url" : "",
"remark" : "",
"isCaught" : "0",
"channelSalesId" : ObjectId("59b5f86aaa0ee15555a39dbf")
}
Now you can use "award_team._id" to update Document:
db.getCollection("0521").findAndModify(
{
query: {_id: award_team._id},
update: { $push: {channelSales: [ {platform: award_team.platform, channel: "walmart_dsv", saleStatus: award_team.saleStatus, type: award_team.type, url: award_team.url, remark: award_team.remark, isCaught: award_team.isCaught, _id: award_team._id }] } },
upsert: true,
});
}
});

Issue with firebase query

My query:
let query = recentRef.queryOrderedByChild(FRECENT_GROUPID).queryEqualToValue(group_id)
query.observeSingleEventOfType(.Value, withBlock: { snapshot in
And database structure is :
And my query looks like:
(/Recent {
ep = fb343534ca520c70fe35b0a316ea8e4c;
i = groupId;
sp = fb343534ca520c70fe35b0a316ea8e4c;
})
and getting Snap (Recent) <null> when I print(snapshot).
Its strange that it was working fine but now its suddenly stopped working.
EDIT:
Complete JSON:
{
"Message" : {
"fb343534ca520c70fe35b0a316ea8e4c" : {
"-Kp0jed1EZ5BLllL5_cm" : {
"createdAt" : 1.500046597341153E9,
"groupId" : "fb343534ca520c70fe35b0a316ea8e4c",
"objectId" : "-Kp0jed1EZ5BLllL5_cl",
"senderId" : "lI6SRppSboScWo5xVjcfLL82Ogr2",
"senderName" : "Test1 Test1",
"status" : "",
"text" : "hi",
"type" : "text",
"updatedAt" : 1.50004659734136E9
}
}
},
"Recent" : {
"-Kp0jecwejhzQbbm62CW" : {
"counter" : 0,
"createdAt" : 1.500046600967624E9,
"description" : "Test1 Test1",
"groupId" : "fb343534ca520c70fe35b0a316ea8e4c",
"lastMessage" : "hi",
"members" : [ "lI6SRppSboScWo5xVjcfLL82Ogr2", "fnRvHFpaoDhXqM1se7NoTSiWZIZ2" ],
"objectId" : "-Kp0jecwejhzQbbm62CV",
"picture" : "",
"type" : "private",
"updatedAt" : 1.500046600967647E9,
"userId" : "fnRvHFpaoDhXqM1se7NoTSiWZIZ2"
},
"-Kp0jed-FU1PXt1iPr29" : {
"counter" : 0,
"createdAt" : 1.500046600971885E9,
"description" : "Srikant Root",
"groupId" : "fb343534ca520c70fe35b0a316ea8e4c",
"lastMessage" : "hi",
"members" : [ "lI6SRppSboScWo5xVjcfLL82Ogr2", "fnRvHFpaoDhXqM1se7NoTSiWZIZ2" ],
"objectId" : "-Kp0jed-FU1PXt1iPr28",
"picture" : "https://s3.amazonaws.com/top500golfdev/uploads/profile/srikant.yadav#rootinfosol.com/profilepicture.jpg",
"type" : "private",
"updatedAt" : 1.500046600971896E9,
"userId" : "lI6SRppSboScWo5xVjcfLL82Ogr2"
}
},
"User" : {
"fnRvHFpaoDhXqM1se7NoTSiWZIZ2" : {
"createdAt" : 1.500045753102713E9,
"email" : "srikant.yadav#rootinfosol.com",
"firstname" : "Srikant",
"fullname" : "Srikant Yadav",
"handle" : "Srikant",
"lastname" : "Yadav",
"networkImage" : "https://s3.amazonaws.com/top500golfdev/uploads/profile/srikant.yadav#rootinfosol.com/profilepicture.jpg",
"objectId" : "fnRvHFpaoDhXqM1se7NoTSiWZIZ2",
"online" : false,
"updatedAt" : 1.500045753102731E9
},
"lI6SRppSboScWo5xVjcfLL82Ogr2" : {
"createdAt" : 1.500045791892967E9,
"email" : "test1#gmail.com",
"firstname" : "Test1",
"fullname" : "Test1 Test1",
"handle" : "test1",
"lastname" : "Test1",
"networkImage" : "",
"objectId" : "lI6SRppSboScWo5xVjcfLL82Ogr2",
"online" : false,
"updatedAt" : 1.500046571456235E9
}
}
}

Need to count field based on value in mongodb

We have mongo 3.2. we have collection name test5.
It has many field and array ( undaries.couies.ZIPCodes.status )
Status field has few values like Add1 & Add2
I want to take COUNT based on status=ADD1 or ADD2
"**undaries**" : [
{
"**couies**" : [
{
"**ZIPCodes**" : [
{
"ZIPCode" : "60349",
"city" : "Test",
"household" : "Test2",
"accounts" : "0",
"SD" : "Y",
"**status**" : "Add1",
"lastUpdateDate" : "2017-01-24T09:39:56.417Z",
"lastUpdateBy" : "Test"
},
{
"ZIPCode" : "60234",
"city" : "Test",
"household" : "test1",
"accounts" : "0",
"SD" : "Y",
"status" : "Add2",
"lastUpdateDate" : "2017-01-24T09:39:56.417Z",
"lastUpdateBy" : "Test"
},
{
"ZIPCode" : "60235",
"city" : "Test",
"household" : "test1",
"accounts" : "0",
"SD" : "Y",
"status" : "Add1",
"lastUpdateDate" : "2017-01-24T09:39:56.417Z",
"lastUpdateBy" : "Test"
}................
How to get total count of status based on value.
Thanks & Regards.
You may use the count() method and the $in operator
db.yourCollection.count({"undaries.couies.ZIPCodes.status":{$in : ["Add1", "Add2"]}})
count() is shorthand for .find({query}).count()