Remove Duplicate character from string in Mongodb - mongodb

I want to remove duplicate characters from strings in MongoDB.
Example:
Input string: xxxyzzxcdv
Output string: xyzcdv

Query
reduce on range(count string)
keep 2 values {"previous": [], "string": ""} (initial value of reduce)
get the cur-char {"$substrCP": ["$mystring", "$$this", 1]}
this is the current index on the string, and i take the next char
if it is in previous kep "string" as it is, else concat to add the new character
heelo
reduce on (0 1 2 3 4) `{"$range": [0, {"$strLenCP": "$mystring"}]}`
we start from `{"previous": [], "string": ""}`
- get 1 character start from index 0
`{"$substrCP": ["$mystring", "$$this", 1]}}` = "h"
- if this character is on previous don't add it
`{"$in": ["$$cur_char", "$$value.previous"]}`
- else add it on previous and on the string the 2 concats in code
Repeat for `index($$this)`= 1
- get 1 character start from index 1
`{"$substrCP": ["$mystring", "$$this", 1]}}` = "e"
.....
PlayMongo
aggregate(
[{"$set":
{"mystring":
{"$getField":
{"field": "string",
"input":
{"$reduce":
{"input": {"$range": [0, {"$strLenCP": "$mystring"}]},
"initialValue": {"previous": [], "string": ""},
"in":
{"$let":
{"vars": {"cur_char": {"$substrCP": ["$mystring", "$$this", 1]}},
"in":
{"$cond":
[{"$in": ["$$cur_char", "$$value.previous"]},
"$$value",
{"previous":
{"$concatArrays": ["$$value.previous", ["$$cur_char"]]},
"string":
{"$concat": ["$$value.string", "$$cur_char"]}}]}}}}}}}}}])
Edit
The second query removed only the duplicates we choose.
Query
removes only the characters in the array, here only ["x"]
i removed the $getField because its only for MongoDB 5 +
aggregate(
[{"$set":
{"mystring":
{"$reduce":
{"input": {"$range": [0, {"$strLenCP": "$mystring"}]},
"initialValue": {"previous": [], "string": ""},
"in":
{"$let":
{"vars": {"cur_char": {"$substrCP": ["$mystring", "$$this", 1]}},
"in":
{"$cond":
[{"$and":
[{"$in": ["$$cur_char", "$$value.previous"]},
{"$in": ["$$cur_char", ["x"]]}]},
"$$value",
{"previous":
{"$concatArrays": ["$$value.previous", ["$$cur_char"]]},
"string":
{"$concat": ["$$value.string", "$$cur_char"]}}]}}}}}}},
{"$set": {"mystring": "$mystring.string"}}])
Edit
If you need to use this aggregation for update, you can use it as pipeline update like.
update({},
[{"$set": ......])
See your driver to find how to do update with pipeline, in Java its like above, alternative run it as database command

Let's assume we have the following collection and the records inside of it:
db.messages.insertMany([
{
"message": "heelllo theeere"
},
{
"message": "may thhhee forrrce be wiithh yyouuu"
},
{
"message": "execute orrrrdder 66"
}
])
Due to uncertainty, I am dropping solutions for both manipulating while querying and updating the records (permanently).
If you want to remove them while running your aggregation query:
In addition to #Takis's solution, using $function pipeline operator can be another option if your MongoDB version is 4.4 or higher.
Further readings on $function operator
// Query via Aggregation Framework
db.messages.aggregate([
{
$match: {
message: {
$ne: null
}
}
},
{
$addFields: {
distinctChars: {
$function: {
body: function (message) {
return message
.split("")
.filter(function (item, pos, self) {
return self.indexOf(item) == pos;
})
.join("");
},
args: ["$message"],
lang: "js"
}
}
}
},
])
If you want to remove them via an update operation:
// Update each document using cursor
db.messages.find({ message: { $ne: null } })
.forEach(doc => {
var distinctChars = doc.message
.split("")
.filter(function (item, pos, self) {
return self.indexOf(item) == pos;
})
.join("");
db.messages.updateOne({ _id: doc._id }, [{ $set: { distinctChars: distinctChars } }]);
});
A quick reminder: Above script just shows an easy way to update the records to reach the goal without focusing on other details. It can be an expensive operation depending on your real world collection's size and configurations, sharding for instance. Consider to improve it with your own way.
Result
For both way, the result should be like the following:
[
{
"_id": {
"$oid": "618d95ccdedc26d80875b75a"
},
"message": "heelllo theeere",
"distinctChars": "helo tr"
},
{
"_id": {
"$oid": "618d95ccdedc26d80875b75b"
},
"message": "may thhhee forrrce be wiithh yyouuu",
"distinctChars": "may theforcbwiu"
},
{
"_id": {
"$oid": "618d95ccdedc26d80875b75c"
},
"message": "execute orrrrdder 66",
"distinctChars": "excut ord6"
}
]

Related

How to insert in the smallest sub-array with mongobd?

I have a game collection that looks like this:
[
{
_id: ObjectId("6314dc4de4ad4c8141ce0b08"),
status: "started",
channel: "myChannel",
teams: [
{
name: "myFirstTeam",
score: 0,
users: [
{
id: 9082376,
name: "myFirstUser"
},
{
id: 289168,
name: "mySecondUser"
},
]
},
{
name: "mySecondTeam",
score: 0,
users: [
{
id: 898323,
name: "myThirdUser"
}
]
}
]
}
]
I managed to add a user to a team of a specific size:
db.collection.update({
"channel": "myChannel",
"teams.users": {
$size: 1
}
},
{
$push: {
"teams.$.users": {
id: 23424234,
name: "myUserName"
}
}
})
My goal is to add a user to a specific game on the smallest team. I'm new with mongodb, I don't even know if that's possible with a request only. I see the $min and $count but I can't find how to use it.
You can try on this playground
BONUS: Check to make sure the userId added is not already on any team of this game, inside the query (but I can check that before or/and after)
Query
update pipeline, because you need more complicated update
it could be smaller with more queries but this does all with 1 query
find the number of members of the smallest team for example size=1
reduce the teams, with initial value {found false new-teams []}
if we are in the team with the smallest size, we add the new member, and turn found to true, else we just add the team as it is
*i didn't added way to check if user exists, its not hard to add it, but i didnt know what to do if it existed and what to do if didnt exist, query is big and you are new, dont know if it will help you, i cant think now of smaller way to do it with 1 query
Playmongo
updateOne(
{"channel": {"$eq": "myChannel"}},
[{"$set": {"new-member": {"id": 23424234, "name": "myUserName"}}},
{"$set":
{"min-size":
{"$min":
{"$map":
{"input": "$teams", "as": "t", "in": {"$size": "$$t.users"}}}}}},
{"$set":
{"teams":
{"$reduce":
{"input": "$teams",
"initialValue": {"added": false, "new-teams": []},
"in":
{"$let":
{"vars": {"v": "$$value", "t": "$$this"},
"in":
{"$cond":
[{"$or":
["$$v.added", {"$gt":
[{"$size": "$$t.users"}, "$min-size"]}]},
{"added": "$$v.added",
"new-teams": {"$concatArrays": ["$$v.new-teams", ["$$t"]]}},
{"added": true,
"new-teams":
{"$concatArrays":
["$$v.new-teams",
[{"$mergeObjects":
["$$t",
{"users":
{"$concatArrays":
["$$t.users", ["$new-member"]]}}]}]]}}]}}}}}}},
{"$set":
{"teams": {"$getField": {"field": "new-teams", "input": "$teams"}}}},
{"$unset": ["new-member", "min-size"]}])

mongo - accessing a key-value field while filtering

I have these data:
myMap = {
"2": "facing",
"3": "not-facing"
"1": "hidden"
}
stages [
{
"k": 1,
"v": "hidden"
},
{
"k": 2,
"v": "facing"
},
{
"k": 3,
"v": "not-facing"
}
]
and a aggregate query but, I'm missing a syntax to dynamically fetch the map data:
db.MyCollection.aggregate()
.addFields({
myMaps: myMap
})
.addFields({
stages: stages
})
.addFields({
process: {
$filter: {
input: stages,
as: stageData,
cond: {$eq: [$$stageData.v, $myMaps[$$stageData.k]]}
}
}
})
As you may already note, this syntax: $myMaps[$$stageData.k] doesn't work, how should I access the myMaps based on the value of the k in stageData ?
Query
like your query set mymap and stages as extra field
mymapKeys is an extra field added (you can $unset is in the end)
filter the stages, and if stage.k is contained in mymapKeys we keep that member
*not sure if this is what you need, but looks like from your query
in mongodb query language we dont have getField(doc,$$k) we only have getField(doc,constant_string) which cant be used in your case,
so it costs here more than a hashmap lookup, here its like linear cost (check if member in the array). For arrays we have $getElementAt(array,$$k) if those numbers are always in sequence, 1,2,3 etc you might be able to use arrays instead of objects
Playmongo
aggregate(
[{"$set": {"mymap": {"2": "facing", "3": "not-facing", "1": "hidden"}}},
{"$set":
{"mymapKeys":
{"$map": {"input": {"$objectToArray": "$mymap"}, "in": "$$this.k"}}}},
{"$set":
{"stages":
[{"k": 1, "v": "hidden"}, {"k": 2, "v": "facing"},
{"k": 3, "v": "not-facing"}]}},
{"$set":
{"process":
{"$filter":
{"input": "$stages",
"cond": {"$in": [{"$toString": "$$this.k"}, "$mymapKeys"]}}}}}])

MongoDB - Getting rid of duplicate objects inside an array while keeping their original order

I have two collections with the following documents
Collection #1
{"_id": "1", "posts": [{"text": "all day long I dream about", "datetime": "123"}, {"text": "all day long ", "datetime": "321"}]}
Collection #2
{"_id": "1", "posts": [{"text": "all day long I dream about", "datetime": "123"}, {"text": "all day long ", "datetime": "8888"}, {"text": "I became very hungry after watching this movie...", "datetime": "8885"}]}
I'm merging collection #1 into collection #2
db.collection_1.aggregate([
{
'$merge': {'into': 'collection_2',
'on': '_id',
'whenMatched': [
{'$addFields': {
'posts': {
'$concatArrays': [f'$posts',
f'$$new.posts']}}}],
'whenNotMatched': 'insert'
}
}
])
By this merge the posts field contains all the 5 posts, including duplicates (the post with "text": "all day long I dream about").
At this point I wish to remove duplicates for posts. I'm doing this by using the following function
db.collection_2.aggregate([{"$project": {
"posts": {"$setUnion": ["$posts", "$posts"]}}},
{'$out': collection_2}])
This function works perfectly, all duplicate posts are gone. My problem is that because I'm using $setUnion I'm loosing the original order of the posts.
I wish to do everything directly on MongoDB server.
Any suggestions on how I can remove duplicate posts while preserving the original order of them?
Merging arrays manually and remove duplicates:
$reduce concatenated array (with duplicates)
the reducer:
check $cond if the element $$this is already $in the carried result $$value
if so - return $$value unchanged
otherwise add it to the array with $concatArrays
The code (the changed part is in $addFields ):
db.collection_1.aggregate([
{
'$merge': {'into': 'collection_2',
'on': '_id',
'whenMatched': [
{
"$addFields": {
"posts": {
$reduce: {
input: {
"$concatArrays": [
"$posts",
"$$new.posts"
]
},
initialValue: [],
in: {
"$cond": {
"if": {
"$in": [
"$$this",
"$$value"
]
},
"then": "$$value",
"else": {
"$concatArrays": [
"$$value",
[
"$$this"
]
]
}
}
}
}
}
}
}
'whenNotMatched': 'insert'
}
}
])

Convert array of objects, to object

I have a mongo db record like this
items: [
0: {
key: name,
value: y
},
1: {
key: module,
value: z
}
]
And per record my expected output is this
{
name : y,
module : z
}
What is the best mongo db aggregation i should apply to achieve this. I have tried few ways but i am unable to produce output.
We can't use unwind otherwise it will break the relationship between name and module.
Query
the items array is almost ready to become document {}
$map to rename key->k and value->v (those are the field names mongo needs)
$arrayToObject to make the items [] => {}
and then merge items document with the root document
and project the items {} (we don't need it anymore the fields are part of the root document now)
*you can combine the 2 $set in 1 stage, i wrote it in 2 to be simpler
Test code here
aggregate(
[{"$set":
{"items":
{"$map":
{"input": "$items", "in": {"k": "$$this.key", "v": "$$this.value"}}}}},
{"$set": {"items": {"$arrayToObject": ["$items"]}}},
{"$replaceRoot": {"newRoot": {"$mergeObjects": ["$items", "$$ROOT"]}}},
{"$project": {"items": 0}}])
aggregate(
[{
$project: {
items: {
$map: {
input: "$items",
"in": {
"k": "$$this.key",
"v": "$$this.value"
}
}
}
}
}, {
$project: {
items: {
$arrayToObject: "$items"
}
}
}])
Test Code

Comparing number values as strings in MongoDb

I am currently trying to compare two strings which are numbers in a find query. I need to use strings because the numbers would cause an overflow in Javascript if I save them as such.
The to compared value comes via an API call and the value is inside an array:
const auction = await this.findOne({
$and: [{
$or: [
{ 'bids.amount': amount },
{ 'bids.signature': signature },
{ 'bids.amount': { $gte: amount } }
]
}, { 'tokenId': tokenId }, { 'isActive': true }]
});
How would I change the query in order to handle the strings as numbers, so my comparison would actually be correct?
The bellow query assumes that bids is an array like
bids=[{"amount" : "1224245" , "signature" : "234454523"} ...]
If signature is not a number remove the $toLong from signature.
aggregate(
[{"$match":
{"$expr":
{"$and":
[{"$eq": ["$tokenId", "tokenId_VAR"]},
{"$eq": ["$isActive", true]},
{"$reduce":
{"input": "$bids",
"initialValue": false,
"in":
{"$or":
["$$value",
{"$gte": [{"$toLong": "$$this.amount"}, "amount_VAR"]},
{"$eq": [{"$toLong": "$$this.signature"}, "signature_VAR"]}]}}}]}}}])