mongoimport --mode merge --typ csv --collection not working - mongodb

Trying to import the following csv:
_id,receiver,month,accrualMonth,paymentData.bankCode,operation
573378aef3af68090023da7d,547517955021020200599440,2016-05,2016-04,41,Manual
When I run in mongo shell mongo version 3.4.5
mongoimport --db mean-dev --mode=merge --collection fulfilledpayments --type csv --headerline --file ~/Downloads/\Import.csv -vvvv
it returns the following log but it doesn't really import:
2018-04-04T20:51:25.331-0300 using upsert fields: [_id]
2018-04-04T20:51:25.332-0300 using 0 decoding workers
2018-04-04T20:51:25.332-0300 using 1 insert workers
2018-04-04T20:51:25.332-0300 will listen for SIGTERM, SIGINT, and SIGKILL
2018-04-04T20:51:25.360-0300 filesize: 139 bytes
2018-04-04T20:51:25.361-0300 using fields: _id,receiver,month,accrualMonth,paymentData.bankCode,operation
2018-04-04T20:51:25.381-0300 connected to: localhost
2018-04-04T20:51:25.381-0300 ns: mean-dev.fulfilledpayments
2018-04-04T20:51:25.381-0300 connected to node type: standalone
2018-04-04T20:51:25.381-0300 standalone server: setting write concern w to 1
2018-04-04T20:51:25.381-0300 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-04-04T20:51:25.381-0300 standalone server: setting write concern w to 1
2018-04-04T20:51:25.381-0300 using write concern: w='1', j=false, fsync=false, wtimeout=0
2018-04-04T20:51:25.382-0300 got line: [573378aef3af68090023da7d 547517955021020200599440 2016-05 2016-04 41 Manual]
2018-04-04T20:51:25.384-0300 imported 1 document
But nothing is really imported into the database, which remains untouched like this:
{
"_id" : ObjectId("573378aef3af68090023da7d"),
"creator" : "547517955021020200599440",
"amountTransferred" : 101.79,
"externalId" : "61fa09",
"date" : ISODate("2016-05-06T16:00:00.000-03:00"),
"payments" : [
ObjectId("559363f127c09e0900b679dd"),
ObjectId("55bc4c9170b99e090093e2a8"),
ObjectId("55e5175a3b2a8e090040d4cd"),
ObjectId("560cab8bad3c6a0900275f5a"),
ObjectId("563cc8d3f2db060900a8ba81"),
ObjectId("5661033e57d24c090035b191"),
ObjectId("568eac27eaa71c090074d5b0"),
ObjectId("56b2e691ced93a0900408267"),
ObjectId("56d905cb4c830809007e8355"),
ObjectId("56fee8063cdd4d0900776fa9"),
ObjectId("5732732e5d237d09008c57e2")
],
"__v" : 0
}
If I get the --collection fulfilledpayments parameter off it imports to a new collection, but of course there in no need for the merge mode there because it doesn't contain the _id to be matched.

Maybe you should enclose your _id within ObjectId, like:
_id,receiver,month,accrualMonth,paymentData.bankCode,operation
ObjectID(573378aef3af68090023da7d),547517955021020200599440,2016-05,2016-04,41,Manual
https://docs.mongodb.com/manual/reference/program/mongoimport/#ex-mongoimport-merge

Related

elasticsearch 6 not allowing multiple types when trying to pipeline with mongo-connector

I am trying to push data from mongodb3.6 to elasticsearch6.1 using mongo-connector.
My records are:
db.administrators.find({}).pretty()
{
"_id" : ObjectId("5701d81893dc484c812b4fc1"),
"name" : "Test Naupada",
"username" : "adminn",
"ward" : "56a6129f44fc869f215fe3fe",
"password" : "nadmin"
}
rs0:PRIMARY> db.sub_ward_master.find({}).pretty()
{
"_id" : ObjectId("56a6129f44fc869f215fe3fe"),
"wardCode" : "3",
"wardName" : "Naupada",
"wardgeoCodes" : [],
"cityName" : "thane"
}
When I run mongo-connector I am getting following error:
OperationFailed: (u'1 document(s) failed to index.', [{u'index': {u'status': 400, u'_type': u'administrators', u'_index': u'smartjn', u'error': {u'reason': u'Rejecting mapping update to [smartjn] as the final mapping would have more than 1 type: [sub_ward_master, administrators]', u'type': u'illegal_argument_exception'}, u'_id': u'5701d81893dc484c812b4fc1', u'data': {u'username': u'adminn', u'ward': u'56a6129f44fc869f215fe3fe', u'password': u'nadmin', u'name': u'Test Naupada'}}}
Any help any one?
Thanks
ES 6 does not allow to create more than one type in any single index.
There's an open issue in the mongo-connector repo to support ES 6. Until that's solved, you should go with ES 5 instead.
You can do it in ES6 by creating a new index for different document type (ie different collection in mongoDB) and use -g flag to direct it to new index.
For example:
mongo-connector -m localhost:27017 -t localhost:9200 -d elastic2_doc_manager -n {db}.{collection_name} -g {new_index}.{document_type}.
Refer mongo-connector-wiki

Mongodb Query to CSV dump (mlab hosted mongodb)

I am querying an already populated mlab MongoDB database, and I want to store the resulting multiple documents in one single CSV file.
EDIT: output format of CSV file I hope to get:
uniqueid status date
191b117fcf5c 0 2017-03-01 15:26:28.217000
191b117fcf5c 1 2017-03-01 18:26:28.217000
MongoDB database document format is
{
"_id": {
"$oid": "58b6bcc00bd666355805a3ee"
},
"sensordata": {
"operation": "chgstatus",
"user": {
"status": "1",
"uniqueid": "191b117fcf5c"
}
},
"created_date": {
"date": "2017-03-01 17:51:17.216000"
}
}
Database name:mparking_sensor
collection name: demo
The python code to query is as follows:
# -*- coding: utf-8 -*-
"""
Created on Wed Mar 01 18:55:18 2017
#author: Being_Rohit
"""
import pymongo
uri = 'mongodb://#####:*****#ds157529.mlab.com:57529/mparking_sensor'
client = pymongo.MongoClient(uri)
db = client.get_default_database().demo
print db
results = db.find()
f = open("mytest.csv", "w")
for record in results:
query1 = (record["sensordata"]["user"],record["created_date"])
print query1
print "done"
client.close()
EDIT: output format of query1 I am getting is:
({u'status': u'0', u'uniqueid': u'191b117fcf5c'}, {u'date': u'2017-03-01 17:51:08.263000'})
Does someone know the correct way to dump this data in a .csv file (pandas/or any other means) or some other approach for further prediction based analysis to do on it in future like linear regression?
Mongoexport will do the job for you. It can, uniquely among native MongoDB tools, export in CSV format, limited to a specific set of fields.
Your mongoexport command would be something like this:
mongoexport.exe \
--db mparking_sensor \
--collection demo \
--type=csv \
--fields sensordata.user.uniqueid,sensordata.user.status,created_date
That will export something like the following:
sensordata.user.uniqueid,sensordata.user.status,created_date
191b117fcf5c,0,2017-03-01T15:26:28.217000Z
191b117fcf5c,1,2017-03-01T18:26:28.217000Z
I was trying to export a collection to csv using mlabs 'export collection' they make it harder than it needs to be. So i used https://studio3t.com and connected using the standard MongoDB URI

Can i use mongoexport --query <file> where file is a list of conditions

I have an array of ids stored in a file, and I want to retrieve their data from the mongdb
so i looked into the mongoexport method. it seems --query option can only accept a json instead read a large json or array from a file. In my case, it is about 4000 ids stored in the file. Is there a solution to this?
I was able to use
mongoexport --db db --collection collection --field name --csv -oout ~/data.csv
but how to read query conditions from a file
for example, for mongoid in rails application, query like this is Data.where(:_id.in => array).
or is it possible to do from mongo shell by executing a javscript file
tks
I believe you can use a javascript to output the array you need.
you can use "printjson" command in your script, for example:
create a script.js javascript file as following:
script.js:
printjson( db.albums.find({_id : 18}, {"images" : 1,"_id":0}).toArray() )
Call hi as follow:
mongo test script.js > out.txt
In my local environment albums collection has the following structure:
db.albums.findOne({"_id":18
{
"_id" : 18,
"images" : [
2926,
5377,
8036,
9023,
10119,
11543,
12305,
12556,
12576,
13753,
14414,
14865,
15193,
15933,
17156,
17314,
17391,
20168,
21705,
22016,
22348,
23036,
23452,
24112,
27086,
27310,
27864,
28092,
29184,
29190,
29250,
29354,
29454,
29563,
30366,
30619,
31390,
31825,
31906,
32339,
32674,
33307,
33844,
37475,
37976,
38717,
38774,
39801,
41369,
41752,
44977,
45384,
45643,
46918,
47069,
50099,
52755,
54314,
54497,
62338,
63438,
63572,
63600,
65631,
66953,
67160,
67369,
69802,
71087,
71127,
71282,
73123,
73201,
73954,
74972,
76279,
77054,
78397,
78645,
78936,
79364,
79707,
83065,
83142,
83568,
84160,
85391,
85443,
85488,
86143,
86240,
86949,
89406,
89846,
92591,
92639,
92655,
93844,
93934,
94987,
95324,
95431,
95817,
95864,
96230,
96975,
97026
]
}
>
, so the output I got was:
$ cat out.txt
MongoDB shell version: 2.2.1
connecting to: test
[
{
"images" : [
2926,
5377,
8036,
9023,
10119,
11543,
12305,
12556,
12576,
13753,
14414,
14865,
15193,
15933,
17156,
17314,
17391,
20168,
21705,
22016,
22348,
23036,
23452,
24112,
27086,
27310,
27864,
28092,
29184,
29190,
29250,
29354,
29454,
29563,
30366,
30619,
31390,
31825,
31906,
32339,
32674,
33307,
33844,
37475,
37976,
38717,
38774,
39801,
41369,
41752,
44977,
45384,
45643,
46918,
47069,
50099,
52755,
54314,
54497,
62338,
63438,
63572,
63600,
65631,
66953,
67160,
67369,
69802,
71087,
71127,
71282,
73123,
73201,
73954,
74972,
76279,
77054,
78397,
78645,
78936,
79364,
79707,
83065,
83142,
83568,
84160,
85391,
85443,
85488,
86143,
86240,
86949,
89406,
89846,
92591,
92639,
92655,
93844,
93934,
94987,
95324,
95431,
95817,
95864,
96230,
96975,
97026
]
}
]
Regards,
Moacy

mongoexport too many options error while creating changelog

trying to use mongoexport to export a csv of the oplog... tried all quote combinations I have read so far...
../mongodb/bin/mongoexport --csv -d local -c oplog.rs -o export.csv -f {op,ns,o._id} -q "{ts: { \"$gte\": Timestamp(1355100998000,1)} , op :{ \"$nin\" : [\"c\",\"n\"]}"
but I keep getting
ERROR: too many positional options
.....
what could be wrong?
After a lot of screwing around I have tried this
q="{op: { \$nin: [\"c\",\"n\"]}}"
mongoexport --csv -d local -c oplog.rs -o export.csv -f {op,ns,o._id} -q "$q"
and this works like a charm.
but still this
q="{ts: { \$gte: Timestamp(1355100998000,1)}, op: { \$nin: [\"c\",\"n\"]}}"
../mongodb/bin/mongoexport --csv --db local --collection oplog.rs -o changelog.csv --fields op,ns -q "$q"
does not work. Output
Assertion: 10340:Failure parsing JSON string near: ts: { $gte
Feel something is wrong with Timestamp()?
So finally this is how it should be done... or how I did it. It is pretty fast tried it on 30000 records takes max 2 seconds.
All thats happening is that I am storing the results in a new collection by using mongo with --eval option
q="db.oplog.rs.find({ ts : { \$gte : Timestamp( $timestamp, 1)}, op : { \$nin : [\"c\",\"n\"] } }, { op : 1 , ns : 1 , \"o._id\" : 1 , h : 1 } ).forEach(function(x){db.changelog.save(x);})"
../mongodb/bin/mongo localhost:27017/local --eval "$q"
and then export it as .csv using mongoexport
../mongodb/bin/mongoexport --csv --db local --collection changelog -o changelog.csv --fields "o._id","op","ns","h"
and removinf the temporary database to support future changelogs
../mongodb/bin/mongo localhost:27017/local --eval 'db.changelog.remove()'

mongo dbname --eval 'db.collection.find()' does not work

Why does this work:
# mongo dbname
MongoDB shell version: 1.8.3
connecting to: nextmuni_staging
> db.collection.find()
{ "foo" : "bar" }
> bye
While this does not work:
# mongo localhost/dbname --eval 'db.collection.find()'
MongoDB shell version: 1.8.3
connecting to: localhost/dbname
DBQuery: dbname.collection -> undefined
It should be exactly the same, no?
Thanks!
The return val of db.collection.find() is a cursor type. Executing this command from within the shell will create a cursor and show you the first page of data. You can start going through the rest by repeating the 'it' command.
I think the scope of variables used during the execution of an eval'd script is only for the lifetime of the script (data can be persisted into collections of course) so once the script terminates those cursor variables no longer exist and so you would be able to send another eval script to page the data. So the behaviour you get during a shell session wouldn't really work from an eval script.
To get close to the behaviour you could run something like this:
mongo dbname --eval "db.collection.find().forEach(printjson)"
That shows you that the command does execute and produce a cursor which you can then iterate over sending the output to stdout.
Edit: I think the point I was trying to make was that the command you are issuing is working its just the output is not what you expect.
The printjson functions covers a lot of ground when scripting with mongo --eval '...'. Rather than chaining .forEach you can simply wrap your call.
$ mongo --eval 'db.stats_data.stats()' db_name
MongoDB shell version: 2.4.14
connecting to: db_name
[object Object]
$ mongo --eval 'db.stats_data.stats().forEach(printjson)' db_name
MongoDB shell version: 2.4.14
connecting to: db_name
Tue Jan 10 15:32:11.961 TypeError: Object [object Object] has no method 'forEach'
$ mongo --eval 'printjson(db.stats_data.stats())' db_name
MongoDB shell version: 2.4.14
connecting to: db_name
{
"ns" : "db_name.stats_data",
"count" : 5516290,
"size" : 789938800,
"avgObjSize" : 143.20110073980882,
"storageSize" : 1164914688,
"numExtents" : 18,
"nindexes" : 3,
"lastExtentSize" : 307515392,
"paddingFactor" : 1.0000000000000457,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 1441559616,
"indexSizes" : {
"_id_" : 185292688,
"owner_id_key_idx" : 427678384,
"onwer_metric_key_idx" : 828588544
},
"ok" : 1
}