Is it ideal that MongoDB is using 150 MB memory? - mongodb

This is the first project by me which is using MongoDB.
I have hosted it on a linode (a VPS which uses XEN) and I'm checking memory usage with "top".
The mongod process seem to use around 150 MB of memory. There were no connections to it when I checked. I use RockMongo to administer it. My main database stats are -
Size - 464m
Storage Size - 83.99m
Data Size - 66.4m
Index Size - 49.33m
Collections - 5
Objects - 584850
A lot of queries happen when the cron job is running, around 75 per minute or even more. But, as I said earlier, when I checked the memory usage, there were no connections.
Output of db.serverStatus();
Note - I had restarted mongod before running db.serverStatus(); and the memory usage was 40 MB.
{
"retval": {
"version": "1.6.5",
"uptime": 790,
"uptimeEstimate": 783,
"localTime": "Mon, 07 Feb 2011 00: 51: 04 -0500",
"globalLock": {
"totalTime": 790027671,
"lockTime": 376381,
"ratio": 0.00047641495838188,
"currentQueue": {
"total": 0,
"readers": 0,
"writers": 0
}
},
"mem": {
"bits": 64,
"resident": 38,
"virtual": 957,
"supported": true,
"mapped": 288
},
"connections": {
"current": 2,
"available": 9598
},
"extra_info": {
"note": "fields vary by platform",
"heap_usage_bytes": 152448,
"page_faults": 0
},
"indexCounters": {
"btree": {
"accesses": 1,
"hits": 1,
"misses": 0,
"resets": 0,
"missRatio": 0
}
},
"backgroundFlushing": {
"flushes": 13,
"total_ms": 1,
"average_ms": 0.076923076923077,
"last_ms": 0,
"last_finished": "Mon, 07 Feb 2011 00: 50: 54 -0500"
},
"cursors": {
"totalOpen": 0,
"clientCursors_size": 0,
"timedOut": 0
},
"opcounters": {
"insert": 0,
"query": 57,
"update": 0,
"delete": 0,
"getmore": 0,
"command": 46
},
"asserts": {
"regular": 0,
"warning": 0,
"msg": 0,
"user": 0,
"rollovers": 0
},
"ok": 1
},
"ok": 1
}
A friend of mine runs his WordPress blog on a linode with same amount of ram (1024 MB). His MySQL usage show mere 20.48 and approx. 12 users are like "always-surfing" (as in always-on) on his site.
This makes me feel MongoDB isn't a nice choice for me and I should have sticked to MySQL!
Thank you, all.

"Using" that much memory isn't as bad as it seems ... MongoDB will (at least seem to) use up a lot of available memory, but it leaves it up to the OS's VMM to tell it to release the memory when need. (see Caching in the MongoDB docs.)
For the most part it's "using" that memory for cache, which dramatically speeds things up.
You should be able to release any and all memory by restarting MongoDB.
However, to some extent MongoDB isn't really actively "using" the memory ... read on for a lot more details in this anwser ..
How to release the caching which is used by Mongodb?

Memory management is soley up to the OS.
Read
http://blog.mongodb.org/post/101911655/mongo-db-memory-usage
There is basically no way right now to influence the memory usage....and mentioned: learn about memory-mapped files and don't mix up the memory usage of memory mapped files with the actual memory usage.

Related

Why is orientdb oetl import giving me this error

I am trying to import a csv file into orientdb 3.0 I have created and tested the json file and it works with a smaller dataset. But the dataset that I want to import is around a billion rows (six columns)
Following is the user.json file I am using for import with oetl
{
"source": { "file": { "path": "d1.csv" } },
"extractor": { "csv": {} },
"transformers": [
{ "vertex": { "class": "User" } }
],
"loader": {
"orientdb": {
"dbURL": "plocal:/databases/magriwebdoc",
"dbType": "graph",
"classes": [
{"name": "User", "extends": "V"}
], "indexes": [
{"class":"User", "fields":["id:string"], "type":"UNIQUE" }
]
}
}
}
This is the console output from oetl command:
2019-05-22 14:31:15:484 INFO Windows OS is detected, 262144 limit of open files will be set for the disk cache. [ONative]
2019-05-22 14:31:15:647 INFO 8261029888 B/7878 MB/7 GB of physical memory were detected on machine [ONative]
2019-05-22 14:31:15:647 INFO Detected memory limit for current process is 8261029888 B/7878 MB/7 GB [ONative]
2019-05-22 14:31:15:649 INFO JVM can use maximum 455MB of heap memory [OMemoryAndLocalPaginatedEnginesInitializer]
2019-05-22 14:31:15:649 INFO Because OrientDB is running outside a container 12% of memory will be left unallocated according to the setting 'memory.leftToOS' not taking into account heap memory [OMemoryAndLocalPaginatedEnginesInitializer]
2019-05-22 14:31:15:650 INFO OrientDB auto-config DISKCACHE=6,477MB (heap=455MB os=7,878MB) [orientechnologies]
2019-05-22 14:31:15:652 INFO System is started under an effective user : `lenovo` [OEngineLocalPaginated]
2019-05-22 14:31:15:670 INFO WAL maximum segment size is set to 6,144 MB [OrientDBEmbedded]
2019-05-22 14:31:15:701 INFO BEGIN ETL PROCESSOR [OETLProcessor]
2019-05-22 14:31:15:703 INFO [file] Reading from file d1.csv with encoding UTF-8 [OETLFileSource]
2019-05-22 14:31:15:703 INFO Started execution with 1 worker threads [OETLProcessor]
2019-05-22 14:31:16:008 INFO Page size for WAL located in D:\databases\magriwebdoc is set to 4096 bytes. [OCASDiskWriteAheadLog]
2019-05-22 14:31:16:703 INFO + extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 1001ms [0 warnings, 0 errors] [OETLProcessor]
2019-05-22 14:31:16:770 INFO Storage 'plocal:D:\databases/magriwebdoc' is opened under OrientDB distribution : 3.0.18 - Veloce (build 747595e790a081371496f3bb9c57cec395644d82, branch 3.0.x) [OLocalPaginatedStorage]
2019-05-22 14:31:17:703 INFO + extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 2001ms [0 warnings, 0 errors] [OETLProcessor]
2019-05-22 14:31:17:954 SEVER ETL process has problem: [OETLProcessor]
2019-05-22 14:31:17:956 INFO END ETL PROCESSOR [OETLProcessor]
2019-05-22 14:31:17:957 INFO + extracted 0 rows (0 rows/sec) - 0 rows -> loaded 0 vertices (0 vertices/sec) Total time: 2255ms [0 warnings, 0 errors] [OETLProcessor]D:\orientserver\bin>
I know the code is right but I am assuming it's more of a memory issue!
Please advise what should I do.
Have you tried improving your memory settings, according to the size of the data that you want to process?
From the documentation, you can custom these properties:
Configuration Environmental Variables (See $ORIENTDB_OPTS_MEMORY parameter)
Performance-Tuning - Memory Settings
Maybe could help you
Your json script seems no problem, but you can try to delete your indexes part. I have encountered the same problem because of the wrong indexes, too. It may because the UNIQUE indexes constraint. You can try:
Delete the indexes part of json script.
If you need this index, make sure to clear you database before you import your dataset.

MongoDB Concurrency Bottleneck

Too Long; Didn't Read
The question is about a concurrency bottleneck I am experiencing on MongoDB. If I make one query, it takes 1 unit of time to return; if I make 2 concurrent queries, both take 2 units of time to return; generally, if I make n concurrent queries, all of them take n units of time to return. My question is about what can be done to improve Mongo's response times when faced with concurrent queries.
The Setup
I have a m3.medium instance on AWS running a MongoDB 2.6.7 server. A m3.medium has 1 vCPU (1 core of a Xeon E5-2670 v2), 3.75GB and a 4GB SSD.
I have a database with a single collection named user_products. A document in this collection has the following structure:
{ user: <int>, product: <int> }
There are 1000 users and 1000 products and there's a document for every user-product pair, totalizing a million documents.
The collection has an index { user: 1, product: 1 } and my results below are all indexOnly.
The Test
The test was executed from the same machine where MongoDB is running. I am using the benchRun function provided with Mongo. During the tests, no other accesses to MongoDB were being made and the tests only comprise read operations.
For each test, a number of concurrent clients is simulated, each of them making a single query as many times as possible until the test is over. Each test runs for 10 seconds. The concurrency is tested in powers of 2, from 1 to 128 simultaneous clients.
The command to run the tests:
mongo bench.js
Here's the full script (bench.js):
var
seconds = 10,
limit = 1000,
USER_COUNT = 1000,
concurrency,
savedTime,
res,
timediff,
ops,
results,
docsPerSecond,
latencyRatio,
currentLatency,
previousLatency;
ops = [
{
op : "find" ,
ns : "test_user_products.user_products" ,
query : {
user : { "#RAND_INT" : [ 0 , USER_COUNT - 1 ] }
},
limit: limit,
fields: { _id: 0, user: 1, product: 1 }
}
];
for (concurrency = 1; concurrency <= 128; concurrency *= 2) {
savedTime = new Date();
res = benchRun({
parallel: concurrency,
host: "localhost",
seconds: seconds,
ops: ops
});
timediff = new Date() - savedTime;
docsPerSecond = res.query * limit;
currentLatency = res.queryLatencyAverageMicros / 1000;
if (previousLatency) {
latencyRatio = currentLatency / previousLatency;
}
results = [
savedTime.getFullYear() + '-' + (savedTime.getMonth() + 1).toFixed(2) + '-' + savedTime.getDate().toFixed(2),
savedTime.getHours().toFixed(2) + ':' + savedTime.getMinutes().toFixed(2),
concurrency,
res.query,
currentLatency,
timediff / 1000,
seconds,
docsPerSecond,
latencyRatio
];
previousLatency = currentLatency;
print(results.join('\t'));
}
Results
Results are always looking like this (some columns of the output were omitted to facilitate understanding):
concurrency queries/sec avg latency (ms) latency ratio
1 459.6 2.153609008 -
2 460.4 4.319577324 2.005738882
4 457.7 8.670418178 2.007237636
8 455.3 17.4266174 2.00989353
16 450.6 35.55693474 2.040380754
32 429 74.50149883 2.09527338
64 419.2 153.7325095 2.063482104
128 403.1 325.2151235 2.115460969
If only 1 client is active, it is capable of doing about 460 queries per second over the 10 second test. The average response time for a query is about 2 ms.
When 2 clients are concurrently sending queries, the query throughput maintains at about 460 queries per second, showing that Mongo hasn't increased its response throughput. The average latency, on the other hand, literally doubled.
For 4 clients, the pattern continues. Same query throughput, average latency doubles in relation to 2 clients running. The column latency ratio is the ratio between the current and previous test's average latency. See that it always shows the latency doubling.
Update: More CPU Power
I decided to test with different instance types, varying the number of vCPUs and the amount of available RAM. The purpose is to see what happens when you add more CPU power. Instance types tested:
Type vCPUs RAM(GB)
m3.medium 1 3.75
m3.large 2 7.5
m3.xlarge 4 15
m3.2xlarge 8 30
Here are the results:
m3.medium
concurrency queries/sec avg latency (ms) latency ratio
1 459.6 2.153609008 -
2 460.4 4.319577324 2.005738882
4 457.7 8.670418178 2.007237636
8 455.3 17.4266174 2.00989353
16 450.6 35.55693474 2.040380754
32 429 74.50149883 2.09527338
64 419.2 153.7325095 2.063482104
128 403.1 325.2151235 2.115460969
m3.large
concurrency queries/sec avg latency (ms) latency ratio
1 855.5 1.15582069 -
2 947 2.093453854 1.811227185
4 961 4.13864589 1.976946318
8 958.5 8.306435055 2.007041742
16 954.8 16.72530889 2.013536347
32 936.3 34.17121062 2.043083977
64 927.9 69.09198599 2.021935563
128 896.2 143.3052382 2.074122435
m3.xlarge
concurrency queries/sec avg latency (ms) latency ratio
1 807.5 1.226082735 -
2 1529.9 1.294211452 1.055566166
4 1810.5 2.191730848 1.693487447
8 1816.5 4.368602642 1.993220402
16 1805.3 8.791969257 2.01253581
32 1770 17.97939718 2.044979532
64 1759.2 36.2891598 2.018374668
128 1720.7 74.56586511 2.054769676
m3.2xlarge
concurrency queries/sec avg latency (ms) latency ratio
1 836.6 1.185045183 -
2 1585.3 1.250742872 1.055438974
4 2786.4 1.422254414 1.13712774
8 3524.3 2.250554777 1.58238551
16 3536.1 4.489283844 1.994745425
32 3490.7 9.121144097 2.031759277
64 3527 18.14225682 1.989033023
128 3492.9 36.9044113 2.034168718
Starting with the xlarge type, we begin to see it finally handling 2 concurrent queries while keeping the query latency virtually the same (1.29 ms). It doesn't last too long, though, and for 4 clients it again doubles the average latency.
With the 2xlarge type, Mongo is able to keep handling up to 4 concurrent clients without raising the average latency too much. After that, it starts to double again.
The question is: what could be done to improve Mongo's response times with respect to the concurrent queries being made? I expected to see a rise in the query throughput and I did not expect to see it doubling the average latency. It clearly shows Mongo is not being able to parallelize the queries that are arriving.
There's some kind of bottleneck somewhere limiting Mongo, but it certainly doesn't help to keep adding up more CPU power, since the cost will be prohibitive. I don't think memory is an issue here, since my entire test database fits in RAM easily. Is there something else I could try?
You're using a server with 1 core and you're using benchRun. From the benchRun page:
This benchRun command is designed as a QA baseline performance measurement tool; it is not designed to be a "benchmark".
The scaling of the latency with the concurrency numbers is suspiciously exact. Are you sure the calculation is correct? I could believe that the ops/sec/runner was staying the same, with the latency/op also staying the same, as the number of runners grew - and then if you added all the latencies, you would see results like yours.

What are the meaning of the extra values I'm getting from bid_info

For one of my ads, I'm getting the following bid_info:
"bid_type": 6,
"bid_info": {
"1": 24,
"37": 0,
"38": 27,
"44": 26,
"45": 0,
"46": 0,
"48": 0,
"55": 23
}
I came to understand that "37" was also "impressions" (not "Social Impressions"?), but what do the values 45, 46 and 48 map to? It seems I only have that for bid_type 6 (RELATIVE_OCPM), is that normal?
Yes I know about the October 2nd breaking changes that is why I'm inquiring about this field, I even tried to find past information using the waybackmachine to no avail.
1 => 'clicks',
37 => 'impressions',
38 => 'social',
44 => 'reach',
55 => 'actions'
All I can figure out at this point

Free Rest API to retrieve current datetime as string (timezone irrelevant) [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I am seeking a reliable REST API that can provide world time and time zone information across platforms.
I need the current time as a string. I'd like it to return the result in under a second, regardless of the user's location worldwide.
Among other implementations I want to use this for a consistent countdown timer, to be more accurate than a user's [possibly-inaccurate] computer time. It can be GMT or another time zone, as long as the time zone and offset is specified, like 2012-11-05 16:16:50 EST.
I would build this API myself, but have concerns of potential latency issues (as well as inelegance) when filtering someone through a whole big software stack like Rails just to return a simple String.
Excessive latency for users far away from the US east coast would offset the benefit of accuracy that the task requires.
Any suggestions and/or examples are appreciated.
TimezoneDb provides a free API: http://timezonedb.com/api
GenoNames also has a RESTful API available to get the current time for a given location: http://www.geonames.org/export/ws-overview.html.
You can use Greenwich, UK if you'd like GMT.
This API gives you the current time and several formats in JSON - https://market.mashape.com/parsify/format#time. Here's a sample response:
{
"time": {
"daysInMonth": 31,
"millisecond": 283,
"second": 42,
"minute": 55,
"hour": 1,
"date": 6,
"day": 3,
"week": 10,
"month": 2,
"year": 2013,
"zone": "+0000"
},
"formatted": {
"weekday": "Wednesday",
"month": "March",
"ago": "a few seconds",
"calendar": "Today at 1:55 AM",
"generic": "2013-03-06T01:55:42+00:00",
"time": "1:55 AM",
"short": "03/06/2013",
"slim": "3/6/2013",
"hand": "Mar 6 2013",
"handTime": "Mar 6 2013 1:55 AM",
"longhand": "March 6 2013",
"longhandTime": "March 6 2013 1:55 AM",
"full": "Wednesday, March 6 2013 1:55 AM",
"fullSlim": "Wed, Mar 6 2013 1:55 AM"
},
"array": [
2013,
2,
6,
1,
55,
42,
283
],
"offset": 1362534942283,
"unix": 1362534942,
"utc": "2013-03-06T01:55:42.283Z",
"valid": true,
"integer": false,
"zone": 0
}
If you're using Rails, you can just make an empty file in the public folder and use ajax to get that. Then parse the headers for the Date header. Files in the Public folder bypass the Rails stack, and so have lower latency.

How do I get mongoimport to work with complex json data?

Trying to use the built in mongoimport utility with mongo db...
I might be blind but is there a way to import complex json data? For instance, say I need to import instances of the following object: { "bob": 1, "dog": [ 1, 2, 3 ], "beau": { "won": "ton", "lose": 3 } }.
I'm trying the following and it looks like it loads everything into memory but nothing actually gets imported into the db:
$ mongoimport -d test -c testdata -vvvv -file ~/Downloads/jsondata.json
connected to: 127.0.0.1
Tue Aug 10 17:38:38 ns: test.testdata
Tue Aug 10 17:38:38 filesize: 69
Tue Aug 10 17:38:38 got line:{ "bob": 1, "dog": [ 1, 2, 3 ], "beau": { "won": "ton", "lose": 3 } }
imported 0 objects
Any ideas on how to get the json data to actually import into the db?
I did some testing and it looks like you need to have an end-of-line character at the end of the file. Without the end-of-line character the last line is read, but isn't imported.