How to add arbitrary log data to MongoDB logs from client? - mongodb

I have a node.js web application connecting to MongoDB for which I want to profile DB performance. Each request received by the application is assigned a request-id and can cause multiple queries to be sent to MongoDB. I want to see this request-id value in each log line in MongoDB. Is there a way to do this? I would like to avoid adding always-true fields to each query like "req<id>": null because I suspect this may affect performance.
The docs reference a feature similar to this called Client Data at https://docs.mongodb.com/manual/reference/log-messages/ however this appears to be sent once per connection, and I'm looking for client data that changes multiple times even on the same single connection.

Try using cursor.comment() to record the request ID value into the log messages.

Related

Getting an error "use of closed network connection"

My application stack consist of Golang for backend programming & MongoDB Atlas Free Tier for database. I am trying to trigger events from Stripe CLI and my GoLang programs updates a bunch of tables in MongoDB Atlas based on certain validations. This seems to work fine for the most part. But at certain times in the process, I am facing the below error while updating data into Mongo Collections.
connection(xxxxx-shard-00-02.ka3rc.mongodb.net:xxx[-15])
incomplete read of message header: read tcp
xxx.xxx.x.xx:xxxxx->xx.xx.xxx.xxx:xxxxx: use of closed network connection
I am trying the use the same mongo client that i opened when control enters my Go program to execute all queries within the application.
Do anyone know the reason why we would face this error? Could this be due to Mongo Atlas restricting the number of requests per minute for free tier? This issue happens so randomly and i am not able to determine any pattern when this occurs.
From the go driver client options page:
https://pkg.go.dev/go.mongodb.org/mongo-driver#v1.8.0/mongo/options#ClientOptions
Most of timers are 0 by default :
( ConnectTimeout , MaxConnIdleTime , SocketTimeout )
This mean that in some cases server can close the connection , but the application driver to still not be aware , so it is recommended those timeouts from the client side to be set explicitly during connection init phase.

CodecConfigurationException when using bulk write API

I'm using mongodb's bulk write api to update/upsert several documents at once in an unordered fashion. I use the response to monitor number of upserts, updates, deletes, etc. Once in a while my code throws the CodecConfigurationException because it can't update fields such as BigInteger.
When such an exception is thrown do other commands go through successfully? From what I can tell, it does seem so
If other updates are being applied successfully then shouldn't such an error be part of the MongoBulkWriteException that allows for successful writes to be monitored and pin-points the unsuccessful writes along with the reason for write failure. How can one determine the failed update if we're updating, say 50000 documents at a time?
If no updates go through then is it that this exception is triggered even before the command is executed on the server side? Is this is a purely client side exception?

FIWARE context broker storing all data to mongodb

I have installed fiware context broker and I am sending data to it using the localhost:1026/v1/updateContext end point.
Everything is working properly and I am able to get and visualise the data being sent. As orion is a broker service the latest only entity can be received.
Question: I need to save automatically the historical data to a mongo db database. Orion save only the latest 2 entries. STH and Cygnus is not doing the job since they require a lot of configuration both in sending data and collecting, storing etc.
Is there anyway to automatically save all data being sent to orion? And group them by service ids?
Thank you in advance.
I'm afraid that the only way to store historical data in FIWARE is through STH, QuantumLeap (incubated GE) or Cygnus.
Configuring them is not so difficult. Please follow these tutorials:
https://github.com/Fiware/tutorials.Historic-Context
https://github.com/Fiware/tutorials.Time-Series-Data
https://github.com/Fiware/tutorials.Short-Term-History
http://fiwaretourguide.readthedocs.io/en/latest/generating-historical-context-information-sth-comet/how-to-generate-the-history-of-Context-Information-using-STH-Comet/
http://fiwaretourguide.readthedocs.io/en/latest/storing-data-cygnus-mysql/how-to-store-data-cygnus-mysql/
Precisely, orchestrating the persistence of historical data of context entities is the purpose of CYGNUS Generic Enabler. Then you can use STH to store historical data for the most recent period of time or choose some other alternative such as Cosmos for Big Data.
You can find in the official documentation of cygnus, examples of configuration files in order to persist data for STH. In addition, if you are familiar with MongoDB, here is official documentation of the Mongodb Sink, with examples for the different persistence configurations.
If you give me a little more information about how you are configuring Cygnus and STH I could help you more.
Regards!
Solution
Playing around with context broker, I altered the way that orion stores the data to the auto generated mongodb. When one send data to the orion the id of the json will be always the servicepath, the id of the json and the type. Therefore new data gets overwritten. We need to change that by having another element in the id that is incremental therefore new entries are saved. I am not sure if this is a clumsy solution but is definitely more scalable since we don't need subscriptions.

How mongodb handles users requests when multiple insert commands execute

I am new in mongodb and i want to know How mongodb handels users requests.
What happened if the multiple users fire the multiple insert commands or read commands at the same time.
2:-When or where Snapshot coming in to the picture.(Which phase).
Multiple Inserts and Multiple Reads
MongoDB allows multiple clients to read and write the same data.
In order to ensure consistency, it uses locking and other concurrency control measures to prevent multiple clients from modifying the same piece of data simultaneously
Read this documentation it will give you complete info about concurrency
concurrency reference
MongoDB allows very fast writes and updates by default. The tradeoff is that you are not explicitly notified of failures.By default most drivers do asynchronous, ‘unsafe’ writes - this means that the driver does not return an error directly, similar to INSERT DELAYED with MySQL. If you want to know if something succeeded, you have to manually check for errors using getLastError.
MongoDB doesn't offer durability if you use the default configuration. It writes once every minute data to the disk.
This can be configured using j Option and Write Concern on the insert query.
write-concern reference
Snapshot
The $snapshot operator prevents the cursor from returning a document more than once because an intervening write operation results in a move of the document.
Even in snapshot mode, objects inserted or deleted during the lifetime of the cursor may or may not be returned.
snapshot reference
References: here and here
Hope it Helps!!
I am asking that question in the context of journaling in mongodb. As per the mongodb documentation. A write operation first come into the private view.So the Quetion is if multiple write operation have been performed at the same time,than multiple private view will be created...
2;-Checkpoints and snapshot:in the journaling process which point of place snapshot of data is available..?

Adding user information to centralized logging with ELK stack

I am using ELK stack (first project) to centralize logs of a server and visualize some real-time statistics with Kibana. The logs are stored in an ES index and I have another index with user information (IP, name, demographics). I am trying to:
Join user information with the server logs, matching the IPs. I want to include this information in the Kibana dashboard (e.g. to show in real-time the username of the connected users).
Create new indexes with filtered and processed information (e.g. users that have visited more than 3 times certain url).
Which is the best design to solve those problems (e.g. include username in the logstash stage through a filter, do scheduled jobs,...)? If the processing task (2) gets more complex, would it be better to use MongoDB instead?
Thank you!
I recently wanted to cross reference some log data with user data (containing IPs among other data) and just used elasticsearch's bulk import API. This meant extracting the data from a RDBMS, converting it to JSON and outputting a flat file that adhered to the format desired by the bulk import API (basically prefixing a row that describes the index and type).
That should work for an initial import, then your delta could be achieved using triggers in whatever stores your user data. Might simply write to a flat file and process like other logs. Other options might be JDBC River.
I am also interested to know where the data is stored originally (DB, pushing straight from a server..). However, I initially used the ELK stack to pull data back from a DB server using a batch file utilizing BCP (running on a scheduled task) and storing it to a flat file, monitoring the file with Logstash, and manipulating the data inside the LS config (grok filter). You may also consider a simple console/web application to manipulate the data before grokking with Logstash.
If possible, I would attempt to pull your data via SQL Server SPROC/BCP command and match the returned, complete message within Logstash. You can then store the information in a single index.
I hope this helps as I am by no means an expert, but I will be happy to answer more questions for you if you get a little more specific with the details of your current data storage; namely how the data is entering Logstash. RabbitMQ is another valuable tool to take a look at for your input source.