How to avoid Mongo DB NoSQL blind (sleep) injection - mongodb

While scanning my Application for vulnerability, I have got one high risk error i.e.
Blind MongoDB NoSQL Injection
I have checked what exactly request is sent to database by tool which performed scanning and found while Requesting GET call it had added below line to GET request.
{"$where":"sleep(181000);return 1;"}
Scan received a "Time Out" response, which indicates that the injected "Sleep" command succeeded.
I need help to fix this vulnerability. Can anyone help me out here? I just wanted to understand what I need to add in my code to perform this check before connecting to database?
Thanks,
Anshu

Similar to SQL injection, or any other type of Code Injection, don't copy untrusted content into a string that will be executed as a MongoDB query.
You apparently have some code in your app that naively accepts user input or some other content and runs it as a MongoDB query.
Sorry, it's hard to give a more specific answer, because you haven't shown that code, or described what you intended it to do.
But generally, in every place where you use external content, you have to imagine how it could be misused if the content doesn't contain the format you assume it does.
You must instead validate the content, so it can only be in the format you intend, or else reject the content if it's not in a valid format.

Related

Where to handle errors on database in mongoDB?

When user is registering on website, e-mail needs to be provided which is unique. I've made unique index on schema's email attribute, so if I try to save the document in database, error with code 11000 will be returned. My question is, regarding to business layer and data layer, should I just pass the document to database and catch/check error codes which it returns or should I check if the user with that e-mail exists before? I've being told that data integrity should be checked before passing it to the database by the business layer, but I don't see the reason why should I do that since I believe that mongo would be much faster raising the exception itself since it has that index provided. The only disadvantage I see in error code checking is that error codes might change (but I could abstract them) and the syntax might be changed.
There is the practical matter of speed and the fragility of "check-then-set" systems. If you try and check if an email exists before you write the document keyed on email, there is a chance that between the time you check and the time you right the conditions of the unique index are met and your write fails anyhow. This is a classic race condition. Further, it takes 2 queries to do check-then-set but only 1 query to do the insert and handle the failure. In my application I am having success with just letting the failure occur and reacting to the result.
As #JamesWahlin says, it is the difference between dong this all in one or causing mixed results (along with the index check) from potential race conditions by adding the extra client read.
Definitely rely on the response of only insert from MongoDB here.

ESAPI for email address "blabla#example.com"

I saw some related questions. But I was not getting what exactly I was looking for. Sorry, if this turns out to be a silly request. Hopefully, I am having this specific query:
So I am trying to make a ReST API with MySQL database.
I am trying to read data from a table which is basically pulling out the valid email addresses of the users.
The output is going to be displayed on a HTML page.
temp = blabla#example.com
temp = ESAPI.encoder().canonicalize(temp);
temp = ESAPI.encoder().encodeForHTML(temp);
OUTPUT: temp = blabla#gmail.com
How can I avoid this from happening? and get blabla#email.com
I think the behavior here is as expected. But I just wanted to know if there is a work around other can Conditional Handling (if..else)
Also, what if someone can point me to the reasoning behind some of design choices for ESAPI. I t should be interesting read.
SHORT ANSWER:
If you can truly trust what's coming from your database, you don't need to perform canonicalize. If you know your data isn't going to be used by a browser, don't encode for HTML. If however you suspect your data will be used by a browser, encode it, have the caller deal with the results. If that's deemed unacceptable, expose an "unsafe" version of your webservice, one whose URL will explicitly use warning words to flag as "potentially malicious," forcing your caller to be aware that they're engaging in unsafe activity.
LONG ANSWER:
Well first, according to your use-case, you're essentially providing data to a calling client. My first instinct upon reading your question is that I don't think you're comfortable with your data contexts.
So, typically you're going to see a call to canonicalize() when you need safe data to perform validation against. So, the first questions to ask are these:
q1: Can I trust the data coming from my database?
Guidelines for q1: If the data is appropriately validated and neutralized, say by using a call to ESAPI.validator().getValidInput( args ); by the process that stores the data, then the application will store a safe email string into the database. If you can provably trust your input data at this point, it should be completely safe for you to not canonicalize your output as you're doing here.
If however, you cannot trust the data at this point, then you're in a scenario where before you pass along data to a downstream system, you'll need to validate it. A call to ESAPI.validator().getValidInput( args ); will BOTH canonicalize the input and ensure that its a valid email address. However this comes with the baggage that your caller is going to have to properly transform the neutralized input, which according to your question is what you want to avoid.
If you want to send safe data downstream, and you cannot defensibly trust your data source, you have no choice but to send safe data to your caller and have them work with it on their end--except perhaps to expose an unsafe method, which I will discuss shortly.
q2: Will browsers be used to consume my data?
Guidelines for q2: the encoder.encodeForHTML() method is designed to neutralize browser interpretation. Since you're talking about RESTful web services, I don't understand why you think you need to use it, because a browser should correctly interpret blabla#gmail.com to the correct canonical form--unless perhaps its being correctly trapped as a data element, such as in a dropdown box. But this is something I'm guessing you have NO control over?
As you can now tell, there are no fast answers to questions like this. You have to have some idea of how the data will be used by your caller. Since you have the possibility of having your data treated correctly as data by the browser, and the possibility of the data treated as code, you might be forced to offer a "safe" and "unsafe" call to retrieve your data, assuming that you have no control over how the client uses your service. That puts you in a bad spot, because a lazy caller might simply only ever use the unsafe version. When this happens in my industry, I'll usually make it so that the URL to call for an unsafe function looks something like mywebservice.com/unSafeNonPCICompliantMethod or something similar, so that you force your caller to explicitly accept the risk. If its being used in the correct context on the browser... the unsafe method might actually be safe. You just won't know.

Marklogic REST API search for latest document version

We need to restrict a MarkLogic search to the latest version of managed documents, using Marklogic's REST api. We're using MarkLogic 6.
Using straight xquery, you can use dls:documents-query() as an additional-query option (see
Is there any way to restrict marklogic search on specific version of the document).
But the REST api requires XML, not arbitrary xquery. You can turn ordinary cts queries into XML easily enough (execute <some-element>{cts:word-query("hello world")}</some-element> in QConsole).
If I try that with dls:documents-query() I get this:
<cts:properties-query xmlns:cts="http://marklogic.com/cts">
<cts:registered-query>
<cts:id>17524193535823153377</cts:id>
</cts:registered-query>
</cts:properties-query>
Apart from being less than totally transparent... how safe is that number? We'll need to put it in our query options, so it's not something we can regenerate every time we need it. I've looked on two different installations here and the the number's the same, but is it guaranteed to be the same, and will it ever change? On, for example, a MarkLogic upgrade?
Also, assuming the number is safe, will the registered-query always be there? The documentation says that registered queries may be cleared by the system at various times, but it's talking about user-defined registered queries, and I'm not sure how much of that applies to internal queries.
Is this even the right approach? If we can't do this we can always set up collections and restrict the search that way, but we'd rather use dls:documents-query if possible.
The number is a registered query id, and is deterministic. That is, it will be the same every time the query is registered. That behavior has been invariant across a couple of major releases, but is not guaranteed. And as you already know, the server can unregister a query at any time. If that happens, any query using that id will throw an XDMP-UNREGISTERED error. So it's best to regenerate the query when you need it, perhaps by calling dls:documents-query again. It's safest to do this in the same request as the subsequent search.
So I'd suggest extending the REST API with your own version of the search endpoint. Your new endpoint could add dls:documents-query to the input query. That way the registered query would be generated in the same request with the subsequent search. For ML6, http://docs.marklogic.com/6.0/guide/rest-dev/extensions explains how to do this.
The call to dls:documents-query() makes sure the query is actually registered (on the fly if necessary), but that won't work from REST api. You could extend the REST api with a custom extension as suggested by Mike, but you could also use the following:
cts:properties-query(
cts:and-not-query(
cts:element-value-query(
xs:QName("dls:latest"),
"true",
(),
0
),
cts:element-query(
xs:QName("dls:version-id"),
cts:and-query(())
)
)
)
That is the query that is registered by dls:documents-query(). Might not be future proof though, so check at each upgrade. You can find the definition of the function in /Modules/MarkLogic/dls.xqy
HTH!

How to determine the encoding of request query string

Suppose I have a .NET HttpModule that analyzes incoming requests to check for possible attacks like Sql Injection.
Now suppose that a user of my application enters the following in a form field and submits it:
&#039&#032&#079&#082&#032&#049&#061&#049
That is Unicode for ' OR 1=1. So in the request I get something like:
http://example.com/?q=%26%23039%26%23032%26%23079%26%23082%26%23032%26%23049%26%23061%26%23049
Which in my HttpModule looks fine (no Sql Injection), but the server will correctly decode it to q=' OR 1=1 and my filter will fail.
So, my question is: Is there any way to know at that point what is the encoding used by the request query string, so I can decode it and detect the attack?
I guess the browser has to tell the server which encoding the request is in, so it can be correctly decoded. Or am I wrong?
the server will correctly decode it to q=' OR 1=1
It shouldn't. There is no valid reason(*) an application would HTML-decode the &#039... string before using it in an SQL query. HTML-decoding is a client-side occurrence.
(* there's the invalid reason: that the application author doesn't have the foggiest idea what they're doing, tries to write an input-HTML-escaping function - a misguided idea in the first place - and due to incompetence writes an input-de-escaping function instead... but that would be an unlikely case. Hopefully.)
Is there any way to know at that point what is the encoding used by the request query string
No. Some Web Application Firewalls attempt to get around this by applying every decoding scheme they can think of to the incoming data, and triggering if any of them match something suspicious, just in case the application happens to have an arbitrary decoder of that type sitting between the input and a vulnerable system.
This can result in a performance hit as well as increased false positives, and doubly so for the WAFs that try all possible combinations of two or more decoders. (eg is T1IrMQ a base-64-encoded, URL-encoded OR 1 SQL attack, or just a car numberplate?)
Quite how far you take this idea is a trade-off between how many potential attacks you catch and how much negative impact you have on real users of the app. There's no one 'correct' solution because ultimately you can never provide complete protection against app vulnerabilities in a layer outside the app (aka "WAFs don't work").
What you are seeing is URL Encoded, where a percent sign followed by 2 hex digits represents a single encoded byte octet. In HTML, an entity starting with an ampersand and ending with a semicolon contains an entity name or an explicit Unicode codepoint value.
What gets sent over the wire between the browser and server is http://example.com/?q=%26%23039%26%23032%26%23079%26%23082%26%23032%26%23049%26%23061%26%23049, but logically is actually represents http://example.com/?q=&#039&#032&#079&#082&#032&#049&#061&#049 when decoded by the server upon receiving it. When your code reads the query string, it should be receiving &#039&#032&#079&#082&#032&#049&#061&#049. The server should not be decoding that any further to ' OR 1=1, you would have to do that in your own code.
If you are allowing a URL query string to specify an SQL query filter as-is, then that is a mistake on your part to begin with. That suggests you are building SQL queries dynamically instead of using parameterized SQL queries or stored procedures, so you are leaving yourself open to SQL Injection attacks. You should not be using that. Parameterized SQL queries and stored procedure are not subject to injection attacks, so your clients should only be allowed to submit the indiviudal parameter values in the URL. Your server code can then extract the individual values from the URL query and pass them to the SQL parameters as needed. The SQL Engine will make sure the values are santitized and formatted to avoid attacks. You should not be handling that manually.

How do I pretend duplicate values in my read database with CQRS

Say that I have a User table in my ReadDatabase (use SQL Server). In a regulare read/write database I can put like a index on the table to make sure that 2 users aren't addedd to the table with the same emailadress.
So if I try to add a user with a emailadress that already exist in my table for a diffrent user, the sql server will throw an exception back.
In Cqrs I can't do that since if I decouple the write to my readdatabas from the domain model, by puting it on an asyncronus queue I wont get the exception thrown back to me, and I will return "OK" to the UI and the user will think that he is added to the database, when infact he will never be added to the read database.
I can do a search in the read database checking if there is a user already in my database with the emailadress, and if there is one, then thru an exception back to the UI. But if they press the save button the same time, I will do 2 checks to the database and see that there isn't any user in the database with the emailadress, I send back that it's okay. Put it on my queue and later it will fail (by hitting the unique identifier).
Am I suppose to load all users from my EventSource (it's a SQL Server) and then do the check on that collection, to see if I have a User that already has this emailadress. That sounds a bit crazy too me...
How have you people solved it?
The way I can see is to not using an asyncronized queue, but use a syncronized one but that will affect perfomance really bad, specially when you have many "read storages" to write to...
Need some help here...
Searching for CQRS Set Based Validation will give you solutions to this issue.
Greg Young posted about the business impact of embracing eventual consistency http://codebetter.com/gregyoung/2010/08/12/eventual-consistency-and-set-validation/
Jérémie Chassaing posted about discovering missing aggregate roots in the domain http://thinkbeforecoding.com/post/2009/10/28/Uniqueness-validation-in-CQRS-Architecture
Related stack overflow questions:
How to handle set based consistency validation in CQRS?
CQRS Validation & uniqueness