Mongo ObjectIDs: Safe to use in the wild? - mongodb

I'm designing an API that interacts with MongoDB.
Now the question is, if it is safe to use the raw ObjectID to query for objects etc. Could any security issues arise when using the OIDs directly (e.g. in queries), or should I encrypt/decrypt them before they leave my server environment?

Look at BSON Object ID specification here and you will know if it is safe for you to use.
If you try to protect from users sending different URLs from scripts (fuskators) then it seems for me it has weak security. There won't be too many 'machine', 'pid' part combinations. 'time' part can calculated if attacker can have an idea how data was inserted (especially if using batch). 'inc' - very weak.
I won't trust ObjectIDs as the only security.
Please note there can't be a right answer to the question "is it safe" in general. You must decide yourself.
PS. But keep in mind that such URL-based security will fall to dust when users will share URLs they visited. Even best your encryption won't help.

I think it isn't much times safer, if you don't share the object ids, because a possible attacker, which would know a possible security issue, could also use a brute force attack or something else to get the object ids.
Eventually this question may help you also.

Related

What's the best & safest encryption for passwords & emails?

I'm going to encrypt & hash (I guess that's the same thing) my emails in my database.
Which hash or encryption is the best & safest and most hard to get information out from?
My site is basically a private streaming site which will get probably very much attacks and such and I guess if not a hacker will get into the database the police will later on. So, what should I use to protect my users to the maximum?
Kindly Regards.
First of all, as Allan S. said, hashing and encryption aren't the same thing.
A hash function is mainly used to verify the integrity of a file because little differences in the file cause very different hashes. Hash function
Encryption, in a few words, is used to protect files or data from being stolen or watched by someone. Cryptography
Back to your question:
Here you could find some advice linked to some hashing function,there's a little explanation.
About encryption algorithms maybe you'll find something useful here and here.
Specifically related to databases I found this.
Keep in mind that a every time valid solution doesn't exist, you need to find the one which fits better to your requirements.

Designing a MongoDB schema for a chat server

I wish to design a schema for a chat server. The schema needs to support delivery and reading of messages. Each message needs to have a option of being a private or group message.
I was trying to think about where the data regarding if it has been read and delivered be sent.
In a relational database this could be set in another table. In MongoDB I could set this either in the user or the actual message json document.
If the message isn't for a specific user but a broadcast message then i presume it would be better to store the IDs of the users that have seen it as part of the json document of the message.
Does anyone know of some good example schemas that are available. I don't fully understand the best way of attacking this issue.
(Too long for a comment. And it kinda answers the question)
Yeah, it's a challenging design. Also it's something we can't do for you, I'm afraid, because we don't know all your requirements, you do. However you design it, you should respect the usual mongodb guidelines. Unfortunately, they conflict with each other:
Don't put too much stuff into one document.
In the classic blog schema exercise, one might be tempted to embed comments into the post document, each comment embedding its user too. This can easily lead to overflowing mongodb's max document size. Also it leads to write contention. Doesn't matter much for MMAPv1 engine, but matters for WiredTiger engine (which has document-level locking).
Do not build overly normalized schemas.
Normalized schemas are encouraged in relational databases. In mongodb they're useless (because of the lack of joins). What you need to do is careful duplication of some data. For example, in blog/comments example, one might embed author's id/email in a comment, but not the rest of author's data (sign up date, membership status, etc.)
When I decide a place or shape of the data, I generally ask myself these two questions:
How am I going to query this?
Isn't this too much duplication?

When to use multiple DBMS

When is it a good idea to use more than one DBMS? What are the possible repercussions, and how do you decide when to do so?
I'm currently building an application which runs an analysis on our users' websites and stores it. This allows me to analyze all the data and give them analytics.
Since the data collected from each site is static and varies greatly from site to site, CouchDB seemed like a great fit. But in order to create this system, I'd need to build a user account system which couch is quite horrible at (reserving names, emails, etc has all sorts of problems).
My first thought was to use MySQL to handle the user accounts and CouchDB for the massive amounts of data. Essentially, trying to use a hammer for a nail and a screwdriver for a screw.
Is this a time when more than one DBMS is a good idea?
I don't see anything wrong with using MySQL for users accounts and CouchDB for crawled information.
For the users, you might even consider something simpler, like GDBM

Which database have you had the best experience with to handle friending on the back end?

Since my last question was considered subjective :( , I'm trying to make it more specific.
I'm building an application in PHP where users can "friend" each other. This seems to be best suited to a graph datastore... For example, you can have this set of fields in a traditional RDBMs:
id | user1 | user2
and you have to deal with duplicate data (id = 1,user1 = Joe, user2 = Jeff, id=2, user1=Jeff, user2=Joe)...
You also have to search both columns for one user.
When performing certain friend of a friend searches, the recursion can be tricky indeed.
Do you agree a graph database is best?
If so, which one? and why is it best in your experience?
Since client already has MySQL, is it worth the overhead to obtain a graph store, or is there a good approach to the main issues with friending while keeping it in MySQL.
P.S. TO MODERATORS:
If you still have a problem with this post, I'd most appreciate if you could tell me if there's any particular way to ask this question and be considered a "constructive" post? gmail me (joedevon), tweet me (joedevon), add it in a comment. whatever method suits you best...
I just want some input from fellow programmers and I think the problem is common, filled w/ opportunities and issues, and interesting. Amazed that the original wasn't considered good for SO, but thems the rules...
Maybe it would be relevant to mention what language you use for building that app. If it is Java, I think there is no real competitor to neo4j. (my opinion)
Maybe you've already seen these neo4j presentation series, but in the 200. second of that presentation, you can see use case for Polyglot persistence, which means that you could utilize your existing data model or data model that doesn't need to be type free or isn't suitable for Graph data model, TOGETHER with the aspect of application (user bonding and relationships) that would use neo4j under the hood and its graph data model.
Spring Data project makes this really easy and I consider this a very perspective way to go.

How to overcome fear of user-input (web development)

I'm writing a web application for public consumption...How do you get over/ deal with the fear of User Input? As a web developer, you know the tricks and holes that exist that can be exploited particularly on the web which are made all the more easier with add-ons like Firebug etc
Sometimes it's so overwhelming you just want to forget the whole deal (does make you appreciate Intranet Development though!)
Sorry if this isn't a question that can be answered simply, but perhaps ideas or strategies that are helpful...Thanks!
One word: server-side validation (ok, that may have been three words).
There's lots of sound advice in other answers, but I'll add a less "programming" answer:
Have a plan for dealing with it.
Be ready for the contingency that malicious users do manage to sneak something past you. Have plans in place to mitigate damage, restore clean and complete data, and communicate with users (and potentially other interested parties such as the issuers of any credit card details you hold) to tell them what's going on. Know how you will detect the breach and close it. Know that key operational and development personnel are reachable, so that a bad guy striking at 5:01pm on the Friday before a public holiday won't get 72+ clear hours before you can go offline let alone start fixing things.
Having plans in place won't help you stop bad user input, but it should help a bit with overcoming your fears.
If its "security" related concerns you need to just push through it, security and exploits are a fact of life in software, and they need to be addressed head-on as part of the development process.
Here are some suggestions:
Keep it in perspective - Security, Exploits and compromises are going to happen to any application which is popular or useful, be prepared for them and expect them to occur
Test it, then test it again - QA, Acceptance testing and sign off should be first class parts of your design and production process, even if you are a one-man shop. Enlist users to test as a dedicated (and vocal) user will be your most useful tool in finding problems
Know your platform - Make sure you know the technology, and hardware you are deploying on. Ensure that relevant patches and security updates are applied
research - look at applications similar to your own and see what issues they experience, surf their forums, read their bug logs etc.
Be realistic - You are not going to be able to fix every bug and close every hole. Pick the most impactful ones and address those
Lots of eyes - Enlist as many people to review your designs and code as possible. This should be in addition to your QA resources
You don't get over it.
Check everything at server side - validate input again, check permissions, etc.
Sanitize all data.
That's very easy to write in bold letter and a little harder to do in practice.
Something I always did was wrap all user strings in an object, something like StringWrapper which forces you to call an encoding method to get the string. In other words, just provide access to s.htmlEncode() s.urlEncode().htmlEncode() etc. Of course you need to get the raw string so you can have a s.rawString() method, but now you have something you can grep for to review all uses of raw strings.
So when you come to 'echo userString' you will get a type error, and you are then reminded to encode/escape the string through the public methods.
Some other general things:
Prefer white-lists over black lists
Don't go overboard with stripping out bad input. I want to be able to use the < character in posts/comments/etc! Just make sure you encode data correctly
Use parameterized SQL queries. If you are SQL escaping user input yourself, you are doing it wrong.
First, I'll try to comfort you a bit by pointing out that it's good to be paranoid. Just as it's good to be a little scared while driving, it's good to be afraid of user input. Assume the worst as much as you can, and you won't be disappointed.
Second, program defensively. Assume any communication you have with the outside world is entirely compromised. Take in only parameters that the user should be able to control. Expose only that data that the user should be able to see.
Sanitize input. Sanitize sanitize sanitize. If it's input that will be displayed on the site (nicknames for a leaderboard, messages on a forum, anything), sanitize it appropriately. If it's input that might be sent to SQL, sanitize that too. In fact, don't even write SQL directly, use an intermediary of some sort.
There's really only one thing you can't defend from if you're using HTTP. If you use a cookie to identify somebody's identity, there's nothing you can do from preventing somebody else in a coffeehouse from sniffing the cookie of somebody else in that coffee house if they're both using the same wireless connection. As long as they're not using a secure connection, nothing can save you from that. Even Gmail isn't safe from that attack. The only thing you can do is make sure an authorization cookie can't last forever, and consider making them re-login before they do something big like change password or buy something.
But don't sweat it. A lot of the security details have been taken care of by whatever system you're building on top of (you ARE building on top of SOMETHING, aren't you? Spring MVC? Rails? Struts? ). It's really not that tough. If there's big money at stake, you can pay a security auditing company to try and break it. If there's not, just try to think of everything reasonable and fix holes when they're found.
But don't stop being paranoid. They're always out to get you. That's just part of being popular.
P.S. One more hint. If you have javascript like this:
if( document.forms["myForm"]["payment"].value < 0 ) {
alert("You must enter a positive number!");
return false;
}
Then you'd sure as hell have code in the backend that goes:
verify( input.payment >= 0 )
"Quote" everything so that it can not have any meaning in the 'target' language: SQL, HTML, JavaScript, etc.
This will get in the way of course, so you have to be careful to identify when this needs special handling, like through administrative privileges to deal with some if the data.
There are multiple types of injection and cross-site scripting (see this earlier answer), but there are defenses against all of them. You'll clearly want to look at stored procedures, white-listing (e.g. for HTML input), and validation, to start.
Beyond that, it's hard to give general advice. Other people have given some good tips, such as always doing server-side validation and researching past attacks.
Be vigilant, but not afraid.
No validation in web-application layer.
All validations and security checks should be done by the domain layer or business layer.
Throw exceptions with valid error messages and let these execptions be caught and processed at presentation layer or web-application.
You can use validation framework
to automate validations with the help
of custom validation attributes.
http://imar.spaanjaars.com/QuickDocId.aspx?quickdoc=477
There should be some documentation of known exploits for the language/system you're using. I know the Zend PHP Certification covers that issue a bit and you can read the study guide.
Why not hire an expert to audit your applications from time to time? It's a worthwhile investment considering your level of concern.
Our client always say: "Deal with my users as they dont differentiate between the date and text fields!!"
I code in Java, and my code is full of asserts i assume everything is wrong from the client and i check it all at server.
#1 thing for me is to always construct static SQL queries and pass your data as parameters. This limits the quoting issues you have to deal with enormously. See also http://xkcd.com/327/
This also has performance benefits, as you can re-use the prepared queries.
There are actually only 2 things you need to take care with:
Avoid SQL injection. Use parameterized queries to save user-controlled input in database. In Java terms: use PreparedStatement. In PHP terms: use mysql_real_escape_string() or PDO.
Avoid XSS. Escape user-controlled input during display. In Java/JSP terms: use JSTL <c:out>. In PHP terms: use htmlspecialchars().
That's all. You don't need to worry about the format of the data. Just about the way how you handle it.