How can I tell if Deserialization failed in protobuf.net? - deserialization

I had protobuf.net deserialize invalid (random) bytes into a KeyValuePair (i.e. not nullable). Instead of (as expected) an exception being thrown, an empty struct was returned.
Since this default struct could be valid data, I don't see a way to tell if the source data are actually valid. Is this a bug, or is there a way I'm missing?
(protobuf-net 2.0.0.480, 2011.12.11)

Update:
There were scenarios in v2 where it wouldn't spot this, but would instead terminate as though it had reached the end of the stream - in particular if the "field number", after applying shifts, was non-positive. However, this is not valid in a protobuf stream, and this will be fixed next build.
That depends on quite how random it was ;p Actually, getting that to do anything without throwing an error is pretty impressive - the protobuf spec is pretty specific about layout, and normally it will throw a big exception there (probably mentioning "unexpected wire-type" or similar).
Emphasis: in almost all cases it will throw an exception. If you fluke some data of the right spec, but different field numbers, then it will silently ignore the unexpected data, and you'll get an all-zero struct. If you fluke some data of the right spec, but with the right field numbers and layout, you'll get garbage. But that is like saying
if I randomly generate data that by pure chance happens to be {"foo":"0"}, JavascriptSerializer doesn't complain!!! bug!!!
Are you sure you actually deserialized some data here? and that the stream wasn't already the EOF position? For example, the following won't error, as you haven't rewound the stream - you are effective deserializing zero bytes:
var ms = new MemoryStream();
ms.Write(randomBytes, 0, randomBytes.Length);
var obj = Serializer.Deserialize<Foo>(ms);
(and zero bytes is perfectly valid for a protobuf object)
If you want to test a stream for validity, you can use ProtoReader, just skipping (SkipField() or something similar) every field until ReadNextHeader() (or whatever) returns a non-positive integer.

Related

Should a wrong parameter passed via REST call throw an error?

I was accessing REST calls, when I passed wrong parameter to GET request it does not throw any http error. Should the design be changed to throw a http error or wrong parameter can be passed to REST call.
Example 1:(parameters are optional)
https://example.com/api/fruits?fruit=apple
Give list of all apple elements
Example 2:
https://example.com/api/fruits?abc=asb
Give list of all fruits
My question is related to example 2, should example 2 throw an error or is it behaving properly?
It's pretty common to ignore parameters that you aren't necessarily expecting. I think example 2 is behaving as it should.
I know that depending on the browser I would sometimes append an extra variable with a timestamp to make sure that the rest call wouldn't be cached. Something like:
https://example.com/api/fruits?ihateie=2342342342
If you're not explicitly doing anything with the extra parameter then I can't see the harm in allowing it.
For a GET request, the request-line is defined as follows
request-line = 'GET' SP request-target SP HTTP-version CRLF
where request-target "...identifies the target resource upon which to apply the request".
That means that the path /api/fruits, the question-mark ? and the query abc=asb are all part of the identifier.
The fact that your implementation happens to use the path to route the request to a handler, and the query to provide arguments, is an accident of your current implementation.
That leaves you with the freedom to decide that
/api/fruits?abc=asb does exist, and its current state is a list of all fruits
/api/fruits?abc=asb does exist, and its current state is an empty list
/api/fruits?abc=asb does exist, and its current state is something else
/api/fruits?abc=asb does not exist, and attempting to access its current state is an error.
My question is related to example 2, should example 2 throw an error or is it behaving properly?
If abc=asb indicates that there is some sort of error in the client, then you should return a 4xx status to indicate that.
Another way of thinking about the parameter handling is in terms of Must Ignore vs Must Understand.
As a practical matter, if I'm a consumer expecting that my filter is going to result in a small result set, and instead I end up drinking a billion unfiltered records out of a fire hose, I'm not going to be happy.
I'd recommend that in the case of a bad input you find a way to fail safely. On the web, that would probably mean a 404, with an HTML representation explaining the problem, enumerating recognized filters, maybe including a web form that helps resend the query, etc. Translate that into your API in whatever way makes sense.
But choosing to treat that as a successful request and return some representation also works, it's still REST, the web is going to web. If doing it that way gives you consumers a better experience, thereby increasing adoption and making your api more successful, then the answer is easy.

what is the difference between `doc.AddMember("key1",1,document.GetAllocator())` and `doc["key1"]=1`?

I want to create a json object in cocos2d-x 3.4 with rapidjson and convert it to a string:
rapidjson::Document doc;
doc.SetObject();
doc.AddMember("key1",1,doc.GetAllocator());
doc["key2"]=2;
rapidjson::StringBuffer sb;
rapidjson::Writer<rapidjson::StringBuffer> writer(sb);
doc.Accept(writer);
CCLOG("%s",sb.GetString());
but the output is {"key1":1} not {"key1":1,"key2":2}, why?
In old (0.1x) versions of RapidJSON, doc["key2"] returns a Value singleton representing Null. doc["key2"] = 2 actually writes to that singleton.
In newer versions of RapidJSON (v1.0.x), this behavior has been changed. It basically make assertion fail for a key that is not found in a JSON object, in order to solve exact problem you mentioned.
As a reminder, when an operation potentially requires allocating memory (such as AddMember or PushBack, an Allocator object must be appeared. Since operator[] normally only has one parameter, it cannot add new members as in STL. This is quite weird and not very user-friendly, but this is a tradeoff in RapidJSON's design for performance and memory overheads.

How to serialize/deserialize objects sent over the network in Haskell?

I see that there are many ways to serialize/deserialize Haskell objects:
Data.Serialize -> encode, decode functions
Data.Binary http://code.haskell.org/binary/
MsgPack, JSON, BSON, etc
In my application, I want to setup a simple TCP client-server, where client may send serialized Haskell record objects. How does one decide between these serialization alternatives?
Additionally, when objects serialized into strings are sent over the network using Network.Socket, strings are returned. Is there a slightly higher level library, that works at the level of whole TCP messages? In other words, is there a way to avoid writing parsing code on the receive end that:
collects results of a sequence of recv() calls,
detect that a whole object has been received, and
then parse it into a haskell type?
In my application, the objects are not expected to be too large (maybe about ~1MB max).
As for the second part of your question, two things are required:
An incremental parser that doesn't need to have the whole document in memory to start parsing, and which can be fed with the partial chunks of data arriving from the wire. Also, when the parsing succeeds it must return any "leftover data" along with the parsed value.
A source of data with "pushback capabilities", that allows you to "unread" any leftovers so that they are available to the next parsing attempt.
The most popular library providing (1) is attoparsec. As for (2), all the three main streaming libraries (conduit, io-streams, and pipes) offer some kind of pushback functionality (the latter using the auxiliary pipes-parse package). All three libraries can integrate with attoparsec parsers as well (see here, here and here).
(Another option, of course, is to prepend each message with its lenght are read only the exact number of bytes.)
To answer the first part of your question (about data serialization), I would say that everything you listed sounds fine. Since you are dealing with pretty big (1MB) serializations, I think that the most important thing is laziness. There is another serialization library, called cereal that has strict serializations, and you wouldn't want that because you'd need to build it up in memory before sending in out. I'll give a shout out to aeson (http://hackage.haskell.org/package/aeson-0.8.0.2/docs/Data-Aeson.html) which you can use GHC Generics with to get something simple like this:
data Shape = Rect Int Int | Circle Double | Other String Int
deriving (Generic)
instance FromJSON Shape -- uses a default
instance ToJSON Shape -- uses a default
And then, bam!, you've got access to the encode and decode methods. I don't know about a higher level TCP library. Hopefully, someone else will have more insight on that.

Scala: is Either the only Option?

In regard to potential runtime failures, like database queries, it seems that one must use some form of Either[String, Option[T]] in order to accurately capture the following outcomes:
Some (record(s) found)
None (no record(s) found)
SQL Exception
Option simply does not have enough options.
I guess I need to dive into scalaz, but for now it's straight Either, unless I'm missing something in the above.
Have boxed myself into a corner with my DAO implementation, only employing Either for write operations, but am now seeing that some Either writes depend on Option reads (e.g. checking if email exists on new user signup), which is a majorly bad gamble to make.
Before I go all-in on Either, does anyone have alternate solutions for how to handle the runtime trifecta of success/fail/exception?
Try Box from the fantastic lift framework. It provides exactly what you want.
See this wiki (and the links at the top) for details. Fortunately lift project is well modulized, the only dependency to use Box is net.lift-web % lift-common
Use Option[T] for the cases records found and no records found and throw an exception in the case of SQLException.
Just wrap the exception inside your own exception type, like PersistenceException so that you don't have a leaky abstraction.
We do it like this because we can't and don't want to recover from unexpected database exceptions. The exception gets caught on the top level and our web service returns a 500 Internal server error in such case.
In cases where we want to recover we use Validation from scalaz, which is much like Lift's Box.
Here's my revised approach
Preserve Either returning query write operations (useful for transactional blocks where we want to rollback on for comprehension Left outcome).
For Option returning query reads, however, rather than swallowing the exception with None (and logging it), I have created a 500 error screen, letting the exception bubble up.
Why not just work with Either result type by default when working with runtime failures like query Exceptions? Option[T] reads are a bit more convenient to work with vs Either[Why-Fail, Option[T]], which you have to fold/map through to get at T. Leaving Either to write operations simplifies things (all the more so given that's how the application is currently setup, no refactoring required ;-))
The only other change required is for AJAX requests. Rather than displaying the entire 500 error page response in the AJAX status div container, we check for the status type and display 500 error message accordingly.
if(data.status == 500)
$('#status > div').html("an error occurred, please try again")
Could probably do an isAjax check server-side prior to sending the response; in which case I can send back only status + message rather than the error page itself.

error handling vs exception handling in objective c

I am not able to understand the places where an error handling or where an exception handling should be used. I assume this, if it is an existing framework class there are delegate methods which will facilitate the programmer to send an error object reference and handle the error after that. Exception handling is for cases where an operation of a programmer using some framework classes throws an error and i cannot get an fix on the error object's reference.
Is this assumption valid ? or how should i understand them ?
You should use exceptions for errors that would never appear if the programmer would have checked the parameters to the method that throws the exception. E.g. divide by 0 or the well known "out of bounds"-exception you get from NSArrays.
NSErrors are for errors that the programmer could do nothing about. E.g. parsing a plist file. It would be a waste of resources if the program would check if the file is a valid plist before it tries to read its content. For the validity check the program must parse the whole file. And parsing a file to report that it is valid so you can parse it again would be a total waste. So the method returns a NSError (or just nil, which tells you that something went wrong) if the file can't be parsed.
The parsing for validity is the "programmer should have checked the parameters" part. It's not applicable for this type of errors, so you don't throw a exception.
In theory you could replace the out of bounds exception with a return nil. But this would lead to very bad programming.
Apple says:
Important: In many environments, use of exceptions is fairly commonplace. For example, you might throw an exception to signal that a routine could not execute normally—such as when a file is missing or data could not be parsed correctly. Exceptions are resource-intensive in Objective-C. You should not use exceptions for general flow-control, or simply to signify errors. Instead you should use the return value of a method or function to indicate that an error has occurred, and provide information about the problem in an error object.
I think you are absolutely right with your assumption for Errors and for it framework provide a set of methods (UIWebView error handling ), But your assumption for Exception partially right because the exception only occurred if we do something wrong which is not allowed by the framework and can be fixed. (for example accessing a member from an array beyond its limit).
and will result in application crash.