Decode base64 data found in mongodb change stream to human readable format - mongodb

I am developing a small application to test the change stream functionality in MongoDB.
I have found that if one uses a client session, that information is included in the change stream output (change event)
For instance, here is the output when I insert a document:
{"txnNumber"=>1, "lsid"=>{"id"=><BSON::Binary:0x70310118878160 type=uuid data=0x05d30a0fa4db4f24...>, "uid"=><BSON::Binary:0x70310118878040 type=generic data=0x08e97261f57b1617...>}, "_id"=>{"_data"=>"8262D407C4000000022B022C0100296E5A100483BECD0AF46146E4A271EDAC0922356946645F6964006462D407C48C187B092534BD050004"}, "operationType"=>"insert", "clusterTime"=>#<BSON::Timestamp:0x00007fe4b351d4f8 #seconds=1658062788, #increment=2>, "fullDocument"=>{"_id"=>BSON::ObjectId('62d407c48c187b092534bd05'), "one"=>"one"}, "ns"=>{"db"=>"change_stream_testing", "coll"=>"testing"}, "documentKey"=>{"_id"=>BSON::ObjectId('62d407c48c187b092534bd05')}}
The "lsid"-field contains information about the session from which the write originated. After taking a closer look at this i found that it contains base64 encoded data (just doing a json.parse() on the id and uid fields)
ID IS
{"$binary":{"base64":"BdMKD6TbTySICrHNHE6GBA==","subType":"04"}}
UID IS
{"$binary":{"base64":"COlyYfV7FhdDV8hhDrSY7+10/NVCs/fLwkGrKMztex4=","subType":"00"}}
Now, the problem/question is that i can't decode that base64 string to something readable. Using an online decoder i get
У
¤ЫO$€
±НN†**
and **éraõ{CWÈa´˜ïítüÕB³÷ËÂA«(Ìí{
respectively - when using the "auto detect" feature or UTF-8 which MongoDB uses internally (according to a quick google search)
The reason I ask is that I have a use case where, in some cases, I would like to be able to identify where an event in the change stream originated from ie. and what client issued the write. The only way I've been able to sort of accomplishing that without using the mutateFields operator to actually change the documents themselves and adding some kind of marker I could inspect in the change stream code (which I ideally don't want to do) is to use explicit client sessions which at least lets me know that whatever was writing the document was using an explicit session. But I would like to be able to go further and actually decipher this session information to, if possible, get some kind of unique identifier.

Related

Invalid signature returned when previewing 7digital track

I am attempting to preview a track via the 7digital api. I have utilised the reference app to test the endpoint here:-
http://7digital.github.io/oauth-reference-page/
I have specified what I consider to be the correct format query, as in:-
http://previews.7digital.com/clip/8514023?oauth_consumer_key=MY_KEY&country=gb&oauth_nonce=221946762&oauth_signature_method=HMAC-SHA1&oauth_timestamp=1456932878&oauth_version=1.0&oauth_signature=c5GBrJvxPIf2Kci24pq1qD31U%2Bs%3D
and yet, regardless of what parameters I enter I always get an invalid signature as a response. I have also incorporated this into my javascript code using the same oauth signature library as the reference page and yet still get the same invalid signature returned.
Could someone please shed some light on what I may be doing incorrectly?
Thanks.
I was able to sign it using:
url = http://previews.7digital.com/clip/8514023
valid consumer key & consumer secret
field 'country' = 'GB'
Your query strings parameters look a bit out of order. For OAuth the base string, used to sign, is meant to be in alphabetical order, so country would be first in this case. Once generated it doesn't matter the order in the final request, but the above tool applies them back in the same order (so country is first).
Can you make sure there aren't any spaces around your key/secret? It doesn't appear to strip white space.
If you have more specific problems it may be best to get in touch with 7digital directly - https://groups.google.com/forum/#!forum/7digital-api

How is a mojolicious session token created?

Are mojolicious session tokens created in a "standard" way (in a generic sense), or is this up to the individual application? If it is the former, then what is the format?
What I saw so far is a base64 encoded JSON fragment (which by itself is syntactically incomplete), followed by "---", followed by a random looking 40-digit hex-string.
I'm especially interested in the random looking token. Is it randomly generated, or is it the encoding/encryption of something?
Mojolicious session have base64 string (in the begin, first part) and sign (in the end, second part) which separated by "---".
Sign is main part of session which prevent from changes.
So, make a test:
Add to session some value. Make request which get this value in the session.
Get session and transform first part of them (make base64_decode and change value then make base64_encode and put it before "--" in cookies).
Make query to server with new cookie/session. Your new data must be invalid in session.
So, sign it is IMPORTANT part of session.
Read source code to learn more about it
Read this to know how to set secret key for sign cookies

How to convert all \u**** string into the real readable NSString? [duplicate]

I am trying to support arbitrary unicode from a variety of international users. They have already put a bunch of data into sqlite databases on their iPhones, and now I want to capture the data into a database, then send it back to their device. Right now I am using a php page that is sending data back to from an internet mysql database. The data is saved in the mysql database properly, but when it's sent back it comes out as unicode text, such as
Frank\u00e2\u0080\u0099s iPad
instead of just
Frank's iPad
where the apostrophe should really be a curly apostrophe.
The answer posted to another question indicates that there is no built-in Cocoa methods to convert the "\u00e2\u0080\u0099" portion of the unicode string from the webserver to an NSString object. Is this correct?
That seems really surprising (and scarily disappointing), since Cocoa definitely allows input from many different Unicode characters, and I need to support any arbitrary language that I have never heard of, and all of the possible characters. I save them to and from the local sqlite database just fine now, but once I send it to a web server, then perhaps pull down different data, I want to ensure the data pulled from the web server is correctly formatted.
[...] there is no built-in Cocoa methods to convert [...]. Is this
correct?
It's not correct.
You might be interested in CFStringTransform and it's capabilities. It is a full blown ICU transformation engine, which can (also) perform your requested transformation.
See Using Objective C/Cocoa to unescape unicode characters, ie \u1234
All NSStrings are Unicode.
The problem with the “Frank\u00e2\u0080\u0099s iPad” data isn't that it's Unicode; it's that it's escaped to ASCII. “Frank’s iPad” is valid Unicode in any UTF, and is what you need.
So, you need to see whether the database is returning the data escaped or the PHP layer is escaping it at some point. If either of those is the case, fix it if you can; the PHP resource should return UTF-8/16/32. Only if that approach fails should you seek to unescape the string on the Cocoa side.
You're correct that there is no built-in way to unescape the string in Cocoa. If you get to that point, see if you can find some open-source code to do it; if not, you'll need to do it yourself, probably using NSScanner.
Check that your web service response has Content type and charset. Also that xml has encoding specified. In PHP you need to add the following before printing XML:
header('Content-type: text/xml; charset=UTF-8');
print '<?xml version="1.0" encoding="UTF-8"?>';
I guess there is just no encoding specified.

Passing Perl Data Structures as Serialized GET Strings to a Perl CGI program

I want to pass a serialized Perl data structure as a GET variable to a CGI application. I tried Data::Serializer as my first option. Unfortunately the serialized string is too long for my comfort, in addition to containing options joined by '^' (a caret).
Is there a way I can create short encoded strings from perl data structures so that I can safely pass them as GET variables to a perl CGI application?
I would also appreciate being told that this (serialized, encoded strings) is a bad way to pass complex data structures to web applications and suggestions on how I could accomplish this
If you need to send URL's to your users that contains a few key datapoints and you want to ensure it can't be forged you can do this with a Digest (such as from Digest::SHA) and a shared secret. This lets you put the data out there in your messages without needing to keep a local database to track it all. My example doesn't include a time element, but that would be easy enough to add in if you want.
use Digest::SHA qw(sha1_base64);
my $base_url = 'http://example.com/thing.cgi';
my $email = 'guy#somewhere.com';
my $msg_id = '123411';
my $secret = 'mysecret';
my $data = join(":", $email, $msg_id, $secret);
my $digest = sha1_base64($data);
my $url = $base_url . '?email=' . $email . '&msg_id=' . $msg_id' . '&sign=' . $digest;
Then send it along.
In your "thing.cgi" script you just need to extract the parameters and see if the digest submitted in the script matches the one you locally regenerate (using $email and $msg_id, and of course your $secret). If they don't match, don't authorize them, if they do then you have a legitimately authorized request.
Footnote:
I wrote the "raw" methods into Data::Serializer to make translating between serializers much easier and that in fact does help with going between languages (to a point). But that of course is a separate discussion as you really shouldn't ever use a serializer for exchanging data on a web form.
One of the drawbacks of the approach — using a perl-specific serializer, that is — is that if you ever want to communicate between the client and server using something other than perl it will probably be more work than something like JSON or even XML would be. The size limitations of GET requests you've already run in to, but that's problematic for any encoding scheme.
It's more likely to be a problem for the next guy down the road who maintains this code than it is for you. I have a situation now where a developer who worked on a large system before I did decided to store several important bits of data as perl Storable objects. Not a horrible decision in and of itself, but it's making it more difficult than it should be to access the data with tools that aren't written in perl.
Passing serialized encoded strings is a bad way to pass complex data structures to web applications.
If you are trying to pass state from page to page, you can use server side sessions which would only require you to pass around a session key.
If you need to email a link to someone, you can still create a server-side session with a reasonable expiry time (you'll also need to decide if additional authentication is necessary) and then send the session id in the link. You can/should expire the session immediately once the requested action is taken.

RESTful, efficient way to query List.contains(element)?

Given:
/images: list of all images
/images/{imageId}: specific image
/feed/{feedId}: potentially huge list of some images (not all of them)
How would you query if a particular feed contains a particular image without downloading the full list? Put another way, how would you check whether a resource state contains a component without downloading the entire state? The first thought that comes to mind is:
Alias /images/{imageId} to /feed/{feedId}/images/{imageId}
Clients would then issue HTTP GET against /feed/{feedId}/images/{id} to check for its existence. The downside I see with this approach is that it forces me to hard-code logic into the client for breaking down an image URI to its proprietary id, something that REST frowns upon. Ideally I should be using the opaque image URI. Another option is:
Issue HTTP GET against /feed/{feedId}?contains={imageURI} to check for existence
but that feels a lot closer to RPC than I'd like. Any ideas?
What's wrong with this?
HEAD /images/id
It's unclear what "feed" means, but assuming it contains resources, it'd be the same:
HEAD /feed/id
It's tricky to say without seeing some examples to provide context.
But you could just have clients call HEAD /feed/images/{imageURI} (assuming that you might need to encode the imageURI). The server would respond with the usual HEAD response, or with a 404 error if the resource doesn't exist. You'd need to code some logic on the server to understand the imageURI.
Then the client either uses the image meta info in the head, or gracefully handles the 404 error and does something else (depending on the application I guess)
There's nothing "un-RESTful" about:
/feed/{feedId}?contains={imageURI}[,{imageURI}]
It returns the subset as specified. The resource, /feed/{feedid}, is a list resource containing a list of images. How is the resource returned with the contains query any different?
The URI is unique, and returns the appropriate state from the application. Can't say anything about the caching semantics of the request, but they're identical to whatever the caching semantics are of the original /feed/{feedid}, it simply a subset.
Finally, there's nothing that says that there even exists a /feed/{feedid}/image/{imageURL}. If you want to work with the sub-resources at that level, then fine, but you're not required to. The list coming back will likely just be a list of direct image URLS, so where's the link describing the /feed/{feedid}/image/{imageURL} relationship? You were going to embed that in the payload, correct?
How about setting up a ImageQuery resource:
# Create a new query from form data where you could constrain results for a given feed.
# May or may not redirect to /image_queries/query_id.
POST /image_queries/
# Optional - view query results containing URIs to query resources.
GET /image_queries/query_id
This video demonstrates the idea using Rails.