Extract JSON-Key from incoming stream - scala

I am currently working on an endpoint for the Postmark service. Postmark sends received E-Mails as JSON and put the attachments as Base64 into an "Attachment"-Array.
My question is now if it is possible to use akka streams to extract the Attachments while receiving them and "do my things" with it?"
In details: I wanna slice the Attachments into chunks, hash them while they come in to store the chunks in my database.
The rest of the incoming JSON can be processed (without) the attachments afterwards.
The reason I wanna do this, is to use as little memory as possible becuase the attachment sizes can go up to 100MB and the hashing of such big files can use a lot of processing power at once.
I hope to save memory and to much cpu load if I process the attachments in chunks.
If you know a better way to achieve this I am open for any suggestion! :)

Related

Using messaging apps to transfer binary data

I was wondering if it's possible to transfer arbitrary binary data through messaging apps such as Telegram. I guess the question is if binary data can be transferred through text messages. I read somewhere that this is possible if Base64 binary-to-text encoding is used. Telegram is a platform which is not censored in the country I'm living in. So, if I can relqy binary data through telegram, it can be used to bypass censorship. Does telegram support Base64 encoding? What are your thoughts on this?
Well, I'm almost certain that the Base64 encoding can be used to carry binary data on telegram. However, there are limitations on the rate of messages sent on Telegram. Therefore, the idea to use Telegram as a proxy is not achievable since a significantly high number of messages needs to be sent.

Deserializing multiple objects from asynchronous socket

All,
I'm a bit of a newbie to C# and socket programming and I would need some advice. I have been looking on this site and similar sites but haven't really found a solution for my problem.
I am developing a client application and a server application and the two are communicating over an asynchronous socket. The client sends objects to the server, one at a time, by serializing it to a MemoryStream using BinaryFormatter. The resulting byte array is sent over the socket and deserialized by the server.
This works well when the server has time to receive and process the object before the client sends a new one. However when the client sends objects faster than the server can handle them, they queue up at the server side. The next EndReceive() call reads all queued objects from the socket, but the serializer only deserializes the first object and the other ones are lost.
The objects are of variable size, so I guess I can't use the Position property of MemoryStream. Is there a way to detect in the byte array where objects start ?
Also, I have read in other posts that EndReceive() may not receive everything that has been sent in one read, other reads may be needed. So I guess that's something else I'll have to deal with ?
Any pointers ? Any help would be greatly appreciated. :-)
You could read as much as is availble and "queue" it up for processing so that the socket isn't queued up. You could have the server receiving the data simply reading and sending the data into a message queue for processing asynchronously.
It's concerning that the server can't process fast enough to keep up with writes though; you might want to look into optimizing that.

Email attachments and bandwidth usage

I am working on a module that enables our system users to send bulk emails to all the registered active applicants from the applicant pool. Currently, there are more than 10 million active applicants in the pool to which emails can be sent. I am thinking to create blocks of emails and wait for a few minutes before sending individual blocks. What I am more concerned about is the attachment.
Since every email can contain an attachment(max. 2MB), There is a possibility that a huge amount of bandwidth will be consumed, even if the email is sent to only 10,000 applicants (2MB X 10,000 applicants = 20GB bandwidth approximately). My questions are:
Since every attachment is a MIME type, will the size of the email be calculated the the way I have calculated above? Or there is a different mechanism specially in context of bandwidth usage?
In your opinion, what options do I have If I have to send a document to thousands of people and want to save the bandwidth as well? I can put the document on the server and let everybody download, but will it not consume the some amount of bandwidth? (I don't want to go down the FTP route)
Somebody was saying moving these kinds of documents to the cloud?? Does cloud technology offer solutions that cater for this kind of need?
Many thanks,
The attachment creates a problem of being flagged as spam. Best avoid it if you can.
The attachment is MIME encoded rather than gzip compressed. This takes up 1.5 times the bandwidth.
It is not easy to see if the attachment has been opened unless it has some payload that does that for you - again this could be flagged as spam.
Putting these documents on a regular web server will make sense. You can use normal Google Analytics to see what is going on. You can also use public caching to make sure that the document is cached by ISPs etc, thereby reducing your download. The document can also be compressed with gzip to be opened with a browser, unobtrusively doing the un-compressing for your recipients.

iPhone: Strategies for uploading large files from phone to server

We're running into issues uploading hires images from the iPhone to our backend (cloud) service. The call is a simple HTTP file upload, and the issue appears to be the connection breaking before the upload is complete - on the server side we're getting IOError: Client read error (Timeout?).
This happens sporadically: most of the time it works, sometimes it fails. When a good connection is present (ie. wifi) it always works.
We've tuned various timeout parameters on the client library to make sure we're not hitting any of them. The issue actually seems to be unreliable mobile connectivity.
I'm thinking about strategies for making the upload reliable even when faced with poor connectivity.
The first thing that came to mind was to break the file into smaller chunks and transfer it in pieces, increasing the likelihood of each piece getting there. But that introduces a fair bit of complexity on both the client and server side.
Do you have a cleverer approach? How would you tackle this?
I would use the ASIHTTPRequest library. It's have some great features like bandwidth throttling. It can upload files directly from the system instead of loading the file into memory first. Also I would break the photo into like 10 parts. So for a 5 meg photo, it would be like 500k each. You would just create each upload using a queue. Then when the app goes into background, it can complete the part it's currently uploading. If you cannot finish uploading all the parts in the allocated time, just post a local notification reminding the user it's not completed. Then after all the parts have been sent to your server, you would call a final request that would combine all the parts back into your photo on the server-side.
Yeah, timeouts are tricky in general, and get more complex when dealing with mobile connections.
Here are a couple ideas:
Attempt to upload to your cloud service as you are doing. After a few failures (timeouts), mark the file, and ask the user to connect their phone to a wifi network, or wait till they connect to the computer and have them manually upload via the web. This isn't ideal however, as it pushes more work to your users. The upside is that implementationwise, it's pretty straight forward.
Instead of doing an HTTP upload, do a raw socket send instead. Using raw socket, you can send binary data in chunks pretty easily, and if any chunk-send times out, resend it until the entire image file is sent. This is "more complex" as you have to manage binary socket transfer but I think it's easier than trying to chunk files through an HTTP upload.
Anyway that's how I would approach it.

How can I retrieve an e-mail, open a .msg attachment, and parse the attachment, in ASP.NET?

I need to be able to make a program that looks through a mailbox of bounced messages, where the messages come back with the initial message in a .msg attachment, and open the .msg attachment for processing in ASP.NET 2.0. Is there any sort of code that might help in this? I've been looking at Reading Email using Pop3 in C# as a starting point, but can't figure out how best to open the attachment from there, or if there's some easier way I'm missing.
From your post, it appears that you are better off getting a third party component that had already implemented (POP or IMAP) the protocol.
I just googled and got one and I bet there are a bunch out there.
http://www.jscape.com/articles/retrieving_email_pop3_csharp.html
Parsing bounce messages in general is a huge task, because their formats vary greatly between different mail transport agents. So unless you are on a closed network, or you only care for bounces reported directly from your own transport agent, then you are in for a big job, and you certainly cannot count on the original messages being attached in full to the bounce answers.
If it is possible for you to regenerate the outgoing mails from a few key parameters, then you might want to consider using a VERP addressing scheme instead. Your parsing job would then be reduced to recognizing and deciphering the recipient addresses of the bounce messages, instead of their full content.
I ended up going with a solution involving reading in the messages using Microsoft.Office.Interop.Outlook ( http://support.microsoft.com/?kbid=310244 ), saving the attached .msg to the drive, then finally reading in that message using an open-source third party solution ( http://www.codeproject.com/KB/office/reading_an_outlook_msg.aspx ). It's probably not the most efficient solution overall, but it handles the specific case we needed to support.