HTTP reading and rewriting bytes - scala

I am getting a blob of binary data by executing a command-line program. This blob can be decoded by the same program.
I would like to send this binary data in a HTTP response. For that I use the "application/octet-stream" mime-type (I also tried "text/plain"). But when the client gets the data on the other end, it is not readable anymore. The format has changed somehow.
Here is my Scala code (but I think it is a generic question):
val command = s"samtools view -h $testbam $region"
val res: String = command.!! // Gets the stdout as string
Result(
header = ResponseHeader(200, Map.empty),
body = HttpEntity.Strict(ByteString(res), Some("application/octet-stream"))
)
The same command, run on the command-line, can be decoded properly. But the result of a "curl" here, although it does return a bunch of unreadable symbols, cannot be decoded.
Sending the same data in its readable (uncompressed) form goes through properly without losing content or formatting.
What kind of characters conversion am I missing ? Any encoding to specify ?
Edit: Play 2.5.0

Related

How to include file bytes into the POST request body in Jmeter? (What encoding to use)

I have to perform POST requests from Jmeter. I use default HTTPRequest sampler, where I specify the JSON structure that is understandable by the testing app. One part of this JSON has to contain binary data from a pdf file.
For reading the file I use BeanShellSampler in the setUp thread group:
File file = new File(bsh.args[0]);
try {
FileInputStream fis = new FileInputStream(file);
byte[] array = new byte[(int)file.length()];
log.info("String is read.");
fis.read(array);
vars.put("fileEntity", new String(array, "cp1252"));
} catch (e) {
e.printStackTrace();
log.error(e.getMessage());
}
The problem is that when I look at the request with Fiddler, I see that difference, how the binary object is represented there in comparison with Postman's requests:
Postman
Jmeter
I think that there is something wrong with the encoding when I create a String object in the BeanShellSampler. What encoding is correct?
I tried to use RawDataSource plugin but it doesn't help for two reasons:
It fails to read my file, saying "Error reading next chunk"
It uses the same approach that I do to read the file, but uses UTF8 encoding. I tried this encoding also, but without any success.
My expectation is that your fis.read(array); function relies on default value of the file.encoding system property which may or may not be cp1252.
I would recommend introducing an InputStreamReader and explicitly specify the encoding there like:
InputStreamReader isr = new InputStreamReader(fis,"cp1252");
Also be aware that starting from JMeter 3.1 it's recommended to use JSR223 Test Elements and Groovy language for scripting mainly because Groovy performs much better comparing to Beanshell.

Need to find the requests equivalent of openurl() from urllib2

I am currently trying to modify a script to use the requests library instead of the urllib2 library. I haven't really used it before and I am looking to do the equivalent of urlopen("http://www.example.org").read(), so I tried the requests.get("http://www.example.org").text function.
This works fine with normal everyday html, however when I fetch from this url (https://gtfsrt.api.translink.com.au/Feed/SEQ) it doesn't seem to work.
So I wrote the below code to print out the responses from the same url using both the requests and urllib2 libraries.
import urllib2
import requests
#urllib2 request
request = urllib2.Request("https://gtfsrt.api.translink.com.au/Feed/SEQ")
result = urllib2.urlopen(request)
#requests request
result2 = requests.get("https://gtfsrt.api.translink.com.au/Feed/SEQ")
print result2.encoding
#urllib2 write to text
open("Output.txt", 'w').close()
text_file = open("Output.txt", "w")
text_file.write(result.read())
text_file.close()
open("Output2.txt", 'w').close()
text_file = open("Output2.txt", "w")
text_file.write(result2.text)
text_file.close()
The openurl().read() works fine but the requests.get().text doesn't work for the given this url. I suspect it has something to do with encoding, but i don't know what. Any thoughts?
Note: The supplied url is a feed in the google protocol buffer format, once I receive the message i give the feed to a google library that interprets it.
Your issue is that you're making the requests module interpret binary content in a response as text.
A response from the requests library has two main way to access the body of the response:
Response.content - will return the response body as a bytestring
Response.text - will decode the response body as text and return unicode
Since protocol buffers are a binary format, you should use result2.content in your code instead of result2.text.
Response.content will return the body of the response as-is, in bytes. For binary content this is exactly what you want. For text content that contains non-ASCII characters this means the content must have been encoded by the server into a bytestring using a particular encoding that is indicated by either a HTTP header or a <meta charset="..." /> tag. In order to make sense of those bytes they therefore need to be decoded after receiving using that charset.
Response.text now is a convenience method that does exactly this for you. It assumes the response body is text, and looks at the response headers to find the encoding, and decodes it for you, returning unicode.
But if your response doesn't contain text, this is the wrong method to use. Binary content doesn't contain characters, because it's not text, so the whole concept of character encoding does not make any sense for binary content - it's only applicable to text composed of characters. (That's also why you're seeing response.encoding == None - it's just bytes, there is no character encoding involved).
See Response Content and Binary Response Content in the requests documentation for more details.

Scala Playframework send file

I have a string of data, which I get from data in my database. I want to send it to the user, but without creating a local copy of the file, something like
Ok(MyString).as("file/csv")
But it is not working. How can I do it?
You can do this by using chunked with an Enumerator. I've also used withHeaders to specify the content type and disposition of the Result to "attachment", so that the client will interpret it as a file to download (rather than opening in the browser itself).
import play.api.libs.iteratee.Enumerator
val myString: String = ??? // the String you want to send as a file
Ok.chunked(Enumerator(myString.getBytes("UTF-8")).andThen(Enumerator.eof))
.withHeaders(
"Content-Type" -> "text/csv",
"Content-Disposition" -> "attachment; filename=mystring.csv"
)
This might not compile right away, depending on the types you're getting from the database.
Come to think of it, this should also work (without the Enumerator):
Ok(myString).withHeaders( /* headers from above */ )

Base64 decoding of MIME email not working (GMail API)

I'm using the GMail API to retrieve an email contents. I am getting the following base64 encoded data for the body: http://hastebin.com/ovucoranam.md
But when I run it through a base64 decoder, it either returns an empty string (error) or something that resembles the HTML data but with a bunch of weird characters.
Help?
I'm not sure if you've solved it yet, but GmailGuy is correct. You need to convert the body to the Base64 RFC 4648 standard. The jist is you'll need to replace - with + and _ with /.
I've taken your original input and did the replacement: http://hastebin.com/ukanavudaz
And used base64decode.org to decode it, and it was fine.
You need to use URL (aka "web") safe base64 decoding alphabet (see rfc 4648), which it doesn't appear you're doing. Using the standard base64 alphabet may work sometimes but not always (2 of the characters are different).
Docs don't seem to consistently mention this important detail. Here's one where it does though:
https://developers.google.com/gmail/api/guides/drafts
Also, if your particular library doesn't support the "URL safe" alphabet then you can do string substitution on the string first ("-" with "+" and "_" with "/") and then do normal base64 decoding on it.
I had the same issue decoding the 'data' fields in the message object response from the Gmail API. The Google Ruby API library wasn't decoding the text correctly either. I found I needed to do a url-safe base64 decode:
#data = Base64.urlsafe_decode64(JSON.parse(#result.data.to_json)["payload"]["body"]["data"])
Hope that helps!
There is an example for python 2.x and 3.x:
decodedContents = base64.urlsafe_b64decode(payload["body"]["data"].encode('ASCII'))
If you only need to decode for displaying purposes, consider using atob to decode the messages in JavaScript frontend (see ref).
I found whilst playing with the API result, once I had drilled down to the body I was given an option to decode in the available methods.
val message = mService!!.users().messages().get(user, id).setFormat("full").execute()
println("Message snippet: " + message.snippet)
if(message.payload.mimeType == "text/plain"){
val body = message.payload.body.decodeData() // getValue("body")
Log.i("BODY", body.toString(Charset.defaultCharset()))
}
The result:-
com.example.quickstart I/BODY: ISOLATE NORMAL: 514471,Fap, South Point Rolleston, 55 Faringdon Boulevard , Rolleston, 30 May 2018 20:59:21
I coped the base64 test to a file (b64.txt), then base64-decoded it using base64 (from coreutils) with the -d option (see http://linux.die.net/man/1/base64) and I got text that was perfectly readable. The command I used was:
cat b64.txt | base64 -d

httprequest encoding mismatch

I'm using a Google Gears Worker to submt a POST httprequest (using var request = google.gears.factory.create('beta.httprequest'); )
with a parameter containing the string
"bford%20%24%23%26!%3F%40%20%E5%BE%B3%E5%8A%9B%E5%9F%BA%E5%BD%A6"
but the Django HttpRequest is receiving it as "bford $#&!?# å¾³å\u008a\u009bå\u009fºå½¦"
How do I specify to one or the other of the parties in the transaction to leave it untranslated?
Check the HttpRequest.encoding and the DEFAULT_CHARSET settings. Judging by the encoded value, this should be UTF-8 (which is indeed usually the right thing).
You can get the ‘untranslated’ (with %s still in) value by looking at the input stream (for POST) or environ QUERY_STRING (for GET) and decoding it manually, but it would be better to fix Django's incorrect string-to-unicode decoding really.
As I understand it, Django 1.0 should default to using UTF-8, so I'm not sure why it's not in your case.