Set kafka stream header from Int to Array[Byte] conversion - scala

I'm using kafka streams on a project (using Scala) in which I need to include some Headers on the records that I'm sending to a topic. Specifically, a Header requires you to convert the value that you want to put on the Header (in my case Int and Long) to a Array[Byte]. If I try to convert an Int to a Array[Byte] using toByte function, I cannot recover the header from the topic (it appears as null). The only way to do it seems to be to convert it to a String and then use getBytes to store it on the header:
val counter: Int = 1
// counter_1 doesn't appear
context.headers().add("counter_1", Array[Byte](counter.toByte))
// counter_2 doesn't appear
context.headers().add("counter_2", ByteBuffer.allocate(4).putInt(counter).array())
// the value 1 appears on the header counter_3.
context.headers().add("counter_3", String.valueOf(counter).getBytes(StandardCharsets.UTF_8))
It doesn't seem to me that converting an int to a String is the correct way to do it. What's the right way to do it?
I'm recovering the records using the kafkacat command with the %h option. So I would get something like this:
...counter_1=,counter_2=,counter_3=1,....
Any help would be greatly appreciated.

Notice that Kafka Streams sends a stream of bytes without telling the Kafka topic whether these bytes are an Integer, Double, String or other type. Kafka is not aware of sent data type, it just gets some bytes.
When you fetch messages using Kafkacat it assumes the messages are Strings. So if you have sent a String encoded to bytes it is able to display it. However, when you send Integers encoded to bytes - then most likely these integers are converted to bytes representing ASCII codes of unprintable characters and that's why Kafkacat won't display them

Related

Socket response header as hexadecimal format on two digits

I have a TCP socket connection where I am trying to respond to a request message. The specification I have states a response to a request should be in the format of:
Where the header contains the data length in hexadecimal format on two digits. My response data is "202000000000000TEST0_OK ".
As the response is 25 characters long, I assumed the header would be the hex representation of 25, so 0x19. However, when sending this, I'm getting an unknown error back from the device I'm replying to.
Using the BlueSocket library, I'm sending my response using:
if(message.contains("00000000000000001")) {
reply = "202000000000000TEST0_OK "
var header: UInt8 = 0x19
try socket.write(from: Data.init(bytes: &header, count: MemoryLayout<UInt8>.size))
try socket.write(from: reply)
}
Unfortunately, I don't have any further error information. I'm pretty confident my understanding of how to calculate and send the header is incorrect...
Edit: I've also tried the following responses which all failed:
"25XXXXXXXXXXXXXXXXXXXXXXXXX" Started with the decimal length of the data string (25).
"27XXXXXXXXXXXXXXXXXXXXXXXXX" Prepended with the entire length of the data and header string (27).
"19XXXXXXXXXXXXXXXXXXXXXXXXX" Prepended with the hex representation of the length of the data string (25 characters so 0x19).
"1BXXXXXXXXXXXXXXXXXXXXXXXXX" Prepended with the hex representation of the entire length of the data and header string (27 characters so 0x1B).

HTTP reading and rewriting bytes

I am getting a blob of binary data by executing a command-line program. This blob can be decoded by the same program.
I would like to send this binary data in a HTTP response. For that I use the "application/octet-stream" mime-type (I also tried "text/plain"). But when the client gets the data on the other end, it is not readable anymore. The format has changed somehow.
Here is my Scala code (but I think it is a generic question):
val command = s"samtools view -h $testbam $region"
val res: String = command.!! // Gets the stdout as string
Result(
header = ResponseHeader(200, Map.empty),
body = HttpEntity.Strict(ByteString(res), Some("application/octet-stream"))
)
The same command, run on the command-line, can be decoded properly. But the result of a "curl" here, although it does return a bunch of unreadable symbols, cannot be decoded.
Sending the same data in its readable (uncompressed) form goes through properly without losing content or formatting.
What kind of characters conversion am I missing ? Any encoding to specify ?
Edit: Play 2.5.0

How to calculate the offset value for Jetty 8 web socket sendMessageMethod

I have a project where a proxy sends Base64 encoded messages to the server. The server then decodes the messages into a byte array and sends to the client. The Jetty 8 WeboScoket.Connection sendMessage(data, offset, length) method expects an offset.
Question how does one determine this offset when decoding from base64?
Also is it okay to assume that the length parameter is the converted byte array's length?
def onMessage(message:String) {
println("From client: " + message)
val decoded = Base64.decodeBase64(message)
println("Decoded and sent to the client: " + decoded)
serverSocket.connection.sendMessage(decoded, offset???, decoded.length)
}
tl;dr: It's an offset into decoded, supporting the case that only a part of an array is to be sent. Here, this parameter probably should be 0.
The link you've put into your message points to API v7, not 8.
Grep coding for org.eclipse.jetty.websocket.connection, I've followed this one in jetty-websocket. Then you can find types implementing that method -- e.g. WebSocketConnectionD00. You see your data mysteriously disappearing into the addFrame method of another interface, WebSocketGenerator. Now here's finally the real meat.
This is pretty low-level here, you can find the data being put into yet another abstraction:
private Buffer _buffer;
...
_buffer.put(content, offset + (payload - remaining), chunk);
One more down, and here's the info. Wait... no. Either grepcode is showing wrong data here or the devs copy/pasted the comments from the void put(byte b) to the two methods below, just adapting the comment on the returned value.
One more down, and you finally see what's happening:
System.arraycopy(b, offset, dst_array, index, length);
, where b is the decoded byte[]. Unfortunately, using grepcode, one cannot dive into that implementation.
Note that I don't use Jetty. Just wanted to look into some other code...

How to read the whole message with Chilkat socket?

I need to get the whole message(response), but socket.ReceiveBytes(); returns just part of the message. I tried to loop it but it fails on timeout when no bytes to receive.
List<byte> lb = new List<byte>();
byte[] receivedMsg = socket.ReceiveBytes();
while (receivedMsg.Length > 0)
{
lb.AddRange(receivedMsg);
receivedMsg = socket.ReceiveBytes();
}
So, how I can check if there are byte to read? How I can read the whole message?
Since its a Chilkat implementation, you should probably contact the developer. But I found this that could help: http://www.cknotes.com/?p=302
Ultimately, you need to know how much to read from the socket to constitute a whole message. For example, if the overlying protocol is a portmapper, then you know that you are expecting messsages in the format that the RFC specifies (http://tools.ietf.org/html/rfc1833.)
If you are rolling your own protocol over a socket connection, then use the method in the Chilkat blog post about putting the size of the total message in the first 4 bytes.

How much data to receive from server in SSL handshake before calling InitializeSecurityContext?

In our Windows C++ application I am using InitializeSecurityContext() client side to open an schannel connection to a server which is running stunnel SSL proxy. My code now works, but only with a hack I would like to eliminate.
I started with this sample code:http://msdn.microsoft.com/en-us/library/aa380536%28v=VS.85%29.aspx
In the sample code, look at SendMsg and ReceiveMsg. The first 4 bytes of any message sent or received indicates the message length. This is fine for the sample, where the server portion of the sample conforms to the same convention.
stunnel does not seem to use this convention. When the client is receiving data during the handshake, how does it know when to stop receiving and make another call to InitializeSecurityContext()?
This is how I structured my code, based on what I could glean from the documentation:
1. call InitializeSecurityContext which returns an output buffer
2. Send output buffer to server
3. Receive response from server
4. call InitializeSecurityContext(server_response) which returns an output buffer
5. if SEC_E_INCOMPLETE_MESSAGE, go back to step 3,
if SEC_I_CONTINUE_NEEDED go back to step 2
I expected InitializeSecurityContext in step 4 to return SEC_E_INCOMPLETE_MESSAGE if not enough data was read from the server in step 3. Instead, I get SEC_I_CONTINUE_NEEDED but an empty output buffer. I have experimented with a few ways to handle this case (e.g. go back to step 3), but none seemed to work and more importantly, I do not see this behavior documented.
In step 3 if I add a loop that receives data until a timeout expires, everything works fine in my test environment. But there must be a more reliable way.
What is the right way to know how much data to receive in step 3?
SChannel is different than the Negotiate security package. You need to receive at least 5 bytes, which is the SSL/TLS record header size:
struct {
ContentType type;
ProtocolVersion version;
uint16 length;
opaque fragment[TLSPlaintext.length];
} TLSPlaintext;
ContentType is 1 byte, ProtocolVersion is 2 bytes, and you have 2 byte record length. Once you read those 5 bytes, SChannel will return SEC_E_INCOMPLETE_MESSAGE and will tell you exactly how many more bytes to expect:
SEC_E_INCOMPLETE_MESSAGE
Data for the whole message was not read from the wire.
When this value is returned, the pInput buffer contains a SecBuffer structure with a BufferType member of SECBUFFER_MISSING. The cbBuffer member of SecBuffer contains a value that indicates the number of additional bytes that the function must read from the client before this function succeeds.
Once you get this output, you know exactly how much to read from the network.
I found the problem.
I found this sample:
http://www.codeproject.com/KB/IP/sslsocket.aspx
I was missing the handling of SECBUFFER_EXTRA (line 987 SslSocket.cpp)
The SChannel SSP returns SEC_E_INCOMPLETE_MESSAGE from both InitializeSecurityContext and DecryptMessage when not enough data is read.
A SECBUFFER_MISSING message type is returned from DecryptMessage with a cbBuffer value of the amount of desired bytes.
But in practice, I did not use the "missing data" value. The documentation indicates the value is not guaranteed to be correct, and is only a hint for developers can use to reduce calls.
InitalizeSecurityContext MSDN doc:
While this number is not always accurate, using it can help improve performance by avoiding multiple calls to this function.
So I unconditionally read more data into the same buffer whenever SEC_E_INCOMPLETE_MESSAGE was returned. Reading multiple bytes at a time from a socket.
Some extra input buffer management was required to append more read data and keep the lengths right. DecryptMessage will modify the input buffers' cbBuffer properties when it fails, which surprised me.
Printing out the buffers and return result after calling InitializeSecurityContext shows the following:
read socket:bytes(5).
InitializeSecurityContext:result(80090318). // SEC_E_INCOMPLETE_MESSAGE
inBuffers[0]:type(2),bytes(5).
inBuffers[1]:type(0),bytes(0). // no indication of missing data
outBuffer[0]:type(2),bytes(0).
read socket:bytes(74).
InitializeSecurityContext:result(00090312). // SEC_I_CONTINUE_NEEDED
inBuffers[0]:type(2),bytes(79). // notice 74 + 5 from before
inBuffers[1]:type(0),bytes(0).
outBuffer[0]:type(2),bytes(0).
And for the DecryptMessage Function, input is always in dataBuf[0], with the rest zeroed.
read socket:bytes(5).
DecryptMessage:len 5, bytes(17030201). // SEC_E_INCOMPLETE_MESSAGE
DecryptMessage:dataBuf[0].BufferType 4, 8 // notice input buffer modified
DecryptMessage:dataBuf[1].BufferType 4, 8
DecryptMessage:dataBuf[2].BufferType 0, 0
DecryptMessage:dataBuf[3].BufferType 0, 0
read socket:bytes(8).
DecryptMessage:len 13, bytes(17030201). // SEC_E_INCOMPLETE_MESSAGE
DecryptMessage:dataBuf[0].BufferType 4, 256
DecryptMessage:dataBuf[1].BufferType 4, 256
DecryptMessage:dataBuf[2].BufferType 0, 0
DecryptMessage:dataBuf[3].BufferType 0, 0
read socket:bytes(256).
DecryptMessage:len 269, bytes(17030201). // SEC_E_OK
We can see my TLS Server peer is sending TLS headers (5 bytes) in one packet, and then the TLS message (8 for Application Data), then the Application Data payload in a third.
You must read some arbitrary amount the first time, and when you receive SEC_E_INCOMPLETE_MESSAGE, you must look in the pInput SecBufferDesc for a SECBUFFER_MISSING and read its cbBuffer to find out how many bytes you are missing.
This problem was doing my head in today, as I was attempting to modify my handshake myself, and having the same problem the other commenters were having, i.e. not finding a SECBUFFER_MISSING. I do not want to interpret the tls packet myself, and I do not want to unconditionally read some unspecified number of bytes. I found the solution to that, so I'm going to address their comments, too.
The confusion here is because the API is confusing. Ordinarily, to read the output of InitializeSecurityContext, you look at the content of the pOutput parameter (as defined in the signature). It's that SecBufferDesc that contains the SECBUFFER_TOKEN etc to pass to AcceptSecurityContext.
However, in the case where InitializeSecurityContext returns SEC_E_INCOMPLETE_MESSAGE, the SECBUFFER_MISSING is returned in the pInput SecBufferDesc, in place of the SECBUFFER_ALERT SecBuffer that was passed in.
The documentation does say this, but not in a way that clearly contrasts this case against the SEC_I_CONTINUE_NEEDED and SEC_E_OK cases.
This answer also applies to AcceptSecurityContext.
From MSDN, I'd presume SEC_E_INCOMPLETE_MESSAGE is returned when not enough data is received from server at the moment. Instead, SEC_I_CONTINUE_NEEDED returned with InBuffers[1] indicating amount of unread data (note that some data is processed and must be skipped) and OutBuffers containing nothing.
So the algorithm is:
If SEC_I_CONTINUE_NEEDED returned, check type of InBuffers[1]
If it is SECBUFFER_EXTRA, handle it (move InBuffers[1].cbBuffer bytes to the beginning of input buffer) and jump to next recv & InitializeSecurityContext iteration
If OutBuffers is not empty, send its contents to server