NSString parsing of continuous data - iphone

Good morning,
I am retrieving a stream of bytes from a serial device that connects to the iPad. Once connected the supplied SDK will call a delegate method with the bytes that have been forwarded.
The readings forwarded by the serial device via the SDK are in the following format:
!X1:000.0;
Once connected to the serial device the delegated methods will start receiving data immediately - this could be in various states of completion i.e.
:000.00;
What I need to do is establish a concrete way of splitting the readings returned from the serial device so that I can manipulate the data.
Some of the tried options are:
Simply concatenate the received strings for a fixed period and then split the NSString on the ";" character. This is a little inefficient though and does not allow me to manipulate the data dynamically
-(void)receivingDelegateMethod:(NSString *)aString {
if(counter < 60){
[self.PropertyString stringByAppendingString:aString];
}else{
NSArray *readings = [self.PropertyString componentsSeparatedByString: #";"];
}
}
Determine a starting point by looking for the "!" character and then appending the resulting substring to a NSString property. All previous calls to the delegated method will append to this property and then remove the first 10 characters.
I know there are further options such as NSScanners and RegEx but I wanted to get the opinion of the community before wasting more time of different methods.
Thanks

Make a BOOL flag that indicates that the stream has been initialized, and set it to false. When you receive the next chunk of data, check the flag first. If it is not set, skip all characters until you see an exclamation point '!'. Once you see it, discard everything in front of it, and copy the rest of the string into the buffer. If the "is initialized" flag is set, append the entire string to the buffer without skipping characters.
Once you finish the append, scan the buffer for ! and ; delimited sections. For each occurrence of that pattern, call a designated method with a complete portion of the pattern. You can get fancy, and define your own "secondary" delegate for processing pre-validated strings.
You may need to detect disconnections, and set the "is initialized" flag back to NO.

Related

Most efficient way to locate a the presence of a substring in a dictionary (NLP)

I'm currently working on a speech recognition feature, where a user can say a command and have that command trigger an event.
The way I have it structured now does work, but mostly because the recognition dictionary is small. It likely will never be millions of commands, but that's no reason to be sloppy.
Here is how it works now:
#ObservedObject var speechText
private var matchables: [String:Int] = [["start the action",0], //formal
["start action",0], //informal
["star traction",0], //common misfire by NLU
["stop the action",1]] //different action
//Call processText with lowercase speechText
//when Observed string value changes
//assume text is conversational, such as "Jenny, I like chicken, also device why
//don't you start the action"
func processText(text: String) {
for (key, value) in matchables {
if text.contains(key) {
executeActionByID(value)
}
}
}
This will loop through the matchables collection and search for the contents of each key inside of the text value. This works fine on a small dictionary, but becomes cumbersome at scale.
I could theoretically break text into N-Grams and then access the dictionary directly by key, but this is long running recognition, and text might contain a substantial number of words (hundreds?) which may exceed the maximum practical size of the dictionary.
Is there a third, better way to analyze long running streams of text and quickly pick out commands that match a small substring?
Here is my back-of-the envelope thinking about this problem:
Searching for keys in a Dictionary is really fast (almost constant time). Searching strings for substrings using String.contains(_:) is slow. (Around O(n) where n is the length of the string.)
As your string length goes up and your number of keys goes up, your time to completion is going to go up by O(n*x) (n=string length, x = number of keys.)
That's likely to get slow for longer search strings, and total time will grow geometrically if both your number of keys and string length increase.
I'd suggest breaking your string into discrete units to search for (the obvious way is to divide it with spaces and other separators like punctuation.) If you do that you could check to see if each word appears in your dictionary keys. That should get you roughly O(n) time performance, since each search for a key in a dictionary runs in nearly constant time.

Fetching LogBook descriptors in custom firmware

I'm looking to fetch recorded data using LogBook in a custom Movesense firmware. How do I get the correct byte stream offset for the next GET call when receiving HTTP_CONTINUE?
I'm trying to implement these steps as described in DataStorage.md:
### /Logbook usage ###
To get recording from the Movesense sensors EEPROM storage, you need to:
1. Do **GET** on */Logbook/Entries*. This returns a list of LogEntry objects. If the status was HTTP_OK, the list is complete. If the result code is HTTP_CONTINUE, you must GET again with the parameter StartAfterId set to the Id of the last entry you received and you'll get the next entries.
2. Choose the Log that you are interested in and notice the Id of it.
3. Fetch the descriptors with **GET** to */Logbook/byId/<Id>/Descriptors*. This returns a bytestream with the similar HTTP_CONTINUE handling as above. However you **must** keep re-requesting the **GET** until you get error or HTTP_OK, or the Logbook service will stay "in the middle of the stream" (we hope to remove this limitation in the future).
4. Fetch the data with **GET** to */Logbook/byId/<Id>/Data*. This returns also a bytestream (just like the */Logbook/Descriptors* above).
5. Convert the data using the converter tools or classes. (To Be Continued...)
The problem is basically the same for step 3 and 4. I receive a whiteboard::ByteStream object in the onGetResult callback function but I don't know how to get the correct offset information from it.
I've found a number of different methods seemingly concerning different aspects of number of bytes in ByteStream.h (length, fullSize, transmitted, payloadSize and serializationLength) but I just can't get it working properly.
Basically I would like to do something like this in onGetResult:
if (resultCode == whiteboard::HTTP_CODE_CONTINUE) {
const whiteboard::ByteStream &byteStream = rResultData.convertTo<const whiteboard::ByteStream &>();
currentEntryOffset += byteStream.length();
asyncGet(WB_RES::LOCAL::MEM_LOGBOOK_BYID_LOGID_DESCRIPTORS(), AsyncRequestOptions::Empty, currentEntryIdToFetch, currentEntryOffset);
return;
}
The basic idea is to do the same call again.
So if you do:
asyncGet(WB_RES::LOCAL::MEM_LOGBOOK_BYID_LOGID_DESCRIPTORS(),AsyncRequestOptions::Empty, currentEntryIdToFetch);
and get the response HTTP_CONTINUE, do:
asyncGet(WB_RES::LOCAL::MEM_LOGBOOK_BYID_LOGID_DESCRIPTORS(),AsyncRequestOptions::Empty, currentEntryIdToFetch);
Until you get HTTP_CONTINUE or an error.
If the result code is HTTP_CONTINUE, you must GET again with the parameter StartAfterId set to the Id of the last entry you received and you'll get the next entries.
Might be a bit cryptic but do another asyncGet to the exact same resource until you get HTTP_OK or an http error code.
Also, note that you need to decode the data, a python script can be found here in this answer

Reading from Socket Stream Blocking After Retrieval

I'm currently attempting to read an incoming message from a client socket, that, prior to the below procedure has already been connected to the server socket. The below procedure outputs the message, one character at a time, as it retrieves it from the stream.
The problem is that, when the stream is out of information, the call to Ada.Streams.Read is blocking, and stops the application flow completely. According to some examples, it would appear as though Offset should be set to 0 automatically at the end of the stream, but that never happens. Instead the application stops at the call to Read.
procedure Read_From (Channel : Sockets.Stream_Access) is
use Ada.Text_IO;
use Ada.Streams;
Data : Stream_Element_Array (1 .. 1);
Offset : Stream_Element_Offset;
begin
loop
Read (Channel.All, Data, Offset);
exit when Offset = 0;
Put (Character'Val (Data (1)));
end loop;
-- The application never reaches this point.
New_Line;
Put_Line ("Finished reading from client!");
end Read_From;
-- #param Channel `GNAT.Sockets.Stream (Client_Socket)`
I've also attempted the same process with GNAT.Sockets.Receive_Socket, but the same issue remains: the application flow is stopped completely, assumably awaiting further information from the stream, even though there is nothing more to retrieve.
Any pointers in the right direction would be highly appreciated!
Normally, you’d read a (binary) message from a stream knowing how much data needed to be read, so you could read until you’d got that much.
But, if you’re reading a text message from an externally-defined source, as it might be an HTTP request, there needs to be some terminator sequence so you can read character-by-character until you’ve read the terminator. In the case of an HTTP request, that’s a CR/LF/CR/LF sequence. Or it could be a null-terminated C string, in which case you’d be looking for the ASCII.NUL.
The Ada way to transfer variable-length text is to use String’Output/String’Input (see ARM 13.13.2(18)ff). What happens for a String (an array of Character) is that first the bounds are sent, then the content; on reception, the bounds are read, a String with those bounds is created, and the required number of bytes are read into the new String, which is then returned.
Basically that's how Ada streams work. The end of the stream only comes once you reach the final end of the stream, not just the current end of a buffer.
If you want to be able to interrupt reading, you have to use another representation of the connection than GNAT.Sockets.Stream_Access.

RedPark Cable readBytesAvailable read twice every time

I have not been able to find this information anywhere. How long can a string be send with the TTL version of the redpark cable?
The following delegate method is called twice when I print something thorough serial from my Arduino, an example of a string is this: 144;480,42;532,40;20e
- (void) readBytesAvailable:(UInt32)length{
When I use the new function methods of retrieving available data [getStringFromBytesAvailable] I will only get 144;480,42;532,40; and then the whole function is called again and the string now contains the rest of the string: 20e
The following method is working for appending the two strings, but only if the rate of data transmission is 'slow' (1 time a second, I would prefer minimum 10 times a second).
-
(void) readBytesAvailable:(UInt32)length{
if(string && [string rangeOfString:#"e"].location == NSNotFound){
string = [string stringByAppendingString:[rscMgr getStringFromBytesAvailable]];
NSLog(string);
finishedReading = YES;
}
else{
string = [rscMgr getStringFromBytesAvailable];
}
if (finishedReading == YES)
{
//do stuff
}
finishedReading = NO;
string = nil;
}
}
But can you tell my why the methods is called twice if I write a "long" string, and how to avoid this issue?
Since your program fragment runs faster then the time it takes to send a string, you need to capture the bytes and append them to a string.
If the serial data is terminated with a carriage return you can test for it to know when you have received the entire string.
Then you can allow your Arduino to send 10 times a second.
That is just how serial ports work. You can't and don't need to avoid those issues. There is no attempt at any level of the SW/HW to keep your serial data stream intact, so making any assumptions about that in your code is just wrong. Serial data is just a stream of bytes, with no concept of packetization. So you have to deal with the fact that you might have to read partial data and read the rest later.
The serialPortConfig within the redparkSerial header file provided by RedPark does, in fact, give you more configuration control than you may realize. The readBytesAvailable:length method is abstracted, and is only called when one of two conditions is met: rxForwardingTimeout value is exceeded with data in the primary buffer (default set to 100 ms) or rxForwardCount is reached (default set to 16 characters).
So, in your case it looks like you've still got data in your buffer after your initial read, which means that the readBytesAvailable:length method will be called again (from the main run loop) to retrieve the remaining data. I would propose playing around with the rxForwardingTimeout and rxForwardCount until it performs as you'd expect.
As already mentioned, though, I'd recommend adding a flag (doesn't have to be carriage return) to at least the end of your packet, for identification.
Also, some good advice here: How do you design a serial command protocol for an embedded system?
Good luck!

socket receive loop never returns

I have a loop that reads from a socket in Lua:
socket = nmap.new_socket()
socket:connect(host, port)
socket:set_timeout(15000)
socket:send(command)
repeat
response,data = socket:receive_buf("\n", true)
output = output..data
until data == nil
Basically, the last line of the data does not contain a "\n" character, so is never read from the socket. But this loop just hangs and never completes. I basically need it to return whenever the "\n" delimeter is not recognised. Does anyone know a way to do this?
Cheers
Updated
to include socket code
Update2
OK I have got around the initial problem of waiting for a "\n" character by using the "receive_bytes" method.
New code:
--socket set as above
repeat
data = nil
response,data = socket:receive_bytes(5000)
output = output..data
until data == nil
return output
This works and I get the large complete block of data back. But I need to reduce the buffer size from 5000 bytes, as this is used in a recursive function and memory usage could get very high. I'm still having problems with my "until" condition however, and if I reduce the buffer size to a size that will require the method to loop, it just hangs after one iteration.
Update3
I have gotten around this problem using string.match and receive_bytes. I take in at least 80 bytes at a time. Then string.match checks to see if the data variable conatins a certain pattern. If so it exits. Its not the cleanest solution, but it works for what I need it to do. Here is the code:
repeat
response,data = socket:receive_bytes(80)
output = output..data
until string.match(data, "pattern")
return output
I believe the only way to deal with this situation in a socket is to set a timeout.
The following link has a little bit of info, but it's on http socket: lua http socket timeout
There is also this one (9.4 - Non-Preemptive Multithreading): http://www.lua.org/pil/9.4.html
And this question: http://lua-list.2524044.n2.nabble.com/luasocket-howto-read-write-Non-blocking-TPC-socket-td5792021.html
A good discussion on Socket can be found on this link:
http://nitoprograms.blogspot.com/2009/04/tcpip-net-sockets-faq.html
It's .NET but the concepts are general.
See update 3. Because the last part of the data is always the same pattern, I can read in a block of bytes and each time check if that block has the pattern. If it has the pattern it will mean that it is the end of the data, append to the output variable and exit.