How to get current position of iterator in ByteString?

How to get current position of iterator in ByteString? - scala

I have an instance of ByteString. To read data from it I should use it's iterator() method.
I read some data and then I decide than I need to create a view (separate iterator of some chunk of data).
I can't use slice() of original iterator, because that would make it unusable, because docs says that:
After calling this method, one should discard the iterator it was called on, and use only the iterator that was returned. Using the old
iterator is undefined, subject to change, and may result in changes to
the new iterator as well.
So, it seems that I need to call slice() on ByteString. But slice() has from and until parameters and I don't know from. I need something like this:
ByteString originalByteString = ...; // <-- This is my input data
ByteIterator originalIterator = originalByteString .iterator();
...
read some data from originalIterator
...
int length = 100; // < -- Size of the view
int from = originalIterator.currentPosition(); // <-- I need this
int until = from + length;
ByteString viewOfOriginalByteString = originalByteString.slice(from, until);
ByteIterator iteratorForView = viewOfOriginalByteString.iterator(); // <-- This is my goal
Update:
Tried to do this with duplicate():
ByteIterator iteratorForView = originalIterator.duplicate()._2.take(length);

ByteIterator's from field is private, and none of the methods seems to simply return it. All I can suggest is to use originalIterator.duplicate to get a safe copy, or else to "cheat" by using reflection to read the from field, assuming reflection is available in your deployment environment.

Related

Call a PostgreSQL function and get result back with no loop

I have a simple rust program that interacts with a PostgreSQL database.
The actual code is:
for row in &db_client.query("select magic_value from priv.magic_value();", &[]).unwrap()
{
magic_value = row.get("magic_value");
println!("Magic value is = {}", magic_value);
}
And.. it works. But I don't like it: I know this function will return one and only one value.
From the example I found, for example here: https://docs.rs/postgres/latest/postgres/index.html
and here: https://tms-dev-blog.com/postgresql-database-with-rust-how-to/
You always have a recordset to loop on.
Which is the clean way to call a function without looping?

query returns a Result<Vec<Row>, _>. You are already unwrapping the Vec, so you can just use it directly instead of looping. By turning the Vec into an owning iterator yourself, you can even easily obtain a Row instead of a &Row.
magic_value = db_client.query("select magic_value from priv.magic_value();", &[])
.unwrap() // -> Vec<Row>
.into_iter() // -> impl Iterator<Item=Row>
.next() // -> Option<Row>
.unwrap() // -> Row
.get("magic_value");

How to sequence a task to execute once all Single's in a collection complete

I am using Helidon DBClient transactions and have found myself in a situation where I end up with a list of Singles, List<Single<T>> and want to perform the next task only after completing all of the singles.
I am looking for something of equivalent to CompletableFuture.allOf() but with Single.
I could map each of the single toCompletableFuture() and then do a CompletableFuture.allOf() on top, but is there a better way? Could someone point me in the right direction with this?
--
Why did I end up with a List<Single>?
I have a collection of POJOs which I turn into named insert .execute() all within an open transaction. Since I .stream() the original collection and perform inserts using the .map() operator, I end up with a List when I terminate the stream to collect a List. None of the inserts might have actually been executed. At this point, I want to wait until all of the Singles have been completed before I proceed to the next stage.
This is something I would naturally do with a CompletableFuture.allOf(), but I do not want to change the API dialect for just this and stick to Single/Multi.

Single.flatMap, Single.flatMapSingle, Multi.flatMap will effectively inline the future represented by the publisher passed as argument.
You can convert a List<Single<T>> to Single<List<T>> like this:
List<Single<Integer>> listOfSingle = List.of(Single.just(1), Single.just(2));
Single<List<Integer>> singleOfList = Multi.just(listOfSingle)
.flatMap(Function.identity())
.collectList();
Things can be tricky when you are dealing with Single<Void> as Void cannot be instantiated and null is not a valid value (i.e. Single.just(null) throws a NullPointerException).
// convert List<Single<Void>> to Single<List<Void>>
Single<List<Void>> listSingle =
Multi.just(List.of(Single.<Void>empty(), Single.<Void>empty()))
.flatMap(Function.identity())
.collectList();
// convert Single<List<Void>> to Single<Void>
// Void cannot be instantiated, it needs to be casted from null
// BUT null is not a valid value...
Single<Void> single = listSingle.toOptionalSingle()
// convert Single<List<Void>> to Single<Optional<List<Void>>>
// then Use Optional.map to convert Optional<List<Void>> to Optional<Void>
.map(o -> o.map(i -> (Void) null))
// convert Single<Optional<Void>> to Single<Void>
.flatMapOptional(Function.identity());
// Make sure it works
single.forSingle(o -> System.out.println("ok"))
.await();

want to store promise value in an array

I want to get all the values from a registration page and store all values in an array. How can I do that in protractor?
var arr = new Array(); //declare array
InputName.getAttribute("value")
.then(function(value){
arr[0]=value; // want to store promise value in an array
});
console.log(arr[0]);

If you run your code, it will first log arr[0] and then resolve Promise. Therefore, you may access that array's values in the next Promise. Something like this
var arr = new Array(); // <- this is by the way bad practice, use 'let arr = [];'
InputName.getAttribute("value")
.then(function(value) {
arr[0]=value; // I would use arr.push(value)
});
anotherInput.getAttribute("value")
.then(function(value) {
console.log(arr[0]); // your value should be accessible here
arr.push(value) // push another value
});
But, honestly, I've been working with Protractor fo a while now and I still have difficulties understanding promises... This why I'm using async/await in my tests so if I were to implement something like that I would end up having the following
let arr = [];
let value1 = await InputName.getAttribute("value");
arr.push(value1);
console.log(arr[0]);
Clear, neat code with no hustle. Plus protractor team is actually removing promise_manager, so one day when you update it your code will not work anymore. Then why not switch earlier

Getting line locations with iText

How can one find where are lines located in a document with iText?
Suppose say I have a table in a PDF document, and want to read its contents; I would like to find where exactly the cells are located. In order to do that I thought I might find the intersections of lines.

I think your only option using iText will be to parse the PDF tokens manually. Before doing that I would have a copy of the PDF spec handy.
(I'm a .Net guy so I use iTextSharp but other than some capitalization differences and property declarations they're almost 100% the same.)
You can get the individual tokens using the PRTokeniser object which you feed bytes into from calling getPageContent(pageNum) on your PdfReader.
//Get bytes for page 1
byte[] pageBytes = reader.getPageContent(1);
//Get the tokens for page 1
PRTokeniser tokeniser = new PRTokeniser(pageBytes);
Then just loop through the PRTokeniser:
PRTokeniser.TokType tokenType;
string tokenValue;
while (tokeniser.nextToken()) {
tokenType = tokeniser.tokenType;
tokenValue = tokeniser.stringValue;
//...check tokenValue, do something with it
}
As far a tokenValue, you'd want to probably look for re and l values for rectangle and line. If you see an re then you want to look at the previous 4 values and if you see an l then previous 2 values. This also means that you need to store each tokenValue in an array so you can look back later.
Depending on what you used to create the PDF with you might get some interesting results. For instance, I created a 4 cell table with Microsoft Word and saved as a PDF. For some reason there are two sets of 10 rectangles with many duplicates, but the general idea still works.
Below is C# code targeting iTextSharp 5.1.1.0. You should be able to convert it to Java and iText very easily, I noted the one line that has .Net-specific code that needs to be adjusted from a Generic List (List<string>) to a Java equivalent, probably an ArrayList. You'll also need to adjust some casing, .Net uses Object.Method() whereas Java uses Object.method(). Lastly, .Net accesses properties without gets and sets, so Object.Property is both the getter and setter compared to Java's Object.getProperty and Object.setProperty.
Hopefully this gets you started at least!
//Source file to read from
string sourceFile = "c:\\Hello.pdf";
//Bind a reader to our PDF
PdfReader reader = new PdfReader(sourceFile);
//Create our buffer for previous token values. For Java users, List<string> is a generic list, probably most similar to an ArrayList
List<string> buf = new List<string>();
//Get the raw bytes for the page
byte[] pageBytes = reader.GetPageContent(1);
//Get the raw tokens from the bytes
PRTokeniser tokeniser = new PRTokeniser(pageBytes);
//Create some variables to set later
PRTokeniser.TokType tokenType;
string tokenValue;
//Loop through each token
while (tokeniser.NextToken()) {
//Get the types and value
tokenType = tokeniser.TokenType;
tokenValue = tokeniser.StringValue;
//If the type is a numeric type
if (tokenType == PRTokeniser.TokType.NUMBER) {
//Store it in our buffer for later user
buf.Add(tokenValue);
//Otherwise we only care about raw commands which are categorized as "OTHER"
} else if (tokenType == PRTokeniser.TokType.OTHER) {
//Look for a rectangle token
if (tokenValue == "re") {
//Sanity check, make sure we have enough items in the buffer
if (buf.Count < 4) throw new Exception("Not enough elements in buffer for a rectangle");
//Read and convert the values
float x = float.Parse(buf[buf.Count - 4]);
float y = float.Parse(buf[buf.Count - 3]);
float w = float.Parse(buf[buf.Count - 2]);
float h = float.Parse(buf[buf.Count - 1]);
//..do something with them here
}
}
}

NodeJS: What is the proper way to handling TCP socket streams ? Which delimiter should I use?

From what I understood here, "V8 has a generational garbage collector. Moves objects aound randomly. Node can’t get a pointer to raw string data to write to socket." so I shouldn't store data that comes from a TCP stream in a string, specially if that string becomes bigger than Math.pow(2,16) bytes. (hope I'm right till now..)
What is then the best way to handle all the data that's comming from a TCP socket ? So far I've been trying to use _:_:_ as a delimiter because I think it's somehow unique and won't mess around other things.
A sample of the data that would come would be something_:_:_maybe a large text_:_:_ maybe tons of lines_:_:_more and more data
This is what I tried to do:
net = require('net');
var server = net.createServer(function (socket) {
socket.on('connect',function() {
console.log('someone connected');
buf = new Buffer(Math.pow(2,16)); //new buffer with size 2^16
socket.on('data',function(data) {
if (data.toString().search('_:_:_') === -1) { // If there's no separator in the data that just arrived...
buf.write(data.toString()); // ... write it on the buffer. it's part of another message that will come.
} else { // if there is a separator in the data that arrived
parts = data.toString().split('_:_:_'); // the first part is the end of a previous message, the last part is the start of a message to be completed in the future. Parts between separators are independent messages
if (parts.length == 2) {
msg = buf.toString('utf-8',0,4) + parts[0];
console.log('MSG: '+ msg);
buf = (new Buffer(Math.pow(2,16))).write(parts[1]);
} else {
msg = buf.toString() + parts[0];
for (var i = 1; i <= parts.length -1; i++) {
if (i !== parts.length-1) {
msg = parts[i];
console.log('MSG: '+msg);
} else {
buf.write(parts[i]);
}
}
}
}
});
});
});
server.listen(9999);
Whenever I try to console.log('MSG' + msg), it will print out the whole buffer, so it's useless to see if something worked.
How can I handle this data the proper way ? Would the lazy module work, even if this data is not line oriented ? Is there some other module to handle streams that are not line oriented ?

It has indeed been said that there's extra work going on because Node has to take that buffer and then push it into v8/cast it to a string. However, doing a toString() on the buffer isn't any better. There's no good solution to this right now, as far as I know, especially if your end goal is to get a string and fool around with it. Its one of the things Ryan mentioned # nodeconf as an area where work needs to be done.
As for delimiter, you can choose whatever you want. A lot of binary protocols choose to include a fixed header, such that you can put things in a normal structure, which a lot of times includes a length. In this way, you slice apart a known header and get information about the rest of the data without having to iterate over the entire buffer. With a scheme like that, one can use a tool like:
node-buffer - https://github.com/substack/node-binary
node-ctype - https://github.com/rmustacc/node-ctype
As an aside, buffers can be accessed via array syntax, and they can also be sliced apart with .slice().
Lastly, check here: https://github.com/joyent/node/wiki/modules -- find a module that parses a simple tcp protocol and seems to do it well, and read some code.

You should use the new stream2 api. http://nodejs.org/api/stream.html
Here are some very useful examples: https://github.com/substack/stream-handbook
https://github.com/lvgithub/stick

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to get current position of iterator in ByteString? - scala

ByteIterator's from field is private, and none of the methods seems to simply return it. All I can suggest is to use originalIterator.duplicate to get a safe copy, or else to "cheat" by using reflection to read the from field, assuming reflection is available in your deployment environment.

Related

Call a PostgreSQL function and get result back with no loop

How to sequence a task to execute once all Single's in a collection complete

want to store promise value in an array

Getting line locations with iText

NodeJS: What is the proper way to handling TCP socket streams ? Which delimiter should I use?

Categories

Resources