Reading CSV file with Spring batch and map to Domain objects based on the the first field and then insert them in DB accordingly [duplicate] - spring-batch

How can we implement pattern matching in Spring Batch, I am using org.springframework.batch.item.file.mapping.PatternMatchingCompositeLineMapper
I got to know that I can only use ? or * here to create my pattern.
My requirement is like below:
I have a fixed length record file and in each record I have two fields at 35th and 36th position which gives record type
for example below "05" is record type which is at 35th and 36th position and total length of record is 400.
0000001131444444444444445589868444050MarketsABNAKKAAAAKKKA05568551456...........
I tried to write regular expression but it does not work, i got to know only two special character can be used which are * and ? .
In that case I can only write like this
??????????????????????????????????05?????????????..................
but it does not seem to be good solution.
Please suggest how can I write this solution, Thanks a lot for help in advance

The PatternMatchingCompositeLineMapper uses an instance of org.springframework.batch.support.PatternMatcher to do the matching. It's important to note that PatternMatcher does not use true regular expressions. It uses something closer to ant patterns (the code is actually lifted from AntPathMatcher in Spring Core).
That being said, you have three options:
Use a pattern like you are referring to (since there is no short hand way to specify the number of ? that should be checked like there is in regular expressions).
Create your own composite LineMapper implementation that uses regular expressions to do the mapping.
For the record, if you choose option 2, contributing it back would be appreciated!

Related

IBM Watson Assistant: Regular expressions with context variables

I am gathering some context variables with slots, and they work just fine.
So I decided to do in another node of the conversation, check if one of these context variables is a specific number:
I was thinking on enabling multi-responses and check if, for example $dni:1 (it is an integer, pattern of 1 integer only), or if it is 2 or 3:
But this is not working. I was trying to solve it for some days with different approaches but I really cannot find a way through it.
My guess is that a context variable has a value, and you can print it to use it like responding with the user's name and stuff like that (which indeed is useful!), but comparing values is not possible.
Any insights on this I can receive?
Watson Assistant uses a short-hand syntax but also supports the more complex expressions. What you could do is to edit the condition in the JSON editor. There, for the condition, use a function like matches() on the value of the context variable.
Note that it is not recommended to check for context variables in the slot conditions. You can use multi-responses. An alternative way is to put the check into the response itself. There, you can use predicates to generate the answer.
<? context.dni==1 ? 'Very well' : 'Your number is not 1' ?>
You can nest the evaluation to have three different answers. Another way is to build an array of responses and use dni as key.
Instead of matching to specific integers, you could consider using the Numbers system entity. Watson Assistant supports several languages. As a benefit, users could answer "the first one", "the 2nd option", etc., and the bot still would understand and your logic could still route to the correct answer.

MongoDB the difference between db.getCollection.find and db.tablename.find?

What is the difference between:
db.getCollection('booking').find()
and
db.booking.find()
Are they exactly the same, or when should I use which one?
db.getCollection('booking').find({_id:"0J0DR"})
db.booking.find({_id:"0J0DR"})
Yes, they are exactly the same and you can use either.
The first form db.getCollection(collectionName).find() becomes handy when your collection name contains special characters that will otherwise render the other syntax redundant.
Example:
Suppose your collection has a name that begin with _ or matches a database shell method or has a space, then you can use db.getCollection("booking trips").find() or db["booking trips"].find() where doing db.booking trips.find() is impossible.
I prefer using db.collection() to either as it will work on nonexistent collections, which is particularly useful when for example creating the first user in a users collection that doesn't yet exist.
db.collection('users').findOneAndUpdate(...) // Won't throw even if the collection doesn't exist yet
In addition to the previous answers, on the shell, they might be exactly the same but in real IDE (like PyCharm), db.getCollection(collectionName) gives you back the whole doculment even with out the find() method.

LibreOffice, Using Constants as query Parameters

I'm using the VLOOKUP function to move data from one table into another. I need to apply this formula to an entire column, and I need to know how to define certain parameters as variable and some as constant.
Here's my problem:
=VLOOKUP($D8,Sheet2.A1:B20,2)
becomes, when I drag the corner of the cell across multiple rows,
=VLOOKUP($D8,Sheet2.A1:B20,2)
=VLOOKUP($D9,Sheet2.A2:B21,2)
=VLOOKUP($D10,Sheet2.A3:B22,2)
=VLOOKUP($D11,Sheet2.A4:B23,2)
And what I need is
=VLOOKUP($D8,Sheet2.A1:B20,2)
=VLOOKUP($D9,Sheet2.A1:B20,2)
=VLOOKUP($D10,Sheet2.A1:B20,2)
=VLOOKUP($D11,Sheet2.A1:B20,2)
With the first parameter changing and the rest remaining constant. I'm sure there is an easy way to do this, but searching and browsing help topics is returning nothing. I admittedly have zero background in spreadsheets. Thanks for your help
Add more $ signs, like this:
=VLOOKUP($D8,Sheet2.$A$1:$B$20,2)
https://help.libreoffice.org/Calc/Addresses_and_References,_Absolute_and_Relative

Can ItemReaders just pass in the record read and not need a lineMapper t o convert to an object

I'm asking if I can pass into the ItemProcessors the entire delimited record read in the ItemReader as one long string.
I have situations with unpredictable data. The file is pipe-delimited, but even with that, a single double-quote will have a parse error using Spring Batch's ItemReader.
In a standalone java application I wrote code using Spring's StringUtils class. I read in the full delimited record as a String (BufferedReader), then call Spring's StringUtils.delimitedListToStringArray(...,...). This gets all the characters whether valid or not, and then I can do a search/replace to get things like any single double-quote or commas in the fields.
My standalone Java program is a down-n-dirty solution. I'm turning it into a Spring Batch job for the long term solution. It's a monthly process, and it's an impractical, if not impossible, task to get SAP users to keep trash out of data fields (i.e. fat-finger city).
I see where it appears I have to have a domain object for the input record to be mapped into. Is this correct, or can i do a pass-through scenario, and let me handle the parsing myself using StringUtils?
The pipe-delimited records turn into comma-delimited records. There's really no need to create a domain object and do all the field set mapping.
Am happy for ideas if I'm approaching this the wrong way.
Thank you in advance.
Thanks,
Michael
EDIT:
This is the error, and the record. The lone double-quote in column 6 is the problem. I can't control the input, so I'm scrubbing each field (all Strings) for unwanted characters. So, my solution was to skip the line mapping and use StringUtils to do it myself--as I've done as mentioned earlier.
Caused by: org.springframework.batch.item.file.FlatFileParseException: Parsing error at line: 33526 in resource=[URL [file:/temp/comptroller/myfile.txt]], input=[xxx|xxx|xxx|xxx|xxx|xxx x xxx xxxxxxx xxxx xxxx "x|xxx|xxx|xxxxx|xx|xxxxxxxxxxxxx|xxxxxxx|xxx|xx |xxx ]
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:182)
at org.springframework.batch.item.support.AbstractItemCountingItemStreamItemReader.read(AbstractItemCountingItemStreamItemReader.java:85)
at org.springframework.batch.core.step.item.SimpleChunkProvider.doRead(SimpleChunkProvider.java:90)
at org.springframework.batch.core.step.item.FaultTolerantChunkProvider.read(FaultTolerantChunkProvider.java:87)
... 27 more
Caused by: org.springframework.batch.item.file.transform.IncorrectTokenCountException: Incorrect number of tokens found in record: expected 15 actual 6
Since the domain objects you read from ItemReaders, write to ItemWriters, and optionally process with ItemProcessors can be any Object, they can be Strings.
So the short answer is yes, you should be able to use a FlatFileItemReader to read one line at a time, pass it to SomeItemProcessor<String,String>, which replaces your pipes with commas (and handles existing commas) with whatever code you want, and sends those converted lines to a FlatFileItemWriter. Spring Batch includes common implementations of the LineTokenizer and LineAggregator classes which could help.
In this scenario, Spring Batch would be acting like a glorified search replace tool, with saner failure handling. To answer the bigger question of whether you should be using domain objects, or at least beans, think about whether you want to perform other tasks in the conversion process, like validation.
P.S. I'm not aware that FFItemReader blows up on a single double-quote, might want to file that as a bug.

i18n in Symfony Forms

Is there any way I can use the format_number_choice function inside of a actions file. In fact I need to use it for a Form error message.
'max_size' => 'File is too large (maximum is %max_size% bytes).',
In English it's simply "bytes", but in other languages the syntax changes after a certain value (for example if the number is greater than 20 it's: "20 of bytes").
I can use parenthesis, of course, but if the framework offers support for doing this specific thing, why not to use it?!
The way it's currently implemented in the 1.4 branch, you can define only one translation per message using il18n XML files.
What you could do is create a custom validator which inherits the current validator (sfValidatorFile in your example) and does the size checking in the doClean method before calling its parent's method.
I suggest you take a look at the source to see how it works : sfValidatorFile
The correct way to handle number ranges for translation is explained here in the Definitive Guide. I won't reproduce it here as the documentation itself is clear and concise. Note however that the string is not extracted automatically by the i18n-extract task, so you need to add it manually - again, the documentation explains this.
So yes, you can use the format_number_choice() function inside an action - you just need to load the helper inside the action like this:
sfContext::getInstance()->getConfiguration()->loadHelpers('I18N');