How to make multiple processors and writers from one table in Spring batch? - spring-batch

I am trying to perform operations on multiple tables using Spring Batch.
Of course there are ways to create multiple jobs, but in my case it's not possible.
So I want to ask for help.
Here is my situation.
[A]
name, amount
1, 1000
2, 5000
[B]
name, avail_amount
1, 3000
2, 6000
[C]
name, part, avail_amount
1, 1, 300
1, 2, 400
1, 3, 800
1, 4, 1500
2, 1, 6000
[D - History of C]
name, part, amount
1, 1, 300
1, 2, 400
1, 3, 300
[A] Get the name and amount from table A.
[B] Subtract the total amount in Table B from the amount in Table A.
[C] When avail_amount becomes 0, continue subtracting the C table and move on to the next part.
[D] Whenever C is updated, the history is continuously inserted.
To solve this problem, I tried to process multiple tables using CompositeItemProcessor, but I couldn't get the desired result.
ItemReader - A
ItemProcessor - B - ItemWriter - B
ItemProcessor - C - ItemWriter - D
ItemProcessor - C - ItemWriter - D
Is there any way to make it in the form above?
And, I am using MybatisItemProcessor.

Related

How to create a deterministic finite automata for the "regular" function where states lead to more than one state depending on the value of an int

the manual of Minizinc says that we can pass an array to the "regular" function that represent the transitions between states of a DFA. For this state machine:
It puts this example:
array[STATE,SHIFT] of int: t =
[| 2, 3, 1 % state 1
| 4, 4, 1 % state 2
| 4, 5, 1 % state 3
| 6, 6, 1 % state 4
| 6, 0, 1 % state 5
| 0, 0, 1|]; % state 6
Where the row indicates the state, the first two rows indicate the value of "d" and "n", and the last one is the state that it leads to. However, it doesn't have any examples of how to aproach it if we need to make a state machine where the state can lead to more than one states, or where the variables of excitation aren't boolean. For instance:
I can't find it in the manual or in Google, thanks.
I am not familiar with Minizinc, but your first question does not depend on that: You are dealing with a deterministic automaton, so each input value can only lead to one other state, otherwise it would be non-deterministic.
As to your second question, if the possible values for x are restricted to 0, 1, 2, and 3, then you could re-phrase this as booleans: in analogy to the d/n/o example, the first column would give the state for x = 0, the second one for x = 1, etc. This does become unwieldy when x can have many values, but should work for your example.

Data structures being used in MongoDB (B-Trees etc)

Assume I am going to insert the following 5 elements in a Mongo db:
{id= 1, name=Bob, age=34}
{id= 2, name=Jane, age=22}
{id= 3, name=Mike, age=44}
{id= 4, name=Sam, age=55}
{id= 5, name=Joe, age=21}
1)
What data structure are these 5 objects stored in (before building an index)?
2)
I now build an index based on age field. As I understand a B-Tree will now be created containing those 5 objects. But what happens with the previous data structure are they still located in that as well?

Spring Batch - Aggregating multiple lines in Processor

I am trying to write an Spring Batch application that gets lots of data from a database and writes it to Excel. In this process I need to transpose some data that comes in rows into columns.
So imagine a query that returns:
id,name,value
1, me, 1
1, me, 3
1, me, 2
2, you, 4
3, her, 5
My excel would look like:
1, me, 1, 2, 3
2, you, 4
3, her, 5
Note that when transposing the lines into columns, I also sort the values, so doing this transpose in SQL is a bit tricky.
My idea was to create an ItemReader thar return each line as an object, and then in the Processor consolidate grouping lines into a single object, and an ExcelWriter that gets this DTO and writes to Excel.
To implement the Processor, I would do something like:
private ConsolidatedDTO consolidatedDTO = new ConsolidatedDTO();
public ConsolidatedDTO process(AnaliticDTO item) {
if (consolidatedDTO.getKey().equals(item.getKey)) {
consolidatedDTO.add(item);
} else {
ConsolidatedDTO result = consolidatedDTO;
consolidatedDTO = new ConsolidatedDTO(item);
return result;
}
}
The problem is that I'm returning after consolidating everything, when I receive a different item, but how do I deal with the LAST item? I needed a way to know in the Processor when I received the last item so I could return the consolidated immediately instead of waiting for the next line.
Thanks in advance

500000x2 array, find rows meeting specific requirements of 1st and 2nd column, MATLAB

I'm facing a dead end here..
I have collected a huge amount of data and I have isolated only the information that I'm interested in, into a 500K x 2 array of pairs.
1st column contains an ID of, let's say, an Access Point.
2nd column contains a string.
There might be multiple occurrences of an ID in the 1st column, and there can be anything written in the 2nd column. Remember, those are pairs in each row.
What I need to find in those 500K pairs:
I want to find all the IDs, or even the rows, that have 'hello' written in the 2nd column, AND as an additional requirement, there must be more than 2 occurrences of this 'pair'.
Even better want to save how many times this happens, if this happens more than 2 times.
so for example:
col1 (IDs): [ 1, 2, 6, 2, 1, 2, 3, 1]
col2 (str): [ 'hello', 'go', 'hello', 'piz', 'hello', 'da', 'mn', 'hello']
so the data that I ask is :
[ 1, 3 ] , which means, ID=1 , 3 occurences of id=1 with str='hello'
I tried to benchmark it to see if it could do 500.000 rows in a reasonable time.
generate some test data (in total about 60MB)
V = 1+round(rand(5E5,1).*1E4);
H = cell(1,length(V));
for ct = 1:length(H)
switch floor(rand(1)*10)
case 0
H{ct} = 'hello';
case 1
H{ct} = 'go';
case 2
H{ct} = 'piz';
case 3
H{ct} = 'da';
case 4
H{ct} = 'mn';
case 5
H{ct} = 'ds';
case 6
H{ct} = 'wf';
case 7
H{ct} = 'sf';
case 8
H{ct} = 'as';
case 9
H{ct} = 'sg';
end
end
The analysis
tic
a=ismember(H,{'hello'});
M = accumarray(V(a),1);
idx = find(M>1);
result = [idx,M(idx)];
toc
Elapsed time is 0.011699 seconds.
Alternative method with a loop
tic
M=zeros(max(V),1);
for ct = 1:length(H)
if strcmp(H{ct},'hello')
M(V(ct))=M(V(ct))+1;
end
end
idx = find(M>1);
result1 = [idx,M(idx)];
toc
Elapsed time is 0.192560 seconds.
Their are many possible solutions. Here is one: use a map structure. The key set of the map contains the ID's (where "hello" appears in the second column), and the value set contains the number of occurrences.
Run over the second column. When you find "hello", check if the corresponding ID is already a key in the map structure. If true, add +1 to the value associated to that key. Else, add a new pair (key,value) = (the ID, 1).
When finished, remove all the pairs from the map whose values are less or equal than 2. The remaining map is what you are looking for.
Matlab map: https://es.mathworks.com/help/matlab/map-containers.html

Selecting random values in a set in mathematica

I have a set which has {0} and other 8 elements, total 9 elements. I want to choose random 3 value in this set and create a 3x1 column matrix. This will repeat all possible choices in the set. How can I do?
As #Picket said in comment,
The way RandomSample works will ensure it will not output the same choice twice in a single call
If your list is small, you can generate all subsets and sample it.
Example
RandomSample[Subsets[{a, b, c, d, e, f}, {3}], 7]
will generate all (20) subsets with 3 (distinct) elements and then pick 7 different uniformly (there are options to weight each member differently, chose the random generator, etc.).
RandomSample[Flatten[Permutations /# Subsets[{a, b, c, d, e, f}, {3}], 1], 13]
will generate all (120) possible ordered selections of 3 distinct elements among a set of 6 elements and give a sample of 13 distinct elements of this list.
If what you want is a random ordering of all possible subsets of size 3, or of all ordered selections without duplicate of size 3 just ask the same way but with the exact number of such sets.
myset = { foo, foo2, foo3, foo5 };
RandomSample[Subsets[myset, {3}], Binomial[Length[myset],3 ]]
RandomSample[Flatten[Permutations /# Subsets[myset, {3}], 1], 3!*Binomial[Length[myset],3 ] ]
(if you ask more than the exact number of possibilities, RandomSample will complain)
Now if your initial set is large so that the set of subsets is impractical for generation time and memory, take advantage of representing set composition by numbers, even if it is not perfect in term of uniform distribution. Say that your initial set has 20 distinct elements. A three digit number in base 20 can represent any selection of 3. If you account for the need to filter out the few with one digit appearing more than once
20^3/(3!*Binomial[20, 3]) // N
1.16959
You are probably safe by generating 25% more numbers than what you need and filtering the ones with repetition:
Cases[IntegerDigits[RandomSample[0 ;; 20^3-1, Ceiling[31*(1 + 1/4)] ], 20, 3], _?(Length[Union[#]] == 3 &), 1, 31]
This generates a random sample of 39 distinct 3-digit numbers in base 20 and select the first 31 with no duplicates in the form of a list of 3-coordinates vectors.