Incorrect JobDataMap when scheduling multiple jobs at once - quartz-scheduler

Given a job definition (excuse my Kotlin) that stores data from trigger JobDataMap into job's JobDataMap
#DisallowConcurrentExecution
#PersistJobDataAfterExecution
class ProductUpdateEtlJob : RunEtlJob() {
companion object {
const val PRODUCT_IDS = "productIds"
}
override fun execute(context: JobExecutionContext) {
val additionalIds =
context
.trigger
.jobDataMap[PRODUCT_IDS]
if (additionalIds != null) {
context.jobDetail.jobDataMap.merge(
PRODUCT_IDS,
additionalIds,
) { old, new -> (old as List<*>) + (new as List<*>) }
} else {
super.execute(context)
context.jobDetail.jobDataMap.remove(PRODUCT_IDS)
}
}
}
Superclass details are not relevant it just processes the data.
If I trigger 3 executions immediately:
scheduler.triggerJob(
key,
JobDataMap(
mapOf(
PRODUCT_IDS to listOf(123)
)
)
)
scheduler.triggerJob(
key,
JobDataMap(
mapOf(
PRODUCT_IDS to listOf(456)
)
)
)
scheduler.triggerJob(
key,
JobDataMap(
mapOf(
PRODUCT_IDS to listOf(789)
)
)
)
Then each job will execute with an empty JobDataMap.
So at the end the job's JobDataMap contains only the last element (789).
If I add sleep(100) (enough for the Quartz job to execute) between the triggerJob statements I get the expected result of (123, 456,789).
How can I update the JobDataMap for an existing job without lost update problem?

Related

Spark Structured Streaming - Update following groupByKey and mapGroupsWithState giving duplicate key results

I am trying to execute the following stateful aggregation in Databricks (scala):
sig_df
.as[InputRow]
.groupByKey(_.uid)
.mapGroupsWithState(GroupStateTimeout.NoTimeout)(updateAcrossEvents)
.writeStream
.queryName("events_per_window_2")
.format("memory")
.outputMode("update")
.start()
The functions managing the state are these:
def updateAcrossEvents(uid: String,
inputs: Iterator[InputRow],
oldState: GroupState[UState]):UState =
{
var state:UState = if (oldState.exists) oldState.get else UState(uid, -999999, -999999, -999999)
for (input <- inputs) {
state = updateUStateWithEvent(state, input)
oldState.update(state)
}
state
}
And this:
def updateUStateWithEvent(state:UState, input:InputRow):UState = {
// no timestamp, just ignore it
if (Option(input.timestamp).isEmpty) {
return state
}
if (input.sig_id == 10) {
state.front_in = input.sig_value.toInt
}
else if (input.sig_id == 17) {
state.rear_in = input.sig_value.toInt
}
else if (input.sig_id == 25){
state.top_in = input.sig_value.toInt
}
//return the updated state
state
}
The issue I am facing is that the output has duplicates for the key uid. The following query returns plenty of results:
SELECT uid, count(*) FROM events_per_window_2
where front_in <> -999999
or rear_in <> -999999
or top_in <> -999999
group by uid
having count(*) > 1
I was of the understanding that since the outputMode is an update, we will not get any dupes.What might be going wrong with my approach here?

How can I read 23 million records from postgres using JDBC? I have to read from a table in postgres and write to another table

When I write a simple JPA code to findAll() data, I run into memory issues. For writing, I can do batch update. But how to read 23 million records and save them in list for storing into another table?
Java is a poor choice for processing "batch" stuff (and I love java!).
Instead, do it using pure SQL:
insert into target_table (col1, col2, ...)
select col1, col2, ....
from ...
where ...
or, if you must do some processing in java that can't be done within the query, open a cursor for the query and read rows 1 at a time and write the target row before reading the next row. This approach however will take a looooong time to finish.
I fully agree with Bohemian's answer.
If the source and the destination tables you can read and write within the same loop
something in a try - catch block like:
PreparedStatement reader = null;
PreparedStatement writer = null;
ResultSet rs = null;
try {
reader = sourceConnection.prepareStatement("select....");
writer = destinationConnection.prepareStatement("insert into...");
rs = reader.executeQuery();
int chunksize = 10000; // this is you batch size, depends on your system
int counter = 0;
while ( rs.next() {
writer.set.... // do for every field to insert the corresponding set
writer.addBatch();
if ( counter++ % chunksize == 0 ) {
int rowsWritten = writer.executeBatch();
System.out.println("wrote " + counter + " rows"); // probably a simple message to see a progress
}
}
// when finished, do not forget to flush the rest of the batch job
writer.executeBatch();
} catch (SQLException sqlex ) {
// an Errormessage to your gusto
System.out.println("SQLException: " + sqlex.getMessage());
} finally {
try {
if ( rs != null ) rs.close();
if ( reader != null ) reader.close();
if ( writer != null ) writer.close();
// probably you want to clsoe the connections as well
} catch (SQLException e ) {
System.out.println("Exception while closing " + e.getMessage());
}
}

Operator CombineLatest - is possible determine stream of last emited item?

I use combineLatest for join of two streams with two types of tasks. Processing two types of tasks should be interleaved. Is possible to determine which stream emits last value of pair?
I use solution with timestamp, but it is not correct. Each subject contain default value.
List<Flowable<? extends Timed<? extends Task>>> sources = new ArrayList<>();
Flowable<Timed<TaskModification>> modificationSource = mTaskModificationSubject
.onBackpressureDrop()
.observeOn(Schedulers.io(), false, 1)
.timestamp();
Flowable<Timed<TaskSynchronization>> synchronizationSource = mTaskSynchronizationSubject
.onBackpressureDrop()
.observeOn(Schedulers.io(), false, 1)
.flatMap(TaskSynchronizationWrapper::getSources)
.timestamp();
sources.add(0, modificationSource);
sources.add(1, synchronizationSource);
return Flowable
.combineLatest(sources, array -> {
Timed<TaskModification> taskModification = (Timed<TaskModification>) array[0];
Timed<TaskSynchronization> taskSynchronization = (Timed<TaskSynchronization>) array[1];
return (taskModification.time() > taskSynchronization.time())
? taskModification.value()
: taskSynchronization.value();
}, 1)
.observeOn(Schedulers.io(), false, 1)
.flatMapSingle(
Task::getSource
)
.ignoreElements();
When modification task is emitted than should have priority before synchronization tasks.
Without implementing a custom operator, you could introduce queues, merge the signals, then pick items from the priority queue first:
Flowable<X> prioritySource = ...
Flowable<X> source = ...
Flowable<X> output = Flowable.defer(() -> {
Queue<X> priorityQueue = new ConcurrentLinkedQueue<>();
Queue<X> queue = new ConcurrentLinkedQueue<>();
return Flowable.merge(
prioritySource.map(v -> {
priorityQueue.offer(v);
return 1;
}),
source.map(v -> {
queue.offer(v);
return 1;
})
)
.map(v -> {
if (!priorityQueue.isEmpty()) {
return priorityQueue.poll();
}
return queue.poll();
});
});

How to assure the return StringList will be ordered : Scala

I am using Scala 2.11.8
I am trying to read queries from my Property File. Each Query Set has multiple parts (explained below)
And i have certain sequence in which these queries must execute.
Code:
import com.typesafe.config.ConfigFactory
object ReadProperty {
def main(args : Array[String]): Unit = {
val queryRead = ConfigFactory.load("testqueries.properties").getConfig("select").getStringList("caseInc").toArray()
val localRead = ConfigFactory.load("testqueries.properties").getConfig("select").getStringList("caseLocal").toArray.toSet
queryRead.foreach(println)
localRead.foreach(println)
}
}
PropertyFile Content :
select.caseInc.2 = Select emp_salary, emp_dept_id from employees
select.caseLocal.1 = select one
select.caseLocal.3 = select three
select.caseRemote.2 = Select e1.emp_name, d1.dept_name, e1.salary from emp_1 e1 join dept_1 d1 on(e1.emp_dept_id = d1.dept_id)
select.caseRemote.1 = Select * from departments
select.caseInc.1 = Select emp_id, emp_name from employees
select.caseLocal.2 = select two
select.caseLocal.4 = select four
Output:
Select emp_id, emp_name from employees
Select emp_salary, emp_dept_id from employees
select one
select two
select three
select four
As we can see in output, The result is Sorted . In the property if you see i have tried numbering the queries in the sequence it should run.(passing the caseInc, caseLocal as arguments).
With getStringList() i am always getting the Sorted List on the basis of the sequence number i am providing.
Even when i tried using toArray() & toArray().toSet i am getting sorted output.
So far its Good
But how to be sure that it will always return in Sorted Order which i have provided in the property file. I am confused because somehow i am not able to find the API which says that the returned List will be Sorted.
I think you can rely on this fact. Looking into the code of DefaultTransformer you can see following piece of logic:
} else if (requested == ConfigValueType.LIST && value.valueType() == ConfigValueType.OBJECT) {
// attempt to convert an array-like (numeric indices) object to a
// list. This would be used with .properties syntax for example:
// -Dfoo.0=bar -Dfoo.1=baz
// To ensure we still throw type errors for objects treated
// as lists in most cases, we'll refuse to convert if the object
// does not contain any numeric keys. This means we don't allow
// empty objects here though :-/
AbstractConfigObject o = (AbstractConfigObject) value;
Map<Integer, AbstractConfigValue> values = new HashMap<Integer, AbstractConfigValue>();
for (String key : o.keySet()) {
int i;
try {
i = Integer.parseInt(key, 10);
if (i < 0)
continue;
values.put(i, o.get(key));
} catch (NumberFormatException e) {
continue;
}
}
if (!values.isEmpty()) {
ArrayList<Map.Entry<Integer, AbstractConfigValue>> entryList = new ArrayList<Map.Entry<Integer, AbstractConfigValue>>(
values.entrySet());
// sort by numeric index
Collections.sort(entryList,
new Comparator<Map.Entry<Integer, AbstractConfigValue>>() {
#Override
public int compare(Map.Entry<Integer, AbstractConfigValue> a,
Map.Entry<Integer, AbstractConfigValue> b) {
return Integer.compare(a.getKey(), b.getKey());
}
});
// drop the indices (we allow gaps in the indices, for better or
// worse)
ArrayList<AbstractConfigValue> list = new ArrayList<AbstractConfigValue>();
for (Map.Entry<Integer, AbstractConfigValue> entry : entryList) {
list.add(entry.getValue());
}
return new SimpleConfigList(value.origin(), list);
}
}
Note how keys are parsed as integer values and then sorted using Integer.compare

Test Coverage fails on before insert / before update Apex trigger

I have this very simple before insert / update trigger on Opportunity that auto-selects the Price Book based on a dropdown value containing Sales Office (State) location info.
Here's my Trigger:
trigger SelectPriceBook on Opportunity ( before insert, before update ) {
for( Opportunity opp : Trigger.new ) {
// Change Price Book
// New York
if( opp.Campus__c == 'NYC' )
opp.Pricebook2Id = PB_NYC; // contains a Pricebook's ID
// Atlanta
if( opp.Campus__c == 'ATL' )
opp.Pricebook2Id = PB_ATL; // contains another Pricebook's ID
}
}
Here's my Test Class:
#isTest (SeeAllData = true)
public class SelectPriceBookTestClass {
static testMethod void validateSelectPriceBook() {
// Pricebook IDs
ID PB_NYC = 'xxxx';
ID PB_ATL = 'xxxx';
// New Opp
Opportunity opp = new Opportunity();
opp.Name = 'Test Opp';
opp.Office__c = 'NYC';
opp.StageName = 'Quote';
// Insert
insert opp;
// Retrive inserted opportunity
opp = [SELECT Pricebook2id FROM Opportunity WHERE Id =:opp.Id];
System.debug( 'Retrieved Pricebook Id: ' + opp.Pricebook2Id );
// Change Campus
opp.Office__c = 'ATL';
// Update Opportunity
update opp;
// Retrive updated opportunity
opp = [SELECT Pricebook2id FROM Opportunity WHERE Id =:opp.Id];
System.debug( 'Retrieved Updated Pricebook Id: ' + opp.Pricebook2Id );
// Test
System.assertEquals( PB_ATL, opp.Pricebook2Id );
}
}
The test runs report 0% test coverage.
Also, on similar lines I have another before insert trigger that sets the Owner of an Event same as the Owner of the parent Lead. Here's the code:
trigger AutoCampusTourOwner on Event( before insert ) {
for( Event evt : Trigger.new ) {
// Abort if other kind of Event
if( evt.Subject != 'Visit' )
return;
// Set Owner Id
Lead parentLead = [SELECT OwnerId FROM Lead WHERE Id = :evt.WhoId];
evt.OwnerId = parentLead.OwnerId;
}
}
This, too, is causing 0% coverage - my guess is that it's got something to do with the for loops in both. I know I'm seriously flouting DML rules by invoking SOQL query inside a for loop, but for my purposes it should be fine as these Events are created manually and only one at a time - so there are no scopes of governor limits kicking in due to bulk inserts.
The code in both cases work 100%. Please suggest a fix for the test cases.
Have you tried trigger.old ?? My thinking is, when you update the office in your test class from NYC to ATL, the value 'NYC' will be in trigger.old, and that's what you want to check in your trigger.
I could be wrong since i'm new to apex too, but try it and let me know what happens.
For the first trigger don't do anything rather just create opportunity and execute like this.
Test class for SelectPriceBook
#isTest
private class TriggerTestClass {
static testmethod void selectPriceTest(){
Opportunity opps = new Opportunity(
Name= 'Test Opps',
CloseDate = System.today().addDays(30),
StageName = 'Prospecting',
ForecastCategoryName = 'Pipeline',
Office__c = 'NYC');
insert opps;
Opportunity opps2 = new Opportunity(
Name= 'Test Opps 2',
CloseDate = System.today().addDays(28),
StageName = 'Prospecting',
ForecastCategoryName = 'Pipeline',
Office__c = 'ATL');
insert opps2;
}
}
It will give you good test coverage and I don't know what are you trying to do in AutoCampusTourOwner!
i have test class for these
trigger ClientEmailTrigger on inflooens__Client_Email__c (after insert, after update, before insert, before update) {
ApexTriggerSettings__c setting = ApexTriggerSettings__c.getValues('Inflooens Trigger Settings');
if(setting != NULL && setting.ClientEmailTrigger__c == TRUE){
AuditTrailController objAdt = new AuditTrailController();
if(Trigger.isAfter){
if(Trigger.isUpdate){
System.debug('In Update Client Email Record');
objAdt.insertAuditRecord(Trigger.newMap, 'inflooens__Client_Email__c', Trigger.new.get(0).Id, Trigger.oldMap);
}
if(Trigger.isInsert){
objAdt.insertAuditRecord(Trigger.newMap, 'inflooens__Client_Email__c', null , null);
}
}
}
}