how to dig into this memory leak with eclipse MAT further - eclipse

I have an issue where a ScheduledThreadPoolExecutor ends up with 3 million future tasks. I am trying to see what type of task so I can go to where that task is scheduled, but I am not sure how to get any info from this screen(I have tried right clicking those future tasks and selecting various choices in the menu). It seems like there is something missing in the gui like the links to the actual runnables or something...
any ideas on how to drill into further?

Some General Stuff
You need to know, that if you have a portable heap dump (phd, see types here), then it does not contain actual data (primitives), so then you can make your findings only based on reference map (which types hold a reference to which other types).
You can give a try to OQL. This is an SQL like language, with which you can query your objects.
One example:
select * from java.lang.String s where s.#retainedHeapSize>10000
This gives back all strings, that are bigger than ~10k.
You can make also some functions (like this aggregating here).
You could give a try to it.
As for the current problem
If you check the FutureTask source (here is JDK6 below):
public class FutureTask<V> implements RunnableFuture<V> {
/** Synchronization control for FutureTask */
private final Sync sync;
...
public FutureTask(Callable<V> callable) {
if (callable == null)
throw new NullPointerException();
sync = new Sync(callable);
}
...
public FutureTask(Runnable runnable, V result) {
sync = new Sync(Executors.callable(runnable, result));
}
The actual Runnable is referred by the Sync object:
private final class Sync extends AbstractQueuedSynchronizer {
private static final long serialVersionUID = -7828117401763700385L;
/** State value representing that task is running */
private static final int RUNNING = 1;
/** State value representing that task ran */
private static final int RAN = 2;
/** State value representing that task was cancelled */
private static final int CANCELLED = 4;
/** The underlying callable */
private final Callable<V> callable;
/** The result to return from get() */
private V result;
/** The exception to throw from get() */
private Throwable exception;
/**
* The thread running task. When nulled after set/cancel, this
* indicates that the results are accessible. Must be
* volatile, to ensure visibility upon completion.
*/
private volatile Thread runner;
Sync(Callable<V> callable) {
this.callable = callable;
}
So in the GUI open the Sync object (not open in your picture), and then you can check the Runnables.
I dont know if you can change the code or not, but in general it is better always limit the size of the queue used by an executor, since this way you can avoid leaks. Or you can use some persisted queue. If you apply a limit you can define the rejection policy like for example reject, run in caller and so on. See http://docs.oracle.com/javase/1.5.0/docs/api/java/util/concurrent/ThreadPoolExecutor.html for details.

Related

What is a good way/pattern to use Temporal/Cadence versioning API

The Versioning API is powerful. However, with the pattern of using it, the code will quickly get messy and hard to read and maintain.
Over the time, product need to move fast to introduce new business/requirements. Is there any advice to use this API wisely.
I would suggest using a Global Version Provider design pattern in Cadence/Temporal workflow if possible.
Key Idea
The versioning API is very powerful to let you change the behavior of the existing workflow executions in a deterministic way(backward compatible). In real world, you may only care about adding the new behavior, and being okay to only introduce this new behavior to newly started workflow executions. In this case, you use a global version provider to unify the versioning for the whole workflow.
The Key idea is that we are versioning the whole workflow (that's why it's called GlobalVersionProvider). Every time adding a new version, we will update the version provider and provide a new version.
Example In Java
import com.google.common.annotations.VisibleForTesting;
import com.google.common.collect.ImmutableMap;
import io.temporal.workflow.Workflow;
import java.util.HashMap;
import java.util.Map;
public class GlobalVersionProvider {
private static final String WORKFLOW_VERSION_CHANGE_ID = "global";
private static final int STARTING_VERSION_USING_GLOBAL_VERSION = 1;
private static final int STARTING_VERSION_DOING_X = 2;
private static final int STARTING_VERSION_DOING_Y = 3;
private static final int MAX_STARTING_VERSION_OF_ALL =
STARTING_VERSION_DOING_Y;
// Workflow.getVersion can release a thread and subsequently cause a non-deterministic error.
// We're introducing this map in order to cache our versions on the first call, which should
// always occur at the beginning of an workflow
private static final Map<String, GlobalVersionProvider> RUN_ID_TO_INSTANCE_MAP =
new HashMap<>();
private final int versionOnInstantiation;
private GlobalVersionProvider() {
versionOnInstantiation =
Workflow.getVersion(
WORKFLOW_VERSION_CHANGE_ID,
Workflow.DEFAULT_VERSION,
MAX_STARTING_VERSION_OF_ALL);
}
private int getVersion() {
return versionOnInstantiation;
}
public boolean isAfterVersionOfUsingGlobalVersion() {
return getVersion() >= STARTING_VERSION_USING_GLOBAL_VERSION;
}
public boolean isAfterVersionOfDoingX() {
return getVersion() >= STARTING_VERSION_DOING_X;
}
public boolean isAfterVersionOfDoingY() {
return getVersion() >= STARTING_VERSION_DOING_Y;
}
public static GlobalVersionProvider get() {
String runId = Workflow.getInfo().getRunId();
GlobalVersionProvider instance;
if (RUN_ID_TO_INSTANCE_MAP.containsKey(runId)) {
instance = RUN_ID_TO_INSTANCE_MAP.get(runId);
} else {
instance = new GlobalVersionProvider();
RUN_ID_TO_INSTANCE_MAP.put(runId, instance);
}
return instance;
}
// NOTE: this should be called at the beginning of the workflow method
public static void upsertGlobalVersionSearchAttribute() {
int workflowVersion = get().getVersion();
Workflow.upsertSearchAttributes(
ImmutableMap.of(
WorkflowSearchAttribute.TEMPORAL_WORKFLOW_GLOBAL_VERSION.getValue(),
workflowVersion));
}
// Call this API on each replay tests to clear up the cache
#VisibleForTesting
public static void clearInstances() {
RUN_ID_TO_INSTANCE_MAP.clear();
}
}
Note that because of a bug in Temporal/Cadence Java SDK, Workflow.getVersion can release a thread and subsequently cause a non-deterministic error.
We're introducing this map in order to cache our versions on the first call, which should
always occur at the beginning of the workflow execution.
Call clearInstances API on each replay tests to clear up the cache.
Therefor in the workflow code:
public class HelloWorldImpl{
private GlovalVersionProvider globalVersionProvider;
#VisibleForTesting
public HelloWorldImpl(final GlovalVersionProvider versionProvider){
this.globalVersionProvider = versionProvider;
}
public HelloWorldImpl(){
this.globalVersionProvider = GlobalVersionProvider.get();
}
#Override
public void start(final Request request) {
if (globalVersionProvider.isAfterVersionOfUsingGlobalVersion()) {
GlobalVersionProvider.upsertGlobalVersionSearchAttribute();
}
...
...
if (globalVersionProvider.isAfterVersionOfDoingX()) {
// doing X here
...
}
...
if (globalVersionProvider.isAfterVersionOfDoingY()) {
// doing Y here
...
}
...
}
Best practice with the pattern
How to add a new version
For every new version
Add the new constant STARTING_VERSION_XXXX
Add a new API ` public boolean isAfterVersionOfXXX()
Update MAX_STARTING_VERSION_OF_ALL
Apply the new API into workflow code where you want to add the new logic
Maintain the replay test JSON in a pattern of `HelloWorldWorkflowReplaytest-version-x-description.json. Make sure always add a new replay test for every new version you introduce to the workflow. When generating the JSON from a workflow execution, make sure it exercise the new code path – otherwise it won't be able to protect the determinism. If it requires more than one workflow executions to exercise all branches, then make multiple JSON files for replay. 
How to remove a old version:
To remove an old code path(version), add a new version to not execute old code path, then later on use Search attribute query like
GlobalVersion>=STARTING_VERSION_DOING_X AND GlobalVersion<STARTING_VERSION_NOT_DOING_X to find out if there is existing workflow execution still running with certain versions.
Instead of waiting for workflows to close, you can terminate or reset workflows
Example of deprecating a code path DoingX:
Therefor in the workflow code:
public class HelloWorldImpl implements Helloworld{
...
#Override
public void start(final Request request) {
...
...
if (globalVersionProvider.isAfterVersionOfDoingX() && !globalVersionProvider.isAfterVersionOfNotDoingX()) {
// doing X here
...
}
}
###TODO Example In Golang
Benefits
Prevent spaghetti code by using native Temporal versioning API everywhere in the workflow code
Provide search attribute to find workflow of particular version. This will fill the gaps that Temporal Java SDK is missing TemporalChangeVersion feature.
Even Cadence Java/Golang SDK has CadenceChangeVersion, this global
version search attribute is much better in query, because it's an
integer instead of a keyword.
Provide a pattern to maintain replay test easily
Provide a way to test different version without this missing feature
Cons
There shouldn't be any cons. Using this pattern doesn't stop you from using the raw versioning API directly in the workflow. You can combine this pattern with others together.

How Can I have multiples instances of a Spring boot Repository(Interface), to have a complete test-state-isolation?

1) Contextualization:
In order, to have a complete test-isolation-state in all test of my Test-Class;
I would like to have a new-instance-repository(DAO) for each individual test;
My Repository is a Interface, thats the why I can not simply instantiate that.
My Goal is:
Run all tests 'Parallelly', meaning 'at the same time';
That's the why, I need individual/multiple instances of Repository(DAO) in each test;
Those multiple instances will make sure that the tests' conclusion would not interfere on those that still is running.
Below is the code for the above situation:
1.1) Code:
Current working status: working, BUT with ths SAME-REPOSITORY-INSTANCE;
Current behaviour:
The tests are not stable;
SOMETIMES, they interfere in each other;
meaning, the test that finalize early, destroy the Repository Bean that still is being used, for the test that is still running.
public class ServiceTests2 extends ConfigTests {
private List<Customer> customerList;
private Flux<Customer> customerFlux;
#Lazy
#Autowired
private ICustomerRepo repo;
private ICustomerService service;
#BeforeEach
public void setUp() {
service = new CustomerService(repo);
Customer customer1 = customerWithName().create();
Customer customer2 = customerWithName().create();
customerList = Arrays.asList(customer1,customer2);
customerFlux = service.saveAll(customerList);
}
#Test
#DisplayName("Save")
public void save() {
StepVerifier.create(customerFlux)
.expectNextSequence(customerList)
.verifyComplete();
}
#Test
#DisplayName("Find: Objects")
public void find_object() {
StepVerifier
.create(customerFlux)
.expectNext(customerList.get(0))
.expectNext(customerList.get(1))
.verifyComplete();
}
}
2) The ERROR happening:
This ERROR happens in the failed-Tests:
3) Question:
How Can I create multiple instances of Repository
Even if, it being a Interface(does not allow instantation)?
In order, to have a COMPLETE TEST-ISOLATION
Meaning: ONE different instance of Repository in each test?
Thanks a lot for any help or idea
You can use the #DirtiesContext annotation on the test class that modifies the application context.
Java Doc
Spring documentation
By default, this will mark the application context as dirty after the entire test class is run. If you would like to mark the context as dirty after a single test method, then you can either annotate the test method instead or set the classMode property to AFTER_EACH_TEST_METHOD at your class level annotation.
#DirtiesContext(classMode = ClassMode.AFTER_EACH_TEST_METHOD)
When an application context is marked dirty, it is removed from the
testing framework's cache and closed; thus the underlying Spring
container is rebuilt for any subsequent test that requires a context
with the same set of resource locations.

Score corruption when using computed values to calculate score

I have a use case where:
A job can be of many types, says A, B and C.
A tool can be configured to be a type: A, B and C
A job can be assigned to a tool. The end time of the job depends on the current configured type of the tool. If the tool's current configured type is different from the type of the job, then time needs to be added to change the current tool configuration.
My #PlanningEntity is Allocation, with startTime and tool as #PlanningVariable. I tried to add the currentConfiguredToolType in the Allocation as the #CustomShadowVariable and update the toolType in the shadowListener's afterVariableChanged() method, so that I have the correct toolType for the next job assigned to the tool. However, it is giving me inconsistent results.
[EDIT]: I did some debugging to see if the toolType is set correctly. I found that the toolType is being set correctly in afterVariableChanged() method. However, when I looked at the next job assigned to the tool, I see that the toolType has not changed. Is it because of multiple threads executing this flow? One thread changing the toolType of the tool the first time and then a second thread simultaneously assigning the times the second time without taking into account the changes done by the first thread.
[EDIT]: I was using 6.3.0 Final earlier (till yesterday). I switched to 6.5.0 Final today. There too I am seeing similar results, where the toolType seems to be set properly in afterVariableChanged() method, but is not taken into account for the next allocation on that tool.
[EDIT]: Domain code looks something like below:
#PlanningEntity
public class Allocation {
private Job job;
// planning variables
private LocalDateTime startTime;
private Tool tool;
//shadow variable
private ToolType toolType;
private LocalDateTime endTime;
#PlanningVariable(valueRangeProviderRefs = TOOL_RANGE)
public Tool getTool() {
return this.tool;
}
#PlanningVariable(valueRangeProviderRefs = START_TIME_RANGE)
public LocalDateTime getStartTime() {
return this.startTime;
}
#CustomShadowVariable(variableListenerClass = ToolTypeVariableListener.class,
sources = {#CustomShadowVariable.Source(variableName = "tool")})
public ToolType getCurrentToolType() {
return this.toolType;
}
private void setToolType(ToolType type) {
this.toolType = type;
this.tool.setToolType(type);
}
private setStartTime(LocalDateTime startTime) {
this.startTime = startTime;
this.endTime = getTimeTakenForJob() + getTypeChangeTime();
...
}
private LocalDateTime getTypeChangeTime() {
//typeChangeTimeMap is available and is populated with data
return typeChangeTimeMap.get(tool.getType);
}
}
public class Tool {
...
private ToolType toolType;
getter and setter for this.
public void setToolType() { ...}
public ToolType getToolType() { ...}
}
public class ToolTypeVariableListener implements VariableListener<Allocation> {
...
#Override
public void afterVariableChanged(ScoreDirector scoreDirector, Allocation entity) {
scoreDirector.afterVariableChanged(entity, "currentToolType");
if (entity.getTool() != null && entity.getStartTime() != null) {
entity.setCurrentToolType(entity.getJob().getType());
}
scoreDirector.afterVariableChanged(entity, "currentToolType");
}
[EDIT]: When I did some debugging, looks like the toolType set in the machine for one allocation is used in calculating the type change time for a allocation belonging to a different evaluation set. Not sure how to avoid this.
If this is indeed the case, what is a good way to model problems like this where the state of a item affects the time taken? Or am I totally off. I guess i am totally lost here.
[EDIT]: This is not an issue with how Optaplanner is invoked, but score corruption, when the rule to penalize it based on endTime is added. More details in comments.
[EDIT]: I commented out the rules specified in rules one-by-one and saw that the score corruption occurs only when the score computed depends on the computed values: endTime and toolTypeChange. It is fine when the score depends on the startTime, which is a planningVariable alone. However, that does not give me the best results. It gives me a solution which has a negative hard score, which means it violated the rule of not assigning the same tool during the same time to different jobs.
Can computed values not be used for score calculations?
Any help or pointer is greatly appreciated.
best,
Alice
The ToolTypeVariableListener seems to lack class to the before/after methods, which can cause score corruption. Turn on FULL_ASSERT to verify.

How to cancel a recurring job in firebase job dispatcher

I have created a recurring job, I want to cancel the recurring job when some conditions met.
final Job.Builder builder = dispatcher.newJobBuilder()
.setTag("myJob")
.setService(myJobService.class)
.setRecurring(true)
.setTrigger(Trigger.executionWindow(30, 60));
How can i cancel a job in firebase ?
The readme on GitHub says:
Driver is an interface that represents a component that can schedule,
cancel, and execute Jobs. The only bundled Driver is the
GooglePlayDriver, which relies on the scheduler built-in to Google
Play services.
So cancelling is part of the driver you are using. Inspecting the code of the driver interface there are two methods to cancel a job:
/**
* Cancels the job with the provided tag and class.
*
* #return one of the CANCEL_RESULT_ constants.
*/
#CancelResult
int cancel(#NonNull String tag);
/**
* Cancels all jobs registered with this Driver.
*
* #return one of the CANCEL_RESULT_ constants.
*/
#CancelResult
int cancelAll();
So in your case you have to call:
dispatcher.cancel("myJob");
or
dispatcher.cancelAll();
The dispatcher will call the corresponding method of the driver for you. If you want you can also call the methods directly on your driver myDriver.cancelAll() like it is done in the sample app which comes with the GitHub project.
The chosen method will return one of the following constants:
public static final int CANCEL_RESULT_SUCCESS = 0;
public static final int CANCEL_RESULT_UNKNOWN_ERROR = 1;
public static final int CANCEL_RESULT_NO_DRIVER_AVAILABLE = 2;

How can we persist states and transitions in stateless4j based State Machine?

I am working on implementing a state machine for a workflow management system based on the Stateless4j API. However, I am not able to find an effective way to persist the states and transitions in Stateless4j.
As part of our usecases, we have the requirement to keep States alive for more than 3 - 4 days until the user returns to the workflow. And we will have more than one workflow running concurrently.
Can you please share your insights on the best practices to persist states in Stateless4j based State Machine implementation?
It looks like what you need to do is construct your StateMachine with a custom accessor and mutator, something like this:
public class PersistentMutator<S> implements Action1<S> {
Foo foo = null;
#Inject
FooRepository fooRepository;
public PersistentMutator(Foo foo) {
this.foo = foo;
}
#Override
public void doIt(S s) {
foo.setState(s);
fooRepository.save(foo)
}
}
Then you want to call the constructor with your accessors and mutators:
/**
* Construct a state machine with external state storage.
*
* #param initialState The initial state
* #param stateAccessor State accessor
* #param stateMutator State mutator
*/
public StateMachine(S initialState, Func<S> stateAccessor, Action1<S> stateMutator, StateMachineConfig<S, T> config) {
this.config = config;
this.stateAccessor = stateAccessor;
this.stateMutator = stateMutator;
stateMutator.doIt(initialState);
}
Alternatively, you may want to look at StatefulJ. It has built in support for atomically updating state in both JPA and Mongo out of the box. This may save you some time.
Disclaimer: I'm the author of StatefulJ