IBM Watson Retrieve and Rank Task are repeating after being completed - ibm-cloud

I have two collections in my Retrieve and Rank cluster.
I have already completed TASKS for one collection and trained a ranker for it too.
But, now when am training the second collection, my TASKS' QUESTIONS are repeating. This did not happen when I completed tasks for first collection.
Well, all my respomses are being saved though. But due to repeated questions, teh Ranker Performance Status does not increase.
Even on skipping those questions, the TASK gets started again with same questions.
This is what my Priority Training task looks like, even after being completed:
Eg. PRIORITY TRAINING TASK has 46 question (lets say). After completing the whole task, the PRIORITY TRAINING TASK starts again with the same questions and same recommended answers.

I am aware that this was happening for some time yesterday as a result of some problems caused by bluemix maintenance - preventing communication between the question manager and the tasks service.
My understanding is that this was resolved. Please can you try it again and let me know. If it's working now please just mark this as accepted, if it's still a problem add a comment and I'll look into it further.

Related

ADO: Analyzing Sprint performance

I am a PO leading a small development team for enhancements to our PeopleSoft Campus Solutions application for a Medical School.
We are using the Sprint functionality in ADO to assign stories from our backlog to the Sprint, create the relevant tasks for each story (mainly development, testing, deployment) and assign the tasks to resources who, in turn, provide effort values (original estimate, remaining, completed). We also make sure our capacity is properly set, with resources OOO time and school holidays configured to get an accurate team and resource capacity. The team updates their effort numbers daily to ensure we are tracking burndown.
While we always start the Sprint with the remaining work hours under team capacity (and the same at the resource level), we have historically left alot of remaining work on the table at the end of the Sprint.
My leadership wants to answer the question "Why was the work left on the table?". Of course, there could be MANY reasons, we underestimated the effort, we were blocked on a task (for example, we can't start the testing task until the development is done), the resource didn't actually have the calculated capacity due to being pulled into other meetings or initiatives, or (and I don't think this is the case) people were just plain lazy.
What reports/analytics can I leverage to help answer this question? Even just seeing a list of remaining tasks per resource with remaining task effort and with a total amount of work remaining per resource overall would be helpful, but I can't seem to find anything.
Any suggestions or guidance is appreciated!
You can use Queries to find the remaining tasks(add column option->Remaining Work) and save the query into Shared Queries.
There is a query results widget in dashboard to display the query in Shared Queries. Do not forget to add Remaining Working in widget.
Remaining Work:
You could refer to the document: Adjust work to fit sprint capacity

Architecture/Optimization of a scheduling problem on OptaPlanner

I'd like to generate a planning with OptaPlanner, with the following problem :
"Task" is the planning entity
It has a fixed duration
It has an employee (planning variable) which must match required skills
It has a workstation (planning variable) which must match required attributes
It has dependencies: some tasks must start after some others
Some tasks have a deadline
At first, I tried with chained/shadow variables, basing on the OptaPlanner Task Assignment example. On my first attempt, I kept the employee as an anchor, and let the workstation management to the solver:
I quickly saw that there was a problem with this approach. Here, Task A and Task B (and all others) have an influence the one on the other while they are not on the same chain and also the previous element of the chain has not enough information to determine the start time of the task. Also, the worst thing is that workstations changes not being tracked in this model, the solutions are just non-sense as workstations can be used several times.
To fix the problem of workstation tracking, I added workstations to the anchor and made all employee/workstation combinations
This way I know the employee and workstation on each chain. However this does not solve the problem of start time of tasks on a chain being dependent on tasks on other chains (e.g, Task A and Task D share the same workstation) and therefore Task insertions/removals shall have repercussion on other chains, which is not the spirit.
I ended giving up the idea of chained variables as it seems that their usage does not fit my problem.
So I modified my classes and the start time of tasks is now resolved with all intelligence left to OptaPlanner's solver, driven by pure drools rules and a ValueRangeProvider.
When I have only the following rules :
No employee recovery (Hard)
No workstation recovery (Hard)
Skills and attributes requirements (Hard)
Tasks with deadlines ending before deadline (Soft, could be Medium)
Tasks ending as soon as possible (sum of squared end times, Soft)
I can get quite fast a solution that seems to be the best.
However, when I add dependencies between tasks (with a hard rule going down task dependencies to see if a dependency does not end after the task starts), the complexity seems to dramatically increase so for a few dozens task, with only 2 operators, 3 workstations, a satisfying solution (without an unexpected holes between tasks) can take an hour to come, with the following parameters:
Experiment length: 10,000 min
Granularity: 1 min
I also have a TaskDifficultyComparator to help the solver place the hardest tasks before
This is very long for a few tasks and it will be far worse when inserting a notion of availability of user, I even suspect it never converges as task end time will "jump" depending on availabilities.
So my questions are:
Are there solutions leveraging chained variables that would suit my problem?
Are their any optimizations on the solver/rules/something else that could grant me a precious speed-up?

Azure Batch: Do I get charged after deleting job but still keep pool?

I found it takes a few minutes to create pool and prepare the job, I am thinking to keep the pool to get rid of the overhead.
Thanks
Lidong
I can try and answer this but folks can correct me, AFAIK: refer to this document: https://azure.microsoft.com/en-us/pricing/details/batch/
In batch as doc states: “Virtual machines are billed per-second rounded down to the last minute.”
Most of the pricing structure is well document and seems like it’s just few cents per hour.
One general suggestion could be that once you are done with your compute you can always re-scale your pool back to 0.
You probably want to add more to your question if you are keen to discuss any specific scenario so that right folks can answer in detail.
Hope this helps, but add more most o the content above is general purpose. Thanks
Update https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/

Job Shop : Arena

I'll try and keep it simple : I've started using Arena Simulation for studies purposes, and up until now, I've been unable to find any conclusive documentation or tutorial as to how to create a Job Shop, if you could direct me to specific and practical documentation, or otherwise a helpful example which could get me started , that would be most helpful.
My problem : A given number of jobs must be processed through a given number of ressources (machines), each job has a different route to take, and each one has a different work-time depending on the resource it is using.
Ex : For job_1 to be finished, it must first use ressource_1 with 5 seconds execution-time, then ressource_3 with 3 seconds execution-time and finally ressource_9 with 1 second execution-time. Of course, a different job has a totally different route and different execution-times.
Here's an MS thesis I found...
http://www.scribd.com/doc/54342479/Simulation-of-Job-Shop-using-Arena-Mini-Project-Report
ADDENDUM:
The basic idea is to use ASSIGN to label the jobs with attribute variables reflecting their routing requirements. Those attributes can be read and used by decision blocks to route the job to the appropriate next workstation or to the exit. Perhaps these notes will be more useful to you than the MS thesis cited above. That's about all I can give you since I haven't used Arena for several years now -- I no longer have access to it and can't put together any specific examples.

WF performance with new 20,000 persisted workflow instances each month

Windows Workflow Foundation has a problem that is slow when doing WF instances persistace.
I'm planning to do a project whose bussiness layer will be based on WF exposed WCF services. The project will have 20,000 new workflow instances created each month, each instance could take up to 2 months to finish.
What I was lead to belive that given WF slownes when doing peristance my given problem would be unattainable given performance reasons.
I have the following questions:
Is this true? Will my performance be crap with that load(given WF persitance speed limitations)
How can I solve the problem?
We currently have two possible solutions:
1. Each new buisiness process request(e.g. Give me a new drivers license) will be a new WF instance, and the number of persistance operations will be limited by forwarding all status request operations to saved state values in a separate database.
2. Have only a small amount of Workflow Instances up at any give time, without any persistance ofso ever(only in case of system crashes etc.), by breaking each workflow stap in to a separate worklof and that workflow handling each business process request instance in the system that is at that current step(e.g. I'm submitting my driver license reques form, which is step one... we have 100 cases of that, and my step one workflow will handle every case simultaneusly).
I'm very insterested in solution for that problem. If you want to discuss that problem pleas be free to mail me at nstjelja#gmail.com
The number of hydrated executing wokflows will be determined by environmental factors memory server through put etc. Persistence issue really only come into play if you are loading and unloading workflows all the time aka real(ish) time in that case workflow may not be the best solution.
In my current project we also use WF with persistence. We don't have quite the same volume (perhaps ~2000 instances/month), and they are usually not as long to complete (they are normally done within 5 minutes, in some cases a few days). We did decide to split up the main workflow in two parts, where the normal waiting state would be. I can't say that I have noticed any performance difference in the system due to this, but it did simplify it, since our system sometimes had problems matching incoming signals to the correct workflow instance (that was an issue in our code; not in WF).
I think that if I were to start a new project based on WF I would rather go for smaller workflows that are invoked in sequence, than to have big workflows handling the full process.
To be honest I am still investigating the performance characteristics of workflow foundation.
However if it helps, I have heard the WF team have made many performance improvements with the new release of WF 4.
Here are a couple of links that might help (if you havn't seem them already)
A Developer's Introduction to Windows Workflow Foundation (WF) in .NET 4 (discusses performance improvements)
Performance Characteristics of Windows Workflow Foundation (applies to WF 3.0)
WF on 3.5 had a performance problem. WF4 does not - 20000 WF instances per month is nothing. If you were talking per minute I'd be worried.