Can I or how to delete a Cadence workflow domain? - cadence-workflow

I incidentally created a wrong domain, or after some testing I wanted to delete the domain.
Should I do that? and How?

It's strongly recommended not to delete domain.
There could be some data like tasks associated with a domain in the Cadence system. There is no tooling to clean them yet. Simply delete the domain will lead to corruption. For example, there may be a timer task scheduled for 1 year later in that domain. If the domain is deleted, it may look okay right now. But one year later, when the timer fire, the system will be corrupted. By design Cadence needs to be strongly consistent. So the server cannot simply skip a timer task.
In most cases, you don't need to delete an existing domain. As long as you don't use it you are fine. Including cases like you created a domain with a wrong name, or you want to deprecate a domain. In those cases, just don't bother it.
Another case, you created a local domain but later on realized that it should be a global domain. It's recommended to just ignore the local domain.
There could some slightly better reasons to delete a domain. For example, in the above case, you want to keep using the same domain name for whatever reason.
!!Danger Zone!!
The ONLY case that you can delete a domain is the case that you are sure the domain has never been used at all.
The operation is as follow. Using your database tool:
For SQL:
DELETE FROM domains WHERE name ="<yourDomain>" LIMIT 1
For Cassandra:
SELECT domain FROM domains_by_name_v2 WHERE domains_partition=0 AND name ="<yourDomain>"
This will return the domainUUID.
Then delete the records from two table:
DELETE FROM domains_by_name_v2 WHERE domains_partition=1 AND name ="<yourDomain>" LIMIT 1;
DELETE FROM domains WHERE id = domainUUID LIMIT 1;

Related

Ansible: Check (GET) before Applying (POST/PUT) if applying is idempotent

I am writing roles in ansible, which use the ansible.builtin.uri method to do my bidding against the api of the service.
As I don't want to POST/PUT every time I run the Playbook, I check if the item I want to create already exists.
Does this make sense? In the end I introduce an extra step to GET the status, to skip the POST/PUT, where the POST/PUT itself would simply set what I want in the end.
For example, I wrote an Ansible Role which gives a User a role in Nexus.
Every time I run the Role, it first checks if the User already has the role and if not, it provides it.
If I don't check before and the User would already have the Role, it would simply apply it again.
But as I would like to know exactly whats going to happen, I believe it is better to explicitly check before applying.
What is the best practice for my scenario and are there any reasons against checking before or directly applying the changes?

How to effectively use Worker, WorkflowClient

Product Use Case - Our product has a typical use case where we will be having n no of users. Each user will have n no of workflows and each workflow can be run at any time(n of time).
I hope this is a typical use case of any workflow product.
can I use a domain to differentiate users (I mean to say that creating a domain per user)?
Can I create one WorkflowClient per user to serve all his workflow executions? Or for each request should I need to create one WorkflowClient? which one is a recommended approach?
What is the recommended approach in creating Worker objects to poll task list?
Please don't mistake me If I have asked anything meaningless
can I use a domain to differentiate users (I mean to say that creating a domain per user)?
Yes, especially when these users are working in different teams or product, using different domain will avoid workflowName/IDs conflicting each others, and also assign independent number of quotas for managing traffic.
Can I create one WorkflowClient per user to serve all his workflow executions? Or for each request should I need to create one WorkflowClient? which one is a recommended approach?
Use one WorkflowClient for each domain, but let all WorkflowClients on the same instance share the same TChannelService to save the TCP connection.
I would start with a single namespace (domain) for all users. Unless your users directly operate their workflow implementations it doesn't buy you much to use multiple namespaces.

Accessing multi-level fields in a CRM 2011 Workflow

Sorry if this is sort of confusing because I'm not sure how to word this. I am trying to create a workflow that runs off of Account's in Microsoft CRM 2011. One part of this workflow requires me to retrieve a field contained in the Business Unit of the User in the Account's "Created By" field. However, the workflow will only allow me to access the Business Unit itself, but not any of its fields.
I'm wondering if there is a simple trick or work-around that will allow me access to this data.
Thanks!
For reference, the Account has a User, who has a Business Unit, and the Business Unit has a field I need to access. CRM, however, doesn't want to let me get more than 2 levels deep when accessing fields.
Clunky but do-able if you accept a bit of denormalisation (temporarily or otherwise). I'll assume for the sake of example you want to get at the "cost centre" field from the BU.
Add a field on User entity to temporarily hold the value from the BU (so make it same type and length, text(100) in this case), optionally put it on the form.
Create a child workflow for the User entity to update the user with the "cost centre" value from their BU. Make it only available to run as a child, not onDemand or anything else. Activate
In your Account workflow, add a step to call the child workflow against the relevant user (eg Created By in your case).
Add a step to wait until the new cost centre field on the user record contains data.
Now do whatever you need to with the value from the user record, such as update the Account, or do some branched logic.
Whatever you do, once you have used the value, clear the field on the user record, or do this as the last step of the workflow.
Now, since Users don't change BU very often, you might actually just go ahead and keep that value on the User record permanently, and instead of a child workflow, simply run this on create of a new user, or on change of BU, and store the value permanently on the User record. Yes, it is 'denormalised' and not purest SQL design, but then you don't need a child workflow, you don't need a wait state and you don't have to clear the value at the end, or worry about what happens when two Accounts need to run their workflow at the same time. I include the more general approach above as this might apply to other records which do change their parent quite often.
Just an additional thought - you can access the "owning business unit" of the Account, but this will be the BU of the Owning User, rather than the Created By, but is your business process such that this would normally be the same person? (eg users only have Create priviledge to "user owned" depth, so can only create records they own).
If so, then you could get at the BU directly from the Account, and then any fields on it too (in a condition or to update the Account)
Alternative which is less ideal but a similar approach - add a relationship from Account to BU (eg "created BU"). Now you can update the Account with this by referring to the Created By User's BU, then in the next step, reference this value from the Account. This is again denormalised, and less preferable since number of Accounts is far greater than number of users, so the level of duplicate information is much higher.
You can't get deeper with the standard steps of a workflow.
The solution is to create a custom workflow activity, you can start from this article:
http://msdn.microsoft.com/en-us/library/gg328515.aspx

Creation Concurrency with CQRS and EventStore

Baseline info:
I'm using an external OAuth provider for login. If the user logs into the external OAuth, they are OK to enter my system. However this user may not yet exist in my system. It's not really a technology issue, but I'm using JOliver EventStore for what it's worth.
Logic:
I'm not given a guid for new users. I just have an email address.
I check my read model before sending a command, if the user email
exists, I issue a Login command with the ID, if not I issue a
CreateUser command with a generated ID. My issue is in the case of a new user.
A save occurs in the event store with the new ID.
Issue:
Assume two create commands are somehow issued before the read model is updated due to browser refresh or some other anomaly that occurs before consistency with the read model is achieved. That's OK that's not my problem.
What Happens:
Because the new ID is a Guid comb, there's no chance the event store will know that these two CreateUser commands represent the same user. By the time they get to the read model, the read model will know (because they have the same email) and can merge the two records or take some other compensating action. But now my read model is out of sync with the event store which still thinks these are two separate entities.
Perhaps it doesn't matter because:
Replaying the events will have the same effect on the read model
so that should be OK.
Because both commands are duplicate "Create" commands, they should contain identical information, so it's not like I'm losing anything in the event store.
Can anybody illuminate how they handled similar issues? If some compensating action needs to occur does the read model service issue some kind of compensation command when it realizes it's got a duplicate entry? Is there a simpler methodology I'm not considering?
You're very close to what I'd consider a proper possible solution. The scenario, if I may summarize, is somewhat like this:
Perform the OAuth-entication.
Using the read model decide between a recurring visitor and a new visitor, based on the email address.
In case of a new visitor, send a RegisterNewVisitor command message that gets handled and stored in the eventstore.
Assume there is some concurrency going on that, for the same email address, causes two RegisterNewVisitor messages, each containing what the system thinks is the key associated with the email address. These keys (guids) are different.
Detect this duplicate key issue in the read model and merge both read model records into one record.
Now instead of merging the records in the read model, why not send a ResolveDuplicateVisitorEmailAddress { Key1, Key2 } towards your domain model, leaving it up to the domain model (the codified form of the business decision to be taken) to resolve this issue. You could even have a dedicated read model to deal with these kind of issues, the other read model will just get a kind of DuplicateVisitorEmailAddressResolved event, and project it into the proper records.
Word of warning: You've asked a technical question and I gave you a technical, possible solution. In general, I would not apply this technique unless I had some business indicator that this is worth investing in (what's the frequency of a user logging in concurrently for the first time - maybe solving it this way is just a way of ignoring the root cause (flakey OAuth, no register new visitor process in place, etc)). There are other technical solutions to this problem but I wanted to give you the one closest to what you already have in place. They range from registering new visitors sequentially to keeping an in-memory projection of the visitors not yet in the read model.

How to Implement a Reliable Web Page Counter?

What's a good way to implement a Web Page counter?
On the surface this is a simple problem, but it gets problematic when dealing with search engine crawlers and robots, multiple clicks by the same user, refresh clicks.
Specifically what is a good way to ensure links aren't just 'clicked up' by user by repeatedly clicking? IP address? Cookies? Both of these have a few drawbacks (IP Addresses aren't necessarily unique, cookies can be turned off).
Also what is the best way to store the data? Increment a counter individually or store each click as a record in a log table, then summarize occasionally.
Any live experience would be helpful,
+++ Rick ---
Use IP Addresses in conjunction with Sessions. Count every new session for an IP address as one hit against your counter. You can store this data in a log database if you think you'll ever need to look through it. This can be useful for calculating when your site gets the most traffic, how much traffic per day, per IP, etc.
So I played around with this a bit based on the comments here. What I came up with is counting up a counter in a simple field. In my app I have code snippet entities with a Views property.
When a snippet is viewed a method filters out (white list) just what should hopefully be browsers:
public bool LogSnippetView(string snippetId, string ipAddress, string userAgent)
{
if (string.IsNullOrEmpty(userAgent))
return false;
userAgent = userAgent.ToLower();
if (!(userAgent.Contains("mozilla") || !userAgent.StartsWith("safari") ||
!userAgent.StartsWith("blackberry") || !userAgent.StartsWith("t-mobile") ||
!userAgent.StartsWith("htc") || !userAgent.StartsWith("opera")))
return false;
this.Context.LogSnippetClick(snippetId, IpAddress);
}
The stored procedure then uses a separate table to temporarily hold the latest views which store the snippet Id, entered date and ip address. Each view is logged and when a new view comes in it's checked to see if the same IP address has accessed this snippet within the last 2 minutes. if so nothing is logged.
If it's a new view the view is logged (again SnippetId, IP, Entered) and the actual Views field is updated on the Snippets table.
If it's not a new view the table is cleaned up with any views logged that are older than 4 minutes. This should result in a minmal number of entries in the View log table at any time.
Here's the stored proc:
ALTER PROCEDURE [dbo].[LogSnippetClick]
-- Add the parameters for the stored procedure here
#SnippetId AS VARCHAR(MAX),
#IpAddress AS VARCHAR(MAX)
AS
BEGIN
SET NOCOUNT ON;
-- check if don't allow updating if this ip address has already
-- clicked on this snippet in the last 2 minutes
select Id from SnippetClicks
WHERE snippetId = #SnippetId AND ipaddress = #IpAddress AND
DATEDIFF(minute, Entered, GETDATE() ) < 2
IF ##ROWCOUNT = 0
BEGIN
INSERT INTO SnippetClicks
(SnippetId,IpAddress,Entered) VALUES
(#SnippetId,#IpAddress,GETDATE())
UPDATE CodeSnippets SET VIEWS = VIEWS + 1
WHERE id = #SnippetId
END
ELSE
BEGIN
-- clean up
DELETE FROM SnippetClicks WHERE DATEDIFF(minute,Entered,GETDATE()) > 4
END
END
This seems to work fairly well. As others mentioned this isn't perfect but it looks like it's good enough in initial testing.
If you get to use PHP, you may use sessions to track activity from particular users. In conjunction with a database, you may track activity from particular IP addresses, which you may assume are the same user.
Use timestamps to limit hits (assume no more than 1 hit per 5 seconds, for example), and to tell when new "visits" to the site occur (if the last hit was over 10 minutes ago, for example).
You may find $_SERVER[] properties that aid you in detecting bots or visitor trends (such as browser usage).
edit:
I've tracked hits & visits before, counting a page view as a hit, and +1 to visits when a new session is created. It was fairly reliable (more than reliable enough for the purposes I used it for. Browsers that don't support cookies (and thus, don't support sessions) and users that disable sessions are fairly uncommon nowadays, so I wouldn't worry about it unless there is reason to be excessively accurate.
If I were you, I'd give up on my counter being accurate in the first place. Every solution (e.g. cookies, IP addresses, etc.), like you said, tends to be unreliable. So, I think your best bet is to use redundancy in your system: use cookies, "Flash-cookies" (shared objects), IP addresses (perhaps in conjunction with user-agents), and user IDs for people who are logged in.
You could implement some sort of scheme where any unknown client is given a unique ID, which gets stored (hopefully) on the client's machine and re-transmitted with every request. Then you could tie an IP address, user agent, and/or user ID (plus anything else you can think of) to every unique ID and vice-versa. The timestamp and unique ID of every click could be logged in a database table somewhere, and each click (at least, each click to your website) could be let through or denied depending on how recent the last click was for the same unique ID. This is probably reliable enough for short term click-bursts, and long-term it wouldn't matter much anyway (for the click-up problem, not the page counter).
Friendly robots should have their user agent set appropriately and can be checked against a list of known robot user agents (I found one here after a simple Google search) in order to be properly identified and dealt with seperately from real people.