Configuring the Node Identifier Option in Exhaustive CHAID in SPSS Data Modeler - spss-modeler

According to IBM's online help:
Optionally, for CHAID, QUEST, and C&R Tree models, an additional field can be added that indicates the ID for the node to which each record is assigned.
I cannot find that option. I am using an (exhaustive) CHAID which adds the $R- (prediction field) variable but there is no $RI- (node identifier field) variable. Just in case IBM was being literal I checked running a regular CHAID (not exhaustive) but still without getting the $RI-variable I need.
I know that in SPSS v. 25 this is easily configured so is IBM just confused in their online help for modeler, or am I missing something obvious? Thanks in advance for any help.

The get the rule identifier added to the data set, you need to first train the model to generate the model nugget.
You can then edit (or open) the model nugget and select the "Settings" tab. Here you will find the option "Rule identifier" which must be checked to include the ID of node the each record is assigned.
It is important to realize that this is a setting in the generated mudel nugget and not in the modeling node. This also means that this setting must be checked (and rechecked) each time the model is retrained and the nugget is regenerated.

Related

What do we enter in the parameter field when we use "most trusted source" as the survivorship function (i.e., using t-swoosh algorithm in tMatchGroup)

I would like to create a master record from customer listing in multiple sources (i.e., Golden Customer Record / Master Data Management) using a Talend job. Research indicates that the tMatchGroup is the best component to perform this as it is capable of merging records base on survivorship rules.
My question is, if I would like to use the "most trusted source" survivorship function, how do I list the source ranking in the parameter field when I use the t-swoosh algorithm? The documentation does not show how to do this and I can't find anything online.
This is the documentation I am referring to. Any advise would be much appreciated.

How to bring dropdown at top

I'm not getting this dropdown in front as i have created a custom control combo of tree behaviour.
By reference to this doc: Add a custom field to a work item type (Inheritance process), we can add a custom field to support tracking data requirements you have that aren’t met with the existing set of fields.
Currently, the field Area Path and Iteration Path are existed datatype as Tree path, but their value are defined in Project configuration page and Team configuration page by reference to this doc: Define area paths and assign to a team and Define iteration paths and configure team iterations.
However, tree type is not available in new field, as below.
In addition, this is indeed a good suggestion. You can create a suggestion ticket here. The product group will review these tickets regularly, and consider take it as roadmap.

How to create a Derived Column in IIDR CDC for Kafka Topics?

we are currently working on a project to get data from an IBM i (formerly known as AS400) system with IBM IIDR CDC to Apache Kafka (Confluent Plattform).
So far everything was working fine, everything get replicated and appears in the topics.
Now we are trying to create a derived column in a table mapping which gives us the journal entry type from the source system (IBM i).
We would like to have the information to see whether it was an Insert,Update or Delete Operation.
Therefore we crated a derived column called OPERATION as Char(2) with Expression &ENTTYP.
But unfortunately the Kafka Topic doesn't show the value.
Can someone tell me what we were missing here?
Best regards,
Michael
I own the IBM IDR Kafka target, so lets see if I can help a bit.
So you have two options. The recommended way to see audit information would be to use one of the audit KCOPs. For instance you might use this one...
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/tasks/kcopauditavroformat.html#kcopauditavroformat
You'll note that the audit.jcf property in the example is set to CCID and ENTTYP, so you get both the operation type and the transaction id.
Now if you are using derived columns I believe you would follow the following procedure... https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.mcadminguide.doc/tasks/addderivedcolumn.html
If this is not working out, open a ticket and the L2 folks will provide a deeper debug. Oh also if you end up adding one, does the actual column get created in the output, just with no value in it?
Cheers,
Shawn
your colleagues told me how to do it:
DR Management Console -> Go to the "Filtering" tab -> find out "Derived Column" column in "Filter Columns" (Source Columns) section and mark "replicate" next to the column. Save table mapping afterwards and see if it appears now.
Unfortunately a derived column isn`t automatically selected for replication, but now I know how to select it.
you need to duplicate the new column on filter:
https://www.ibm.com/docs/en/idr/11.4.0?topic=mstkul-mapping-audit-fields-journal-control-fields-kafka-targets

Auto-Numeric Nugget Ignores Splits in SPSS Modeler

I'm trying to explore a continuous target variable in SPSS Modeler v. 18.2, using a split variable ("Cohort"). In other models that have a nominal target variable, I'm able to use the auto-classifier to generate models on each split---but in this model when I use the auto-numeric node it ignores the splits entirely. Here is the stream:
In the data file, I have "Cohort" set to Split:
In the node, in the Fields tab, I have added Cohort to the splits...
...and in the Model tab I have checked the build model for each split box:
The nugget doesn't include the splits---in the Summary tab it doesn't look like it's in the model at all:
My work-around is to use Select nodes for each split but that has disadvantages---thank you in advance for any help/corrections.
I am currently using IBM SPSS Modeler 18.0 but I am seeing the exact same behavior when using one of the demo data sets supplied with Modeler. I would consider this to be a defect and something that would need to be addressed by IBM's development team.
I suggest that you replicate the issue with one of the data sets from the "Demos" folder such as the "car_insurance_claims.sav" and then open a support ticket with the IBM SPSS technical support to have this resolved.

Enterprise Architect: Setting run state from initial attribute values when creating instance

I am on Enterprise Architect 13.5, creating a deployment diagram. I am defining our servers as nodes, and using attributes on them so that I can specify their details, such as Disk Controller = RAID 5 or Disks = 4 x 80 GB.
When dragging instances of these nodes onto a diagram, I can select "Set Run State" on them and set values for all attributes I have defined - just like it is done in the deployment diagram in the EAExample project:
Since our design will have several servers using the same configuration, my plan was to use the "initial value" column in the attribute definition on the node to specify the default configuration so that all instances I create automatically come up with reasonable values, and when the default changes, I would only change the Initial Values on the original node instead of having to go to all instances:
My problem is that even though I define initial values, all instances I create do not show any values when I drag them onto the diagram. Only by setting the Run State on each instance, I can get them to show the values I want:
Is this expected behavior? Btw, I can reproduce the same using classes and instances of them, so this is not merely a deployment diagram issue.
Any ideas are greatly appreciated! I'm also thankful if you can describe a better way to achieve the same result with EA, in case I am doing it wrong.
What you could do is to either write a script to assist with it or even create an add-in to bring in more automation. Scripting is easier to implement but you need to run the script manually (which however can add the values in a batch for newly created diagram objects). Using an add-in could do this on element creation if you hook to EA_OnPostNewElement.
What you need to do is to first get the classifier of the object. Using
Repository.GetElementByID(object.ClassifierID)
will return that. You then can check the attributes of that class and make a list of those with an initial value. Finally you add the run states of the object by assigning object.RunState with a crude string. E.g. for a != 33 it would be
#VAR;Variable=a;Value=33;Op=!=;#ENDVAR;
Just join as many as you need for multiple run states.