Is it even possible to really monitor links in a DataStage job? - datastage

DS Version 11.7
In the DSODBConfig.cfg file, we set MonitorLinks=1.
The documentation tells us that this setting
Controls if stage-level and link-level statistics, and references to
data locators, are captured at the end of each job run.
But when we inspect the DSODB tables, we only find data for links that are either a data source or a data target. All data for intermediate links is missing.
Q:
Is it somehow possible to enable the logging of really all links?

Related

how to quickly locate which sheets/dashboards contain a field?

I am creating a data dictionary and I am supposed to track the location of any used field in a workbook. For example (superstore sample data), I need to specify which sheets/dashboards have the [sub-category] field.
My dataset has hundreds of measures/dimensions/calc fields, so it's incredibly time exhaustive to click into every single sheet/dashboard just to see if a field exists in there, so is there a quicker way to do this?
One robust, but not free, approach is to use Tableau's Data Catalog which is part of the Tableau Server Data Management Add-On
Another option is to build your own cross reference - You could start with Chris Gerrard's ruby libraries described in the article http://tableaufriction.blogspot.com/2018/09/documenting-dashboards-and-their.html

How to know what variables can be accessed in Mirth?

How to know what variables can be accessed for any given screen in Mirth (v3.10)?
Eg. When setting up a Channel Source for a Database Reader connector type, I realized that I can access the variables of the fields that are being read by the SQL query when writing the Post-Process SQL query purely by experimentation (else there was nothing on the screen that would indicate that I could have done that). I see a blank box to the right of the Post-Process SQL query box, but it's empty so IDK what that's for (unlike the variables that I can drag from the right of the File Writer channel destination connection).
Looking at the "Velocity Variable Replacement" and "Variable Maps" sections of the Mirth user's manual did not have much detail on this question either.
Anyone with more experience know where I could see this information (is it a link, something in the UI itself, docs somewhere)?

ANYLOGIC- Automatically model creation

Can anylogic automatically model if we provide the initial data to it in form of excel file or database so that it can be populated with blocks like delay, resource pool, etc.
Sure, but it is not for beginners. I recently had a full webinar about it, see https://youtu.be/casVdmKC-S0

Best practices for parameterizing load of multiple CSV files in Data Factory

I am experimenting with Azure Data Factory to replace some other data-load solutions we currently have, and I'm struggling with finding the best way to organize and parameterize the pipelines to provide the scalability we need.
Our typical pattern is that we build an integration for a particular Platform. This "integration" is essentially the mapping and transform of fields from their data files (CSVs) into our Stage1 SQL database, and by the time the data lands in there, the data types should be set properly and the indexes set.
Within each Platform, we have Customers. Each Customer has their own set of data files that get processed in that Customer context -- within the scope of a Platform, all Customer files follow the same schema (or close to it), but they all get sent to us separately. If you looked at our incoming file store, it might look like (simplified, there are 20-30 source datasets per customer depending on platform):
Platform
Customer A
Employees.csv
PayPeriods.csv
etc
Customer B
Employees.csv
PayPeriods.csv
etc
Each customer lands in their own SQL schema. So after processing the above, I should have CustomerA.Employees and CustomerB.Employees tables. (This allows a little bit of schema drift between customers, which does happen on some platforms. We handle it later in our stage 2 ETL process.)
What I'm trying to figure out is:
What is the best way to setup ADF so I can effectively manage one set of mappings per platform, and automatically accommodate any new customers we add to that platform without having to change the pipeline/flow?
My current thinking is to have one pipeline per platform, and one dataflow per file per platform. The pipeline has a variable, "schemaname", which is set using the path of the file that triggered it (e.g. "CustomerA"). Then, depending on file name, there is a branching conditional that will fire the right dataflow. E.g. if it's "employees.csv" it runs one dataflow, if it's "payperiods.csv" it loads a different dataflow. Also, they'd all be using the same generic target sink datasource, the table name being parameterized and those parameters being set in the pipeline using the schema variable and the filename from the conditional branch.
Are there any pitfalls to setting it up this way? Am I thinking about this correctly?
This sounds solid. Just be aware that you if you define column-specific mappings with expressions that expect those columns to be present, you may have data flow execution failures if those columns are not present in your customer source files.
The ways to protect against that in ADF Data Flow is to use column patterns. This will allow you to define mappings that are generic and more flexible.

IBM Datastage reports failure code 262148

I realize this is a bad question, but I don't know where else to turn.
can someone point me to where I can find the list of reports failure codes for IBM? I've tried searching for it in the IBM documentation, and in general google search, but this particular error is unique and I've never seen it before.
I'm trying to find out what code 262148 means.
Background:
I built a datastage job that has:
ORACLE CONNECTOR --> TRANSFORMER -> HIERARCHICAL DATA
The intent is to pull data from a ORACLE table, and output the response of the select statement into a JSON file. I'm using the HIERARCHICAL stage to set it. When tested in the stage, no problems, I see the JSON output.
However, when I run the job, it squawks:
reports failure code 262148
then the job aborts. There are no warnings, no signs, no errors prior to this line.
Until I know what it is, I can't troubleshoot.
If someone can point me to where the list of failure codes are, i can proceed.
Thanks!
can someone point me to where I can find the list of reports failure codes for IBM?
Here you go:
https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_73/rzahb/rzahbsrclist.htm
While this list does not list your specific error code, it does categorize many other codes, and explains how the code breakdown works. While this list is not specifically for DataStage, in my experience IBM standards are generally compatible across different products. In this list every code that starts with a 2 is a disk failure, so maybe run a disk checker. That's the best I've got as far as error codes.
Without knowledge of the inner workings of the product, there is not much more you can do beyond checking system health in general (especially disk, network and permissions in this case). Personally, I prefer to go get internal knowledge whenever exterior knowledge proves insufficient. I would start with a network capture, as I'm sure there's a socket involved in the connection between the layers. Compare the network capture from when the select statement is run from the hierarchical from one captured when run from the job. There may be clues in there, like reset or refused connections.