Customize logName in GKE - kubernetes

Recently had to upgrade our GKE logging settings to Cloud Operations for GKE (per https://cloud.google.com/stackdriver/docs/solutions/gke/migration#what-is-changing). From the document, an interesting change in the logName field occurred, where it was previously based on the container name, but now is just "projects/{PROJECTID}/logs/stdout".
This normally would not be a problem but we rely heavily on Logging to BigQuery sinks to be able to analyze our log data. Since BigQuery log sinks use the logName for table it generates, every table we produce now is "stdout_*" instead of the container name. This is very confusing and makes it more difficult to use shared datasets and generally bad from a naming point of view. I have already filed a feature request with google to be able to customize the BQ sink table name, but that does not help our use case right now.
If we can change the logName then that would be able to change the BQ table name as well. I have seen google's documentation for their logging agent but have not found a way to edit this logName field.
Options I have considered:
cron job to copy the tables to a new table name (downside on maintenance and small increase in cost)
Disable default logging and use a custom logging solution (large time investment and not clear if that would still help here)
Biting the bullet and just using the stdout_* table name (very close to choosing this option)
Is there anyway to customize the logName from a k8s container?

Related

Is it even possible to really monitor links in a DataStage job?

DS Version 11.7
In the DSODBConfig.cfg file, we set MonitorLinks=1.
The documentation tells us that this setting
Controls if stage-level and link-level statistics, and references to
data locators, are captured at the end of each job run.
But when we inspect the DSODB tables, we only find data for links that are either a data source or a data target. All data for intermediate links is missing.
Q:
Is it somehow possible to enable the logging of really all links?

How to create a Derived Column in IIDR CDC for Kafka Topics?

we are currently working on a project to get data from an IBM i (formerly known as AS400) system with IBM IIDR CDC to Apache Kafka (Confluent Plattform).
So far everything was working fine, everything get replicated and appears in the topics.
Now we are trying to create a derived column in a table mapping which gives us the journal entry type from the source system (IBM i).
We would like to have the information to see whether it was an Insert,Update or Delete Operation.
Therefore we crated a derived column called OPERATION as Char(2) with Expression &ENTTYP.
But unfortunately the Kafka Topic doesn't show the value.
Can someone tell me what we were missing here?
Best regards,
Michael
I own the IBM IDR Kafka target, so lets see if I can help a bit.
So you have two options. The recommended way to see audit information would be to use one of the audit KCOPs. For instance you might use this one...
https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdckafka.doc/tasks/kcopauditavroformat.html#kcopauditavroformat
You'll note that the audit.jcf property in the example is set to CCID and ENTTYP, so you get both the operation type and the transaction id.
Now if you are using derived columns I believe you would follow the following procedure... https://www.ibm.com/support/knowledgecenter/en/SSTRGZ_11.4.0/com.ibm.cdcdoc.mcadminguide.doc/tasks/addderivedcolumn.html
If this is not working out, open a ticket and the L2 folks will provide a deeper debug. Oh also if you end up adding one, does the actual column get created in the output, just with no value in it?
Cheers,
Shawn
your colleagues told me how to do it:
DR Management Console -> Go to the "Filtering" tab -> find out "Derived Column" column in "Filter Columns" (Source Columns) section and mark "replicate" next to the column. Save table mapping afterwards and see if it appears now.
Unfortunately a derived column isn`t automatically selected for replication, but now I know how to select it.
you need to duplicate the new column on filter:
https://www.ibm.com/docs/en/idr/11.4.0?topic=mstkul-mapping-audit-fields-journal-control-fields-kafka-targets

Get Incident details such as assigned to, comments by the analyst,Incident ID etc using query in Logs

I am investigating incidents but I need to tie them with the SOC analyst who worked on it and what comments were added by them. I am not able to find these details in any table.
This will be helpful to pull out the metrics for the SOC team.
Where can I find this information?
Understandably, these are difficult to expose right now. They are located in the AzureActivity table (with the Azure Activity Data Connector enabled).
We will be making this much easier very soon with a new table specific to Incidents.
In the interim, here's a KQL snippet that you can use to start sifting through the results for Incidents in the AzureActivity table:
AzureActivity
| where _ResourceId has "Microsoft.SecurityInsights" and _ResourceId has "incidents"

Two types of user in database how might I redesign my database?

I have a database in production, with two types of users. Similar to Uber, we have a 'Rider' and a 'Driver' user.
In my database, I have Users, Rider and Driver tables. The User table contains shared data between the two user types, and the Driver and Rider tables contain the remaining data.
When the database was originally designed, it was not thought that a Driver might also want to be a Rider. This use case how now arisen and I am unsure how to handle the database tables.
Currently, the email_address field has a unique constraint. The user also has a user_type field, which is either Rider or Driver.
My current thoughts are to remove the unique constraint on email_address, and create a unique index on email_address and user_type, which allows users to use both sides of the application.
This does create the problem of needing to specify which type of user is being worked with, for example, when calling /login. I think I would now need to do something like /login?type=rider.
Are there any better approaches to this? We are in production, so I can't just scrap and recreate, but I am open to migrating data if there is a better solution.
My suggestion is a bit too big for comment, so hear me out.
Exact solution is difficult as I have not seen how you are using these tables in the application. So this solution is designed on some assumptions.
At login time, you need to determine if the user wants to logon as driver or rider. Typically, link for drivers should be different that link for users. Or there should be a flag.
For riders, you don't need additional validation. But not everyone should be allowed to log in as driver ( e.g. If a user by mistake tries to logon as driver, system should not allow that).
That means your initial user table and driver table does not need any modification. Now if you are not a driver, user role should default to rider.
If you are a rider logging in as rider, no change is needed.
I am assuming you have a role setting mechanism somewhere in your code base.
However, if you are a driver logging in as driver, your existing logic should continue with this additional flag, i.e. you should check if flag is present in code and only then allow checking for driver in your tables. If you are a driver coming in as rider, the flag has to go as no.
Now you have 2 options. First is to get the driver coming in as rider data from driver table only, and second is to make an entry in rider table and use that. Both approaches will work ( you will have to tweak your application logic based on the flag to get to the correct table). My suggestion will be to go to the rider table, that should reduce further complications in the system. So basically, one user, if he is a driver sometimes and rider sometimes, should have entries in both driver and rider table( assuming your constraint is only on user tables) and you should use the correct entry based on the role the person is logging in. Hopefully, once logged in using the correct table, the changes to the system would be a minimum.

Billing by tag in Google Compute Engine

Google Compute Engine allows for a daily export of a project's itemized bill to a storage bucket (.csv or .json). In the daily file I can see X-number of seconds of N1-Highmem-8 VM usage. Is there a mechanism for further identifying costs, such as per tag or instance group, when a project has many of the same resource type deployed for different functional operations?
As an example, Qty:10 N1-Highmem-8 VM's are deployed to a region in a project. In the daily bill they just display as X-seconds of N1-Highmem-8.
Functionally:
2 VM's might run a database 24x7
3 VM's might run batch analytics operation averaging 2-5 hrs each night
5 VM's might perform a batch operation which runs in sporadic 10 minute intervals through the day
final operation writes data to a specific GS Buckets, other operations read/write to different buckets.
How might costs be broken out across these four operations each day?
The Usage Logs do not provide 'per-tag' granularity at this time and it can be a little tricky to work with the usage logs but here is what I recommend.
To further break down the usage logs and get better information out of em, I'd recommend trying to work like this:
Your usage logs provide the following fields:
Report Date
MeasurementId
Quantity
Unit
Resource URI
ResourceId
Location
If you look at the MeasurementID, you can choose to filter by the type of image you want to verify. For example VmimageN1Standard_1 is used to represent an n1-standard-1 machine type.
You can then use the MeasurementID in combination with the Resource URI to find out what your usage is on a more granular (per instance) scale. For example, the Resource URI for my test machine would be:
https://www.googleapis.com/compute/v1/projects/MY_PROJECT/zones/ZONE/instances/boyan-test-instance
*Note: I've replaced the "MY_PROJECT" and "ZONE" here, so that's that would be specific to your output along with the name of the instance.
If you look at the end of the URI, you can clearly see which instance that is for. You could then use this to look for a specific instance you're checking.
If you are better skilled with Excel or other spreadsheet/analysis software, you may be able to do even better as this is just an idea on how you could use the logs. At that point it becomes somewhat a question of creativity. I am sure you could find good ways to work with the data you gain from an export.
9/2017 update.
It is now possible to add user defined labels, then track usage and billing by these labels for Compute and GCS.
Additionally, by enabling the billing export to Big Query, it is then possible to create custom views or hit Big Query in a tool more friendly to finance people such as Google Docs, Data Studio, or anything which can connect to Big Query. Here is a great example of labels across multiple projects to split costs into something friendlier to organizations, in this case a Data Studio report.