Having trouble getting usable results from Watson's Document Conversion service - ibm-cloud

When I try to convert this document
https://public.dhe.ibm.com/common/ssi/ecm/po/en/poq12347usen/POQ12347USEN.PDF
with Watson's Document Conversion service, all I get is four answer units, one for each level-4 heading. What I really need is 47 answer units, one for each FAQ question. How can I achieve this?

Often a custom configuration can be used to produce more usable results in the case of a document such as this one.
The custom configuration can be passed to Document Conversion in a config form part on the request.
Please refer to the documentation (https://www.ibm.com/watson/developercloud/doc/document-conversion/customizing.shtml)
for more details on the options available. In this particular case, the following seems to give improved results:
{
  "conversion_target": "ANSWER_UNITS",
  "pdf": {
    "heading": {
      "fonts": [
        {"level": 1, "min_size": 14, "max_size": 80},
        {"level": 2, "min_size": 11, "max_size": 12, "bold": true},
        {"level": 3, "min_size": 9, "max_size": 11, "bold": true}
      ]
    }
  }
}

Related

Duplicate builds leading to wrong commit status in GitHub

This issue is also described in https://issues.jenkins.io/browse/JENKINS-70459
When using Jenkins, we noticed that the wrong pipeline status is often reported in GitHub PRs.
Further investigation showed very odd behavior. We have not yet found the cause of this problem (random?).
The 'Detail' link leads to the build which is successful.
Now comes the odd thing: The Jenkins log shows that the same build id was build twice!
First, it runs successful (trigger: PR Update). Here is an excerpt from the log:
{ [-]
   build_number: 2
   build_url: job/(...)/PR-2906/2/
   event_tag: job_event
   job_duration: 1108.635
   job_name: (...)/PR-2906
   job_result: SUCCESS
   job_started_at: 2023-01-19T14:41:14Z
   job_type: Pipeline
   label: master
   metadata: { [+]
   }
   node: (built-in)
   queue_id: 1781283
   queue_time: 5.063
   scm: git
   test_summary: { [+]
   }
   trigger_by: Pull request #2906 updated
   type: completed
   upstream:
   user: anonymous
}
Then, another run, under the exact same build id / url appears in the log:
{
   build_number: 2
   build_url: job/(...)/PR-2906/2/
   event_tag: job_event
   job_duration: 1.959
   job_name: (...)/PR-2906
   job_result: FAILURE
   job_started_at: 2023-01-20T07:14:50Z
   job_type: Pipeline
   label: master
   node: (built-in)
   queue_id: 2261495
   queue_time: 7.613
   test_summary: { [+]
   }
   trigger_by: Branch indexing
   type: completed
   upstream:
   user: anonymous
}
Notice that the trigger is now "Branch indexing". We do not know why this build happens but it is likely the root cause of this issue.
The failed build is not displayed in the Jenkins UI and the script console also returns #2 as the last successful build. We assume that this "corrupt" build is reported to GitHub. Does anyone have any ideas how this may happen? Any ideas are very welcome!
We checked our logs and tried to reproduce this behaviour - unsuccessful, so far.
¿Are you using Multibranch Pipeline plugin?
By default, Jenkins will not automatically re-index the repository for branch additions or deletions (unless using an Organization Folder), so it is often useful to configure a Multibranch Pipeline to periodically re-index in the configuration
Source: https://www.jenkins.io/doc/book/pipeline/multibranch/
Maybe this can also help: What are "Branch indexing" activities in Jenkins BlueOcean

Can apim policy fragments be imported/exported

I've read the documentation and while the policy fragment idea seems good for code reuse, the system doesn't seem to provide a way to deploy them in an automated way.
I've even exported the entire configuration of the apim to git and could not find my policy fragment.
Seems like it's a very recent feature, we had the same problem, and as a first approach we decided to use terraform for deploying policy fragments from dev environment to stagging and production environments.
https://learn.microsoft.com/es-mx/azure/templates/microsoft.apimanagement/2021-12-01-preview/service/policyfragments?pivots=deployment-language-terraform
$computer> cat main.tf
terraform {
  required_providers {
    azapi = {
      source = "azure/azapi"
    }
  }
}
provider "azapi" {
}
resource "azapi_resource" "symbolicname" {
  type = "Microsoft.ApiManagement/service/policyFragments#2021-12-01-preview"
  name = “fragmentpolicyname”
  parent_id = "/subscriptions/[subscriptionid]/resourceGroups/[resourcegroupname]/providers/Microsoft.ApiManagement/service/[apimanagementservicename]”
  body = jsonencode({
    properties = {
      description = “fragment policy description”
      format = "xml" # it could also be rawxml
      value = <<EOF
<!--
    IMPORTANT:
    - Policy fragment are included as-is whenever they are referenced.
    - If using variables. Ensure they are setup before use.
    - Copy and paste your code here or simply start coding
 -->
 <fragment>
        //some magical code here that you will use in a lot of policies
 </fragment>
EOF
    }
  })
}
terraform init
terraform plan
terraform apply
You can integrate this part in your azure devops pipeline.

Azure DevOps REST api - Run pipeline with variables

I have a pipeline on Azure Devops that I'm trying to run programatically/headless using the REST api: https://learn.microsoft.com/en-us/rest/api/azure/devops/pipelines/runs/run%20pipeline?view=azure-devops-rest-6.0
So far so good, I can auth and start a run. I would like to pass data to this pipeline which the docs suggests is possible using variables in the request body. My request body:
{
"variables": {
"HELLO_WORLD": {
"isSecret": false,
"value": "HelloWorldValue"
}
}
}
My pipeline YAML looks like this:
trigger: none
pr: none
pool:
vmImage: 'ubuntu-latest'
steps:
- task: Bash#3
inputs:
targetType: 'inline'
script: |
KEY=$(HELLO_WORLD)
echo "Hello world key: " $KEY
This however gives me an error that "HELLO_WORLD: command not found".
I have tried adding a "HELLO_WORLD" variable to the pipeline and enabled the "Let users override this value when running this pipeline"-setting. This results in the HELLO_WORLD variable no longer being unknown, but instead its stuck on its initial value and not set when i trigger a run with the REST api
How do you pass variables to a pipeline using the REST api? It is important that the variable value is set only for a specific run/build
I found another API to run a build, but it seems like you cannot use Personal Access Token auth with it, like you can with the pipeline api - only OAuth2 - https://learn.microsoft.com/en-us/rest/api/azure/devops/build/builds/queue?view=azure-devops-rest-6.0
You can do it with both the Runs API and Build Queue API, both work with Personal Access Tokens. For which one is the better/preferred, see this question: Difference between Azure Devops Builds - Queue vs run pipeline REST APIs, but in short the Runs API will be the more future proof option
Option 1: Runs API
POST https://dev.azure.com/{{organization}}/{{project}}/_apis/pipelines/{{PipelineId}}/runs?api-version=6.0-preview.1
Your body will be of type application/json (HTTP header Content-Type is set to application/json) and similar to the below, just replace resources.repositories.self.refName with the appropriate value
{
"resources": {
"repositories": {
"self": {
"refName": "refs/heads/main"
}
}
},
"variables": {
"HELLO_WORLD": {
"isSecret": false,
"value": "HelloWorldValue"
}
}
}
Option 2: Build API
POST https://dev.azure.com/{{organization}}/{{project}}/_apis/build/builds?api-version=6.0
Your body will be of type application/json (HTTP header Content-Type is set to application/json), something similar to below, just replace definition.id and sourcebranch with appropriate values. Please also note the "stringified" content of the parameter section (it should be a string representation of a json map)
{
"parameters": "{\"HELLO_WORLD\":\"HelloWorldValue\"}",
"definition": {
"id": 1
},
"sourceBranch": "refs/heads/main"
}
Here's the way I solved it....
The REST call:
POST https://dev.azure.com/<myOrg>/<myProject>/_apis/pipelines/17/runs?api-version=6.0-preview.1
 
The body of the request:
{
    "resources": {
        "repositories": {
            "self": {
                "refName": "refs/heads/main"
            }
        }
    },
    "templateParameters": {
        "A_Parameter": "And now for something completely different."
    }
}
Note: I added an authorization header with basic auth containing a username (any name will do) and password (your PAT token value). Also added a Content-Type application/json header.
 
Here's the entire yaml pipeline I used:
 
parameters:
- name: A_Parameter
  displayName: A parameter
  default: noValue
  type: string
 
trigger:
- none
 
pool:
  vmImage: ubuntu-latest
 
steps:
 
- script: |
    echo '1 - using dollar sign parens, p dot A_Parameter is now: ' $(parameters.A_Parameter)
    echo '2 - using dollar sign double curly braces, p dot A_Parameter is now::' ${{ parameters.A_Parameter }} '::'
    echo '3 - using dollar sign and only the var name: ' $(A_Parameter)
  displayName: 'Run a multi-line script'
 
 
And here's the output from the pipeline log. Note that only the second way properly displayed the value.  
 
1 - using dollar sign parens, p dot A_Parameter is now: 
2 - using dollar sign double curly braces, p dot A_Parameter is now:: And now for something completely different. :: 
3 - using dollar sign and only the var name:

Why error, "alias target name does not lie within the target zone" in Terraform aws_route53_record?

With Terraform 0.12, I am creating a static web site in an S3 bucket:
...
resource "aws_s3_bucket" "www" {
bucket = "example.com"
acl = "public-read"
policy = <<-POLICY
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "AddPerm",
"Effect": "Allow",
"Principal": "*",
"Action": ["s3:GetObject"],
"Resource": ["arn:aws:s3:::example.com/*"]
}]
}
POLICY
website {
index_document = "index.html"
error_document = "404.html"
}
tags = {
Environment = var.environment
Terraform = "true"
}
}
resource "aws_route53_zone" "main" {
name = "example.com"
tags = {
Environment = var.environment
Terraform = "true"
}
}
resource "aws_route53_record" "main-ns" {
zone_id = aws_route53_zone.main.zone_id
name = "example.com"
type = "A"
alias {
name = aws_s3_bucket.www.website_endpoint
zone_id = aws_route53_zone.main.zone_id
evaluate_target_health = false
}
}
I get the error:
Error: [ERR]: Error building changeset: InvalidChangeBatch:
[Tried to create an alias that targets example.com.s3-website-us-west-2.amazonaws.com., type A in zone Z1P...9HY, but the alias target name does not lie within the target zone,
Tried to create an alias that targets example.com.s3-website-us-west-2.amazonaws.com., type A in zone Z1P...9HY, but that target was not found]
status code: 400, request id: 35...bc
on main.tf line 132, in resource "aws_route53_record" "main-ns":
132: resource "aws_route53_record" "main-ns" {
What is wrong?
The zone_id within alias is the S3 bucket zone ID, not the Route 53 zone ID. The correct aws_route53_record resource is:
resource "aws_route53_record" "main-ns" {
zone_id = aws_route53_zone.main.zone_id
name = "example.com"
type = "A"
alias {
name = aws_s3_bucket.www.website_endpoint
zone_id = aws_s3_bucket.www.hosted_zone_id # Corrected
evaluate_target_health = false
}
}
Here is an example for CloudFront. The variables are:
base_url = example.com
cloudfront_distribution = "EXXREDACTEDXXX"
domain_names = ["example.com", "www.example.com"]
The Terraform code is:
data "aws_route53_zone" "this" {
name = var.base_url
}
data "aws_cloudfront_distribution" "this" {
id = var.cloudfront_distribution
}
resource "aws_route53_record" "this" {
for_each = toset(var.domain_names)
zone_id = data.aws_route53_zone.this.zone_id
name = each.value
type = "A"
alias {
name = data.aws_cloudfront_distribution.this.domain_name
zone_id = data.aws_cloudfront_distribution.this.hosted_zone_id
evaluate_target_health = false
}
}
Many users specify CloudFront zone_id = "Z2FDTNDATAQYW2" because it's always Z2FDTNDATAQYW2...until some day maybe it isn't. I like to avoid the literal string by computing it using data source aws_cloudfront_distribution.
For anyone like me that came here from Google in hope to find the syntax for the CloudFormation and YML, Here is how you can achieve it for your sub-domains.
Here we add a DNS record into the Route53 and redirect all the subnets of example.com to this ALB:
AlbDnsRecord:
Type: "AWS::Route53::RecordSet"
DependsOn: [ALB_LOGICAL_ID]
Properties:
HostedZoneName: "example.com."
Type: "A"
Name: "*.example.com."
AliasTarget:
DNSName: !GetAtt [ALB_LOGICAL_ID].DNSName
EvaluateTargetHealth: False
HostedZoneId: !GetAtt [ALB_LOGICAL_ID].CanonicalHostedZoneID
Comment: "A record for Stages ALB"
My mistakes was:
not adding . at the end of my HostedZoneName
under AliasTarget.HostedZoneId ID is al uppercase in the end of CanonicalHostedZoneID
replace the [ALB_LOGICAL_ID] with the actual name of your ALB, for me it was like: ALBStages.DNSName
You should have the zone in your Route53.
So for us all the below addresses will come to this ALB:
dev01.example.com
dev01api.example.com
dev02.example.com
dev02api.example.com
qa01.example.com
qa01api.example.com
qa02.example.com
qa02api.example.com
uat.example.com
uatapi.example.com

Not able to retrieve RedShift cluster Capacity details like Storage, Memory using Python script

I have tried to fetch my RedShift cluster details. I'm able to see many details about the cluster but few details got missed.
For Ex:- Details like Storageand Memory
The below is the code:-
redshiftClient = boto3.client('redshift', aws_access_key_id = role.credentials.access_key,
aws_secret_access_key = role.credentials.secret_key, aws_session_token = role.credentials.session_token, region_name='us-west-2')
#Getting all the clusters
clusters = redshiftClient.describe_clusters()
can you please check provide the way to get it.
Thanks.
The describe-clusters command does not return that type of information. The output of that command is:
{
"Clusters": [
{
"NodeType": "dw.hs1.xlarge",
"Endpoint": {
"Port": 5439,
"Address": "mycluster.coqoarplqhsn.us-east-1.redshift.amazonaws.com"
},
"ClusterVersion": "1.0",
"PubliclyAccessible": "true",
"MasterUsername": "adminuser",
"ClusterParameterGroups": [
{
"ParameterApplyStatus": "in-sync",
"ParameterGroupName": "default.redshift-1.0"
} ],
"ClusterSecurityGroups": [
{
"Status": "active",
"ClusterSecurityGroupName": "default"
} ],
"AllowVersionUpgrade": true,
"VpcSecurityGroups": \[],
"AvailabilityZone": "us-east-1a",
"ClusterCreateTime": "2013-01-22T21:59:29.559Z",
"PreferredMaintenanceWindow": "sat:03:30-sat:04:00",
"AutomatedSnapshotRetentionPeriod": 1,
"ClusterStatus": "available",
"ClusterIdentifier": "mycluster",
"DBName": "dev",
"NumberOfNodes": 2,
"PendingModifiedValues": {}
} ],
"ResponseMetadata": {
"RequestId": "65b71cac-64df-11e2-8f5b-e90bd6c77476"
}
}
You will need to retrieve Memory and Storage statistics from Amazon CloudWatch.
See your other question: Amazon CloudWatch is not returning Redshift metrics
If you actually want to retrieve information about a standard cluster (that is, the amount of storage and memory assigned to each node, rather than current memory and storage usage), that is not available from an API call. Instead see: Amazon Redshift Clusters