wget command line - define format and encoding - rest

trying to download all categories from an ecommerce website using rest and wget (curl either) I cannot make a readable file. the following line is the one that Im executing:
...>wget https://api.mercadolibre.com/sites/MLB/categories/all --no-check-certificate
I receive information like this->
½Û’Û8².ü*_[6q Hö]>t{\=¶ÇëðÇŽˆ¢ªè–ÄÜmïXûÖôeÇŽ¹˜˜»û®^ìHU €()‰dåŠ1]ì®,$&
I expected something like:
, {
"id": "MLA1743",
"name": "Autos, Motos y Otros"
}, {
"id": "MLA1384",
"name": "Bebés"
}, {
"id": "MLA1039",
"name": "Cámaras y Accesorios"
}, {
"id": "MLA1051",
"name": "Celulares y Teléfonos"
}, {
"id": "MLA1798",
"name": "Coleccionables y Hobbies"
}
sorry if its a newbie question but i cannot find a proper tutorial. brgds

The content is gzip-encoded. You can figure this out by looking at the Content-Encoding header the servers sends with the response. You can access the data like this:
wget -o- https://api.mercadolibre.com/sites/MLB/categories/all | zcat
Or just save it to a file first:
wget -o all.gz https://api.mercadolibre.com/sites/MLB/categories/all
gunzip all.gz

Related

Batch create transcription always results in: The recordings URI contains invalid data

I would like to use Azure Speech Services Batch Transcription APIs to create a transcription of my audio file. I've already had success using the Speech Service SDK (for Node.js), but was interested in trying out one of the newer features available in v3.1 preview version of the api (displayFormWordLevelTimestampsEnabled), so I figured I had to do use the REST API service to do that.
Overall my problem is that for whatever input I've feed the Create Transcript API for contentUrls, I always end up getting the same error:
"error": {
"code": "InvalidData",
"message": "The recordings URI contains invalid data."
}
After a little digging, I found some tips through the Azure portal to use sox to handle transcoding the audio file in the specific format requested.
The specific format they mention in the portal documentation shows:
If you are using REST API, make sure that it uses one of the formats in this table:
Format
Codec
Bit rate
Sample Rate
WAV
PCM
256 kbps
16 kHz, mono
OGG
OPUS
256 kpbs
16 kHz, mono
With the sox specific commands being:
Activity
SoX command
Check the audio file format.
sox --i
Convert the audio file to single channel, 16-bit, 16 KHz.
sox -b 16 -e signed-integer -c 1 -r 16k -t wav .wav
I ran my mp3 through the second command and verified the file with the first, and the contents of the file looks like:
Input File : 'out5.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:30.09 = 481488 samples ~ 2256.97 CDDA sectors
File Size : 963k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
Finally, I uploaded the file to a public S3 bucket, to use as my content url for my request:
POST https://westus.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions
{
"contentUrls": [
"https://s3.us-west-1.amazonaws.com/xxxx/out5.wav"
],
"locale": "en-US",
"displayName": "Test"
}
Still it failed with the same error that I posted above. Any insights into what might be wrong? Thanks!
Update:
The answer below mentioned being able to reference a reports.json file on the Get Transcript/Create Transcript api call.
When I use the Create Transcript API my payload is:
{
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95",
"model": {
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base/c3b008fa-eb47-4f6d-a5b9-71dd37870bb7"
},
"links": {
"files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95/files"
},
"properties": {
"diarizationEnabled": false,
"wordLevelTimestampsEnabled": false,
"displayFormWordLevelTimestampsEnabled": false,
"channels": [
0,
1
],
"punctuationMode": "DictatedAndAutomatic",
"profanityFilterMode": "Masked"
},
"lastActionDateTime": "2022-09-13T23:37:09Z",
"status": "NotStarted",
"createdDateTime": "2022-09-13T23:37:09Z",
"locale": "en-US",
"displayName": "Test"
}
Calling the Get Transcript I see:
{
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95",
"model": {
"self": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/models/base/c3b008fa-eb47-4f6d-a5b9-71dd37870bb7"
},
"links": {
"files": "https://westus.api.cognitive.microsoft.com/speechtotext/v3.1-preview.1/transcriptions/02815462-e9c0-4fdc-8bbe-7b0e78152f95/files"
},
"properties": {
"diarizationEnabled": false,
"wordLevelTimestampsEnabled": false,
"displayFormWordLevelTimestampsEnabled": false,
"channels": [
0,
1
],
"punctuationMode": "DictatedAndAutomatic",
"profanityFilterMode": "Masked",
"error": {
"code": "InvalidData",
"message": "The recordings URI contains invalid data."
}
},
"lastActionDateTime": "2022-09-13T23:37:22Z",
"status": "Failed",
"createdDateTime": "2022-09-13T23:37:09Z",
"locale": "en-US",
"displayName": "Test"
}
And finally looking at the transcript files I'm getting an empty list:
{
"values": []
}
I see no reference to a reports.json, or any data populated here at all.
In many cases you can get a detailed error information by doing a GET on https://westus.api.cognitive.microsoft.com/speechtotext/v3.0/transcriptions/<transcription_id>/files and looking at the report.json that is referenced there.
If that doesn't help, you could post transcription id(s) of failed transcription so someone from the team (I am one of them) can look at the service logs.

Can't build customise Slack notifications

After much research, I discovered this .sh script, which allows me to send a very simple personalized notification to a Slack channel (I failed to send it to a particular member).
The problem, which I am facing, is that of formatting the message of the notification. The tags offered on the Slack site to format the notification do not work correctly (that I do not know how to use them!).
The notification must be built "from scratch" at the end of each execution of the Rundeck job.
Rundeck's "On Failure" and "On Success" notifications are not suitable for users.
This the sh script :
#!/bin/bash
​#Usage: slackpost <channel> <message>
slackhost="https://hooks.slack.com/services"
token="XXXXXXXXXX/XXXXXXXXXX/XXXXXXXXXXXXXXXXXXXXXX"
slack_username="Someone-who-loves-you"
slack_icon="smile"​
channel=$1
if [[ $channel == "" ]]
then
echo "No channel specified"
exit 1
fi
text="$2"
if [[ $text == "" ]]
then
echo "No text specified"
exit 1
fi
escapedText=$(echo $text | sed 's/"/\"/g' | sed "s/'/\'/g" )
json="{\"channel\": \"#$channel\", \"username\":\"${slack_username}\", \"icon_emoji\":\":${slack_icon}:\", \"text\": \"$escapedText\"}"
curl -s -d "payload=$json" "$slackhost/$token"
The variable $text is filled with the content of a .txt file (myFileText):
myFileText content is as follows:
text="",
{
blocks=[
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Hello world! <#UNVD64N02> :tada: \n\n - a \n-b"
}
}
]
}
{
"blocks": [
{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "Hello, Assistant to the Regional Manager John Doe! *Martin Scott* wants to know where you'd like to take the Paper Company investors to dinner tonight.\n\n *Please select a restaurant:*"
}
}
]
}
The .sh script is run as follows:
sh slack_post.sh "channel-receiving-the-notification" "$(< myFileText)"
As result I have :
invalid_payload
Any idea could be helpful.

How to get the top-n files sorted by cognitive complexity in a project using REST API?

How to get the top 500 files in a project sorted by cognitive complexity (using the REST API)? The intent is to export the metric for use with another tool.
On a current SonarQube (8.2, though this would work with earlier versions as well, according to documentation), and presuming your instance is on localhost:9000 and the project's name is project1, this bash script curls SonarQube for top 500 files and their cognitive complexity values, sorted by cognitive complexity, then pretty prints it with jq, and displays it in less:
#!/bin/bash
curl \
"localhost:9000"\
"/api/measures/component_tree?"\
"component=project1&"\
"strategy=leaves&"\
"metricKeys=cognitive_complexity&"\
"s=metric&"\
"metricSort=cognitive_complexity&"\
"asc=false&"\
"ps=500" \
| jq "[.components[] | {path: .path, cognitive_complexity: .measures[0].value}]" \
| less
Above script produces output as such:
[
{
"path": "desktop/src/main/java/bisq/desktop/main/offer/MutableOfferViewModel.java",
"cognitive_complexity": "319"
}
{
"path": "desktop/src/main/java/bisq/desktop/main/offer/offerbook/OfferBookView.java",
"cognitive_complexity": "304"
}
{
"path": "p2p/src/main/java/bisq/network/p2p/network/Connection.java",
"cognitive_complexity": "228"
}
{
"path": "desktop/src/main/java/bisq/desktop/main/support/dispute/DisputeView.java",
"cognitive_complexity": "225"
}
{
"path": "desktop/src/main/java/bisq/desktop/util/GUIUtil.java",
"cognitive_complexity": "192"
}
...

get task id's from kafka connect API to print in logs

I have a kafka connect sink code for which below json is passed as curl command to register tasks.
Please let me know if anyone has any idea on how to get the task id's of my connect. For example in below example, we have defined max tasks is 3, so I need to know
the name of 3 tasks for logs i.e. I need to know which line of my log belongs to which task.
In below example, I know I have 3 tasks - TestCheck-1, TestCheck-2 and TestCheck-3 based on the kafka connect logs. I want to know how to get the task names so that I can print them in my kafka connect log lines.
{
"name": "TestCheck",
"config": {
"topics": "topic1",
"connector.class": "ApplicationSinkTask Class package",
"tasks.max": "3",
"key.converter": "org.apache.kafka.connect.storage.StringConverter",
"value.converter": "org.apache.kafka.connect.storage.StringConverter",
"connector.url": "jdbc connection url",
"driver.name": "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"username": "myusername",
"password": "mypassword",
"table.name": "test_table",
"database.name": "test",
}
}
When I register, I will get below details.
curl -X POST -H "Content-Type: application/json" --data #myjson.json http://service:8082/connectors
{"name":"TestCheck","config":{"topics":"topic1","connector.class":"ApplicationSinkTask Class package","tasks.max":"3","key.converter":"org.apache.kafka.connect.storage.StringConverter","value.converter":"org.apache.kafka.connect.storage.StringConverter","connector.url":"jdbc:sqlserver://datahubprod.database.windows.net:1433;","driver.name":"jdbc connection url","username":"myuser","password":"mypassword","table.name":"test_table","database.name":"test","name":"TestCheck"},"tasks":[{"connector":"TestCheck","task":0},{"connector":"TestCheck","task":1},{"connector":"TestCheck","task":2}],"type":null}
You can manage the connectors with the Kafka Connect Rest API. There's a whole heap of commands which you can find here
The example given in the above link shows you can retrieve all task for a given connector using the command
$ curl localhost:8083/connectors/local-file-sink/tasks
[
{
"id": {
"connector": "local-file-sink",
"task": 0
},
"config": {
"task.class": "org.apache.kafka.connect.file.FileStreamSinkTask",
"topics": "connect-test",
"file": "test.sink.txt"
}
}
]
You can use a language of your choice to send the curl command and import the json response into a variable/dictionary for further use, such as printing to a log. Here's a very simple example using python which will assign the whole output to a variable.
import requests
import json
connectors = 'http://localhost:8083/connectors'
p = requests.get(connectors)
data = p.json()
If you parse the data variable to a a dictionary, you can the access each element, i.e the task id
I hope this helps!

Installing Slack Plugin to Sensu NON-enterprise edition

I have Sensu running and followed the instructions the best I could to install the Slack plugin. I'm attempting to just do a "hello-world" to get started, but the documentation seems lacking to me.
I followed the "getting started" with checks:
https://sensuapp.org/docs/0.20/getting-started-with-checks
and everything seems to be in the correct place on the server.
I am attempting to install the following community plugin, but they have a catch-all instruction for all community plugins. There is a json file in the plugin instructions, but doesn't say where to put it...
https://github.com/sensu-plugins/sensu-plugins-slack
Here is what my check_cron.json looks like ( I tried 2 methods, 1 from another source other than Sensu):
{
"checks": {
"cron_checks": {
"handlers": ["default", "slack"],
"command": "/etc/sensu/plugins/check-procs.rb -p cron -C 1 ",
"interval": 60, "subscribers": ["webservers"]
},
"cron": {
"handlers": ["default", "slack"],
"command": "/etc/sensu/plugins/check-procs.rb -p cron",
"subscribers": [
"production",
"webservers",
],
"interval": 60
}
}
}
I have restarted my server after making the changes. I'm assuming that this cron will hit every minute and call the slack notification plugin, but don't know what I'm missing, or where to put the .json doc from the Slack plugin "documentation"
https://github.com/sensu-plugins/sensu-plugins-slack
Any help getting me to the right direction?
You need a handler on the Sensu Server that will fire the request to Slack. Have you created that? If yes, please post it's content.
So I just solved this. benishkey did provide the solution in the link, however, just in case anyone comes across this and the link is broken, I thought I would add the solution.
-github user eugene-chow:
The Slack handler's config need to be named differently. Try the JSON below. I renamed the Slack config for each environment, and then pointed the handler to the respective config with -j config_name
{
"handlers": {
"slack-staging": {
"type": "pipe",
"command": "/usr/local/bin/handler-slack.rb -j slack-staging",
"severites": ["critical", "unknown"]
}
},
"slack-staging": {
"webhook_url": "https://hooks.slack.com/services/...",
"template" : ""
}
}
{
"handlers": {
"slack-production": {
"type": "pipe",
"command": "/usr/local/bin/handler-slack.rb -j slack-production",
"severites": ["critical", "unknown"]
}
},
"slack-production": {
"webhook_url": "https://hooks.slack.com/services/...",
"template" : ""
}
}
I dropped the handler-slack.rb file in with my checks and referenced it from there because it wasn't in my /usr/local/bin/ folder
I was facing the same issue, so the answer is already given but maybe help someone in the future,
First, install sensu slack plugin
/opt/sensu/embedded/bin/gem install sensu-plugins-slack
Then, Create a handler config file
vim /etc/sensu/conf.d/slack-handler.json
handler-slack.rb https://github.com/sensu-plugins/sensu-plugins-slack/blob/master/bin/handler-slack.rb
{
"handlers": {
"slack": {
"type": "pipe",
"command": "/opt/sensu/embedded/bin/handler-slack.rb",
"severites": ["critical", "unknown"]
}
},
"slack": {
"webhook_url": "https://your_webhook.com/abc",
"template" : ""
}
}
I found the answer in the "issues" section in Git
https://github.com/sensu-plugins/sensu-plugins-slack/issues/7