I Want to use bpftrace to get all the http request content of my program.
cat /etc/redhat-release
CentOS Linux release 8.0.1905 (Core)
uname -a
Linux infra-test4.18.0-305.12.1.el8_4.x86_64 #1 SMP Wed Aug 11
01:59:55 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
bpftrace bt :
printf("Welcome to Offensive BPF... Use Ctrl-C to exit.\n");
#sk[tid] = args->upeer_sockaddr;
/ #sk[tid] /
#sys_accepted[tid] = #sk[tid];
/ #sys_accepted[tid] /
printf("->sys_enter_read for allowed thread (fd: %d)\n", args->fd);
#sys_read[tid] = args->buf;
if (#sys_read[tid] != 0)
$len = args->ret;
$cmd = str(#sys_read[tid], $len);
printf("*** Command: %s\n", $cmd);
printf("Exiting. Bye.\n");
And I star my server on 8080 and then start bpftrace :
Attaching 8 probes...
Welcome to Offensive BPF... Use Ctrl-C to exit.
then I start to curl :
curl -H "traceparent: 00-123-456-01" -lv
The bpftrace only output :
bpftrace --unsafe http.bt
Attaching 8 probes...
Welcome to Offensive BPF... Use Ctrl-C to exit.
->sys_enter_read for allowed thread (fd: 15)
*** Command: GET /misc/ping HTTP/1.1
User-Agent: curl
->sys_enter_read for allowed thread (fd: 15)
*** Command: GET /misc/ping HTTP/1.1
User-Agent: curl
output is not the whole curl content, I don`t know why, Can anyone help?
I wanted to create a REST service in XQuery who are able to receive zip file with documents.
I try to use the API REST extension, but the input is a document-node and is unable to receive zip file.
Is there a way to do that?
I have to receive a very big number of files, and I want to zip them by 1000 documents to limit the network time.
I use ml-gradle to create my database services
I do an other try :
XQ :
declare function he.insertMedia:put(
$context as map:map,
$params as map:map,
$input as document-node()*
) as document-node()?
let $id := map:get($params,"id")
let $uri := map:get($params,'uri')
return ( xdmp:document-insert($uri,$input),
document { <ok/> }
CURL call :
curl --location --request PUT 'http://localhost:8175/LATEST/resources/he.insertMedia?rs:id=TestMarc&rs:uri=/test/testMarc' \
--header 'Content-type: application/xslt+xml' \
--data-binary '#/D:/Travail/3-1-ELS/cep-lss-octo-exampledata/DATA/ACTUEL_ARTICLE/media/000_1wy606.jpg'
Result :
"errorResponse": {
"statusCode": 400,
"status": "Bad Request",
"messageCode": "XDMP-DOCUTF8SEQ",
"message": "XDMP-DOCUTF8SEQ: xdmp:get-request-body() -- Invalid UTF-8 escape sequence at line 1 -- document is not UTF-8 encoded"
ML version : 10.0-6.1
You send binary but tells the server it is XML (XSLT as XML.) Try the following in cURL:
Content-Type: application/octet-stream
Have a strange issue where I need to remove JSON text in a tilde delimited file (having the JSON breaks the import due to CRLF at the end of each line of the JSON). Example line:
Test Plan Work~Response Status: BadRequest Bad Request,Response Content: {
"trace": "0HM5285F2",
"errors": [
"code": "server_error",
"message": "Couldn't access service ",
"moreInfoUrl": null,
"target": {
"type": null,
"name": null
},Request: https://www.test.com Headers: Accept: application/json
~87c5de00-5906-4d2d-b65f-4asdfsdfsdfa29~3/17/2020 1:54:08 PM
or ones like these that don't have JSON but still have the same pattern I need:
Test Plan Pay Work~Response Status: InternalServerError Internal Server Error,Response Content: Error,Request: https://api.test.com Headers: Accept: application/json
Authorization: Bearer eyJhbGciOiJSUzI1NiIsInR5c
SubscriberId: eb7aee
~9d05b16e-e57b-44be-b028-b6ddsdfsdf62a5~1/20/2021 7:07:53 PM
Need both of these types of CSV text to be in the format:
Test Plan Work~Response Status: BadRequest Bad Request~87c5de00-5906-4d2d-b65f-4asdfsdfsdfa29~3/17/2020 1:54:08 PM
The JSON (including the CRLF's at the end of each line of the JSON) are breaking the import of the data into Powershell. Any help or insight would be appreciated!
PowerShell (or rather, .NET) has two perculiar features in its regex engine that might be perfect for this use case - balancing groups and conditionals!
Balancing groups is a complicated feature to fully explain, but it essentially allows us to "keep count" of occurrences of specific named subexpressions in a regex pattern, and looks like this when applied:
PS ~> $string = 'Here is text { but wait { it has } nested { blocks }} here is more text'
PS ~> $string -replace '\{(?>\{(?<depth>)|[^{}]+|\}(?<-depth>))*(?(depth)(?!))\}'
Here is text here is more text
Let's break down the regex pattern:
\{ # match literal '{'
(?> # begin atomic group*
\{(?<depth>) # match literal '{' and increment counter
| [^{}]+ # OR match any sequence of characters that are NOT '{' or '}'
| \}(?<-depth>) # OR match literal '}' and decrement counter
)* # end atomic group, whole group should match 0 or more times
(? # begin conditional group*
(depth)(?!) # if the 'depth' counter > 0, then FAIL!
) # end conditional group
\} # match literal '}' (corresponding to the initial '{')
*) The (?>...) atomic grouping prevents backtracking - a safeguard against accidentally counting anything more than once.
For the CRLF characters in the remaining fields, we can prefix the pattern with (?s) - this makes the regex engine include new lines when matching the . "any" metacharacter, up until we reach the position just before ~87c5...:
(?s),Response Content:\s*\{(?>\{(?<depth>)|[^{}]+|\}(?<-depth>))*(?(depth)(?!))\}.*?(?=~)
Or we can, perhaps more accurately, describe the fields following the JSON as repeating pairs of , and "not ,":
,Response Content:\s*(?:\{(?>\{(?<depth>)|[^{}]+|\}(?<-depth>))*(?(depth)(?!))\})?\s*(?:,[^,]+?)*(?=~)
Let's give it a try against your multi-line input string:
$string = #'
Test Plan Work~Response Status: BadRequest Bad Request,Response Content: {
"trace": "0HM5285F2",
"errors": [
"code": "server_error",
"message": "Couldn't access service ",
"moreInfoUrl": null,
"target": {
"type": null,
"name": null
},Request: https://www.test.com Headers: Accept: application/json
~87c5de00-5906-4d2d-b65f-4asdfsdfsdfa29~3/17/2020 1:54:08 PM
$string -replace ',Response Content:\s*(?:\{(?>\{(?<depth>)|[^{}]+|\}(?<-depth>))*(?(depth)(?!))\})?\s*(?:,[^,]+?)*(?=~)'
Test Plan Work~Response Status: BadRequest Bad Request~87c5de00-5906-4d2d-b65f-4asdfsdfsdfa29~3/17/2020 1:54:08 PM
I was trying to create some basic inspec tests to validate a set of HTTP URLs. The way I started is like this -
control 'http-url-checks' do
impact 1.0
title 'http-url-checks'
desc '
Specify the URLs which need to be up and working.
tag 'http-url-checks'
describe http('http://example.com') do
its('status') { should eq 200 }
its('body') { should match /abc/ }
its('headers.name') { should eq 'header' }
describe http('http://example.net') do
its('status') { should eq 200 }
its('body') { should match /abc/ }
its('headers.name') { should eq 'header' }
We notice that the URLs are hard-coded in the controls and isn't a lot of fun. I'd like to move them to some 'attributes' file of some sort and loop through them in the control file.
My attempt was to use the 'files' folder structure inside the profile.I created a file - httpurls.yml and had the following content in it -
- url: http://example.com
- url: http://example.net
..and in my control file, I had the construct -
my_urls = yaml(content: inspec.profile.file('httpurls.yml')).params
my_urls.each do |s|
describe http(s['url']) do
its('status') { should eq 200 }
However, when I execute the compliance profile, I get an error - 'httpurls.yml not found' (not sure about the exact error message though though). The following is the folder structure I had for my compliance profile.
What I am doing wrong?
Is there a better way to achieve what I am trying to do?
The secret is to use profile attributes, as defined near the bottom of this page:
First, create a profile attributes YML file. I name mine profile-attribute.yml.
Second, put your array of values in the YML file, like so:
- http://example.com
- http://example.net
Third, create an attribute at the top of your InSpec tests:
my_urls = attribute('urls', description: 'The URLs that I am validating.')
Fourth, use your attribute in your InSpec test:
my_urls.each do |s|
describe http(s['url']) do
its('status') { should eq 200 }
Finally, when you call your InSpec test, point to your YML file using --attrs:
inspec exec mytest.rb --reporter=cli --attrs profile-attribute.yml
There is another way to do this using files (instead of the profile attributes and the --attrs flag). You can use JSON or YAML.
First, create the JSON and/or YAML file and put them in the files directory. A simple example of the JSON file might look like this:
"urls": ["https://www.google.com", "https://www.apple.com"]
And a simple example of the YAML file might look like this:
- https://www.google.com
- https://www.apple.com
Second, include code at the top of your InSpec file to read and parse the JSON and/or YAML, like so:
jsoncontent = inspec.profile.file("tmp.json")
jsonparams = JSON.parse(jsoncontent)
jsonurls = jsonparams['urls']
yamlcontent = inspec.profile.file("tmp.yaml")
yamlparams = YAML.load(yamlcontent)
yamlurls = yamlparams['urls']
Third, use the variables in your InSpec tests, like so:
jsonurls.each do |jsonurl|
describe http(jsonurl) do
puts "json url is " + jsonurl
its('status') { should eq 200 }
yamlurls.each do |yamlurl|
describe http(yamlurl) do
puts "yaml url is " + yamlurl
its('status') { should eq 200 }
(NOTE: the puts line is for debugging.)
The result is what you would expect:
json url is https://www.google.com
json url is https://www.apple.com
yaml url is https://www.google.com
yaml url is https://www.apple.com
Profile: InSpec Profile (inspec-file-test)
Version: 0.1.0
Target: local://
http GET on https://www.google.com
✔ status should eq 200
http GET on https://www.apple.com
✔ status should eq 200
http GET on https://www.google.com
✔ status should eq 200
http GET on https://www.apple.com
✔ status should eq 200
Good morning,
I have the following set of virtual machines:
Generic Enablers Orion and Cygnus
Cygnus configuration is:
# Configuration file for apache-flume
# Copyright 2014 Telefonica Investigación y Desarrollo, S.A.U
# This file is part of fiware-connectors (FI-WARE project).
# cosmos-injector is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General
# Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
# later version.
# cosmos-injector is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
# details.
# You should have received a copy of the GNU Affero General Public License along with fiware-connectors. If not, see
# http://www.gnu.org/licenses/.
# For those usages not covered by the GNU Affero General Public License please contact with iot_support at tid dot es
# Who to run cygnus as. Note that you may need to use root if you want
# to run cygnus in a privileged port (<1024)
# Where is the config folder
# Which is the config file
# Name of the agent. The name of the agent is not trivial, since it is the base for the Flume parameters
# naming conventions, e.g. it appears in .sources.http-source.channels=...
# Name of the logfile located at /var/log/cygnus. It is important to put the extension '.log' in order to the log rotation works properly
# Administration port. Must be unique per instance
# Polling interval (seconds) for the configuration reloading
# Copyright 2014 Telefónica Investigación y Desarrollo, S.A.U
# This file is part of fiware-connectors (FI-WARE project).
# fiware-connectors is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General
# Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
# later version.
# fiware-connectors is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
# details.
# You should have received a copy of the GNU Affero General Public License along with fiware-connectors. If not, see
# http://www.gnu.org/licenses/.
# For those usages not covered by the GNU Affero General Public License please contact with iot_support at tid dot es
# To be put in APACHE_FLUME_HOME/conf/agent.conf
# General configuration template explaining how to setup a sink of each of the available types (HDFS, CKAN, MySQL).
# The next tree fields set the sources, sinks and channels used by Cygnus. You could use different names than the
# ones suggested below, but in that case make sure you keep coherence in properties names along the configuration file.
# Regarding sinks, you can use multiple types at the same time; the only requirement is to provide a channel for each
# one of them (this example shows how to configure 3 sink types at the same time). Even, you can define more than one
# sink of the same type and sharing the channel in order to improve the performance (this is like having
# multi-threading).
cygnusagent.sources = http-source
cygnusagent.sinks = mongo-sink
cygnusagent.channels = mongo-channel
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = mongo-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = def_serv
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts gi
# Timestamp interceptor, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# Destination extractor interceptor, do not change
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
# Matching table for the destination extractor interceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
# ============================================
# OrionMongoSink configuration
# channel name from where to read notification events
cygnusagent.sinks.mongo-sink.channel = mongo-channel
# sink class, must not be changed
cygnusagent.sinks.mongo-sink.type = com.telefonica.iot.cygnus.sinks.OrionMongoSink
# true if the grouping feature is enabled for this sink, false otherwise
cygnusagent.sinks.mongo-sink.enable_grouping = false
# the FQDN/IP address where the MySQL server runs (standalone case) or comma-separated list of FQDN/IP:port pairs where the MongoDB replica set members run
cygnusagent.sinks.mongo-sink.mongo_host =
# a valid user in the MongoDB server
cygnusagent.sinks.mongo-sink.mongo_username =
# password for the user above
cygnusagent.sinks.mongo-sink.mongo_password =
# prefix for the MongoDB databases
cygnusagent.sinks.mongo-sink.db_prefix = hvds_
# prefix for the MongoDB collections
cygnusagent.sinks.mongo-sink.collection_prefix = hvds_
# true is collection names are based on a hash, false for human redable collections
cygnusagent.sinks.mongo-sink.should_hash = false
# specify if the sink will use a single collection for each service path, for each entity or for each attribute
cygnusagent.sinks.mongo-sink.data_model = collection-per-entity
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.mongo-sink.attr_persistence = column
# mongo-channel configuration
# channel type (must not be changed)
cygnusagent.channels.mongo-channel.type = memory
# capacity of the channel
cygnusagent.channels.mongo-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.mongo-channel.transactionCapacity = 100
When performing the following steps:
I subscribe sensor and data persistently want to save :
(curl -s -S --header 'Content-Type: application/json' --header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
"entities": [
"type": "Sensor",
"isPattern": "false",
"id": "sensor003"
"attributes": [
"reference": "http://localhost:5050/notify",
"duration": "P1M",
"notifyConditions": [
"condValues": [
Then I make the creation or modification of such data:
(curl -s -S --header 'Content-Type: application/json' --header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
"contextElements": [
"type": "Sensor",
"isPattern": "false",
"id": "sensor003",
"attributes": [
"updateAction": "APPEND"
The expected result is obtained, but when accessing the database of VM B, to see if they have created and saved the data, we see that it has not happened:
admin (empty)
local 0.078GB
localhost (empty)
If we go to the database of VM A we can see who has created the database:
admin (empty)
hvds_def_serv 0.078GB
hvds_qsg 0.078GB
local 0.078GB
orion 0.078GB
Would I could indicate how it can solve?
Thank you in advance for your help
I subscribe the sensor005
(curl -s -S --header 'Content-Type: application/json' --header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
"entities": [{
"type": "Sensor",
"isPattern": "false",
"id": "sensor005"
"attributes": [
"reference": "http://localhost:5050/notify",
"duration": "P1M",
"notifyConditions": [{
"type": "ONCHANGE",
"condValues": [
"throttling": "PT1S"
Then I edit data:
(curl -s -S --header 'Content-Type: application/json' --header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
"contextElements": [
"type": "Sensor",
"isPattern": "false",
"id": "sensor005",
"attributes": [
"value": {
"tiempo": [
"kw": [
"updateAction": "APPEND"
These are the two logs generated:
/var/log/cygnus/cygnus.log with DEBUG