opc-ua input plugin for telegraf is read successful but does not show any data - opc-ua

My plan is to use telegraf to read opc-ua data from a wago sps as an input and send the data as output to my influxdb database.
Other plugins do work such as the mqtt plugin...so i can verify that the influxdb is correctly set up.
in the my telegraf.config ...
# Retrieve data from OPCUA devices
[[inputs.opcua]]
name = "opcua"
endpoint = "opc.tcp://192.168.178.97:4840"
connect_timeout = "10s"
request_timeout = "5s"
security_policy = "None"
security_mode = "None"
auth_method = "UserName"
username = "admin"
password = "wago"
[[inputs.opcua.group]]
namespace ="4"
identifier_type ="s"
nodes = [
{name="IIoTgateway_xHeartbeat", namespace="4", identifier_type="s", identifier="|var|WAGO 750-8212 PFC200 G2 2ETH RS.Application.GVL_STATUS_PRG.IIoTgateway_xHeartbeat"},
]
Using the tool UA Expert...i can verify that the xHeartbeat changes every 1s...
logs of telegraf plugin looks also good ....
logging into the database (influxdb) i see the variable but not any change...
What is wrong here ?
In the docs of telegraf there is a statement ...about
set the namespaceIndex ..number from 0 to 3....could this be the problem since my namespaceIndex is 4 ?

the config file.
[global_tags]
[agent]
interval = "500ms"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = "0s"
hostname = ""
omit_hostname = false
###############################################################################
# OUTPUT PLUGINS #
###############################################################################
[[outputs.influxdb_v2]]
urls = ["http://${DOCKER_INFLUXDB_INIT_HOST}:${DOCKER_INFLUXDB_INIT_PORT}"]
token = "$DOCKER_INFLUXDB_INIT_ADMIN_TOKEN"
organization = "$DOCKER_INFLUXDB_INIT_ORG"
bucket = "$DOCKER_INFLUXDB_INIT_BUCKET"
###############################################################################
# AGGREGATOR PLUGINS #
###############################################################################
###############################################################################
# INPUT PLUGINS #
###############################################################################
# Retrieve data from OPCUA devices
[[inputs.opcua]]
name = "opcua"
endpoint = "opc.tcp://192.168.178.97:4840"
connect_timeout = "10s"
request_timeout = "5s"
security_policy = "None"
security_mode = "None"
auth_method = "UserName"
username = "admin"
password = "wago"
[[inputs.opcua.group]]
namespace ="4"
identifier_type ="s"
nodes = [
{name="IIoTgateway_xHeartbeat", namespace="4", identifier_type="s", identifier="|var|WAGO 750-8212 PFC200 G2 2ETH RS.Application.GVL_STATUS_PRG.IIoTgateway_xHeartbeat"},
{name="IIoTgateway_xDoorSwitch", namespace="4", identifier_type="s", identifier="|var|WAGO 750-8212 PFC200 G2 2ETH RS.Application.GVL_IIOT_BOX_INPUTS.IIoTgateway_xDoorSwitch"},
]
[[outputs.file]]
## Files to parse each interval. Accept standard unix glob matching rules,
## as well as ** to match recursive files and directories.
files = ["./tmp/metrics.out"]
data_format = "influx"
rotation_interval = "24h"
rotation_max_archives = 10
after checking the ./tmp/metrics.out file by logging into the container...data comes in correct...
make sure that in the flux
from(bucket: "telegrafmqtt")
|> range(start: v.timeRangeStart, stop: v.timeRangeStop)
|> filter(fn: (r) => r["_measurement"] == "opcua")
|> filter(fn: (r) => r["_field"] == "IIoTgateway_xHeartbeat")
|> toInt()
toInt() did finally the trick...
now i see the 1second heardbeat
last not least...do not use the standart password wago and admin...this is only for testing...also ensure to secure OPC-UA connection...

Related

Remove trailing bits from hex pyModBus

I want to built a function that sends a request from ModBus to serial in hex. I more o less have a working function but have two issues.
Issue 1
[b'\x06', b'\x1c', b'\x00!', b'\r', b'\x1e', b'\x1d\xd3', b'\r', b'\n', b'\x1e', b'\x1d']
I cant remove this part b'\r', b'\n', using the .split('\r \n') method since It's not a string.
Issue 2
When getting a value from holding register 40 (33) and i try to use the .to_bytes() method I keep getting b'\x00!', b'\r' and I'm expecting b'\x21'
r = client.read_holding_registers(40)
re = r.registers[0]
req = re.to_bytes(2, 'big')
My functions to generate my request and to send trough pyserial.
def scanned_code():
code = client.read_holding_registers(0)
# code2= client.re
r = code.registers[0]
return r
def send_request(data):
""" Takes input from create_request() and sends data to serial port"""
try:
for i in range(data):
serial_client.write(data[i])
# serial_client.writelines(data[i])
except:
print('no se pudo enviar el paquete <<<--------------------')
def create_request(job):
""" Request type is 33 looks for job
[06]
[1c]
req=33[0d][0a]
job=30925[0d][0a][1e]
[1d]
"""
r = client.read_holding_registers(40)
re = r.registers[0]
req = re.to_bytes(2, 'big')
num = job.to_bytes(2, 'big')
data = [
b'\x06',
b'\x1C',
req,
b'\x0D',
b'\x1E',
num,
b'\x0D',
b'\x0A',
b'\x1E',
b'\x1D'
]
print(data)
while True:
# verify order_trigger() is True.
while order_trigger() != False:
print('inside while loop')
# set flag coil back to 0
reset_trigger()
# get Job no.
job = scanned_code()
# check for JOB No. dif. than 0
if job != 0:
print(scanned_code())
send_request(create_request(job))
# send job request to host to get job data
# send_request()
# if TRUE send job request by serial to DVI client
# get job request data
# translate job request data to modbus
# send data to plc
else:
print(' no scanned code')
break
time.sleep(INTERNAL_SLEEP_TIME)
print('outside loop')
time.sleep(EXTERNAL_SLEEP_TIME)
As an additional question is this the proper way of doing things?

Problems running Cygnus with Postgresql

So I have installed Cygnus and in the simple test configuration case which I took from here (https://github.com/telefonicaid/fiware-cygnus/blob/master/cygnus-ngsi/README.md) everything works fine.
But I need Postgresql as a backend for my application.
For this I adjusted the agent_1.conf file with all postgresql parameters found from http://fiware-cygnus.readthedocs.io/en/latest/cygnus-ngsi/installation_and_administration_guide/ngsi_agent_conf/
cygnus-ngsi.sources = http-source
cygnus-ngsi.sinks = postgresql-sink
cygnus-ngsi.channels = postgresql-channel
cygnus-ngsi.sources.http-source.channels = hdfs-channel mysql-channel ckan-channel mongo-channel sth-channel kafka-channel dynamo-channel postgresql-channel
cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
cygnus-ngsi.sources.http-source.port = 5050
cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
cygnus-ngsi.sources.http-source.handler.notification_target = /notify
cygnus-ngsi.sources.http-source.handler.default_service = default
cygnus-ngsi.sources.http-source.handler.default_service_path = /
cygnus-ngsi.sources.http-source.interceptors = ts gi
cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
cygnus-ngsi.sinks.postgresql-sink.channel = postgresql-channel
cygnus-ngsi.sinks.postgresql-sink.type = com.telefonica.iot.cygnus.sinks.NGSIPostgreSQLSink
cygnus-ngsi.sinks.postgresql-sink.postgresql_host = 127.0.0.1
cygnus-ngsi.sinks.postgresql-sink.postgresql_port = 5432
cygnus-ngsi.sinks.postgresql-sink.postgresql_database = myUser
cygnus-ngsi.sinks.postgresql-sink.postgresql_username = mydb
cygnus-ngsi.sinks.postgresql-sink.postgresql_password = xxxx
cygnus-ngsi.sinks.postgresql-sink.attr_persistence = row
cygnus-ngsi.sinks.postgresql-sink.batch_size = 100
cygnus-ngsi.sinks.postgresql-sink.batch_timeout = 30
cygnus-ngsi.sinks.postgresql-sink.batch_ttl = 10
# postgresql-channel configuration
cygnus-ngsi.channels.postgresql-channel.type = memory
cygnus-ngsi.channels.postgresql-channel.capacity = 1000
cygnus-ngsi.channels.postgresql-channel.transactionCapacity = 100
I didn'r really find any information about other files I am supposed to change and aren't really sure if all parameters are correct.
I also tried the sample configuration from here http://fiware-cygnus.readthedocs.io/en/latest/cygnus-ngsi/flume_extensions_catalogue/ngsi_postgresql_sink/index.html
Cygnus seems to start correctly but all if I try to send a notification I get connection refused
Befor doing anything, please, create the database, the user and the password in Postgresql.
The cygnus configuration is like this one. The /etc/cygnus/conf/cygnus_instance_1.conf file:
CYGNUS_USER=cygnus
CONFIG_FOLDER=/usr/cygnus/conf
CONFIG_FILE=/usr/cygnus/conf/agent_1.conf
AGENT_NAME=cygnus-ngsi
LOGFILE_NAME=cygnus.log
ADMIN_PORT=8081
POLLING_INTERVAL=30
So, the other file /usr/cygnus/conf/agent_1.conf is like this one (please change PostgreSQL parameters):
cygnus-ngsi.sources = http-source
cygnus-ngsi.sinks = postgresql-sink
cygnus-ngsi.channels = postgresql-channel
cygnus-ngsi.sources.http-source.channels = postgresql-channel
cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
cygnus-ngsi.sources.http-source.port = 5050
cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
cygnus-ngsi.sources.http-source.handler.notification_target = /notify
cygnus-ngsi.sources.http-source.handler.default_service = default
cygnus-ngsi.sources.http-source.handler.default_service_path = /
cygnus-ngsi.sources.http-source.handler.events_ttl = 10
cygnus-ngsi.sources.http-source.interceptors = ts gi
cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
#cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
# =============================================
# postgresql-channel configuration
# channel type (must not be changed)
cygnus-ngsi.channels.postgresql-channel.type = memory
# capacity of the channel
cygnus-ngsi.channels.postgresql-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnus-ngsi.channels.postgresql-channel.transactionCapacity = 100
# ============================================
# NGSIPostgreSQLSink configuration
# channel name from where to read notification events
cygnus-ngsi.sinks.postgresql-sink.channel = postgresql-channel
# sink class, must not be changed
cygnus-ngsi.sinks.postgresql-sink.type = com.telefonica.iot.cygnus.sinks.NGSIPostgreSQLSink
# true applies the new encoding, false applies the old encoding.
# cygnus-ngsi.sinks.postgresql-sink.enable_encoding = false
# true if the grouping feature is enabled for this sink, false otherwise
cygnus-ngsi.sinks.postgresql-sink.enable_grouping = false
# true if name mappings are enabled for this sink, false otherwise
cygnus-ngsi.sinks.postgresql-sink.enable_name_mappings = false
# true if lower case is wanted to forced in all the element names, false otherwise
# cygnus-ngsi.sinks.postgresql-sink.enable_lowercase = false
# the FQDN/IP address where the PostgreSQL server runs
cygnus-ngsi.sinks.postgresql-sink.postgresql_host = 127.0.0.1
# the port where the PostgreSQL server listens for incomming connections
cygnus-ngsi.sinks.postgresql-sink.postgresql_port = 5432
# the name of the postgresql database
cygnus-ngsi.sinks.postgresql-sink.postgresql_database = cygnusdb
# a valid user in the PostgreSQL server
cygnus-ngsi.sinks.postgresql-sink.postgresql_username = cygnus
# password for the user above
cygnus-ngsi.sinks.postgresql-sink.postgresql_password = cygnusdb
# how the attributes are stored, either per row either per column (row, column)
cygnus-ngsi.sinks.postgresql-sink.attr_persistence = row
# select the data_model: dm-by-service-path or dm-by-entity
cygnus-ngsi.sinks.postgresql-sink.data_model = dm-by-entity
# number of notifications to be included within a processing batch
cygnus-ngsi.sinks.postgresql-sink.batch_size = 1
# timeout for batch accumulation
cygnus-ngsi.sinks.postgresql-sink.batch_timeout = 30
# number of retries upon persistence error
cygnus-ngsi.sinks.postgresql-sink.batch_ttl = 0
# true enables cache, false disables cache
cygnus-ngsi.sinks.postgresql-sink.backend.enable_cache = true

Not able to connect to MongoDB with Auth - FIWARE Cygnus

We have been trying today to put a Cygnus container in production and we haven't been able to connect it to MongoDB. In our case, we have installed MongoDB with the Auth flag, and we created different users in order to test everything work.
However, we didn't find out the way to connect Cygnus. It tries to connect to the sth_default database, but the it requires enough privileges to create other databases.
The workaround was to start the MongoDB service without the Auth flag, allowing us to check that everything worked when the user can access with admin user without login in, which is not the way we would like to work, due to the fact that it is insecure.
Are we missing anything?
Thanks in advance!
UPDATE
I'm adding here the Cygnus agent.conf file. Moreover, I'm using the Docker Image (docker-ngsi: https://hub.docker.com/r/fiware/cygnus-ngsi/) in its latest version.
cygnus-ngsi.sources = http-source
# Using both, Mongo and Postgres sinks
cygnus-ngsi.sinks = mongo-sink postgresql-sink
cygnus-ngsi.channels = mongo-channel postgresql-channel
cygnus-ngsi.sources.http-source.type = org.apache.flume.source.http.HTTPSource
cygnus-ngsi.sources.http-source.channels = mongo-channel postgresql-channel
cygnus-ngsi.sources.http-source.port = 5050
cygnus-ngsi.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.NGSIRestHandler
cygnus-ngsi.sources.http-source.handler.notification_target = /notify
cygnus-ngsi.sources.http-source.handler.default_service = default
cygnus-ngsi.sources.http-source.handler.default_service_path = /
cygnus-ngsi.sources.http-source.interceptors = ts gi
cygnus-ngsi.sources.http-source.interceptors.ts.type = timestamp
cygnus-ngsi.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.NGSIGroupingInterceptor$Builder
cygnus-ngsi.sources.http-source.interceptors.gi.grouping_rules_conf_file = /opt/apache-flume/conf/grouping_rules.conf
cygnus-ngsi.sinks.mongo-sink.type = com.telefonica.iot.cygnus.sinks.NGSIMongoSink
cygnus-ngsi.sinks.mongo-sink.channel = mongo-channel
#cygnus-ngsi.sinks.mongo-sink.enable_encoding = false
#cygnus-ngsi.sinks.mongo-sink.enable_grouping = false
#cygnus-ngsi.sinks.mongo-sink.enable_name_mappings = false
#cygnus-ngsi.sinks.mongo-sink.enable_lowercase = false
#cygnus-ngsi.sinks.mongo-sink.data_model = dm-by-entity
#cygnus-ngsi.sinks.mongo-sink.attr_persistence = row
cygnus-ngsi.sinks.mongo-sink.mongo_hosts = MyIP:MyPort
cygnus-ngsi.sinks.mongo-sink.mongo_username = MyUsername
cygnus-ngsi.sinks.mongo-sink.mongo_password = MyPassword
#cygnus-ngsi.sinks.mongo-sink.db_prefix = sth_
#cygnus-ngsi.sinks.mongo-sink.collection_prefix = sth_
#cygnus-ngsi.sinks.mongo-sink.batch_size = 1
#cygnus-ngsi.sinks.mongo-sink.batch_timeout = 30
#cygnus-ngsi.sinks.mongo-sink.batch_ttl = 10
#cygnus-ngsi.sinks.mongo-sink.data_expiration = 0
#cygnus-ngsi.sinks.mongo-sink.collections_size = 0
#cygnus-ngsi.sinks.mongo-sink.max_documents = 0
#cygnus-ngsi.sinks.mongo-sink.ignore_white_spaces = true
Thanks
The following configuration lines are missing:
cygnus-ngsi.sinks.mongo-sink.type = com.telefonica.iot.cygnus.sinks.NGSIMongoSink
cygnus-ngsi.sinks.mongo-sink.channel = mongo-channel
I.e. you have to specify the Java class implementing the MongoDB sink, and the channel that connects the source with such a sink.
If the configuration you are showing is the default one when Cygnus is installed through Docker, then the development team must be warned.

get_process_lines in liquidsoap 1.3.0

I've just updated Liquidsoap to 1.3.0 and now get_process_lines does not return anything.
def get_request() =
# Get the URI
lines = get_process_lines("curl http://localhost:3000/api/v1/liquidsoap/next/my-radio")
log("liquidsoap curl returns #{lines}")
uri = list.hd(default="",lines)
log("liquidsoap will try and play #{uri}")
# Create a request
request.create(uri)
end
I read on the CHANGELOG
- Moved get_process_lines and get_process_output to utils.liq, added optional env parameter
Does it mean I have to do something to use utils.liq in my script now ?
The full script is as follows
set("log.file",false)
set("log.stdout",true)
set("log.level",3)
def apply_metadata(m) =
title = m["title"]
artist = m["artist"]
log("Now playing: #{title} by #{artist}")
end
# Our custom request function
def get_request() =
# Get the URI
lines = get_process_lines("curl http://localhost:3000/api/v1/liquidsoap/next/my-radio")
log("liquidsoap curl returns #{lines}")
uri = list.hd(default="",lines)
log("liquidsoap will try and play #{uri}")
# Create a request
request.create(uri)
end
def my_safe(s) =
security = sine()
fallback(track_sensitive=false,[s,security])
end
s = request.dynamic(id="s",get_request)
s = on_metadata(apply_metadata,s)
s = crossfade(s)
s = my_safe(s)
# We output the stream to an icecast
# server, in ogg/vorbis format.
log("liquidsoap starting")
output.icecast(
%mp3(id3v2=true,bitrate=128,samplerate=44100),
host = "localhost",
port = 8000,
password = "PASSWORD",
mount = "myradio",
genre="various",
url="http://www.myradio.fr",
description="My Radio",
s
)
Of course the API is working
$ curl http://localhost:3000/api/v1/liquidsoap/next/my-radio
annotate:title="Chamakay",artist="Blood Orange",album="Cupid Deluxe":http://localhost/stream/3.mp3
A more simple example :
lines = get_process_lines("echo hi")
log("lines = #{lines}")
line = list.hd(default="",lines)
log("line = #{line}")
returns the following logs
2017/05/05 15:24:42 [lang:3] lines = []
2017/05/05 15:24:42 [lang:3] line =
Many thanks in advance for your help !
geoffroy
The issue was fixed in liquidsoap 1.3.1
Fixed:
Fixed run_process, get_process_lines, get_process_output when compiling with OCaml <= 4.03 (#437, #439)
https://github.com/savonet/liquidsoap/blob/1.3.1/CHANGES#L12

Match a running ipython notebook to a process

My server runs many long running notebooks, and I'd like to monitor the notebooks memory.
Is there a way to match between the pid or process name and a notebook?
Since the question is about monitoring notebooks' memory, I've written a complete example showing the memory consumption of the running notebooks. It is based on the excellent #jcb91 answer and a few other answers (1, 2, 3, 4).
import json
import os
import os.path
import posixpath
import subprocess
import urllib2
import pandas as pd
import psutil
def show_notebooks_table(host, port):
"""Show table with info about running jupyter notebooks.
Args:
host: host of the jupyter server.
port: port of the jupyter server.
Returns:
DataFrame with rows corresponding to running notebooks and following columns:
* index: notebook kernel id.
* path: path to notebook file.
* pid: pid of the notebook process.
* memory: notebook memory consumption in percentage.
"""
notebooks = get_running_notebooks(host, port)
prefix = long_substr([notebook['path'] for notebook in notebooks])
df = pd.DataFrame(notebooks)
df = df.set_index('kernel_id')
df.index.name = prefix
df.path = df.path.apply(lambda x: x[len(prefix):])
df['pid'] = df.apply(lambda row: get_process_id(row.name), axis=1)
# same notebook can be run in multiple processes
df = expand_column(df, 'pid')
df['memory'] = df.pid.apply(memory_usage_psutil)
return df.sort_values('memory', ascending=False)
def get_running_notebooks(host, port):
"""Get kernel ids and paths of the running notebooks.
Args:
host: host at which the notebook server is listening. E.g. 'localhost'.
port: port at which the notebook server is listening. E.g. 8888.
username: name of the user who runs the notebooks.
Returns:
list of dicts {kernel_id: notebook kernel id, path: path to notebook file}.
"""
# find which kernel corresponds to which notebook
# by querying the notebook server api for sessions
sessions_url = posixpath.join('http://%s:%d' % (host, port), 'api', 'sessions')
response = urllib2.urlopen(sessions_url).read()
res = json.loads(response)
notebooks = [{'kernel_id': notebook['kernel']['id'],
'path': notebook['notebook']['path']} for notebook in res]
return notebooks
def get_process_id(name):
"""Return process ids found by (partial) name or regex.
Source: https://stackoverflow.com/a/44712205/304209.
>>> get_process_id('kthreadd')
[2]
>>> get_process_id('watchdog')
[10, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61] # ymmv
>>> get_process_id('non-existent process')
[]
"""
child = subprocess.Popen(['pgrep', '-f', name], stdout=subprocess.PIPE, shell=False)
response = child.communicate()[0]
return [int(pid) for pid in response.split()]
def memory_usage_psutil(pid=None):
"""Get memory usage percentage by current process or by process specified by id, like in top.
Source: https://stackoverflow.com/a/30014612/304209.
Args:
pid: pid of the process to analyze. If None, analyze the current process.
Returns:
memory usage of the process, in percentage like in top, values in [0, 100].
"""
if pid is None:
pid = os.getpid()
process = psutil.Process(pid)
return process.memory_percent()
def long_substr(strings):
"""Find longest common substring in a list of strings.
Source: https://stackoverflow.com/a/2894073/304209.
Args:
strings: list of strings.
Returns:
longest substring which is found in all of the strings.
"""
substr = ''
if len(strings) > 1 and len(strings[0]) > 0:
for i in range(len(strings[0])):
for j in range(len(strings[0])-i+1):
if j > len(substr) and all(strings[0][i:i+j] in x for x in strings):
substr = strings[0][i:i+j]
return substr
def expand_column(dataframe, column):
"""Transform iterable column values into multiple rows.
Source: https://stackoverflow.com/a/27266225/304209.
Args:
dataframe: DataFrame to process.
column: name of the column to expand.
Returns:
copy of the DataFrame with the following updates:
* for rows where column contains only 1 value, keep them as is.
* for rows where column contains a list of values, transform them
into multiple rows, each of which contains one value from the list in column.
"""
tmp_df = dataframe.apply(
lambda row: pd.Series(row[column]), axis=1).stack().reset_index(level=1, drop=True)
tmp_df.name = column
return dataframe.drop(column, axis=1).join(tmp_df)
Here is an example output of show_notebooks_table('localhost', 8888):
I came here looking for the simple answer to this question, so I'll post it for anyone else looking.
import os
os.getpid()
This is possible, although I could only think of the rather hackish solution I outline below. In summary:
Get the ports each kernel (id) is listening on from the corresponding json connection files residing in the server's security directory
Parse the output of a call to netstat to determine which pid is listening to the ports found in step 1
Query the server's sessions url to find which kernel id maps to which session, and hence to which notebook. See the ipython wiki for the api. Although not all of it works for me, running IPython 2.1.0, the sessions url does.
I suspect there is a much simpler way, but I'm not sure as yet where to find it.
import glob
import os.path
import posixpath
import re
import json
import subprocess
import urllib2
# the url and port at which your notebook server listens
server_path = 'http://localhost'
server_port = 8888
# the security directory of the notebook server, containing its connections files
server_sec_dir = 'C:/Users/Josh/.ipython/profile_default/security/'
# part 1 : open all the connection json files to find their port numbers
kernels = {}
for json_path in glob.glob(os.path.join(server_sec_dir, 'kernel-*.json')):
control_port = json.load(open(json_path, 'r'))['control_port']
key = os.path.basename(json_path)[7:-5]
kernels[control_port] = {'control_port': control_port, 'key': key}
# part2 : get netstat info for which processes use which tcp ports
netstat_ouput = subprocess.check_output(['netstat', '-ano'])
# parse the netstat output to map ports to PIDs
netstat_regex = re.compile(
"^\s+\w+\s+" # protocol word
"\d+(\.\d+){3}:(\d+)\s+" # local ip:port
"\d+(\.\d+){3}:(\d+)\s+" # foreign ip:port
"LISTENING\s+" # connection state
"(\d+)$" # PID
)
for line in netstat_ouput.splitlines(False):
match = netstat_regex.match(line)
if match and match.lastindex == 5:
port = int(match.group(2))
if port in kernels:
pid = int(match.group(5))
kernels[port]['pid'] = pid
# reorganize kernels to use 'key' as keys
kernels = {kernel['key']: kernel for kernel in kernels.values()}
# part 3 : find which kernel corresponds to which notebook
# by querying the notebook server api for sessions
sessions_url = posixpath.join('%s:%d' % (server_path, server_port),
'api','sessions')
response = urllib2.urlopen(sessions_url).read()
for session in json.loads(response):
key = session['kernel']['id']
if key in kernels:
nb_path = os.path.join(session['notebook']['path'],
session['notebook']['name'])
kernels[key]['nb_path'] = nb_path
# now do what you will with the dict. I just print a pretty list version:
print json.dumps(kernels.values(), sort_keys=True, indent=4)
outputs (for me, at the moment):
[
{
"key": "9142896a-34ca-4c01-bc71-e5709652cac5",
"nb_path": "2015/2015-01-16\\serhsdh.ipynb",
"pid": 11436,
"port": 56173
},
{
"key": "1ddedd95-5673-45a6-b0fb-a3083debb681",
"nb_path": "Untitled0.ipynb",
"pid": 11248,
"port": 52191
},
{
"key": "330343dc-ae60-4f5c-b9b8-e5d05643df19",
"nb_path": "ipynb\\temp.ipynb",
"pid": 4680,
"port": 55446
},
{
"key": "888ad49b-5729-40c8-8d53-0e025b03ecc6",
"nb_path": "Untitled2.ipynb",
"pid": 7584,
"port": 55401
},
{
"key": "26d9ddd2-546a-40b4-975f-07403bb4e048",
"nb_path": "Untitled1.ipynb",
"pid": 10916,
"port": 55351
}
]
Adding to the Dennis Golomazov's answer to:
Make the code compatible with Python 3
Allow to login into a password-protected session
I replaced the get_running_notebooks function by this one (source):
import requests
import posixpath
import json
def get_running_notebooks(host, port, password=''):
"""
Get kernel ids and paths of the running notebooks.
Args:
host: host at which the notebook server is listening. E.g. 'localhost'.
port: port at which the notebook server is listening. E.g. 8888.
Returns:
list of dicts {kernel_id: notebook kernel id, path: path to notebook file}.
"""
BASE_URL = 'http://{0}:{1}/'.format(host, port)
# Get the cookie data
s = requests.Session()
url = BASE_URL + 'login?next=%2F'
resp = s.get(url)
xsrf_cookie = resp.cookies['_xsrf']
# Login with the password
params = {'_xsrf': xsrf_cookie, 'password': password}
res = s.post(url, data=params)
# Find which kernel corresponds to which notebook
# by querying the notebook server api for sessions
url = posixpath.join(BASE_URL, 'api', 'sessions')
ret = s.get(url)
#print('Status code:', ret.status_code)
# Get the notebook list
res = json.loads(ret.text)
notebooks = [{'kernel_id': notebook['kernel']['id'],
'path': notebook['notebook']['path']} for notebook in res]
return notebooks
Here is a solution that solves the access issue mentioned in other posts by first obtaining the access-token via jupyter lab list.
import requests
import psutil
import re
import os
import pandas as pd
# get all processes that have a ipython kernel and get kernel id
dfp = pd.DataFrame({'p': [p for p in psutil.process_iter() if 'ipykernel_launcher' in ' '.join(p.cmdline())]})
dfp['kernel_id'] = dfp.p.apply(lambda p: re.findall(r".+kernel-(.+)\.json", ' '.join(p.cmdline()))[0])
# get url to jupyter server with token and open once to get access
urlp = requests.utils.parse_url([i for i in os.popen("jupyter lab list").read().split() if 'http://' in i][0])
s = requests.Session()
res = s.get(urlp)
# read notebook list into dataframe and get kernel id
resapi = s.get(f'http://{urlp.netloc}/api/sessions')
dfn = pd.DataFrame(resapi.json())
dfn['kernel_id'] = dfn.kernel.apply(lambda item: item['id'])
# merge the process and notebook dataframes
df = dfn.merge(dfp, how = 'inner')
# add process info as desired
df['pid'] = df.p.apply(lambda p: p.pid)
df['mem [%]'] = df.p.apply(lambda p: p.memory_percent())
df['cpu [%]'] = df.p.apply(lambda p: p.cpu_percent())
df['status'] = df.p.apply(lambda p: p.status())
# reduce to columns of interest and sort
dfout = df.loc[:,['name','pid','mem [%]', 'cpu [%]','status']].sort_values('mem [%]', ascending=False)
I have asked similar question and in order to make it a duplicate I "reverse engineer" Dennis Golomazov's answer with focus on matching notebooks in a generic way (also manually).
Get json from the api/sessions path of your Jupyter server (i.e. https://localhost:8888/api/sessions in most cases).
Parse the json. It is a sequence of session objects (nested dicts if parsed with json module). Their .path attributes point to the notebook file, and .kernel.id is the kernel id (which is a part of path passed as an argument of python -m ipykernel_launcher, in my case `{PATH}/python -m ipykernel_launcher -f {HOME}/.local/share/jupyter/runtime/kernel-{ID}.json).
Find PID of process run with that path (e.g. by pgrep -f {ID}).