How to calculate bandwidth consumption in pjsip - bandwidth

I see that Jitsi (which uses jain-sip) can display bandwidth information when a call is in progress.
How to get bandwidth information in pjsip ? Are there any callback or how to calculate it ?
P/S: By bandwidth I means the actual bandwidth used by media layer
Update
When calling pjsua_call_dump, I receive this info. Is this enough to calculate ? If yes, how ?
Call time: 00h:00m:06s, 1st res in 1592 ms, conn in 6512ms
#0 audio PCMU #8kHz, sendrecv, peer=130.148.79.127:33380
SRTP status: Not active Crypto-suite: (null)
RX pt=0, last update:00h:00m:01.891s ago
total 337pkt 53.9KB (67.4KB +IP hdr) #avg=62.5Kbps/78.1Kbps
pkt loss=2 (0.6%), discrd=0 (0.0%), dup=0 (0.0%), reord=0 (0.0%)
(msec) min avg max last dev
loss period: 20.000 20.000 20.000 20.000 0.000
jitter : 1.125 19.350 27.000 18.875 3.951
TX pt=0, ptime=20, last update:00h:00m:02.260s ago
total 344pkt 55.0KB (68.8KB +IP hdr) #avg=63.8Kbps/79.7Kbps
pkt loss=2 (0.6%), dup=0 (0.0%), reorder=0 (0.0%)
(msec) min avg max last dev
loss period: 40.000 40.000 40.000 40.000 0.000
jitter : 0.000 7.375 14.750 14.750 7.375
RTT msec : 203.000 203.000 203.000 203.000 0.000

Related

Twitter Rate Limit for search_all for full archive data

I am using tweepy to get replies to a given user name and filter by since id and until id. I got a rate limit for every three requests. My code is like blow:
q = "to:"+screen_name
reply_ls = []
tweets = client.search_all_tweets(query = q ,
since_id = since_id,
until_id = until_id,
tweet_fields = ['in_reply_to_user_id',\
'author_id','created_at','conversation_id',"referenced_tweets"],
expansions = "referenced_tweets.id" )
Here is the rate limit I got
10%|█████▌ | 10/100 [00:10<01:28, 1.02it/s]Rate limit exceeded. Sleeping for 607 seconds.
13%|███████ | 13/100 [10:20<2:07:02, 87.62s/it]Rate limit exceeded. Sleeping for 898 seconds.
17%|█████████ | 17/100 [25:22<2:37:38, 113.95s/it]Rate limit exceeded. Sleeping for 897 seconds.
23%|████████████▍ | 23/100 [40:26<1:16:23, 59.52s/it]Rate limit exceeded. Sleeping for 894 seconds.
I thought the API allow 300 requests every 15 minutes. But now I can only have three requests every 15 minutes. I don't know is it reasonable?

Minimalmodbus: read multiple slaves on same serial port

I have a setup with 2 SDM120 kWh energy meters daisy chained on the same serial port (in the future I want to add a SDM630). I found "Using multiple instruments" in the MinimalModbus communication. I succeed in reading registers on the SDM120 on address 1, but I get an error on reading address 2. The error: minimalmodbus.NoResponseError: No communication with the instrument (no answer).
I can work around it by adding time.sleep(0.1), but I would think that RS485 allows to immediately read the registers of a second address after the first one is completed. I also tried lower values, but eg. time.sleep(0.01) also gave a NoResponseError.
I personally thought the setting instrument.serial.timeout = 1 already would have had the desired effect, but apparently I really need the time.sleep. Is the time.sleep(0.1) the correct way of doing? If so: how can I know the lowest value, so that I don't have a NoResponseError? Trial and error? Could my script be optimized? Especially when timing is important, eg. to avoid injection in the grid (pv diverter, ...). Thanks in advance!
The script:
#!/usr/bin/env python3
import minimalmodbus
import time
instrumentA = minimalmodbus.Instrument('/dev/ttyUSB0', 1, debug = True) # port name, slave address (in decimal)
instrumentA.serial.baudrate = 9600
instrumentA.serial.timeout = 1 # seconds
instrumentA.serial.bytesize = 8
instrumentA.serial.parity = minimalmodbus.serial.PARITY_NONE
instrumentA.serial.stopbits = 1
instrumentA.mode = minimalmodbus.MODE_RTU
instrumentB = minimalmodbus.Instrument('/dev/ttyUSB0', 2, debug = True)
instrumentB.mode = minimalmodbus.MODE_RTU
print ("====== SDM120 instrumentA on addres 1 ======")
print (instrumentA)
P = instrumentA.read_float(12, 4, 2)
print ("Active Power in Watts:", P)
#time.sleep(0.1) #workaround to avoid NoResponseError
print ("====== SDM120 instrumentB on addres 2 ======")
print (instrumentB)
P = instrumentB.read_float(12, 4, 2)
print ("Active Power in Watts:", P)
Output without the time.sleep(0.1):
MinimalModbus debug mode. Create serial port /dev/ttyUSB0
MinimalModbus debug mode. Serial port /dev/ttyUSB0 already exists
====== SDM120 instrumentA on addres 1 ======
minimalmodbus.Instrument<id=0x7f36e3dc0df0, address=1, mode=rtu, close_port_after_each_call=False, precalculate_read_size=True, clear_buffers_before_each_transaction=True, handle_local_echo=False, debug=True, serial=Serial<id=0x7f36e3dd90d0, open=True>(port='/dev/ttyUSB0', baudrate=9600, bytesize=8, parity='N', stopbits=1, timeout=1, xonxoff=False, rtscts=False, dsrdtr=False)>
MinimalModbus debug mode. Will write to instrument (expecting 9 bytes back): '\x01\x04\x00\x0c\x00\x02±È' (01 04 00 0C 00 02 B1 C8)
MinimalModbus debug mode. Clearing serial buffers for port /dev/ttyUSB0
MinimalModbus debug mode. No sleep required before write. Time since previous read: 190954.73 ms, minimum silent period: 4.01 ms.
MinimalModbus debug mode. Response from instrument: '\x01\x04\x04\x00\x00\x00\x00û\x84' (01 04 04 00 00 00 00 FB 84) (9 bytes), roundtrip time: 53.3 ms. Timeout for reading: 1000.0 ms.
Active Power in Watts: 0.0
====== SDM120 instrumentB on addres 2 ======
minimalmodbus.Instrument<id=0x7f36e3c55940, address=2, mode=rtu, close_port_after_each_call=False, precalculate_read_size=True, clear_buffers_before_each_transaction=True, handle_local_echo=False, debug=True, serial=Serial<id=0x7f36e3dd90d0, open=True>(port='/dev/ttyUSB0', baudrate=9600, bytesize=8, parity='N', stopbits=1, timeout=1, xonxoff=False, rtscts=False, dsrdtr=False)>
MinimalModbus debug mode. Will write to instrument (expecting 9 bytes back): '\x02\x04\x00\x0c\x00\x02±û' (02 04 00 0C 00 02 B1 FB)
MinimalModbus debug mode. Clearing serial buffers for port /dev/ttyUSB0
MinimalModbus debug mode. Sleeping 2.31 ms before sending. Minimum silent period: 4.01 ms, time since read: 1.70 ms.
MinimalModbus debug mode. Response from instrument: '' () (0 bytes), roundtrip time: 1001.3 ms. Timeout for reading: 1000.0 ms.
Traceback (most recent call last):
File "./sdm120-daisychain_v3.py", line 25, in <module>
P = instrumentB.read_float(12, 4, 2)
File "/home/mattias/.local/lib/python3.8/site-packages/minimalmodbus.py", line 662, in read_float
return self._generic_command(
File "/home/mattias/.local/lib/python3.8/site-packages/minimalmodbus.py", line 1170, in _generic_command
payload_from_slave = self._perform_command(functioncode, payload_to_slave)
File "/home/mattias/.local/lib/python3.8/site-packages/minimalmodbus.py", line 1240, in _perform_command
response = self._communicate(request, number_of_bytes_to_read)
File "/home/mattias/.local/lib/python3.8/site-packages/minimalmodbus.py", line 1406, in _communicate
raise NoResponseError("No communication with the instrument (no answer)")
minimalmodbus.NoResponseError: No communication with the instrument (no answer)
Output with the time.sleep(0.1):
MinimalModbus debug mode. Create serial port /dev/ttyUSB0
MinimalModbus debug mode. Serial port /dev/ttyUSB0 already exists
====== SDM120 instrumentA on addres 1 ======
minimalmodbus.Instrument<id=0x7f91feddcdf0, address=1, mode=rtu, close_port_after_each_call=False, precalculate_read_size=True, clear_buffers_before_each_transaction=True, handle_local_echo=False, debug=True, serial=Serial<id=0x7f91fedf50d0, open=True>(port='/dev/ttyUSB0', baudrate=9600, bytesize=8, parity='N', stopbits=1, timeout=1, xonxoff=False, rtscts=False, dsrdtr=False)>
MinimalModbus debug mode. Will write to instrument (expecting 9 bytes back): '\x01\x04\x00\x0c\x00\x02±È' (01 04 00 0C 00 02 B1 C8)
MinimalModbus debug mode. Clearing serial buffers for port /dev/ttyUSB0
MinimalModbus debug mode. No sleep required before write. Time since previous read: 176619.62 ms, minimum silent period: 4.01 ms.
MinimalModbus debug mode. Response from instrument: '\x01\x04\x04\x00\x00\x00\x00û\x84' (01 04 04 00 00 00 00 FB 84) (9 bytes), roundtrip time: 53.3 ms. Timeout for reading: 1000.0 ms.
Active Power in Watts: 0.0
====== SDM120 instrumentB on addres 2 ======
minimalmodbus.Instrument<id=0x7f91fec70940, address=2, mode=rtu, close_port_after_each_call=False, precalculate_read_size=True, clear_buffers_before_each_transaction=True, handle_local_echo=False, debug=True, serial=Serial<id=0x7f91fedf50d0, open=True>(port='/dev/ttyUSB0', baudrate=9600, bytesize=8, parity='N', stopbits=1, timeout=1, xonxoff=False, rtscts=False, dsrdtr=False)>
MinimalModbus debug mode. Will write to instrument (expecting 9 bytes back): '\x02\x04\x00\x0c\x00\x02±û' (02 04 00 0C 00 02 B1 FB)
MinimalModbus debug mode. Clearing serial buffers for port /dev/ttyUSB0
MinimalModbus debug mode. No sleep required before write. Time since previous read: 102.09 ms, minimum silent period: 4.01 ms.
MinimalModbus debug mode. Response from instrument: '\x02\x04\x04\x00\x00\x00\x00È\x84' (02 04 04 00 00 00 00 C8 84) (9 bytes), roundtrip time: 52.8 ms. Timeout for reading: 1000.0 ms.
Active Power in Watts: 0.0
There seems to be nothing wrong with your code or the library you are using (minimalmodbus).
As you probably know, Modbus works in a query-response mode over a half-duplex link. In plain English: you first send a query and the device that query is addressed to answers with the data you asked for.
Both parts of the transaction (queries and responses) travel over the same bus. But the bus is a shared medium and only one device is allowed to take control of the bus (to talk) at any time.
When you have a single master and one or multiple slaves this process works with no issues as long as you guarantee a short silent period after any device writes to the bus. The Modbus specification established this value at 3.5 characters (the time it takes to send 3 and a half characters serially on the bus at the baud rate you are using).
Unfortunately, some manufacturers do not stick to this rule. So some of those devices just take longer than 3.5 characters time to release control of the bus.
This seems to be the case at least with one of your devices. This manual can give you some clues:
My bet is out of your two devices one of them takes significantly less than the other to release the bus, but that's something you will have to confirm with the debug details. It might even be that the device takes longer to release the bus if you query 20 or 40 registers instead of 4 or 8...
What can you do about it? Well, from the device side, not much, it is what it is. On your software you can do many different things. As I said in the comments above you should not feel bad about using time.sleep() considering that's the way minimalmodbus tries to cope with the bus contention problem.
To make your code more robust you can add try: ... except:. This approach is explained in the documentation. You can keep retrying to read within a loop for a number of attempts or add a small delay to the except chunk. Maybe something like this.
The answer of Marcos G. (answered Apr 13 at 9:24) includes some background details. In short:
With some trial and error, one can have a value for time.sleep so that minimalmodbus can cope with the bus contention problem.
You can have more robust code with try: ... except:. It might be a good idea to only try a number of times, to avoid a infinite loop.
I include two scripts which use those approaches to my posted problem. Compared to my original question a for loop and an array for the addresses is used.
The first one is with time.sleep
#!/usr/bin/env python3
import minimalmodbus
import time
addr = 1
instrument = minimalmodbus.Instrument('/dev/ttyUSB0', addr) # port name, slave address (in decimal)
instrument.serial.baudrate = 9600 # Baud
instrument.serial.bytesize = 8
instrument.serial.parity = minimalmodbus.serial.PARITY_NONE
instrument.serial.stopbits = 1
instrument.serial.timeout = 1 # seconds
instrument.mode = minimalmodbus.MODE_RTU # rtu or ascii mode
addresses = [1,2]
registers = [ 0, 6, 12, 18, 24, 30, 70, 72, 74, 76, 78, 84, 86, 88, 90, 92, 94, 258, 264, 342, 344]
names = ["V","I","P","S", "Q","PF","f","IAE","EAE", "IRE", "ERE","TSP","MSP","ISP","MIP","ESP","MEP","ID","MID","TAE", "TRE"]
units = ["V","A","W","VA","var", "","Hz","kWh","kWh","kvarh","kvarh", "W", "W", "W", "W", "W", "W", "A", "A","kWh","kvarh"]
info = [
"(V for Voltage in volt)",
"(I for Current in ampere)",
"(P for Active Power in watt)",
"(S for Apparent power in volt-ampere)",
"(Q for Reactive power in volt-ampere reactive)",
"(PF for Power Factor)",
"(f for Frequency in hertz)",
"(IAE for Import active energy in kilowatt-hour)",
"(EAE for Export active energy in kilowatt-hour)",
"(IRE for Import reactive energy in kilovolt-ampere reactive hours)",
"(ERE for Export reactive energy in kilovolt-ampere reactive hours)",
"(TSP for Total system power demand in watt)",
"(MSP for Maximum total system power demand in watt)",
"(ISP for Import system power demand in watt)",
"(MIP for Maximum import system power demand in watt)",
"(ESP for Export system power demand in watt)",
"(MEP for MaximumExport system power demand in watt)",
"(ID for current demand in ampere)",
"(MID for Maximum current demand in ampere)",
"(TAE for Total active energy in kilowatt-hour)",
"(TRE for Total reactive energy in kilovolt-ampere reactive hours)",
]
for addr in addresses:
instrument.address = addr
print ("=== General info about address", addr, "===")
print (instrument)
print ("=== The registers for address", addr, "===")
for i in range(len(registers)):
value = instrument.read_float(registers[i], 4, 2)
print (str(registers[i]).rjust(3), str(value).rjust(20), units[i].ljust(5), info[i])
time.sleep(0.1) # To avoid minimalmodbus.NoResponseError
print ("")
The second one with try: ... except:
#!/usr/bin/env python3
import minimalmodbus
# This alternative script `sdm120-daisy-alt.py` will try to reread a device an extra 9 times before giving up and continuing with the other addresses in the array.
addr = 1
instrument = minimalmodbus.Instrument('/dev/ttyUSB0', addr) # port name, slave address (in decimal)
instrument.serial.baudrate = 9600 # Baud
instrument.serial.bytesize = 8
instrument.serial.parity = minimalmodbus.serial.PARITY_NONE
instrument.serial.stopbits = 1
instrument.serial.timeout = 1 # seconds
instrument.mode = minimalmodbus.MODE_RTU # rtu or ascii mode
addresses = [1,2]
registers = [ 0, 6, 12, 18, 24, 30, 70, 72, 74, 76, 78, 84, 86, 88, 90, 92, 94, 258, 264, 342, 344]
names = ["V","I","P","S", "Q","PF","f","IAE","EAE", "IRE", "ERE","TSP","MSP","ISP","MIP","ESP","MEP","ID","MID","TAE", "TRE"]
units = ["V","A","W","VA","var", "","Hz","kWh","kWh","kvarh","kvarh", "W", "W", "W", "W", "W", "W", "A", "A","kWh","kvarh"]
info = [
"(V for Voltage in volt)",
"(I for Current in ampere)",
"(P for Active Power in watt)",
"(S for Apparent power in volt-ampere)",
"(Q for Reactive power in volt-ampere reactive)",
"(PF for Power Factor)",
"(f for Frequency in hertz)",
"(IAE for Import active energy in kilowatt-hour)",
"(EAE for Export active energy in kilowatt-hour)",
"(IRE for Import reactive energy in kilovolt-ampere reactive hours)",
"(ERE for Export reactive energy in kilovolt-ampere reactive hours)",
"(TSP for Total system power demand in watt)",
"(MSP for Maximum total system power demand in watt)",
"(ISP for Import system power demand in watt)",
"(MIP for Maximum import system power demand in watt)",
"(ESP for Export system power demand in watt)",
"(MEP for MaximumExport system power demand in watt)",
"(ID for current demand in ampere)",
"(MID for Maximum current demand in ampere)",
"(TAE for Total active energy in kilowatt-hour)",
"(TRE for Total reactive energy in kilovolt-ampere reactive hours)",
]
for addr in addresses:
instrument.address = addr
print ("=== General info about address", addr, "===")
print (instrument)
print ("=== The registers for address", addr, "===")
for attempt in range(10):
try:
for i in range(len(registers)):
value = instrument.read_float(registers[i], 4, 2)
print (str(registers[i]).rjust(3), str(value).rjust(20), units[i].ljust(5), info[i])
except minimalmodbus.NoResponseError:
print("NoResponseError: did't work on attempt ", attempt+1, "I will retry")
else:
print ("Succeeded on attempt", attempt+1)
break
print ("")

Kafka consumer test and reported metrics

I want to understand how kafka consumer test works and how to interpret some of numbers reported,
below is the test i ran and the output i got. My questions are
values reported for rebalance.time.ms, fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec are 1593109326098, -1593108732333, -0.0003, -0.2800; can you explain how it can report such a high and negative numbers ? they dont make sense to me.
Everything reported from Metric Name Value line is reported due to --print-metrics flag. What is difference between metrics reported by default and with this flag? how they are calculated and where can i read about what do they mean?
No matter i scale total consumer running in parallel or scale network and io threads at broker, consumer-fetch-manager-metrics:fetch-latency-avg metrics remains almost same. can you explain this? with more consumers pulling data fetch latency should go higher; similarly for given consuming rate if i reduce io and network thread parameters at broker shouldnt latency scale higher?
here is the command i ran
[root#oak-clx17 kafka_2.12-2.5.0]# bin/kafka-consumer-perf-test.sh --topic topic_test8_cons_test1 --threads 1 --broker-list clx20:9092 --messages 500000000 --consumer.config config/consumer.properties --print-metrics
and results
start.time, end.time, data.consumed.in.MB, MB.sec, data.consumed.in.nMsg, nMsg.sec, rebalance.time.ms,fetch.time.ms, fetch.MB.sec, fetch.nMsg.sec
WARNING: Exiting before consuming the expected number of messages: timeout (10000 ms) exceeded. You can use the --timeout option to increase the timeout.
2020-06-25 11:22:05:814, 2020-06-25 11:31:59:579, 435640.7686, 733.6922, 446096147, 751300.8463, 1593109326098, -1593108732333, -0.0003, -0.2800
 
Metric Name Value
consumer-coordinator-metrics:assigned-partitions:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:commit-latency-avg:{client-id=consumer-perf-consumer-25533-1} : 2.700
consumer-coordinator-metrics:commit-latency-max:{client-id=consumer-perf-consumer-25533-1} : 4.000
consumer-coordinator-metrics:commit-rate:{client-id=consumer-perf-consumer-25533-1} : 0.230
consumer-coordinator-metrics:commit-total:{client-id=consumer-perf-consumer-25533-1} : 119.000
consumer-coordinator-metrics:failed-rebalance-rate-per-hour:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:failed-rebalance-total:{client-id=consumer-perf-consumer-25533-1} : 1.000
consumer-coordinator-metrics:heartbeat-rate:{client-id=consumer-perf-consumer-25533-1} : 0.337
consumer-coordinator-metrics:heartbeat-response-time-max:{client-id=consumer-perf-consumer-25533-1} : 6.000
consumer-coordinator-metrics:heartbeat-total:{client-id=consumer-perf-consumer-25533-1} : 197.000
consumer-coordinator-metrics:join-rate:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:join-time-avg:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:join-time-max:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:join-total:{client-id=consumer-perf-consumer-25533-1} : 1.000
consumer-coordinator-metrics:last-heartbeat-seconds-ago:{client-id=consumer-perf-consumer-25533-1} : 2.000
consumer-coordinator-metrics:last-rebalance-seconds-ago:{client-id=consumer-perf-consumer-25533-1} : 593.000
consumer-coordinator-metrics:partition-assigned-latency-avg:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:partition-assigned-latency-max:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:partition-lost-latency-avg:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:partition-lost-latency-max:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:partition-revoked-latency-avg:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:partition-revoked-latency-max:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:rebalance-latency-avg:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:rebalance-latency-max:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:rebalance-latency-total:{client-id=consumer-perf-consumer-25533-1} : 83.000
consumer-coordinator-metrics:rebalance-rate-per-hour:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:rebalance-total:{client-id=consumer-perf-consumer-25533-1} : 1.000
consumer-coordinator-metrics:sync-rate:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-coordinator-metrics:sync-time-avg:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:sync-time-max:{client-id=consumer-perf-consumer-25533-1} : NaN
consumer-coordinator-metrics:sync-total:{client-id=consumer-perf-consumer-25533-1} : 1.000
consumer-fetch-manager-metrics:bytes-consumed-rate:{client-id=consumer-perf-consumer-25533-1, topic=topic_test8_cons_test1} : 434828205.989
consumer-fetch-manager-metrics:bytes-consumed-rate:{client-id=consumer-perf-consumer-25533-1} : 434828205.989
consumer-fetch-manager-metrics:bytes-consumed-total:{client-id=consumer-perf-consumer-25533-1, topic=topic_test8_cons_test1} : 460817319851.000
consumer-fetch-manager-metrics:bytes-consumed-total:{client-id=consumer-perf-consumer-25533-1} : 460817319851.000
consumer-fetch-manager-metrics:fetch-latency-avg:{client-id=consumer-perf-consumer-25533-1} : 58.870
consumer-fetch-manager-metrics:fetch-latency-max:{client-id=consumer-perf-consumer-25533-1} : 503.000
consumer-fetch-manager-metrics:fetch-rate:{client-id=consumer-perf-consumer-25533-1} : 48.670
consumer-fetch-manager-metrics:fetch-size-avg:{client-id=consumer-perf-consumer-25533-1, topic=topic_test8_cons_test1} : 9543108.526
consumer-fetch-manager-metrics:fetch-size-avg:{client-id=consumer-perf-consumer-25533-1} : 9543108.526
consumer-fetch-manager-metrics:fetch-size-max:{client-id=consumer-perf-consumer-25533-1, topic=topic_test8_cons_test1} : 11412584.000
consumer-fetch-manager-metrics:fetch-size-max:{client-id=consumer-perf-consumer-25533-1} : 11412584.000
consumer-fetch-manager-metrics:fetch-throttle-time-avg:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-fetch-manager-metrics:fetch-throttle-time-max:{client-id=consumer-perf-consumer-25533-1} : 0.000
consumer-fetch-manager-metrics:fetch-total:{client-id=consumer-perf-consumer-25533-1} : 44889.000
Exception in thread "main" java.util.IllegalFormatConversionException: f != java.lang.Integer
at java.base/java.util.Formatter$FormatSpecifier.failConversion(Formatter.java:4426)
at java.base/java.util.Formatter$FormatSpecifier.printFloat(Formatter.java:2951)
at java.base/java.util.Formatter$FormatSpecifier.print(Formatter.java:2898)
at java.base/java.util.Formatter.format(Formatter.java:2673)
at java.base/java.util.Formatter.format(Formatter.java:2609)
at java.base/java.lang.String.format(String.java:2897)
at scala.collection.immutable.StringLike.format(StringLike.scala:354)
at scala.collection.immutable.StringLike.format$(StringLike.scala:353)
at scala.collection.immutable.StringOps.format(StringOps.scala:33)
at kafka.utils.ToolsUtils$.$anonfun$printMetrics$3(ToolsUtils.scala:60)
at kafka.utils.ToolsUtils$.$anonfun$printMetrics$3$adapted(ToolsUtils.scala:58)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at kafka.utils.ToolsUtils$.printMetrics(ToolsUtils.scala:58)
at kafka.tools.ConsumerPerformance$.main(ConsumerPerformance.scala:82)
at kafka.tools.ConsumerPerformance.main(ConsumerPerformance.scala)
https://medium.com/metrosystemsro/apache-kafka-how-to-test-performance-for-clients-configured-with-ssl-encryption-3356d3a0d52b it's a post I've written on this.
Expectation according to this KIP is for rebalance.time.ms and fetch.time.ms to show total rebalance time for the consumer group and the total fetching time excluding the rebalance time.
As far I can tell, as of Apache Kafka version 2.6.0 this is still a work in progress and currently the output are timestamps in Unix epoch time.
fetch.MB.sec and fetch.nMsg.sec are intended to show the average quantity of messages consumed per second (in MB and as a count)
See https://kafka.apache.org/documentation/#consumer_group_monitoring for the consumer group metrics listed with --print-metrics flag
fetch-latency-avg (the average time taken for a fetch request) will vary but this depends a lot on the test setup.

haproxy stats: qtime,ctime,rtime,ttime?

Running a web app behind HAProxy 1.6.3-1ubuntu0.1, I'm getting haproxy stats qtime,ctime,rtime,ttime values of 0,0,0,2704.
From the docs (https://www.haproxy.org/download/1.6/doc/management.txt):
58. qtime [..BS]: the average queue time in ms over the 1024 last requests
59. ctime [..BS]: the average connect time in ms over the 1024 last requests
60. rtime [..BS]: the average response time in ms over the 1024 last requests
(0 for TCP)
61. ttime [..BS]: the average total session time in ms over the 1024 last requests
I'm expecting response times in the 0-10ms range. ttime of 2704 milliseconds seems unrealistically high. Is it possible the units are off and this is 2704 microseconds rather than 2704 millseconds?
Secondly, it seems suspicious that ttime isn't even close to qtime+ctime+rtime. Is total response time not the sum of the time to queue, connect, and respond? What is the other time, that is included in total but not queue/connect/response? Why can my response times be <1ms, but my total response times be ~2704 ms?
Here is my full csv stats:
$ curl "http://localhost:9000/haproxy_stats;csv"
# pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk,last_agt,qtime,ctime,rtime,ttime,
http-in,FRONTEND,,,4707,18646,50000,5284057,209236612829,42137321877,0,0,997514,,,,,OPEN,,,,,,,,,1,2,0,,,,0,4,0,2068,,,,0,578425742,0,997712,22764,1858,,1561,3922,579448076,,,0,0,0,0,,,,,,,,
servers,server1,0,0,0,4337,20000,578546476,209231794363,41950395095,,0,,22861,1754,95914,0,no check,1,1,0,,,,,,1,3,1,,578450562,,2,1561,,6773,,,,0,578425742,0,198,0,0,0,,,,29,1751,,,,,0,,,0,0,0,2704,
servers,BACKEND,0,0,0,5919,5000,578450562,209231794363,41950395095,0,0,,22861,1754,95914,0,UP,1,1,0,,0,320458,0,,1,3,0,,578450562,,1,1561,,3922,,,,0,578425742,0,198,22764,1858,,,,,29,1751,0,0,0,0,0,,,0,0,0,2704,
stats,FRONTEND,,,2,5,2000,5588,639269,8045341,0,0,29,,,,,OPEN,,,,,,,,,1,4,0,,,,0,1,0,5,,,,0,5374,0,29,196,0,,1,5,5600,,,0,0,0,0,,,,,,,,
stats,BACKEND,0,0,0,1,200,196,639269,8045341,0,0,,196,0,0,0,UP,0,0,0,,0,320458,0,,1,4,0,,0,,1,0,,5,,,,0,0,0,0,196,0,,,,,0,0,0,0,0,0,0,,,0,0,0,0,
In haproxy >2 you now get two values n / n which is the max within a sliding window and the average for that window. The max value remains the max across all sample windows until a higher value is found. On 1.8 you only get the average.
Example of haproxy 2 v 1.8. Note these proxies are used very differently and with dramatically different loads.
So looks like the average response times at least since last reboot are 66m and 275ms.
The average is computed as:
data time + cumulative http connections - 1 / cumulative http connections
This might not be a perfect analysis so if anyone has improvements it'd be appreciated. This is meant to show how I came to the answer above so you can use it to gather more insight into the other counters you asked about. Most of this information was gathered from reading stats.c. The counters you asked about are defined here.
unsigned int q_time, c_time, d_time, t_time; /* sums of conn_time, queue_time, data_time, total_time */
unsigned int qtime_max, ctime_max, dtime_max, ttime_max; /* maximum of conn_time, queue_time, data_time, total_time observed */```
The stats page values are built from this code:
if (strcmp(field_str(stats, ST_F_MODE), "http") == 0)
chunk_appendf(out, "<tr><th>- Responses time:</th><td>%s / %s</td><td>ms</td></tr>",
U2H(stats[ST_F_RT_MAX].u.u32), U2H(stats[ST_F_RTIME].u.u32));
chunk_appendf(out, "<tr><th>- Total time:</th><td>%s / %s</td><td>ms</td></tr>",
U2H(stats[ST_F_TT_MAX].u.u32), U2H(stats[ST_F_TTIME].u.u32));
You asked about all the counter but I'll focus on one. As can be seen in the snippit above for "Response time:" ST_F_RT_MAX and ST_F_RTIME are the values displayed on the stats page as n (rtime_max) / n (rtime) respectively. These are defined as follows:
[ST_F_RT_MAX] = { .name = "rtime_max", .desc = "Maximum observed time spent waiting for a server response, in milliseconds (backend/server)" },
[ST_F_RTIME] = { .name = "rtime", .desc = "Time spent waiting for a server response, in milliseconds, averaged over the 1024 last requests (backend/server)" },
These set a "metric" value (among other things) in a case statement further down in the code:
case ST_F_RT_MAX:
metric = mkf_u32(FN_MAX, sv->counters.dtime_max);
break;
case ST_F_RTIME:
metric = mkf_u32(FN_AVG, swrate_avg(sv->counters.d_time, srv_samples_window));
break;
These metric values give us a good look at what the stats page is telling us. The first value in the "Responses time: 0 / 0" ST_F_RT_MAX, is some max value time spent waiting. The second value in "Responses time: 0 / 0" ST_F_RTIME is an average time taken for each connection. These are the max and average taken within a window of time, i.e. however long it takes for you to get 1024 connections.
For example "Responses time: 10000 / 20":
max time spent waiting (max value ever reached including http keepalive time) over the last 1024 connections 10 seconds
average time over the last 1024 connections 20ms
So for all intents and purposes
rtime_max = dtime_max
rtime = swrate_avg(d_time, srv_samples_window)
Which begs the question what is dtime_max d_time and srv_sample_window? These are the data time windows, I couldn't actually figure how these time values are being set, but at face value it's "some time" for the last 1024 connections. As pointed out here keepalive times are included in max totals which is why the numbers are high.
Now that we know ST_F_RT_MAX is a max value and ST_F_RTIME is an average, an average of what?
/* compue time values for later use */
if (selected_field == NULL || *selected_field == ST_F_QTIME ||
*selected_field == ST_F_CTIME || *selected_field == ST_F_RTIME ||
*selected_field == ST_F_TTIME) {
srv_samples_counter = (px->mode == PR_MODE_HTTP) ? sv->counters.p.http.cum_req : sv->counters.cum_lbconn;
if (srv_samples_counter < TIME_STATS_SAMPLES && srv_samples_counter > 0)
srv_samples_window = srv_samples_counter;
}
TIME_STATS_SAMPLES value is defined as
#define TIME_STATS_SAMPLES 512
unsigned int srv_samples_window = TIME_STATS_SAMPLES;
In mode http srv_sample_counter is sv->counters.p.http.cum_req. http.cum_req is defined as ST_F_REQ_TOT.
[ST_F_REQ_TOT] = { .name = "req_tot", .desc = "Total number of HTTP requests processed by this object since the worker process started" },
For example if the value of http.cum_req is 10, then srv_sample_counter will be 10. The sample appears to be the number of successful requests for a given sample window for a given backends server. d_time (data time) is passed as "sum" and is computed as some non-negative value or it's counted as an error. I thought I found the code for how d_time is created but I wasn't sure so I haven't included it.
/* Returns the average sample value for the sum <sum> over a sliding window of
* <n> samples. Better if <n> is a power of two. It must be the same <n> as the
* one used above in all additions.
*/
static inline unsigned int swrate_avg(unsigned int sum, unsigned int n)
{
return (sum + n - 1) / n;
}

MongoDB Concurrency Bottleneck

Too Long; Didn't Read
The question is about a concurrency bottleneck I am experiencing on MongoDB. If I make one query, it takes 1 unit of time to return; if I make 2 concurrent queries, both take 2 units of time to return; generally, if I make n concurrent queries, all of them take n units of time to return. My question is about what can be done to improve Mongo's response times when faced with concurrent queries.
The Setup
I have a m3.medium instance on AWS running a MongoDB 2.6.7 server. A m3.medium has 1 vCPU (1 core of a Xeon E5-2670 v2), 3.75GB and a 4GB SSD.
I have a database with a single collection named user_products. A document in this collection has the following structure:
{ user: <int>, product: <int> }
There are 1000 users and 1000 products and there's a document for every user-product pair, totalizing a million documents.
The collection has an index { user: 1, product: 1 } and my results below are all indexOnly.
The Test
The test was executed from the same machine where MongoDB is running. I am using the benchRun function provided with Mongo. During the tests, no other accesses to MongoDB were being made and the tests only comprise read operations.
For each test, a number of concurrent clients is simulated, each of them making a single query as many times as possible until the test is over. Each test runs for 10 seconds. The concurrency is tested in powers of 2, from 1 to 128 simultaneous clients.
The command to run the tests:
mongo bench.js
Here's the full script (bench.js):
var
seconds = 10,
limit = 1000,
USER_COUNT = 1000,
concurrency,
savedTime,
res,
timediff,
ops,
results,
docsPerSecond,
latencyRatio,
currentLatency,
previousLatency;
ops = [
{
op : "find" ,
ns : "test_user_products.user_products" ,
query : {
user : { "#RAND_INT" : [ 0 , USER_COUNT - 1 ] }
},
limit: limit,
fields: { _id: 0, user: 1, product: 1 }
}
];
for (concurrency = 1; concurrency <= 128; concurrency *= 2) {
savedTime = new Date();
res = benchRun({
parallel: concurrency,
host: "localhost",
seconds: seconds,
ops: ops
});
timediff = new Date() - savedTime;
docsPerSecond = res.query * limit;
currentLatency = res.queryLatencyAverageMicros / 1000;
if (previousLatency) {
latencyRatio = currentLatency / previousLatency;
}
results = [
savedTime.getFullYear() + '-' + (savedTime.getMonth() + 1).toFixed(2) + '-' + savedTime.getDate().toFixed(2),
savedTime.getHours().toFixed(2) + ':' + savedTime.getMinutes().toFixed(2),
concurrency,
res.query,
currentLatency,
timediff / 1000,
seconds,
docsPerSecond,
latencyRatio
];
previousLatency = currentLatency;
print(results.join('\t'));
}
Results
Results are always looking like this (some columns of the output were omitted to facilitate understanding):
concurrency queries/sec avg latency (ms) latency ratio
1 459.6 2.153609008 -
2 460.4 4.319577324 2.005738882
4 457.7 8.670418178 2.007237636
8 455.3 17.4266174 2.00989353
16 450.6 35.55693474 2.040380754
32 429 74.50149883 2.09527338
64 419.2 153.7325095 2.063482104
128 403.1 325.2151235 2.115460969
If only 1 client is active, it is capable of doing about 460 queries per second over the 10 second test. The average response time for a query is about 2 ms.
When 2 clients are concurrently sending queries, the query throughput maintains at about 460 queries per second, showing that Mongo hasn't increased its response throughput. The average latency, on the other hand, literally doubled.
For 4 clients, the pattern continues. Same query throughput, average latency doubles in relation to 2 clients running. The column latency ratio is the ratio between the current and previous test's average latency. See that it always shows the latency doubling.
Update: More CPU Power
I decided to test with different instance types, varying the number of vCPUs and the amount of available RAM. The purpose is to see what happens when you add more CPU power. Instance types tested:
Type vCPUs RAM(GB)
m3.medium 1 3.75
m3.large 2 7.5
m3.xlarge 4 15
m3.2xlarge 8 30
Here are the results:
m3.medium
concurrency queries/sec avg latency (ms) latency ratio
1 459.6 2.153609008 -
2 460.4 4.319577324 2.005738882
4 457.7 8.670418178 2.007237636
8 455.3 17.4266174 2.00989353
16 450.6 35.55693474 2.040380754
32 429 74.50149883 2.09527338
64 419.2 153.7325095 2.063482104
128 403.1 325.2151235 2.115460969
m3.large
concurrency queries/sec avg latency (ms) latency ratio
1 855.5 1.15582069 -
2 947 2.093453854 1.811227185
4 961 4.13864589 1.976946318
8 958.5 8.306435055 2.007041742
16 954.8 16.72530889 2.013536347
32 936.3 34.17121062 2.043083977
64 927.9 69.09198599 2.021935563
128 896.2 143.3052382 2.074122435
m3.xlarge
concurrency queries/sec avg latency (ms) latency ratio
1 807.5 1.226082735 -
2 1529.9 1.294211452 1.055566166
4 1810.5 2.191730848 1.693487447
8 1816.5 4.368602642 1.993220402
16 1805.3 8.791969257 2.01253581
32 1770 17.97939718 2.044979532
64 1759.2 36.2891598 2.018374668
128 1720.7 74.56586511 2.054769676
m3.2xlarge
concurrency queries/sec avg latency (ms) latency ratio
1 836.6 1.185045183 -
2 1585.3 1.250742872 1.055438974
4 2786.4 1.422254414 1.13712774
8 3524.3 2.250554777 1.58238551
16 3536.1 4.489283844 1.994745425
32 3490.7 9.121144097 2.031759277
64 3527 18.14225682 1.989033023
128 3492.9 36.9044113 2.034168718
Starting with the xlarge type, we begin to see it finally handling 2 concurrent queries while keeping the query latency virtually the same (1.29 ms). It doesn't last too long, though, and for 4 clients it again doubles the average latency.
With the 2xlarge type, Mongo is able to keep handling up to 4 concurrent clients without raising the average latency too much. After that, it starts to double again.
The question is: what could be done to improve Mongo's response times with respect to the concurrent queries being made? I expected to see a rise in the query throughput and I did not expect to see it doubling the average latency. It clearly shows Mongo is not being able to parallelize the queries that are arriving.
There's some kind of bottleneck somewhere limiting Mongo, but it certainly doesn't help to keep adding up more CPU power, since the cost will be prohibitive. I don't think memory is an issue here, since my entire test database fits in RAM easily. Is there something else I could try?
You're using a server with 1 core and you're using benchRun. From the benchRun page:
This benchRun command is designed as a QA baseline performance measurement tool; it is not designed to be a "benchmark".
The scaling of the latency with the concurrency numbers is suspiciously exact. Are you sure the calculation is correct? I could believe that the ops/sec/runner was staying the same, with the latency/op also staying the same, as the number of runners grew - and then if you added all the latencies, you would see results like yours.