How can you write a supply with a dynamic throttle? - reactive-programming

For a chat bot I'm refactoring to not require locks for managing most of its state, the website it connects to via websocket throttles messages that can be received from regular users to a rate of 0.6s and voiced users to a rate of 0.3s, while administrators have no throttle. Whether or not a user is voiced or an administrator isn't known until some point well after the connection gets made; before then, everyone is considered a regular user.
Currently, I handle throttled messages by putting the react block for listening for messages in a loop that exits once the connection has been forcibly closed. When the throttle gets updated, I call done to enter the next iteration, which updates the whenever block for the supply for messages to send to have the updated throttle. This is terrible, racy code!
What can I do to (a) ensure the connection starts out with a 0.3s throttle that can be used immediately after a websocket connection gets made, (b) make it possible to call a method that updates this throttle when needed, and (c) not keep any state related to this throttle (which can be inferred by other means)?
Edit: I forgot to mention this earlier, but there are unthrottled messages for all types of users, not just administrators.

I found a way to do this using a combination of Supply.migrate, Supply.map, and Supply.merge. If I create suppliers for throttled messages, unthrottled messages, and throttle updates, I can map over the throttle updates supply with throttled message supplies throttled using any throttles emitted, call Supply.migrate on the resulting supply, and merge it with the unthrottled messages supply. This results in one supply that I can use to handle sending all types of messages:
react {
my Supplier:D $unthrottled .= new;
my Supplier:D $throttled .= new;
my Supplier:D $throttles .= new;
whenever Supply.merge(
$unthrottled.Supply,
$throttles.Supply.map({ $throttled.Supply.throttle: 1, $_ }).migrate,
) -> Str:D $message {
say sprintf '%s # %f', $message, now;
done if ++$ == 12;
}
$throttles.emit: 1.2;
$throttled.emit: "$_" for 1..3;
whenever Promise.in(1) {
$unthrottled.emit: "$_" for 7..9;
}
whenever Promise.in(5) {
$throttles.emit: 0.6;
$throttled.emit: "$_" for 4..6;
whenever Promise.in(1) {
$unthrottled.emit: "$_" for 10..12;
}
}
}
# OUTPUT:
# 1 # 1579731916.713831
# 7 # 1579731917.764047
# 8 # 1579731917.769012
# 9 # 1579731917.774584
# 2 # 1579731917.913512
# 3 # 1579731919.123057
# 4 # 1579731921.749377
# 5 # 1579731922.353073
# 10 # 1579731922.768212
# 11 # 1579731922.773777
# 12 # 1579731922.780446
# 6 # 1579731922.963087

Interesting. I gave it a go and from what I read in the documentation, which is sparse, something like the following should work:
my $s = Supply.from-list(^Inf);
my $control = Supplier.new.Supply;
my $throttle = $s.throttle( 10, 1, :$control );
react
{
whenever $throttle -> $n
{
$n.say
};
# change speed every 5 seconds
whenever Supply.interval(5) -> $x
{
"limit: { (1..10).pick }".say;
$control.emit( "limit: { (1..10).roll }" );
}
}
But it doesn't. The program freezes when it hits $control.emit( ... ). If you comment that out, it runs as expected. Relevant doc parts of the throttle method:
Produces a Supply from a given Supply, but makes sure the number of messages passed through, is limited.
[ ... ]
If the second positional parameter is a numeric value, it is interpreted as the time-unit (in seconds). If you specify .1 as the value, then it makes sure you don't
exceed the limit for every tenth of a second.
[ ... ]
The :control named parameter optionally specifies a Supply that you can use to
control the throttle while it is in operation. Messages that can be
sent, are strings in the form of "key:value".
[ ... ]
These messages can be sent to the :control Supply. A control message consists of a string of the form "key: value", e.g. "limit: 4".

Related

How would you make abstracted nested dictionary/hashtable objects readable (style)?

I'm parsing a database table exported into csv where there are embedded fields in what is essentially a memo field.
The database also contains version history, and the csv contains all versions.
Basic structure of the data is Index(sequential record number),Reference(specific foreign key), Sequence (order of records for a given reference), and Data (the memo field with the data to parse).
You could think of the "Data" field as text documents limited to 80 chars wide and 40 chars deep, and then sequenced in the order they would print. Every record entry is assigned an ascending index.
For reference, $myParser is a [Microsoft.VisualBasic.FileIO.TextFieldParser], so ReadFields() returns a row of fields as an array/list.
My ultimate question is, how can this be formatted to be more intuitive to the reader? Below code is powershell, i'd be interested in answers relating to C# also,as it's something of a language agnostic style problem, though i think get/set would trivialize this to some degree.
Consider the following code (an insert/update routine in a 2 deep nested dictionary/hash):
enum cmtField
{
Index = 0
Sequence = 1
Reference = 2
Data = 4
}
$myRecords = [System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new() #this could be a hash table, but is more verbose this way
While($true) #there's actually control here, but this provides a simple loop assuming infinite data
{
$myFields = $myParser.ReadFields() #read a line from the csvfile and return an array/list of fields for that line
if(!$myRecords.ContainsKey($myFields[[cmtField]::Reference])) #if the reference of the current record is new
{
$myRecords.Add($myFields[[cmtField]::Reference],[System.Collections.Generic.Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
$myRecords[$myFields[[cmtField]::Reference]].add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #create tier 2 sequence reference and data
}
else #if the reference aklready exists in the dictionary
{
if(!$myRecords[$myFields[[cmtField]::Reference]].ContainsKey($myFields[[cmtField]::Sequence])) #if the sequence ID of the current record is new
{
$myRecords[$myFields[[cmtField]::Reference]].Add($myFields[[cmtField]::Sequence],$myFields[[cmtField]::Data]) #add record at [reference][sequence]
}
else #if the sequence already exists for this reference
{
if($myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]].Index -lt $myFields[[cmtField]::Index]) #if the index of the currently read field is higher than the store index, it must be newer
{
$myRecords[$myFields[[cmtField]::Reference]][$myFields[[cmtField]::Sequence]] = $myFields[[cmtField]::Data] #replace with new data
}
#else discard currently read data (do nothing
}
}
}
Frankly, trying to make this readable both makes my head hurt and my eyes bleed a little. It only gets messier and messier the deeper the dictionary goes. I'm stuck between the bracket soup and no self-documentation.
My ultimate question is, how can this be formatted to be more intuitive to the reader?
That... ultimately depends on who "the reader" is - is it your boss? Your colleagues? Me? Will you use this code sample to teach programming to someone?
In terms of making it less "messy", there are a couple of immediate steps you can take.
The first thing I would change to make your code more readable, would be to add a using namespace directive at the top of the file:
using namespace System.Collections.Generic
Now you can create nested dictionaries with:
[Dictionary[int,Dictionary[int,string]]]::new()
... as opposed to:
[System.Collections.Generic.Dictionary[int,System.Collections.Generic.Dictionary[int,string]]]::new()
The next thing I would reduce is repeated index access patterns like $myFields[[cmtField]::Reference] - you never modify $myFields after initial assignment at the top of the loop, so there's no need to delay resolution of it.
while($true)
{
$myFields = $myParser.ReadFields()
$Reference = $myFields[[cmtField]::Reference]
$Data = $myFields[[cmtField]::Data]
$Sequence = $myFields[[cmtField]::Sequence]
$Index = $myFields[[cmtField]::Index]
if(!$myRecords.ContainsKey($Reference)) #if the reference of the current record is new
{
$myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new()) #create tier 1 reference index
$myRecords[$Reference].Add($Sequence,$Data) #create tier 2 sequence reference and data
}
else
{
# ...
Finally, you can simplify the code vastly by abandoning nested if/else statements, and instead just break it down into a succession of steps that has to pass one by one, and you end up with something like this:
using namespace System.Collections.Generic
enum cmtField
{
Index = 0
Sequence = 1
Reference = 2
Data = 4
}
$myRecords = [Dictionary[int,Dictionary[int,CommentRecord]]]::new()
while($true)
{
$myFields = $myParser.ReadFields()
$Reference = $myFields[[cmtField]::Reference]
$Data = $myFields[[cmtField]::Data]
$Sequence = $myFields[[cmtField]::Sequence]
$Index = $myFields[[cmtField]::Index]
# Step 1 - ensure tier 1 dictionary is present
if(!$myRecords.ContainsKey($Reference))
{
$myRecords.Add($Reference,[Dictionary[int,CommentRecord]]::new())
}
# (now we only need to resolve `$myRecords[$Reference]` once)
$record = $myRecords[$Reference]
# step 2 - ensure sequence entry exists
if(!$record.ContainsKey($Sequence))
{
$record.Add($Sequence, $Data)
}
# step 3 - handle superceding comment records
if($record[$Sequence].Index -lt $Index)
{
$record[$Sequence] = $Data
}
}
I personally find this easier on the eyes (and mind) than the original if/else approach

Invoke-RestMethod and foreach loop

Via PowerShell I'm trying to get the last ticker data from all currency pairs via the public API of a cryptocurrency exchange.
For this I first get all markets and then I want to loop through these, but for some reason only the first currency pair is being returned.
Anyone knows what I'm missing?
$bt_baseapi_url = "https://bittrex.com/api/v1.1/"
$getmarkets = $bt_baseapi_url + "public/getmarkets"
$getticker = $bt_baseapi_url + "public/getticker"
$markets = Invoke-RestMethod -Uri $getmarkets
$marketnames = $markets.result
foreach ($marketname in $marketnames.marketname) {
$tickerurl = $getticker + "?market=" + $marketname
$ticker = Invoke-RestMethod -Uri $tickerurl
return $ticker.result.last
}
As Ansgar Wiechers suggests in a comment on the question, do not use return inside a foreach statement's body in an attempt to return (output) a value while continuing the loop; return would return from any enclosing function or script.
Instead, rely on PowerShell's implicit output behavior, as demonstrated in this simple example:
> foreach ($el in 1, 2, 3) { $el }
1
2
3
Simply referencing $el without assigning to a variable or piping / redirecting it elsewhere cause its value to be output.
If needed at all, use continue to prevent execution of subsequent statements in the loop body while continuing the loop overall; use break to exit the loop.
By contrast - and that may be the source of the confusion - inside the body of a ForEach-Object cmdlet call - as part of a pipeline - rather than the foreach statement, the rules change, and return indeed would only exit the iteration at hand and proceed with the next input object:
> 1, 2, 3 | ForEach-Object { return $_ }
1
2
3
Note that even in this case return $_ is just syntactic sugar for $_; return - i.e., an output generating statement followed by a control-flow statement, and simply using $_ may be enough.
Do NOT use break / continue with the ForEach-Object cmdlet, as these statements would look for an enclosing loop statement (such as foreach, do, while`) and - in the absence of one - exit the entire script.
Unfortunately, there is no direct way to exit a pipeline prematurely, - see https://github.com/PowerShell/PowerShell/issues/3821; make your voice heard there if you think this should change.

perl snmp get_bulk_request function returning data for all indexes while added index in oid , snmpwalk on terminal giving correct index data not all

$UsRx = '1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288';
my %table; # Hash to store the results
my $res = $session->get_bulk_request(
-varbindlist => [ $UsRx ],
-callback => [ \&get_callback, \%table ],
-maxrepetitions => 80,
);
snmp_dispatcher();
if (!defined $res) {
printf "ERROR: %s\n", $session->error();
$session->close();
exit 1;
}
for my $oid (oid_lex_sort(keys %table)) {
printf "%s,%s,\n",
$index,
$table{$oid};
}
Note : callback function not here but assume that is working correct issue seems with get_bulk_request when need a single index data then it is ignoring the given index and returning data of index, any alternative solution is also will be appreciated
o/p :
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1337,-70
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1338,-75
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1339,-55
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1340,-60
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737289.1337,-75
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737289.1338,-75
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737289.1339,-60
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737289.1340,-65
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737290.1337,-80
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737290.1338,-70
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737290.1339,-65
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737290.1340,-65
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737291.1337,-65
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737291.1338,-55
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737291.1339,-50
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737291.1340,-45
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737293.1337,-15
Expected o/p :
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1337,-70
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1338,-75
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1339,-55
1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1340,-60
While this working fine with snmpwalk on terminal
system#new:~$ snmpwalk -v2c -c #543%we 23.9.4.67 1.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288
iso.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1337 = INTEGER: -70
iso.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1338 = INTEGER: -75
iso.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1339 = INTEGER: -55
iso.3.6.1.4.1.4491.2.1.20.1.4.1.3.737288.1340 = INTEGER: -60
I'm not sure I am interpreting your question correctly, but it sounds like you are asking why snmpwalk (CLI tool) returns only OIDs that have the same prefix as the one you specified, while using get-bulk from your perl code returns OIDs beyond the subtree you requested.
This would be expected behavior. "snmpwalk" is not an SNMP request type; get-bulk and get-next are. Instead, "snmpwalk" is a specialized tool that uses get-next or get-bulk and handles, itself, detecting that the get-bulk or get-next has retrieved an OID outside the subtree you specified and terminating the walk. Unless the API you're using provides a similar function, you would have to implement this logic in your code. The agent is just doing what was requested: return up to 80 (per your code) varbinds lexicographically greater than the request OID. SNMP doesn't have a built-in request type that retrieves only a subtree.

Real-time output from engines in IPython parallel?

I am running a bunch of long-running tasks with IPython's great parallelization functionality.
How can I get real-time output from the ipengines' stdout in my IPython client?
E.g., I'm running dview.map_async(fun, lots_of_args) and fun prints to stdout. I would like to see the outputs as they are happening.
I know about AsyncResult.display_output(), but it's only available after all tasks have finished.
You can see stdout in the meantime by accessing AsyncResult.stdout, which will return a list of strings, which are the stdout from each engine.
The simplest case being:
print ar.stdout
You can wrap this in a simple function that prints stdout while you wait for the AsyncResult to complete:
import sys
import time
from IPython.display import clear_output
def wait_watching_stdout(ar, dt=1, truncate=1000):
while not ar.ready():
stdouts = ar.stdout
if not any(stdouts):
continue
# clear_output doesn't do much in terminal environments
clear_output()
print '-' * 30
print "%.3fs elapsed" % ar.elapsed
print ""
for eid, stdout in zip(ar._targets, ar.stdout):
if stdout:
print "[ stdout %2i ]\n%s" % (eid, stdout[-truncate:])
sys.stdout.flush()
time.sleep(dt)
An example notebook illustrating this function.
Now, if you are using older IPython, you may see an artificial restriction on access of the stdout attribute ('Result not ready' errors).
The information is available in the metadata, so you can still get at it while the task is not done:
rc.spin()
stdout = [ rc.metadata[msg_id]['stdout'] for msg_id in ar.msg_ids ]
Which is essentially the same thing that the ar.stdout attribute access does.
just in case somebody is still struggling with
getting ordinary print-outputs of the individual kernels:
I adapted minrk's answer such that i get the output of each
kernel as if it would have been a local one by constantly checking if the stdout of each kernel changes while the program is running.
asdf = dview.map_async(function, arguments)
# initialize a stdout0 array for comparison
stdout0 = asdf.stdout
while not asdf.ready():
# check if stdout changed for any kernel
if asdf.stdout != stdout0:
for i in range(0,len(asdf.stdout)):
if asdf.stdout[i] != stdout0[i]:
# print only new stdout's without previous message and remove '\n' at the end
print('kernel ' + str(i) + ': ' + asdf.stdout[i][len(stdout0[i]):-1])
# set stdout0 to last output for new comparison
stdout0 = asdf.stdout
else:
continue
asdf.get()
outputs will then be something like:
kernel0: message 1 from kernel 0
kernel1: message 1 from kernel 1
kernel0: message 2 from kernel 0
kernel0: message 3 from kernel 0
kernel1: message 2 from kernel 0
...

Trouble using 'while' loop to evaluate multiple lines, Perl

Thank you in advance for indulging an amateur Perl question. I'm extracting some data from a large, unformatted text file, and am having trouble combining the use of a 'while' loop and regular expression matching over multiple lines.
First, a sample of the data:
01-034575 18/12/2007 258,750.00 11,559.00 36 -2 0 6 -3 2 -2 0 2 1 -1 3 0 5 15
-13 -44 -74 -104 -134 -165 -196 -226 -257 -287 -318 -349 -377 -408 -438
-469 -510 -541 -572 -602 -633 -663
Atraso Promedio ---> 0.94
The first sequence, XX-XXXXXX is a loan ID number. The date and the following two numbers aren't important. '36' is the number of payments. The following sequence of positive and negative numbers represent how late/early this client was for this loan at each of the 36 payment periods. The '0.94' following 'Atraso Promedio' is the bank's calculation for average delay. The problem is it's wrong, since they substitute all negative (i.e. early) payments in the series with zeros, effectively over-stating how risky a client is. I need to write a program that extracts ID and number of payments, and then dynamically calculates a multi-line average delay.
Here's what I have so far:
#Create an output file
open(OUT, ">out.csv");
print OUT "Loan_ID,Atraso_promedio,Atraso_alt,N_payments,\n";
open(MYINPUTFILE, "<DATA.txt");
while(<MYINPUTFILE>){
chomp($_);
if($ID_select != 1 && m/(\d{2}\-\d{6})/){$Loan_ID = $1, $ID_select = 1}
if($ID_select == 1 && m/\d{1,2},\d{1,3}\.00\s+\d{1,2},\d{1,3}\.00\s+(\d{1,2})/) {$N_payments = $1, $Payment_find = 1};
if($Payment_find == 1 && $ID_select == 1){
while(m/\s{2,}(\-?\d{1,3})/g){
$N++;
$SUM = $SUM + $1;
print OUT "$Loan_ID,$1\n"; #THIS SHOWS ME WHAT NUMBERS THE CODE IS GRABBING. ACTUAL OUTPUT WILL BE WRITTEN BELOW
print $Loan_ID,"\n";
}
if(m/---> *(\d*.\d*)/){$Atraso = $1, $Atraso_select = 1}
if($ID_select == 1 && $Payment_find == 1 && $Atraso_select == 1){
...
There's more, but the while loop is where the program is breaking down. The problem is with the pattern modifier, 'g,' which performs a global search of the string. This makes the program grab numbers that I don't want, such as the '1' in loan ID and the '36' for the number of payments. I need the while loop to start from wherever the previous line in the code left off, which should be right after it has identified the number of loans. I've tried every pattern modifier that I've been able to look up, and only 'g' keeps me out of an infinite loop. I need the while loop to go to the end of the line, then start on the next one without combing over the parts of the string already fed through the program.
Thoughts? Does this make sense? Would be immensely grateful for any help you can offer. This work is pro-bono, unpaid: just trying to help out some friends in a micro-lending institution conduct a risk analysis.
Cheers,
Aaron
The problem is probably easier using split, for instance something like this:
use strict;
use warnings;
open DATA, "<DATA.txt" or die "$!";
my #payments;
my $numberOfPayments;
my $loanNumber;
while(<DATA>)
{
if(/\b\d{2}-\d{6}\b/)
{
($loanNumber, undef, undef, undef, $numberOfPayments, #payments) = split;
}
elsif(/Atraso Promedio/)
{
my (undef, undef, undef, $atrasoPromedio) = split;
# Calculate average of payments and print results
}
else
{
push(#payments, split);
}
}
If the data's clean enough, I might approach it by using split instead of regular expressions. The first line is identifiable if field[0] matches the form of a loan number and field[1] matches the format of a date; then the payment dates are an array slice of field[5..-1]. Similarly testing the first field of each line tells you where you are in the data.
Peter van her Heijden's answer is a nice simplification for a solution.
To answer the OP's question about getting the regexp to continue where it left off, see Perl operators - regexp-quote-like operators, specifically the section "Matching in list context" and the "\G assertion" section just after that.
Essentially, you can use m//gc along with the \G assertion to use regexps match where previous matches left off.
The example in the "\G assertion" section about lex-like scanners would seem to apply to this question.