I have this code ipython code:
import ipywidgets as widgets
from IPython.display import display
import time
w = widgets.Dropdown(
options=['Addition', 'Multiplication', 'Subtraction'],
value='Addition',
description='Task:',
)
def on_change(change):
print("changed to %s" % change['new'])
w.observe(on_change)
display(w)
It works as expected. When the value of the widget changes, the on_change function gets triggered. However, I want to run a long computation and periodically check for updates to the widget. For example:
for i in range(100):
time.sleep(1)
# pull for changes to w here.
# if w.has_changed:
# print(w.value)
How can I achieve this?
For reference, I seem to be able to do the desired polling with
import IPython
ipython = IPython.get_ipython()
ipython.kernel.do_one_iteration()
(I'd still love to have some feedback on whether this works by accident or design.)
I think you need to use threads and hook into the ZMQ event loop. This gist illustrates an example:
https://gist.github.com/maartenbreddels/3378e8257bf0ee18cfcbdacce6e6a77e
Also see https://github.com/jupyter-widgets/ipywidgets/issues/642.
To elaborate on the OP's self answer, this does work. It forces the widgets to sync with the kernel at an arbitrary point in the loop. This can be done right before the accessing the widget.value.
So the full solution would be:
import IPython
ipython = IPython.get_ipython()
last_val = 0
for i in range(100):
time.sleep(1)
ipython.kernel.do_one_iteration()
new_val = w.value
if new_val != old_val:
print(new_val)
old_val = new_val
A slight improvement to the ipython.kernel.do_one_iteration call used
# Max iteration limit, in case I don't know what I'm doing here...
for _ in range(100):
ipython.kernel.do_one_iteration()
if ipython.kernel.msg_queue.empty():
break
In my case, I had a number of UI elements, that could be clicked multiple times between do_one_iteration calls, which will process them one at a time, and with a 1 second time delay, that could get annoying. This will process at most 100 at a time. Tested it by mashing a button multiple times, and now they all get processes as soon as the sleep(1) ends.
Related
I never used multithreading before so I searched an easy example and found this approach from 8 years ago that opens 10 threads and joins them later:
(I liked it because I also needed to save the results of my thread functions)
How to get the return value from a thread in python?
def foo(bar, result, index):
print 'hello {0}'.format(bar)
result[index] = "foo"
from threading import Thread
threads = [None] * 10
results = [None] * 10
for i in range(len(threads)):
threads[i] = Thread(target=foo, args=('world!', results, i))
threads[i].start()
# do some other stuff
for i in range(len(threads)):
threads[i].join()
print " ".join(results)
I managed to implemented this into my code (that I know to work correctly without multi-threading) and the implementation works: No error and it seems everything finishes and does what it should.
Just afterwards iPython hardly responds anymore (super slow) so I can't continue working.
(I can hardly type and new code I enter in iPython runs super slow)
Any ideas what might cause the problem?
According to the ResourceManager the CPUs are still processing something. But why? The script has finished! At least I am 99% sure it did.
I am truly thankful for any tips!
What is their in "# do some other stuff"? You can add towards the end is_alive() to check if it is the thread or some other stuffs that are causing this lag.
Minimally edited code (w.r.t. yours) is as below.
def foo(bar, result, index):
print 'hello {0}'.format(bar)
result[index] = "foo"
from threading import Thread
threads = [None] * 10
results = [None] * 10
for i in range(len(threads)):
threads[i] = Thread(target=foo, args=('world!', results, i))
threads[i].start()
# do some other stuff
for i in range(len(threads)):
threads[i].join()
print " ".join(results)
for i in range(len(threads)):
print(threads[i].is_alive())
If towards the end it prints 10 times False then none of your threads are alive! They have done their job and they are out of the way, otherwise they are still working.
I have many data and I have experimented with partitions of cardinality [20k, 200k+].
I call it like that:
from pyspark.mllib.clustering import KMeans, KMeansModel
C0 = KMeans.train(first, 8192, initializationMode='random', maxIterations=10, seed=None)
C0 = KMeans.train(second, 8192, initializationMode='random', maxIterations=10, seed=None)
and I see that initRandom() calls takeSample() once.
Then the takeSample() implementation doesn't seem to call itself or something like that, so I would expect KMeans() to call takeSample() once. So why the monitor shows two takeSample()s per KMeans()?
Note: I execute more KMeans() and they all invoke two takeSample()s, regardless of the data being .cache()'d or not.
Moreover, the number of partitions doesn't affect the number takeSample() is called, it's constant to 2.
I am using Spark 1.6.2 (and I cannot upgrade) and my application is in Python, if that matters!
I brought this to the mailing list of the Spark devs, so I am updating:
Details of 1st takeSample():
Details of 2nd takeSample():
where one can see that the same code is executed.
As suggested by Shivaram Venkataraman in Spark's mailing list:
I think takeSample itself runs multiple jobs if the amount of samples
collected in the first pass is not enough. The comment and code path
at GitHub
should explain when this happens. Also you can confirm this by
checking if the logWarning shows up in your logs.
// If the first sample didn't turn out large enough, keep trying to take samples;
// this shouldn't happen often because we use a big multiplier for the initial size
var numIters = 0
while (samples.length < num) {
logWarning(s"Needed to re-sample due to insufficient sample size. Repeat #$numIters")
samples = this.sample(withReplacement, fraction, rand.nextInt()).collect()
numIters += 1
}
However, as one can see, the 2nd comment said it shouldn't happen often, and it does happen always to me, so if anyone has another idea, please let me know.
It was also suggested that this was a problem of the UI and takeSample() was actually called only once, but that was just hot air.
I want to make a progress bar that updates asynchronously within the Jupyter notebook (with an ipython kernel)
Example
In [1]: ProgressBar(...)
Out[1]: [|||||||||------------------] 35% # this keeps moving
In [2]: # even while I do other stuff
I plan to spin up a background thread to check and update progress. I'm not sure how to update the rendered output though (or even if this is possible.)
This might help put you on the right track, code taken from lightning-viz, which borrowed a lot of it from matplotlib. warning this is all pretty underdocumented.
In python you have to instantiate a comm object
from IPython.kernel.comm import Comm
comm = Comm('comm-target-name', {'id': self.id})
full code here https://github.com/lightning-viz/lightning-python/blob/master/lightning/visualization.py#L15-L19. The id is just there in case you want to manage multiple different progress bars for example.
then do the same in javascript:
var IPython = window.IPython;
IPython.notebook.kernel.comm_manager.register_target('comm-target-name', function(comm, data) {
// the data here contains the id you set above, useful for managing
// state w/ multiple comm objects
// register the event handler here
comm.on_msg(function(msg) {
})
});
full example here. Note the javascript on_msg code is untested as I've only used comm to go from js -> python. To see what that handler looks like see https://github.com/lightning-viz/lightning-python/blob/master/lightning/visualization.py#L90
finally to send a message in python:
comm.send(data=data)
https://ipython.org/ipython-doc/3/api/generated/IPython.kernel.comm.comm.html#IPython.kernel.comm.comm.Comm.send
from random import randrange
from time import sleep
#import thread
from threading import Thread
from Queue import Queue
'''The idea is that there is a Seeker method that would search a location
for task, I have no idea how many task there will be, could be 1 could be 100.
Each task needs to be put into a thread, does its thing and finishes. I have
stripped down a lot of what this is really suppose to do just to focus on the
correct queuing and threading aspect of the program. The locking was just
me experimenting with locking'''
class Runner(Thread):
current_queue_size = 0
def __init__(self, queue):
self.queue = queue
data = queue.get()
self.ID = data[0]
self.timer = data[1]
#self.lock = data[2]
Runner.current_queue_size += 1
Thread.__init__(self)
def run(self):
#self.lock.acquire()
print "running {ID}, will run for: {t} seconds.".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
sleep(self.timer)
Runner.current_queue_size -= 1
print "{ID} done, terminating, ran for {t}".format(ID = self.ID,
t = self.timer)
print "Queue size: {s}".format(s = Runner.current_queue_size)
#self.lock.release()
sleep(1)
self.queue.task_done()
def seeker():
'''Gathers data that would need to enter its own thread.
For now it just uses a count and random numbers to assign
both a task ID and a time for each task'''
queue = Queue()
queue_item = {}
count = 1
#lock = thread.allocate_lock()
while (count <= 40):
random_number = randrange(1,350)
queue_item[count] = random_number
print "{count} dict ID {key}: value {val}".format(count = count, key = random_number,
val = random_number)
count += 1
for n in queue_item:
#queue.put((n,queue_item[n],lock))
queue.put((n,queue_item[n]))
'''I assume it is OK to put a tulip in and pull it out later'''
worker = Runner(queue)
worker.setDaemon(True)
worker.start()
worker.join()
'''Which one of these is necessary and why? The queue object
joining or the thread object'''
#queue.join()
if __name__ == '__main__':
seeker()
I have put most of my questions in the code itself, but to go over the main points (Python2.7):
I want to make sure I am not creating some massive memory leak for myself later.
I have noticed that when I run it at a count of 40 in putty or VNC on
my linuxbox that I don't always get all of the output, but when
I use IDLE and Aptana on windows, I do.
Yes I understand that the point of Queue is to stagger out your
Threads so you are not flooding your system's memory, but the task at
hand are time sensitive so they need to be processed as soon as they
are detected regardless of how many or how little there are; I have
found that when I have Queue I can clearly dictate when a task has
finished as oppose to letting the garbage collector guess.
I still don't know why I am able to get away with using either the
.join() on the thread or queue object.
Tips, tricks, general help.
Thanks for reading.
If I understand you correctly you need a thread to monitor something to see if there are tasks that need to be done. If a task is found you want that to run in parallel with the seeker and other currently running tasks.
If this is the case then I think you might be going about this wrong. Take a look at how the GIL works in Python. I think what you might really want here is multiprocessing.
Take a look at this from the pydocs:
CPython implementation detail: In CPython, due to the Global Interpreter Lock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound tasks simultaneously.
I'm having a problem with serial IO under both Windows and Linux using pySerial. With this code the device never receives the command and the read times out:
import serial
ser = serial.Serial('/dev/ttyUSB0',9600,timeout=5)
ser.write("get")
ser.flush()
print ser.read()
This code times out the first time through, but subsequent iterations succeed:
import serial
ser = serial.Serial('/dev/ttyUSB0',9600,timeout=5)
while True:
ser.write("get")
ser.flush()
print ser.read()
Can anyone tell what's going on? I tried to add a call to sync() but it wouldn't take a serial object as it's argument.
Thanks,
Robert
Put some delay in between write and read
e.g.
import serial
ser = serial.Serial('/dev/ttyUSB0',9600,timeout=5)
ser.flushInput()
ser.flushOutput()
ser.write("get")
# sleep(1) for 100 millisecond delay
# 100ms dely
sleep(.1)
print ser.read()
Question is really old, but I feel this might be relevant addition.
Some devices (such as Agilent E3631, for example) rely on DTR. Some ultra-cheap adapters do not have DTR line (or do not have it broken out), and using those, such devices may never behave in expected manner (delays between reads and writes get ridiculously long).
If you find yourself wrestling with such a device, my recommendation is to get an adapter with DTR.
This is because pyserial returns from opening the port before it is actually ready. I've noticed that things like flushInput() don't actually clear the input buffer, for example, if called immediately after the open(). Following is code to demonstrate:
import unittest
import serial
import time
"""
1) create a virtual or real connection between COM12 and COM13
2) in a terminal connected to COM12 (at 9600, N81), enter some junk text (e.g.'sdgfdsgasdg')
3) then execute this unit test
"""
class Test_test1(unittest.TestCase):
def test_A(self):
with serial.Serial(port='COM13', baudrate=9600) as s: # open serial port
print("Read ASAP: {}".format(s.read(s.in_waiting)))
time.sleep(0.1) # wait for 100 ms for pyserial port to actually be ready
print("Read after delay: {}".format(s.read(s.in_waiting)))
if __name__ == '__main__':
unittest.main()
"""
output will be:
Read ASAP: b''
Read after delay: b'sdgfdsgasdg'
.
----------------------------------------------------------------------
Ran 1 test in 0.101s
"""
My workaround has been to implement a 100ms delay after opening before doing anything.
Sorry that this is old and obvious to some, but I didn't see this option mentioned here. I ended up calling a read_all() when flush wasn't doing anything with my hardware.
# Stopped reading for a while on the connection so things build up
# Neither of these were working
conn.flush()
conn.flushInput()
# This did the trick, return value is ignored
conn.read_all()
# Waits for next line
conn.read_line()