By that I mean, is it acceptable to add to or subtract from semaphores? The example I have is the following:
semaphore secureTarget = 7;
semaphore allClearAlert = 0;
semaphore bellAlert = 0;
Archer:
start();
wait(secureTarget);
wait(allClearAlert);
fireAtTarget();
signal(secureTarget);
wait(secureTarget - 7);
signal(bellAlert);
end();
Boy:
start();
signal(allClearAlert);
wait(bellAlert);
end();
Does that seem acceptable? If it helps, the initial question I'm trying to answer is:
An archery club has seven targets. Archers in the club must compete
to secure a target. Once an archer secures her target she must wait
until the all-clear has been sounded before she can fire. Once an
archer finishes firing she leaves her target. The last archer to finish
sounds the bell that signifies that all have finished firing. Only then
is it safe for the small boy who collects the arrows to venture forth.
When all the arrows have been collected the boy gets out of the firing
lines and sounds the all-clear to the archers.
Semaphores can only be incremented using the signal() and wait() methods, you can't explictly change the variable as you describe. I can't give the solution explictly - looking at your history I think I'm doing the same coursework for the same module and I don't want to be done for plagiarism, but you may find the Little Book of Semaphores useful.
EDIT: you don't have to just use semaphores. You can use other types of shared data, as long as you use a mutex semaphore to control concurrent access to those variables.
Related
I'm using the blue pill, and trying to figure out interrupts. I have an interrupt handler:
void __attribute__ ((interrupt ("TIM4_IRQHandler"))) myhandler()
{
puts("hi");
TIM4->EGR |= TIM_EGR_UG; // send an update even to reset timer and apply settings
TIM4->SR &= ~0x01; // clear UIF
TIM4->DIER |= 0x01; // UIE
}
I set up the timer:
RCC_APB1ENR |= RCC_APB1ENR_TIM4EN;
TIM4->PSC=7999;
TIM4->ARR=1000;
TIM4->EGR |= TIM_EGR_UG; // send an update even to reset timer and apply settings
TIM4->EGR |= (TIM_EGR_TG | TIM_EGR_UG);
TIM4->DIER |= 0x01; // UIE enable interrupt
TIM4->CR1 |= TIM_CR1_CEN;
My timer doesn't seem to activate. I don't think I've actually enabled it though. Have I??
I see in lots of example code commands like:
NVIC_EnableIRQ(USART1_IRQn);
What is actually going in NVIC_EnableIRQ()?
I've googled around, but I can't find actual bare-metal code that's doing something similar to mine.
I seem to be missing a crucial step.
Update 2020-09-23 Thanks to the respondents to this question. The trick is to set the bit for the interrupt number in an NVIC_ISER register. As I pointed out below, this doesn't seem to be mentioned in the STM32F101xx reference manual, so I probably would never have been able to figure this out on my own; not that I have any real skill in reading datasheets.
Anyway, oh joy, I managed to get interrupts working! You can see the code here: https://github.com/blippy/rpi/tree/master/stm32/bare/04-timer-interrupt
Even if you go bare metal, you might still want to use the CMSIS header files that provide declarations and inline version of very basic ARM Cortex elements such NVIC_EnableIRQ.
You can find NVIC_EnableIRQ at https://github.com/ARM-software/CMSIS_5/blob/develop/CMSIS/Core/Include/core_cm3.h#L1508
It's defined as:
#define NVIC_EnableIRQ __NVIC_EnableIRQ
__STATIC_INLINE void __NVIC_EnableIRQ(IRQn_Type IRQn)
{
if ((int32_t)(IRQn) >= 0)
{
__COMPILER_BARRIER();
NVIC->ISER[(((uint32_t)IRQn) >> 5UL)] = (uint32_t)(1UL << (((uint32_t)IRQn) & 0x1FUL));
__COMPILER_BARRIER();
}
}
If you want to, you can ignore __COMPILER_BARRIER(). Previous versions didn't use it.
The definition is applicable to Cortex M-3 chips. It's different for other Cortex versions.
With the libraries is still considered bare metal. Without operating system, but anyway, good that you have a desire to learn at this level. Someone has to write the libraries for others.
I was going to do a full example here, (it really takes very little code to do this), but will take from my code for this board that uses timer1.
You obviously need the ARM documentation (technical reference manual for the cortex-m3 and the architectural reference manual for armv7-m) and the data sheet and reference manual for this st part (no need for programmers manual from either company).
You have provided next to no information related to making the part work. You should never dive right into a interrupt, they are advanced topics and you should poll your way as far as possible before finally enabling the interrupt into the core.
I prefer to get a uart working then use that to watch the timer registers when the roll over, count, etc. Then see/confirm the status register fired, learn/confirm how to clear it (sometimes it is just a clear on read).
Then enable it into the NVIC and by polling see the NVIC sees it, and that you can clear it.
You didn't show your vector table this is key to getting your interrupt handler working. Much less the core booting.
08000000 <_start>:
8000000: 20005000
8000004: 080000b9
8000008: 080000bf
800000c: 080000bf
...
80000a0: 080000bf
80000a4: 080000d1
80000a8: 080000bf
...
080000b8 <reset>:
80000b8: f000 f818 bl 80000ec <notmain>
80000bc: e7ff b.n 80000be <hang>
...
080000be <hang>:
80000be: e7fe b.n 80000be <hang>
...
080000d0 <tim1_handler>:
The first word loads the stack pointer, the rest are vectors, the address to the handler orred with one (I'll let you look that up).
In this case the st reference manual shows that interrupt 25 is TIM1_UP at address 0x000000A4. Which mirrors to 0x080000A4, and that is where the handler is in my binary, if yours is not then two things, one you can use VTOR to find an aligned space, sometimes sram or some other flash space that you build for this and point there, but your vector table handler must have the proper pointer or your interrupt handler won't run.
volatile unsigned int counter;
void tim1_handler ( void )
{
counter++;
PUT32(TIM1_SR,0);
}
volatile isn't necessarily the right way to share a variable between interrupt handler and foreground task, it happens to work for me with this compiler/code, you can do the research and even better, examine the compiler output (disassemble the binary) to confirm this isn't a problem.
ra=GET32(RCC_APB2ENR);
ra|=1<<11; //TIM1
PUT32(RCC_APB2ENR,ra);
...
counter=0;
PUT32(TIM1_CR1,0x00001);
PUT32(TIM1_DIER,0x00001);
PUT32(NVIC_ISER0,0x02000000);
for(rc=0;rc<10;)
{
if(counter>=1221)
{
counter=0;
toggle_led();
rc++;
}
}
PUT32(TIM1_CR1,0x00000);
PUT32(TIM1_DIER,0x00000);
A minimal init and runtime for tim1.
Notice that the NVIC_ISER0 is bit 25 that is set to enable interrupt 25 through.
Well before trying this code, I polled the timer status register to see how it works, compare with docs, clear the interrupt per the docs. Then with that knowledge confirmed with the NVIC_ICPR0,1,2 registers that it was interrupt 25. As well as there being no other gates between the peripheral and the NVIC as some chips from some vendors may have.
Then released it through to the core with NVIC_ISER0.
If you don't take these baby steps and perhaps you have already, it only makes the task much worse and take longer (yes, sometimes you get lucky).
TIM4 looks to be interrupt 30, offset/address 0x000000B8, in the vector table. NVIC_ISER0 (0xE000E100) covers the first 32 interrupts so 30 would be in that register. If you disassemble the code you are generating with the library then we can see what is going on, and or look it up in the library source code (as someone already did for you).
And then of course your timer 4 code needs to properly init the timer and cause the interrupt to fire, which I didn't check.
There are examples, you need to just keep looking.
The minimum is
vector in the table
set the bit in the interrupt set enable register
enable the interrupt to leave the peripheral
fire the interrupt
Not necessarily in that order.
I'm simulating a security control process, and i can't do that each passenger pickup their baggage. I have tried with Match, Combine, Pickup, but I still can't execute the commands correctly.
I've created the follow flowchart, and the problem is in the wReclaimPax, pickup and wReclaimBags blocks (you can see them in the picture).
https://ibb.co/v3V57Tm
I saw this link Anylogic - Combined multiple items back to original owner to understand something, but I still need help.
I've created 3 functions:
isMatch:
if(equipaje.pasajeroLink.equals(pasajero.equipajeLink)){
return true;
}
return false;
paxBags:
for(int i=0;i<wait.size();i++){
Pasajero p=(Pasajero)wait.get(i);
if(isMatch(p,bag))
return p;
}
return null;
bagsPax:
for(int i=0;i<wait.size();i++){
Equipaje e=(Equipaje)wait.get(i);
if(isMatch(pasajero,e))
return e;
}
return null;
Assumed context
You haven't really explained how your code is related to your process but I'm assuming the following:
Because this is luggage-retrieval, you want to ensure that a passenger
agent (Pasajero) only enters the Pickup block (representing taking bag from
carousel) when his bag (Equipaje agent by the look of it) has
arrived into the wReclaimBag Wait, and been released from it to
queue4 Queue.
For this you need triggers (to remove agents from Wait blocks) when
either a passenger (Pasajero) arrives in wReclaimPax Wait, or a bag (Equipaje) arrives
in the wReclaimBag Wait (because you don't know whether the passenger or their bag will get to their respective Wait blocks first).
So your paxBags function is called in on-entry action of the wReclaimBag Wait, and your bagsPax function in the on-entry action of the wReclaimPax Wait.
Possible problems with current approach
Without knowing more of your model it's hard to say but problems I can think of based on what you've supplied are:
Your functions return the Pasajero or Equipaje if there is one that matches. Your match check relies seemingly on bidirectional connections (links) between Pasajero and Equipaje. Obviously if they're not setup properly the model won't work and, if you're using bidirectional connections you shouldn't need to check both ends.
Your functions need calling so that, if they return non null, they then free the matching agent from the other Wait block, and free themselves. Are you doing that? Without checking, there may be issues with calling free for yourself as you enter a Wait block (since this kind of depends on AnyLogic internals as to whether you count as being 'in' the block at this stage and can be freed). If this seems to be the problem you could create a timeout 0 dynamic event instance to do the free so that you're not doing it within the scope of the on-enter action.
Your pickup block (since it's been setup so that the entering agent will always want to pickup the first agent (Equipaje) in queue4) just needs to be set as waiting for quantity 1 (though see below).
If you've done all this the most likely problem is that the underlying events ordering of AnyLogic is affecting things. When you free agents I'm fairly sure the freeing actually happens in a timeout 0 event scheduled under-the-covers. So it may be that the passenger arrives at the Pickup before their Equipment arrives in queue4 though, if you set the Pickup to be "Exact quantity (wait for)", with quantity of 1, it should handle that.
The animation of the process (numbers in/out/within each block and details when clicking on blocks) should also help you debug what is going wrong; e.g., are bags being left in the Wait when they should have been released, etc.
P.S. With this kind of thing you should always create a minimal example model to make testing the issue/solution easier (and for sharing in help forums such as this where the rest of the complexity of your model is irrelevant). Often you find the problem 'naturally' in the process of trying to construct such a model that reproduces your problem in a minimal way.
I'm creating a server side app in Swift 3. I've chosen libevent for implementing networking code because it's cross-platform and doesn't suffer from C10k problem. Libevent implements it's own event loop, but I want to keep CFRunLoop and GCD (DispatchQueue.main.after etc) functional as well, so I need to glue them somehow.
This is what I've came up with:
var terminated = false
DispatchQueue.main.after(when: DispatchTime.now() + 3) {
print("Dispatch works!")
terminated = true
}
while !terminated {
switch event_base_loop(eventBase, EVLOOP_NONBLOCK) { // libevent
case 1:
break // No events were processed
case 0:
print("DEBUG: Libevent processed one or more events")
default: // -1
print("Unhandled error in network backend")
exit(1)
}
RunLoop.current().run(mode: RunLoopMode.defaultRunLoopMode,
before: Date(timeIntervalSinceNow: 0.01))
}
This works, but introduces a latency of 0.01 sec. While RunLoop is sleeping, libevent won't be able to process events. Lowering this timeout increases CPU usage significantly when the app is idle.
I was also considering using only libevent, but third party libs in the project can use dispatch_async internally, so this can be problematic.
Running libevent's loop in a different thread makes synchronization more complex, is this the only way of solving this latency issue?
LINUX UPDATE. The above code does not work on Linux (2016-07-25-a Swift snapshot), RunLoop.current().run exists with an error. Below is a working Linux version reimplemented with a timer and dispatch_main. It suffers from the same latency issue:
let queue = dispatch_get_main_queue()
let timer = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0, queue)
let interval = 0.01
let block: () -> () = {
guard !terminated else {
print("Quitting")
exit(0)
}
switch server.loop() {
case 1: break // Just idling
case 0: break //print("Libevent: processed event(s)")
default: // -1
print("Unhandled error in network backend")
exit(1)
}
}
block()
let fireTime = dispatch_time(DISPATCH_TIME_NOW, Int64(interval * Double(NSEC_PER_SEC)))
dispatch_source_set_timer(timer, fireTime, UInt64(interval * Double(NSEC_PER_SEC)), UInt64(NSEC_PER_SEC) / 10)
dispatch_source_set_event_handler(timer, block)
dispatch_resume(timer)
dispatch_main()
A quick search of the Open Source Swift Foundation libraries on GitHub reveals that the support in CFRunLoop is (perhaps obviously) implemented differently on different platforms. This means, in essence, that RunLoop and libevent, with respect to cross-platform-ness, are just different ways to achieve the same thing. I can see the thinking behind the thought that libevent is probably better suited to server implementations, since CFRunLoop didn't grow up with that specific goal, but as far as being cross-platform goes, they're both barking up the same tree.
That said, the underlying synchronization primitives used by RunLoop and libevent are inherently private implementation details and, perhaps more importantly, different between platforms. From the source, it looks like RunLoop uses epoll on Linux, as does libevent, but on macOS/iOS/etc, RunLoop is going to use Mach ports as its fundamental primitive, but libevent looks like it's going to use kqueue. You might, with enough effort, be able to make a hybrid RunLoopSource that ties to a libevent source for a given platform, but this would likely be very fragile, and generally ill-advised, for a couple of reasons: Firstly, it would be based on private implementation details of RunLoop that are not part of the public API, and therefore subject to change at any time without notice. Second, assuming you didn't go through and do this for every platform supported by both Swift and libevent, you would have broken the cross-platform-ness of it, which was one of your stated reasons for going with libevent in the first place.
One additional option you might not have considered would be to use GCD by itself, without RunLoops. Look at the docs for dispatch_main. In a server application, there's (typically) nothing special about a "main thread," so dispatching to the "main queue", should be good enough (if needed at all). You can use dispatch "sources" to manage your connections, etc. I can't personally speak to how dispatch sources scale up to the C10K/C100K/etc. level, but they've seemed pretty lightweight and low-overhead in my experience. I also suspect that using GCD like this would likely be the most idiomatic way to write a server application in Swift. I've written up a small example of a GCD-based TCP echo server as part of another answer here.
If you were bound and determined to use both RunLoop and libevent in the same application, it would, as you guessed, be best to give libevent it's own separate thread, but I don't think it's as complex as you might think. You should be able to dispatch_async from libevent callbacks freely, and similarly marshal replies from GCD managed threads to libevent using libevent's multi-threading mechanisms fairly easily (i.e. either by running with locking on, or by marshaling your calls into libevent as events themselves.) Similarly, third party libraries using GCD should not be an issue even if you chose to use libevent's loop structure. GCD manages its own thread pools and would have no way of stepping on libevent's main loop, etc.
You might also consider architecting your application such that it didn't matter what concurrency and connection handling library you used. Then you could swap out libevent, GCD, CFStreams, etc. (or mix and match) depending on what worked best for a given situation or deployment. Choosing a concurrency approach is important, but ideally you wouldn't couple yourself to it so tightly that you couldn't switch if circumstances called for it.
When you have such an architecture, I'm generally a fan of the approach of using the highest level abstraction that gets the job done, and only driving down to lower level abstractions when specific circumstances require it. In this case, that would probably mean using CFStreams and RunLoops to start, and switching out to "bare" GCD or libevent later, if you hit a wall and also determined (through empirical measurement) that it was the transport layer and not the application layer that was the limiting factor. Very few non-trivial applications actually get to the C10K problem in the transport layer; things tend to have to scale "out" at the application layer first, at least for apps more complicated than basic message passing.
I understand that some indeterminism stems from parfor's parallel nature but I don't understand why it should be entirely random. Is there any way to force parfor to respect (at least loosely) the order of the loop? More specifically I would like that in the case of:
parfor i=1:100
do_independent_stuff()
end
each worker of the pool when asking for a new task (i.e. in this case a new iteration of the loop) to be affected the lowest i that hasn't been computed or affected to a worker yet.
I think its by design that running something in parallel assumes that order is not important, at least in Matlab. Each thread/worker should be independent of each other. However, as indicated in this question, you could try job and task control interface to give you some level of control.
Firstly, in practice, PARFOR isn't "entirely random" - you can easily observe that it sends out chunks of loop iterates in reverse order. In R2013b and later, if you need more control over ordering (if, for example, you know that certain of your independent things are likely to take a long time, and therefore wish to start computing them first), you can use PARFEVAL.
If you need to loosely synchronize things, for instance wait until some thread as finished or has reach some point before to start another one, best should be to use semaphores, locks, mutex, etc...
I don't know if 'Parallel toolbox' includes such synchronization objects, but here is some workaround to create semaphore for instance:
https://stackoverflow.com/a/22874669/684399
You can also use objects in 'System.Threading' namespace (requires .NET):
Init:
someResultAvailable = System.Threading.ManualResetEvent(false);
In some job:
... do work ...
someResultAvailable .Set();
... continue ...
In another one:
... do work ...
if (!someResultAvailable.WaitOne(10000))
{
error('Timeout waiting for result from other thread');
}
... continue now knowing that results are available ...
When implementing algorithms and other things while trying to maintain reusability and separation patterns, I regularly get stuck in situations like this:
I communicate back and forth with an delegate while traversing a big graph of objects. My concern is how much all this messaging hurts, or how much I must care about Objective-C messaging overhead.
The alternative is to not separate anything and always put the individual code right into the algorithm like for example this graph traverser. But this would be nasty to maintain later and is not reusable.
So: Just to get an idea of how bad it really is: How many Objective-C messages can be sent in one second, on an iPhone 4?
Sure I could write a test but I don't want to get it biased by making every message increment a variable.
There's not really a constant number to be had. What if the phone is checking email in the background, or you have a background thread doing IO work?
The approach to take with things like this is, just do the simple thing first. Call delegates as you would, and see if performance is OK.
If it's not, then figure out how to improve things. If messaging is the overhead you could replace it with a plan C function call.
Taking the question implicitly to be "at what point do you sacrifice good design patterns for speed?", I'll add that you can eliminate many of the Objective-C costs while keeping most of the benefits of good design.
Objective-C's dynamic dispatch consults a table of some sort to map Objective-C selectors to the C-level implementations of those methods. It then performs the C function call, or drops back onto one of the backup mechanisms (eg, forwarding targets) and/or ends up throwing an exception if no such call exists. In situations where you've effectively got:
int c = 1000000;
while(c--)
{
[delegate something]; // one dynamic dispatch per loop iteration
}
(which is ridiculously artificial, but you get the point), you can instead perform:
int c = 1000000;
IMP methodToCall = [delegate methodForSelector:#selector(something)];
while(c--)
{
methodToCall(delegate, #selector(something));
// one C function call per loop iteration, and
// delegate probably doesn't know the difference
}
What you've done there is taken the dynamic part of the dispatch — the C function lookup — outside the inner loop. So you've lost many dynamic benefits. 'delegate' can't method swizzle during the loop, with the side effect that you've potentially broken key-value observing, and all of the backup mechanisms won't work. But what you've managed to do is pull the dynamic stuff out of the loop.
Since it's ugly and defeats many of the Objective-C mechanisms, I'd consider this bad practice in the general case. The main place I'd recommend it is when you have a tightly constrained class or set of classes hidden somewhere behind the facade pattern (so, you know in advance exactly who will communicate with whom and under what circumstances) and you're able to prove definitively that dynamic dispatch is costing you significantly.
For full details of the inner workings at the C level, see the Objective-C Runtime Reference. You can then cross-check that against the NSObject class reference to see where convenience methods are provided for getting some bits of information (such as the IMP I use in the example).