Bounded Buffers (Producer Consumer) - operating-system

In the shared buffer memory problem , why is it that we can have at most (n-1) items in the buffer at the same time.
Where 'n' is the buffer's size .
Thanks!

In an OS development class in college, I had an adjunct teacher that claimed it was impossible to have a software-only solution that could use all N elements in the buffer.
I proved him wrong with something I decided to call the race track solution (inspired by the fact that I like to run track).
On a race track, you are not limited to a 400 meter race; a race can consist of more than one lap. What happens if two runners are neck and neck
in a race? How do you know whether they are tied, or whether one runner has lapped the other? The answer is simple: in a race, we don't monitor a runner's position
on the track; we monitor the distance each runner has traversed. Thus, when two runners are neck and neck, we can disambiguafy between a tie and when one runner has
lapped the other.
So, our algorithm has an N-element array, and manages a 2N race. We don't restart the producer/consumer's counter back to zero until they finish their respective 2N race.
We don't allow the producer to be more than one lap ahead of the consumer, and we don't allow the consumer to be ahead of the producer.
Actually, we only have to monitor the distance between the producer and consumer.
The code is as follows:
Item track[LAP];
int consIdx = 0;
int prodIdx = 0;
void consumer()
{ while(true)
{ int diff = abs(prodIdx - consIdx);
if(0 < diff) //If the consumer isn't tied
{ track[consIdx%LAP] = null;
consIdx = (consIdx + 1) % (2*LAP);
}
}
}
void producer()
{ while(true)
{ int diff = (prodIdx - consIdx);
if(diff < LAP) //If prod hasn't lapped cons
{ track[prodIdx%LAP] = Item(); //Advance on the 1-lap track.
prodIdx = (prodIdx + 1) % (2*LAP);//Advance in the 2-lap race.
}
}
}
It's been a while since I originally solved the problem, so this is according to my best recollection. Hopefully I didn't overlook any bugs.
Hope this helps!

Oops, here's a bug fix:
Item track[LAP];
int consIdx = 0;
int prodIdx = 0;
void consumer()
{ while(true)
{ int diff = prodIdx - consIdx; //When prodIdx wraps to 0 before consIdx,
diff = 0<=diff? diff: diff + (2*LAP); //think in 3 Laps until consIdx wraps to 0.
if(0 < diff) //If the consumer isn't tied
{ track[consIdx%LAP] = null;
consIdx = (consIdx + 1) % (2*LAP);
}
}
}
void producer()
{ while(true)
{ int diff = prodIdx - consIdx;
diff = 0<=diff? diff: diff + (2*LAP);
if(diff < LAP) //If prod hasn't lapped cons
{ track[prodIdx%LAP] = Item(); //Advance on the 1-lap track.
prodIdx = (prodIdx + 1) % (2*LAP);//Advance in the 2-lap race.
}
}
}

Well, theoretically a bounded buffer can hold elements upto its size. But what you are saying could be related to certain implementation quirks like a clean way of figuring out when the buffer is empty/full. This question -> Empty element in array-based bounded buffer deals with a similar thing. See if it helps.
However you can of course have implementations that have all n slots filled up. That's how the bounded buffer problem is defined anyway.

Related

Decoding delimited frames from byte arrays

I have frames that are delimited by bytes to start and stop the frame (they do not appear in the stream).
I read a chunk from disk or network socket, i then need to pass to a deserializer but only after I have de-framed the packet first.
Frames may span multiple chunks that have been read, note how frame 3 is split across array 1 and array 2.
Rather than reinvent the wheel for this common problem, do any github or similar projects exist?
I am investigating ReadOnlySequenceSegment<T> from https://www.codemag.com/article/1807051/Introducing-.NET-Core-2.1-Flagship-Types-Span-T-and-Memory-T and will post updates as I work out the requirements.
Update
Further to Stephen Cleary link (thank you!!) to https://github.com/davidfowl/TcpEcho/blob/master/src/Server/Program.cs I have the below.
My data is json, so unlike the original question the delimiter tokens will appear in the stream. Therefore I have to count the array delimitator and only declare a frame when i have found the outermost [ and ] characters.
The below code works, and less manual copies done (not sure if still done behind the scenes - code is quite neater using David Fowl approach).
However I am casting to array instead of using buffer.PositionOf((byte)'[') since I was unable to see how I could call the PositionOf with an offset applied (i.e. scan deeper into the frame past previously found delimiter tokens).
Am i using/butchering the library in a brute force way, or is the below good to go with the array cast?
class Program
{
static async Task Main(string[] args)
{
using var stream = File.Open(args[0], FileMode.Open);
var reader = PipeReader.Create(stream);
while (true)
{
ReadResult result = await reader.ReadAsync();
ReadOnlySequence<byte> buffer = result.Buffer;
while (TryDeframe(ref buffer, out ReadOnlySequence<byte> line))
{
// Process the line.
var str = System.Text.Encoding.UTF8.GetString(line.ToArray());
Console.WriteLine(str);
}
// Tell the PipeReader how much of the buffer has been consumed.
reader.AdvanceTo(buffer.Start, buffer.End);
// Stop reading if there's no more data coming.
if (result.IsCompleted)
{
break;
}
}
// Mark the PipeReader as complete.
await reader.CompleteAsync();
}
private static bool TryDeframe(ref ReadOnlySequence<byte> buffer, out ReadOnlySequence<byte> frame)
{
int frameCount = 0;
int start = -1;
int end = -1;
var bytes = buffer.ToArray();
for (var i = 0; i < bytes.Length; i++)
{
var b = bytes[i];
if (b == (byte)'[')
{
if (start == -1)
start = i;
frameCount++;
}
else if (b == (byte)']')
{
frameCount--;
if (frameCount == 0)
{
end = i;
break;
}
}
}
if (start == -1 || end == -1) // no frame found
{
frame = default;
return false;
}
frame = buffer.Slice(start, end+1);
buffer = buffer.Slice(frame.Length);
return true;
}
}
do any github or similar projects exist?
David Fowler has an echo server that uses Pipelines to implement delimited frames.

send message to set of channels in non-deterministic order

I'm building a Promela model in which one process send a request to N other processes, waits for the replies, and then computes a value. Basically a typical map-reduce style execution flow. Currently my model sends requests in a fixed order. I'd like to generalize this to send a non-deterministic order. I've looked at the select statement, but that appears to select a single element non-deterministically.
Is there a good pattern for achieving this? Here the basic structure of what I'm working with:
#define NUM_OBJECTS 2
chan obj_req[NUM_OBJECTS] = [0] of { mtype, chan };
This is the object process that responds to msgtype messages with some value that it computes.
proctype Object(chan request) {
chan reply;
end:
do
:: request ? msgtype(reply) ->
int value = 23
reply ! value
od;
}
This is the client. It sends a request to each of the objects in order 0, 1, 2, ..., and collects all the responses and reduces the values.
proctype Client() {
chan obj_reply = [0] of { int };
int value
// WOULD LIKE NON-DETERMINISM HERE
for (i in obj_req) {
obj_req[i] ! msgtype(obj_reply)
obj_reply ? value
// do something with value
}
}
And I start up the system like this
init {
atomic {
run Object(obj_req[0]);
run Object(obj_req[1]);
run Client();
}
}
From your question I gather that you want to assign a task to a given process in a randomised order, as opposed to simply assign a random task to an ordered sequence of processes.
All in all, the solution for both approaches is very similar. I don't know whether the one I am going to propose is the most elegant approach, though.
#define NUM_OBJECTS 10
mtype = { ASSIGN_TASK };
chan obj_req[NUM_OBJECTS] = [0] of { mtype, chan, int };
init
{
byte i;
for (i in obj_req) {
run Object(i, obj_req[i]);
}
run Client();
};
proctype Client ()
{
byte i, id;
int value;
byte map[NUM_OBJECTS];
int data[NUM_OBJECTS];
chan obj_reply = [NUM_OBJECTS] of { byte, int };
d_step {
for (i in obj_req) {
map[i] = i;
}
}
// scramble task assignment map
for (i in obj_req) {
byte j;
select(j : 0 .. (NUM_OBJECTS - 1));
byte tmp = map[i];
map[i] = map[j];
map[j] = tmp;
}
// assign tasks
for (i in obj_req) {
obj_req[map[i]] ! ASSIGN_TASK(obj_reply, data[i]);
}
// out-of-order wait of data
for (i in obj_req) {
obj_reply ? id(value);
printf("Object[%d]: end!\n", id, value);
}
printf("client ends\n");
};
proctype Object(byte id; chan request)
{
chan reply;
int in_data;
end:
do
:: request ? ASSIGN_TASK(reply, in_data) ->
printf("Object[%d]: start!\n", id)
reply ! id(id)
od;
};
The idea is have an array which acts like a map from the set of indexes to the starting position (or, equivalently, to the assigned task).
The map is then scrambled through a finite number of swap operations. After that, each object is assigned its own task in parallel, so they can all start more-or-less at the same time.
In the following output example, you can see that:
Objects are being assigned a task in a random order
Objects can complete the task in a different random order
~$ spin test.pml
Object[1]: start!
Object[9]: start!
Object[0]: start!
Object[6]: start!
Object[2]: start!
Object[8]: start!
Object[4]: start!
Object[5]: start!
Object[3]: start!
Object[7]: start!
Object[1]: end!
Object[9]: end!
Object[0]: end!
Object[6]: end!
Object[2]: end!
Object[4]: end!
Object[8]: end!
Object[5]: end!
Object[3]: end!
Object[7]: end!
client ends
timeout
#processes: 11
...
If one wants to assign a random task to each object rather than starting them randomly, then it suffices to change:
obj_req[map[i]] ! ASSIGN_TASK(obj_reply, data[i]);
into:
obj_req[i] ! ASSIGN_TASK(obj_reply, data[map[i]]);
Obviously, data should be initialised to some meaningful content first.

Atomically setting a variable without comparing first

I've been reading up on and experimenting with atomic memory access for synchronization, mainly for educational purposes. Specifically, I'm looking at Mac OS X's OSAtomic* family of functions. Here's what I don't understand: Why is there no way to atomically set a variable instead of modifying it (adding, incrementing, etc.)? OSAtomicCompareAndSwap* is as close as it gets -- but only the swap is atomic, not the whole function itself. This leads to code such as the following not working:
const int N = 100000;
void* threadFunc(void *data) {
int *num = (int *)data;
// Wait for main thread to start us so all spawned threads start
// at the same time.
while (0 == num) { }
for (int i = 0; i < N; ++i) {
OSAtomicCompareAndSwapInt(*num, *num+1, num);
}
}
// called from main thread
void test() {
int num = 0;
pthread_t threads[5];
for (int i = 0; i < 5; ++i) {
pthread_create(&threads[i], NULL, threadFunc, &num);
}
num = 1;
for (int i = 0; i < 5; ++i) {
pthread_join(threads[i], NULL);
}
printf("final value: %d\n", num);
}
When run, this example would ideally produce 500,001 as the final value. However, it doesn't; even when the comparison in OSAtomicCompareAndSwapInt in thread X succeeds, another thread Y can come in set the variable first before X has a chance to change it.
I am aware that in this trivial example I could (and should!) simply use OSAtomicAdd32, in which case the code works. But, what if, for example, I wanted to set a pointer atomically so it points to a new object that another thread can then work with?
I've looked at other APIs, and they seem to be missing this feature as well, which leads me to believe that there is a good reason for it and my confusion is just based on lack of knowledge. If somebody could enlighten me, I'd appreciate it.
I think that you have to check the OSAtomicCompareAndSwapInt result to guarantee that the int was actually set.

Why threads give different number on my program using ThreadPool?

why Number has different value?
Thx
class Program
{
static DateTime dt1;
static DateTime dt2;
static Int64 number = 0;
public static void Main()
{
dt1 = DateTime.Now;
for (int i = 0; i < 10; i++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(WorkThread), DateTime.Now);
}
dt2 = DateTime.Now;
Console.WriteLine("***");
Console.ReadLine();
}
public static void WorkThread(object queuedAt)
{
number = 0;
for (Int64 i = 0; i < 2000000; i++)
{
number += i;
}
Console.WriteLine("number is:{0} and time:{1}",number,DateTime.Now - dt1);
}
}
number is being shared between all of your threads, and you're not doing anything to synchronize access to it from each thread. So one thread might not have even started it's i loop (it may or may not have reset number to 0 at this point), while another can be half way through, and another might have finished it's loop completely and be at the Console.WriteLine part.
Here you have 10 threads acting on the static variable number at indeterminate times. One thread could on its 10000 iteration while another could just be beginning execution. And your routine begins by resetting number to 0. This logic would produce interesting results but nothing predictable.
If multiple threads access the same variable all at once, there is a risk of race conditions. A race condition is basically when the operations of the two threads are interwoven such that they interfere with eachother. To add a value to "number", the old value must be read, the sum computed, and the new value set. If those steps are being done by many threads at the same time, the value-setting can overwrite work done by previous threads, and the final result can change. You must use a lock (also called a critical section, mutex, or monitor) to protect the variable so this can't happen.

Mutual Exclusion Problem

Please take a look on the following pseudo-code:
boolean blocked[2];
int turn;
void P(int id) {
while(true) {
blocked[id] = true;
while(turn != id) {
while(blocked[1-id])
/* do nothing */;
turn = id;
}
/* critical section */
blocked[id] = false;
/* remainder */
}
}
void main() {
blocked[0] = false;
blocked[1] = false;
turn = 0;
parbegin(P(0), P(1)); //RUN P0 and P1 parallel
}
I thought that a could implement a simple Mutual - Exclution solution using the code above. But it's not working. Has anyone got an idea why?
Any help would really be appreciated!
Mutual Exclusion is in this exemple not guaranteed because of the following:
We begin with the following situation:
blocked = {false, false};
turn = 0;
P1 is now executes, and skips
blocked[id] = false; // Not yet executed.
The situation is now:
blocked {false, true}
turn = 0;
Now P0 executes. It passes the second while loop, ready to execute the critical section. And when P1 executes, it sets turn to 1, and is also ready to execute the critical section.
Btw, this method was originally invented by Hyman. He sent it to Communications of the Acm in 1966
Mutual Exclusion is in this exemple not guaranteed because of the following:
We begin with the following situation:
turn= 1;
blocked = {false, false};
The execution runs as follows:
P0: while (true) {
P0: blocked[0] = true;
P0: while (turn != 0) {
P0: while (blocked[1]) {
P0: }
P1: while (true) {
P1: blocked[1] = true;
P1: while (turn != 1) {
P1: }
P1: criticalSection(P1);
P0: turn = 0;
P0: while (turn != 0)
P0: }
P0: critcalSection(P0);
Is this homework, or some embedded platform? Is there any reason why you can't use pthreads or Win32 (as relevant) synchronisation primitives?
Maybe you need to declare blocked and turn as volatile, but without specifying the programming language there is no way to know.
Concurrency can not be implemented like this, especially in a multi-processor (or multi-core) environment: different cores/processors have different caches. Those caches may not be coherent. The pseudo-code below could execute in the order shown, with the results shown:
get blocked[0] -> false // cpu 0
set blocked[0] = true // cpu 1 (stored in CPU 1's L1 cache)
get blocked[0] -> false // cpu 0 (retrieved from CPU 0's L1 cache)
get glocked[0] -> false // cpu 2 (retrieved from main memory)
You need hardware knowledge to implement concurrency.
Compiler might have optimized out the "empty" while loop. Declaring variables as volatile might help, but is not guaranteed to be sufficient on multiprocessor systems.