I am using Veins framework with OMNET++ to simulate a highway scenario.
I am using cOutVector to collect the results from my experiments.
I have more than 1000 nodes (vehicles), and the cOutVector collects results individually for each module (node). However, I need to collect the overall results.
For example, How many beacons were received by all nodes? Is there anyway of collecting such results?
In OMNeT++ the output results can be saved in two different types, and thus, file formats:
Scalars (*.sca) - contain summary data (mean, sum, count, max, min) for the whole simulation run
Vectors (*.vec) - contain fine-grained data (in form of time series) for each second of the simulation run
The output file formats are tightly coupled with the statistic mechanisms of OMNeT++. The statistics allow you to store different result recording modes, such as: count, sum, mean, vector.
In your case you would need to look at the sum for each of your nodes.
#statistic[foo](record=count,mean,vector);
These OMNeT++ mechanism seem complicated in the beginning, but they are rather easy once you wrap your head around. Also, they are very powerful giving insights to many aspects of your simulations.
To understand the difference between scalars and vectors read this.
To understand results recording using signals read this.
For a concrete example of how to use the signals and record a specific statistic & metric, check my detailed answer here.
Unfortunately, it is impossible to provide a "ready-to-use" solution for your case without knowing your code.
For example, How many beacons were received by all nodes? Is there
anyway of collecting such results?
You can create a static variable and everytime a node receive one beacon you increase the value of this variable.
For example:
(on app_name.h)
static int beaconCount; // in protected
int app_name::beaconCount = 0; // in the and of app_name.h, before #endif.
(on app_name.cc)
void app_name::onBeacon(WaveShortMessage* wsm) {
app_name::beaconCount++; // received one beacon
}
After this you can print the beaconCount in the function finish() or save in save file.
void app_name:: finish(){
if(strcmp(findHost()->getFullName(), "car[0]") == 0){ // For only the car[0] print the final value
cout << "Count of beacons received by all node:" << beaconCount << endl;
}
}
Q: Do you mean you want to collect aggregate stats of all nodes?
If so then I suggest you to use R, which provides more functionalities and customization. Though, you'll need time to learn the basic operation.
There is tutorial in the omnetpp-resultfile Github page.
Related
I noticed a HashSet<int> performing very slowly when working on a Flutter project. I had about 20,000 integers in a Set, and checking set.contains() took a very long time. But when I use toString() to convert all items to string, it performed 1000x faster.
I then tried to create a minimal reproducible code with 10 million random integers, but I couldn't reproduce the issue. Turns out, something special about these data caused the extreme slowness. I've attached a test code (and data) at the end of this question.
How to reproduce:
First, click "add int" button to add 14790 integers to a set. Then click "query int" (runs set.contains(123)) and "query string" (runs set.contains('123')). Observe that: 1. both operations are super slow; 2. the int version is slower than the string version. Picture:
Then click "clear items", then "add string" to add the toString() version of the data. Then click "query int" and "query string" again, notice how much faster it becomes. Picture:
Lastly, click both "add int" and "add string" to create a mixed set (with twice the entries). Observe that the querying times dropped in half for the int version, as if the faster strings helped "dilute" the problem. Picture:
I've had several friends running the same test code on various machines (intel i5, apple M1, snapdragon), timings are different but the conclusions are the same.
What's not the answer here:
Here are some things I considered, but they couldn't explain what's happening with some more tests.
Maybe int needs boxing, whereas string is already an object?
That's probably not the issue here. With 1 million randomly generated values, ints performed faster than strings.
string is immutable so their hash value could be cached?
I don't know if they are cached, but this doesn't explain the results observed with 1 million randomly generated values.
int hash resulted in a lot of collisions?
I tried to print out .hashCode for all ints and strings in the data set, and verified they are all unique.
Test code:
The full test code with data is too long for StackOverflow, I've put it here https://pastebin.com/raw/4fm2hKQB instead.
So yeah, I'm lost, if anyone could help me understand what's going on that'll be greatly appreciated!
I commented on the issue in the Dart repo. For completeness I will mention the 'answer' part of the comment here.
The implementations of HashSet and LinkedHashSet make the assumption that the key.hashCode values are 'good' hash codes that are reasonably distributed over a range of integers so that the lower N bits do not collide or nearly collide to 'bunch up' in the hash-table. Unfortunately int.hashCode does not have this property as it is effectively the identity function.
Things go wrong when the lower bits of all the keys are the same (or have only a few of the possible values) so taking the lower N bits gives the same effective hash code value. This is just the power-of-two version of the % 1000 example mentioned by #ch271828n.
#ch271828n mentions using a different hashCode. This is probably the best short-term solution. Use LinkedHashSet(hashCode: dispersedHashCode) with something like this:
int dispersedHashCode(e) { // untested!
int hash = e.hashCode;
// Odd number with 30%-50% of the bits set in an irregular pattern.
hash *= 0x1736B4D29;
hash += hash >>> 20;
// maybe do it again to let bits higher that 20 influence the low bits.
return hash;
}
Something like this would ideally be built into the core library hashed structures. This might take a long time since, realistically, a performance issue with a simple work-around will be likely be prioritized behind security bugs, incorrect behaviour bugs, performance issues with no work-around, and new features that enable customers to do things that are otherwise impossible to difficult to do.
A completely different approach would be to use an ordered Set like SplayTreeSet.
I am also considering hash collision problem.
int hash resulted in a lot of collisions?
I tried to print out .hashCode for all ints and strings in the data set, and verified they are all unique.
Well, "all unique" does not mean "there is no collision". For a hash set, the number of bins are much less than the number of hashcode. For example, suppose you have a hash set with 1000 bins, and the mapping from hashCode to bin index is a simple bin index := hashCode % 1000, and suppose your data has hashCode like 0, 1000, 2000, 3000 etc. In this artificial case, your data has all unique hashCode, but they all fall into the first bin out of the 1000 bins. Huge collision!
A simple approach to debug whether it is the problem of hashcode: Re-run the program with LinkedHashSet(hashCode: (e) => some_other_hash_approach(e), equals: ...). By using such a new hash set, you can test on other hashCode generating functions. If some hashCode generating functions do not result in the same extremely slow speed, it is highly because of the original hashCode function which causes collision.
In addition, you can even use the same hashCode method for both the int and the String case. Then you guarantee that both cases have exactly the same collision behavior. Then it is easy to see whether collision is the cause, or is unrelated.
Another debug approach: Look at the C++ source code of LinkedHashSet, and see what algorithms it uses to assign data to bins. Then check whether collision as mentioned above happens or not.
A third debug method: Compile the pure-Dart program into an executable, and use profilers like perf to run it. Then you can see which code is hottest and consume most of the time. You may need debug symbols of Dart's native C++ code, which should be fetchable.
Suppose that I have a program that takes a pointer as its input. Without prior knowledge about the structure of the pointee, how does a fuzzer create valid inputs that can actually hits the internal of the program? To make this more concrete, imagine an artificial C program
int myprogram (unknow_pointer* input){
printf("%s", input->name);
}
In some situations, the tested program first checks the input format. If the input format is not good, it raises an exception. In such situations, how can a fuzzer reach program points beyond that exception-raising statement?
Most fuzzers don't know anything about the internal structure of the program. Different fuzzers dealt with this in a various ways:
Not deal with it at all. Just throw random inputs and hope to produce an input that will pass some/all checks. (for example - radamasa)
Mutate a valid input - take a known valid input, and mutate it (flip bits, remove parts, add parts, etc.) in many cases it will be valid enough to pass some or all of the checks. For example - if you want to fuzz VLC, you will take a valid movie file as the input for the fuzzer, which will provide mutations of it to VLC. Those are often called mutation based fuzzers. (for example - zzuf)
If you have prior knowledge of the input's structure, build a model of the input, and then mutate specific fields within it. A big advantage of such method is the ability to deal with very specific types of fields - checksums, hashes, sizes, etc. Those are often called generation based fuzzers. (for example - spike, sulley and their successors, peach)
However, in recent years a new kind of fuzzers was evolved - feedback based fuzzers - these fuzzers perform mutations on a valid (or not) input, and based on feedback they receive from the fuzzed program they decide how and what to mutate next. The feedback is received by instrumenting the program execution, either by injection tracing in compile time, injecting the tracing code by patching the program in runtime, or using hardware tracing mechanisms. First among them is AFL (you can read more about it here).
A fuzzer throws every sort of random combination of inputs at the attack surface. The intention is to look for any opportunity for a "golden BB" to get past the input checks and get a response that can be further explored.
OK, I've been struggling with this for a while. What is the best way to accomplish the following:
where Reaction Wheel 1-4 are links to the same block in a library. When the Speed Counter, Speed Direction and Current signals are added to the final bus output as shown, MATLAB (rightfully) complains:
Warning: Signals 9, 10, 11, 12 entering Bus Creator
'myAwesomeModel' have duplicated names 'Current'. These are being made unique
by appending "(signal #)" to the signals within the resulting bus. Please
update the labels of the signals such that they are all unique.
Until now I've been using a "solution" like this:
that is, place a size-1-mux/gain-of-1/other-dummy block in the middle, so the signals can be renamed into something unique. However, I really like to believe that The MathWorks has thought of a better way to do this...
What is the "proper" way to construct bus signals like this? It feels rather like I'm being pushed to adopt a particular design/architecture, but what that is precisely, eludes me for the moment...
It was quite a challenge for me but looks like I kinda sorted it out. Matlab R2007a here. I'll do the example with an already done subsystem, with its inputs, outputs, ...
1- In Block Properties, add a tag to the block. This will be done to identify the block and its "siblings" among the system. MY_SUBSYSTEM for this example.
2- Block Properties again. Add the following snippet in CopyFcn callback:
%Find total amount of copies of the block in system
len = length(find_system(gcs,'Tag','MY_SUBSYSTEM'));
%Get handle of the block copied/added and name the desired signal accordingly
v = get_param(gcb,'PortHandles');
set(v.Outport(_INDEX_OF_PORT_TO_BE_RENAMED_),'SignalNameFromLabel',['BASENAME_HERE' num2str(len)]);
3- In _INDEX_OF_PORT_TO_BE_RENAMED_ you should put the port signal index (starting from 1) that you want to have renamed for each copy of the block. For a single output block this should be 1. BASENAME_HERE should be the port basename, in this case "Current" for you.
4- Add the block to the desired library, and delete the instance you used to create this example. From there on, as you add from the library or copy an existing block, the outport should name Current1, Current2, Current3, and so on. Notice that you could apply any convention or formatting.
Hope this helps. It worked for me, don't hesitate to ask/criticize!
Note: Obviously, as the model grows, this method may be computer-demanding as find_system will have to loop through the entire model, however looks like a good workaround for me in small-medium sized systems.
Connect a Bus Selector to each Data Output. Select the signals you want and set "Output as bus". Then connect all Bus Selectors to a Bus Creator.
simulink model
I've written some code that has an RTC component in it. It's a bit difficult to do proper emulation of the code because the clock speed is set to 50MHz so to see any 'real time' events take place would take forever. I did try to do simulation for 2 seconds in modelsim but it ended up crashing.
What would be a better way to do it if I don't have an evaluation board to burn and test using scope?
If you could provide a little more specific example of exactly what you're trying to test and what is chewing up your simulation cycles that would be helpful.
In general, if you have a lot of code that you need to test in simulation, it's helpful if you can create testbenches of the sub-modules and test them first. Often, if you simulate at the top (chip) level and try to stimulate sub-modules that are buried deep in the hierarchy of a design, it takes many clock ticks just to get data into and out of the sub-module. If you simulate the sub-module directly you have direct access to the modules I/O and can test the things you want to test in that module in fewer cycles than if you try to get to it from the top level.
If you are trying to test logic that has very deep fifos that you are trying to fill or a specific count of a large counter you're trying to hit, you can either add logic to your code to help create those conditions in fewer cycles (like a load instruction on the counter) or you can force the values of internal signals of your design from the testbench itself.
These are just a couple of general ideas. Again, if you provide more detail about what it is you're simulating there are probably people on this forum that can provide help that is more specific to your problem.
As already mentioned by Ciano, if you provided more information about your design we would be able to give more accurate answer. However, there are several tips that hardware designers should follow, specially for complex system simulation. Some of them (that I mostly use) are listed below:
Hierarchical simulation (as Ciano, already posted): instead of simulating the entire system, try to simulate smaller set of modules.
Selective configuration: most systems require some initialization processes such as reset initialization time, external chips register initialization, etc... Usually for simulation purposes a few of them are not require and you may use a global constant to jump these stages when simulating, like:
constant SIMULATION_ENABLE : STD_LOGIC := '1';
...;
-- in reset condition:
if SIMULATION_ENABLE = '1' then
currentState <= state_executeSystem; -- jump the initialization procedures
else
currentState <= state_initializeSystem;
end if;
Be careful, do not modify your code directly (hard coded). As the system increases, it becomes impossible to remember which parts of it you modified to simulate. Use constants instead, as the above example, to configure modules to simulation profile.
Scaled time/size constants: instead of using (everytime) the real values for time and sizes (such as time event, memory sizes, register file size, etc) use scaled values whenever possible. For example, if you are building a RTC that generates an interrupt to the main system every 60 seconds - scale your constants (if possible) to generate interrupts to about (6ms, 60us). Of course, the scale choice depends on your system. In my designs, I use two global configuration files. One of them I use for simulation and the other for synthesis. Most constant values are scaled down to enable lower simulation time.
Increase the abstraction: for bigger modules it might be useful to create a simplified and more abstract module, acting as a model of your module. For example, if you have a processor that has this RTC (you mentioned) as a peripheral, you may create a simplified module of this RTC. Pretending that you only need the its interrupt you may create a simplified model such as:
constant INTERRUPT_EVENTS array(1 to 2) of time := (
32 ns,
100 ms
);
process
for i in 1 to INTERRUPT_EVENTS'length loop
rtcInterrupt <= '0';
wait for INTERRUPT_EVENTS(i);
rtcInterrupt <= '1';
wait for clk = '1' and clk'event
end for
wait;
end process;
I have recently been diving into Scala and (perhaps predictably) have spent quite a bit of time studying the immutable collections API in the Scala standard library.
I am writing an application that necessarily does many +/- operations on large sets. For this reason, I want to ensure that the implementation I choose is a so-called "persistent" data structure so that I avoid doing copy-on-write. I saw this answer by Martin Odersky, but it didn't really clear up the issue for me.
I wrote the following test code to compare the performance of ListSet and HashSet for add operations:
import scala.collection.immutable._
object TestListSet extends App {
var set = new ListSet[Int]
for(i <- 0 to 100000) {
set += i
}
}
object TestHashSet extends App {
var set = new HashSet[Int]
for(i <- 0 to 100000) {
set += i
}
}
Here is a rough runtime measurement of the HashSet:
$ time scala TestHashSet
real 0m0.955s
user 0m1.192s
sys 0m0.147s
And ListSet:
$ time scala TestListSet
real 0m30.516s
user 0m30.612s
sys 0m0.168s
Cons on a singly linked list is a constant-time operation, but this performance looks linear or worse. Is this performance hit related to the need to check each element of the set for object equality to conform to the no-duplicates invariant of Set? If this is the case, I realize it's not related to "persistence".
As for official documentation, all I could find was the following page, but it seems incomplete: Scala 2.8 Collections API -- Performance Characteristics. Since ListSet seems initially to be a good choice for its memory footprint, maybe there should be some information about its performance in the API docs.
An old question but also a good example of conclusions being drawn on the wrong foundation.
Connor, basically you're trying to do a microbenchmark. That is generally not recommended and damn hard to do properly.
Why? Because the JVM is doing many other things than executing the code in your examples. It's loading classes, doing garbage collection, compiling bytecode to native code, etc. All dynamically and based on different metrics sampled at runtime.
So you cannot conclude anything about the performance of the two collections with the above test code. For example, what you could actually be measuring could be the compilation time of the += method of HashSet and garbage collection times of ListSet. So it's a comparison between apples and pears.
To do a micro benchmark properly, you should:
Warm up the JVM: Load all classes, ensure all code paths in the benchmark are run and hot spots in the code are compiled (e.g. the += method).
Run the benchmark and ensure neither the GC or the compiler runs during the test (use the JVM flags -XX:-PrintCompilation and -XX:-PrintGC). If either runs during the test, discard the result.
Repeat step 2 and sample 10-15 good measurements. Calculate variance and standard deviation.
Evaluate: If the mean of each benchmark +/- 3 std do not overlap, then you can draw a conclusion about which is faster. Otherwise, it's a blurry result (depending on the amount of overlap).
I can recommend reading Oracle's recommendations for doing micro benchmarks and a great article about benchmark pitfalls by Brian Goetz.
Also, if you want to use a good tool, which does all the above for you, try Caliper by Google.
The key line from the ListSet source is (within subclass Node):
override def +(e: A): ListSet[A] = if (contains(e)) this else new Node(e)
where you can see that an item is added only if it is not already contained. So adding to the set is O(n). You can generally assume that XMap has similar performance characteristics to XSet, and ListMap is listed as linear time all the way along. This is why, and it is how a set is supposed to behave.
P.S. In the TestHashSet case you're measuring startup time. It's way more than 30x faster.
Since a set has to have with no duplicates, before adding an element, a Set must check to see if it already contains the element. This search in a list that has no guarantee of an element's position will be O(N) linear time. The same general idea applies to its remove operation.
With a HashSet, the class defines a function that picks a location for any element in O(1), which makes the contains(element) method much quicker, at the expense of taking up more space to decrease the chance of element location collisions.