Does e language provides a more efficient solution or data structure for dealing with FIFO management? - specman

I am extensively using lists in our UVC’s monitor, due to our protocol specifications a lot is modeled using FIFO operation of list.push() and list.pop0(), since pop0() is a very expansive operation in ‘e’ on large list,
Does e language provides a more efficient solution or data structure for dealing with FIFO management?

You can use eTL.
these are templates that implements common cases such as FIFO and more.
the way they are implemented is more efficient and focus on performance.

Yes. Use etl. For example, in order to overcome the poor performance of the list.pop0() function, you can use a deque of uint (instead of list of uint) for better performance.

Yes e support solution for dealing with those kinds of problems.
One of the solutions is using eTL or redefine existing methods.

Related

Hash Table With Adaptive Hash Function

The performance of a particular hash table depends heavily on both the keys and the hash function. Obviously one can improve the performance greatly by trying different hash functions based on the incoming elements, and picking the one resulting into the least collisions. Are there any publications on this subject, exploring the methods of selecting such functions dynamically with or without user guidance?
I doubt there is a formal process to choose the best. There are too many moving parts. Especially when it comes to performance - there is no single "best performance" approach. Is it best latency? throughput? memory usage? cpu usage? More reads? More writes? Concurrent access? etc, etc, etc.
The only sensible way is to run performance tests for your specific code and use cases and choose what works for you.

Concurrent hash-tables

I make some investigation about a hash-tables implementations for multi-threaded environment. I wonder are there some new articles or research in 2010-2014 period? Maybe you know about such scientific papers?
Currently, I've found not so much information about it:
1. Relativistic Causal Ordering A Memory Model for Scalable Concurrent Data Structures
2. CPHash: A Cache-Partitioned Hash Table
Have you looked into Cuckoo Hashing and Hopscotch Hashing?

What are the real advantages of immutable collections?

Scala provides immutable collections, such as Set, List, Map. I understand that the immutability has advantages in concurrent programs. However what are exactly the advantages of the immutability in regular data processing?
What if I enumerate subsets, permutations and combinations for example? Does the immutable collections have any advantage here?
What are exactly the advantages of the immutability in regular data processing?
Generally speaking, immutable objects are easier/simpler to reason about.
It does. Since you're enumerating on a collection, presumably you'd want to be certain that elements are not inadvertently added or removed while you're enumerating.
Immutability is very much a paradigm in functional programming. Making collections immutable allows one to think of them much like primitive data types (i.e. modifying a collection or any other object results in creating a different object just as adding 2 to 3 doesn't modify 3, but creates 5)
To expand Matt's answer: From my personal experience I can say that implementations of algorithms based on search trees (e.g. breadth first, depth first, backtracking) using mutable collections end up regularly as a steaming pile of crap: Either you forget to copy a collection before a recursive call, or you fail to take back changes correctly if you get the collection back. In that area immutable collections are clearly superior. I ended up writing my own immutable list in Java when I couldn't get a problem right with Java's collections. Lo and behold, the first "immutable" implementation worked immediately.
If your data doesn't change after creation, use immutable data structures. The type you choose will identify the intent of usage. Anything more specific would require knowledge about your particular problem space.
You may really be looking for a subset, permutation, or combination generator, and then the discussion of data structures is moot.
Also, you mentioned that you understand the concurrent advantages. Presumably, you're throwing some algorithm at permutations and subsets, and there's a good chance that algorithm can be parallelized to some extent. If that's the case, using immutable structures up front ensures your initial implementation of algorithm X will be easily transformed into concurrent algorithm X.
I have a couple advantages to add to the list:
Immutable collections can't be invalidated out from under you
That is, it's totally fine to have immutable public val members of a Scala class. They are read-only by definition. Compare to Java where not only do you have to remember to make the member private but also write a get method that returns a copy of the object so the original is not modified by the calling code.
Immutable data structures are persistent. This means that the immutable collection obtained by calling filter on your TreeSet actually shares some of its nodes with the original. This translates to time and space savings and offsets some of the penalties incurred by using immutability.
some of immutability advantages :
1 - smaller margin for error (you always know what’s in your collections and read-only variables).
2 - you can write concurrent programs without worrying about threads stepping on each other when modifying variables and collections.

why does memcached not support "multi set"

Can anyone explain why memcached folks decided to support multi get but not multi set.
By multi I mean operation involving more than one key (see protocol at http://code.google.com/p/memcached/wiki/NewCommands).
So you can get multiple keys in one shot (basic advantage is the standard saving you get by doing less round trips) but why can not you get bulk sets?
My theory is that it was meant to do less number of sets and that too individually (e.g. on a cache read and miss). But I still do not see how multi-set really conflicts with the general philosophy of memcached.
I looked at the client features at http://code.google.com/p/memcached/wiki/NewCommonFeatures and it seems that some clients potentially do support "Multi-Set" (why only in binary protocol?). I am using Java spy memcached, btw.
It's not supported in the text protocol because it'd be very, very complicated to express, no clients would support it, and it would provide very little that you can't already do from the text protocol.
It's supported in the binary protocol because it's a trivial use case of binary operations.
spymemcached supports it implicitly -- just do a bunch of sets and magic happens:
http://dustin.github.com/2009/09/23/spymemcached-optimizations.html
I don't know a lot about memcache internals, but I assume writes have to be blocking, atomic operations. I assume that allowing multiple set operations to be batched, you could block all reads for a long time (or risk a get occurring while only half of a write had been applied). Forcing writes to be done individually allows them to be interleaved fairly with gets.
I would imagine that the restriction against using multi sets is to avoid collisions when writing cached values to the memcache.
As an object cache, I can't foresee an example of when you would need transactional type writes. This use case seems less suited for a caching layer, but better suited for the underlying database.
If sets come in interleaved from different clients, it is most likely the case that for one key, the last one wins, or is at least close enough, until the cache is invalidated and a newer value is written.
As Gian mentions, there don't seem to be any good reasons to block reads from the cache while several or many writes to the cache happen.

Why “Set based approaches” are better than the “Procedural approaches”?

I am very eager to know the real cause though earned some knowledge from googling.
Thanks in adavnce
Because SQL is a really poor language for writing procedural code, and because the SQL engine, storage, and optimizer are designed to make it efficient to assemble and join sets of records.
(Note that this isn't just applicable to SQL Server, but I'll leave your tags as they are)
Because, in general, the hundreds of man-years of development time that have gone into the database engine and optimizer, and the fact that it has access to real-time statistics about the data, have resulted in it being better than the user in working out the best way to process the data, for a given request.
Therefore by saying what we want to achieve (with a set-based approach), and letting it decide how to do it, we generally achieve better results than by spelling out exactly how to provess the data, line by line.
For example, suppose we have a simple inner join from table A to table B. At design time, we generally don't know 'which way round' will be most efficient to process: keep a list of all the values on the A side, and go through B matching them, or vice versa. But the query optimizer will know at runtime both the numbers of rows in the tables, and also the most recent statistics may provide more information about the values themselves. So this decision is obviously better made at runtime, by the optimizer.
Finally, note that I have put a number of 'generally's in this post - there will always be times when we know better than the optimizer will, and for such times we can provide hints (NOLOCK etc).
Set based approaches are declarative, so you don't describe the way the work will be done, only what you want the result to look like. The server can decide between several strategies how to complay with your request, and hopefully choose one that is efficient.
If you write procedural code, that code will at best be less then optimal in some situation.
Because using a set-based approach to SQL development conforms to the design of the data model. SQL is a very set-based language, used to build sets, subsets, unions, etc, from data. Keeping that in mind while developing in TSQL will generally lead to more natural algorithms. TSQL makes many procedural commands available that don't exist in plain SQL, but don't let that switch you to a procedural methodology.
This makes me think of one of my favorite quotes from Rob Pike in Notes on Programming C:
Data dominates. If you have chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming.
SQL databases and the way we query them are largely set-based. Thus, so should our algorithms be.
From an even more tangible standpoint, SQL servers are optimized with set-based approaches in mind. Indexing, storage systems, query optimizers, and other optimizations made by various SQL database implmentations will do a much better job if you simply tell them the data you need, through a set-based approach, rather than dictating how you want to get it procedurally. Let the SQL engine worry about the best way to get you the data, you just worry about telling it what data you want.
As each one has explained, let the SQL engine help you, believe, it is very smart.
If you do not use to write set based solution and use to develop procedural code, you will have to spend some time until write well formed set based solutions. This is a barrier for most people. A tip if you wish to start coding set base solutions is, stop thinking what you can do with rows, and start thinking what you can do with collumns, and do practice functional languages.