How to enumerate permutations in maple without the inbuilt function - maple

I want to write maple code without using the permute function maple such that I that can get each permutation at a time and as I do not want to store all the permutations in a list as it would require too much memory space when n is larger.
So I would like to say I have to permute [1,2,3] first I get 1,2,3 leave it , next get 2,1,3 leave it, 3,2,1 leave it and so I don't store them after using it.
The same way for combinations too without using the combinat to store entire in a list i want say choose from 1,2,3,4 say 3 elements at a time
I choose 1,2,3 leave it. Next I choose say 1,2,4 leave it without storing after I have just used it.

You can use either the combinat:-nextperm command or the Iterator:-Permute command (more recent versions of Maple) to iterate over permutations individually.
In other words, these allow you to construct (and inspect/manipulate) the iterations individually without the full set being stored at any one time which is prohibitively expensive in memory. In fact, this is the primary purpose of these commands, to avoid storing the full set at once.
Similarly there are commands such as combinat:-nextcomb, etc, for combinations.

Related

Is it more efficient to assign data from a structure once, or to extract it multiple times?

I have to run a certain functions many times; this function takes a certain structure sc as input. Within the function, certain values from the structure (say sc.a and sc.b) are used multiple times.
I have two options:
Assign a=sc.a and use a every time it is needed within the function;
Extract sc.a every time I need it within the function.
Which of these is more efficient? In (1) I am using extra memory to assign a, while in (2) I am extracting sc.a multiple times.
Arrays would quite faster if you have a plenty of operations.
This is almost language agnostic. Arrays are easier to access due to being next to each other in memory, while with structs you break the memory pattern, so you disable the possibility of caching, thus requiring more time for memory reads. On top of that, MATLAB's openMP/multi-thread operations work great in arrays, while they don't in structs.

what is the efficiency of an assign statement in progress-4gl

why is an assign statement more efficient than not using assign?
co-workers say that:
assign
a=3
v=7
w=8.
is more efficient than:
a=3.
v=7.
w=8.
why?
You could always test it yourself and see... but, yes, it is slightly more efficient. Or it was the last time I tested it. The reason is that the compiler combines the statements and the resulting r-code is a bit smaller.
But efficiency is almost always a poor reason to do it. Saving a micro-second here and there pales next to avoiding disk IO or picking a more efficient algorithm. Good reasons:
Back in the dark ages there was a limit of 63k of r-code per program. Combining statements with ASSIGN was a way to reduce the size of r-code and stay under that limit (ok, that might not be a "good" reason). One additional way this helps is that you could also often avoid a DO ... END pair and further reduce r-code size.
When creating or updating a record the fields that are part of an index will be written back to the database as they are assigned (not at the end of the transaction) -- grouping all assignments into a single statement helps to avoid inconsistent dirty reads. Grouping the indexed fields into a single ASSIGN avoids writing the index entries multiple times. (This is probably the best reason to use ASSIGN.)
Readability -- you can argue that grouping consecutive assignments more clearly shows your intent and is thus more readable. (I like this reason but not everyone agrees.)
basically doing:
a=3.
v=7.
w=8.
is the same as:
assign a=3.
assign v=7.
assign w=8.
which is 3 separate statements so a little more overhead. Therefore less efficient.
Progress does assign as one statement whether there is 1 or more variables being assigned. If you do not say Assign then it is assumed so you will do 3 statements instead of 1. There is a 20% - 40% reduction in R Code and a 15% - 20% performance improvement when using one assign statement. Why this is can only be speculated on as I can not find any source with information on why this is. For database fields and especially key/index fields it makes perfect sense. For variables I can only assume it has to do with how progress manages its buffers and copies data to and from buffers.
ASSIGN will combine multiple statements into one. If a, v and w are fields in your db, that means it will do something like INSERT INTO (a,v,w)...
rather than
INSERT INTO (a)...
INSERT INTO (v)
etc.

Find global subscript midpoint

In Caché ObjectScript (Intersystems' dialect of MUMPS), is there a way to efficiently skip to the approximate midpoint or a linear point in the key for a global subscript range? Equal, based on the number of records.
I want to divide up the the subscript key range into approximately equal chunks and then process each chunk in parallel.
Knowing that the keys in a global are arranged in a binary tree of some kind, this should be a simple operation for the underlying data storage engine but I'm not sure if there is an interface to do this.
I can do it by scanning the global's whole keyspace but that would defeat the purpose of trying to run the operation in parallel. A sequential scan takes hours on this global. I need the keyspace divided up BEFORE I begin scanning.
I want each thread will to an approximately equal sized contiguous chunk of the keyspace to scan individually; the problem is calculating what key range to give each thread.
you can use second parameter "direction" (1 or -1) in function $order or $query
For my particular need, I found that the application I'm using has what I would call an index global. Another global maintained by the app with different keys, linking back to the main table. I can scan that in a fraction of the time and break up the keyset from there.
If someone comes up with a way to do what I want given only the main global, I'll change the accepted answer to that.

Merging huge sets (HashSet) in Scala

I have two huge (as in millions of entries) sets (HashSet) that have some (<10%) overlap between them. I need to merge them into one set (I don't care about maintaining the original sets).
Currently, I am adding all items of one set to the other with:
setOne ++= setTwo
This takes several minutes to complete (after several attempts at tweaking hashCode() on the members).
Any ideas how to speed things up?
You can get slightly better performance with Parallel Collections API in Scala 2.9.0+:
setOne.par ++ setTwo
or
(setOne.par /: setTwo)(_ + _)
There are a few things you might wanna try:
Use the sizeHint method to keep your sets at the expected size.
Call useSizeMap(true) on it to get better hash table resizing.
It seems to me that the latter option gives better results, though both show improvements on tests here.
Can you tell me a little more about the data inside the sets? The reason I ask is that for this kind of thing, you usually want something a bit specialized. Here's a few things that can be done:
If the data is (or can be) sorted, you can walk pointers to do a merge, similar to what's done using merge sort. This operation is pretty trivially parallelizable since you can partition one data set and then partition the second data set using binary search to find the correct boundary.
If the data is within a certain numeric range, you can instead use a bitset and just set bits whenever you encounter that number.
If one of the data sets is smaller than the other, you could put it in a hash set and loop over the other dataset quickly, checking for containment.
I have used the first strategy to create a gigantic set of about 8 million integers from about 40k smaller sets in about a second (on beefy hardware, in Scala).

Merge of key-value stores

Is there some merge strategy or program which is aware of key-value stores, in the sense that the sequence of the lines does not matter*? For a real example, jEdit does not keep the order of options, so there are hundreds of lines which are shuffled around. It would be nice to diff/merge these without having to sort the file first, for example to see how values are changed and keys are added/removed by configuration modifications while the program is running.
* I know it matters for some file types, like shell scripts where you can have references to other keys. These of course should be merged normally.
if the stores are unsorted then comparing them will cost O(n*m) time, if you first sort them you can run it in O(n log n + m log m) for the sort plus O(n+m) for the check, so if the stores are reasonably large then sorting is way faster