Diff, compare and merge files - general concepts

Diff, compare and merge files - general concepts - merge

Consider the example below without any assumption about language or file types. I have:
M_old : [A,B,C]
M_new : [A,B,X]
and now I would like to create:
M_result : [A,B,C,X]
which is basically the union of M_old and M_new. Are there any compare/diff/merge tool that supports this kind of operation - that can take M_old and M_new and produce M_result based on the above instances?
I have had success with computing (using a simple diff/merge tool) M_result when:
M_old : [A,B,C]
M_new : [A,B,C,X]
since all elements in M_old is contained in M_new. But I don't see how this possible when some of the elements in M_old is not present in M_new.
So in more general terms does a "merge" operation only support union under special conditions as described above?

If you don't care about the exact order of the resulting list, you can compute M_result by appending every element in M_old that is not in M_new to M_new, or, alternatively, by concatenating M_new and M_old and then removing duplicate elements (These yield the exact same result if neither input list contains any duplicate elements to begin with).

correct merge tools will propose you the solution that you want amongst other solutions, this is what our tool ECMerge does.
(I presume you have an ancestor: M_ancestor: [A,B])
your problem here is that there are not enough data to determine that is expected when there is a replacement (should the merge result be ',C' then ',X', ',X' then ',C' or only one or the other).
(I presume you have an ancestor: M_ancestor: [A,B,C])
the other case when the change is an insertion of ',X' (M_old : [A,B,C] versus M_new : [A,B,C,X]) is generally resolved by adding ',X' as there is no particular choice. if the ancestor is [A,B] only then there is still a question: should the merger consider adding ',C' or ',C,X' ? this is not obvious then as the version with only ',C' might have taken into account the knowledge of ',C,X' from outside the SCC system and already deleted the ',X'.
even 'clever' merger cannot do divination :)

Related

Drools RETE algorithm confusion

I am having an issue understanding RETE algorithm Beta node JoinNode and notNode?
Documentation says :
There are two two-input nodes, JoinNode and NotNode, and both are
types of BetaNodes. BetaNodes are used to compare 2 objects, and their
fields, to each other. The objects may be the same or different types.
By convention, we refer to the two inputs as left and right. The left
input for a BetaNode is generally a list of objects; in Drools this is
a Tuple. The right input is a single object. Two Nodes can be used to
implement 'exists' checks. BetaNodes also have memory. The left input
is called the Beta Memory and remembers all incoming tuples. The right
input is called the Alpha Memory and remembers all incoming objects.
I understood, Alpha Node: Various literal conditions for drl rules but above documentation for BetaNodes is confusing me a bit.
say below is drl condition for above diagram:
$person : Person( favouriteCheese == $cheddar )
Query: 1) what are these left and right inputs to two-input Beta Nodes exactly as explained in above documentation? I believe it's referring to facts and rules where I believe tuples would be facts?
2) notNode would be basically drl condition matching literal condition with not?
Updated question on 6Sep17:
3) I believe above diagram represent joinNode, how would notNode be represented , if above workflow is altered to suit notNode?

The condition corresponding to the diagram would be
Cheese( $name: name == "Cheddar" )
Person( favouriteCheese == $name )
Once there is a match, a new tuple consisting of the matching Cheese and Person is composed and can act as a new tuple for further matches if there is a third pattern in the condition.
A not-node would be one that asserts the non-existence of some fact. It would fire only once.
You might find a much better description of "rete" on the web.

What is the correct way to select real solutions?

Suppose one needs to select the real solutions after solving some equation.
Is this the correct and optimal way to do it, or is there a better one?
restart;
mu := 3.986*10^5; T:= 8*60*60:
eq := T = 2*Pi*sqrt(a^3/mu):
sol := solve(eq,a);
select(x->type(x,'realcons'),[sol]);
I could not find real as type. So I used realcons. At first I did this:
select(x->not(type(x,'complex')),[sol]);
which did not work, since in Maple 5 is considered complex! So ended up with no solutions.
type(5,'complex');
(* true *)
Also I could not find an isreal() type of function. (unless I missed one)
Is there a better way to do this that one should use?
update:
To answer the comment below about 5 not supposed to be complex in maple.
restart;
type(5,complex);
true
type(5,'complex');
true
interface(version);
Standard Worksheet Interface, Maple 18.00, Windows 7, February
From help
The type(x, complex) function returns true if x is an expression of the form
a + I b, where a (if present) and b (if present) are finite and of type realcons.

Your solutions sol are all of type complex(numeric). You can select only the real ones with type,numeric, ie.
restart;
mu := 3.986*10^5: T:= 8*60*60:
eq := T = 2*Pi*sqrt(a^3/mu):
sol := solve(eq,a);
20307.39319, -10153.69659 + 17586.71839 I, -10153.69659 - 17586.71839 I
select( type, [sol], numeric );
[20307.39319]
By using the multiple argument calling form of the select command we here can avoid using a custom operator as the first argument. You won't notice it for your small example, but it should be more efficient to do so. Other commands such as map perform similarly, to avoid having to make an additional function call for each individual test.
The types numeric and complex(numeric) cover real and complex integers, rationals, and floats.
The types realcons and complex(realcons) includes the previous, but also allow for an application of evalf done during the test. So Int(sin(x),x=1..3) and Pi and sqrt(2) are all of type realcons since following an application of evalf they become floats of type numeric.
The above is about types. There are also properties to consider. Types are properties, but not necessarily vice versa. There is a real property, but no real type. The is command can test for a property, and while it is often used for mixed numeric-symbolic tests under assumptions (on the symbols) it can also be used in tests like yours.
select( is, [sol], real );
[20307.39319]
It is less efficient to use is for your example. If you know that you have a collection of (possibly non-real) floats then type,numeric should be an efficient test.
And, just to muddy the waters... there is a type nonreal.
remove( type, [sol], nonreal );
[20307.39319]

The one possibility is to restrict the domain before the calculation takes place.
Here is an explanation on the Maplesoft website regarding restricting the domain:
4 Basic Computation
UPD: Basically, according to this and that, 5 is NOT considered complex in Maple, so there might be some bug/error/mistake (try checking what may be wrong there).
For instance, try putting complex without quotes.
Your way seems very logical according to this.
UPD2: According to the Maplesoft Website, all the type checks are done with type() function, so there is rather no isreal() function.

How to check if the value is a number in Prolog manually?

How to check if the given value is a number in Prolog without using built-in predicates like number?
Let's say I have a list [a, 1, 2, 3]. I need a way to check if every element within this list is a number. The only part of the problem that bothers me is how to do the check itself without using the number predicate.
The reason why I'm trying to figure this out is that I've got a college assignment where it's specifically said not to use any of the built-in predicates.

You need some built-in predicate to solve this problem - unless you enumerate all numbers explicitly (which is not practical since there are infinitely many of them).
1
The most straight-forward would be:
maplist(number, L).
Or, recursively
allnumbers([]).
allnumbers([N|Ns]) :-
number(N),
allnumbers(Ns).
2
In a comment you say that "the value is given as an atom". That could mean that you get either [a, '1', '2'] or '[a, 1, 2]`. I assume the first. Here again, you need a built-in predicate to analyze the name. Relying on ISO-Prolog's errors we write:
numberatom(Atom) :-
atom_chars(Atom, Chs),
catch(number_chars(_, Chs), error(syntax_error(_),_), false).
Use numberatom/1 in place of number/1, So write a recurse rule or use maplist/2
3
You might want to write a grammar instead of the catch... goal. There have been many such definitions recently, you may look at this question.
4
If the entire "value" is given as an atom, you will need again atom_chars/2or you might want some implementation specific solution like atom_to_term/3 and then apply one of the solutions above.

Why doesn't scala's parallel sequences have a contains method?

Why does
List.range(0,100).contains(2)
Work, while
List.range(0,100).par.contains(2)
Does not?
This is planned for the future?

The non-teleological answer is that it's because contains is defined in SeqLike but not in ParSeqLike.
If that doesn't satisfy your curiosity, you can find that SeqLike's contains is defined thus:
def contains(elem: Any): Boolean = exists (_ == elem)
So for your example you can write
List.range(0,100).par.exists(_ == 2)
ParSeqLike is missing a few other methods as well, some of which would be hard to implement efficiently (e.g. indexOfSlice) and some for less obvious reasons (e.g. combinations - maybe because that's only useful on small datasets). But if you have a parallel collection you can also use .seq to get back to the linear version and get your methods back:
List.range(0,100).par.seq.contains(2)
As for why the library designers left it out... I'm totally guessing, but maybe they wanted to reduce the number of methods for simplicity's sake, and it's nearly as easy to use exists.
This also raises the question, why is contains defined on SeqLike rather than on the granddaddy of all collections, GenTraversableOnce, where you find exists? A possible reason is that contains for Map is semantically a different method to that on Set and Seq. A Map[A,B] is a Traversable[(A,B)], so if contains were defined for Traversable, contains would need to take a tuple (A,B) argument; however Map's contains takes just an A argument. Given this, I think contains should be defined in GenSeqLike - maybe this is an oversight that will be corrected.
(I thought at first maybe parallel sequences don't have contains because searching where you intend to stop after finding your target on parallel collections is a lot less efficient than the linear version (the various threads do a lot of unnecessary work after the value is found: see this question), but that can't be right because exists is there.)

How to delete elements from a transformed collection using a predicate?

If I have an ArrayList<Double> dblList and a Predicate<Double> IS_EVEN I am able to remove all even elements from dblList using:
Collections2.filter(dblList, IS_EVEN).clear()
if dblList however is a result of a transformation like
dblList = Lists.transform(intList, TO_DOUBLE)
this does not work any more as the transformed list is immutable :-)
Any solution?

Lists.transform() accepts a List and helpfully returns a result that is RandomAccess list. Iterables.transform() only accepts an Iterable, and the result is not RandomAccess. Finally, Iterables.removeIf (and as far as I see, this is the only one in Iterables) has an optimization in case that the given argument is RandomAccess, the point of which is to make the algorithm linear instead of quadratic, e.g. think what would happen if you had a big ArrayList (and not an ArrayDeque - that should be more popular) and kept removing elements from its start till its empty.
But the optimization depends not on iterator remove(), but on List.set(), which is cannot be possibly supported in a transformed list. If this were to be fixed, we would need another marker interface, to denote that "the optional set() actually works".
So the options you have are:
Call Iterables.removeIf() version, and run a quadratic algorithm (it won't matter if your list is small or you remove few elements)
Copy the List into another List that supports all optional operations, then call Iterables.removeIf().

The following approach should work, though I haven't tried it yet.
Collection<Double> dblCollection =
Collections.checkedCollection(dblList, Double.class);
Collections2.filter(dblCollection, IS_EVEN).clear();
The checkCollection() method generates a view of the list that doesn't implement List. [It would be cleaner, but more verbose, to create a ForwardingCollection instead.] Then Collections2.filter() won't call the unsupported set() method.
The library code could be made more robust. Iterables.removeIf() could generate a composed Predicate, as Michael D suggested, when passed a transformed list. However, we previously decided not to complicate the code by adding special-case logic of that sort.

Maybe:
Collection<Double> odds = Collections2.filter(dblList, Predicates.not(IS_EVEN));
or
dblList = Lists.newArrayList(Lists.transform(intList, TO_DOUBLE));
Collections2.filter(dblList, IS_EVEN).clear();

As long as you have no need for the intermediate collection, then you can just use Predicates.compose() to create a predicate that first transforms the item, then evaluates a predicate on the transformed item.
For example, suppose I have a List<Double> from which I want to remove all items where the Integer part is even. I already have a Function<Double,Integer> that gives me the Integer part, and a Predicate<Integer> that tells me if it is even.
I can use these to get a new predicate, INTEGER_PART_IS_EVEN
Predicate<Double> INTEGER_PART_IS_EVEN = Predicates.compose(IS_EVEN, DOUBLE_TO_INTEGER);
Collections2.filter(dblList, INTEGER_PART_IS_EVEN).clear();

After some tries, I think I've found it :)
final ArrayList<Integer> ints = Lists.newArrayList(1, 2, 3, 4, 5);
Iterables.removeIf(Iterables.transform(ints, intoDouble()), even());
System.out.println(ints);
[1,3,5]

I don't have a solution, instead I found some kind of a problem with Iterables.removeIf() in combination with Lists.TransformingRandomAccessList.
The transformed list implements RandomAccess, thus Iterables.removeIf() delegates to Iterables.removeIfFromRandomAccessList() which depends on an unsupported List.set() operation.
Calling Iterators.removeIf() however would be successful, as the remove() operation IS supported by Lists.TransformingRandomAccessList.
see: Iterables: 147
Conclusion: instanceof RandomAccess does not guarantee List.set().
Addition:
In special situations calling removeIfFromRandomAccessList() even works:
if and only if the elements to erase form a compact group at the tail of the List or all elements are covered by the Predicate.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Diff, compare and merge files - general concepts - merge

Related

Drools RETE algorithm confusion

What is the correct way to select real solutions?

How to check if the value is a number in Prolog manually?

Why doesn't scala's parallel sequences have a contains method?

How to delete elements from a transformed collection using a predicate?

Categories

Resources