How to only emit consistent calculations? - system.reactive

I'm using reactive programming to do a bunch of calculations. Here is a simple example that tracks two numbers and their sum:
static void Main(string[] args) {
BehaviorSubject<int> x = new BehaviorSubject<int>(1);
BehaviorSubject<int> y = new BehaviorSubject<int>(2);
var sum = Observable.CombineLatest(x, y, (num1, num2) => num1 + num2);
Observable
.CombineLatest(x, y, sum, (xx, yy, sumsum) => new { X = xx, Y = yy, Sum = sumsum })
.Subscribe(i => Console.WriteLine($"X:{i.X} Y:{i.Y} Sum:{i.Sum}"));
x.OnNext(3);
Console.ReadLine();
}
This generates the following output:
X:1 Y:2 Sum:3
X:3 Y:2 Sum:3
X:3 Y:2 Sum:5
Notice how second output result is "incorrect" because it is showing that 3+2=3. I understand why this is happening (x is updated before the sum is updated) but I want my output calculations to be atomic/consistent - no value should be emitted until all dependent calculations are complete. My first approach was this...
Observable.When(sum.And(Observable.CombineLatest(x, y)).Then((s, xy) => new { Sum = s, X = xy[0], Y = xy[1] } ));
This seems to work for my simple example. But my actual code has LOTS of calculated values and I couldn't figure out how to scale it. For example, if there was a sum and squaredSum, I don't know how to wait for each of these to emit something before taking action.
One method that should work (in-theory) is to timestamp all the values I care about, as shown below.
Observable
.CombineLatest(x.Timestamp(), y.Timestamp(), sum.Timestamp(), (xx, yy, sumsum) => new { X = xx, Y = yy, Sum = sumsum })
.Where(i=>i.Sum.Timestamp>i.X.Timestamp && i.Sum.Timestamp>i.Y.Timestamp)
// do the calculation and subscribe
This method could work for very complicated models. All I have to do is ensure that no calculated value is emitted that is older than any core data value. I find this to be a bit of a kludge. It didn't actually work in my console app. When I replaced Timestamp with a custom extension that assigned a sequential int64 it did work.
What is a simple, clean way to handle this kind of thing in general?
=======
I'm making some progress here. This waits for a sum and sumSquared to emit a value before grabbing the data values that triggered the calculation.
var all = Observable.When(sum.And(sumSquared).And(Observable.CombineLatest(x, y)).Then((s, q, data)
=> new { Sum = s, SumSquared = q, X = data[0], Y = data[1] }));

This should do what you want:
Observable.CombineLatest(x, y, sum)
.DistinctUntilChanged(list => list[2])
.Subscribe(list => Console.WriteLine("{0}+{1}={2}", list[0], list[1], list[2]));
It waits until the sum has been updated, which means that all its sources must have been updated too.

You problem isn't because x is updated before the sum is updated per se. It's really about the way that you've constructed your query.
You've effectively created two queries: Observable.CombineLatest(x, y, (num1, num2) => num1 + num2) & Observable.CombineLatest(x, y, sum, (xx, yy, sumsum) => new { X = xx, Y = yy, Sum = sumsum }). Since in each you're subscribing to x then you've create two subscriptions. Meaning that when x updates then two lots of updates occur.
You need to avoid creating two subscriptions.
If you write your code like this:
BehaviorSubject<int> x = new BehaviorSubject<int>(1);
BehaviorSubject<int> y = new BehaviorSubject<int>(2);
Observable
.CombineLatest(x, y, (num1, num2) => new
{
X = num1,
Y = num2,
Sum = num1 + num2
})
.Subscribe(i => Console.WriteLine($"X:{i.X} Y:{i.Y} Sum:{i.Sum}"));
x.OnNext(3);
...then you correctly get this output:
X:1 Y:2 Sum:3
X:3 Y:2 Sum:5

I've started to get my head around this some more. Here is a more detailed example of what I'm trying to accomplish. This is some code that validates a first and last name, and should only generate a whole name when both parts are valid. As you can see I'm trying to use a bunch of small independently defined functions, like "firstIsValid", and then compose them together to calculate something more complex.
It seems like the challenge I'm facing here is trying to correlate inputs and outputs in my functions. For example, "firstIsValid" generates an output that says some first name was valid, but doesn't tell you which one. In option 2 below, I'm able to correlate them using Zip.
This strategy won't work if a validation function does not generate one output for each input. For example, if the user is typing web addresses and we're trying to validate them on the web, maybe we'd do a Throttle and/or Switch. There might be 10 web addresses for a single "webAddressIsValid". In that situation, I think I have to include the output with the input. Maybe have an IObservable> where the string is the web address and the bool is whether it is valid or not.
static void Main(string[] args) {
var first = new BehaviorSubject<string>(null);
var last = new BehaviorSubject<string>(null);
var firstIsValid = first.Select(i => string.IsNullOrEmpty(i) || i.Length < 3 ? false : true);
var lastIsValid = last.Select(i => string.IsNullOrEmpty(i) || i.Length < 3 ? false : true);
// OPTION 1 : Does not work
// Output: bob smith, bob, bob roberts, roberts
// firstIsValid and lastIsValid are not in sync with first and last
//var whole = Observable
// .CombineLatest(first, firstIsValid, last, lastIsValid, (f, fv, l, lv) => new {
// First = f,
// Last = l,
// FirstIsValid = fv,
// LastIsValid = lv
// })
// .Where(i => i.FirstIsValid && i.LastIsValid)
// .Select(i => $"{i.First} {i.Last}");
// OPTION 2 : Works as long as every change in a core data value generates one calculated value
// Output: bob smith, bob robert
var firstValidity = Observable.Zip(first, firstIsValid, (f, fv) => new { Name = f, IsValid = fv });
var lastValidity = Observable.Zip(last, lastIsValid, (l, lv) => new { Name = l, IsValid = lv });
var whole =
Observable.CombineLatest(firstValidity, lastValidity, (f, l) => new { First = f, Last = l })
.Where(i => i.First.IsValid && i.Last.IsValid)
.Select(i => $"{i.First.Name} {i.Last.Name}");
whole.Subscribe(i => Console.WriteLine(i));
first.OnNext("bob");
last.OnNext("smith");
last.OnNext(null);
last.OnNext("roberts");
first.OnNext(null);
Console.ReadLine();
}
Another approach here. Each value gets a version number (like a timestamp). Any time a calculated value is older than the data (or other calculated values it relies upon) we can ignore it.
public class VersionedValue {
static long _version;
public VersionedValue() { Version = Interlocked.Increment(ref _version); }
public long Version { get; }
}
public class VersionedValue<T> : VersionedValue {
public VersionedValue(T value) { Value = value; }
public T Value { get; }
public override string ToString() => $"{Value} {Version}";
}
public static class ExtensionMethods {
public static IObservable<VersionedValue<T>> Versioned<T>(this IObservable<T> values) => values.Select(i => new VersionedValue<T>(i));
public static VersionedValue<T> AsVersionedValue<T>(this T obj) => new VersionedValue<T>(obj);
}
static void Main(string[] args) {
// same as before
//
var whole = Observable
.CombineLatest(first.Versioned(), firstIsValid.Versioned(), last.Versioned(), lastIsValid.Versioned(), (f, fv, l, lv) => new {
First = f,
Last = l,
FirstIsValid = fv,
LastIsValid = lv
})
.Where(i => i.FirstIsValid.Version > i.First.Version && i.LastIsValid.Version > i.Last.Version)
.Where(i => i.FirstIsValid.Value && i.LastIsValid.Value)
.Select(i => $"{i.First.Value} {i.Last.Value}");

Related

Rx operator that starts like combineLatest but then acts like withLatestFrom

I am looking for an operator that combines two Observables by not emitting anything until both Observables have emitted an element (similar to combineLatest), but then only emits elements by combining elements from one Observable with the most recently emitted element of the other Observable (similar to withLatestFrom). The results would look like this (y observable is the "control"):
Does an operator like this exist?
I solved this in Java, but the same theory should work for you.
What you have is really two basic patterns; A combineLatest value followed by withLatestFrom values. If withLatestFrom triggers first then you want to skip the combineLatest value.
Start by making the withLatestFrom observable:
Observable<Result> wlf = o1.withLatestFrom(o2, f::apply);
Next we want to create the combineLatest observable that emits a single value. We also want to stop this observable when wlf triggers:
Observable<Result> cl = Observable.combineLatest(o1, o2, f::apply)
.take(1).takeUntil(wlf);
Finally add these two observable together... For convenience I made a helper method to accept any two observables and a bi-function operator:
public static <Result,
Param1, Source1 extends Param1,
Param2, Source2 extends Param2>
Observable<Result> combineThenLatestFrom(
final Observable<Source1> o1,
final Observable<Source2> o2,
final BiFunction<Param1, Param2, Result> f
) {
final Observable<Result> base = o1
.withLatestFrom(o2, f::apply);
return Observable
.combineLatest(o1, o2, f::apply)
.take(1).takeUntil(base)
.mergeWith(base);
}
And here's the test code I used to verify the method:
public static void main(final String[] args) {
final TestScheduler scheduler = new TestScheduler();
final TestSubject<String> o1 = TestSubject.create(scheduler);
final TestSubject<String> o2 = TestSubject.create(scheduler);
final Observable<String> r = combineThenLatestFrom(o1, o2, (a, b) -> a + b);
r.subscribe(System.out::println);
o1.onNext("1");
o1.onNext("2");
o2.onNext("A");
o2.onNext("B");
o2.onNext("C");
o2.onNext("D");
o1.onNext("3");
o2.onNext("E");
scheduler.triggerActions();
}
Which outputs:
2A
3D
It ain't pretty, but this works (in C#):
var xs = new Subject<string>();
var ys = new Subject<int>();
var query =
Observable
.Merge(
xs.Select(x => new { xt = true, yt = false, x, y = default(int) }),
ys.Select(y => new { xt = false, yt = true, x = default(string), y }))
.StartWith(new { xt = false, yt = false, x = default(string), y = default(int) })
.Scan((a, b) => new
{
xt = a.xt && a.yt ? false : a.xt || b.xt,
yt = a.xt && a.yt ? false : a.yt || b.yt,
x = b.xt ? b.x : a.x,
y = b.yt ? b.y : a.y
})
.Where(z => z.xt & z.yt)
.Select(z => z.y + z.x);
query.Subscribe(v => Console.WriteLine(v));
ys.OnNext(1);
ys.OnNext(2);
xs.OnNext("A");
xs.OnNext("B");
xs.OnNext("C");
xs.OnNext("D");
ys.OnNext(3);
xs.OnNext("E");
It gives:
2A
3D

Confusion over behavior of Publish().Refcount()

I've got a simple program here that displays the number of letters in various words. It works as expected.
static void Main(string[] args) {
var word = new Subject<string>();
var wordPub = word.Publish().RefCount();
var length = word.Select(i => i.Length);
var report =
wordPub
.GroupJoin(length,
s => wordPub,
s => Observable.Empty<int>(),
(w, a) => new { Word = w, Lengths = a })
.SelectMany(i => i.Lengths.Select(j => new { Word = i.Word, Length = j }));
report.Subscribe(i => Console.WriteLine($"{i.Word} {i.Length}"));
word.OnNext("Apple");
word.OnNext("Banana");
word.OnNext("Cat");
word.OnNext("Donkey");
word.OnNext("Elephant");
word.OnNext("Zebra");
Console.ReadLine();
}
And the output is:
Apple 5
Banana 6
Cat 3
Donkey 6
Elephant 8
Zebra 5
I used the Publish().RefCount() because "wordpub" is included in "report" twice. Without it, when a word is emitted first one part of the report would get notified by a callback, and then the other part of report would be notified, double the notifications. That is kindof what happens; the output ends up having 11 items rather than 6. At least that is what I think is going on. I think of using Publish().RefCount() in this situation as simultaneously updating both parts of the report.
However if I change the length function to ALSO use the published source like this:
var length = wordPub.Select(i => i.Length);
Then the output is this:
Apple 5
Apple 6
Banana 6
Cat 3
Banana 3
Cat 6
Donkey 6
Elephant 8
Donkey 8
Elephant 5
Zebra 5
Why can't the length function also use the same published source?
This was a great challenge to solve!
So subtle the conditions that this happens.
Apologies in advance for the long explanation, but bear with me!
TL;DR
Subscriptions to the published source are processed in order, but before any other subscription directly to the unpublished source. i.e. you can jump the queue!
With GroupJoin subscription order is important to determine when windows open and close.
My first concern would be that you are publish refcounting a subject.
This should be a no-op.
Subject<T> has no subscription cost.
So when you remove the Publish().RefCount() :
var word = new Subject<string>();
var wordPub = word;//.Publish().RefCount();
var length = word.Select(i => i.Length);
then you get the same issue.
So then I look to the GroupJoin (because my intuition suggests that Publish().Refcount() is a red herring).
For me, eyeballing this alone was too hard to rationalise, so I lean on a simple debugging too I have used dozens of times of the years - a Trace or Log extension method.
public interface ILogger
{
void Log(string input);
}
public class DumpLogger : ILogger
{
public void Log(string input)
{
//LinqPad `Dump()` extension method.
// Could use Console.Write instead.
input.Dump();
}
}
public static class ObservableLoggingExtensions
{
private static int _index = 0;
public static IObservable<T> Log<T>(this IObservable<T> source, ILogger logger, string name)
{
return Observable.Create<T>(o =>
{
var index = Interlocked.Increment(ref _index);
var label = $"{index:0000}{name}";
logger.Log($"{label}.Subscribe()");
var disposed = Disposable.Create(() => logger.Log($"{label}.Dispose()"));
var subscription = source
.Do(
x => logger.Log($"{label}.OnNext({x.ToString()})"),
ex => logger.Log($"{label}.OnError({ex})"),
() => logger.Log($"{label}.OnCompleted()")
)
.Subscribe(o);
return new CompositeDisposable(subscription, disposed);
});
}
}
When I add the logging to your provided code it looks like this:
var logger = new DumpLogger();
var word = new Subject<string>();
var wordPub = word.Publish().RefCount();
var length = word.Select(i => i.Length);
var report =
wordPub.Log(logger, "lhs")
.GroupJoin(word.Select(i => i.Length).Log(logger, "rhs"),
s => wordPub.Log(logger, "lhsDuration"),
s => Observable.Empty<int>().Log(logger, "rhsDuration"),
(w, a) => new { Word = w, Lengths = a })
.SelectMany(i => i.Lengths.Select(j => new { Word = i.Word, Length = j }));
report.Subscribe(i => ($"{i.Word} {i.Length}").Dump("OnNext"));
word.OnNext("Apple");
word.OnNext("Banana");
word.OnNext("Cat");
word.OnNext("Donkey");
word.OnNext("Elephant");
word.OnNext("Zebra");
This will then output in my log something like the following
Log with Publish().RefCount() used
0001lhs.Subscribe()
0002rhs.Subscribe()
0001lhs.OnNext(Apple)
0003lhsDuration.Subscribe()
0002rhs.OnNext(5)
0004rhsDuration.Subscribe()
0004rhsDuration.OnCompleted()
0004rhsDuration.Dispose()
OnNext
Apple 5
0001lhs.OnNext(Banana)
0005lhsDuration.Subscribe()
0003lhsDuration.OnNext(Banana)
0003lhsDuration.Dispose()
0002rhs.OnNext(6)
0006rhsDuration.Subscribe()
0006rhsDuration.OnCompleted()
0006rhsDuration.Dispose()
OnNext
Banana 6
...
However when I remove the usage Publish().RefCount() the new log output is as follows:
Log without only Subject
0001lhs.Subscribe()
0002rhs.Subscribe()
0001lhs.OnNext(Apple)
0003lhsDuration.Subscribe()
0002rhs.OnNext(5)
0004rhsDuration.Subscribe()
0004rhsDuration.OnCompleted()
0004rhsDuration.Dispose()
OnNext
Apple 5
0001lhs.OnNext(Banana)
0005lhsDuration.Subscribe()
0002rhs.OnNext(6)
0006rhsDuration.Subscribe()
0006rhsDuration.OnCompleted()
0006rhsDuration.Dispose()
OnNext
Apple 6
OnNext
Banana 6
0003lhsDuration.OnNext(Banana)
0003lhsDuration.Dispose()
...
This gives us some insight, however when the issue really becomes clear is when we start annotating our logs with a logical list of subscriptions.
In the original (working) code with the RefCount our annotations might look like this
//word.Subsribers.Add(wordPub)
0001lhs.Subscribe() //wordPub.Subsribers.Add(0001lhs)
0002rhs.Subscribe() //word.Subsribers.Add(0002rhs)
0001lhs.OnNext(Apple)
0003lhsDuration.Subscribe() //wordPub.Subsribers.Add(0003lhsDuration)
0002rhs.OnNext(5)
0004rhsDuration.Subscribe()
0004rhsDuration.OnCompleted()
0004rhsDuration.Dispose()
OnNext
Apple 5
0001lhs.OnNext(Banana)
0005lhsDuration.Subscribe() //wordPub.Subsribers.Add(0005lhsDuration)
0003lhsDuration.OnNext(Banana)
0003lhsDuration.Dispose() //wordPub.Subsribers.Remove(0003lhsDuration)
0002rhs.OnNext(6)
0006rhsDuration.Subscribe()
0006rhsDuration.OnCompleted()
0006rhsDuration.Dispose()
OnNext
Banana 6
So in this example, when word.OnNext("Banana"); is executed the chain of observers is linked in this order
wordPub
0002rhs
However, wordPub has child subscriptions!
So the real subscription list looks like
wordPub
0001lhs
0003lhsDuration
0005lhsDuration
0002rhs
If we annotate the Subject only log we see where the subtlety lies
0001lhs.Subscribe() //word.Subsribers.Add(0001lhs)
0002rhs.Subscribe() //word.Subsribers.Add(0002rhs)
0001lhs.OnNext(Apple)
0003lhsDuration.Subscribe() //word.Subsribers.Add(0003lhsDuration)
0002rhs.OnNext(5)
0004rhsDuration.Subscribe()
0004rhsDuration.OnCompleted()
0004rhsDuration.Dispose()
OnNext
Apple 5
0001lhs.OnNext(Banana)
0005lhsDuration.Subscribe() //word.Subsribers.Add(0005lhsDuration)
0002rhs.OnNext(6)
0006rhsDuration.Subscribe()
0006rhsDuration.OnCompleted()
0006rhsDuration.Dispose()
OnNext
Apple 6
OnNext
Banana 6
0003lhsDuration.OnNext(Banana)
0003lhsDuration.Dispose()
So in this example, when word.OnNext("Banana"); is executed the chain of observers is linked in this order
1. 0001lhs
2. 0002rhs
3. 0003lhsDuration
4. 0005lhsDuration
As the 0003lhsDuration subscription is activated after the 0002rhs, it wont see the "Banana" value to terminate the window, until after the rhs has been sent the value, thus yielding it in the still open window.
Whew
As #francezu13k50 points out the obvious and simple solution to your problem is to just use word.Select(x => new { Word = x, Length = x.Length });, but as I think you have given us a simplified version of your real problem (appreciated) I understand why this isn't suitable.
However, as I dont know what your real problem space is I am not sure what to suggest to you to provide a solution, except that you have one with your current code, and now you should know why it works the way it does.
RefCount returns an Observable that stays connected to the source as long as there is at least one subscription to the returned Observable. When the last subscription is disposed, RefCount disposes it's connection to the source, and reconnects when a new subscription is being made. It might be the case with your report query that all subscriptions to the 'wordPub' are disposed before the query is fulfilled.
Instead of the complicated GroupJoin query you could simply do :
var report = word.Select(x => new { Word = x, Length = x.Length });
Edit:
Change your report query to this if you want to use the GroupJoin operator :
var report =
wordPub
.GroupJoin(length,
s => wordPub,
s => Observable.Empty<int>(),
(w, a) => new { Word = w, Lengths = a })
.SelectMany(i => i.Lengths.FirstAsync().Select(j => new { Word = i.Word, Length = j }));
Because GroupJoin seems to be very tricky to work with, here is another approach for correlating the inputs and outputs of functions.
static void Main(string[] args) {
var word = new Subject<string>();
var length = new Subject<int>();
var report =
word
.CombineLatest(length, (w, l) => new { Word = w, Length = l })
.Scan((a, b) => new { Word = b.Word, Length = a.Word == b.Word ? b.Length : -1 })
.Where(i => i.Length != -1);
report.Subscribe(i => Console.WriteLine($"{i.Word} {i.Length}"));
word.OnNext("Apple"); length.OnNext(5);
word.OnNext("Banana");
word.OnNext("Cat"); length.OnNext(3);
word.OnNext("Donkey");
word.OnNext("Elephant"); length.OnNext(8);
word.OnNext("Zebra"); length.OnNext(5);
Console.ReadLine();
}
This approach works if every input has 0 or more outputs subject to the constraints that (1) outputs only arrive in the same order as the inputs AND (2) each output corresponds to its most recent input. This is like a LeftJoin - each item in the first list (word) is paired with items in the right list (length) that subsequently arrive, up until another item in the first list is emitted.
Trying to use regular Join instead of GroupJoin. I thought the problem was that when a new word was created there was a race condition inside Join between creating a new window and ending the current one. So here I tried to elimate that by pairing every word with a null signifying the end of the window. Doesn't work, just like the first version did not. How is it possible that a new window is created for each word without the previous one being closed first? Completely confused.
static void Main(string[] args) {
var lgr = new DelegateLogger(Console.WriteLine);
var word = new Subject<string>();
var wordDelimited =
word
.Select(i => Observable.Return<string>(null).StartWith(i))
.SelectMany(i => i);
var wordStart = wordDelimited.Where(i => i != null);
var wordEnd = wordDelimited.Where(i => i == null);
var report = Observable
.Join(
wordStart.Log(lgr, "word"), // starts window
wordStart.Select(i => i.Length),
s => wordEnd.Log(lgr, "expireWord"), // ends current window
s => Observable.Empty<int>(),
(l, r) => new { Word = l, Length = r });
report.Subscribe(i => Console.WriteLine($"{i.Word} {i.Length}"));
word.OnNext("Apple");
word.OnNext("Banana");
word.OnNext("Cat");
word.OnNext("Zebra");
word.OnNext("Elephant");
word.OnNext("Bear");
Console.ReadLine();
}

Merging Observables

Here we have a Observable Sequence... in .NET using Rx.
var aSource = new Subject<int>();
var bSource = new Subject<int>();
var paired = Observable
.Merge(aSource, bSource)
.GroupBy(i => i).SelectMany(g => g.Buffer(2).Take(1));
paired.Subscribe(g => Console.WriteLine("{0}:{1}", g.ElementAt(0), g.ElementAt(1)));
aSource.OnNext(4);
bSource.OnNext(1);
aSource.OnNext(2);
bSource.OnNext(5);
aSource.OnNext(3);
bSource.OnNext(3);
aSource.OnNext(5);
bSource.OnNext(2);
aSource.OnNext(1);
bSource.OnNext(4);
Output:
3:3
5:5
2:2
1:1
4:4
We will get events every time a pair of numbers arrive with the same id.
Perfect! Just what i want.
Groups of two, paired by value.
Next question....
How to get a selectmany/buffer for sequences of values.
So 1,2,3,4,5 arrives at both aSource and bSource via OnNext(). Then fire ConsoleWriteLine() for 1-5. Then when 2,3,4,5,6 arrives, we get another console.writeline(). Any clues anyone?
Immediately, the Rx forum suggests looking at .Window()
http://introtorx.com/Content/v1.0.10621.0/17_SequencesOfCoincidence.html
Which on the surface looks perfect. In my case i need a window of value 4, in this case.
Where in the query sequence does it belong to get this effect?
var paired = Observable.Merge(aSource, bSource).GroupBy(i => i).SelectMany(g => g.Buffer(2).Take(1));
Output
1,2,3,4,5 : 1,2,3,4,5
2,3,4,5,6 : 2,3,4,5,6
Regards,
Daniel
Assuming events arrive randomly at the sources, use my answer to "Reordering events with Reactive Extensions" to get the events in order.
Then use Observable.Buffer to create a sliding buffer:
// get this using the OrderedCollect/Sort in the referenced question
IObservable<int> orderedSource;
// then subscribe to this
orderedSource.Buffer(5, 1);
Here is an extension method that fires when it has n inputs of the same ids.
public static class RxExtension
{
public static IObservable<TSource> MergeBuffer<TSource>(this IObservable<TSource> source, Func<TSource, int> keySelector, Func<IList<TSource>,TSource> mergeFunction, int bufferCount)
{
return Observable.Create<TSource>(o => {
var buffer = new Dictionary<int, IList<TSource>>();
return source.Subscribe<TSource>(i =>
{
var index = keySelector(i);
if (buffer.ContainsKey(index))
{
buffer[index].Add(i);
}
else
{
buffer.Add(index, new List<TSource>(){i});
}
if (buffer.Count==bufferCount)
{
o.OnNext(mergeFunction(buffer[index]));
buffer.Remove(index);
}
});
});
}
}
Calling the extension.
mainInput = Observable.Merge(inputNodes.ToArray()).MergeBuffer<NodeData>(x => x.id, x => MergeData(x), 1);

How to write left outer join using MethodCallExpressions?

The code block below answers the question: "How do you perform a left outer join using linq extension methods?"
var qry = Foo.GroupJoin(
Bar,
foo => foo.Foo_Id,
bar => bar.Foo_Id,
(x,y) => new { Foo = x, Bars = y })
.SelectMany(
x => x.Bars.DefaultIfEmpty(),
(x,y) => new { Foo = x, Bar = y});
How do you write this GroupJoin and SelectMany as MethodCallExpressions? All of the examples that I've found are written using DynamicExpressions translating strings into lambdas (another example). I like to avoid taking a dependency on that library if possible.
Can the query above be written with Expressions and associated methods?
I know how to construct basic lambda expressions like foo => foo.Foo_Id using ParameterExpressions MemberExpressions and Expression.Lambda() , but how do you construct (x,y) => new { Foo = x, Bars = y })??? to be able to construct the necessary parameters to create both calls?
MethodCallExpression groupJoinCall =
Expression.Call(
typeof(Queryable),
"GroupJoin",
new Type[] {
typeof(Customers),
typeof(Purchases),
outerSelectorLambda.Body.Type,
resultsSelectorLambda.Body.Type
},
c.Expression,
p.Expression,
Expression.Quote(outerSelectorLambda),
Expression.Quote(innerSelectorLambda),
Expression.Quote(resultsSelectorLambda)
);
MethodCallExpression selectManyCall =
Expression.Call(typeof(Queryable),
"SelectMany", new Type[] {
groupJoinCall.ElementType,
resultType,
resultsSelectorLambda.Body.Type
}, groupJoinCall.Expression, Expression.Quote(lambda),
Expression.Quote(resultsSelectorLambda)));
Ultimately, I need to create a repeatable process that will left join n Bars to Foo. Because we have a vertical data structure, a left-joined query is required to return what is represented as Bars, to allow the user to sort Foo. The requirement is to allow the user to sort by 10 Bars, but I don't expect them to ever use more than three. I tried writing a process that chained the code in the first block above up to 10 times, but once I got passed 5 Visual Studio 2012 start to slow and around 7 it locked up.
Therefore, I'm now trying to write a method that returns the selectManyCall and calls itself recursively as many times as is requested by the user.
Based upon the query below that works in LinqPad, the process that needs to be repeated only requires manually handling the transparent identifiers in Expression objects. The query sorts returns Foos sorted by Bars (3 Bars in this case).
A side note. This process is significantly easier doing the join in the OrderBy delegate, however, the query it produces includes the T-SQL "OUTER APPLY", which isn't supported by Oracle which is required.
I'm grateful for any ideas on how to write the projection to anonymous type or any other out-of-the-box idea that may work. Thank you.
var q = Foos
.GroupJoin (
Bars,
g => g.FooID,
sv => sv.FooID,
(g, v) =>
new
{
g = g,
v = v
}
)
.SelectMany (
s => s.v.DefaultIfEmpty (),
(s, v) =>
new
{
s = s,
v = v
}
)
.GroupJoin (
Bars,
g => g.s.g.FooID,
sv => sv.FooID,
(g, v) =>
new
{
g = g,
v = v
}
)
.SelectMany (
s => s.v.DefaultIfEmpty (),
(s, v) =>
new
{
s = s,
v = v
}
)
.GroupJoin (
Bars,
g => g.s.g.s.g.FooID,
sv => sv.FooID,
(g, v) =>
new
{
g = g,
v = v
}
)
.SelectMany (
s => s.v.DefaultIfEmpty (),
(s, v) =>
new
{
s = s,
v = v
}
)
.OrderBy (a => a.s.g.s.g.v.Text)
.ThenBy (a => a.s.g.v.Text)
.ThenByDescending (a => a.v.Date)
.Select (a => a.s.g.s.g.s.g);
If you're having trouble figuring out how to generate the expressions, you could always get an assist from the compiler. What you could do is declare an lambda expression with the types you are going to query with and write the lambda. The compiler will generate the expression for you and you can examine it to see what expressions make up the expression tree.
e.g., your expression is equivalent to this using the query syntax (or you could use the method call syntax if you prefer)
Expression<Func<IQueryable<Foo>, IQueryable<Bar>, IQueryable>> expr =
(Foo, Bar) =>
from foo in Foo
join bar in Bar on foo.Foo_Id equals bar.Foo_Id into bars
from bar in bars.DefaultIfEmpty()
select new
{
Foo = foo,
Bar = bar,
};
To answer your question, you can't really generate an expression that creates an anonymous object, the actual type isn't known at compile time. You can cheat kinda by creating a dummy object and use GetType() to get its type which you could then use to create the appropriate new expression, but that's more of a dirty hack and I wouldn't recommend doing this. Doing so, you won't be able to generate strongly typed expressions since you don't know the type of the anonymous type.
e.g.,
var dummyType = new
{
foo = default(Foo),
bars = default(IQueryable<Bar>),
}.GetType();
var fooExpr = Expression.Parameter(typeof(Foo), "foo");
var barsExpr = Expression.Parameter(typeof(IQueryable<Bar>), "bars");
var fooProp = dummyType.GetProperty("foo");
var barsProp = dummyType.GetProperty("bars");
var ctor = dummyType.GetConstructor(new Type[]
{
fooProp.PropertyType,
barsProp.PropertyType,
});
var newExpr = Expression.New(
ctor,
new Expression[] { fooExpr, barsExpr },
new MemberInfo[] { fooProp, barsProp }
);
// the expression type is unknown, just some lambda
var lambda = Expression.Lambda(newExpr, fooExpr, barsExpr);
Whenever you need to generate an expression that involves an anonymous object, the right thing to do would be to create an known type and use that in place of the anonymous type. It will have limited use yes but it's a much cleaner way to handle such a situation. Then at least you'll be able to get the type at compile time.
// use this type instead of the anonymous one
public class Dummy
{
public Foo foo { get; set; }
public IQueryable<Bar> bars { get; set; }
}
var dummyType = typeof(Dummy);
var fooExpr = Expression.Parameter(typeof(Foo), "foo");
var barsExpr = Expression.Parameter(typeof(IQueryable<Bar>), "bars");
var fooProp = dummyType.GetProperty("foo");
var barsProp = dummyType.GetProperty("bars");
var ctor = dummyType.GetConstructor(Type.EmptyTypes);
var newExpr = Expression.MemberInit(
Expression.New(ctor),
Expression.Bind(fooProp, fooExpr),
Expression.Bind(barsProp, barsExpr)
);
// lambda's type is known at compile time now
var lambda = Expression.Lambda<Func<Foo, IQueryable<Bar>, Dummy>>(
newExpr,
fooExpr,
barsExpr);
Or, instead of creating and using a dummy type, you might be able to use tuples in your expressions instead.
static Expression<Func<T1, T2, Tuple<T1, T2>>> GetExpression<T1, T2>()
{
var type1 = typeof(T1);
var type2 = typeof(T2);
var tupleType = typeof(Tuple<T1, T2>);
var arg1Expr = Expression.Parameter(type1, "arg1");
var arg2Expr = Expression.Parameter(type2, "arg2");
var arg1Prop = tupleType.GetProperty("Item1");
var arg2Prop = tupleType.GetProperty("Item2");
var ctor = tupleType.GetConstructor(new Type[]
{
arg1Prop.PropertyType,
arg2Prop.PropertyType,
});
var newExpr = Expression.New(
ctor,
new Expression[] { arg1Expr, arg2Expr },
new MemberInfo[] { arg1Prop, arg2Prop }
);
// lambda's type is known at compile time now
var lambda = Expression.Lambda<Func<T1, T2, Tuple<T1, T2>>>(
newExpr,
arg1Expr,
arg2Expr);
return lambda;
}
Then to use it:
var expr = GetExpression<Foo, IQueryable<Bar>>();

Reactive Framework / DoubleClick

I know that there is an easy way to do this - but it has beaten me tonight ...
I want to know if two events occur within 300 milliseconds of each other, as in a double click.
Two leftdown mouse clicks in 300 milliseconds - I know this is what the reactive framework was built for - but damn if I can find a good doc that has simple examples for all the extenstion operatores - Throttle, BufferWithCount, BufferWithTime - all of which just werent' doing it for me....
The TimeInterval method will give you the time between values.
public static IObservable<Unit> DoubleClicks<TSource>(
this IObservable<TSource> source, TimeSpan doubleClickSpeed, IScheduler scheduler)
{
return source
.TimeInterval(scheduler)
.Skip(1)
.Where(interval => interval.Interval <= doubleClickSpeed)
.RemoveTimeInterval();
}
If you want to be sure that triple clicks don't trigger values, you could just use Repeat on a hot observable (I've used a FastSubject here as the clicks will all come on one thread and therefore don't require the heaviness of the normal Subjects):
public static IObservable<TSource> DoubleClicks<TSource>(
this IObservable<TSource> source, TimeSpan doubleClickSpeed, IScheduler scheduler)
{
return source.Multicast<TSource, TSource, TSource>(
() => new FastSubject<TSource>(), // events won't be multithreaded
values =>
{
return values
.TimeInterval(scheduler)
.Skip(1)
.Where(interval => interval.Interval <= doubleClickSpeed)
.RemoveTimeInterval()
.Take(1)
.Repeat();
});
}
Edit - Use TimeInterval() instead.
The Zip() and Timestamp() operators might be a good start.
var ioClicks = Observable.FromEvent<MouseButtonEventHandler, RoutedEventArgs>(
h => new MouseButtonEventHandler(h),
h => btn.MouseLeftButtonDown += h,
h => btn.MouseLeftButtonDown -= h);
var ioTSClicks = ioClicks.Timestamp();
var iodblClicks = ioTSClicks.Zip(ioTSClicks.Skip(1),
(r, l) => l.Timestamp - r.Timestamp)
.Where(tspan => tspan.TotalMilliseconds < 300);
Probably best to test this via the test scheduler, so you know exactly what you're getting:
[Fact]
public void DblClick()
{
// setup
var ioClicks = _scheduler.CreateHotObservable(
OnNext(210, "click"),
OnNext(220, "click"),
OnNext(300, "click"),
OnNext(365, "click"))
.Timestamp(_scheduler);
// act
Func<IObservable<TimeSpan>> target =
() => ioClicks.Zip(ioClicks.Skip(1),
(r, l) => l.Timestamp - r.Timestamp)
.Where(tspan => tspan.Ticks < 30);
var actuals = _scheduler.Run(target);
// assert
Assert.Equal(actuals.Count(), 1);
// + more
}
public static Recorded<Notification<T>> OnNext<T>(long ticks, T value)
{
return new Recorded<Notification<T>>(
ticks,
new Notification<T>.OnNext(value));
}