I see that the question's been asked several times of how to set the command timeout for long-running queries with Entity Framework. The problem I'm running into now is that the query that gets run against the server doesn't really take that long to execute and return.
Here's the code that runs the query:
var records = (from c in _context.Set<CompletedQuiz>()
where c.FarmId == _entityId && c.ToolboxId == _toolboxId
group c by new { c.UserId, c.LessonId } into g
select g).ToList()
.Select(c => new {
UserId = c.Key.UserId,
LessonId = c.Key.LessonId,
NumQuestions = c.Max(n => n.TotalNumQuestions),
NumLessons = c.Select(l => l.LessonId).Distinct().Count(),
Start = c.Min(s => s.LogonDateTime),
End = c.Max(e => e.LogoffDateTime),
MaxScore = c.Max(s => s.Score),
Passed = c.Any(p => p.Passed)
});
I'm selecting from a fairly simple view called CompletedQuizzes, and grouping on the record ids for users and lessons. I've run this with SQL Profiler running to capture the actual query that's executed; if I run that exact same query in SSMS, it runs almost instantly (<0 seconds). However, running from my application will often exceed the default command timeout of 30 seconds. I put a breakpoint on the line that's shown above, and I added the call to .ToList() to make sure that the query is executed immediately.
What else should I be checking as a possible culprit here?
EDIT:
I still don't understand why the code above takes so long to execute, but I reworked it using using Linq extension methods, and now it runs as fast as I would expect. Here's what it looks like now:
var records = _context.Set<CompletedQuiz>()
.Where(c => c.FarmId == _entityId && c.ToolboxId == _toolboxId)
.GroupBy(c => new { c.UserId, c.LessonId })
.Select(c => new {
UserId = c.Key.UserId,
LessonId = c.Key.LessonId,
NumQuestions = c.Max(n => n.TotalNumQuestions),
NumLessons = c.Select(l => l.LessonId).Distinct().Count(),
Start = c.Min(s => s.LogonDateTime),
End = c.Max(e => e.LogoffDateTime),
MaxScore = c.Max(s => s.Score),
Passed = c.Any(p => p.Passed)
});
I guess at this point I would adjust my question to why is the query generated by the second block of code executed so much more quickly from my application?
I think it is qutie obvious your first query is using .ToList() befor projection. It means that you load all CompleteQuiz instances satisfying your condition to your application and execute all aggregations in your application. It also means that there must be a lot of subsequent queries to lazy load your relations for aggregation computations.
In your second query there is no ToList and thus whole query including all aggregations is performed in the database.
Related
I am trying to loop over inputs and process them to produce scores.
Just for the first input, I want to do some processing that takes a while.
The function ends up returning just the values from the 'else' part. The 'if' part is done executing after the function returns the value.
I am new to Scala and understand the behavior but not sure how to fix it.
I've tried inputs.zipWithIndex.map instead of foreach but the result is the same.
def getscores(
inputs: inputs
): Future[Seq[scoreInfo]] = {
var scores: Seq[scoreInfo] = Seq()
inputs.zipWithIndex.foreach {
case (f, i) => {
if (i == 0) {
// long operation that returns Future[Option[scoreInfo]]
getgeoscore(f).foreach(gso => {
gso.foreach(score => {
scores = scores.:+(score)
})
})
} else {
scores = scores.:+(
scoreInfo(
id = "",
score = 5
)
)
}
}
}
Future {
scores
}
}
For what you need, I would drop the mutable variable and replace foreach with map to obtain an immutable list of Futures and recover to handle exceptions, followed by a sequence like below:
def getScores(inputs: Inputs): Future[List[ScoreInfo]] = Future.sequence(
inputs.zipWithIndex.map{ case (input, idx) =>
if (idx == 0)
getGeoScore(input).map(_.getOrElse(defaultScore)).recover{ case e => errorHandling(e) }
else
Future.successful(ScoreInfo("", 5))
})
To capture/print the result, one way is to use onComplete:
getScores(inputs).onComplete(println)
The part your missing is understanding a tricky element of concurrency, and that is that the order of execution when using multiple futures is not guaranteed.
If your block here is long running, it will take a while before appending the score to scores
// long operation that returns Future[Option[scoreInfo]]
getgeoscore(f).foreach(gso => {
gso.foreach(score => {
// stick a println("here") in here to see what happens, for demonstration purposes only
scores = scores.:+(score)
})
})
Since that executes concurrently, your getscores function will also simultaneously continue its work iterating over the rest of inputs in your zipWithindex. This iteration, especially since it's trivial work, likely finishes well before the long-running getgeoscore(f) completes the execution of the Future it scheduled, and the code will exit the function, moving on to whatever code is next after you called getscores
val futureScores: Future[Seq[scoreInfo]] = getScores(inputs)
futureScores.onComplete{
case Success(scoreInfoSeq) => println(s"Here's the scores: ${scoreInfoSeq.mkString(",")}"
}
//a this point the call to getgeoscore(f) could still be running and finish later, but you will never know
doSomeOtherWork()
Now to clean this up, since you can run a zipWithIndex on your inputs parameter, I assume you mean it's something like a inputs:Seq[Input]. If all you want to do is operate on the first input, then use the head function to only retrieve the first option, so getgeoscores(inputs.head) , you don't need the rest of the code you have there.
Also, as a note, if using Scala, get out of the habit of using mutable vars, especially if you're working with concurrency. Scala is built around supporting immutability, so if you find yourself wanting to use a var , try using a val and look up how to work with the Scala's collection library to make it work.
In general, that is when you have several concurrent futures, I would say Leo's answer describes the right way to do it. However, you want only the first element transformed by a long running operation. So you can use the future return by the respective function and append the other elements when the long running call returns by mapping the future result:
def getscores(inputs: Inputs): Future[Seq[ScoreInfo]] =
getgeoscore(inputs.head)
.map { optInfo =>
optInfo ++ inputs.tail.map(_ => scoreInfo(id = "", score = 5))
}
So you neither need zipWithIndex nor do you need an additional future or join the results of several futures with sequence. Mapping the future just gives you a new future with the result transformed by the function passed to .map().
I fail to translate a sql query to a linq query that could calculate some stock.
This is my test query that I'm trying to convert to a linq query.
SELECT
i.*,
(SELECT COUNT(t.*) FROM tickets t
WHERE t.starttime::time = i.sessionstarttime::time
AND t.starttime::date = '2018-04-06'::date)
as stock
FROM items I
-- note that the hardcoded date ('2018-04-06') is a function parameter
( tl;dr; how would you convert this PostgreSQL query to LINQ? )
My attempts so far are the variations of the following query:
var items = await _context.Items.Select(x => new Item
{
Id = x.Id,
IsTicket = x.IsTicket,
Name = x.Name,
Price = x.Price,
SaleItems = x.SaleItems,
SessionStartTime = x.SessionStartTime,
DateCreated = x.DateCreated,
DateEdit = x.DateEdit,
UserIdCreated = x.UserIdCreated,
UserIdEdited = x.UserIdEdited,
// calculate stock in subquery
Stock = _context.Tickets.Count(
t => t.StartTime.Date == ticketDate
&& x.SessionStartTime.HasValue
&& t.StartTime.Hour == x.SessionStartTime.Value.Hours // this is the part that is failing
&& t.State != TicketState.Canceled)
}).ToListAsync();
t.StartTime is Datetime and x.SessionStartTime is Nullable Timespan
So when I comment the line && t.StartTime.Hour == x.SessionStartTime.Value.Hours everything is fine, but with it I get warnings that it could not be translated and will be evaluated locally. But I don't want to download the whole ticket table just to count them.
The t.StartTime.Hour part is fine, I tried to perform static comparisons with both parameters. t.StartTime.Hour == 5 was translated without any problems, but x.SessionStartTime.Value.Hours == 5 failed to translate.
Also the problematic part in the application output:
([t].StartTime.Hour == Convert([x].SessionStartTime, TimeSpan).Hours))
So I guess that convert part is failing.
So what I'm missing and how I could work around this problem. Any help will be appreciated.
Update:
After experimenting a bit I have found two workarounds, that I wouldn't call the answers.
First I noticed that EF is trying to convert Nullable<TimeSpan> to a regular TimeSpan from the mentioned output: ([t].StartTime.Hour == Convert([x].SessionStartTime, TimeSpan).Hours))
I thought I could prevent that conversion by converting to a string and comparing the strings (I have a feeling this will bite me in the future):
t.StartTime.ToString().Contains(x.SessionStartTime.ToString())
The second workaround is only viable for my scenario since I know the items query is final and I can materialise it without calculated Stock, and then loop through the results and calculate it on a separate query. But this seems to add additional calls to the database and sacrifice some performance.
foreach(var x in items.Where(x=>x.SessionStartTime.HasValue))
{
// accessing the t.StartTime.TimeOfDay property seems to fail the LINQ to SQL as well
var hours = x.SessionStartTime.Value.Hours;
var minutes = x.SessionStartTime.Value.Minutes;
x.Stock = _context.Tickets.Count(t => t.StartTime.Date == ticketDate
&& t.StartTime.Hour == hours
&& t.StartTime.Minute == minutes);
}
If I try to get parts from a machine into the machines list of parts from the db, I execute this:
Machine ma = new Machine();
ma = dbcontext.Machine.Where(s => s.Guid == guid).ToList()[0];
IQueryable<Part> PartsQuery = from m in db.Machines
where m.Guid == guid
from p in m.Parts
select p;
ma.parts.AddRange(PartsQuery.ToList());
I get double the Parts into my parts list of the machine than actually are in the database!
If I do this instead of the last line:
List<parts> partsFromDb = PartsQuery.ToList();
ma.parts.AddRange(partsFromDb);
the amount of parts in the ma.parts list is correct. Can someone explain that to me please?
You can achieve what you trying to do in one round trip to your database:
Machine mab=context.Machine.Include(m=>m.Parts).FirstOrDefault(m=> m.Guid == guid);
About your issue, that's probably is due to the caching policy of EF and maybe Lazy Loading is involve too. I don't know how you are testing your code, but if your do the following:
Machine ma = context.Machine.FirstOrDefault(m=> m.Guid == guid);
IQueryable<Part> PartsQuery = from m in db.Machines
where m.Guid == guid
from p in m.Parts
select p;
PartsQuery.ToList(); //materialize your query but don't save the result;
var parts=ma.parts;// now take a look here and you will see the related parts were loaded
That should be the reason why the data is duplicated, because when you materialize your query and consult later the navigation property (m.parts), the related entities are already there. But anyways the best way to get what you need is using the query that I show at the beginning of my answer.
Machine ma = new Machine();
ma = dbcontext.Machine.Where(s => s.Guid == guid).ToList()[0];
IQueryable<Part> PartsQuery = from m in db.Machines
where m.Guid == guid
from p in m.Parts
select p;
ma.parts.AddRange(PartsQuery.ToList());
Is 100% equivalent to:
// 1. Find and retrieve the first machine with the given GUID
Machine machine = dbcontext.Machine.First(s => s.Guid == guid);
// 2. Again, find and retrieve the machines with the given GUID, select the parts of each machine that matches and flatten it down to a single list.
IList<Part> machineParts = db.Machines
.Where(m => m.Guid == guid)
.SelectMany(m => m.Parts)
.ToList();
// 3. Add.. all of the parts to that machine again?
machine.parts.AddRange(machineParts);
So it makes sense that you end up with double the parts inside the retrieved machine.
To be honest, I don't believe that the last change you speak about, i.e. capturing the 'PartsQuery' into a temporary variable, makes any difference with regards to the end result of your machine.
Something else must be going on there.
For some reason I am unable to use Select() after a Skip()/Take() unless I do this in a certain way. The following code works and allows me to use result as part of a sub query.
var query = QueryOver.Of<MyType>();
query.Skip(1);
var result = query.Select(myType => myType.Id);
However, if I attempt to create the query on one line as below I can't compile.
var query = QueryOver.Of<MyType>().Skip(1);
var result = query.Select(myType => myType.Id);
It looks like the code in the first results in query being of type QueryOver< MyType, MyType> while the second results in query being of type QueryOver< MyType>.
It also works if written like this.
var query = QueryOver.Of<MyType>().Select(myType => myType.Id).Skip(1);
Any ideas why the second version fails horribly when the first and third versions work? It seems like odd behavior.
You have a typo in the second version...
var query = QueryOver.Of<MyType().Skip(1);
is missing the >
var query = QueryOver.Of<MyType>().Skip(1);
Not sure if thats what you where looking for.
Let me explain what I want to achieve first.
Lets say I have the following data incoming form the event stream
var data = new string[] {
"hello",
"Using",
"ok:michael",
"ok",
"begin:events",
"1:232",
"2:343",
"end:events",
"error:dfljsdf",
"fdl",
"error:fjkdjslf",
"ok"
};
When I subscribe the data source, I would like to get the following result
"ok:michael"
"ok"
"begin:events 1:232 2:343 end:events"
"error:dfljsdf"
"error:fjkdjslf"
"ok"
Basically, I want to get whichever data that start with ok or error and the data between begin and end.
I have tried this so far..
var data = new string[] {
"hello",
"Using",
"ok:michael",
"ok",
"begin:events",
"1:232",
"2:343",
"end:events",
"error:dfljsdf",
"fdl",
"error:fjkdjslf",
"ok"
};
var dataStream = Observable.Generate(
data.GetEnumerator(),
e => e.MoveNext(),
e => e,
e => e.Current.ToString(),
e => TimeSpan.FromSeconds(0.1));
var onelineStream = from d in dataStream
where d.StartsWith("ok") || d.StartsWith("error")
select d;
// ???
// may be need to buffer? I want to get data like "begin:events 1:232 2:343 end:events"
// but it is not working...
var multiLineStream = from list in dataStream.Buffer<string, string, string>(
bufferOpenings: dataStream.Where(d => d.StartsWith("begin")),
bufferClosingSelector: b => dataStream.Where(d => d.StartsWith("end")))
select String.Join(" ", list);
// merge two stream????
// but I have no clue how to merge these twos :(
mergeStream .Subscribe(d =>
{
Console.WriteLine(d);
Console.WriteLine();
});
Since I'm very new to Reactive programming, I can't make myself to think in reactive way. :(
Thanks in advance.
You were so, so very close to the right answer!
Essentially you had the onelineStream & multiLineStream queries just about right.
Merging them together is very easy. Just do this:
onelineStream.Merge(multiLineStream)
However, where your queries fell short was in the Observable.Generate that you used to introduce the delay between values. This creates a observable that, if you have multiple subscribers, kind of "fans out" the values.
Given your data and your definition for dataStream look how this code behaves:
dataStream.Select(x => "!" + x).Subscribe(Console.WriteLine);
dataStream.Select(x => "#" + x).Subscribe(Console.WriteLine);
You get these values:
!hello
#Using
!ok:michael
#ok
#1:232
!begin:events
#2:343
!end:events
!fdl
#error:dfljsdf
!error:fjkdjslf
#ok
Notice that some got handled by one subscription and the others got handled by the other. This means that even though your onelineStream & multiLineStream queries were just about right they would only see some of the data each and thus not behave as you expect.
You can also get race conditions that can skip and duplicate values. So it's best to avoid this kind of observable.
A better approach to introduce a delay between values is to do this:
var dataStream = data.ToObservable().Do(_ => Thread.Sleep(100));
Now this creates a "cold" observable, meaning that every new subscriber will get a fresh subscription of the observable so starting from the first value.
Your multiLineStream query will not work correctly on a cold observable.
To make the data stream a "hot" observable (which shares values amongst the subscribers) we use the Publish operator.
So, multiLineStream now looks like this:
var multiLineStream =
dataStream.Publish(ds =>
from list in ds.Buffer(
ds.Where(d => d.StartsWith("begin")),
b => ds.Where(d => d.StartsWith("end")))
select String.Join(" ", list));
You can then get your results like so:
onelineStream.Merge(multiLineStream).Subscribe(d =>
{
Console.WriteLine(d);
Console.WriteLine();
});
This is what I got:
ok:michael
ok
begin:events 1:232 2:343 end:events
error:dfljsdf
error:fjkdjslf
ok
Let me know if that works for you.