Join and group by in LINQ - entity-framework

Join and group by in LINQ - entity-framework

Here is the entity and table structure-
class Person { int PersonId, string PersonName}
class Report { int ReportId, datetime ReportTime, int PersonId }
Table: Persons
----------------------------
| PersonId | PersonName |
----------------------------
| 1 | Abc |
----------------------------
| 2 | Xyz |
----------------------------
Table: Reports
----------------------------------------------
| ReportId | ReportTime | PersonId |
----------------------------------------------
| 10 | 2017-02-27 11:12 | 1 |
---------------------------- -----------------
| 14 | 2017-02-27 15:23 | 1 |
---------------------------- -----------------
I want to select data as follows (PersonName from Person and last record of his Id in reports table)-
-------------------------------------
| PersonName | ReportTime |
-------------------------------------
| Abc | 2017-02-27 15:23 |
-------------------------------------
How can i do it in Lambda or LINQ?

Use Queryable.GroupJoin:
from p in db.Persons
join r in db.Reports on p.PersonId equals r.PersonId into g
where g.Any() // if some persons do not have reports
select new {
p.PersonName,
ReportTime = g.Max(r => r.ReportTime)
}
Lambda (note that it will return Nullable ReportTime with nulls for persons which don't have any reports)
db.Persons.GroupJoin(
db.Reports,
p => p.PersonId,
r => r.PersonId,
(p,g) => new { p.PersonName, ReportTime = g.Max(r => (DateTime?)r.ReportTime) })

Try this:
List<Person> people = new List<Person>
{
new Person {PersonId = 1, PersonName = "AB" },
new Person {PersonId = 2, PersonName = "CD" },
new Person {PersonId = 3, PersonName = "EF" },
};
List<Report> reports = new List<Report>()
{
new Report {ReportId = 1, ReportTime = DateTime.Now, PersonId = 1 },
new Report {ReportId = 2, ReportTime = DateTime.Now.AddHours(-1), PersonId = 1 },
new Report {ReportId = 3, ReportTime = DateTime.Now.AddHours(-2), PersonId = 1 },
new Report {ReportId = 4, ReportTime = DateTime.Now.AddMinutes(-3), PersonId = 2 },
new Report {ReportId = 5, ReportTime = DateTime.Now, PersonId = 2 }
};
var res = (from rep in reports
group rep by rep.PersonId into repGrp
join ppl in people on repGrp.FirstOrDefault().PersonId equals ppl.PersonId
select new
{
PersonName = ppl.PersonName,
ReportDate = repGrp.Max(r => r.ReportTime),
}).ToList();
}

Related

Tabular data from DB to MAP Data structure

I am fetching few values from DB and want to create a nested map data structure out of this. The tabular data looks like this
+---------+--------------+----------------+------------------+----------------+-----------------------+
| Cube_ID | Dimension_ID | Dimension_Name | Partition_Column | Display_name | Dimension_Description |
+---------+--------------+----------------+------------------+----------------+-----------------------+
| 1 | 1 | Reporting_Date | Y | Reporting_Date | Reporting_Date |
| 1 | 2 | Platform | N | Platform | Platform |
| 1 | 3 | Country | N | Country | Country |
| 1 | 4 | OS_Version | N | OS_Version | OS_Version |
| 1 | 5 | Device_Version | N | Device_Version | Device_Version |
+---------+--------------+----------------+------------------+----------------+-----------------------+
I want to create a nested structure something like this
{
CubeID = "1": {
Dimension ID = "1": [
{
"Name": "Reporting_Date",
"Partition_Column": "Y"
"Display": "Reporting_Date"
}
]
Dimension ID = "2": [
{
"Name": "Platform",
"Column": "N"
"Display": "Platform"
}
]
},
CubeID = "2": {
Dimension ID = "1": [
{
"Name": "Reporting_Date",
"Partition_Column": "Y"
"Display": "Reporting_Date"
}
]
Dimension ID = "2": [
{
"Name": "Platform",
"Column": "N"
"Display": "Platform"
}
]
}
}
I have the result set from DB using the following. I am able to populate individual columns, but not sure how to create a map for later computation
while (rs.next()) {
val Dimension_ID = rs.getInt("Dimension_ID")
val Dimension_Name = rs.getString("Dimension_Name")
val Partition_Column = rs.getString("Partition_Column")
val Display_name = rs.getString("Display_name")
val Dimension_Description = rs.getString("Dimension_Description")
}
I believe I should write a case class for this, but I am not sure how to create a case class and load values to the case class.
Thanks for the help. I can provide any other info needed. Let me know

Background
you can define data class something as below,
case class Dimension(
dimensionId: Long,
name: String,
partitionColumn: String,
display: String
)
case class Record(
cubeId: Int,
dimension: Dimension
)
case class Data(records: List[Record])
And this is how you can construct data,
val data =
Data(
List(
Record(
cubeId = 1,
dimension = Dimension(
dimensionId = 1,
name = "Reporting_Date",
partitionColumn = "Y",
display = "Reporting_Date"
)
),
Record(
cubeId = 2,
dimension = Dimension(
dimensionId = 1,
name = "Platform",
partitionColumn = "N",
display = "Platform"
)
)
)
)
Now to your question, since you are using JDBC you have to construct list of records in a mutable way or use scala Iterator. I will write below mutable way to construct above data class but you can explore more.
import scala.collection.mutable.ListBuffer
var mutableData = new ListBuffer[Record]()
while (rs.next()) {
mutableData += Record(
cubeId = rs.getIn("Cube_ID"),
dimension = Dimension(
dimensionId = rs.getInt("Dimension_ID"),
name = rs.getString("Dimension_Name"),
partitionColumn = rs.getString("Partition_Column"),
display = rs.getString("Dimension_Description")
)
)
}
val data = Data(records = mutableData.toList)
Also read - Any better way to convert SQL ResultSet to Scala List

merge result of multiple select using linq and entityframework

how can merge result of this 3 line
var newscatid=Dbcontext.tbl_NewsPosition.where(x => x.Fk_NewsID==4 and IsMainPosition=true).select(x => x.Fk_NewsCatId);
from p in Dbcontext.tbl_cat.where(x => x.Id== newscatid) select new { parentCat = b.CatName};
from ch in Dbcontext.tbl_cat.where(x => x.Fk_ParentId== newscatid) select new { childCat = ch.CatName};
This is what I’m trying to obtain:
+-----------+----------+
| parentCat | childCat |
+-----------+----------+
| Sport | Footbal |
| | |
+-----------+----------+
and these are my tables:

Try this one:
var result = from p in Dbcontext.tbl_cat
join ch in Dbcontext.tbl_cat on p.Id equals ch.Fk_ParentId
join np in Dbcontext.tbl_NewsPosition on p.Id equals np.Fk_NewsCatId
where np.Fk_NewsID==4 && np.IsMainPosition
select new { parentCat = p.CatName, childCat = ch.CatName };

Please refer below code:
var newscatid=Dbcontext.tbl_NewsPosition.where(x => x.Fk_NewsID==4 and
IsMainPosition=true).select(x => x.Fk_NewsCatId);'
var data =
from p in Dbcontext.tbl_cat
join ch in Dbcontext.tbl_cat on p.Id equals ch.Fk_ParentId
where p.Id==newscatid
select new
{
parentCat = p.CatName ,
childCat = ch.CatName
};

How to do 2 distinct groupby conditions on the same data frame in Scala?

I have a data frame, I need to two different groupbys on the same data frame.
+----+-------+--------+----------------------------+
| id | type | item | value | timestamp |
+----+-------+--------+----------------------------+
| 1 | rent | dvd | 12 |2016-09-19T00:00:00Z
| 1 | rent | dvd | 12 |2016-09-19T00:00:00Z
| 1 | buy | tv | 12 |2016-09-20T00:00:00Z
| 1 | rent | movie | 12 |2016-09-20T00:00:00Z
| 1 | buy | movie | 12 |2016-09-18T00:00:00Z
| 1 | buy | movie | 12 |2016-09-18T00:00:00Z
+----+-------+-------+------------------------------+
I would like to get the result as :
id : 1
totalValue : 72 --- group by based on id
typeCount : {"rent" : 3, "buy" : 3} --- group by based on id
itemCount : {"dvd" : 2, "tv" : 1, "movie" : 3 } --- group by based on id
typeForDay : {"rent: 2, "buy" : 2 } --- group By based on id and dayofmonth(col("timestamp")) atmost 1 type per day
I tried :
val count_by_value = udf {( listValues :scala.collection.mutable.WrappedArray[String]) => if (listValues == null) null else listValues.groupBy(identity).mapValues(_.size)}
val group1 = df.groupBy("id").agg(collect_list("type"),sum("value") as "totalValue", collect_list("item"))
val group1Result = group1.withColumn("typeCount", count_by_value($"collect_list(type)"))
.drop("collect_list(type)")
.withColumn("itemCount", count_by_value($"collect_list(item)"))
.drop("collect_list(item)")
val group2 = df.groupBy("id", dayofmonth(col("timestamp"))).agg(collect_set("type"))
val group2Result = group2.withColumn("typeForDay", count_by_value($"collect_set(type)"))
.drop("collect_set(type)")
val groupedResult = group1Result.join(group2Result, "id").show()
But it takes time, is there any other efficient way of doing this ?

Better approach is to add each group field to key & reduce them instead of groupBy(). You can use these:
df1.map(rec => (rec(0), rec(3).toString().toInt)).
reduceByKey(_+_).take(5).foreach(println)
=> (1,72)
df1.map(rec => ((rec(0), rec(1)), 1)).
map(x => (x._1._1, x._1._2,x._2)).
reduceByKey(_+_).take(5).foreach(println)
=>(1,rent,3)
(1,buy,3)
df1.map(rec => ((rec(0), rec(2)), 1)).
map(x => (x._1._1, x._1._2,x._2)).
reduceByKey(_+_).take(5).foreach(println)
=>(1,dvd,2)
(1,tv,1)
(1,movie,3)
df1.map(rec => ((rec(0), rec(1), rec(4).toString().substring(8,10)), 1)).
reduceByKey(_+_).map(x => (x._1._1, x._1._2,x._1._3,x._2)).
take(5).foreach(println)
=>(1,rent,19,2)
(1,buy,20,1)
(1,buy,18,2)
(1,rent,20,1)

Linq: dynamic Where clause inside a nested subquery get latest records of each group

I currently get problem to dynamic linq expression below
My Models
public class Orders
{
public int OrderId ;
public ICollection<OrderStatuses> Statuses;
}
public class Statuses
{
public int StatusId;
public int OrderId;
public string Note;
public DateTime Created;
}
My Sample data :
Orders
| ID | Name |
----------------------
| 1 | Order 01 |
| 2 | Order 02 |
| 3 | Order 03 |
Statuses
|ID | OrderId | Note | Created |
---------------------------------------
| 1 | 1 | Ordered | 2016-03-01|
| 2 | 1 | Pending | 2016-04-02|
| 3 | 1 | Completed | 2016-05-19|
| 4 | 1 | Ordered | 2015-05-19|
| 5 | 2 | Ordered | 2016-05-20|
| 6 | 2 | Completed | 2016-05-19|
| 7 | 3 | Completed | 2016-05-19|
I'd like to get number of orders which have note value equal to 'Ordered' and max created time.
Below is sample number of orders that I expect from query
| Name | Note | Last Created|
-------------------------------------|
| Order 01 | Ordered | 2016-03-01 |
| Order 02 | Ordered | 2016-05-20 |
Here my idea but it's seem to wrong way
var outer = PredicateBuilder.True<Order>();
var orders = _entities.Orders
.GroupBy(x => x.OrderId)
.Select(x => new { x.Key, Created = x.Max(g => g.Created) })
.ToArray();
var predicateStatuses = PredicateBuilder.False<Status>();
foreach (var item in orders)
{
predicateStatuses = predicateStatuses.Or(x => x.OrderId == item.Key && x.Created == item.Created);
}
var predicateOrders = PredicateBuilder.False<JobOrder>();
predicateOrders = predicateOrders.Or(predicateStatuses); (I don't how to passed expression which different object type (Order and Status) here or I have to write an extension method or something
outer = outer.And(predicateOrders);
Please suggest me how to solve this dynamic linq expression in this case.
Thanks in advance.

There's nothing dynamic about your query, at least, it doesn't need to be. You can express it as a regular query.
var query =
from o in db.Orders
join s in db.Statuses on o.Id equals s.OrderId
where s.Note == "Ordered"
orderby s.Created descending
group new { o.Name, s.Note, LastCreated = s.Created } by o.Id into g
select g.First();
p.s., your models doesn't seem to match the data at all so I'm ignoring that. Adjust as necessary.

Thanks so much for #Jeff Mercado answer. Finally, I customized your answer to solve my problem below:
var predicateStatuses = PredicateBuilder.False<Order>();
predicateStatuses = predicateStatuses.Or(p => (
from j in db.Statuses
where j.OrderId == p.ID
group j by j.OrderId into g
select g.OrderByDescending(t=>t.Created)
.FirstOrDefault()
).FirstOrDefault().Note == 'Ordered'
);

How to map-reduce group, sort and count sort values

I have some problems with mapreduce.
I want to group, sort and count some values in collection. I have collection such as:
----------------------------
| item_id | date |
----------------------------
| 1 | 01/15/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
| 1 | 01/15/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 2 | 01/03/2012 |
----------------------------
| 2 | 01/03/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 1 | 01/01/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
| 2 | 01/01/2012 |
----------------------------
I want to group by item_id and count date by day for each item and sort date for each item and get result such as:
value: {{item_id:1, date:{01/01/2012:3, 01/15/2012:2 }},{item_id:2, date:{01/01/2012:3, 01/03/2012:2 }}}
I use mapReduce:
m=function()
{
emit(this.item_id, this.date);
}
r=function(key, values)
{
var res={};
values.forEach(function(v)
{
if(typeof res[v]!='undefined') ? res[v]+=1 : res[v]=1;
});
return res;
}
But I didn't receive result such as:
{{item_id:1, date:{01/01/2012:3, 01/15/2012:2 }},{item_id:2, date:{01/01/2012:3, 01/03/2012:2 }}}
Any ideas?

Given input documents of the form:
> db.dates.findOne()
{ "_id" : 1, "item_id" : 1, "date" : "1/15/2012" }
>
The following map and reduce functions should produce the output that you are looking for:
var map = function(){
myDate = this.date;
var value = {"item_id":this.item_id, "date":{}};
value.date[myDate] = 1;
emit(this.item_id, value);
}
var reduce = function(key, values){
output = {"item_id":key, "date":{}};
for(v in values){
for(thisDate in values[v].date){
if(output.date[thisDate] == null){
output.date[thisDate] = 1;
}else{
output.date[thisDate] += values[v].date[thisDate];
}
}
}
return output;
}
> db.runCommand({"mapReduce":"dates", map:map, reduce:reduce, out:{replace:"dates_output"}})
> db.dates_output.find()
{ "_id" : 1, "value" : { "item_id" : 1, "date" : { "1/15/2012" : 2, "1/01/2012" : 3 } } }
{ "_id" : 2, "value" : { "item_id" : 2, "date" : { "1/01/2012" : 3, "1/03/2012" : 2 } } }
Hopefully the above will do what you need it to, or at least get you pointed in the right direction.
For more information on using Map Reduce with MongoDB, please see the Mongo Documentation:
http://www.mongodb.org/display/DOCS/MapReduce
There are some additional Map Reduce examples in the MongoDB Cookbook:
http://cookbook.mongodb.org/
For a step-by-step walkthrough of how a Map Reduce operation is run, please see the "Extras" section of the MongoDB Cookbook recipe "Finding Max And Min Values with Versioned Documents" http://cookbook.mongodb.org/patterns/finding_max_and_min/
Good luck!

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Join and group by in LINQ - entity-framework

Related

Tabular data from DB to MAP Data structure

merge result of multiple select using linq and entityframework

How to do 2 distinct groupby conditions on the same data frame in Scala?

Linq: dynamic Where clause inside a nested subquery get latest records of each group

How to map-reduce group, sort and count sort values

Categories

Resources