Merge two collections in mongodb using c#

Merge two collections in mongodb using c# - mongodb

I have two collections in mongodb.I am retreiving data from two collections independently working gud.But when I am implementing paging using skip and take methods I am getting data from both the collections like this
paging = new Pagination() { CurrentPage = pageNumber, ItemsPerPage = 16 };
var results = dataTable.FindAs<TradeInfo(queryAll).Skip(paging.Skip).Take(paging.Take).ToList<TradeInfo>();
paging.TotalCount = Convert.ToInt32(dataTable.Find(query).Count());
var results2 = new List<TradeInfo>();
if (dataTable2 != null)
{
results2 = dataTable2.FindAs<TradeInfo(queryAll).Skip(paging.Skip).Take(paging.Take).ToList<TradeInfo>();
int count = Convert.ToInt32(dataTable2.Find(query).Count());
paging.TotalCount = paging.TotalCount + count;
results.AddRange(results2);
}
I am giving results as Itemssource to Datagrid and I am getting total 32 items per page.
How can I do that is there any joins concept in Mongodb.Two collections columns are same.
How can I do it?
Please help me in doing that....
Thanks,
jan

I believe what you are looking for here is more of a Union than a Join.
Unfortunately there is no such concept in MongoDB. If your paging is dependent on a query, which in this case it seems it might be, your only real option is to create and maintain a single merged collection which gets updated every time a document is added or saved to either of these two collections. Then you can skip and take on the single collection after applying the query to it.

Related

Validate data exits using EF core best practices

List list = new List();
I have a list of Guid. What is the best to check all guid exits or not using ef core table?
I am currently using the below code but the performance is very bad. assume user table as 1 million records.
for Example
public async Task<bool> IsIdListValid(IEnumerable<int> idList)
{
var validIds = await _context.User.Select(x => x.Id).ToListAync();
return idList.All(x => validIds.Contains(x));
}

The performance is bad because you are reading each row of the table into memory, and then iterating through it (ToList materializes the query.) Try using the Any() method to take advantage of the strength of the database. Use something like the following: bool exists = _context.User.Any(u => idList.Contains(u));. This should translate to an SQL IN clause.

Provided you assert that the # of IDs being sent in is kept reasonable, you could do the following:
var idCount = _context.User.Where(x => idList.Contains(x.Id)).Count();
return idCount == idList.Count;
This assumes that you are comparing on a unique constraint like the PK. We get a count of how many rows have a matching ID from the list, then compare that to the count of IDs sent.
If you're passing a large # of IDs, you would need to break the list up into reasonable sets as there are limits to what you can do with an IN clause and potential performance costs as well.

Doing subqueries in Mybatis, or query recursively the selected values

UPDATE:
I understood that the solution to my problem is doing subqueries, which apply a different filter each time, and they have a reduced result set. But I can't find a way to do that in MyBatis logic. Here is my query code
List<IstanzaMetadato> res = null;
SqlSession sqlSession = ConnectionFactory.getSqlSessionFactory().openSession(true);
try {
IstanzaMetadatoMapper mapper = sqlSession.getMapper(IstanzaMetadatoMapper.class);
IstanzaMetadatoExample example = new IstanzaMetadatoExample();
Iterator<Map.Entry<Integer, String>> it = map.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Integer, String> entry = it.next();
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue());
}
example.setDistinct(true);
res = mapper.selectByExample(example);
I need to execute a new selectByExample but inside the while cycle, and it has to query the previus "SELECTED" results....
Is there a Solution ?
ORIGINAL QUESTION:
I have this table structure
I have to select rows from the table with different filters, specified by the final user.
Those filters are specified by a couple (id_metadato, valore), in example you can have id_metadato = 3 and valore = "pippo";
the user can specify 0-n filters from the web page typing 0-n values inside the search boxes which are based on id_metadato
Obviusly, the more filters the users specifies, the more restriction would have the final query.
In example if the user fills only the first search box, the query will have only a filter and would provide all the rows that will have the couple (id_metadato, valore) specified by the user.
If he uses two search boxes, than the query will have 2 filters, and it will provide all the rows that verify the first condition AND the second one, after the "first subquery" is done.
I need to do this dinamically, and in the best efficient way. I can't simply add AND clause to my query, they have to filter and reduce the result set every time.
I can't do 0-n subqueries (Select * from ... IN (select * from ....) ) efficiently.
Is there a more elegant way to do that ? I'm reading dynamic SQL queries tutorials with MyBatis, but I'm not sure that is the correct way. I'm still trying to figure out the logic of the resosultio, then I will try to implement with MyBatis.
Thanks for the answers

MyBatis simplified a lot this process of nesting subqueries, it was sufficient to concatenate the filter criterias and to add
the excerpt of the code is the following
try {
IstanzaMetadatoMapper mapper = sqlSession.getMapper(IstanzaMetadatoMapper.class);
IstanzaMetadatoExample example = new IstanzaMetadatoExample();
Iterator<Map.Entry<Integer, String>> it = map.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Integer, String> entry = it.next();
if (listaIdUd.isEmpty()) {
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue());
example.setDistinct(true);
listaIdUd = mapper.selectDynamicNested(example);
continue;
}
example.clear();
example.createCriteria().andIdMetadatoEqualTo(entry.getKey()).andValoreEqualTo(entry.getValue()).andIdUdIn(listaIdUd);
example.setDistinct(true);
listaIdUd = mapper.selectDynamicNested(example);
}

Count in jpa without getting result [duplicate]

I like the idea of Named Queries in JPA for static queries I'm going to do, but I often want to get the count result for the query as well as a result list from some subset of the query. I'd rather not write two nearly identical NamedQueries. Ideally, what I'd like to have is something like:
#NamedQuery(name = "getAccounts", query = "SELECT a FROM Account")
.
.
Query q = em.createNamedQuery("getAccounts");
List r = q.setFirstResult(s).setMaxResults(m).getResultList();
int count = q.getCount();
So let's say m is 10, s is 0 and there are 400 rows in Account. I would expect r to have a list of 10 items in it, but I'd want to know there are 400 rows total. I could write a second #NamedQuery:
#NamedQuery(name = "getAccountCount", query = "SELECT COUNT(a) FROM Account")
but it seems a DRY violation to do that if I'm always just going to want the count. In this simple case it is easy to keep the two in sync, but if the query changes, it seems less than ideal that I have to update both #NamedQueries to keep the values in line.
A common use case here would be fetching some subset of the items, but needing some way of indicating total count ("Displaying 1-10 of 400").

So the solution I ended up using was to create two #NamedQuerys, one for the result set and one for the count, but capturing the base query in a static string to maintain DRY and ensure that both queries remain consistent. So for the above, I'd have something like:
#NamedQuery(name = "getAccounts", query = "SELECT a" + accountQuery)
#NamedQuery(name = "getAccounts.count", query = "SELECT COUNT(a)" + accountQuery)
.
static final String accountQuery = " FROM Account";
.
Query q = em.createNamedQuery("getAccounts");
List r = q.setFirstResult(s).setMaxResults(m).getResultList();
int count = ((Long)em.createNamedQuery("getAccounts.count").getSingleResult()).intValue();
Obviously, with this example, the query body is trivial and this is overkill. But with much more complex queries, you end up with a single definition of the query body and can ensure you have the two queries in sync. You also get the advantage that the queries are precompiled and at least with Eclipselink, you get validation at startup time instead of when you call the query.
By doing consistent naming between the two queries, it is possible to wrap the body of the code to run both sets just by basing the base name of the query.

Using setFirstResult/setMaxResults do not return a subset of a result set, the query hasn't even been run when you call these methods, they affect the generated SELECT query that will be executed when calling getResultList. If you want to get the total records count, you'll have to SELECT COUNT your entities in a separate query (typically before to paginate).
For a complete example, check out Pagination of Data Sets in a Sample Application using JSF, Catalog Facade Stateless Session, and Java Persistence APIs.

oh well you can use introspection to get named queries annotations like:
String getNamedQueryCode(Class<? extends Object> clazz, String namedQueryKey) {
NamedQueries namedQueriesAnnotation = clazz.getAnnotation(NamedQueries.class);
NamedQuery[] namedQueryAnnotations = namedQueriesAnnotation.value();
String code = null;
for (NamedQuery namedQuery : namedQueryAnnotations) {
if (namedQuery.name().equals(namedQueryKey)) {
code = namedQuery.query();
break;
}
}
if (code == null) {
if (clazz.getSuperclass().getAnnotation(MappedSuperclass.class) != null) {
code = getNamedQueryCode(clazz.getSuperclass(), namedQueryKey);
}
}
//if not found
return code;
}

Entity Framework: A better way to update lots of table records

I am new at Entity framework, and curious what the best way would be to update all tables with records of new data. I have a method which returns a list of objects with updated records. Most of the information stays the same; just two fields will be updated.
Currently I created two ways of doing that update.
The first one is to get data from the database table and iterate from both Lists to find a match and update that match:
var previousDatafromTable= db.Widgets.ToList();
var newDataReturnedFromMethod =.......
foreach (var d in previousDatafromTable)
{
foreach (var l in newDataReturnedFromMethod )
{
if (d.id == l.id)
{
d.PositionColumn = l.PositionColumn;
d.PositionRow = l.PositionRow;
}
}
The second one is:
foreach (var item in newDataReturnedFromMethod )
{
var model = db.Widgets.Find(item.id);
model.PositionColumn = item.PositionColumn;
model.PositionRow = item.PositionRow;
}
I am iterating through the updated data and updating my database table by ID.
So I am interested to know which method is the better way of doing this, and maybe there is an option in Entity Framework to measure the performance of these two tasks? Thanks for your time in answering.

Neither is really efficient.
The first option loops through newDataReturnedFromMethod for each iteration of previousDatafromTable. That's a lot of iterations.
The second options probably executes a database query for each iteration of newDataReturnedFromMethod.
It's far more efficient to join:
var query = from n in newDataReturnedFromMethod
join p in previousDatafromTable on n.id equals p.id
select new { n,p };
foreach (var pair in query)
{
pair.p.PositionColumn = pair.n.PositionColumn;
pair.p.PositionRow = pair.n.PositionRow;
}
EF doesn't have built-in performance measurements. You'd typically use a profiler for that, or the StopWatch class.

What are the ways to optimize Entity Framework queries with Contains()?

Wee load large object graph from DB.
The query has many Includes and Where()uses Contains() to filter the final result.
Contains is called for the collection containing about thousand entries.
The profiler shows monstrous human-unreadable SQL.
The query cannot be precompiled because of Contains().
Is there any ways for optimization of such queries?
Update
public List<Vulner> GetVulnersBySecurityObjectIds(int[] softwareIds, int[] productIds)
{
var sw = new Stopwatch();
var query = from vulner in _businessModel.DataModel.VulnerSet
join vt in _businessModel.DataModel.ObjectVulnerTieSet.Where(ovt => softwareIds.Contains(ovt.SecurityObjectId))
on vulner.Id equals vt.VulnerId
select vulner;
var result = ((ObjectQuery<Vulner>)query.OrderBy(v => v.Id).Distinct())
.Include("Descriptions")
.Include("Data")
.Include("VulnerStatuses")
.Include("GlobalIdentifiers")
.Include("ObjectVulnerTies")
.Include("Object.ProductObjectTies.Product")
.Include("VulnerComment");
//Если переданы конкретные продукты, добавляем фильтрацию
if (productIds.HasValues())
result = (ObjectQuery<Vulner>)result.Where(v => v.Object.ProductObjectTies.Any(p => productIds.Contains(p.ProductId)));
sw.Start();
var str = result.ToTraceString();
sw.Stop();
Debug.WriteLine("Сборка запроса заняла {0} секунд.", sw.Elapsed.TotalSeconds);
sw.Restart();
var list = result.ToList();
sw.Stop();
Debug.WriteLine("Получение уязвимостей заняло {0} секунд.", sw.Elapsed.TotalSeconds);
return list;
}

It's almost certain that splitting the query in pieces performs better, in spite of more db round trips. It is always advised to limit the number of includes, because they not only blow up the size and complexity of the query (as you noticed) but also blow up the result set both in length and in width. Moreover, they often get translated into outer joins.
Apart from that, using Contains the way you do is OK.
Sorry, it is hard to be more specific without knowing your data model and the size of the tables involved.