How to remove column-duplicates from the query result using entity-framework? - entity-framework

On my database table I have
Key | Value
a | 1
a | 2
b | 11
c | 1
d | 2
b | 3
But I just need to get the items which keys are not duplicates of the previous rows. The desired result should be:
Key | Value
a | 1
b | 11
c | 1
d | 2
How could we get the desired result using entity-framework?
Note: we need the first value. Thank you very much.

var q = from e in Context.MyTable
group e by e.Key into g
select new
{
Key = g.Key,
Value = g.OrderBy(v => v.Value).FirstOrDefault()
};

You should look at either writing a View in the database and mapping your entity to that.
Or creating a DefiningQuery in the part of your EDMX (aka the bit that ends up in the SSDL file).
See Tip 34 for more information.
Conceptually both approaches allow you to write a view that excludes the 'duplicate rows'. The difference is just where the view lives.
If you have control over the database - I'd put the view in the database
If not you can put the view in your inside the and then map to that.
Hope this helps
Alex

Related

How to convert nested json to data frame with kdb+

I am trying to get the data from cryptostats like below, it gives me back a nested json. I want it to be in a table format. How do I do that?
query:"https://api.cryptostats.community/api/v1/fees/oneDayTotalFees/2023-02-07";
raw:.Q.hg query;
res:.j.k raw;
To get json file, use https://api.cryptostats.community/api/v1/fees/oneDayTotalFees/2023-02-07
To view json code into a table format, use https://jsongrid.com/json-grid
Final result would be a kdb+ table which has all the cols from nested json output
They are all dictionaries
q)distinct type each res[`data]
,99h
But they do not collapse to a table because they do not all have matching keys
q)distinct key each res[`data]
`id`bundle`results`metadata`errors
`id`bundle`results`metadata
Looking at a row where errors is populated we can see it is a dictionary
q)res[`data;0;`errors]
oneDayTotalFees| "Error executing oneDayTotalFees on compound: Date incomplete"
You can create a prototype dictionary with a blank errors key in it and join , each piece of data onto it. This will result in uniform dictionaries which will be promoted to a table type 98h
q)table:(enlist[`errors]!enlist (`$())!()),/:res`data
q)type table
98h
Row which already had errors is unaffected:
q)table 0
errors | (,`oneDayTotalFees)!,"Error executing oneDayTotalFees..
id | "compound"
bundle | 0n
results | (,`oneDayTotalFees)!,0n
metadata| `source`icon`name`category`description`feeDescription;..
Row which previously did not have errors now has a valid empty dictionary
q)table 1
errors | (`symbol$())!()
id | "swapr-ethereum"
bundle | "swapr"
results | (,`oneDayTotalFees)!,24.78725
metadata| `category`name`icon`bundle`blockchain`description`feeDescription..
https://kx.com/blog/kdb-q-insights-parsing-json-files/
https://code.kx.com/q/ref/join/
https://code.kx.com/q/kb/faq/#construction
https://code.kx.com/q/basics/datatypes/
https://code.kx.com/q/ref/maps/#each-left-and-each-right
If you want to explore nested objects you can index at depth (see blog post linked above). If you have many sparse keys leaving it like this is efficient for storage:
q)select tokenSymbol:metadata[::;`tokenSymbol] from table where not ""~/:metadata[::;`tokenSymbol]
tokenSymbol
-----------
"HNY"
If you do wish to explode a nested field you can run similar to:
q)table:table,'{flip c!flip table[`metadata]#\:(c:distinct raze key each table[`metadata])}[]
q)meta table
c | t f a
----------------| -----
errors |
id | C
bundle | C
results |
metadata |
source | C
icon | C
name | C
category | C
description | C
feeDescription | C
blockchain | C
website | C
tokenTicker | C
tokenCoingecko | C
protocolLaunch | C
tokenLaunch | C
adapter | C
subtitle | C
events | C
shortName | C
protocolShutdown| C
tokenSymbol | C
subcategory | C
tokenticker | C
tokencoingecko | C
Care needs to be taken will filling in nulls and keeping consistent types of data in each column. In this dataset the events tag inside metadata is tabular data:
q)select distinct type each events from table
events
------
10
98
0
This would need to be cleaned similar to:
q)table:update events:count[i]#enlist ([] date:();description:()) from table where not 98h=type each events
The data returned from the API contains dictionaries with two distinct sets of keys:
q)distinct key each res`data
`id`bundle`results`metadata`errors
`id`bundle`results`metadata
One simple way to convert this to a table is to enlist each dictionary first, converting them to tables, then joining with uj:
q)(uj/)enlist each res`data
id bundle results metadata ..
-----------------------------------------------------------------------------..
"compound" 0n (,`oneDayTotalFees)!,0n `source`i..
"swapr-ethereum" "swapr" (,`oneDayTotalFees)!,24.78725 `category..
...
This works as uj generalises the join operator ,, allowing different schemas with common elements to be combined.

Relational databse design to represent similarity between rows of same table

For background purposes: I'm using PostgreSQL with SQLAlchemy (Python).
Given a table of unique references as such:
references_table
-----------------------
id | reference_code
-----------------------
1 | CODEABCD1
2 | CODEABCD2
3 | CODEWXYZ9
4 | CODEPOIU0
...
In a typical scenario, I would have a separate items table:
items_table
-----------------------
id | item_descr
-----------------------
1 | `Some item A`
2 | `Some item B`
3 | `Some item C`
4 | `Some item D`
...
In such typical scenario, the many-to-many relationship between references and items is set in a junction table:
references_to_items
-----------------------
ref_id (FK) | item_id (FK)
-----------------------
1 | 4
2 | 1
3 | 2
4 | 1
...
In that scenario, it is easy to model and obtain all references that are associated to the same item, for instance item 1 has references 2 and 4 as per table above.
However, in my scenario, there is no items_table. But I would still want to model the fact that some references refer to the same (non-represented) item.
I see a possibility to model that via a many-to-many junction table as such (associating FKs of the references table):
reference_similarities
-----------------------
ref_id (FK) | ref_id_similar (FK)
-----------------------
2 | 4
2 | 8
2 | 9
...
Where references with ID 2, 4, 8 and 9 would be considered 'similar' for the purposes of my data model.
However, the inconvenience here is that such model requires to choose one reference (above id=2) as a 'pivot', to which multiple others can be declared 'similar' in the reference_similarities table. Ref 2 is similar to 4 and ref 2 is similar to 8 ==> thus 4 is similar to 8.
So the question is: is there a better design that doesn't involve having a 'pivot' FK as above?
Ideally, I would store the 'similarity' as an Array of FKs as such:
reference_similarities
------------------------
id | ref_ids (Array of FKs)
------------------------
1 | [2, 4, 8, 9]
2 | [1, 3, 5]
..but I understand from https://dba.stackexchange.com/questions/60132/foreign-key-constraint-on-array-member that it is currently not possible to have foreign keys in PostgreSQL arrays. So I'm trying to figure out a better design for this model.
I can understand that you want to group items in a set, and able to query the set from any of item in it.
You can use a hash function to hash a set, then use the hash as pivot value.
For example you have a set of values (2,4,8,9), it will be hashed like this:
hash = ((((31*1 + 2)*31 + 4)*31 + 8)*31 + 9
you can refer to Arrays.hashCode in Java to know how to hash a list of values.
int result = 1;
for (Object element : a)
result = 31 * result + (element == null ? 0 : element.hashCode());
Table reference_similarities:
reference_similarities
-----------------------
ref_id (FK) | hash_value
-----------------------
2 | hash(2, 4, 8, 9) = 987204
4 | 987204
8 | 987204
9 | 987204
To query the set, you can first query hash_value from ref_id first, then, get all ref_id from hash_value.
The draw back of this solution is every time you add a new value to a set, you have to rehash the set.
Another solution is you can just write a function in Python to produce a unique hash_value when creating a new set.

Have unique constraint between tables

I need to validate uniqueness of an element but each element could have n codes and I need to validate that to. Element should be unique by type, provider, time and code. When an element is created it has a code array if any of that codes overlap with existing element code that means the element is the same.
E.g Having
codes
| id | code|
|:-------| ---:|
| 1 | 123 |
| 2 | 456 |
elements
| id | type | provider_id | time |
|:-------|:-------------:| ------------:|-----------:|
| 1 | A | 1 | 01/01/2016 |
codes_elements
| code_id | element_id |
|:--------|------------:|
| 1 | 1 |
Expectations:
When I try to insert:
`Element = type: A, provider: 1, codes: [1], time: 01/01/2015`
I expect this fails because violate contrain
When I try to insert:
`Element = type: B, provider: 1, codes: [1], time: 01/01/2015`
I expect this create a new element with code relation
When I try to insert:
`Element = type: A, provider: 1, codes: [2], time: 01/01/2015`
I expect this create a new element because element is the same but the code array not overlap with any other element
When I try to insert:
`Element = type: A, provider: 1, codes: [1, 2], time: 01/01/2015`
I expect this add relation in codes_elements only so element 1 will have 2 codes now because code 1 overlap other element which means that this is same element so I need to just include code 2
Options that I see
Change join table to int array in elements table.- In this case I can search by other columns and append code id also create contrain by codes array to prevent duplication. Problem with this is when I query element table by code is slow e.g get all elements with code 1.
Create a trigger and check if codes overlap other element assign just code relation if not create new element.
I'm using Ruby on Rails but I need model validations are not enought because I've multiple threads doing inserts at same time and could end with duplications
I would like to know better options or comments about current ideas looking improve performance, prevent duplication, etc.
Why not to set the validation in RoR?
If the element is repeated you are going to have an error message anyway. You can rise and handle the error in RoR adding a character like "1"to the end of the element if it is repeated. Other option is create a "consolidate" column in the codes_elements table and save the element+code value in it.
EDIT:
You can be sure about not to save duplicated rows using a transaction:
ActiveRecord::Base.transaction do
elements.code do |c|
Element.find_or_create_by(code: c, element: e)
or
Element.exists? code: c, element: e
end
end

Check if field value is in a list of strings in SSRS report

I'm using SSRS (VS2008) and creating a report of work orders. In the detail line of the report table, I have the following columns (with some fake data)
WONUM | A | B | Hours
ABC123 | 3 | 0 | 3
SPECIAL| 0 | 6 | 6
DEF456 | 5 | 0 | 5
GHI789 | 4 | 0 | 4
OTHER | 0 | 2 | 2
As you can kind of see, all work orders have a work order number (WONUM) as well as a total # of hours (HOURS). I need to put the hours into either column A or column B based on WONUM. I have a list of specifically named work orders (in the example, they would be "SPECIAL" and "OTHER") which would cause the HOURS value to be put in column B. If the WONUM is NOT a special named one, then it goes in column A. Here's what I WANTED to put as the expression for column A and column B:
Column A: =IIF(Fields!WONUM.Value IN ("SPECIAL","OTHER"), 0, Fields!Hours.Value)
Column B: =IIF(Fields!WONUM.Value IN ("SPECIAL","OTHER"), Fields!Hours.Value, 0)
But as you're probably aware, Fields!WONUM.Value IN ("SPECIAL","OTHER") is not a valid method of doing this! What is the best way to make this work? I cannot flag it in the SQL query in any other way for other reasons so it must be done in the table.
Thanks in advance for any and all help!
Try this, (Using InStr() function)
IIF(InStr(Fields!WONUM.Value,"SPECIAL")>0 OR InStr(Fields!WONUM.Value,"OTHER")>0, 0, Fields!Hours.Value)
IIF(InStr(Fields!WONUM.Value,"SPECIAL")>0 OR InStr(Fields!WONUM.Value,"OTHER")>0, Fields!Hours.Value,0)
If it's just the two WONUMs then you can do this:
Column A:
=IIF((Fields!WONUM.Value <> "SPECIAL") AND (Fields!WONUM.Value <> "OTHER"), Fields!Hours.Value, 0)
Column B:
=IIF((Fields!WONUM.Value = "SPECIAL") OR (Fields!WONUM.Value = "OTHER"), Fields!Hours.Value, 0)
or use the same formula in each column for consistency and swap the field/0 at the end.

EF 4 - associations with keys that dont match

We're using POCOs and have 2 entities: Item and ItemContact. There are 1 or more contacts per item.
Item has as a primary key:
ItemID
LanguageCode
ItemContact has:
ItemID
ContactID
We cant add an association with a referrential constraint as they have differing keys. There isnt a strict primary / foreign key as languageCode isnt in ItemContact and ContactID isnt in Item.
How can we go about mapping this with an association for contacts for an item if there isnt a direct link but I still want to see the contacts for an item?
One of the entities originates in a database view so it is not possible to add foreign keys to the database
Thanks
Stephen Ward
In order to create any relationship (in EF or any ORM for that matter) you have to have something to Join on.
Because at the moment your don't, you need to fabricate something...
The only option I can think of is to create a Relationship - using some of the same techniques described in here to create an SSDL view to back the relationship using a <DefiningQuery> based on a cross product join.
So if you have data like this:
ItemID | LanguageCode
1 | a
and this:
ItemID | ContactID
1 | x
1 | y
1 | z
Then your <DefiningQuery> should have T-SQL that produces something like this:
Item_ItemID | Item_LanguageCode | ItemContact_ItemID | ItemContact_ContactID
1 | a | 1 | x
1 | a | 1 | y
1 | a | 1 | z
Now because this is technically an Independent Association - as opposed to an FK association - you should be able to claim in the CSDL that the cardinality is 1 - * even though there is nothing in the SSDL to constrain it - and stop it from being a * - *.
Hope this helps
Alex