How do I restrict the number of columns used in an INSERT when using EntityFramework? - entity-framework

Let's say you have an Entity with 26 columns. It matches the corresponding table which also has 26 columns.
From time to time I would like to be able to send fewer columns in an INSERT (Add) operation than are specified in the entity because of certain business rules (In our case we have a trigger on a table that will automatically populate certain fields with data. We routinely leave those columns out of our INSERT statements)
I know that I can use DTOs to restrict the number of columns returned, but how do I restrict the number of columns sent?

If there are operations that would insert entities that only provide a subset of columns (non-null-able for example) then you can consider using a bounded context with an entity declaration for just those applicable columns. The bounded context is a smaller, single-purpose context for reading and writing data since a single EF context does not support multiple entity definitions to a single table.

Related

Postgresql - Return column subset from cursor

I have a legacy stored procedure returning a number (row count) a cursor with many columns; I need to retrieve a subset of the selected columns. I can think of three ways of doing it:
Invoke the existing procedure from the outside, and map columns to my own data structures trimming unneeded columns;
Write a new stored procedure, mostly identical to the existing one but returning different columns;
Write a new stored procedure, invoking the old one internally and filtering columns (the referenced entities and thus the number of rows are exactly the same as the existing procedure).
Number 2 is obviously a no-go.
Number 1 is viable. As far as I know, there is little difference in the computing cost between retrieving one or more columns, in that the engine has to read full rows regardless, before filtering unrequired columns; I do have a feeling it would be heavier on the runtime invoking the procedure from the outside, as objects representing unneeded columns would exist on returning from the DB call.
I would be interested in implementing Number 3, but I would prefer to maintain the same return type as the existing function (count + refcursor) for conformity.
I think I could transfer all the rows in the cursor returned by the existing function into a temporary table as described e.g. in this question, and use it as a source for the output cursor but:
I am not sure of how the output cursor would behave with a temporary table created with a drop-on-commit clause (would the results exist reliably after the procedure has terminated? Would the temporary table be dropped as expected?);
I read that temporary tables are expensive to use, and it feels like overkill for what in the end is a filtering of columns on the same rows from a pre-computed result.
Is there a way to query the existing cursor so that it may be used as a source for the output cursor, while filtering columns?

MATLAB- Joining tables w/ overlapping data using key variable WHERE neither table contains all data points from the other one

I am working on combining 2 tables with different types of patient information using the PID (Patient Identity) feature present in both tables. Usually the function "join" (https://www.mathworks.com/help/matlab/ref/table.join.html) does the trick when one of the tables have information on all the patients from the other one. But in my case, both tables have certain values of PID (or information for new patients) that isn't present in the other one. How do I create a new table for using patient info from both tables that only contains info on the patients present in both tables?
I could probably write some long, clunky code to do this manually, but I was wondering if there's a function (or a few functions) that can do the task more efficiently. Thank you
The solution is to use either innerjoin or outerjoin.

What is the purpose of dividing rows into columnfamilies if they can have different number/types of columns anyway?

Given that a column family can have rows with arbitrary structure we could store all rows in a single "store" (avoiding the name 'columnfamily/table' on purpose).
What is the purpose of column families then?
The simplest of all reasons is evident in the name itself "Column Family". A Column Family groups a bunch of related columns together. You could consider it as a namespace containing related columns.
For example the Column "Name" by itself lacks context, which can be provided by ColumnFamilies like "Employees" or "Cities". Or each Column would need to carry all of it's context by itself with no concept of related Columns.
Atomicity
In Cassandra 1.1 and below, the only atomic guarantee you have is that writes to the same row (i.e. with the same key) will be atomic.
Thus, you think very carefully about what you want in your columns, and what row those columns should be in so that your application will behave appropriately if a write fails.
Reasons:
To have a different sort order for the columns within a row. The comparator is specified at column family creation time and can't be changed afterwards. So if you have rows which columns must be sorted alphabetically or numerically you have to create different column families.
Customize the storage options that can be set on per column family basis. E.g. caching or rows, compaction, deletion of expired columns, etc. Per column family storage options can be found here
Can't mix counter and non-counter columns in the same column family
As mentioned in other answers, due to logical cohesion - columns represent attributes of some entity identified by the row id.

stored procedure returns data of varying dimensions Entity Framework 4

for the reporting of a survey system I am working on, we developed a stored procedure that returns data with varying number of columns.
we show the operator all the columns from these tables: Users, Questions, Answers.
the user selects the columns from each of the tables that the report should show.
for example:
User: Name, Age, zipcode.
Questions: question2, question 4
Answers: answer2, answer3, answer4.
we then pass the parameters to the stored procedure and the stored procedure returns:
one column for each user property, question or answer.
and a row for each user in the DB.
example:
as you can see, the stored procedure can return anything between 3 rows of 2 columns to 500 rows of 50 columns. Is there a way to use the stored procedure with entity framework? at first I tried with a complex return type, but it appears that that approach will not work in this case.
EF supports only stored procedures with fixed number of columns defined at design time. To execute this procedure you need to use plain old ADO.NET.
EDIT: If you have fixed total nuber of colums (you mentioned 50) you can create single class containing all these columns and use it as result for execution. EF will fill only properties existing in the result set.

DB2 Auto generated Column / GENERATED ALWAYS pros and cons over sequence

Earlier we were using 'GENERATED ALWAYS' for generating the values for a primary key. But now it is suggested that we should, instead of using 'GENERATED ALWAYS' , use sequence for populating the value of primary key. What do you think can be the reason of this change? It this just a matter of choice?
Earlier Code:
CREATE TABLE SCH.TAB1
(TAB_P INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY (START WITH 1, INCREMENT BY 1, NO CACHE),
.
.
);
Now it is
CREATE TABLE SCH.TAB1
(TAB_P INTEGER ),
.
.
);
now while inserting, generate the value for TAB_P via sequence.
I tend to use identity columns more than sequences, but I'll compare the two for you.
Sequences can generate numbers for any purpose, while an identity column is strictly attached to a column in a table.
Since a sequence is an independent object, it can generate numbers for multiple tables (or anything else), and is not affected when any table is dropped. When a table with a identity column is dropped, there is no memory of what value was last assigned by that identity column.
A table can have only one identity column, so if you want to want to record multiple sequential numbers into different columns in the same table, sequence objects can handle that.
The most common requirement for a sequential number generator in a database is to assign a technical key to a row, which is handled well by an identity column. For more complicated number generation needs, a sequence object offers more flexibility.
This might probably be to handle ids in case there are lots of deletes on the table.
For eg: In case of identity, if your ids are
1
2
3
Now if you delete record 3, your table will have
1
2
And then if your insert a new record, the ids will be
1
2
4
As opposed to this, if you are not using an identity column and are generating the id using code, then after delete for the new insert you can calculate id as max(id) + 1, so the ids will be in order
1
2
3
I can't think of any other reason, why an identity column should not be used.
Heres something I found on the publib site:
Comparing IDENTITY columns and sequences
While there are similarities between IDENTITY columns and sequences, there are also differences. The characteristics of each can be used when designing your database and applications.
An identity column has the following characteristics:
An identity column can be defined as
part of a table only when the table
is created. Once a table is created,
you cannot alter it to add an
identity column. (However, existing
identity column characteristics might
be altered.)
An identity column
automatically generates values for a
single table.
When an identity
column is defined as GENERATED
ALWAYS, the values used are always
generated by the database manager.
Applications are not allowed to
provide their own values during the
modification of the contents of the
table.
A sequence object has the following characteristics:
A sequence object is a database
object that is not tied to any one
table.
A sequence object generates
sequential values that can be used in
any SQL or XQuery statement.
Since a sequence object can be used
by any application, there are two
expressions used to control the
retrieval of the next value in the
specified sequence and the value
generated previous to the statement
being executed. The PREVIOUS VALUE
expression returns the most recently
generated value for the specified
sequence for a previous statement
within the current session. The NEXT
VALUE expression returns the next
value for the specified sequence. The
use of these expressions allows the
same value to be used across several
SQL and XQuery statements within
several tables.
While these are not all of the characteristics of these two items, these characteristics will assist you in determining which to use depending on your database design and the applications using the database.
I don't know why anyone would EVER use an identity column rather than a sequence.
Sequences accomplish the same thing and are far more straight forward. Identity columns are much more of a pain especially when you want to do unloads and loads of the data to other environments. I not going to go into all the differences as that information can be found in the manuals but I can tell you that the DBA's have to almost always get involved anytime a user wants to migrate data from one environment to another when a table with an identity is involved because it can get confusing for the users. We have no issues when a sequence is used. We allow the users to update any schema objects so they can alter their sequences if they need to.