In what scenarios would I need to use the CREATEREF, DEREF and REF keywords? - entity-framework

This question is about why I would use the above keywords. I've found plenty of MSDN pages that explain how. I'm looking for the why.
What query would I be trying to write that means I need them? I ask because the examples I have found appear to be achievable in other ways...
To try and figure it out myself, I created a very simple entity model using the Employee and EmployeePayHistory tables from the AdventureWorks database.
One example I saw online demonstrated something similar to the following Entity SQL:
SELECT VALUE
DEREF(CREATEREF(AdventureWorksEntities3.Employee, row(h.EmployeeID))).HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
This seems to pull back the HireDate without having to specify a join?
Why is this better than the SQL below (that appears to do exactly the same thing)?
SELECT VALUE
h.Employee.HireDate
FROM
AdventureWorksEntities3.EmployeePayHistory as h
Looking at the above two statements, I can't work out what extra the CREATEREF, DEREF bit is adding since I appear to be able to get at what I want without them.
I'm assuming I have just not found the scenarios that demostrate the purpose. I'm assuming there are scenarios where using these keywords is either simpler or is the only way to accomplish the required result.
What I can't find is the scenarios....
Can anyone fill in the gap? I don't need entire sets of SQL. I just need a starting point to play with i.e. a brief description of a scenario or two... I can expand on that myself.

Look at this post
One of the benefits of references is that it can be thought as a ‘lightweight’ entity in which we don’t need to spend resources in creating and maintaining the full entity state/values until it is really necessary. Once you have a ref to an entity, you can dereference it by using DEREF expression or by just invoking a property of the entity

TL;DR - REF/DEREF are similar to C++ pointers. It they are references to persisted entities (not entities which have not be saved to a data source).
Why would you use such a thing?: A reference to an entity uses less memory than having the DEFEF'ed (or expanded; or filled; or instantiated) entity. This may come in handy if you have a bunch of records that have image information and image data (4GB Files stored in the database). If you didn't use a REF, and you pulled back 10 of these entities just to get the image meta-data, then you'd quickly fill up your memory.
I know, I know. It'd be easier just to pull back the metadata in your query, but then you lose the point of what REF is good for :-D

Related

TYPO3 backend workflow when avoiding the storage of data in intermediate table

I have a situation as described in the ExtbaseFluid book:
I would like to store information in the intermediate table which is not recommended at all.
Here is a cite from the warning box of the above linked book chapter:
Do not store data in the Intermediate Table that concern the Domain. Though TYPO3 supports this (especially in combination with Inline Relational Record Editing (IRRE) but this is always a sign that further improvements can be made to your Domain Model. Intermediate Tables are and should always be tools for storing relationships and nothing else.
Let’s say you want to store a CD with its containing music tracks: CD -- m:n (Intermediate Table) -- Song. The track number may be stored in a field of the Intermediate Table. However, the track should be stored as a separate domain object, and the connection be realized as CD -- 1:n -- Track -- n:1 -- Song.
So I want not to do what is not recommended. But thinking about the workflow for the editor that results of the recommended solution rises a few question for me.
To stay with this example I would need the following tables:
tx_extname_domain_model_cd
tx_extname_domain_model_cd_track_mm
tx_extname_domain_model_track (which holds the track number)
tx_extname_domain_model_track_song_mm
tx_extname_domain_model_song
From what I know this would end in the situation that the editor would need to create following records:
one record for the cd
one record for the song
now the editor can create one record for the track.
There the track number is added.
Furthermore the cd record needs to be assigned as well as the song.
So here are my questions:
I guess this workflow cannot be improved with some (to me unknown) TCA setup?
An editor cannot directly reach the song when the cd record is opened?
Instead first she / he has to open the track record and can from there navigate to the song?
Is it really that bad to store data in the intermediate table? The TYPO3 table sys_file_reference does the same!? But I wonder how those data could be shown (because IRRE is not possible because it shall only be used for 1:n relations (source).
The question you have to ask yourself is: Do I want to do coding by the book, or do I want to create a pragmatic approach to solve a customer's problem?
In this specific case the additional problem is, that the people who originally invented Extbase had a quite sophisticated and academic approach, but when it comes to a pragmatic use and performance, they were blocked by their own rules and stuck with coding by the book.
Especially this example and the warning message shows a way of thinking that was one of the reasons, why I never actually used Extbase but went for Core-API methods to create performant and pragmatic queries to get the desired result sets. Now that we've got Doctrine under the hood, this works like a charm even with quite exotic DB flavors.
Of course intermediate tables are a good idea and of course those intermediate tables can and should be enriched with additional data fields, that do not require a 3rd, 4th or nth table to store i.e. a simple set of dropdown options, since this can easily be handled with attributes configured in TCA, as it is shown here: https://docs.typo3.org/m/typo3/reference-tca/master/en-us/ColumnsConfig/Type/Inline/Examples.html
sys_file_reference is the most prominent example since it provides exactly that kind of additional information that should not be pumped into additional tables - and guess what, the TYPO3 core does not make use of a single line of Extbase code to deal with that data or almost any other data of the core tables.
To answer your last question: Take a look at the good old IRRE Tutorial to get a clue how to do m:n connections with intermediate inline tables.
https://docs.typo3.org/typo3cms/extensions/irre_tutorial/0.4.0/Manual/Index.html#intermediate-tables-for-m-n-relations
Depends on the issue, sometimes the intermediate table is an entity, sometimes not. In this example the intermediate table is the track, which would contain: [uid, cd, song, track_no, ... (whatever else needed to define the track)]
Be carefull when you define your data, that you do not make it too advanced.

SQL Database Design - Flag or New Table?

Some of the Users in my database will also be Practitioners.
This could be represented by either:
an is_practitioner flag in the User table
a separate Practitioner table with a user_id column
It isn't clear to me which approach is better.
Advantages of flag:
fewer tables
only one id per user (hence no possibility of confusion, and also no confusion in which id to use in other tables)
flexibility (I don't have to decide whether fields are Practitioner-only or not)
possible speed advantage for finding User-level information for a practitioner (e.g. e-mail address)
Advantages of new table:
no nulls in the User table
clearer as to what information pertains to practitioners only
speed advantage for finding practitioners
In my case specifically, at the moment, practitioner-related information is generally one-to-many (such as the locations they can work in, or the shifts they can work, etc). I would not be at all surprised if it turns I need to store simple attributes for practitioners (i.e., one-to-one).
Questions
Are there any other considerations?
Is either approach superior?
You might want to consider the fact that, someone who is a practitioner today, is something else tomorrow. (And, by that I don't mean, not being a practitioner). Say, a consultant, an author or whatever are the variants in your subject domain, and you might want to keep track of his latest status in the Users table. So it might make sense to have a ProfType field, (Type of Professional practice) or equivalent. This way, you have all the advantages of having a flag, you could keep it as a string field and leave it as a blank string, or fill it with other Prof.Type codes as your requirements grow.
You mention, having a new table, has the advantage for finding practitioners. No, you are better off with a WHERE clause on the users table for that.
Your last paragraph(one-to-many), however, may tilt the whole choice in favour of a separate table. You might also want to consider, likely number of records, likely growth, criticality of complicated queries etc.
I tried to draw two scenarios, with some notes inside the image. It's really only a draft just to help you to "see" the various entities. May be you already done something like it: in this case do not consider my answer please. As Whirl stated in his last paragraph, you should consider other things too.
Personally I would go for a separate table - as long as you can already identify some extra data that make sense only for a Practitioner (e.g.: full professional title, University, Hospital or any other Entity the Practitioner is associated with).
So in case in the future you discover more data that make sense only for the Practitioner and/or identify another distinct "subtype" of User (e.g. Intern) you can just add fields to the Practitioner subtable, or a new Table for the Intern.
It might be advantageous to use a User Type field as suggested by #Whirl Mind above.
I think that this is just one example of having to identify different type of Objects in your DB, and for that I refer to one of my previous answers here: Designing SQL database to represent OO class hierarchy

Can I avoid a relation loop in my database design?

I try to design database tables for the case shown below. I also have an account defined, but it's not important regarding my problem.
There is a list of operations (expenses). Each operation can take place in specified POI, places can be grouped in chains (optional). Each operation can have a recipient, specifically a shop chain.
My current design looks like below. I could even remove chain table in favor of direct reference to recipient, but it still leaves a loop between tables. Effectively, single row could contain references to place and receiving account having different recipient defined.
The only solution I can see is a table check to exclude described case, but I'm wondering: is there a better fix?
As far as I can tell there isn't anything fundamentally wrong with your design. There's no need to change it just because it contains a loop. The loop in this case doesn't even appear to be a circular dependency. If you believe your current design accurately models what it is intended to then I see no need to change it.

Core Data Structure - avoiding circular reference?

I just wanted to validate my data structure.
It seems a bit convoluted to me, maybe it can be simplified?
Questions are grouped into chapters.
For each question, only one answer per session is possible.
The purpose is to be able to compare / analyze answers to the same questions (by different users or by the same users at different times, i.e. with different sessions).
A template, being a collection of chapters & questions, should not have to be replicated, if chapters and questions are the same.
(That would be necessary if Answer did not have a relationship to Session.)
Is the relationship from Answer back to Session the right strategy?
What else would you improve to simplify the model?
Thank you!
EDIT
Follow-up clarification:
The Answer is not static (e.g. "right" answer, "solution"), but some text the user inputs. It is more like a "questionnaire" than a "quiz". The answer has quantitative attributes that can be analyzed.
As stated, one question can have only one answer within a session. Because questions can indirectly belong to more than one session (via (NSSet*) question.chapter.template.sessions), they could have more than one answers and thus need a to-many relationship.
The typical scenario: User starts a new session with a certain template and fills out the answers. Then he can look at the analysis of the results and compare those with the results of other sessions that use the same template.
EDIT 2
The snapshot of the data model including attributes
honestly, this is what I would do instead of your structure, but I don't know what the purpose of the each entity because I'm not able to find out from their simple names.
this is just an idea to resolve the loop.
you can still reach all templates and all answers from the session, not directly but it does not make your life much harder.
UPDATE:
at the first and second sight, for me, it seems the Session entity is just an extra entity only here. honestly you would not need it, if you concatenate with the Template (aka Questionnaire) entity.
you have to add a many-to-many relationship between the Template and User (you can do it, don't worry about it). using this way, from each template you can reach all answers as well, and you won't have any loop.
Despite the really helpful effort by the part of #holex - the best way still seems to be to stick with my design. The simplifications I had hoped for have not materialized.

Multiple Models

I like knockoutjs, the sooner we get rid of coding directly toward the DOM the better. I'm having trouble understanding how I would do something which I'm going to explain in terms of a question/answer site. (This is probably a general MVC/MVVM question)
In my data model I have a question[id, description] and answer[id, question_id, text]. The browser requests a list of questions which is bound to a tbody, one column will display the question description, while the other should be bound to an answer textbox.
One obvious way of doing this is to have a QuestionAnswer[question_id, answer_id, question_descrition, answer_text] model. Ideally I'd like to keep them separate to minimize transformation when sending/receiving/storing, if there isn't some way of keeping them separate then I have the following question:
Where is the ideal place to create the QuestionAnswer model ? My bet is that by convention its created on the server.
If there is such an example somewhere please point me to it, otherwise I think it would make a good example.
Please help me wrap my head around this, Thanks!
What you could do is to create the combined model on the server, serialize it to json and then use the mapping plugin to add the serialized list to the view model.
I'm doing that here only it isn't a combined model, but shouldn't make any difference. Especially since it seems like your relation is one-to-one.
If you need to create an "object" in your view model, you can use the mapping definition to do so, like I do here.
I use C# to build my model on the server, but I guess you can use whatever you are comfortable with.
The cool thing with the mapping plugin is that it adds the data to the view model so that you can focus on behaviour.
Ok,
I'v gathered my thoughts on what my question is actually asking.
To do data binding on the client side you obviously need your data model there as well. I was conflicted on what I needed to send over and at what time.
To continue with the Question/Answer site idea: Sending down a list of answers each of which have a question in them is what should be done. That way you can bind to the answer list and simply bind the question description of each answer to the first table column.
If later I want to make a question editor I would potentially send a complete different data structure down and not reuse the Answer has a Question structure previously used.
I thought there might be a way of sending down a more complex data structure that references itself. Which apparently is possible in JSon with some extra libraries.