Using Views and CTE on DB2/AS400 - db2

a Generic question..
I have an employee Table(EMPMAST) which has the New as well old Employee data. There is a flag called Current? which is 'Y' if he/she is a current employee.
Now I have to select records in my SQLRPGLE with only the current records and also some other criteria's(For Example EMPNAME = 'SAM') .What is the best way to deal with it. (in terms of performance and system usage)
To create a View over the EMPMAST with Current? = 'Y'. Then use it in the program with other conditions.
Use a CTE(With AS) in the Program which would have the Condition Current = 'Y' and use it.
use the table directly without CTE and View
any other option

Options 1, 2 and 3 would all perform the same. They would likely all have the same optimized query and access plan.

A CTE vs. a View are two different things. A View would be appropriate for a query that is going to be used in multiple locations, A CTE is only available in the query in which it is defined. I usually don't use the CTE except to replace a complex subquery. In your case the condition is simple enough to be contained in the where clause so I don't see the need to introduce additional complexity.
Some folks will tell you not to query the table directly in the program, but to always use a view. That way you add an extra layer of insulation between the program and the database, and you can still define record structures with ExtName, and not have to worry about changes to the table unless they affect the view itself. In this case you would likely have a dedicated view for each program that uses the table.
I tend to just use a hybrid of these techniques. I query tables, CTE's, or Views depending on the situation, and define my record structures explicitly in the program. I prefer to just query the table, but if I have some complex query logic which is unique to the program, I will use a CTE. I do have a few Views, but these are limited to those queries that happen in multiple programs where I want to ensure the same logic is applied consistently.

Related

Active Record efficient querying on multiple different tables

Let me give a summary of what I've been attempting to do and the efficiency issues I've been running into:
Essentially I want my users to be able to select parameters to filter data from my database, then I want to pass relevant data which passes those filters from the controller.
However, these filters query on data from multiple different tables (that is, about 5-6 different tables), some of which are quite large (as in 100k+ rows). These tables are all related to what I want to show, e.g. Here is a bond that meets so and so criteria, which is issued by so and so issuer, which must meet these criteria, and so on.
From an end result, I only really need about 100 rows after querying based on the parameters given by the user, but it feels like I need to look at everything in every table because I dont know how strict the filters will be beforehand. e.g. With a starting universe of 100k sets of data, passing filter f1,f2 of Table 1 might leave 90k, but after passing through filter f3 of table 2, f4,f5,f6 of table 3, and so ..., we might end up with 100 or less sets of data that pass these parameters because the last filters checked might be quite strict.
How can I go about querying through these multiple different tables efficiently?
Doing a join between them seems like it'd yield some time complexity of |T_1||T_2||T_3||T_4||T_5||T_6| where T_i is the "size" of table i.
On the other hand, just looking through the other tables based off the ids of the ones that pass the previous filter (as in, id 5,7,8 pass filters in T_1, which of those ids then pass filters in T_2, then which of those pass filters in T_3 and so on) looks like it might(?) have time complexity of |T_1| + |T_2| + ... + |T_6|.
I'm relatively new to Ruby on Rails, so im not entirely sure all of the tools at my disposal that could help with optimizing this, but at the same time I'm not entirely sure how to best approach this algorithmically.

Understanding SQL query complexity

I'm currently having trouble understanding why a seemingly simple query is taking much longer to return results than a much more complicated (looking) query.
I have a view, performance_summary (which in turn selects from another view). Currently, within psql, when I run a query like
SELECT section
FROM performance_summary
LIMIT 1;
it takes a minute or so to return a result, whereas a query like
SELECT section, version, weighted_approval_rate
FROM performance_summary
WHERE version in ('1.3.10', '1.3.11') AND section ~~ '%WEST'
ORDER BY 1,2;
gets results almost instantly. Without knowing how the view is defined, is there any obvious or common reason why this is?
Not really, without knowing how the view is defined. It could be that the "more complex" query uses an index to select just two rows and then perform some trivial grouping sorting on the two. The query without the where clause might see postgres operating on millions of rows, trillions of operations and producing a single row out after discarding 999999999 rows, we just don't know unless you post the view definition and the explain plan output for both queries
You might be falling into the trap of thinking that a View is somehow a cache of info - it isn't. It's a stored query, that is inserted into the larger query when you select from it/include it in another query- this means that the whole thing must be planned and executed from scratch. There isn't a notion that creating a View does any pre planning etc, onto which other further improvement is done. It's more like the definition of the View is pasted into any query that uses it, then the query is run as if it were just written there and then

ormlite select count(*) as typeCount group by type

I want to do something like this in OrmLite
SELECT *, COUNT(title) as titleCount from table1 group by title;
Is there any way to do this via QueryBuilder without the need for queryRaw?
The documentation states that the use of COUNT() and the like necessitates the use of selectRaw(). I hoped for a way around this - not having to write my SQL as strings is the main reason I chose to use ORMLite.
http://ormlite.com/docs/query-builder
selectRaw(String... columns):
Add raw columns or aggregate functions
(COUNT, MAX, ...) to the query. This will turn the query into
something only suitable for using as a raw query. This can be called
multiple times to add more columns to select. See section Issuing Raw
Queries.
Further information on the use of selectRaw() as I was attempting much the same thing:
Documentation states that if you use selectRaw() it will "turn the query into" one that is supposed to be called by queryRaw().
What it does not explain is that normally while multiple calls to selectColumns() or selectRaw() are valid (if you exclusively use one or the other),
use of selectRaw() after selectColumns() has a 'hidden' side-effect of wiping out any selectColumns() you called previously.
I believe that the ORMLite documentation for selectRaw() would be improved by a note that its use is not intended to be mixed with selectColumns().
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectColumns("emailAddress"); // This column is not selected due to later use of selectRaw()!
qb.selectRaw("COUNT (emailAddress)");
ORMLite examples are not as plentiful as I'd like, so here is a complete example of something that works:
QueryBuilder<EmailMessage, String> qb = emailDao.queryBuilder();
qb.selectRaw("emailAddress"); // This can also be done with a single call to selectRaw()
qb.selectRaw("COUNT (emailAddress)");
qb.groupBy("emailAddress");
GenericRawResults<String[]> rawResults = qb.queryRaw(); // Returns results with two columns
Is there any way to do this via QueryBuilder without the need for queryRaw(...)?
The short answer is no because ORMLite wouldn't know what to do with the extra count value. If you had a Table1 entity with a DAO definition, what field would the COUNT(title) go into? Raw queries give you the power to select various fields but then you need to process the results.
With the code right now (v5.1), you can define a custom RawRowMapper and then use the dao.getRawRowMapper() method to process the results for Table1 and tack on the titleCount field by hand.
I've got an idea how to accomplish this in a better way in ORMLite. I'll look into it.

Why to create empty (no rows, no columns) table in PostgreSQL

In answer to this question I've learned that you can create empty table in PostgreSQL.
create table t();
Is there any real use case for this? Why would you create empty table? Because you don't know what columns it will have?
These are the things from my point of view that a column less table is good for. They probably fall more into the warm and fuzzy category.
1.
One practical use of creating a table before you add any user
defined columns to it is that it allows you to iterate fast when
creating a new system or just doing rapid dev iterations in general.
2.
Kind of more of 1, but lets you stub out tables that your app logic or procedure can make reference too, even if the columns have
yet to
be put in place.
3.
I could see it coming in handing in a case where your at a big company with lots of developers. Maybe you want to reserve a name
months in advance before your work is complete. Just add the new
column-less table to the build. Of course they could still high
jack it, but you may be able to win the argument that you had it in
use well before they came along with their other plans. Kind of
fringe, but a valid benefit.
All of these are handy and I miss them when I'm not working in PostgreSQL.
I don't know the precise reason for its inclusion in PostgreSQL, but a zero-column table - or rather a zero-attribute relation - plays a role in the theory of relational algebra, on which SQL is (broadly) based.
Specifically, a zero-attribute relation with no tuples (in SQL terms, a table with no columns and no rows) is the relational equivalent of zero or false, while a relation with no attributes but one tuple (SQL: no columns, but one row, which isn't possible in PostgreSQL as far as I know) is true or one. Hugh Darwen, an outspoken advocate of relational theory and critic of SQL, dubbed these "Table Dum" and "Table Dee", respectively.
In normal algebra x + 0 == x and x * 0 == 0, whereas x * 1 == x; the idea is that in relational algebra, Table Dum and Table Dee can be used as similar primitives for joins, unions, etc.
PostgreSQL internally refers to tables (as well as views and sequences) as "relations", so although it is geared around implementing SQL, which isn't defined by this kind of pure relation algebra, there may be elements of that in its design or history.
It is not empty table - only empty result. PostgreSQL rows contains some invisible (in default) columns. I am not sure, but it can be artifact from dark age, when Postgres was Objected Relational database - and PG supported language POSTQUEL. This empty table can work as abstract ancestor in class hierarchy.
List of system columns
I don't think mine is the intended usage however recently I've used an empty table as a lock for a view which I create and change dynamically with EXECUTE. The function which creates/replace the view has ACCESS EXCLUSIVE on the empty table and the other functions which uses the view has ACCESS.

Changed property value when selecting in EF4

I need to change the value of a property when I query the database using EF4. I have a company code that gets returned and I need to translate it to another company code, if needed. So, there is a stored procedure that is used to do this currently. Here's the old select statement.
SELECT companyName, TranslateCompanyCode(companyCode) as newCompanyCode FROM companyTable where companyCode = 'AA';
TranslateCompanyCode is the stored proc that does the translation. I'd like to do this in my new code when needed. I think I might need to use a Model-Defined Function. Anyone know how I can do this?
For your scenario, I would use a JOIN. Model-defined functions are cool when you need to perform a quick function on a value (particularly without an additional query). From a performance standpoint, a JOIN will be faster and more efficient than trying to put the sub-query in a model-defined function - particularly if you are selecting more than 1 row at a time.
However, if you do still want to use Model defined functions, then this example should point you in the right direction as to how to run a query within the function. This implementation will also be more complex than just using a join but is an alternative.