What's the use Overriding Table Setup Methods? - zend-framework

In Zend Framework you can override table setup methods as explained here:
http://framework.zend.com/manual/en/zend.db.table.html#zend.db.table.defining.setup
My question is what is the use of this? Under what circumstances would this be used?
It seems pointless to me because why would you want to change the primarykey value or tablename of your class to something other than what they actually are?

You might want to use the same Zend_Db_Table class with different scenarios. In my case, I needed to use the same table structure across different schemas and table names for a project. So, I could use the same class to access, for instance:
In schema db1:
table tableA
table tableB
...
In schema db2:
table tableC
table tableD
...
(All tables having the same structure, but different names).
Actually, this flexibility was the main reason why I chose to use Zend Framework for this project.
Hope that helps,

Related

Is it possible to have database-wide table aliases?

I am about to model a PostgreSQL database, based on an Oracle database. The latter is old and its tables have been named after a 3-letter-scheme.
E.g. a table that holds parameters for tasks would be named TSK_PAR.
As I model the new database, I'd like to rename those tables to a more descriptive name using actual words. My problem is, that some parts of the software might rely on these old names until they're rewritten and adapted to the new scheme.
Is it possible to create something like an alias that's being used for the whole database?
E.g. I create a new task_parameters database, but add a TSK_PAR alias to it, so if a SELECT * FROM TSK_PAR is being used, it automatically refers to the new name?
Postgres has no synonyms like Oracle.
But for your intended use case, views should do just fine. A view that simply does select * from taks_parameters is automatically updateable (see here for an online example).
If you don't want to clutter your default schema (usually public) with all those views, you can create them in a different schema, and then adjust the user's search path to include that "synonym schema".
For example:
create schema synonyms;
create table public.task_parameters (
id integer primary key,
....
);
create view synonyms.task_par
as
select *
from public.task_parameters;
However, that approach has one annoying drawback: if a table is used by a view, the allowed DDL statements on it are limited, e.g. you can't drop a column or rename it.
As we manage our schema migrations using Liquibase, we always drop all views before applying "normal" migrations, then once everything is done, we simply re-create all views (by running the SQL scripts stored in Git). With that approach, ALTER TABLE statements never fail because there are not views using the tables. As creating a view is really quick, it doesn't add overhead when deploying a migration.

Creating a "table of tables" in PostgreSQL or achieving similar functionality?

I'm just getting started with PostgreSQL, and I'm new to database design.
I'm writing software in which I have various plugins that update a database. Each plugin periodically updates its own designated table in the database. So a plugin named 'KeyboardPlugin' will update the 'KeyboardTable', and 'MousePlugin' will update the 'MouseTable'. I'd like for my database to store these 'plugin-table' relationships while enforcing referential integrity. So ideally, I'd like a configuration table with the following columns:
Plugin-Name (type 'text')
Table-Name (type ?)
My software will read from this configuration table to help the plugins determine which table to update. Originally, my idea was to have the second column (Table-Name) be of type 'text'. But then, if someone mistypes the table name, or an existing relationship becomes invalid because of someone deleting a table, we have problems. I'd like for the 'Table-Name' column to act as a reference to another table, while enforcing referential integrity.
What is the best way to do this in PostgreSQL? Feel free to suggest an entirely new way to setup my database, different from what I'm currently exploring. Also, if it helps you answer my question, I'm using the pgAdmin tool to setup my database.
I appreciate your help.
I would go with your original plan to store the name as text. Possibly enhanced by additionally storing the schema name:
addin text
,sch text
,tbl text
Tables have an OID in the system catalog (pg_catalog.pg_class). You can get those with a nifty special cast:
SELECT 'myschema.mytable'::regclass
But the OID can change over a dump / restore. So just store the names as text and verify the table is there by casting it like demonstrated at application time.
Of course, if you use each tables for multiple addins it might pay to make a separate table
CREATE TABLE tbl (
,tbl_id serial PRIMARY KEY
,sch text
,name text
);
and reference it in ...
CREATE TABLE addin (
,addin_id serial PRIMARY KEY
,addin text
,tbl_id integer REFERENCES tbl(tbl_id) ON UPDATE CASCADE ON DELETE CASCADE
);
Or even make it an n:m relationship if addins have multiple tables. But be aware, as #OMG_Ponies commented, that a setup like this will require you to execute a lot of dynamic SQL because you don't know the identifiers beforehand.
I guess all plugins have a set of basic attributes and then each plugin will have a set of plugin-specific attributes. If this is the case you can use a single table together with the hstore datatype (a standard extension that just needs to be installed).
Something like this:
CREATE TABLE plugins
(
plugin_name text not null primary key,
common_int_attribute integer not null,
common_text_attribute text not null,
plugin_atttributes hstore
)
Then you can do something like this:
INSERT INTO plugins
(plugin_name, common_int_attribute, common_text_attribute, hstore)
VALUES
('plugin_1', 42, 'foobar', 'some_key => "the fish", other_key => 24'),
('plugin_2', 100, 'foobar', 'weird_key => 12345, more_info => "10.2.4"');
This creates two plugins named plugin_1 and plugin_2
Plugin_1 has the additional attributes "some_key" and "other_key", while plugin_2 stores the keys "weird_key" and "more_info".
You can index those hstore columns and query them very efficiently.
The following will select all plugins that have a key "weird_key" defined.
SELECT *
FROM plugins
WHERE plugin_attributes ? 'weird_key'
The following statement will select all plugins that have a key some_key with the value the fish:
SELECT *
FROM plugins
WHERE plugin_attributes #> ('some_key => "the fish"')
Much more convenient than using an EAV model in my opinion (and most probably a lot faster as well).
The only drawback is that you lose type-safety with this approach (but usually you'd lose that with the EAV concept as well).
You don't need an application catalog. Just add the application name to the keys of the table. This of course assumes that all the tables have the same structure. If not: use the application name for a table name, or as others have suggested: as a schema name( which also would allow for multiple tables per application).
EDIT:
But the real issue is of course that you should first model your data, and than build the applications to manipulate it. The data should not serve the code; the code should serve the data.

Entity Framework 4 - Use a single mapping table for all many-to-many relations

I would like to map all many-to-may relations through a single table in my database.
Meaning that I have numerous tables (entities) that have various many-to-many relations. Instead of having a separate mapping table for every relation I would like to use one "master mapping" table having to columns: End1Id & End2Id.
Don't ask why ;) It's required by my customer...
How would I set this up in the model designer, or do I have to edit the edmx xml directly....or is it just not possible?
Thanx for your help!
In such a scenario you can't have explicit foreign keys, because a table like this normally has at least three rows:
PK of table 1
PK of table 2
Type of mapping, which specifies the exact tables to use.
Because of that, you can just create a table in EF, but it will also have no connections to other tables and you will have to do the joins manually.
You would need to set this Master Mappings table manually. The designer doesn't do it for you automatically.
However - if denormalized entities are what you are looking for, better have those denormalized in DB level rather than in EF/code level.

When to use inherited tables in PostgreSQL?

In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world.
I thought it worked like this:
Table users has all fields required for all user levels. Tables like moderators, admins, bloggers, etc but fields are not checked from parent. For example users has email field and inherited bloggers has it now too but it's not unique for both users and bloggers at the same time. ie. same as I add email field to both tables.
The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?
There are some major reasons for using table inheritance in postgres.
Let's say, we have some tables needed for statistics, which are created and filled each month:
statistics
- statistics_2010_04 (inherits statistics)
- statistics_2010_05 (inherits statistics)
In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.
So what makes the inheritance a cool feature - why is it cool to split the data?
PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!
Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.
Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data
For the correct use of table inheritance as a performance booster, look at the postgresql manual.
You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).
I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk.
UPDATE:
fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:
The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default
"Table inheritance" means something different than "class inheritance" and they serve different purposes.
Postgres is all about data definitions. Sometimes really complex data definitions. OOP (in the common Java-colored sense of things) is about subordinating behaviors to data definitions in a single atomic structure. The purpose and meaning of the word "inheritance" is significantly different here.
In OOP land I might define (being very loose with syntax and semantics here):
import life
class Animal(life.Autonomous):
metabolism = biofunc(alive=True)
def die(self):
self.metabolism = False
class Mammal(Animal):
hair_color = color(foo=bar)
def gray(self, mate):
self.hair_color = age_effect('hair', self.age)
class Human(Mammal):
alcoholic = vice_boolean(baz=balls)
The tables for this might look like:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(hair_color varchar(20) REFERENCES hair_color(code) NOT NULL,
PRIMARY KEY (name))
INHERITS (animal);
CREATE TABLE human
(alcoholic boolean NOT NULL,
FOREIGN KEY (hair_color) REFERENCES hair_color(code),
PRIMARY KEY (name))
INHERITS (mammal);
But where are the behaviors? They don't fit anywhere. This is not the purpose of "objects" as they are discussed in the database world, because databases are concerned with data, not procedural code. You could write functions in the database to do calculations for you (often a very good idea, but not really something that fits this case) but functions are not the same thing as methods -- methods as understood in the form of OOP you are talking about are deliberately less flexible.
There is one more thing to point out about inheritance as a schematic device: As of Postgres 9.2 there is no way to reference a foreign key constraint across all of the partitions/table family members at once. You can write checks to do this or get around it another way, but its not a built-in feature (it comes down to issues with complex indexing, really, and nobody has written the bits necessary to make that automatic). Instead of using table inheritance for this purpose, often a better match in the database for object inheritance is to make schematic extensions to tables. Something like this:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
ilk varchar(20) REFERENCES animal_ilk NOT NULL,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(animal varchar(20) REFERENCES animal PRIMARY KEY,
ilk varchar(20) REFERENCES mammal_ilk NOT NULL,
hair_color varchar(20) REFERENCES hair_color(code) NOT NULL);
CREATE TABLE human
(mammal varchar(20) REFERENCES mammal PRIMARY KEY,
alcoholic boolean NOT NULL);
Now we have a canonical reference for the instance of the animal that we can reliably use as a foreign key reference, and we have an "ilk" column that references a table of xxx_ilk definitions which points to the "next" table of extended data (or indicates there is none if the ilk is the generic type itself). Writing table functions, views, etc. against this sort of schema is so easy that most ORM frameworks do exactly this sort of thing in the background when you resort to OOP-style class inheritance to create families of object types.
Inheritance can be used in an OOP paradigm as long as you do not need to create foreign keys on the parent table. By example, if you have an abstract class vehicle stored in a vehicle table and a table car that inherits from it, all cars will be visible in the vehicle table but a foreign key from a driver table on the vehicle table won't match theses records.
Inheritance can be also used as a partitionning tool. This is especially usefull when you have tables meant to be growing forever (log tables etc).
Main use of inheritance is for partitioning, but sometimes it's useful in other situations. In my database there are many tables differing only in a foreign key. My "abstract class" table "image" contains an "ID" (primary key for it must be in every table) and PostGIS 2.0 raster. Inherited tables such as "site_map" or "artifact_drawing" have a foreign key column ("site_name" text column for "site_map", "artifact_id" integer column for the "artifact_drawing" table etc.) and primary and foreign key constraints; the rest is inherited from the the "image" table. I suspect I might have to add a "description" column to all the image tables in the future, so this might save me quite a lot of work without making real issues (well, the database might run little slower).
EDIT: another good use: with two-table handling of unregistered users, other RDBMSs have problems with handling the two tables, but in PostgreSQL it is easy - just add ONLY when you are not interrested in data in the inherited "unregistered user" table.
The only experience I have with inherited tables is in partitioning. It works fine, but it's not the most sophisticated and easy to use part of PostgreSQL.
Last week we were looking the same OOP issue, but we had too many problems with Hibernate - we didn't like our setup, so we didn't use inheritance in PostgreSQL.
I use inheritance when I have more than 1 on 1 relationships between tables.
Example: suppose you want to store object map locations with attributes x, y, rotation, scale.
Now suppose you have several different kinds of objects to display on the map and each object has its own map location parameters, and map parameters are never reused.
In these cases table inheritance would be quite useful to avoid having to maintain unnormalised tables or having to create location id’s and cross referencing it to other tables.
I tried some operations on it, I will not point out if is there any actual use case for database inheritance, but I will give you some detail for making your decision. Here is an example of PostgresQL: https://www.postgresql.org/docs/15/tutorial-inheritance.html
You can try below SQL script.
CREATE TABLE IF NOT EXISTS cities (
name text,
population real,
elevation int -- (in ft)
);
CREATE TABLE IF NOT EXISTS capitals (
state char(2) UNIQUE NOT NULL
) INHERITS (cities);
ALTER TABLE cities
ADD test_id varchar(255); -- Both table would contains test col
DROP TABLE cities; -- Cannot drop because capitals depends on it
ALTER TABLE cities
ADD CONSTRAINT fk_test FOREIGN KEY (test_id) REFERENCES sometable (id);
As you can see my comments, let me summarize:
When you add/delete/update fields -> the inheritance table would also be affected.
Cannot drop the parent table.
Foreign keys would not be inherited.
From my perspective, in growing applications, we cannot easily predict the changes in the future, for me I would avoid applying this to early database developing.
When features are stable as well and we want to create some database model which much likely the same as the existing one, we can consider that use case.
Use it as little as possible. And that usually means never, it boiling down to a way of creating structures that violate the relational model, for instance by breaking the information principle and by creating bags instead of relations.
Instead, use table partitioning combined with proper relational modelling, including further normal forms.

Why EclipseLink is adding discriminator column for joined inheritance strategy?

I am using JOINED inheritance strategy with EclipseLink JPA implementation. I have noticed that EclipseLink is adding discriminator column, named by default DTYPE, to the database schema. I understand, that discriminator is needed for one table inheritance strategy, but why for JOINED strategy?
EclipseLink needs this column because I've got errors after removing it. Is this column added for performance reasons, etc? I am not particularly happy about that since from the point of view of database schema this column is just unnecessary clutter.
Hibernate based JPA does not do anything similar.
From Joined Table Inheritance:
In the joined table inheritance, each
class shares data from the root table.
In addition, each subclass defines its
own table that adds its extended
state. The following example shows two
tables, PROJECT and L_PROJECT, as well
as two classes, Project and
LargeProject:
...
The discriminator column is what determines the type and thus what joined table to use so you need a discriminator column in the parent table.