Enterprise Architect DDL generation order - code-generation

I currently work on a pretty large project with a pretty large database. Logically, there are many constraint requirements, so when I want to generate DDL from the EA script, I get the generations in a wrong order, leading to errors.
I have the tables separated to folders by modules and I have the modules and the tables inside those folders sorted by required order like:
MODULE 1
|_ Table 1
|_ Table 2
MODULE 2
|_ Table 3
|_ Table 4
but when I try to generate DDL, I get those all messed up. Like
Table 2
Table 4
Table 3
Table 1
But with the exception that there are 66 of these so manually sorting them using DDL Generation GUI would be a real pain.
In addition, this sorted order does not seem to follow any intuitive sorting criteria. The list is not sorted by name, it is not sorted by modules (folders), it is not even sorted by time of creation, as some of the newest tables are sitting comfortably in the middle. The only pattern I found is that sequences are generated first, but they are not even a real concern to me.
Is there any way to avoid manually sorting the generated order and go with the Project Browser order automatically (without the workaround being even more complicated like writing my own parser)?
Thank you in advance.

Related

Ignoring space characters when linking tables

I’m experiancing a problem when trying to link to tables in the database expert. The two fields that link the tables have exactly the same information except one table always has an additional space. For example;
Table 1 = Multivitamin/Tablets
Table 2 = Multivitamin//Tablets
‘/‘ are representing spaces
Formulas won’t help (e.g. extractstring etc) as it’s the tables themselves I need to link together
This is preventing me from retrieving the information I need. Any advice on how I can get around this?
There are some ways to come across this:
Consider using a command as datasource instead of tables. When writing the query of the command you can define the join condition yourself.
If you have access to the data source, you could add a calculated field to the tables to contain the normalized field values and then use these for linking in CR.
Alternatively, one could create views in the database, either adding normalized "linking fields" or providing the joined tables results.
If it's only a few rows in CR, you could consider using SQL fields or subreports to retrieve data from Table 2.

Feedback about my database design (multi tenancy)

The idea of the SaaS tool is to have dynamic tables with dynamic custom fields and values of different types, we were thinking to use "force.com/salesforce.com" example but is seems to be too complicated to maintain moving forward, also making some reports to create with a huge abstraction level, so we came up with simple idea but we have to be sure that this is kinda good approach.
This is the architecture we have today (in few steps).
Each tenant has it own separate database on the cluster (Postgres 12).
TABLE table, used to keep all of those tables as reference, this entity has ManyToOne relation to META table and OneToMany relation with DATA table.
META table is used for metadata configuration, has OneToMany relation with FIELDS (which has name of the fields as well as the type of field e.g. TEXT/INTEGER/BOOLEAN/DATETIME etc. and attribute value - as string, only as reference).
DATA table has ManyToOne relation to TABLES and 50 character varying columns with names like: attribute1...50 which are NULL-able.
Example flow today:
When user wants to open a TABLE DATA e.g. "CARS", we load the META table with all the FIELDS (to get fields for this query). User specified that he want to query against: Brand, Class, Year, Price columns.
We are checking by the logic, the reference for Brand, Class, Year and Price in META>FIELDS table, so we know that Brand = attribute2, Class = attribute 5, Year = attribute6 and Price = attribute7.
We parse his request into a query e.g.: SELECT [attr...2,5,6,7] FROM DATA and then show the results to user, if user decide to do some filters on it, based on this data e.g. Year > 2017 AND Class = 'A' we use CAST() functionality of SQL for example SELECT CAST(attribute6 AS int) AND attribute5 FROM DATA WHERE CAST(attribute6 AS int) > 2017 AND attribute5 = 'A';, so then we can actually support most principles of SQL.
However moving forward we are scared a bit:
Manage such a environment for more tenants while we are going to have more tables (e.g. 50 per customer, with roughly 1-5 mil per TABLE (5mil is maximum which we allow, for bigger data we have BigQuery) which is giving us 50-250 mil rows in single table DATA_X) which might affect performance of the queries, especially when we gave possibilities to manage simple WHERE statements (less,equal,null etc.) using some abstraction language e.g. GET CARS [BRAND,CLASS,PRICE...] FILTER [EQ(CLASS,A),MT(YEAR,2017)] developed to be similar to JQL (Jira Query Language).
Transactions lock, as we allow to batch upload CSV into the DATA_X so once they want to load e.g. 1GB of the data, it kinda locks the table for other systems to access the DATA table.
Keeping multiple NULL columns which can affect space a bit (for now we are not that scared as while TABLE creation, customer can decide how many columns he wants, so based on that we are assigning this TABLE to one of hardcoded entities DATA_5, DATA_10, DATA_15, DATA_20, DATA_30, DATA_50, where numbers corresponds to limitations of the attribute columns, and those entities are different, we also support migration option if they decide to switch from 5 to 10 attributes etc.
We are on super early stage, so we can/should make those before we scale, as we knew that this is most likely not the best approach, but we kept it to run the project for small customers which for now is working just fine.
We were thinking also about JSONB objects but that is not the option, as we want to keep it simple for getting the data.
What do you think about this solution (fyi DATA has PRIMARY key out of 2 tables - (ID,TABLEID) and built in column CreatedAt which is used form most of the queries, so there will be maximum 3 indexes)?
If it seem bad, what would you recommend as the alternative to this solution based on the details which I shared (basically schema-less RDBMS)?
IMHO, I anticipate issues when you wanted to join tables and also using cast etc.
We had followed the approach below that will be of help to you
We have a table called as Cars and also have a couple of tables like CarsMeta, CarsExtension columns. The underlying Cars table will have all the common fields for a ll tenant's. Also, we will have the CarsMeta table point out what are the types of columns that you can have for extending the Cars entity. In the CarsExtension table, you will have columns like StringCol1...5, IntCol1....5, LongCol1...10
In this way, you can easily filter for data also like,
If you have a filter on the base table, perform the search, if results are found, match the ids to the CarsExtension table to get the list of exentended rows for this entity
In case the filter is on the extended fields, do a search on the extension table and match with that of the base entity ids.
As we will have the extension table organized like below
id - UniqueId
entityid - uniqueid (points to the primary key of the entity)
StringCol1 - string,
...
IntCol1 - int,
...
In this case, it will be easy to do a join for entity and then get the data along with the extension fields.
In case you are having the table metadata and data being inferred from separate tables, it will be a difficult task to maintain this over long period of time and also huge volume of data.
HTH

Is there a way to include a column from one table in many other tables (while maintaining consistency) in PostgreSQL?

I'm trying to build a database (in PostgreSQL 9.6.6) that allows for one "master column" (items.id) to be replicated in to many (automatically generated) tables (e.g. rank1.id, rank2.id, rank3.id, ...). Only items will have INSERT's (or DELETE's) performed and when they are the newly added id's should also show up (or be removed) in the rankX table(s). To be more concrete:
items:
id | name | description
rank1:
id | rank
rank2:
id | rank
...
Where the id's are always the same, and there is always the same number of rows in each of the tables. The rankX.rank values, however, will be different (imagine users ranking how funny a series of images are -- the images all have the same id's but different users might rank them differently).
What I was thinking was that when a new user was added and a new rankX table created I would do the following:
Have rankX.id referencing a foreign key items.id (with ON DELETE CASCADE)
Copy any items.id that already exist
Auto-generate a trigger function that mirrors the INSERT's to items to the rankX table
This seems cumbersome and wasteful of space since all of the xxxx.id columns are identical and I will end up with hundreds or thousands of trigger functions. As someone new to relational databases I was hoping there was an easier way to achieve this.
So, I have a few questions:
Is there a more efficient way to define my tables such that all of this copying isn't necessary?
If this the best way, can you give an example of how you would set up the triggers (and associated functions)?
Do I need to worry about running out of space on the server as I create (potentially many) sets of triggers of this type?

Silverlight WCF RIA Service select from SQL View vs SQL Table

I have arrived at this dilemma via a tortuous and frustrating route, but I'll start with where I am right now. For information I'm using VS2010, Silverlight 5 and the latest versions of the Silverlight and RIA Toolkits, SDKs etc.
I have a view in my database (it's actually now an indexed view, but that has made no difference to the behaviour). For testing purposes (and that includes testing my sanity) I have duplicated the view as a Table (ie identical column names and definitions), and inserted all the view rows into the table. So if I SELECT * from the view or the table in Query Analyzer, I get identical results. So far so good.
I create an EDF model in my Silverlight Business Application web project, including all objects.
I create a Domain Service based on the model, and it creates ContextTypes and metadata for both the View and the Table, and associated Query objects.
If I populate a Silverlight ListBox in my Silverlight project via the Table Query, it returns all the data in the table.
If I populate the same ListBox via the View Query, it returns one row only, always the first row in the collection, however it is ordered. In fact, if I delve into the inner workings via the debugger, when it executes the ObjectContext Query in the service, it returns a result set of the correct number of rows, but all the rows are identical! If I order ascending I get n copies of the first row, descending I get n copies of the last row.
Can anyone put me out of my misery here, and tell me why the View doesn't work?
Ade
OK, well that was predictable - nearly every time I ask a question on a forum I stumble across the answer while I'm waiting for responses to flood in!
Despite having been through the metadata and model.designer files and made sure that all "view" and "table" class/method definitions etc were identical, it was still showing the exasperating difference in behaviour between view and table queries. So the problem just had to be caused by the database, right?
Sure enough, I hadn't noticed myself creating NOT NULL columns when I created the "identical" Table version of my view! Even though I was using a SELECT NEWID() to create a unique key column on the view, the database insisted that the ID column in the view was NULLABLE, and it was apparently this which was causing the problem.
To save some storage space I switched from using NEWID() to using ROW_NUMBER() to create my key column, but still had the "NULLABLE" property problem. SO I then changed it to
SELECT ISNULL(ROW_NUMBER() (OVER...) , -1)
for the ID column, and at last the column in the view was created NOT NULL! Even though neither NEWID() nor ROW_NUMBER() can ever generate NULL output, it seems you have to hold SQL Server's hand and reassure it by using the ISNULL operator before it will believe itself.
Having done this, deleted/recreated my model and service files, everything burst into glorious technicolour life without any manual additions of [Key()] properties or anything else. The problem had been with the database all along, and NOT with the Model/Service/Metadata definitions.
Hope this saves someone some time. Now all I need to do is work out why the original stored procedure method I started with two days ago doesn't work - but at least I now have a hint!
Ade

Postgres full text search across multiple related tables

This may be a very simplistic question, so apologies in advance, but I am very new to database usage.
I'd like to have Postgres run its full text search across multiple joined tables. Imagine something like a model User, with related models UserProfile and UserInfo. The search would only be for Users, but would include information from UserProfile and UserInfo.
I'm planning on using a gin index for the search. I'm unclear, however, on whether I'm going to need a separate tsvector column in the User table to hold the aggregated tsvectors from across the tables, and to setup triggers to keep it up to date. Or if it's possible to create an index without a tsvector column that'll keep itself up to date whenever any of the relevant fields in any of the relevant tables change. Also, any tips on the syntax of the command to create all this would be much appreciated as well.
Your best answer is probably to have a separate tsvector column in each table (with an index on, of course). If you aggregate the data up to a shared tsvector, that'll create a lot of updates on that shared one whenever the individual ones update.
You will need one index per table. Then when you query it, obviously you need multiple WHERE clauses, one for each field. PostgreSQL will then automatically figure out which combination of indexes to use to give you the quickest results - likely using bitmap scanning. It will make your queries a little more complex to write (since you need multiple column matching clauses), but that keeps the flexibility to only query some of the fields in the cases where you want.
You cannot create one index that tracks multiple tables. To do that you need the separate tsvector column and triggers on each table to update it.