Automating database creation using psql

Automating database creation using psql - postgresql

I have a file that contains SQL commands for the creation of a database.
I also have a set of seperate files, each of which contain SQL commands for creating functions. For example:
funcs_algebra.sql
funcs_trigonometry.sql
funcs_geometry.sql
Lets say my main SQLscript looks like this:
CREATE DATABASE mydb;
CREATE TABLE foo(id INT, name VARCHAR(32));
CREATE TABLE foobar(id INT, age REAL);
-- Commands below to import the functions in the separate files
-- funcs_algebra.sql
-- funcs_trigonometry.sql
-- funcs_geometry.sql
I want to know how to include the files in my 'main SQL' so that I have only one file to pass to psql.
The objective is to be able to use psql to create the database (complete with functions) by merely passing the commands in this file to psql.
Anyone knows how?
[Edit]
I probably should have stated what I thought was obvious. I want to keep the function related SQL in SEPARATE files so that I have only one source to modify. It is a way of me partitioning logic and keeping things DRY.
The function definitions are used for creating other template databases - so I want to keep them in separate files, but reference the files from within my SQL script.
The only other way I can think of doing this is writing a bash script that makes successive calls to psql - first create the database, and then add the functions - not very elegant, and not very DRY. Is there another (more elegant) way?

You can write:
\i funcs_algebra.sql
in your main SQL script to include the extra functions script.

Related

Auto generate script for CREATE TABLE including all indices, constraints, etc (not via SSMS)

I have a data anonymization process that takes a production copy of a database and turns it into an anonymized copy by UPDATE-ing some columns.
Some of the tables contain several million rows so instead of UPDATE-ing the columns, which is very log intensive, I went down the way of
SELECT
Id,
CAST('Redacted' AS NVARCHAR(255)) [ColumnRequiringAnonymization]
INTO MyTable_New
FROM MyTable
EXEC sp_rename MyTable, MyTable_old
EXEC sp_rename MyTable_new, MyTable
DROP TABLE MyTable_old
The problem with this approach is that the "new" table no longer has any of the keys, indices and other dependent objects. I have figured out the keys and indices using SPs to generate the DROP and CREATE scripts. The SPs are based on manually written SQL as can be seen e.g. in this answer.
The next problem is that we have a schemabound view on top of this table, which has indices and a full-text index on its own. The number of SPs to generate scripts is growing and I am sure there will be mistakes.
Is there a way to completely script a table/view by using SQL commands only? ie. just like SSMS does when you click "Script table as - CREATE to" but within a stored procedure?

Right-click on the database, select Tasks; there is Generate Scripts there. Just follow prompts or Google for additional information.

Is it possible to have database-wide table aliases?

I am about to model a PostgreSQL database, based on an Oracle database. The latter is old and its tables have been named after a 3-letter-scheme.
E.g. a table that holds parameters for tasks would be named TSK_PAR.
As I model the new database, I'd like to rename those tables to a more descriptive name using actual words. My problem is, that some parts of the software might rely on these old names until they're rewritten and adapted to the new scheme.
Is it possible to create something like an alias that's being used for the whole database?
E.g. I create a new task_parameters database, but add a TSK_PAR alias to it, so if a SELECT * FROM TSK_PAR is being used, it automatically refers to the new name?

Postgres has no synonyms like Oracle.
But for your intended use case, views should do just fine. A view that simply does select * from taks_parameters is automatically updateable (see here for an online example).
If you don't want to clutter your default schema (usually public) with all those views, you can create them in a different schema, and then adjust the user's search path to include that "synonym schema".
For example:
create schema synonyms;
create table public.task_parameters (
id integer primary key,
....
);
create view synonyms.task_par
as
select *
from public.task_parameters;
However, that approach has one annoying drawback: if a table is used by a view, the allowed DDL statements on it are limited, e.g. you can't drop a column or rename it.
As we manage our schema migrations using Liquibase, we always drop all views before applying "normal" migrations, then once everything is done, we simply re-create all views (by running the SQL scripts stored in Git). With that approach, ALTER TABLE statements never fail because there are not views using the tables. As creating a view is really quick, it doesn't add overhead when deploying a migration.

Statement to display create table SQL

Is there a stored procedure or some SQL that I could run that would display the SQL for creating a table from an existing table? Like sp_helptext to display the contents of a function or stored procedure. Basically, is there a way to do the Script Table As->CREATE TO method?

The answer is no. If you start profiler, and run [scirpt table]>[create to] in SSMS, then you'll see a series of sp_executesql being ran on sys.* tables. This means that no CREATE TABLE commands are stored anywhere in SQL server, and SSMS assemblies CREATE statements for a table from a lot of different sources.
On the other hand, if run [script view]>[create to], you'll see a simple query from sys.all_objects, sys.sql_modules, sys.system_sql_modules, where definitions are stored.

Creating Stored Procedures that can work with different tables

I need to use the same Stored Procedures against many tables all with the same structure in my DB. This is data loaded from customers,with one table/customer and the data needs calculations/checks run before it's loaded to our DataWarehouse.
So far these are the options and issues I've found and I'm looking for a better pattern/approach.
Create a view that points to the
table I want to process, the SPs
then talk to that view. This works
well (especially once I'd worked out
how to create views 'automagically'
based on their columns). But the
view can only be used with one table
at a time, forcing the system to
deal with one customer at a time.
Use dynamic sql within each SP -
makes the SPs much harder to
read/debug and for those reasons has
been ruled out
Create a partitioned view across
all the tables and then use a
paramatised table function to return
just the data we're interested in -
ah but then I can't update the data
as the function returns a table that
can be only used for select
Use dynamic sql inside a function
(can't be done) to create a view
(which also can't be done) .... give
up
Within the SP create a temp table
with over the target table using
dynamic sql, but then the temp table
only exists in the session that runs
the dynamic sql not the 'parent'
session that's running the SP ...
give up
Create a global temp table using
dynamic SQL to avoid the scope issue
of 5, then run the SP against the
global temp table. Still run into
the single customer issue.
Create the view as in 1 within a
transaction and then run all the SPs
and then commit - works fine for one
user, but any others are now blocked
trying to create a new view of the
same name
Use a temporary view ... can't in
T/Sql
Move all the code into .Net - but
we have environment issues where
tsql is much easier to host/run
I know I'm not the only person who has this problem, have any of you good people solved it, please help.

Maybe your approach is wrong, I will go deep in details in a while but it seems that your problem can be solved using SSIS
-- Updated answer:
First, the big picture:
The most affordable way to process the tables dynamically is using a script instead of a stored procedure. If you want to make table access randomly chosen, you certainly will not use any of the performance advantages of stored procedures, i.e. execution plans. A SQL Script can be easily upgraded to point one table at runtime using placeholders and replacing it before executing.
The script can be loaded from the filesystem, a variable, a text column in a table, etc. The loading process consists in read the script content to a string variable. This step occurs once.
The next step is the preparation stage. This step will be executed for each table to be processed. The main business of this step is to replace the table placeholders with the current table being processed. Also is possible to set parameter values like any parameter you can need to pass into the sp that you already wrote.
The last step is the execution of the script. As is already loaded into a variable and the placeholders were set to the current table name, you can safely call a ExecuteSQLTask with the sql variable as the input. This process of course happens for each table you want to process.
Ok. Now let's see this in action.
This is a sample database model:
CREATE TABLE [dbo].[t_n](
[id] [int] IDENTITY(1,1) NOT NULL,
[name] [varchar](50) NOT NULL,
[start] [datetime] NULL,
CONSTRAINT [PK_t_n] PRIMARY KEY CLUSTERED ([id] ASC)
) ON [PRIMARY]
where t_n represents any table (t_1, t_2, t_3, etc).
This is your current stored procedure:
CREATE PROCEDURE SpProcessT_n
AS
BEGIN
SET NOCOUNT ON;
SELECT * FROM [t1];
END
GO
Now, transform this stored procedure to a Sql script, placing a placeholder instead of the table name
SET NOCOUNT ON;
SELECT * FROM [$table_name];
I choose to save this in a .sql file in the filesystem to keep the POC as simple as possible.
Next, create a SSIS Package like this:
These are the settings I choose to set up the loop:
And this is the way you can assign the table name to a variable called appropriately _table_name_
This is the setup of the script task, here you find that the variable _table_name_ has read only access, while a new variable called SqlExec has read/write access:
And this is it's Main function:
public void Main()
{
String Table_Name = Dts.Variables["table_name"].Value.ToString();
String SqlScript;
Regex reg = new Regex(#"\$table_name", RegexOptions.Compiled);
using (var f = File.OpenText(#"c:\sqlscript.sql")) {
SqlScript = f.ReadToEnd();
f.Close();
}
SqlScript = reg.Replace(SqlScript, Table_Name);
Dts.Variables["SqlExec"].Value = SqlScript;
Dts.TaskResult = (int)ScriptResults.Success;
}
You can notice that the Dts Variable SqlExec contains the sql script that will be executed. Now you can set the following options in your ExecuteSqlTask:
Successfully tested in MSSQL 2008, if you put a insert inside the script file you will notice new rows in each table.
Hope this helps!

If your application can afford to have one cut-off day late, then you can have a nightly scheduled job to run an SSIS package that will consolidate all 150+ tables into one single huge table. Since the freshness of the results of the queries against that huge table will then be 1 'date' late, this solution will not include any rows that recently been loaded.
You can actually time the running of this package. If it is still amazingly fast, say within 30 minutes, then you can bet to run it in every few hours, like during: the start of work day, lunch break, and end of day. This way you can have a nearly fresh data to query with.

Write a partitioned view including table names?
SELECT 'TableName', t.* FROM TableName t
UNION ALL
SELECT 'TableName2', t.* FROM TableName2 t
Then write a single instead of trigger which uses dynamic SQL for writing (less testing involved with that use of dynamic SQL because you'd just write the simple CRUD operations once for all tables I'd think)

I would not do this with SQL. What you are describing sounds like a traditional ETL situation.
Since all of the customer tables are the same, I would create a table in the data warehouse with all the columns from the client table, a surrogate key column, and a type identifier. You have an option to create a "staging" table here that will only have data in it during the ETL process, or just working on a single "live" table. I would create the staging table.
Then within SSIS package (don't worry you can still schedule from SQL Server agent, it hasn't totally left the DB server), start the ETL process...
E(xtract): copy the data from your source into the staging table in the data warehouse. You most likely want to use a sub-package within a foreach loop and changing the name of the table that you want to process from an external store (most people would say put this in the warehouse, but its up to you).
T(ransform): run the calculations/checks you were talking about, but do it on the whole set...
L(oad): Copy it to your real within the data warehouse.
There are a couple things I would NOT do.
1. Modify the data in the source table.
2. Try to do this in t-sql. Its just not what tsql is good at.
If you need more detail on this approach, I would probably ask the question with some Business Intelligence tags. I'll be traveling for the next week or so, but I will try to look at the comments to clear anything up if you need me to.

I am fairly certain that the standard way to solve this is using dynamic SQL in each sp (your option 2), which has already been ruled out.
Your goal is to make generic, multi-table SQL. I don't see how you intend to accomplish that without sacrificing some efficiency and readability.

SQL query to list all dependent entities

An SQL table has hundreds of tables, stored procedures and functions.
I am trying to put together an SQL query that will return all the dependencies of a given set of tables. Is there a way to accomplish this using SQL Server Management Studio without writing queries?
Updated: Simplified the question to the point.

In SSMS, just right click on the table and choose "View Dependencies". As far as scripting, take a look at this article.
EDIT: In SSMS, you can only see it for one. The reason why is because of the stored procedure that is run to view them only takes one database object. So to script multiple, you'd simply need to use multiple lines of EXEC sp_depends #objname = N'DATABASE.OBJECT'; for the tables/views/stored procedures/functions that you want to get dependencies for. One approach would be to use a script like the following to get the unique list of all dependent objects that will have to be included:
CREATE TABLE #dependents (obj_name nvarchar(255), obj_type nvarchar(255))
-- Do this for every primary object you're concerned with finding dependents for
INSERT INTO #dependents (obj_name, obj_type)
EXEC sp_depends #objname = N'DATABASE.OBJECT'
-- ...
SELECT DISTINCT obj_name, obj_type
FROM #dependents
DROP TABLE #dependents

I just blog something similar to this that might help:
Knowing What to Test When Changing a SQL Server Object.
Another approach would be to right click the database and select "Tasks" and then "Generate Scripts...", check the checkbox "Script all objects in the selected database". This will give you a giant text file that you can then search.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Automating database creation using psql - postgresql

You can write: \i funcs_algebra.sql in your main SQL script to include the extra functions script.

Related

Auto generate script for CREATE TABLE including all indices, constraints, etc (not via SSMS)

Is it possible to have database-wide table aliases?

Statement to display create table SQL

Creating Stored Procedures that can work with different tables

SQL query to list all dependent entities

Categories

Resources