How can I see the call tree for SQL stored procedures offline (without actually creating them) - tsql

I have a huge SQL script which i need to analyse. It would be really helpful if i could find a way which can generate a call tree; ie, to see which all procedures are called from a particular procedure. a perl based example is here, http://sqlblog.com/blogs/linchi_shea/archive/2009/10/23/find-the-complete-call-tree-for-a-stored-procedure.aspx
but i need a tool to analyse the text file (.sql file), not the procedure stored in the database. due to some reasons i will not be able to create the whole set of procedures in the database and use the above mentioned tool.
please respond if you have come across any ide/tool with this feature.

Probably not very helpful, as it violates your request for a "offline" sql file, text based parsing tool, but wanted to throw this redgate tool out there that I have used with great success in the past; RedGate Sql Dependency Tracker. It works very well and does a good job mapping out your objects and all their dependencies (definable as to what you want mapped). But it does require a database with all of the existing objects in place to work properly. :(
If you can't find one out there, I guess you could maybe do some script/macro text parsing if all the procedure calls are easily defined and predictable in the file. AutoHotKey is a great general purpose scripting tool/framework, and there are a few sql based scripts out there...just not one exactly like you are looking for that I have seen.

Related

What functions are called when working with the Postgres database

I need to implement transparent encryption in Postgres (TDE). To do this, I found which functions are called when INSERT and SELECT are triggered. Used LLVM-LLDB on SELECT.
I'm trying to do the same with INSERT - does not work
the base process stops and does not allow insertion. I did everything about one manual https://eax.me/lldb/.
What could be wrong? how to find out which functions are called upon insertion (in the case of SELECT, this is secure_read, etc.)? And, if anyone knows how to change the function code in the source?
First, the client and server are located on the same machine, the same user adds data and reads them
Unfortunately I do not have enough reputation to add a screenshots.
The SQL statements are the wrong level to start debugging. You should look at the code where blocks are read and written. That would be in src/backend/storage/smgr.
Look at the functions mdread and mdwrite in md.c. This is probably where you'd start hacking.
PostgreSQL v12 has introduced “pluggable storage”, so you can write your own storage manager. See the documentation. If you don't want to patch PostgreSQL, but have an extension that will work with standard PostgreSQL, that would be the direction to take.
So far I have only covered block storage, but you must not forget WAL. Encrypting that will require hacking PostgreSQL.
This is a complex question which you should post to PostgreSQL hackers distribution list https://www.postgresql.org/list/pgsql-hackers/.
You could start by setting a GDB breakpoint in Executor_Start in execMain.c

How to set up background worker for PostgreSQL 11 server?

Recently I've been asigned to migrate part of the database from Oracle to PostgreSQL enviroment, as testing experiment. During that process, major drawback that occured to me was lack of simple way to implement parallelism which is required due to multiple design reasons, which aren't so relevant here. Recently I've discovered https://www.postgresql.org/docs/11/bgworker.html following process, which occured to me as some way to solve my problems.
Yet not so truly, as I couldn't easly find any tutorial or example how to implement it even for a simple task as writing debugmessages into logger, while the process is running. I've tried some old ways, presented in some plugin specifications from version 9.3, but they weren't much of help.
I would like to know how to set up those workers properly. Any help would be appriciated.
PS: Also if some good soul found workaround to implement bulk collect for cursors into PostgreSQL it would be most kind of you, to share it.
The documentation for bgworker that you linked to is for writing C code, which is probably not what you want. You can use the pg_background extension, which will do what you want. ora2pg will optionally use pg_background when converting oracle procedures with the autonomous transaction pragma. The other option is to use dblink to open a connection to the current db.
Neither solution is great, but it's the only way to go if you need to store data in a table whether or not the enclosing transaction succeeds. If you can get by with just putting stuff into the logs, you can use RAISE NOTICE instead.
As far as bulk collect for cursors go, I'm not sure exactly how you are using them, but set returning functions may help you. Functions in postgres can return multiple rows without fiddling with cursors.

Can I Extract, Translate, and Compare using Talend?

I am evaluating Talend and I have little experience using the tool. My question is regarding audit. Does Talend offer a way to run with an audit option that extracts the source data, transforms it and then compares the transformed data to the existing target data? I would like a report that tells me which records would have triggered an update if the audit option had not prevented it.
I know that ETL's are triggered on change but I would want to run this audit without regard to changed timestamps. That is, a query would specify what source records to audit.
I have done a search of the talend documentation but nothing jumped out at me.
Thank you.
EDIT
Looking outside of Talend, I found where Tim Mitchell wrote:
While I have not yet found an auditing framework with enough
customization to suit me, I have learned that it is possible to audit
ETL processes in a customized way without reinventing the wheel every
time. I keep on hand a common (but never set in stone) set of table
definitions and supporting logic to shorten the path to the following
auditing objectives:
Simple aggregate auditing of row counts, financial data, and other key metrics
Documentation templates for source-to-target mappings and transformation logic
Lightweight auditing of batch-level changes
Full auditing of every change made in the transformation process
So I guess that auditing is on me.

Can COPY FROM tolerantly consume bad CSV?

I am trying to load text data into a postgresql database via COPY FROM. Data is definitely not clean CSV.
The input data isn't always consistent: sometimes there are excess fields (separator is part of a field's content) or there are nulls instead of 0's in integer fields.
The result is that PostgreSQL throws an error and stops loading.
Currently I am trying to massage the data into consistency via perl.
Is there a better strategy?
Can PostgreSQL be asked to be as tolerant as mysql or sqlite in that respect?
Thanks
PostgreSQL's COPY FROM isn't designed to handle bodgy data and is quite strict. There's little support for tolerance of dodgy data.
I thought there was little interest in adding any until I saw this proposed patch posted just a few days ago for possible inclusion in PostgreSQL 9.3. The patch has been resoundingly rejected, but shows that there's some interest in the idea; read the thread.
It's sometimes possible to COPY FROM into a staging TEMPORARY table that has all text fields with no constraints. Then you can massage the data using SQL from there. That'll only work if the SQL is at least well-formed and regular, though, and it doesn't sound like yours is.
If the data isn't clean, you need to pre-process it with a script in a suitable scripting language.
Have that script:
Connect to PostgreSQL and INSERT rows;
Connect to PostgreSQL and use the scripting language's Pg APIs to COPY rows in; or
Write out clean CSV that you can COPY FROM
Python's csv module can be handy for this. You can use any language you like; perl, python, php, Java, C, whatever.
If you were enthusiastic you could write it in PL/Perlu or PL/Pythonu, inserting the data as you read it and clean it up. I wouldn't bother.

Data Warehousing Postgres

We're considering using SSIS to maintain a PostgreSql data warehouse. I've used it before between SQL Servers with no problems, but am having a lot of difficulty getting it to play nicely with Postgres. I’m using the evaluation version of the OLEDB PGNP data provider (http://www.postgresql.org/about/news.1004).
I wanted to start with something simple like UPSERT on the fact table (10k-15k rows are updated/inserted daily), but this is proving very difficult (not to mention I’ll want to use surrogate keys in the future).
I’ve attempted (Link) and (http://consultingblogs.emc.com/jamiethomson/archive/2006/09/12/SSIS_3A00_-Checking-if-a-row-exists-and-if-it-does_2C00_-has-it-changed.aspx) which are effectively the same (except I don’t really understand the union all at the end when I’m trying to upsert) But I run into the same problem with parameters when doing the update using a OLEDb command – which I tried to overcome using (http://technet.microsoft.com/en-us/library/ms141773.aspx) but that just doesn’t seem to work, I get a validation error –
The external columns for complent.... are out of sync with the datasource columns... external column “Param_2” needs to be removed from the external columns.
(this error is repeated for the first two parameters as well – never came across this using the sql connection as it supports named parameters)
Has anyone come across this?
AND:
The fact that this simple task is apparently so difficult to do in SSIS suggests I’m using the wrong tool for the job - is there a better (and still flexible) way of doing this? Or would another ETL package be better for use between two Postgres database? -Other options include any listed on (http://en.wikipedia.org/wiki/Extract,_transform,_load#Open-source_ETL_frameworks). I could just go and write a load of SQL to do this for me, but I wanted a neat and easily maintainable solution.
I have used the Slowly Changing Dimension wizard for this with good success. It may give you what you are looking for especially with the Wizard
http://msdn.microsoft.com/en-us/library/ms141715.aspx
The External Columns Out Of Sync: SSIS is Case Sensitive - I encountered this issue multiple times and it makes me want to pull my hair out.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
SCD is way too slow for what I want. I need to use set based sql.
It turned out that a lot of my problems were with bugs in the provider.
I opened a forum topic (http://www.pgoledb.com/forum/viewtopic.php?f=4&t=49) and had a useful discussion with the moderator/support/developer person.
Also Postgres doesn't let you do cross db querys, so I solved the problem this way:
Data Source from Production DB to a temp Archive DB table
Run set based query between temp table and archive table
Truncate temp table
Note that the temp table is not atchally a temp table, but a copy of the archive table schema to temporarily stored data in.
Took a while, but I got there in the end.
This simple task is going to take some work either way. SSIS is by no means an enterprise class ETL product yet, but it does give you some quick and easy functionality, and is sufficient for most ETL work. I guess it is also about your level of comfort with it as well.
What enterprise ETL solution would you suggest?