My manager wants to be able to run a script/job to find the total number of databases currently on all instances/servers.
I know to use: select COUNT(*) from sys.databases
But what's the easiest way to get this to run against all instances so that when he runs it, it counts all for him as opposed to running it against each instance separately?
To query data from different databases/servers, you need Linked Servers. You can get to them in SQL Server Management Studio under
Server Objects-->Linked Servers
Once you have that, you can call data from other servers like so:
select
*
from
sys.databases,
[OtherServerName].[OtherDB].[sys].[databases]
Then build a query to cover all your instances.
Related
I have a question about how it works when several SQL statements are executed in parallel in Snowflake.
For example, if I execute 10 insert statements on 10 different tables with the same base table - will the tables be loaded in parallel?
Since Copy and Insert statement only write in new partitions they can run in parallel with other Copy or Insert statements.
https://docs.snowflake.com/en/sql-reference/transactions.html
"https://docs.snowflake.com/en/sql-reference/transactions.html#transaction-commands-and-functions" states that "Most INSERT and COPY statements write only new partitions. Those statements often can run in parallel with other INSERT and COPY operations,..."
I assume that statements cannot run in parallel when they want to insert into the same micro partition. Is that correct or is there another explanation why locks on INSERTs can happen?
I execute 10 insert statements on 10 different tables
with the same base table - will the tables be loaded in parallel?
YES!
Look for multi-table insert in SF https://docs.snowflake.com/en/sql-reference/sql/insert-multi-table.html
You can execute queries parallelly by just adding a ">" symbol.
for example:
The below statement will submit all the mentioned queries parallelly to snowflake. It will not exit out though if there is any error encountered in any of the queries.
snowsql -o log_level=DEBUG -o exit_on_error=true -q "select 1;>select * from SNOWSQLTABLE;>select 2;>select 3;>insert into TABLE values (1)>;select * from SNOWLTABLE;>select 5;"
The below statement will cause the queries to run one at a time and exit if any error is found.
snowsql -o log_level=DEBUG -o exit_on_error=true -q "select 1;select * from SNOWSQLTABLE;select 2;select 3;insert into SNOQSQLTABLE values (1);select * from SNOWLTABLE;select 5;"
Concurrency in Snowflake is managed with either multiple warehouses (compute resources) or enabling multi-clustering on a warehouse (one virtual warehouse with more than one cluster of servers).
https://docs.snowflake.com/en/user-guide/warehouses-multicluster.html
I'm working with a customer today that does millions of SQL commands a day, they have many different warehouses and most of these warehouses are set to multi-cluster "auto-scale" mode.
Specifically, for your question, it sounds like you have ten sessions connected, running inserts into ten tables via querying a single base table. I'd probably begin my testing of this with one virtual warehouse, configured with a minimum
of one cluster and a maximum of three or four, and then run tests and review the results.
The size of the warehouse I would use would mostly be determined by how large the query is (the SELECT portion), you can start with something like a medium and review the performance and query plans of the inserts to see if that is the appropriate size.
When reviewing the plans, check for queuing time to see if perhaps three or four clusters isn't enough, it probably will be fine.
Your query history will also indicate which "cluster_number" your query ran on, within the virtual warehouse. This is one way to check to see how many clusters were running (the maximum cluster_number), another is to view the warehouses tab in the webUI or to execute the "show warehouses;" command.
Some additional links that might help you:
https://www.snowflake.com/blog/auto-scale-snowflake-major-leap-forward-massively-concurrent-enterprise-applications/
https://community.snowflake.com/s/article/Putting-Snowflake-s-Automatic-Concurrency-Scaling-to-the-Test
https://support.snowflake.net/s/question/0D50Z00009T2QTXSA3/what-is-the-difference-in-scale-out-vs-scale-up-why-is-scale-out-for-concurrency-and-scale-up-is-for-large-queries-
I have been working on a reporting database in DB2 for a month or so, and I have it setup to a pretty decent degree of what I want. I am however noticing small inconsistencies that I have not been able to work out.
Less important, but still annoying:
1) Users claim it takes two login attempts to connect, first always fails, second is a success. (Is there a recommendation for what to check for this?)
More importantly:
2) Whenever I want to refresh the data (which will be nightly), I have a script that drops and then recreates all of the tables. There are 66 tables, each ranging from 10's of records to just under 100,000 records. The data is not massive and takes about 2 minutes to run all 66 tables.
The issue is that once it says it completed, there is usually at least 3-4 tables that did not load any data in them. So the table is deleted and then created, but is empty. The log shows that the command completed successfully and if I run them independently they populate just fine.
If it helps, 95% of the commands are just CAST functions.
While I am sure I am not doing it the recommended way, is there a reason why a number of my tables are not populating? Are the commands executing too fast? Should I lag the Create after the DROP?
(This is DB2 Express-C 11.1 on Windows 2012 R2, The source DB is remote)
Example of my SQL:
DROP TABLE TEST.TIMESHEET;
CREATE TABLE TEST.TIMESHEET AS (
SELECT NAME00, CAST(TIMESHEET_ID AS INTEGER(34))TIMESHEET_ID ....
.. (for 5-50 more columns)
FROM REMOTE_DB.TIMESHEET
)WITH DATA;
It is possible to configure DB2 to tolerate certain SQL errors in nested table expressions.
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.data.fluidquery.doc/topics/iiyfqetnint.html
When the federated server encounters an allowable error, the server allows the error and continues processing the remainder of the query rather than returning an error for the entire query. The result set that the federated server returns can be a partial or an empty result.
However, I assume that your REMOTE_DB.TIMESHEET is simply a nickname, and not a view with nested table expressions, and so any errors when pulling data from the source should be surfaced by DB2. Taking a look at the db2diag.log is likely the way to go - you might even be hitting a Db2 issue.
It might be useful to change your script to TRUNCATE and INSERT into your local tables and see if that helps avoid the issue.
As you say you are maybe not doing things the most efficient way. You could consider using cache tables to take a periodic copy of your remote data https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.5.0/com.ibm.data.fluidquery.doc/topics/iiyvfed_tuning_cachetbls.html
We recently migrated a large DB2 database to a new server. It got trimmed a lot in the migration, for instance 10 years of data chopped down to 3, to name a few. But now I find that I need certain data from the old server until after tax season.
How can I run a UNION query in DBeaver that pulls data from two different connections..? What's the proper syntax of the table identifiers in the FROM and JOIN keywords..?
I use DBeaver for my regular SQL work, and I cannot determine how to span a UNION query across two different connections. However, I also use Microsoft Access, and I easily did it there with two Pass-Through queries that are fed to a native Microsoft Access union query.
But how to do it in DBeaver..? I can't understand how to use two connections at the same time.
For instance, here are my connections:
And I need something like this...
SELECT *
FROM ASP7.F_CERTOB.LDHIST
UNION
SELECT *
FROM OLD.VIPDTAB.LDHIST
...but I get the following error, to which I say "No kidding! That's what I want!", lol... =-)
SQL Error [56023]: [SQL0512] Statement references objects in multiple databases.
How can this be done..?
This is not a feature of DBeaver. DBeaver can only access the data that the DB gives it, and this is restricted to a single connection at a time (save for import/export operations). This feature is being considered for development, so keep an eye out for this answer to be outdated sometime in 2019.
You can export data from your OLD database and import it into ASP7 using DBeaver (although vendor tools for this are typically more efficient for this). Then you can do your union as suggested.
Many RDBMS offer a way to logically access foreign databases as if they were local, in which case DBeaver would then be able to access the data from the OLD database (as far as DBeaver is concerned in this situation, all the data is coming from a single connection). In Postgres, for example, one can use a foreign data wrapper to access foreign data.
I'm not familiar with DB2, but a quick Google search suggests that you can set up foreign connections within DB2 using nicknames or three-part-names.
If you check this github issue:
https://github.com/dbeaver/dbeaver/issues/3605
The way to solve this is to create a task and execute it in different connections:
https://github.com/dbeaver/dbeaver/issues/3605#issuecomment-590405154
I use INavigor system for ad-hoc data extraction from the DB2 database. Only issue is that when it comes to automation. Is there a way I could automate the SQL code to be run on a specific time? I know there is Advance Job Sheduler but I'm not sure how the SQL can be added to the Sheduler. Any one who can help?
IBM added a Run SQL Statements (RUNSQL) CL command at v7.1.
Prior to that, you could store SQL statements in source files and run them with the Run SQL Statements (RUNSQLSTM) command.
Neither of the above allow an SQL Select to be run by itself. For data extraction, you'd want INSERT INTO tbl (SELECT <...> FROM <...>)
For reporting SELECTs, your best bet is to create a Query Manager query (*QMQRY object) and form (*QMFORM object) via Start DB2 UDB Query Manager (STRQM); which can then be run by the Start Query Management Query (STRQMQRY) command. Query Manager (QM) is SQL based, unlike the older Query/400 product. QM manual is here
One last option, is the db2 utility available via QShell.
Don't waste effort creating a day late going out of business because the job scheduler hasn't updated the file system.
Real businesses need real time data.
Just make a SQL view on the iseries that pulls the info you need together.
Query the view externally in real time. Even if you need last 30 days or last month or year to date. These are all simple views to create.
I need to run a query against Active Directory to identify all the unique (distinct) operating systems + service packs in the domain. I can do that pretty easily via the ADsDSOObject provider and a SQL statement. But I need to also tally up how many accounts for each distinct combination. I can do this against a SQL Server or Oracle database very easily using COUNT(field) AS X and GROUP BY field. But with an AD query I can't use GROUP BY (as far as I know), so I'm funneling the recordset into a new disconnected recordset, but how can I run a COUNT() and GROUP BY statement against that? Is there a better way than this?
If you have a SQL Server available you could insert into a temporary table and then have TSQL. Not pretty but that is what I would try.
If I had SQL Server available I wouldn't bother with a disconnected recordset. Apparently the GROUP BY option is unavailable to ADO disconnected recordsets.