What is a "batch", and why is GO used? - tsql

I have read and read over MSDN, etc. Ok, so it signals the end of a batch.
What defines a batch? I don't see why I need go when I'm pasting in a bunch of scripts to be run all at the same time.
I've never understood GO. Can anyone explain this better and when I need to use it (after how many or what type of transactions)?
For example why would I need GO after each update here:
UPDATE [Country]
SET [CountryCode] = 'IL'
WHERE code = 'IL'
GO
UPDATE [Country]
SET [CountryCode] = 'PT'
WHERE code = 'PT'

GO is not properly a TSQL command.
Instead it's a command to the specific client program which connects to an SQL server (Sybase or Microsoft's - not sure about what Oracle does), signalling to the client program that the set of commands that were input into it up till the "go" need to be sent to the server to be executed.
Why/when do you need it?
GO in MS SQL server has a "count" parameter - so you can use it as a "repeat N times" shortcut.
Extremely large updates might fill up the SQL server's log. To avoid that, they might need to be separated into smaller batches via go.
In your example, if updating for a set of country codes has such a volume that it will run out of log space, the solution is to separate each country code into a separate transaction - which can be done by separating them on the client with go.
Some SQL statements MUST be separated by GO from the following ones in order to work.
For example, you can't drop a table and re-create the same-named table in a single transaction, at least in Sybase (ditto for creating procedures/triggers):
> drop table tempdb.guest.x1
> create table tempdb.guest.x1 (a int)
> go
Msg 2714, Level 16, State 1
Server 'SYBDEV', Line 2
There is already an object named 'x1' in the database.
> drop table tempdb.guest.x1
> go
> create table tempdb.guest.x1 (a int)
> go
>

GO is not a statement, it's a batch separator.
The blocks separated by GO are sent by the client to the server for processing and the client waits for their results.
For instance, if you write
DELETE FROM a
DELETE FROM b
DELETE FROM c
, this will be sent to the server as a single 3-line query.
If you write
DELETE FROM a
GO
DELETE FROM b
GO
DELETE FROM c
, this will be sent to the server as 3 one-line queries.
GO itself does not go to the server (no pun intended). It's a pure client-side reserved word and is only recognized by SSMS and osql.
If you will use a custom query tool to send it over the connection, the server won't even recognize it and issue an error.

Many command need to be in their own batch, like CREATE PROCEDURE
Or, if you add a column to a table, then it should be in its own batch.
If you try to SELECT the new column in the same batch it fails because at parse/compile time the column does not exist.
GO is used by the SQL tools to work this out from one script: it is not a SQL keyword and is not recognised by the engine.
These are 2 concrete examples of day to day usage of batches.
Edit: In your example, you don't need GO...
Edit 2, example. You can't drop, create and permission in one batch... not least, where is the end of the stored procedure?
IF OBJECT_ID ('dbo.uspDoStuff') IS NOT NULL
DROP PROCEDURE dbo.uspDoStuff
GO
CREATE PROCEDURE dbo.uspDoStuff
AS
SELECT Something From ATable
GO
GRANT EXECUTE ON dbo.uspDoStuff TO RoleSomeOne
GO

Sometimes there is a need to execute the same command or set of commands over and over again. This may be to insert or update test data or it may be to put a load on your server for performance testing. Whatever the need the easiest way to do this is to setup a while loop and execute your code, but in SQL 2005 there is an even easier way to do this.
Let's say you want to create a test table and load it with 1000 records. You could issue the following command and it will run the same command 1000 times:
CREATE TABLE dbo.TEST (ID INT IDENTITY (1,1), ROWID uniqueidentifier)
GO
INSERT INTO dbo.TEST (ROWID) VALUES (NEWID())
GO 1000
source:
http://www.mssqltips.com/tip.asp?tip=1216
Other than that it marks the "end" of an SQL block (e.g. in a stored procedure)... Meaning you're on a "clean" state again... e.G: Parameters used in the statement before the code are reset (not defined anymore)

As everyone already said, "GO" is not part of T-SQL. "GO" is a batch separator in SSMS, a client application used to submit queries to the database. This means that declared variables and table variables will not persist from code before the "GO" to code following it.
In fact, GO is simply the default word used by SSMS. This can be changed in the options if you want. For a bit of fun, change the option on someone else's system to use "SELECT" as a batch seperator instead of "GO". Forgive my cruel chuckle.

It is used to split logical blocks. Your code is interpreted into sql command line and this indicate next block of code.
But it could be used as recursive statement with specific number.
Try:
exec sp_who2
go 2
Some statement have to be delimited by GO:
use DB
create view thisViewCreationWillFail

Related

Downloading KDB tables into a [q] session from a web-facing database

What I have: the hostname/port number of an always-running [q] session that exposes several KDB tables via our internal web. I can easily run [q] commands against it in a browser (or even, through the use of [hopen], via a local [q] session invoked on the command line).
What I need: a [q] script, or the knowledge of how to write one, that will automatically connect to the web-facing database, and copy over all of its tables into the localhost [q] session's working memory (without knowing all the table names in advance).
Concerns include:
The tables are huge. I'm prepared to wait on my machine if need be, but I do need this to work eventually.
While I can get a legible list of all the server's table-names, I can never get it in a useful format (ideally it'd be a List, rather than the Symbol that the hopen-ed [tables] command always gives me). Also, I'm told that it may be possible to accomplish the transfers without ever explicitly querying the table names, though I can't imagine how; bonus points if you manage that.
Yo can implement smoething like this :
.data.oc:1000;
/connect to the session using hopen
h:hopen `::1234;
/get the table names
tabs:h"tables[]";
/create local tables with the same names
{ .[x;();:;()] } each tabs;
/for each table name
{[tab]
/get the table count
c:h({count value x};tab);
oc:.data.oc;
/cut the table count to some optimal value, say 10,000 (0-99999; 10000-19999).
idxl:$[c>oc; [ l: c div oc; ( (0;oc-1)+/:oc*til l),enlist (l*oc;c-1) ] ; enlist (0; c-1)];
/now iterate over the list and use them as indexes to query the table.
{[t;idx] t upsert h ({[t;y] ?[t; enlist (within;`i;y);0b;()] } ; t;idx ) }[tab] each idxl;
}each tabs

Postgres 'if not exists' fails because the sequence exists

I have several counters in an application I am building, as am trying to get them to be dynamically created by the application as required.
For a simplistic example, if someone types a word into a script it should return the number of times that word has been entered previously. Here is an example of sql that may be executed if they typed the word example.
CREATE SEQUENCE IF NOT EXISTS example START WITH 1;
SELECT nextval('example')
This would return 1 the first time it ran, 2 the second time, etc.
The problem is when 2 people click the button at the same time.
First, please note that a lot more is happening in my application than just these statements, so the chances of them overlapping is much more significant than it would be if this was all that was happening.
1> BEGIN;
2> BEGIN;
1> CREATE SEQUENCE IF NOT EXISTS example START WITH 1;
2> CREATE SEQUENCE IF NOT EXISTS example START WITH 1; -- is blocked by previous statement
1> SELECT nextval('example') -- returns 1 to user.
1> COMMIT; -- unblocks second connection
2> ERROR: duplicate key value violates unique constraint
"pg_type_typname_nsp_index"
DETAIL: Key (typname, typnamespace)=(example, 109649) already exists.
I was under the impression that by using "IF NOT EXISTS", the statement should just be a no-op if it does exist, but it seems to have this race condition where that is not the case. I say race condition because if these two are not executed at the same time, it works as one would expect.
I have noticed that IF NOT EXISTS is fairly new to postgres, so maybe they haven't worked out all of the kinks yet?
EDIT:
The main reason we were considering doing things this way was to avoid excess locking. The thought being that if two people were to increment at the same time, using a sequence would mean that neither user should have to wait for the other (except, as in this example, for the initial creation of that sequence)
Sequences are part of the database schema. If you find yourself modifying the schema dynamically based on the data stored in the database, you are probably doing something wrong. This is especially true for sequences, which have special properties e.g. regarding their behavior with respect to transactions. Specifically, if you increment a sequence (with the help of nextval) in the middle of a transaction and then you rollback that transaction, the value of the sequence will not be rolled back. So most likely, this kind of behavior is something that you don't want with your data. In your example, imagine that a user tries to add word. This results in the corresponding sequence being incremented. Now imagine that the transaction does not complete for reason (e.g. maybe the computer crashes) and it gets rolled back. You would end up with the word not being added to the database but with the sequence being incremented.
For the particular example that you mentioned, there is an easy solution; create an ordinary table to store all the "sequences". Something like that would do it:
CREATE TABLE word_frequency (
word text NOT NULL UNIQUE,
frequency integer NOT NULL
);
Now I understand that this is just an example, but if this approach doesn't work for your actual use case, let us know and we can adjust it to your needs.
Edit: Here's how you the above solution works. If a new word is added, run the following query ("UPSERT" syntax in postgres 9.5+ only):
INSERT INTO word_frequency(word,frequency)
VALUES ('foo',1)
ON CONFLICT (word)
DO UPDATE
SET frequency = word_frequency.frequency + excluded.frequency
RETURNING frequency;
This query will insert a new word in word_frequency with frequency 1, or if the word exists already it will increment the existing frequency by 1. Now what happens if two transaction try to do that at the same time? Consider the following scenario:
client 1 client 2
-------- --------
BEGIN
BEGIN
UPSERT ('foo',1)
UPSERT ('foo',1) <====
COMMIT
COMMIT
What will happen is that as soon as client 2 tries increment the frequency for foo (marked with the arrow above), that operation will block because the row was modified by a different transaction. When client 1 commits, client 2 will get unblocked and continue without any errors. This is exactly how we wanted it to work. Also note, that postgresql will use row-level locking to implement this behavior, so other insertions will not be blocked.
EDIT: The main reason we were considering doing things this way was to
avoid excess locking. The thought being that if two people were to
increment at the same time, using a sequence would mean that neither
user should have to wait for the other (except, as in this example,
for the initial creation of that sequence)
It sounds like you're optimizing for a problem that likely does not exist. Sure, if you have 100,000 simultaneous users that are only inserting rows (since a sequence will only be used then normally) there is the possibility of some contention with the sequence but realistically there will be other bottle necks long before the sequence gets in the way.
I'd advise you to first prove that the sequence is an issue. With a proper database design (which dynamic DDL is not) the sequence will not be the bottle neck.
As a reference, DDL is not transaction safe in most databases.

T-SQL - Trying to query something across all databases on my server

I've got an environment where my server is hosting a variable number of databases, all of which utilize the same table structures/schemas. I need to pull a sum of customers that meet a certain series of constraints with say, the user table. I also need to show which database I am showing the sum for.
I already know all I need to get the sum in a db by db query, but what I'm really looking to do is have one script that hits all of the non-system DBs currently on my server to grab this info.
Please forgive my ignorance in this, just starting out.
Update-
So, to clarify things somewhat; I'm using MS SQL 2014. I know how to pull a listing of the dbs I want to hit by using:
SELECT name
FROM sys.databases
WHERE name not in ('master', 'model', 'msdb', 'tempdb')
AND state = 0
And for the purposes of gathering the data I need from each, let's just say I've got something like:
select count(u.userid)
from users n
join UserAttributes ua on u.userid = ua.userid
where ua.status = 2
New Update:
So, I went ahead and added the ps sp_foreachdb as suggested by #Philip Kelley, and I'm now running into a problem when trying to run this (admittedly, I can tell I'm closer to a solution). So, this is what I'm using to call the sp:
USE [master]
GO
DECLARE #return_value int
EXEC #return_value = [dbo].[sp_foreachdb]
#command = N'select count(userid) as number from ?..users',
#print_dbname = 1,
#user_only = 1
SELECT 'Return Value' = #return_value
GO
This provides a nice and clean output showing a count, but what I'd like to see is the db name in addition to the count, something like this:
|[DB_NAME]|[COUNT]|
But for each DB
Is this even possible?
Source Code: https://codereview.stackexchange.com/questions/113063/executing-dynamic-sql-programmatically
Example Usage:
declare #options int = (
select a.ExcludeSystemDatabases
from dbo.ForEachDatabaseOptions() as a
);
execute dbo.usp_ForEachDatabase
#Command = N'print Db_Name();'
, #Options = #options;
#Command can be anything you want but obviously it needs to be a query that every single database can understand. #Options currently has 3 built-in settings but can be expanded however you see fit.
I wrote this to mimic/expand upon the master.sys.sp_MSforeachdb procedure but it could still use a little bit of polish (especially around the "logic" that replaces ? with the current database name).
Enumerate the databases from schema / sysdatabases. At least in situations without replication, excluding db_ids 1 to 4 as system databases should be reasonably robust:
SELECT [name] FROM master.dbo.sysdatabases WHERE dbid NOT IN (1,2,3,4)
Other methods exist, see here: Get list of databases from SQL Server and here: SQL Server: How to tell if a database is a system database?
Then prefix the query or stored procedure call with the database name, and in a cursor loop over the resultset of the first query, store that in a sysname variable to construct a series of statements like that:
SELECT column FROM databasename.schema.Viewname WHERE ...
and call that using the string execute function
EXECUTE('SELECT ... FROM '+##fully_qualified_table_name+' WHERE ...')
There’s the undocumented sytem procedure, sp_msForEachDB, as found in the master database. Many pundits on the internet recommend not using this, as under obscure fringe cases it can be unreliable and somehow skip random databases. Count me as one of them, this caused me serious grief a few months back.
You can write your own routine to provide this kind of functionality. This is a common task, however, and many people have already done it and posted their code online… so why re-invent the wheel?
#kittoes0124 posted a link to “usp_ForEachDatabse”. This probably works, though pro forma I hate any stored procedures that beings with usp_. I ended up with Aaron Bertrand’s utility, which can be found at http://www.mssqltips.com/sqlservertip/2201/making-a-more-reliable-and-flexible-spmsforeachdb/.
Install a version of this routine, figure out how it works, plug in your script, and go!

Add multiple PostgreSQL enum values with one command

Since PostgreSQL 9.1 enum values can be added using
ALTER TYPE my_type ADD VALUE new_value;
However, trying to run this as part of a bigger script gives an error:
ALTER TYPE ... ADD cannot be executed from a function or multi-command string
This makes it a real hassle to script changes to be applied in production, because support staff have to remember that, while most scripts can be run "normally", there are a few "special" scripts that need to be opened in pgAdmin and run manually, piece by piece. I've Googled this and I understand the limitation - enum values cannot be added inside a transaction or part of a "multi-command string". That's OK. I don't need to do that. I just want to add multiple enum values and execute other, unrelated, SQL statements without having to feed Postgres one statement at a time.
In other words: I want to be able to run a single script, both from pgAdmin and from psql, that simply does the same thing that pgAdmin does when I highlight one line at a time and press F5 (run). Is there a way to do this, maybe with plpgsql?
Looks like this will finally be fixed in PostgreSQL 12.
https://www.postgresql.org/docs/12/sql-altertype.html says
If ALTER TYPE ... ADD VALUE (the form that adds a new value to an enum type) is executed inside a transaction block, the new value cannot be used until after the transaction has been committed.

Updating the text of a large number of stored procedures

The question pretty much sums it up. I've got to replace text in a large number for store procedures. Its not so many that doing it manually is impossible, but enough that I'm asking the question. I also prefer automation as it reduces the change of user error when we make the change in production.
I can Identify them like this:
select OBJECT_DEFINITION(object_id), *
from sys.procedures
where OBJECT_DEFINITION(object_id) like '%''MyExampleLiteral''%'
order by name
Is there any way to mass update them all to change 'MyExampleLiteral' to 'MyOtherExampleLiteral'?
I'd even settle for a way to open all the stored procs. Just Finding these store procs in a larger list will take some time.
I thought about generating alter statements using the above select statements, but then I lose line breaks.
Thanks in advance,
This is a Microsoft SQL Server.
There are different tools to use depending on the database in question. For example, Microsoft SQL Server Data Tools integrates with Visual Studio, and allows you to do these types of operations fairly easily. The database is stored in your solution as scripts, which you can then search and replace any keyword you wish. I'm assuming there would be similar tools available for other platforms.
You could do this with dynamic sql. Query the system tables to get all the SPs containing your "MyExampleLiteral":
SELECT [object_id] FROM sys.objects o
WHERE type_desc = 'SQL_STORED_PROCEDURE'
AND is_ms_shipped = 0
AND OBJECT_DEFINITION(o.[object_id]) LIKE '%<search string>%'
Then, write a while loop to go through those object_ids. In the while loop, get the OBJECT_DEFINITION() into a string and replace the "MyExampleLiteral", then replace CREATE PROCEDURE with ALTER PROCEDURE and execute the string using sp_executesql.
Doing something this crazy, make sure you backup the database first.