I would like to get query execution time in iSQL.
For instance :
SELECT * FROM students;
How do i get query execution time ?
Use SET STATS:
SQL> SET STATS;
SQL> SELECT * FROM RDB$DATABASE;
... query output removed ....
Current memory = 34490656
Delta memory = 105360
Max memory = 34612544
Elapsed time= 0.59 sec
Buffers = 2048
Reads = 17
Writes 0
Fetches = 270
SQL>
Related
Note: I'm running Postgres 11.7 and Python 2.7.17.
It appears that time.sleep() has 60 sec limit in plpythonu function. It works as expected up to 60 seconds. But if passed a value greater than 60 then it stops at 60 seconds.
CREATE OR REPLACE FUNCTION test_sleep(interval_str INTERVAL)
RETURNS INTERVAL
AS $$
try:
import time
import datetime
start_time = time.time()
t = datetime.datetime.strptime(interval_str,"%H:%M:%S")
tdelta = datetime.timedelta(hours=t.hour, minutes=t.minute, seconds=t.second)
interval_secs = tdelta.total_seconds()
print 'interval_secs=%s' % repr(interval_secs)
time.sleep(interval_secs)
elapsed_secs = time.time() - start_time
return datetime.timedelta(seconds=elapsed_secs)
except:
import traceback
print traceback.format_exc()
raise
$$ LANGUAGE PLPYTHONU;
Here's a test on command line using psql. The first test runs as expected. I told it to sleep for two seconds and it dis. The second test only sleeps for 60 seconds even though I requested a sleep for 120 seconds.
$ echo "select now(), test_sleep('2 sec'); select now();" | psql
now | test_sleep
------------------------------+-----------------
2021-05-07 11:10:27.63041-04 | 00:00:02.005745
(1 row)
now
-------------------------------
2021-05-07 11:10:29.652542-04
(1 row)
$ echo "select now(), test_sleep('2 min'); select now();" | psql
now | test_sleep
-------------------------------+-----------------
2021-05-07 11:10:36.056578-04 | 00:00:59.977787
(1 row)
now
-------------------------------
2021-05-07 11:11:36.050637-04
(1 row)
$
Here's output from the Postgres log file.
interval_secs=2.0
interval_secs=120.0
This is not unexpected (although I can't reliably reproduce it).
https://realpython.com/python-sleep/
Note: In Python 3.5, the core developers changed the behavior of
time.sleep() slightly. The new Python sleep() system call will last at
least the number of seconds you’ve specified, even if the sleep is
interrupted by a signal. This does not apply if the signal itself
raises an exception, however.
So in python2, time.sleep can return early if interrupted by a signal. If you don't like that, have it loop to sleep again for the remainder of the time.
Basically I need to automate all of the below in a snowflake TASK
Create/replace a csv file format and stage in Snowflake
Run task query (which runs every few days to pulls some stats)
Unload the query results each time it runs into the Stage csv
Download the contents of the stage csv to a local file on my machine
What I can't get right is the COPY INTO stage, how do I unload the results of the task each time it is run, into the stage?
I don't know what to put in the FROM statement - TITANLOADSUCCESSVSFAIL is not recognized but this is the name of the TASK
COPY INTO #TitanLoadStage/unload/ FROM TITANLOADSUCCESSVSFAIL FILE_FORMAT = TitanLoadSevenDays
First time using stage, and downloading locally with SF so appreciate any advice on how to get this up and running!
Thanks,
Nick
Full Code:
-- create a csv file format
CREATE OR REPLACE FILE FORMAT TitanLoadSevenDays
type = 'CSV'
field_delimiter = '|';
--create a snowflake staging table using the csv
CREATE OR REPLACE STAGE TitanLoadStage
file_format = TitanLoadSevenDays;
CREATE TASK IF NOT EXISTS TitanLoadSuccessVsFail
WAREHOUSE = ITSM_LWH
SCHEDULE = 'USING CRON 1 * * * * Australia/Canberra' --every minute for testing purposes
COMMENT = 'Last 7 days of Titan game success vs fail load %'
AS
WITH SUCCESSCTE AS (
SELECT CLIENTNAME
, COUNT(EVENTTYPE) AS SuccessLoad --count success load events for that game
FROM vw_fact_gameload60
WHERE EVENTTYPE = 103 --success load events
AND USERTYPE = 1 --real users
AND APPID = 2 --titan games
AND EVENTARRIVALDATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE)) --only looking at the last week
GROUP BY CLIENTNAME
),
FAILCTE AS ( --same as above but for failed loads
SELECT CLIENTNAME
, COUNT(EVENTTYPE) AS FailedLoads -- count failed load events for that game
FROM vw_fact_gameload60
WHERE EVENTTYPE = 106 -- failed load events
AND USERTYPE = 1 -- real users
AND APPID = 2 -- Titan games
AND EVENTARRIVALDATE >= DATEADD(DAY, -7, CAST(GETDATE() AS DATE)) -- last 7 days
--AND FACTEVENTARRIVALDATE BETWEEN DATEADD(DAY, -7, GETDATE())AND GETDATE() -- last 7 days
GROUP BY CLIENTNAME
)
SELECT COALESCE(s.CLIENTNAME, f.CLIENTNAME) AS ClientName
, ZEROIFNULL(s.SuccessLoad) + ZEROIFNULL(f.FailedLoads) AS TotalLoads --sum the success and failed loads found for 103, 106 events only, calculated in CTEs
, ZEROIFNULL(s.SuccessLoad) AS Cnt_SuccessLoad --count from success cte
, ZEROIFNULL(f.FailedLoads) AS Cnt_FailedLoads --count from fail cte
, CONCAT(ZEROIFNULL(ROUND(s.SuccessLoad * 100.0 / TotalLoads,2)) , '%') As Pct_Success --percentage of SuccessLoads against total
, CONCAT(ZEROIFNULL(ROUND(f.FailedLoads * 100.0 / TotalLoads,2)), '%') AS Pct_Fail---percentage of failedLoads against total
FROM SUCCESSCTE s
FULL OUTER JOIN FAILCTE f -- outer join in the fail CTE by game name, outer required because some titan games sucess or fail events are NULL
ON s.CLIENTNAME = f.Clientname
ORDER BY CLIENTNAME ASC
--copy the results from the query to the snowflake staging table created above
COPY INTO #TitanLoadStage/unload/ FROM TITANLOADSUCCESSVSFAIL FILE_FORMAT = TitanLoadSevenDays
-- export the stage data to csv located in common folder
GET #TitanLoadStage/unload/data_0_0_0.csv.gz file:\\itsm\group\ITS%20Management\Common\All%20Staff\SMD\Games\Snowflake%20and%20GamesDNA\Snowflake\SnowflakeCSV\TitanLoad.csv
-- start the task
ALTER TASK IF EXISTS TitanLoadSuccessVsFail RESUME
If you want to get the results of a query ran through a task, you need to materialize the results of said query to a table.
What you have now:
CREATE TASK mytask_minute
WAREHOUSE = mywh
SCHEDULE = '5 MINUTE'
AS
SELECT 1 x;
COPY INTO #TitanLoadStage/unload/
FROM mytask_minute;
(mytask_minute is not a table, so you can't select from it)
What you should do instead:
CREATE TASK mytask_minute
WAREHOUSE = mywh
SCHEDULE = '5 MINUTE'
AS
CREATE OR REPLACE TABLE task_results_table
AS
SELECT 1 x;
COPY INTO #TitanLoadStage/unload/
SELECT *
FROM task_results_table;
I use an h2 database file in my Java app. The database is relatively small, 5M, on a SSD. The biggest table is about 25000 rows (25 columns). A simple select query on this table takes around 1.3 seconds. It seems very slow. The table has a primary key on ID. Here is the test code:
long tic = System.nanoTime();
final String sqlCmd = "select * from Transactions order by ID";
//final String sqlCmd = "select ID from TRANSACTIONS order by ID";
try (Statement statement = mConnection.createStatement();
ResultSet resultSet = statement.executeQuery(sqlCmd)) {
} catch (SQLException e) {
mLogger.error("SQLException " + e.getSQLState(), e);
}
long toc = System.nanoTime();
System.err.println("transactionTimingTest: " + (toc-tic)/1e6);
Repeating the same select query runs reasonably faster, about 0.2 seconds. Is there any way to improve the timing of the first run of the select query on the table?
With PostgreSQL 9.5, I would like to track the total amount of bytes written (since DB cluster start) to:
WAL
temp files
temp tables
For 1.:
select
pg_size_pretty(archived_count * 16*1024*1024) temp_bytes,
(now() - stats_reset)::text uptime
from pg_stat_archiver;
For 2.:
select
(now() - stats_reset)::text uptime,
pg_size_pretty(temp_bytes) temp_bytes
from pg_stat_database where datname = 'mydb';
How do I get 3.?
In response to a comment below, I did some tests to check where temp tables are actually written.
First, the DB parameter temp_buffers is at 8GB on this cluster:
select pg_size_pretty(setting::bigint*8192) from pg_settings
where name = 'temp_buffers';
-- "8192 MB"
Lets create a temp table:
drop table if exists foo;
create temp table foo as
select random() from generate_series(1, 1000000000);
-- Query returned successfully: 1000000000 rows affected, 10:22 minutes execution time.
Check the PostgreSQL backend pid and OID of the created temp table:
select pg_backend_pid(), 'pg_temp.foo'::regclass::oid;
-- 46573;398695055
Check the RSS size of the backend process
~$ grep VmRSS /proc/46573/status
VmRSS: 9246276 kB
As can be seen, this is only slightly above the 8GB set with temp_buffers.
The data inserted into the temp table is however immediately written, and it is written to the normal tablespace directories, not temp files:
select * from pg_relation_filepath('pg_temp.foo')
-- "base/16416/t3_398695055"
Here is the number of files and amount written:
with temp_table_files as
(
select * from pg_ls_dir('base/16416/') fn
where fn like 't3_398695055%'
)
select
count(*) as cnt,
pg_size_pretty(sum((pg_stat_file('base/16416/' || fn)).size)) as size
from temp_table_files;
-- 34;"34 GB"
And finally verify that the set of temp files owned by this backend PID is indeed empty:
with temp_files_per_pid as
(
with temp_files as
(
select
temp_file,
(regexp_replace(temp_file, $r$^pgsql_tmp(\d+)\..*$$r$, $rr$\1$rr$, 'g'))::int as pid,
(pg_stat_file('base/pgsql_tmp/' || temp_file)).size as size
from pg_ls_dir('base/pgsql_tmp') temp_file
)
select pid, pg_size_pretty(sum(size)) from temp_files group by pid order by pid
)
select * from temp_files_per_pid where pid = 46573;
Returns nothing.
What is also "interesting", after dropping the temp table
DROP TABLE foo;
the RSS of the backend process does not reduce:
~$ grep VmRSS /proc/46573/status
VmRSS: 9254544 kB
Doing the following will also not free the RSS again:
RESET ALL;
DEALLOCATE ALL;
DISCARD TEMP;
What I know, there are not any special metric for temp tables. The temp tables uses session (process) memory to temp_buffers size (8MB by default). When these temp buffers are full, then temporary files are generated.
I'm using Esper (the event processing engine), the EPL query is:
select * from Event.ext:time_order(timestamp_event, 10000 minutes) where duration > 10
But the output is not ordered by "timestamp_event":
id int = 1, timestamp_event= 1412686800000, duration = 30
id int = 4, timestamp_event= 1412685900000, duration = 70
id int = 2, timestamp_event= 1412688600000, duration = 45
id int = 3, timestamp_event= 1412689500000, duration = 60
id int = 5, timestamp_event= 1412636400000, duration = 15
Why does not the "time_order(timestamp_event, 10000 minutes)" instruction work?
I think the problem is on Esper configuration, let's consider a simple query:
select * from Event.win:time(10 sec) order by id_event
This is the code of the "upate" method of the UpdateListener:
public void update(EventBean[] newEvents, EventBean[] oldEvents) {
EventBean event = newEvents[0];
System.out.println("id int = " + event.get("id_event") + ", timestamp_event = " + ((Long)event.get("timestamp_event")).toString());
But the output is non ordered by "id_event"!
id event = 1, timestamp_event = 1412686800000
id event = 4, timestamp_event = 1412687700000
id event = 2, timestamp_event = 1412687100000
id event = 3, timestamp_event = 1412687400000
id event = 5, timestamp_event = 1412688000000
It seems neither the "order by" instruction doesn't work, how is it possible?
The documentation says to select rstream as the events leaving are ordered and not events entering. See http://esper.codehaus.org/esper-5.0.0/doc/reference/en-US/html_single/index.html#view-time-order
You need to define some sort of time or length constrain. Your statement is simply returning all the entering events into the time_order window.
A statement like this for example would give you all the events in the right order, every 1 minute:
select * from Event.ext:time_order(timestamp_event, 10000 minutes)
where duration > 10 output snapshot every 1 minute
Or, you could define a data window and insert events into it like this:
create window OrderedEvents.ext:time_order(timestamp_event, 10000 minutes) as select * from Event;
insert into OrderedEvents select * from Event;
You can then use ad-hoc queries against it, and they will return the events in the correct order (although you could achieve the same with a win:time(10000 minutes) and adding an order by timestamp_event to your ad-hoc query).