how to check table is using how much space in DB2 - db2

Is there a way in DB2 to identify that a table is consuming how much of the total space allocated to the underlying table space."
Thanks

DB2 queries to check the table size
select
a.CARD*(sum(b.AVGCOLLEN)+10)/1024/1024 as Tablesize_in_MB
from
syscat.tables as a, syscat.columns as b
where
a.TABNAME = b.TABNAME and b.TABNAME = 'TABLE_NAME' group by a.CARD
data size
select
char(date(t.stats_time))||' '||char(time(t.stats_time)) as statstime
,substr(t.tabschema,1,8)||'.'||substr(t.tabname,1,24) as tabname
,card as rows_per_table
,decimal(float(t.npages)/ ( 1024 / (b.pagesize/1024)),9,2) as used_mb
,decimal(float(t.fpages)/ ( 1024 / (b.pagesize/1024)),9,2) as allocated_mb
from
syscat.tables t , syscat.tablespaces b
where t.tbspace=b.tbspace
order by 5 desc with ur
index size
select
rtrim(substr(i.tabschema,1,8))||'.'||rtrim(substr( i.tabname, 1,24)) as tabname
,decimal(sum(i.nleaf)/( 1024 / (b.pagesize/1024)),12,2) as indx_used_per_table_mb
from
syscat.indexes i, syscat.tables t , syscat.tablespaces b
where
i.tabschema is not null and i.tabname=t.tabname
and i.tabschema=t.tabschema and t.tbspace=b.tbspace
group by
i.tabname,i.tabschema, b.pagesize order by 2 desc with ur

Your question creates a false dichotomy because a tablespace may be created without allocating a limited amount of space to it. The limitation may rather be the drive or share that the tablespace is on. But if you know the space allocated to a tablespace or only need the percent of a tablespace's current size that a table is using, then yes, there is a way to know.
SELECT * FROM SYScat.tables where tabname='mytable';
will tell you how many pages a table is using.
Then at the command line: LiST TABLESPACES SHOW DETAIL
will tell you how many total pages are in the tablespace and the size of a page in bytes.
Select * from sysibmadm.tbsp_utilization where tbsp_name='MyTblSpace'
will give you the maximum size of the tablespace if it has one.

You can get the underlying table physical size from the SYSIBMADM.ADMINTABINFO table.
The command is given below.
db2 "SELECT SUBSTR(TABSCHEMA,1,15) as SCHEMA, SUBSTR(TABNAME,1,20) as
TABLENAME, DATA_OBJECT_P_SIZE ,INDEX_OBJECT_P_SIZE ,
LONG_OBJECT_P_SIZE , LOB_OBJECT_P_SIZE , XML_OBJECT_P_SIZE,
(DATA_OBJECT_P_SIZE + INDEX_OBJECT_P_SIZE + LONG_OBJECT_P_SIZE +
LOB_OBJECT_P_SIZE + XML_OBJECT_P_SIZE ) as TOTAL_P_SIZE from
SYSIBMADM.ADMINTABINFO where TABNAME='table_name'"
The total physical size is the sum of all the DATA , INDEX, LONG , LOB and XML OBJECT physical sizes (_P_SIZE indicates Physical Size).
You cannot get the tablespace name(even though you could get the tablespace id) from SYSIBMADM.ADMINTABINFO, for that you need to join the above query with SYSCAT.TABLES.

Related

How to get each row size of a particular table in postgresql..?

Is it possible to get each row size of a particular table in postgres?
For example a table has 50 rows and total table size is 10gb but i want to know each row size rather than complete table size.
Function to calculate row size of postgress table row is something like this
SELECT sum(pg_column_size(t.*)) as filesize, count(*) as filerow FROM TABLE_NAME as t;
replace TABLE_NAME with your table name;
to add condition t.COLUMN_NAME = 'YOUR_VALUE'
There is no function that calculates the size of a row, only a function that calculates the size of a column value. So you can do something like this:
select pg_column_size(column_1) +
pg_column_size(column_2) +
pg_column_size(column_3) as row_size
from the_table;
This could be automated using dynamic SQL and a PL/pgSQL function if you need this.

Conditional OR in the SQL Server Join – Multi-Value Parameters

I have an SSRS report with 4 parameters, two of which are multi-value parameters (#material and #color using VARCHAR(MAX) datatype in SQL Server 2008 R2). I am using a split function to return the value as a comma separated:
SELECT *
FROM MyView
WHERE height > 200
AND width > 100
AND (
material IN (SELECT Item FROM [dbo].[MySplitFunction] (#material, ',')) OR
color IN (SELECT Item FROM [dbo].[MySplitFunction] (#color, ','))
)
(The code above would return 50 records)
The problem with this approach is that these two multi-value parameters have around of 1,500 different colors and materials and degrade the performance. Sometimes, it takes more than 40 minutes to return the results (row count in the view around 600,000).
I tried a different approach where I used a temp table and used it in the JOIN instead of the WHERE clause:
SELECT Item
INTO #TempTable
FROM [dbo].[MySplitFunction] (#material, ',')
SELECT *
FROM MyView
INNER JOIN ON MyView.Item = #TempTable.Item
WHERE height > 200
AND width > 100
AND material IN (SELECT Item FROM [dbo].[MySplitFunction] (#material, ','))
(The code above would return 7 records only, but the performance is much better)
My question is how can I return the same number of records (50 rows) using the second approach by adding the other #color parameter and allowing the OR condition? So in the SSRS report, the user can multi select these two parameters and the query will return #material = values OR #color = Values.
I am open to any other approach as long as it speeds up the query and allows the OR condition for the two multi-value parameters (#material, #color).
Thanks!
Something like the following might do the trick. I'm not sure I have the syntax precisely right, and it wants further testing and analysis that I can't do without the proper structures and data...
SELECT
from MyVeiew
where height > 200
and width > 100
and (exists (select Item
from dbo.MySplitFunction(#material, ',')
where Item = material)
or exists (select Item
from dbo.MySplitFunction(#color, ',')
where Item = color)
)
This performs two correlated subqueries on nested function calls. Exists checks are generally faster than in lookups in these situations. The syntax bit that worries me is the "and (exists" bit -- that's the parenthesis for the OR clause, and combined with exists it looks a bit wonky.
I think it should do what you want, but testing is definitely called for.
I mistrust that or clause. To get rid of it, try this and see what happens:
SELECT * -- Better with specific columns
from MyView
where height > 200
and width > 100
and exists (select Item
from dbo.MySplitFunction(#material, ',')
where Item = material)
UNION select *
from MyView
where height > 200
and width > 100
and exists (select Item
from dbo.MySplitFunction(#color, ',')
where Item = color)
This runs and combines two queries, removing all duplicates -- pretty much the same as the OR clause would.
Next thing to check would be reviewing table sizes and checking indexes. You're filtering results on (only!) columns height, width, material, and color; if the table is huge, appropriate index would help here.

Inserting row to temp table

I have a problem with a performance, when I'm trying to create temporary table. The following code is a part of plpgsql function:
StartTime := clock_timestamp();
CREATE TEMP TABLE wo_tmp WITH (OIDS) AS
SELECT workorders1_.woid AS w_id, workorders1_.woid4seg AS w_id4seg
FROM common.workorders workorders1_
INNER JOIN common.lines lines2_ ON workorders1_.wolineid=lines2_.lineid
INNER JOIN common.products products2_ ON workorders1_.woprodid=products2_.prodid
INNER JOIN common.depts depts3_ ON lines2_.linedeptid=depts3_.deptid
WHERE workorders1_.wostatus='F'
AND workorders1_.wotypestatus = ANY ('{R,C,I,D}'::text[])
AND (p_deptid = 0 OR (depts3_.deptid = p_deptid AND ((p_deptid = 5 AND workorders1_.wosegid = 1) OR workorders1_.wosegid = 4)))
AND (p_lineid = 0 OR lines2_.lineid = p_lineid)
AND (p_prodid = 0 OR products2_.prodid = p_prodid)
AND (p_nrkokili = 0 OR workorders1_.wonrkokili = p_nrkokili)
AND (p_accepted = TRUE OR workorders1_.worjacceptstatus = 'Y')
AND workorders1_.wodateleaverr BETWEEN p_dfr AND p_dto
AND lines2_.status <> 'D';
CREATE INDEX wo_tmp_w_id_idx
ON wo_tmp USING btree (w_id ASC NULLS LAST);
CREATE INDEX wo_tmp_w_id4seg_idx
ON wo_tmp USING btree (w_id4seg ASC NULLS LAST);
EndTime := clock_timestamp();
Delta := extract('epoch' from EndTime)::bigint - extract('epoch' from StartTime)::bigint;
RAISE NOTICE 'Duration [0] in seconds=%', Delta;
Here's an explain analyze report: http://explain.depesz.com/s/uerF
It's strange, because when I execute this function, I obtain notice: Duration [0] in seconds=11. I check query without creating temp table and the result time is ~300ms.
Is it possible that inserting records (~73k) into a temporary table takes 11 seconds? Can I speed it up?
When you fill a temp table inside functions, you can find more than one issue:
locking issues - every temp table is table with some fields in system catalog. Intensive creating and dropping these tables creates high overhead with lot locking. Sometimes temp tables can be replaced by arrays. It is not your case, because you need a indexes.
blind optimization - embedded SQL in PlpgSQL functions are optimized for most common values (this mechanism was slightly enhanced in PostgreSQL 9.2 (but still with possible performance issues). It is not optimized for current values - and this fact can enforces some performance issue. Then dynamic SQL is necessary. Some links of this issues (one and second)
Some hw or file system issues - I am little bit confused about help WITHOUT OIDS. It looks like your file system is terrible bottleneck for you. Temp tables are stored in file system cache - storing 53K rows there should be fast .. removing four bytes (from 35) is not too big change.
postgres=# create table t1 with (oids) as select 1 a,2 b,3 c from generate_series(1,73000);
SELECT 73000
Time: 302.083 ms
postgres=# create table t2 as select 1 a,2 b,3 c from generate_series(1,73000);
SELECT 73000
Time: 267.459 ms
postgres=# create temp table t3 with (oids) as select 1 a,2 b,3 c from generate_series(1,73000);
SELECT 73000
Time: 154.431 ms
postgres=# create temp table t4 as select 1 a,2 b,3 c from generate_series(1,73000);
SELECT 73000
Time: 153.085 ms
postgres=# \dt+ t*
List of relations
Schema | Name | Type | Owner | Size | Description
-----------+------+-------+-------+---------+-------------
pg_temp_2 | t3 | table | pavel | 3720 kB |
pg_temp_2 | t4 | table | pavel | 3160 kB |
public | t1 | table | pavel | 3720 kB |
public | t2 | table | pavel | 3160 kB |
(4 rows)
Writing 3MB file to file system should be significantly less than 1sec .. so it is strange for 11 sec overhead. p.s. default temp_buffers is 8MB, so your result should be stored in memory only - and probably this hypothesis is false - and more probable is blind optimization hypothesis.
For starters, don't use WITH (OIDS) for temporary tables. Ever. Use of OIDs in regular tables is discouraged. That goes doubly for temp tables. Also reduces required RAM / space on disk, which is probably the main bottle neck here. Switch to WITHOUT OIDS.
Next, a likely cause (educated guess) is a lack of temp buffers which forces the temp table to spill to disk. Check the actual size of the temp table with
SELECT pg_size_pretty(pg_relation_size('wo_tmp'));
And set temp_buffers accordingly, possibly for the session only - round up generously, enough to avoid writing to disk.
Details:
How to delete duplicate entries?

Simple TSQL union view won't respect indexes of tables

I have 2 tables with identical structure & a view that unions them together. The view, however, isn't using the indexes on the underlying tables. Is there some way to make SQL Server respect the indexes without knowing which particular indexes will be needed by the caller? The plan for the view should be the same as the plan for a query that mimics the view's query...
Update: Even adding the query hint to use the index in the view's query does nothing!
Reproduce:
create TABLE Uno ( Id int NOT NULL primary key identity(1,1), SomeInt int NULL, SomeOtherCrap varchar(max) null )
CREATE NONCLUSTERED INDEX IX_SomeInt ON dbo.Uno ( SomeInt )
create TABLE Dos ( Id int NOT NULL primary key identity(1,1), SomeInt int NULL, SomeOtherCrap varchar(max) null )
CREATE NONCLUSTERED INDEX IX_SomeInt2 ON dbo.Dos( SomeInt )
go
insert Uno values (1, 'some other crap')
go
declare #i int = 0;
while #i < 500000
begin
insert Dos values( #i, 'a bunch of useless crap' )
set #i = #i + 1
end
go
update statistics uno IX_SomeInt
update statistics dos IX_SomeInt2
go
create View Both as
select * from Uno
union
select * from Dos
Query against view:
Query against individual tables:
The solution is to use the following view:
create View Both as
select * from Uno
union all
select * from Dos
Why does this work. As #BaconBits points out when you do the following
select *
from (select * from Uno
union
select * from Dos)
where SomeInt > 1
The optimizer can't tell that it can use the index on each table. However union all works different and a good optimizer can leverage that. Since union all has no cohesion (that is nothing in one select will effect the other select) the optimizer knows that it can slit the analysis as if you were doing the where clause on each select.
This optimization effect works within a view. Views are exactly the same as if you had typed out the whole query with a sub-select. They are useful for clarity, documentation, and security, but there is not difference between having a view and typing in the full select.
... except on systems where you can indexes to views, but that is another story and out of scope for this question.
Your non-view query specifies something that an index covers, while no index covers the view. The view says "union these then filter". The query says "filter these two tables then union them".
Compare:
select * from Uno where SomeInt > 1
union
select * from Dos where SomeInt > 1
To:
select *
from (select * from Uno
union
select * from Dos)
where SomeInt > 1
How is the query engine supposed to know that those are equivalent? Once the query engine does the union, it can't really use either index since, at the very least, the implicit DISTINCT has changed which rows appear in the result set (some from Uno and some from Dos). Any information gained by scanning the index wouldn't deterministically determine which rows should appear in the result set, so no index can be used.

Using two different rows from the same table in an expression

I'm using PostgreSQL + PostGIS.
In table I have a point and line geometry in the same column of the same table, in different rows. To get the line I run:
SELECT the_geom
FROM filedata
WHERE id=3
If i want to take point I run:
SELECT the_geom
FROM filedata
WHERE id=4
I want take point and line together, like they're shown in this WITH expression, but using a real query against the table instead:
WITH data AS (
SELECT 'LINESTRING (50 40, 40 60, 50 90, 30 140)'::geometry AS road,
'POINT (60 110)'::geometry AS poi)
SELECT ST_AsText(
ST_Line_Interpolate_Point(road, ST_Line_Locate_Point(road, poi))) AS projected_poi
FROM data;
You see in this example data comes from a hand-created WITH expression. I want take it from my filedata table. My problem is i dont know how to work with data from two different rows of one table at the same time.
One possible way:
A subquery to retrieve another value from a different row.
SELECT ST_AsText(
ST_Line_Interpolate_Point(
the_geom
,ST_Line_Locate_Point(
the_geom
,(SELECT the_geom FROM filedata WHERE id = 4)
)
)
) AS projected_poi
FROM filedata
WHERE id = 3;
Use a self-join:
SELECT ST_AsText(
ST_Line_Interpolate_Point(fd_road.the_geom, ST_Line_Locate_Point(
fd_road.the_geom,
fd_poi.the_geom
)) AS projected_poi
FROM filedata fd_road, filedata fd_poi
WHERE fd_road.id = 3 AND fd_poi.id = 4;
Alternately use a subquery to fetch the other row, as Erwin pointed out.
The main options for using multiple rows from one table in a single expression are:
Self-join the table with two different aliases as shown above, then filter the rows;
Use a subquery expression to get a value for all but one of the rows, as Erwin's answer shows;
Use a window function like lag() and lead() to get a row relative to the current row within the query result; or
JOIN on a subquery that returns a table
The latter two are more advanced options that solve problems that're difficult or inefficient to solve with the simpler self-join or subquery expression.