some time ago I happend to resolve a PG related problem with this SO question of mine.
Basically it's about using row_number over a partition in 8.4.
Sadly now I have to create the same thing for 8.2 since one of my customers is on
8.2 and needs it desperatly.
What I do know (on 8.4) is the following:
SELECT custId, custName, 'xyz-' || row_number() OVER (PARTITION by custId)
AS custCode
Basically counting the occurances of custId and assigning custCodes from that.
(just an example, to show what I to; of course the query is way more complex)
I figured the solutions provided to the question mentioned above, but did'nt get them
working since there's one more hurdle to take. I don't run SQL directly I have to
embed it into a xml based config file which creates a certain xml format from the query
results. So creating temp stuff or procedures is not really an option.
So here's the question, does anyone of you guys have an idea how to port that solution of
mine to PG 8.2?
TIA
K
use depesz solution http://www.depesz.com/index.php/2007/08/17/rownum-anyone-cumulative-sum-in-one-query/
Related
i have a DB2 data source and an Oracle 12c target.
The Oracle has a DB link to the DB2 defined which is working in general.
Now i have a huge table in the DB2 which has a timestamp column (lets call it ROW_CHANGED) for row changes. I want to retrieve rows which have changed after a particular time.
Running
SELECT * FROM lib.tbl WHERE ROW_CHANGED >'2016-08-01 10:00:00'
on the DB2 returns exactly 1 row after ca. 90 secs which is fine.
Now i try the same query from the Oracle via the db link:
SELECT * FROM lib.tbl#dblink_name WHERE ROW_CHANGED >TO_TIMESTAMP('2016-08-01 10:00:00')
This runs for hours and ends up in a timeout.
I read some Oracle docs and found distributed query optimization tips but most of them refer to joining a local to a remote table which is not my case.
In my desperation, i have tried the DRIVING_SITE hint, without effect.
Now i wonder when the WHERE part of the query will be evaluated. Since i have to use Oracle syntax and not DB2 syntax for the query, is it possible the Oracle will try to first copy the full table and apply the where clause afterwards? I did some research but did not find anything which would help me in this direction.
The ROW_CHANGED is a hidden column in the DB2, if that matters.
Thx for any hint in advance.
Update
Thanks#all for help. I'll share what did the trick for me.
First of all i have used TO_TIMESTAMP since the DB2 column is also Timestamp (not date) and i had expected to circumvent implicit conversions by this.
Without the explicit conversion i ran into ORA-28534: Heterogeneous Services preprocessing error and i have no hope of touching the DB config within reasonable time.
The explain plan btw did not bring much. It showed a FULL hint and no conversion on the predicates. Indeed it showed the ROW_CHANGED column as Date, i wonder why.
I have tried Justins suggestion to use a bind variable, however i got ORA-28534 again. Next thing i did was to wrap it into a pl/sql block (will run in a SP anyway later).
declare
v_tmstmp TIMESTAMP := 01.08.16 10:00:00;
begin
INSERT INTO ORAUSER.TMP_TBL (SRC_PK,ROW_CHANGED)
SELECT SRC_PK,ROW_CHANGED
FROM lib.tbl#dblink_name
WHERE ROW_CHANGED > v_tmstmp;
end;
This was executing in the same time as in DB2 itself. The date format is DD.MM.YY here since it is the default unfortunately.
When changing the variable assignment to
v_tmstmp TIMESTAMP := TO_TIMESTAMP('01.08.16 10:00:00','DD.MM.YY HH24:MI:SS');
I got the same problem as before.
Meanwhile the DB2 operators have created an index in the ROW_CHANGED column which i requested earlier that day. This has solved the problem in general it seems. Even my original query finishes in no time now.
If you are actually using an Oracle-specific conversion function like to_timestamp, that forces the predicate to be evaluated on the Oracle side. Oracle isn't going to know how to convert a built-in function like to_timestamp into an exactly equivalent function call in DB2.
If you used a bind variable, that would be more likely to get evaluated on the DB2 side. But that may be complicated by the data type mapping between different databases-- there may not be a perfect mapping between one engine's date and another engine's timestamp data type. If this was a numeric column, a bind variable would be almost certain to get pushed. In this case, it probably involves playing around a bit to figure out exactly what data type to use for your variable that works for your framework, Oracle, and DB2.
If using a bind variable doesn't work, you can force the predicate to be evaluated on the remote server using the dbms_hs_passthrough package. That lets you send a query verbatim to the remote server which allows you to do things like use functions defined in your DB2 database. That's a bit of overkill in this situation, hopefully, but it's nice to have the hammer as your backup if the simpler solution doesn't work quickly enough.
Note that PostgreSQL website mentions that it has a limit on number of columns between 250-1600 columns depending on column types.
Scenario:
Say I have data in 17 tables each table having around 100 columns. All are joinable through primary keys. Would it be okay if I select all these columns in a single select statement? The query would be pretty complex but can be programmatically generated. The reason for doing this is to get denormalised data to populate a web page. Please do not ask why though :)
Quite obviously if I do create table table1 as (<the complex select statement>), I will be hitting the limit mentioned in the website. But do simple queries also face the same restriction?
I could probably find this out by doing the exercise myself. In the next few days I probably will. However, if someone has an idea about this and the problems I might face by doing a single query, please share the knowledge.
I can't find definitive documentation to back this up, but I have
received the following error using JDBC on Postgresql 9.1 before.
org.postgresql.util.PSQLException: ERROR: target lists can have at most 1664 entries
As I say though, I can't find the documentation for that so it may
vary by release.
I've found the confirmation. The maximum is 1664.
This is one of the metrics that is available for confirmation in the INFORMATION_SCHEMA.SQL_SIZING table.
SELECT * FROM INFORMATION_SCHEMA.SQL_SIZING
WHERE SIZING_NAME = 'MAXIMUM COLUMNS IN SELECT';
I'm trying to update a table (in pgsql) with a complex expression that needs to occur several times in the UPDATE statement. WITH seems perfect for this:
WITH newtz AS (SELECT timezone FROM timezonebyzipcode WHERE zip=(SELECT zip_code FROM company WHERE id=company_id))
UPDATE cross_rental
SET return_timezone=newtz
return_time=(return_time AT TIME ZONE return_timezone) AT TIME ZONE newtz
WHERE return_to='Vendor' AND return_timezone<>newtz
Unfortunately, it doesn't work:
ERROR: syntax error at or near "UPDATE"
LINE 2: UPDATE cross_rental
^
I've searched and couldn't find any examples of using WITH with UPDATE in this way, but I also don't see anything indicating it shouldn't work. Is this just unsupported, or am I making some silly mistake?
And, if it's unsupported, should I just copy that nasty long expression into each of the three places where I'm using "newtz" in the UPDATE clause? Or is there some better way to accomplish this update?
ERROR: syntax error at or near "UPDATE"
LINE 2: UPDATE cross_rental
This specific error message reveals you're using a PostgreSQL version 9.0 or older. The two major versions before 9.1 featured CTEs and WITH, but not in the context of data modifying queries.
This appeared in 9.1. See 7.8.2. Data-Modifying Statements in WITH in PostgreSQL 9.1 doc.
Assuming a newer version, a CTE must be used as a table with rows and columns (not as a scalar variable), so the query should be fixed as mentioned in Richard Huxton's answer.
It's a table (or a from-recordset-source anyway).
WITH calculated AS (
SELECT ....
)
UPDATE foo
SET bar = calculated.something
FROM calculated
WHERE ...
WITH (SELECT timezone FROM timezonebyzipcode WHERE zip=(SELECT zip_code FROM company WHERE id=company_id)) AS newtz
You have the alias and referent backwards...
I would like to run a query on a large table along the lines of:
SELECT DISTINCT user FROM tasks
WHERE ctime >= '2012-01-01' AND ctime < '2013-01-01' AND parent IS NULL;
There is already an index on tasks(ctime), but most (75%) of rows have a non-NULL parent, so that's not very effective.
I attempted to create a partial index for those rows:
CREATE INDEX CONCURRENTLY task_ctu_np ON tasks (ctime, user)
WHERE parent IS NULL;
but the query planner continues to choose the tasks(ctime) index instead of my partial index.
I'm using postgresql 8.2 on the server, and my psql client is 8.1.
First, I second Richard's suggestion that upgrading should be at the top of your priority. The areas of partial indexes, etc. have, as I understood it, improved significantly since 8.2.
The second thing is you really need the actual query plans with timing information (EXPLAIN ANALYZE) because without these we can't talk about selectivity, etc.
So my order of business if I were you would be to upgrade first and then tune after that.
Now, I understand that 8.3 is a big upgrade (it is the only one that caused us issues in LedgerSMB). You may need some time to address that, but the alternative is to get further behind and be asking questions on a version that is less and less in current understanding as time goes on.
I am currently using this sqlite query in my application. Two tables are used in this query.....
UPDATE table1 set visited = (SELECT COUNT(DISTINCT table1.itemId) from 'table2' WHERE table2.itemId = table1.itemId AND table2.sessionId ='eyoge2avao');
It is working correct.... My problem is it is taking around 10 seconds to execute this query and retrieve the result..... Don't know what to do... Almost all other process are in right way.. So it seems the problem is with this query formation...
Plz someone help with how to optimize this query....
Regards,
Brian
Make sure you have indexes on the following (combinations of) fields:
table1.itemId
(This will speed up the DISTINCT clause, since the itemId will already be in the correct order).
table2.itemId, table2.sessionId
This will speed up the WHERE clause of your SELECT statement.
How many rows are there in these tables?
Aso try doing an EXPLAIN on your SELECT command to see if it gives you any helpful advice.