PostgreSQL update certain row - postgresql

I want to update verified from 'f' to 't' for certain rows, but it will update all the rows when I do this:
UPDATE
news
SET
verified = 't'
FROM
(
SELECT
verified
, ROW_NUMBER() OVER () AS rownum
FROM
news
) AS foo
WHERE
rownum = 1 or rownum = 15 or rownum = 32 or rownum = 54;
Can someone tell me where the problem is? Thanks

From the PostgreSQL documentation for the UPDATE clause:
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table.
So, your query creates the cross product of the news table with the results of the SELECT clause, which of course is also the news table (which I will call "foo" because of your AS clause). Every single row from the news table is accompanied by every single row from the foo result table, creating nxn rows where n is the number of rows in news. So, for every news row, there is at least one foo row that is selected by the WHERE clause. Thus, every row of the news table is modified.
It's not clear to me what you are trying to do. (Can the OVER parameter be blank? And why select verified in the foo query?) But you may want to join each foo row to the proper news row in the WHERE clause, perhaps by primary key.

Related

how to sort a column in postgresql and set it to sorted values

I am new to PostgreSQL and would like to know how to set a column in a table to its sorted version. For example:
(table: t1, column: points)
5
7
3
9
8
4
(table: t1, column: points) // pls note it is sorted
3
4
5
7
8
9
My incorrect version:
UPDATE outputTable SET points_count = (SELECT points_count FROM outputTable ORDER BY points_count ASC)
Try with this :
UPDATE outputTable
SET points_count = s.points_count
FROM (SELECT points_count, ctid FROM outputTable ORDER BY points_count ASC) s
WHERE outputTable.ctid = s.ctid;
As you are planning to update same table with reference to same table, you will need row level equality criteria like ctid to update each row.
It seems like you want to sort the rows in a table.
Now this is normally a pointless exercise, since tables have no fixed order. In fact, every UPDATE will change the order of rows in a PostgreSQL table.
The only way you can get a certain order in the result rows of a query is by using the ORDER BY clause, which will sort the rows regardless of their physical order in the table (which is not dependable, as mentioned above).
There is one use case for physically reordering a table: an index range scan using an index on points_count will be much more efficient if the table is physically sorted like the index. The reason is that far fewer table blocks will be accessed.
Therefore, there is a way to rewrite the table in a certain order as long as you have an index on the column:
CREATE INDEX ON outputtable (points_count);
CLUSTER outputtable USING points_count;
But – as I said above – unless you plan a range scan on that index, the exercise is pointless.

ROWID equivalent in postgres 9.2

Is there any way to get rowid of a record in postgres??
In oracle i can use like
SELECT MAX(BILLS.ROWID) FROM BILLS
Yes, there is ctid column which is equivalent for rowid. But is useless for you. Rowid and ctid are physical row/tuple identifiers => can change after rebuild/vacuum.
See: Chapter 5. Data Definition > 5.4. System Columns
The PostgreSQL row_number() window function can be used for most purposes where you would use rowid. Whereas in Oracle the rowid is an intrinsic numbering of the result data rows, in Postgres row_number() computes a numbering within a logical ordering of the returned data. Normally if you want to number the rows, it means you expect them in a particular order, so you would specify which column(s) to order the rows when numbering them:
select client_name, row_number() over (order by date) from bills;
If you just want the rows numbered arbitrarily you can leave the over clause empty:
select client_name, row_number() over () from bills;
If you want to calculate an aggregate over the row number you'll have to use a subquery:
select max(rownum) from (
select row_number() over () as rownum from bills
) r;
If all you need is the last item from a table, and you have a column to sort sequentially, there's a simpler approach than using row_number(). Just reverse the sort order and select the first item:
select * from bills
order by date desc limit 1;
Use a Sequence. You can choose 4 or 8 byte values.
http://www.neilconway.org/docs/sequences/
Add any unique column to your table(name maybe rowid).
And prevent changing it by creating BEFORE UPDATE trigger, which will raise exception if someone will try to update.
You may populate this column with sequence as #JohnMudd mentioned.

How to update a large table in optimized way using postgres

I am using postgresql. I have a table with about 10 million of records. I need to update a column of the table say 'a' using a sequence. This column needs to be updated in the order of another column say 'b'. So, for any two records r1 and r2, if value of 'a' for r1 is less than value of 'a' for r2 then value of 'b' for r1 must be less than value of 'b' for r2.
I am using something like this:
UPDATE table
SET col1 = nextval('myseq')
WHERE key IN (SELECT key
FROM table
ORDER BY col2);
key is the primary key of the table.
But it is taking too much time. Can anyone help me in doing it in optimized way.
Thanks
Try something like:
UPDATE table t
SET col1 = t2.new_col1
FROM (SELECT t2.key, nextval('myseq') as new_col1
FROM table t2
ORDER BY t2.col2) t2
WHERE t1.key = t2.key;
Or better something like:
UPDATE table t
SET col1 = t2.new_col1
FROM (SELECT t2.key,
row_number() OVER (ORDER BY t2.col2) as new_col1
FROM table t2) t2
WHERE t1.key = t2.key;
Don't use update at all.
Use a SELECT INTO like this:
SELECT *, nextval('myseq') AS col1
INTO new_table
FROM
(
SELECT *
FROM table
ORDER BY key
) AS sorted
Then replace the old table with the new. You'll have to recreate all your indexes and reinforce your primary keys.
Postgres doesn't replace each row it updates, it adds a second entry for the row and deprecates the old one. So if you're doing millions of updates it will make access extremely slow. Replacing the whole table is usually your best option.

To delete records beyond 20 from a table

At any time, I want my table to display the latest 20 rows and delete the rest.
I tried rownum > 20 but it said " 0 rows deleted" even when my table had 50 records.However, on triying rownum<20 - the first 19 records were deleted.
Please help.
ROWNUM is a pseudo-column which is assigned 1 for the first row produced by the query, 2 for the next, and so on. If you say "WHERE ROWNUM > 20", no row will be matched - the first row, if there was one, would have ROWNUM=1, but your predicate causes it to reject it - therefore the query returns no rows.
If you want to query just the latest 20 rows, you'd need some way of determining what order they were inserted into the table. For example, if each row gets a timestamp when it is inserted, this would usually be pretty reliable (unless you get thousands of rows inserted every second).
For example, a table with definition MYTABLE(ts TIMESTAMP, mycol NUMBER), you could query the latest 20 rows with a query like this:
SELECT * FROM (
SELECT ts, mycol FROM MYTABLE ORDER BY ts DESC
)
WHERE ROWNUM <= 20;
Note that if there is more than one row with exact same timestamp, this query may pick some rows non-deterministically if there are two or more rows tied for the 20th spot.
If you have an index on ts it is likely to use the index to avoid a sort, and Oracle will use stopkey optimisation to halt the query once it's found the 20th row.
If you want to delete the older rows, you could do something like this, assuming mycol is unique:
DELETE MYTABLE
WHERE mycol NOT IN (
SELECT mycol FROM (
SELECT ts, mycol FROM MYTABLE ORDER BY ts DESC
)
WHERE ROWNUM <= 20
);
The performance of this delete, if the number of rows to be deleted is large, will probably be helped by an index on mycol.

Select only half the records

I am trying to figure out how to select half the records where an ID is null. I want half because I am going to use that result set to update another ID field. Then I am going to update the rest with another value for that ID field.
So essentially I want to update half the records someFieldID with one number and the rest with another number splitting the update basically between two values for someFieldID the field I want to update.
In oracle you can use the ROWNUM psuedocolumn. I believe in sql server you can use TOP.
Example:
select TOP 50 PERCENT * from table
You can select by percent:
SELECT TOP 50 PERCENT *fields* FROM YourTable WHERE ...
update x set id=#value from (select top 50 percent * from table where id is null) x
The following SQL will return the col_ids of the first half of the table.
SELECT col_id FROM table
WHERE rownum <= (SELECT count(col_id)/2 FROM table);
If the total number of col_ids is an odd number then you will get the first half - 1. This is because, for instance, we have 51 total records, the count(col_id)/2 returns 25.5, and since there is no rownum equal to this result, we get everything equal to 25 and below. That means the other 26 are not returned.
However, I have not seen the reverse statement working:
SELECT col_id FROM table
WHERE rownum > (SELECT count(col_id)/2 FROM table);
So if you want the other half of the table, you could just store the first results into a temp table, lets call it TABLE_A. Then just do MINUS on the original table from this table:
SELECT col_id FROM table
MINUS
SELECT col_id FROM table_a
Hopefully this helps someone.