In oracle, Can i run index rebuild in parallel , all indexes belonging to same table.
We have a dyanmic sql procedure for rebuilding indexes, so can call that procedure in parallel for 3 indexes belonging to same table.
Regards,
Shi
I think it should cause no problems, you might want to check if the columns of the index overlap or not.
refer this for more info:
http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:2913600659112
Thanks
You must create indexes on existing table with data with PARALLEL ON and you can recreate them afterwards with NO-PARALLEL to avoid further processing degradation while insert data.
Related
Looking here, it is clear that Oracle supports execution of DDL commands in parallel with scenarios clearly listed. I was wondering whether Postgres does indeed offer such functionality? I can find a lot of material on "parallel queries" for PostgreSQL but not so much when DDL is involved.
For example, can I execute multiple 'CREATE TABLE...AS SELECT' in parallel? And if not, how can I achieve such functionality? What happens if I have a temporary table (CREATE TEMP TABLE)? Do I need to configure something for locks?
From here:
Even when it is in general possible for parallel query plans to be generated, the planner will not generate them for a given query if any of the following are true:
The query writes any data or locks any database rows. If a query contains a data-modifying operation either at the top level or within
a CTE, no parallel plans for that query will be generated.
(emphasis mine).
Which seems to suggest that Postgres will not "parallelize" any query that modifies the database structure, under any circumstances.
Running multiple queries simultaneously in Postgres requires one connection per running query.
Those are generic DDL statements, they are index operations and partition operations that can be parallelized.
If you check the Notes section of the CREATE INDEX statement, you'll see that parallel index building is supported :
PostgreSQL can build indexes while leveraging multiple CPUs in order to process the table rows faster. This feature is known as parallel index build. For index methods that support building indexes in parallel (currently, only B-tree), maintenance_work_mem specifies the maximum amount of memory that can be used by each index build operation as a whole, regardless of how many worker processes were started. Generally, a cost model automatically determines how many worker processes should be requested, if any.
Update
I suspect the real question is about CREATE TABLE ... AS though.
This is essentially a CREATE TABLE followed by an INSERT .. SELECT. The CREATE TABLE part can't be parallelized and doesn't have to - it's essentially a metadata operation. The SELECT on the other hand, could be parallelized easily. INSERT is a bit harder, but it's a matter of implementation.
As a_horse_with_no_name explains in a comment to this question, parallelization for CREATE TABLE AS was added in PostgreSQL 11 :
Improvements to parallelism, including:
CREATE INDEX can now use parallel processing while building a B-tree index
Parallelization is now possible in CREATE TABLE ... AS, CREATE MATERIALIZED VIEW, and certain queries using UNION
Parallelized hash joins and parallelized sequential scans now perform better
I have a table which index size is too big (about 2G). When I restore the database to a VM, the size is only 200M so I need to rebuild/recreate the index and I will probably do this online.
What is the difference between re-building (reindex) and re-creating the index, and which one is better when I do it online? Particularly, which option allows querying the DB during the operation?
The REINDEX command requires an exclusive table lock, which means that it'll stall any accesses to the table until the command has completed. If you can afford that kind of maintenance window it's perfectly fine.
The alternative for online rebuilding is to create a new index using CREATE INDEX CONCURRENTLY, then drop the old one. This will take longer to complete, but allows access to the table while rebuilding the index.
Postgres 12 has added a REINDEX INDEX CONCURRENTLY command, which does what you want here. https://paquier.xyz/postgresql-2/postgres-12-reindex-concurrently/ https://www.depesz.com/2019/03/29/waiting-for-postgresql-12-reindex-concurrently/
In PostgreSQL,
when I create a table, and doesn't create any index for it, will PostgreSQL automatically create some default index for the table?
When I later update and query the table several times, will PostgreSQL be smart enough to automatically create an index for me based on how and how often I update and query the table?
If not, what commands in PostgreSQL can help me manually choose an index that will improve the performance of the table?
Thanks.
No database engine will create indexes on its own. Indexes have an important impact on performance (when modifying the records), and it's your role to know and calculate the performance gain or drop to take a clever/informed decision. The only index which is automatically created is the PrimaryKey index.
The only thing your database engine will be "smart" about, is when and how to use the indexes which already exists. This is called the query optimizer, and it bases its decision on complex algorithms and internal statistics.
There are tools to analyze how the database works to suggest some indexes. But the best, and simplest way, is to use an EXPLAIN.
https://www.postgresql.org/docs/9.5/static/sql-explain.html
If I index a PostgreSQL table and then update it, do I need to re-index the table or is it automatically re-indexed?
Can someone provide a link to PostgreSQL documentation for further reading? I've got this so far:
https://www.postgresql.org/docs/9.1/static/sql-createindex.html
indexes in PostgreSQL do not need maintenance or tuning
You do not need to re-index manually.
For more details, please also read
https://www.postgresql.org/docs/current/static/monitoring-stats.html
From further reading in the PostgreSQL documentation:
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks doing so would be more
efficient than a sequential table scan. But you might have to run the
ANALYZE command regularly to update statistics to allow the query
planner to make educated decisions. See Chapter 14 for information
about how to find out whether an index is used and when and why the
planner might choose not to use an index.
See:
https://www.postgresql.org/docs/current/static/indexes-intro.html
I am working with a database of a million rows approx.. using python to parse documents and populate the table with terms.. The insert statements work fine but the update statements get extremely time consuming as the table size grows..
It would be great if some can explain this phenomenon and also tell me if there is a faster way to do updates.
Thanks,
Arnav
Sounds like you have an indexing problem. Whenever I hear about problems getting worse as table size grows, it makes me wonder if you're doing a table scan whenever you interact with a table.
Check to see if you have a primary key and meaningful indexes on that table. Look at the WHERE clause you have on that UPDATE and make sure there's an index on those columns to make finding that record as fast as possible.
UPDATE: Write a SELECT query using the WHERE clause you use to UPDATE and ask the database engine to EXPLAIN PLAN. If you see a TABLE SCAN, you'll know what to do.