How to add index to CitusDB's cstore_fdw? - postgresql

I'm currently building an OLAP database in postgres and want to compare the performance of a column-store vs row-store database. CitusDB open-sourced its columnar-store extension cstore_fdw so I'm comparing database performance with and without this extension.
The example shows how to make a test db and query it. I have that example running. But then I try to add indices to it to and get the error ERROR: cannot create index on foreign table "table_name". It makes sense that I can't add indices to a foreign table. Yet, I still need to index that table, or else there's no way it will do well slicing or drilling into the data. How do I do this?

cstore_fdw currently doesn't support PostgreSQL indexes. But it automatically stores some min/max statistics in skip indexes which makes execution of some queries much more efficient.
To learn more about how to use skip indexes please consult documentation.

Related

Does PostgreSQL automatically create an index of a table?

In PostgreSQL,
when I create a table, and doesn't create any index for it, will PostgreSQL automatically create some default index for the table?
When I later update and query the table several times, will PostgreSQL be smart enough to automatically create an index for me based on how and how often I update and query the table?
If not, what commands in PostgreSQL can help me manually choose an index that will improve the performance of the table?
Thanks.
No database engine will create indexes on its own. Indexes have an important impact on performance (when modifying the records), and it's your role to know and calculate the performance gain or drop to take a clever/informed decision. The only index which is automatically created is the PrimaryKey index.
The only thing your database engine will be "smart" about, is when and how to use the indexes which already exists. This is called the query optimizer, and it bases its decision on complex algorithms and internal statistics.
There are tools to analyze how the database works to suggest some indexes. But the best, and simplest way, is to use an EXPLAIN.
https://www.postgresql.org/docs/9.5/static/sql-explain.html

Getting most used queries in mongodb

I'd like to analyze our db and create better indices for it.
Because our app is very complex, and we don't know what are the most used parts of our app, I'd like to somehow see what are the most used read queries that we hit our db with.
That would make it very easy for me to analyze and create the right indices for them.
Any ideas on how to do that?
you can enable database profiling for this.
get the details here - https://docs.mongodb.com/v3.2/tutorial/manage-the-database-profiler/
alternatively a simpler way would be to use the mongostat (details here -https://docs.mongodb.com/v3.2/administration/monitoring/) which captures and returns the counts of database operations by type (e.g. insert, query, update, delete, etc.).

Are PostgreSQL tables automatically re-indexed after an update?

If I index a PostgreSQL table and then update it, do I need to re-index the table or is it automatically re-indexed?
Can someone provide a link to PostgreSQL documentation for further reading? I've got this so far:
https://www.postgresql.org/docs/9.1/static/sql-createindex.html
indexes in PostgreSQL do not need maintenance or tuning
You do not need to re-index manually.
For more details, please also read
https://www.postgresql.org/docs/current/static/monitoring-stats.html
From further reading in the PostgreSQL documentation:
Once an index is created, no further intervention is required: the
system will update the index when the table is modified, and it will
use the index in queries when it thinks doing so would be more
efficient than a sequential table scan. But you might have to run the
ANALYZE command regularly to update statistics to allow the query
planner to make educated decisions. See Chapter 14 for information
about how to find out whether an index is used and when and why the
planner might choose not to use an index.
See:
https://www.postgresql.org/docs/current/static/indexes-intro.html

Perl: Programmatically drop PostgreSQL table index then re-create after COPY using DBD::Pg

I'm copying several tables (~1.5M records) from one data source to another, but it is taking a long time. I'm looking to speed up my use of DBD::Pg.
I'm currently using pg_getcopydata/pg_putcopydata, but I suppose that the indexes on the destination tables are slowing the process down.
I found that I can find some information on table's indexes using $dbh->statistics_info, but I'm curious if anyone has a programmatic way to dynamically drop/recreate indexes based on this information.
The programmatic way, I guess, is to submit the appropriate CREATE INDEX SQL statements via DBI that you would enter into psql.
Sometimes when copying a large table it's better to do it in this order:
create table with out indexes
copy data
add indexes

UPDATE vs INSERT statement performance in PostgreSQL

I am working with a database of a million rows approx.. using python to parse documents and populate the table with terms.. The insert statements work fine but the update statements get extremely time consuming as the table size grows..
It would be great if some can explain this phenomenon and also tell me if there is a faster way to do updates.
Thanks,
Arnav
Sounds like you have an indexing problem. Whenever I hear about problems getting worse as table size grows, it makes me wonder if you're doing a table scan whenever you interact with a table.
Check to see if you have a primary key and meaningful indexes on that table. Look at the WHERE clause you have on that UPDATE and make sure there's an index on those columns to make finding that record as fast as possible.
UPDATE: Write a SELECT query using the WHERE clause you use to UPDATE and ask the database engine to EXPLAIN PLAN. If you see a TABLE SCAN, you'll know what to do.