When I add a lot of records into JavaDB, it becomes very sluggish - javadb

Do I have to index it? But How do I keep it consistent with the main table. I heard about re-indexing, but when do I do a re-index where appropriate?
Thanks
Jack

What is becoming sluggish? The inserts? In this case adding an index is probably not going to help. You could look for bulk insert in the docs.
If queries against the large table are getting slow, it may help with an index? If, or how much it helps, depends on how your table is defined and how you access it.
To create an index consult CREATE INDEX statement in the ref manual. It is kept consistent with base table automatically.

Related

Should I create an index on a column with repetitive values and where in lookup

I have a materialized view (which is very much a table) where I need to make where in kind of queries.
The column I want to query (say view_id) definitely has repetitions (15-20).
The where in queries would also be very large i.e - it would contain a lot of view_id to query.
Should I go ahead and create an index on this column?
Will it give me some performance improvements?
I have another column which would help help me have a multi column index(unique). Should this be a better option?
With questions such as these on performance, there is no substitute for testing it with your exact case. There's little harm in trying it out (even on a production system, but utilize a test system if you can!), other than perhaps slowing performance until you undo what you did. Postgres makes this kind of tinkering safe.
#tim-biegeleisen's first comment is spot on: with your setup, your cardinality is reduced, but that doesn't mean it's not a win.
In short, try it and see. There is no better answer you will get than what your own dataset and access patterns will give you.

postgres many tables vs one huge table

I am using postgresql db.
my application manages many objects of the same type.
for each object my application performs intense db writing - each object has a line inserted to db at least once every 30 seconds. I also need to retrieve the data by object id.
my question is how it's best to design the database? use one huge table for all the objects (slower inserts) or use table for each object (more complicated retrievals)?
Tables are meant to hold a huge number of objects of the same type. So, your second option, that is one table per object, doesn't seem to look right. But of course, more information is needed.
My tip: start with one table. If you run into problems - mainly performance - try to split it up. It's not that hard.
Logically, you should use one table.
However, so called "write amplification" problem exhibited by PostgreSQL seems to have been one of the main reasons why Uber switeched from PostgreSQL to MySQL. Quote:
"For tables with a large number of secondary indexes, these
superfluous steps can cause enormous inefficiencies. For instance, if
we have a table with a dozen indexes defined on it, an update to a
field that is only covered by a single index must be propagated into
all 12 indexes to reflect the ctid for the new row."
Whether this is a problem for your workload, only measurement can tell - I'd recommend starting with one table, measuring performance, and then switching to multi-table (or partitioning, or perhaps switching the DBMS altogether) only if the measurements justify it.
A single table is probably the best solution if you are certain that all objects will continue to have the same attributes.
INSERT does not get significantly slower as the table grows – it is the number of indexes that slows down data modification.
I'd rather be worried about data growth. Do you have a design for getting rid of old data? Big DELETEs can be painful; sometimes partitioning helps.

Performance Tuning: Create index for boolean column

I have written a daemon processor which will fetch rows from one database and insert them into another for synchronizing. It will fetch rows based on a boolean indication flag sync_done.
My table has hundreds of thousands of rows. When I select all rows with sync_done is false, will it cause any database performance issues? Should I apply indexing for that sync_done column to improve performance, since only rows with a sync_done value of false are fetched?
Say, I have 10000 rows. Of those, 9500 have already been synchronized (sync_done is true) and will not be selected.
Please suggest how I might proceed.
For a query like this, a partial index covering only unsynced rows would serve best.
CREATE INDEX ON tbl (id) WHERE sync_done = FALSE;
However, for a use case like this, other synchronization methods may be preferable to begin with:
Have a look at LISTEN / NOTIFY.
Or use a trigger in combination with dblink or a foreign data wrapper like postgres_fdw (preferably).
Or one of the many available replication methods.
Streaming Replication was added with Postgres 9.0 and has become increasingly popular.
I suggest that you do not index the table (the boolean is a low cardinality field), but partition it instead on the boolean value.
See: http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
A table with records and a boolean field should be the way to do it.
Here is something which I believe might help you...
Bitmap Index
Alternative of Bitmap Index in PostgreSQL
An index will certainly help but rather than polling which can impose load and concurrency issues if your database is heavily used it might be worth considering a notification method such as amqp or trigger/database queue based approach instead like Slony or Skytools Londiste.
I have used both Slony and Londiste for trigger based replication and have found both excellent. My preference is for Londiste as it is much simpler to set up and manage (and if you have a simple use case stick to the older 2. branch).

How to make substring-matching query work fast on a large table?

I have a large table with a text field, and want to make queries to this table, to find records that contain a given substring, using ILIKE. It works perfectly on small tables, but in my case it is a rather time-consuming operation, and I need it work fast, because I use it in a live-search field in my website. Any ideas would be appreciated...
Check Waiting for 9.1 – Faster LIKE/ILIKE blog post from depesz for a solution using trigrams.
You'd need to use yet unreleased Postgresql 9.1 for this. And your writes would be much slower then, as trigram indexes are huge.
Full text search suggested by user12861 would help only if you're searching for words, not substrings.
You probably want to look into full text indexing. It's a bit complicated, maybe someone else can give a better description, or you might try some links, like this one for example:
http://wiki.postgresql.org/wiki/Full_Text_Indexing_with_PostgreSQL

Slow SQLite access on iPhone

I have a quite slow data retrieval from a sqlite database on my iPhone and perhaps someone have an alternative idea to explain this. From what I tracked down so far sqlite3_step(statement) is sometimes unusually slow. While retrieving e.g. 50 rows from the database to execute this step takes normally some milliseconds but sometimes it takes several seconds.
My database is not small (80MB) and my theory is that the reason is paging. But can someone else think of an other reason for this?
Do you have a proper index on that table? Queries can be very slow if a full table scan is required to perform your query. See this page for example for some guidance on how to optimize your SQLite queries.
You may simply need to add an index to your table. Don't forget to reindex your table after you add your index (you'll need a tool, there's a free firefox add-on called "SQLite Manager" that does a pretty decent job for this)