How to get total unused pages from firebird nbackup - firebird

I'm using firebird nbackup method to backup my database by using this code:
nbackup -u systemdab -p masterskeys -B 0 DB01.FDB DB01.NBK
After complete backup, the size is smaller than original size. It might be caused of removing unused pages (I'm new to firebird please correct me if I'm wrong).
Is there any possible way to prevent removing of unused pages, or any way to get the amount of unused pages inside my database ?
Please let me know if you have any idea. Thanks.

If GUI tools are okay you may install free edition of First AID from https://ib-aid.com/en/ibsurgeon-firstaid/
After you "open corrupt database" - but make sure the Firebird server itself does not have it opened at the moment - in the "Pages Summary" tab it will show you count and percentage of "free(unused) pages".
For command line, there is isql tool, which while names interactive SQL has options to run from pre-made script: https://firebirdsql.org/manual/isql.html
Starting with Firebird 3 it has unused page counter in show db command: http://tracker.firebirdsql.org/browse/CORE-5063
PAGE_SIZE 8192
Number of DB pages allocated = 2095520
Number of DB pages used = 2083825
Number of DB pages free = 11695
What about forcing incremental backup tool to make copies of garbage pages - I can not even imagine why would anyone want it.

Related

DB2 append backups into one unique db

I am using the restore db to import backup into a test environment database that works fine, but I need to extend this import process to several backups from several dates into a unique test environment db... What is the command to append backups into a unique database...
thanks
Phil
You write "I get only full offline database backup files from several time date (2 weeks of production) that I need to restore in a test database for analysis... example 4 backups files of 2 weeks of data = 2 months of data ...."
and you also write "What is the command to append backups into a unique database..."
While there is no explicit method for full-offline-backup images to be combined for Db2-LUW, there's always another way to get what you need...given the right skills and tools.
IF you have a FULL backup image, it can either be restored to a new database, or it can fully overwrite an existing database. If you have 4 FULL backup images, each can be restored either into a (uniquely named) database (or overwrite 4 existing databases).
You can also restore specific tablespaces from a backup image, if properly configured. Some sites have designed discrete tablespaces for specific time periods (one per day/week/month) to help with such activities. Some sites have designed their tables to be range partitioned (with each time period having its own partition (and sometimes dedicated tablespaces also), and this makes subsequent merging of content more easy with the right skills.
If you are competent with scripting, you can restore the first (earliest) image, export the relevant table contents to flat-files, restore the next backup image and export the relevant tables to new flat-files (repeat as needed), then load these flat-files into a table for analysis. If your database size is small then this can be considered a keep-it-simple approach.
You can also do clever things with federation if you restore to discrete databases.
Separately purchasable tools exist to let you extract selected content from a backup image (which can then be loaded into a Db2 database), without needing to do a restore action. These are not included with the Db2 product. So you could extract specific table contents from a backup image if you pay for the right tools and learn how to use them. Speak with your IBM Salesperson. Such tools may require currently supported versions of Db2 however.

CKAN - file size appearing null in Postgres database

I've installed CKAN on Ubuntu using the Postgres database. Some uploaded files are showing as having null size in the database (Resource table). However I can download the file from the URL and it is a couple of MB. Does anyone know why the size value is not showing?
After loading a large amount of data, you should always run VACUUM ANALYZE to make sure that the system knows about the new data. VACUUM ANALYZE will update statistics and that is reflected in the system catalogs. Those statistics are also used for query optimizations so it is crucial that you do the VACUUM as soon as possible after loading the data.
Vacuuming takes place automatically but on a schedule, so unless you do a manual vacuum it takes a while before the new data is properly accounted for in the system catalogs.
CKAN actually doesn't store the file size yet. The old filestore was incapable of doing this. From CKAN 2.2 and above, the new filestore is able to do this and we're tracking this in Issue #1517. I'll probably fix it myself when I get a chance.

postgres copy database to another server reduces database size

Installed postgres 9.1 in both the machine.
Initially the DB size is 7052 MB then i used the following command for copy to another server.
pg_dump -C dbname | bzip2 | ssh remoteuser#remotehost "bunzip2 | psql dbname"
After successfully copies, In destination machine i check size it shows 6653 MB.
then i checked for table count its same.
Has there been data loss? Is there missing data?
Note:
Two machines have same hardware and software configuration.
i used:
SELECT pg_size_pretty(pg_database_size('dbname'));
One of the PostgreSQL's most sophisticated features is so called Multi-Version Concurrency Control (MVCC), a standard technique for avoiding conflicts between reads and writes of the same object in database. MVCC guarantees that each transaction sees a consistent view of the database by reading non-current data for objects modified by concurrent transactions. Thanks to MVCC, PostgreSQL has great scalability, a robust hot backup tool and many other nice features comparable to the most advanced commercial databases.
Unfortunately, there is one downside to MVCC, the databases tend to grow over time and sometimes it can be a problem. In recent versions of PostgreSQL there is a separate server process called the autovacuum daemon (pg_autovacuum), whose purpose is to keep the database size reasonable. It does that by trying to recover reusable chunks of the database files. Still, there are many scenarios that will force the database to grow, even if the amount of the useful data in it doesn't really change. That happens typically if you have lots of UPDATE and/or DELETE statements in the applications that are using the database.
When you do a COPY, you recover extraneous space and so your copied DB appears smaller.
That looks normal. Databases are often smaller after restore, because a newly created b-tree index is more compact than one that's been progressively built by inserts. Additionally, UPDATEs and DELETEs leave empty space in the tables.
So you have nothing to worry about. You'll find that if you diff an SQL dump from the old DB and a dump taken from the just-restored DB, they'll be the same except for comments.

Esent database engine limited to specific page sizes?

I had a problem opening an esent database (Windows.edb) due to some problem with the page size. The pagesize of the Windows.edb on my system is 32K. When I set this via JET_paramDatabasePageSize JetInit would return the error -1213 (The database page size does not match the engine). Laurion Burchall suggested to turn off JET_paramRecovery once I need only a ReadOnly access to the database. That solved my problem.
Until now. I have a not perfectly shutdown database. I assume that, with JET_paramRecovery=On, JetInit would automagically do the recovery and let me read the database. But if I try that I get that old -1213 error.
Now I can fix my file with ESENTUTL but the dummy user of my app won't be able to. Is there some way to have recovery on and still be able to define ANY DatabasePageSize? There are no log files present at the location of the database (and I set the Logpath to the same directory to make sure they aren't written anywhere else).
Does this mean that the engine on my machine does not support the page size or the database? Or could I solve the problem with setting another magic switch?
Running recovery on another application's database is tricky. ESENT is an embedded engine and each application can have its own settings. Before you run recovery you need to know:
Where the logfiles are located (JET_paramLogFilePath)
The logfile size (JET_paramLogFileSize)
The database page size (JET_paramDatabasePageSize)
The logfile basename (JET_paramBaseName)
If you set all those parameters correctly then recovery will work properly. If you don't do that properly then the other application may have problems recovering its database!
There is a simple (but tricky) way to "fix" an EDB database which wasn't shut down gracefully. There is the state flag in the header at offset 52. It's a 4Byte Integer which should be set to 3 (if not closed gracefully the value you will find is be probably 2).
You will probably need to repeat this entry at the 2nd database page which contains a copy of the database header. You can find that page simply be the page size of the database (usually 4096, 8192, etc).
As this is really a hack, you should use it at your own RISK!

Is there a PostgreSQL equivalent of SQL Server profiler?

I need to see the queries submitted to a PostgreSQL server. Normally I would use SQL Server profiler to perform this action in SQL Server land, but I'm yet to find how to do this in PostgreSQL. There appears to be quite a few pay-for tools, I am hoping there is an open source variant.
You can use the log_statement config setting to get the list of all the queries to a server
https://www.postgresql.org/docs/current/static/runtime-config-logging.html#guc-log-statement
Just set that, and the logging file path and you'll have the list. You can also configure it to only log long running queries.
You can then take those queries and run EXPLAIN on them to find out what's going on with them.
https://www.postgresql.org/docs/9.2/static/using-explain.html
Adding to Joshua's answer, to see which queries are currently running simply issue the following statement at any time (e.g. in PGAdminIII's query window):
SELECT datname,procpid,current_query FROM pg_stat_activity;
Sample output:
datname | procpid | current_query
---------------+---------+---------------
mydatabaseabc | 2587 | <IDLE>
anotherdb | 15726 | SELECT * FROM users WHERE id=123 ;
mydatabaseabc | 15851 | <IDLE>
(3 rows)
I discovered pgBadger (https://pgbadger.darold.net/) and it is a fantastic tool that saved my life many times. Here is an example of an
report.
If you open it and go to 'top' menu you can see the slowest queries and the time consuming queries.
Then you can ask details and see nice graphs that show you the queries by hour and if you use detail button you can see the SQL text in a pretty form. So I can see that this tool is free and perfect.
I need to see the queries submitted to a PostgreSQL server
As an option, if you use pgAdmin (on my picture it's pgAdmin 4 v2.1). You can observe queries via "Dashboard" tab:
Update on Jun, 2022. Answering to the questions in the comments.
Question 1: My long SQL query gets truncated, is there any workaround?
Follow steps below:
Close pgAdmin
Find postgresql.conf file. On my computer it is located in c:\Program Files\PostgreSQL\13\data\postgresql.conf. If you can't find it - read this answer for more details.
Open postgresql.conf file and find property called track_activity_query_size. As you see by default the value is 1024 that means - all queries bigger than 1024 symbols will be truncated. Uncomment this property and set a new value, for example:
track_activity_query_size = 32768
Restart PostgreSQL service on your computer
P.S: now everything is ready, but keep in mind that this change can slightly decrease the performance. From development/debugging standpoint you won't see any difference, but better don't forget to revert this property in 'production' environment. For more details read this article.
Question 2: I ran my function/method that triggers SQL query but I still can't see it in pgAdmin, or sometimes I see it but it runs so quickly so I can't even expand the session on 'Dashboard' tab?
Answer: Try to run your application in 'debug' mode and set a breakpoint right before you close the connection to the database. At the same time (while you debugging) click on 'refresh' button on 'Dashboard' tab in pgAdmin.
You can use the pg_stat_statements extension.
If running the db in docker just add this command in docker-compose.yml, otherwise just look at the installation instructions for your setup:
command: postgres -c shared_preload_libraries=pg_stat_statements -c pg_stat_statements.track=all -c max_connections=200
And then in the db run this query:
CREATE EXTENSION pg_stat_statements;
Now to see the operations that took more time run:
SELECT * FROM pg_stat_statements ORDER BY total_time/calls DESC LIMIT 10;
Or play with other queries over that view to find what you are looking for.
All those tools like pgbadger or pg_stat_statements require access to the server, and/or altering the server-settings/server-log-settings, which is not such a good idea, especially if it requires server-restart and because logging slows everything down, including production use.
In addition to that, extensions such as pg_stat_statements don't really show the queries, let alone in chronological order, and pg_stat_activity doesn't show you anything that doesn't run right now, and in addition, queries that are running that are from other users than you.
Instead of running any such crap, you can add a TCP-proxy in between your application and PostgreSQL-server.
Then your TCP-proxy reads all the sql-query-statements from what goes over the wire from your application to the server, and outputs it to console (or wherever). Also it forwards everything to PostgreSQL and returns the answer(s) to your application.
This way, you don't need to stop/start/restart your db-server, you don't need admin/root rights !ON THE DB-SERVER! to change the config file, and you don't need any access to the db-server. All you need to do is change the db connection string in your application (e.g. in your dev-environment) to point to the proxy server instead of the sql-server (the proxy-server then needs to point to the sql-server). Then you can see (in chronological order) what your <insert_profanity_here> application does on the database - and also, other people's queries don't show up (which makes it even better than sql-server-profiler). [Of course, you can also see what other people do if you put it on the db server on the old db port, and assing the db a new port. ]
I have implemented this with pg_proxy_net
(runs on Windows, Linux and Mac and doesn't require OS-dependencies, as it is .NET-Core-self-contained-deployment).
That way, you get appx. "the same" as you get with sql-server profiler.
Wait, if you aren't disturbed by other people's queries, what you get with pg_proxy_net is actually better than what you get with sql-server profiler.
Also, on github, I have a command-line MS-SQL-Server profiler that works on Linux/Mac.
And an GUI MS-SQL-Express-Profiler for Windows.
The funny thing is, once you have written one such tool, writing some more is just a piece of cake and done in under a day.
Also, if you want to get pg_stat_statements to work, you need to alter the config file (postgresql.conf), adding tracking and preloading libraries, and then restart the server:
CREATE EXTENSION pg_stat_statements;
-- D:\Programme\LessPortableApps\SQL_PostGreSQL\PostgreSQLPortable\Data\data\postgresql.conf
shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
You find the documentation for the PostgreSQL protocol here:
https://www.postgresql.org/docs/current/protocol-overview.html
You can see how the data is written into the TCP-buffer by looking at the source code of a postgresql-client, e.g. FrontendMessages of Npgsql on github:
https://github.com/npgsql/npgsql/blob/main/src/Npgsql/Internal/NpgsqlConnector.FrontendMessages.cs
Also, just in case you have a .NET application (with source code) that uses Npgsql, you might want to have a look at Npgsql.OpenTelemetry.
PS:
To configure the logs, see ChartIO Tutorial and TablePlus.
Cheers !
Happy "profiling" !