Unit tests producing different results when using PostgreSQL - postgresql

Been working on a module that is working pretty well when using MySQL, but when I try and run the unit tests I get an error when testing under PostgreSQL (using Travis).
The module itself is here: https://github.com/silvercommerce/taxable-currency
An example failed build is here: https://travis-ci.org/silvercommerce/taxable-currency/jobs/546838724
I don't have a huge amount of experience using PostgreSQL, but I am not really sure why this might be happening? The only thing I could think that might cause this is that I am trying to manually set the ID's in my fixtures file and maybe PostgreSQL not support this?
If this is not the case, does anyone have an idea what might be causing this issue?
Edit: I have looked again into this and the errors appear to be because of this assertion, which should be finding the Tax Rate vat but instead finds the Tax Rate reduced
I am guessing there is an issue in my logic that is causing the incorrect rate to be returned, though I am unsure why...

In the end it appears that Postgres has different default sorting to MySQL (https://www.postgresql.org/docs/9.1/queries-order.html). The line of interest is:
The actual order in that case will depend on the scan and join plan types and the order on disk, but it must not be relied on
In the end I didn't really need to test a list with multiple items, so instead I just removed the additional items.
If you are working on something that needs to support MySQL and Postgres though, you might need to consider defining a consistent sort order as part of your query.

Related

How to debug DBIx::Class?

I was recently working on a Perl project that required me to use DBIx::Class as the ORM to interact with a database. One of the things I found most annoying and just time consuming was just trying to debug and understand what was happening.
I was especially frustrated with and error I was getting Column 'XXXXXX' in where clause is ambiguous and I figured out what was causing this error. It was down to the fact I was requesting columns from 2 different tables which where joined on the XXXXXX attribute and in the WHERE clause the column wasn't being aliased. This lead to DBIx::Class not knowing which column to use.
The most frustrating thing was not knowing what DBIx::Class was doing, leading me to have many doubts about where the error was coming from.
How to efficiently debug this kind of DBIx::Class errors?
You enable debugging by setting the DBIC_TRACE environment variable to 1 or a filename.
That's documented at the very top of DBIx::Class::Manual::Troubleshooting
So I knew what the error was, but I didn't know exactly where it was being caused. If this was plain old SQL I would've simply added the aliases myself but I didn't know how to do that in DBIx::Class. Moreover, I had no clue what SQL query was actually being executed, which made things even worse.
That's when I found out about the as_query method (https://metacpan.org/pod/DBIx::Class::ResultSet#as_query) which, when logged, prints out the SQL query that DBIx::Class executes. LIFE CHANGER. Saved me from so much trouble and gave me exactly what I needed. Shortly after I was able to solve the issue.
Moral of the story: if you're having trouble seeing what DBIx::Class is doing, use this and save yourself from countless headaches.

Postgres query plan changes for identical query when using pg_hint_plan

(Postgres 11.7)
I'm using the Rows pg_hint_plan hint to dynamically fix a bad row-count estimate.
My query accepts an array of arguments, which get unnested and joined onto the rest of the query as a predicate. By default, the query planner always assumes this array-argument contains 100 records, whereas in reality this number could be very different. This bad estimate was resulting in poor query plans. I set the number of rows definitively from within the calling application, by changing the hint text per query.
This approach seems to work sometimes, but I see some strange behaviour testing the query (in DBeaver).
If I start with a brand new connection, when I explain the query (or indeed just run it), the hint seems to be ignored for the first 6 executions, but thereafter it starts getting interpreted correctly. This is consistently reproducible: I see the offending row count estimates change on the 7th execution on a new connection.
More interestingly, the query also uses some (immutable) functions to do some lookup operations. If I remove these and replace them with an equivalent CTE or sub-select, this strange behaviour seems to disappear, and the hints are evaluated correctly all the time, even on a brand new connection.
What could be causing it to not honour the pg_hint_plan hints until after 6 requests have been made in that session? Why does the presence of the functions have a bearing on the hints?
Since you are using JDBC, try setting the prepareThreshold connection parameter to 1, as detailed in the documentation.
That will make the driver use a server prepared statement as soon as possible, ond it seems like this extension only works in that case.

Time of first occurrence of the query from pg_stat_statements. Possible to get?

I'm debugging a DB performance issue. There's a lead that the issue was introduced after a certain deploy, e.g. when DB started to serve some new queries.
I'm looking to correlate deployment time with the performance issues, and would like to identify the queries that are causing this.
Using pg_stat_statements has been very handy so far. Unfortunately it does not store the time stamp of the first occurrence of each query.
Is there any auxiliary tables I could look into to see the time of first occurrence of queries?
Ideally, if this information would have been available in pg_stat_statements, I'd make a query like this:
select queryid from where date(first_run) = '2020-04-01';
Additionally, it'd be cool to see last_run as well, so to filter out some old queries that no longer execute at all, but remain in pg_stat_statements. That's more of a nice thing that's a necessity though.
This information is not stored anywhere, and indeed it would not be very useful. If the problem statement is a new one, you can easily identify it in your application code. If it is not a new statement, but something made the query slower, the first time the query was executed won't help you.
Is your source code not under version control?

Why is Crystal Reports Query so slow?

I have many Crystal Reports to the same database. Some execute quickly given the same date parameters and many fields are the same as well as the tables they access. One of my reports used to run quickly is now running very slow and I can see it looking through all the records - represented in the bottom 0 of 100000 til it finds records. I have no idea what I may have changed to make it do this. Some reports still run fast and some do not. These findings are consistent with the reports I am talking about. Does anyone know why setting might be causing this?
I have tried looking for any subtle differences in them - I cannot see anything. Many of them where clones from the original(still works fast).
In my CR book in the performance section it states if the where clause can not be translated it will be ignored and for the process of all records - which is what this looks like - though I have a valid where clause when I check it in the report.
Use Indexes Or Server For Speed is checked. All other setting in Report Options as identical.
Thanks
You can do some troubleshoot:
Try run your query directly on db and see how long it takes.
Is there any business logic added in your report.
May be also try to put same query in fresh report and see if it takes similar time.
Also try debug your application and see if some part of your code making your report to show slow.
Are you running it on local db or on some server.
Also if you can share your query, so I can take a look.
Let me know if you need more help.

Perl script fails when selecting data from big PostgreSQL table

I'm trying to run a SELECT statement on PostgreSQL database and save its result into a file.
The code runs on my environment but fails once I run it on a lightweight server.
I monitored it and saw that the reason it fails after several seconds is due to a lack of memory (the machine has only 512MB RAM). I didn't expect this to be a problem, as all I want to do is to save the whole result set as a JSON file on disk.
I was planning to use fetchrow_array or fetchrow_arrayref functions hoping to fetch and process only one row at a time.
Unfortunately I discovered there's no difference when it comes to the true fetch operations between the two above and fetchall_arrayref when you use DBD::Pg. My script fails at the $sth->execute() call, even before it has a chance to do call any fetch... function.
This suggests to me that the implementation of execute in DBD::Pg actually fetches ALL the rows into memory, leaving only the actual format its returned to the fetch... functions.
A quick look at the DBI documentation gives a hint:
If the driver supports a local row cache for SELECT statements, then this attribute holds the number of un-fetched rows in the cache. If the driver doesn't, then it returns undef. Note that some drivers pre-fetch rows on execute, whereas others wait till the first fetch.
So in theory I would just need to set the RowCacheSize parameter. I've tried but this feature doesn't seem to be implemented by DBD::Pg
Not used by DBD::Pg
I find this limitation a huge general problem (execute() call pre-fetches all rows?) and more inclined to believe that I'm missing something here, than that this is actually a true limitation of interacting with PostgreSQL databases using Perl.
Update (2014-03-09): My script works now thanks to using a workaround as described in my comment to Borodin's answer. The maintainer of DBD::Pg library got back to me on the issue actually saying the root cause is deeper and lies within libpq postgresql internal library (used by DBD::Pg). Also, I think very similar issue to the one described here affects pgAdmin. Being postgresql native tool it still doesn't give in the Options chance to define the default limit of the result set row size. This is probably why it makes Query tool sometimes waiting a good while before presenting results from bulky queries, potentially breaking the app in some cases too.
In the section Cursors, the documentation for the database driver says this
Therefore the "execute" method fetches all data at once into data structures located in the front-end application. This fact must to be considered when selecting large amounts of data!
So your supposition is correct. However the same section goes on to describe how you can use cursors in your Perl application to read the data in chunks. I believe this would fix your problem.
Another alternative is to use OFFSET and LIMIT clauses on your SELECT statement to emulate cursor functionality. If you write
my $select = $dbh->prepare('SELECT * FROM table OFFSET ? LIMIT 1');
then you can say something like (all of this is untested)
my $i = 0;
while ($select->execute($i++)) {
my #data = $select->fetchrow_array;
# Process data
}
to read your tables one row at a time.
You may find that you need to increase the chunk size to get an acceptable level of efficiency.