Progress Database Performance issue? - progress-4gl

We have recently upgraded to OE 11.3 version.The application and database appears to be slow in one particular location.But we didn't face any performance issues in the application or in the databases.i have checked few parameters in promon like buffer hits,Number of databse buffers,-spin parameter.
Buffer hits -97%
number of databse buffers - 50000
-spin berfore timeout- 2000 which looks very low.
is there any way we can find the issue why the database and application is very slow in only that location?
we are not facing any performance issue form other locations.
does increasing the -spin value would increase the performance in that location?
Location refers to geographical location.

You are not providing very much information:
A) About your intentions. Do you just want everything to be "faster"? Or are there other needs - like servers are out of memory/under heavy load etc.
B) About your system. How many users, databases, tables, indices etc etc.
C) When you say location - what do you really mean? Is it a specific program, a specific query/search or a specific (geographical) location?
Buffer hits
97% buffer doesn't say that much on its own:
Are there 1 000 record lookups or 1 000 000 000?
"Primary Buffer hits" says nothing about single tables. Perhaps all "buffer misses" comes from a single table (or very few).
A simple explanation of buffer hits:
A record read in the buffer (memory) is a "hit" a record read from the disk is not.
1 000 record lookups with 97% buffer hits means:
970 records are read from buffer (memory). (0.97 x 1 000)
30 records are read from disk. (0.03 x 1 000)
Increasing to 99% buffer hits means you will remove:
20 disc reads. (0.02 x 1 000)
1 000 000 000 record lookups with 97% buffer hits means:
970 000 000 records are read from buffer (memory).
30 000 000 are read from disk.
Increasing to 99% buffer hits means you will remove:
20 000 000 disc reads.
In the first case you won't notice anything at all most likely when going from 97 to 99%. In the second case load on discs will decrease a lot.
Conclusion
Increasing -B might affect your performance as well as buffer hits. Changing -spin might also affect your performance by utilizing more of your CPU. It all depends on how your system works. The best way really is to try (with a test setup).
The first thing you really should do is to look at your application and the most run queries - do the utilize optimal indices? If not you can most likely tune very much without getting big differences. Read up on index usage, XREF-compiling and Different VST-tables you can use to check index performance etcetera.
This is a good place to start:
Top 10 (really more) Performance Tuning Tips For The Progress Database
Also, you can try the excellent free ProTop software for and get some guesstimations for -B:
ProTop

This question is very vague. You would be much better off asking it in a forum where some "back and forth" can take place and where you can be guided to a more complete answer.
You might try:
http://progresstalk.com
https://community.progress.com
http://peg.com
These forums all have dedicated DBA focused areas where many people routinely chip in to help.

We have found that adding (on linux servers) -T /dev/shm made a big performance improvement
/oe116> cat startup.pf
-T /dev/shm
You can also add this to your common.pf files
You can see the before and after of this by doing a ( with the database running)
lsof |grep delete
and you should see a lot of locations on your hard disk then after you add it and restart your database` it will be using shared memory

Related

No data available in performance insight

I have a critical application deployed in AWS RDS; the DB engine is PostgreSQL version 10.18. The architecture is unusual, because we're talking about medical data. This means that all the doctors connecting the database (through a PGBouncer) have their own schema; arount 4000 doctors means around 4000 schemas, with the same structure but different data obviously. Around 2000 doctors are actually connecting every day.
The Instance type is db.r5.4xlarge and there's a total buffer of around 100 GB. Still, there are a lot of hits on the disk, this means that on the performance insight side, I can actually see that the greater AAS is because of a metric called "DataFileRead", which (as far as I know) means that the data couldn't be fetched from the buffer and the Engine went for the disk. There's an average value of 60 AAS on DataFileRead.
This is not really the problem; I'm trying to apply some optimizations creating the right indexes for example, the problem is that on the TOP SQL tab I cannot see any data right after the query (like Calls/sec, Rows/sec, Blk hits/sec, etc.).
Does this means that the limit of 5000 rows of the pg_stat_statements is too low? Also I can't find any information about the impact on the database performance about having the statistics enabled. Does it increase critically increasing the limit of 5000 records? Can I go up to 50000 for example?

What about expected performance in Pentaho?

I am using Pentaho to create ETL's and I am very focused on performance. I develop an ETL process that copy 163.000.000 rows from Sql server 2088 to PostgreSQL and it takes 17h.
I do not know how good or bad is this performance. Do you know how to measure if the time that takes some process is good? At least as a reference to know if I need to keep working heavily on performance or not.
Furthermore, I would like to know if it is normal that in the first 2 minutes of ETL process it load 2M rows. I calculate how long will take to load all the rows. The expected result is 6 hours, but then the performance decrease and it takes 17h.
I have been investigating in goole and I do not find any time references neither any explanations about performance.
Divide and conquer, and proceed by elimination.
First, add a LIMIT to your query so it takes 10 minutes instead of 17 hours, this will make it a lot easier to try different things.
Are the processes running on different machines? If so, measure network bandwidth utilization to make sure it isn't a bottleneck. Transfer a huge file, make sure the bandwidth is really there.
Are the processes running on the same machine? Maybe one is starving the other for IO. Are source and destination the same hard drive? Different hard drives? SSDs? You need to explain...
Examine IO and CPU usage of both processes. Does one process max out one cpu core?
Does a process max out one of the disks? Check iowait, iops, IO bandwidth, etc.
How many columns? Two INTs, 500 FLOATs, or a huge BLOB with a 12 megabyte PDF in each row? Performance would vary between these cases...
Now, I will assume the problem is on the POSTGRES side.
Create a dummy table, identical to your target table, which has:
Exact same columns (CREATE TABLE dummy LIKE table)
No indexes, No constraints (I think it is the default, double check the created table)
BEFORE INSERT trigger on it which returns NULL and drop the row.
The rows will be processed, just not inserted.
Is it fast now? OK, so the problem was insertion.
Do it again, but this time using an UNLOGGED TABLE (or a TEMPORARY TABLE). These do not have any crash-resistance because they don't use the journal, but for importing data it's OK.... if it crashes during the insert you're gonna wipe it out and restart anyway.
Still No indexes, No constraints. Is it fast?
If slow => IO write bandwidth issue, possibly caused by something else hitting the disks
If fast => IO is OK, problem not found yet!
With the table loaded with data, add indexes and constraints one by one, find out if you got, say, a CHECK that uses a slow SQL function, or a FK into a table which has no index, that kind of stuff. Just check how long it takes to create the constraint.
Note: on an import like this you would normally add indices and constraints after the import.
My gut feeling is that PG is checkpointing like crazy due to the large volume of data, due to too-low checkpointing settings in the config. Or some issue like that, probably random IO writes related. You put the WAL on a fast SSD, right?
17H is too much. Far too much. For 200 Million rows, 6 hours is even a lot.
Hints for optimization:
Check the memory size: edit the spoon.bat, find the line containing -Xmx and change it to half your machine memory size. Details varies with java version. Example for PDI V7.1.
Check if the query from the source database is not too long (because too complex, or server memory size, or ?).
Check the target commit size (try 25000 for PostgresSQL), the Use batch update for inserts in on, and also that the index and constraints are disabled.
Play with the Enable lazy conversion in the Table input. Warning, you may produce difficult to identify and debug errors due to data casting.
In the transformation property you can tune the Nr of rows in rowset (click anywhere, select Property, then the tab Miscelaneous). On the same tab check the transformation is NOT transactional.

MongoDB consumes a lot of memory

For more than a month is my war with mongoDB. Until I lose =] ...
Battle 1. Battle 2.
And now a new problem. Again, not enough memory.
Initially, this was solved by simply increasing the memory at a rate of VPS. Then journal = false. But now I got to the top of your plan and continue to increase the memory is not possible.
For my base are lacking 4 GB of memory.
How should I choose a database for the project, was nowhere written that there are so many mongoDB memory. With about 10 million records in the mongoDB missing 4 GB of memory, when my MySQL database with 10 million easily copes with 1.4 GB of memory.
The problem as I understand it, a large number of index fields. But since I can not log into the database, respectively, can not remove them. They needed me in the early stages of development, now they are not important to me.
Tell me please, can I remove them somehow?
There is a dump of the database is completely whole folder database / data / db
On my PC with 4 GB of memory database does not start on a VPS with 4GB same.
As an alternative, I think to take a test period at some VPS / VDS to run mongo and delete keys.
Do you know a web hosting with a test period and 6 GB of memory?
Or if there is an alternative, could you say what?
The issues has very little to do with the size of your data set. MongoDB uses memory mapped files for its storage engine. As such it'll start swapping in pages of hot data into memory when it can and it does so fairly aggressively (or more accurately, the OS memory management does).
Basically it uses as much memory as is available to it and there's very little you can do to avoid it. All data pages (be it actual data or indexes) that are accessed during operation will be swapped into memory if there is space available.
There are plenty of references to this on the internet and on mongodb.org by the way. Saying it isn't mentioned anywhere isn't really true.

MongoDB Insert performance - Huge table with a couple of Indexes

I am testing Mongo DB to be used in a database with a huge table of about 30 billion records of about 200 bytes each. I understand that Sharding is needed for that kind of volume, so I am trying to get 1 to 2 billion records on one machine. I have reached 1 billion records on a machine with 2 CPU's / 6 cores each, and 64 GB of RAM. I mongoimport-ed without indexes, and speed was okay (average 14k records/s). I added indexes, which took a very long time, but that is okay as it is a one time thing. Now inserting new records into the database is taking a very long time. As far as I can tell, the machine is not loaded while inserting records (CPU, RAM, and I/O are in good shape). How is it possible to speed -up inserting new records?
I would recommend adding this host to MMS (http://mms.10gen.com/help/overview.html#installation) - make sure you install with munin-node support and that will give you the most information. This will allow you to track what might be slowing you down. Sorry I can't be more specific in the answer, but there are many, many possible explanations here. Some general points:
Adding indexes means that that the indexes as well as your working data set will be in RAM now, this may have strained your resources (look for page faults)
Now that you have indexes, they must be updated when you are inserting - if everything fits in RAM this should be OK, see first point
You should also check your Disk IO to see how that is performing - how does your background flush average look?
Are you running the correct filesystem (XFS, ext4) and a kernel version later than 2.6.25? (earlier versions have issues with fallocate())
Some good general information for follow up can be found here:
http://www.mongodb.org/display/DOCS/Production+Notes

Cassandra random read speed

We're still evaluating Cassandra for our data store. As a very simple test, I inserted a value for 4 columns into the Keyspace1/Standard1 column family on my local machine amounting to about 100 bytes of data. Then I read it back as fast as I could by row key. I can read it back at 160,000/second. Great.
Then I put in a million similar records all with keys in the form of X.Y where X in (1..10) and Y in (1..100,000) and I queried for a random record. Performance fell to 26,000 queries per second. This is still well above the number of queries we need to support (about 1,500/sec)
Finally I put ten million records in from 1.1 up through 10.1000000 and randomly queried for one of the 10 million records. Performance is abysmal at 60 queries per second and my disk is thrashing around like crazy.
I also verified that if I ask for a subset of the data, say the 1,000 records between 3,000,000 and 3,001,000, it returns slowly at first and then as they cache, it speeds right up to 20,000 queries per second and my disk stops going crazy.
I've read all over that people are storing billions of records in Cassandra and fetching them at 5-6k per second, but I can't get anywhere near that with only 10mil records. Any idea what I'm doing wrong? Is there some setting I need to change from the defaults? I'm on an overclocked Core i7 box with 6gigs of ram so I don't think it's the machine.
Here's my code to fetch records which I'm spawning into 8 threads to ask for one value from one column via row key:
ColumnPath cp = new ColumnPath();
cp.Column_family = "Standard1";
cp.Column = utf8Encoding.GetBytes("site");
string key = (1+sRand.Next(9)) + "." + (1+sRand.Next(1000000));
ColumnOrSuperColumn logline = client.get("Keyspace1", key, cp, ConsistencyLevel.ONE);
Thanks for any insights
purely random reads is about worst-case behavior for the caching that your OS (and Cassandra if you set up key or row cache) tries to do.
if you look at contrib/py_stress in the Cassandra source distribution, it has a configurable stdev to perform random reads but with some keys hotter than others. this will be more representative of most real-world workloads.
Add more Cassandra nodes and give them lots of memory (-Xms / -Xmx). The more Cassandra instances you have, the data will be partitioned across the nodes and much more likely to be in memory or more easily accessed from disk. You'll be very limited with trying to scale a single workstation class CPU. Also, check the default -Xms/-Xmx setting. I think the default is 1GB.
It looks like you haven't got enough RAM to store all the records in memory.
If you swap to disk then you are in trouble, and performance is expected to drop significantly, especially if you are random reading.
You could also try benchmarking some other popular alternatives, like Redis or VoltDB.
VoltDB can certainly handle this level of read performance as well as writes and operates using a cluster of servers. As an in-memory solution you need to build a large enough cluster to hold all of your data in RAM.