How can I calculate the total no of records using Progress 4GL - progress-4gl

How can I calculate the total no. of records in a table? I want to show all table names in a DB along with the no. of records in each table

The fastest method is:
proutil dbname -C tabanalys > dbname.tab
this is an external utility that analyzes the db.
You can also, of course read every record and count them but that tends to be a lot slower.

The way to get the number of records depends on the application you are planning.
Our DBAs just use the progress utilities. In Unix /usr/dlc/bin/proutil -C dbanalys or some variation to get database information and just dump that to a file.
To get the schema information from progress itself you can use the VST tables. Specifically within a particular database you can use the _file table to retrieve all of the table names.
Once you have the table names you can use queries to get the number of records in the table. The fastest way to query a particular table for a record count is to use the preselect.
This will require the usage of a dynamic buffer and query.
So you can do something like the following.
CREATE WIDGET-POOL.
DEF VAR h_predicate AS CHAR NO-UNDO.
DEF VAR h_qry AS HANDLE NO-UNDO.
DEF VAR h_buffer AS HANDLE NO-UNDO.
FOR EACH _file NO-LOCK:
h_predicate = "PRESELECT EACH " + _file._file-name + " NO-LOCK".
CREATE BUFFER h_buffer FOR TABLE _file._file-name .
CREATE QUERY h_qry.
h_qry:SET-BUFFERS( h_buffer ).
h_qry:QUERY-PREPARE( h_predicate ).
h_qry:QUERY-OPEN().
DISP _file._file-name h_qry:NUM-RESULTS.
DELETE OBJECT h_qry.
DELETE OBJECT h_buffer.
END.

An easy one:
Select count(*) from tablename.
A bit more complex:
Def var i as int.
for each table:
i = i + 1.
end.
display i.
For more complex answer, you got the others.

Use CURRENT-RESULT-ROW function with DEFINE QUERY and GET LAST to get the total number of records:
e.g.
DEFINE QUERY qCustomer FOR Customer SCROLLING.
OPEN QUERY qCustomer FOR EACH Customer NO-LOCK.
GET LAST qCustomer.
DISPLAY CURRENT-RESULT-ROW("qCustomer") LABEL "Total number of rows".
...
CLOSE QUERY qCustomer.

Related

Statistics of all/many tables in FileMaker

I'm writing a kind of summary page for my FileMaker solution.
For this, I have define a "statistics" table, which uses formula fields with ExecuteSQL to gather info from most tables, such as number of records, recently changed records, etc.
This strangely takes a long time - around 10 seconds when I have a total of about 20k records in about 10 tables. The same SQL on any database system shouldn't take more than some fractions of a second.
What could the reason be, what can I do about it and where can I start debugging to figure out what's causing all this time?
The actual code is, like this:
SQLAusführen ( "SELECT COUNT(*) FROM " & _Stats::Table ; "" ; "" )
SQLAusführen ( "SELECT SUM(\"some_field_name\") FROM " & _Stats::Table ; "" ; "" )
Where "_Stats" is my statistics table, and it has a string field "Table" where I store the name of the other tables.
So each row in this _Stats table should have the stats for the table named in the "Table" field.
Update: I'm not using FileMaker server, this is a standalone client application.
We can definitely talk about why it may be slow. Usually this has mostly to do with the size and complexity of your schema. That is "usually", as you have found.
Can you instead use the DDR ( database design report ) instead? Much will depend on what you are actually doing with this data. Tools like FMPerception also will give you many of the stats you are looking for. Again, depends on what you are doing with it.
Also, can you post your actual calculation? Is the statistic table using unstored calculations? Is the statistics table related to any of the other tables? These are a couple things that will affect how ExecuteSQL performs.
One thing to keep in mind, whether ExecuteSQL, a Perform Find, or relationship, it's all the same basic query under-the-hood. So if it would be slow doing it one way, it's going to likely be slow with any other directly related approach.
Taking these one at a time:
All records count.
Placing an unstored calc in the target table allows you to get the count of the records through the relationship, without triggering a transfer of all records to the client. You can get the value from the first record in the relationship. Super light way to get that info vs using Count which requires FileMaker to touch every record on the other side.
Sum of Records Matching a Value.
using a field on the _Stats table with a relationship to the target table will reduce how much work FileMaker has to do to give you an answer.
Then having a Summary field in the target table so sum the records may prove to be more efficient than using an aggregate function. The summary field will also only sum the records that match the relationship. ( just don't show that field on any of your layouts if you don't need it )
ExecuteSQL is fastest when it can just rely on a simple index lookup. Once you get outside of that, it's primarily about testing to find the sweet-spot. Typically, I will use ExecuteSQL for retrieving either a JSON object from a user table, or verifying a single field value. Once you get into sorting and aggregate functions, you step outside of the optimizations of the function.
Also note, if you have an open record ( that means you as the current user ), FileMaker Server doesn't know what data you have on the client side, and so it sends ALL of the records. That's why I asked if you were using unstored calcs with ExecuteSQL. It can seem slow when you can't control when the calculations fire. Often I will put the updating of that data into a scheduled script.

how to get talend (TMAP) to load lookup data and incoming data at the same time

I have a talend job that i require a lookup at the target table.
Naturally the target table is large (a fact table) so I don't want to have to wait to load the whole thing before going to running lookups like this picture below:
Is there a way to have the lookup work DURING the pull from the main source?
The attempt is to speed up the inital loads so things move fast, and attempt to save on memory. as you can see, the lookup is already passed 3 Million rows.
the tLogRow represents the same table as the lookup.
You can achieve what you're looking for by configuring the lookup in your tMap to use "Reload at each row" lookup model, instead of "Load Once". This lookup model allows you to reexecute your lookup query for each incoming row, instead of loading all your lookup table at once, useful for lookups on large tables.
When you select the reload at each row model, you will have to specify a lookup key in the global map sections that will appear under the settings. Create a key with a name like "ORDER_ID", and map it with FromExt.ORDER_ID column. Then modify your lookup query so that it returns a single match for the ORDER_ID like so:
"SELECT col1, col1.. FROM lookup_table WHERE id = '" + (String)globalMap.get("ORDER_ID") + "'".
This is supposing your id column is a string.
What this does is create a global variable called "ORDER_ID" containing the order id for every incoming row from your main connection, then executes the lookup query filtering for that id.

Why does `FOR ALL ENTRIES` lower performance of CDS view on DB6?

I'm reading data from a SAP Core Data Service (CDS view, SAP R/3, ABAP 7.50) using a WHERE clause on its primary (and only) key column. There is a massive performance decrease when using FOR ALL ENTRIES (about a factor 5):
Reading data using a normal WHERE clause takes about 10 seconds in my case:
SELECT DISTINCT *
FROM ZMY_CDS_VIEW
WHERE prim_key_col eq 'mykey'
INTO TABLE #DATA(lt_table1).
Reading data using FOR ALL ENTRIES with the same WHERE takes about 50 seconds in my case:
"" boilerplate code that creates a table with one entry holding the same key value as above
TYPES: BEGIN OF t_kv,
key_value like ZMY_CDS_VIEW-prim_key_col,
END OF t_kv.
DATA lt_key_values TYPE TABLE OF t_kv.
DATA ls_key_value TYPE t_kv.
ls_key_value-key_value = 'mykey'.
APPEND ls_key_value TO lt_key_values.
SELECT *
FROM ZMY_CDS_VIEW
FOR ALL ENTRIES IN #lt_key_values
WHERE prim_key_col eq #lt_key_values-key_value
INTO TABLE #DATA(lt_table2).
I do not understand why the same selection takes five times as long when utilising FOR ALL ENTRIES. Since the table lt_key_values has only 1 entry I'd expect the database (sy-dbsys is 'DB6' in my case) to do exactly the same operations plus maybe some small neglectable overhead ≪ 40s.
Selecting from the underlying SQL view instead of the CDS (with its Access Control and so on) makes no difference at all, neither does adding or removing the DISTINCT key word (because FOR ALL ENTRIES implies DISTINCT).
A colleague guessed, that the FOR ALL ENTRIES is actually selecting the entire content of the CDS and comparing it with the internal table lt_key_values at runtime. This seems about right.
Using the transaction st05 I recorded a SQL trace that looks like the following in the FOR ALL ENTRIES case:
SELECT
DISTINCT "ZMY_UNDERLYING_SQL_VIEW".*
FROM
"ZMY_UNDERLYING_SQL_VIEW",
TABLE( SAPTOOLS.MEMORY_TABLE( CAST( ? AS BLOB( 2G )) ) CARDINALITY 1 ) AS "t_00" ( "C_0" VARCHAR(30) )
WHERE
"ZMY_UNDERLYING_SQL_VIEW"."MANDT" = ?
AND "ZMY_UNDERLYING_SQL_VIEW"."PRIM_KEY_COL" = "t_00"."C_0"
[...]
Variables
A0(IT,13) = ITAB[1x1(20)]
A1(CH,10) = 'mykey'
A2(CH,3) = '100'
So what actually happens is: ABAP selects the entire CDS content and puts the value from the internal table in something like an additional column. Then it only keeps those values where internal table and SQL result entry do match. ==> No optimzation on database level => bad performance.

Select/Insert from identical table on two different DB connections

I have to do a select from a table and insert into another identical(same structure) table on two different DB connections.
This is my code:
from("direct:" + getId)
.toD("sql:classpath:" +getSql1 + "?datasource= DataSourse1&usePlaceHolder=true"))
.setHeaders("Results", simple(${body})
toD("sql:classpath:" +getSql2 + "?datasource= DataSourse2&usePlaceHolder=true"))
where
getSQL1 : Select * FROM Product1
and
getSQL2 :Insert Into Product2 Values(${headers.results})
It does not work because of the data format I'm trying to insert, I suppose. What I get from the Select is something like this:
[{ID=130, DESCRIPTION=Product130}]
So, I need to clean my data and get only 130, 'Product130'
Any help? Thanks.
Assuming your actual code works and you're just transferring one record, change the getSql1 select option part to this:
"?datasource= DataSourse1&usePlaceHolder=true&outputType=SelectOne"
That puts a map into your message body as a result, instead of List of Map. No need to copy the result into a header. With Camel's SQL component, it will look for named query parameters in your message body if the body type is a Java Map.
Next change your insert to use the parameters within the Map:
insert into Product2 (ID, DESCRIPTION) values (:#ID, :#DESCRIPTION)
Notice I included the column names. This is for safety and good practice. If by chance the column order isn't the same as the origin table, this will still work.

C# Comparing lists of data from two separate databases using LINQ to Entities

I have 2 SQL Server databases, hosted on two different servers. I need to extract data from the first database. Which is going to be a list of integers. Then I need to compare this list against data in multiple tables in the second database. Depending on some conditions, I need to update or insert some records in the second database.
My solution:
(WCF Service/Entity Framework using LINQ to Entities)
Get the list of integers from 1st db, takes less than a second gets 20,942 records
I use the list of integers to compare against table in the second db using the following query:
List<int> pastDueAccts; //Assuming this is the list from Step#1
var matchedAccts = from acct in context.AmAccounts
where pastDueAccts.Contains(acct.ARNumber)
select acct;
This above query is taking so long that it gives a timeout error. Even though the AmAccount table only has ~400 records.
After I get these matchedAccts, I need to update or insert records in a separate table in the second db.
Can someone help me, how I can do step#2 more efficiently? I think the Contains function makes it slow. I tried brute force too, by putting a foreach loop in which I extract one record at a time and do the comparison. Still takes too long and gives timeout error. The database server shows only 30% of the memory has been used.
Profile the sql query being sent to the database by using SQL Profiler. Capture the SQL statement sent to the database and run it in SSMS. You should be able to capture the overhead imposed by Entity Framework at this point. Can you paste the SQL Statement emitted in step #2 in your question?
The query itself is going to have all 20,942 integers in it.
If your AmAccount table will always have a low number of records like that, you could just return the entire list of ARNumbers, compare them to the list, then be specific about which records to return:
List<int> pastDueAccts; //Assuming this is the list from Step#1
List<int> amAcctNumbers = from acct in context.AmAccounts
select acct.ARNumber
//Get a list of integers that are in both lists
var pastDueAmAcctNumbers = pastDueAccts.Intersect(amAcctNumbers);
var pastDueAmAccts = from acct in context.AmAccounts
where pastDueAmAcctNumbers.Contains(acct.ARNumber)
select acct;
You'll still have to worry about how many ids you are supplying to that query, and you might end up needing to retrieve them in batches.
UPDATE
Hopefully somebody has a better answer than this, but with so many records and doing this purely in EF, you could try batching it like I stated earlier:
//Suggest disabling auto detect changes
//Otherwise you will probably have some serious memory issues
//With 2MM+ records
context.Configuration.AutoDetectChangesEnabled = false;
List<int> pastDueAccts; //Assuming this is the list from Step#1
const int batchSize = 100;
for (int i = 0; i < pastDueAccts.Count; i += batchSize)
{
var batch = pastDueAccts.GetRange(i, batchSize);
var pastDueAmAccts = from acct in context.AmAccounts
where batch.Contains(acct.ARNumber)
select acct;
}