Filemaker summarize related field if - filemaker

I have a FM13 DB with a table "machines" and its related table "consumption" connected with the IDs.
My layout is showing the machines data and a portal with the related consumption entries.
Now I want to summarize the fields "amount" of all related entries where "fuelType" is "diesel" and the "year" is "2015" into one calculted field within the machines table.
Can anyone give me a clue how to do that?
thx
dan

In addition to the ways Michael suggested, FM13 introduced the ExecuteSQL command, which can used in a calculated field. The calculation would look something like this:
ExecuteSQL (
"SELECT SUM(amount)" & ¶ &
"FROM consumption" & ¶ &
"WHERE FuelType = 'diesel' AND \"Year\" = 2015" ;
"" ; ""
)

Create another relationship based on a calculation field in both tables based on concatenate fields:
id_fueltype_year: 12345_diesel_2015
You can then create a Calculation field in Machines containing Sum(newrelation: amount)
If you hard code the fueltype in the calc field machines table, then the result will always be the specified fuel type (e.g. diesel). Thus you would need to create multiple fields for each field type. Or you could create a fuel type field and as you change it the summary calc field would update upon selection.

If this is for display only, you could simply create a one-row portal based on the existing relationship and filter it to show only records where =
Consumption::FuelType = "diesel" and Consumption::Year = 2015
or (preferably, IMHO) =
Consumption::FuelType = Machines::gFuelType
and
Consumption::Year = Machines::gYear
with Machines::gFuelType and Machines::gYear being global fields where you can select any type/year to summarize.
Place a summary field defined (in the Consumption table) as Total of [Amount] inside the filtered portal.
If you need the result as data for further processing, then you will need to add a dedicated relationship (using another occurrence of the Consumption table) as:
Machines::MachineID = Consumption 2::MachineID
AND
Machines::gFuelType = Consumption 2::FuelType
AND
Machines::gYear = Consumption 2::Year
and use Sum ( Consumption 2::Amount ) to summarize the relevant entries.

Related

Statistics of all/many tables in FileMaker

I'm writing a kind of summary page for my FileMaker solution.
For this, I have define a "statistics" table, which uses formula fields with ExecuteSQL to gather info from most tables, such as number of records, recently changed records, etc.
This strangely takes a long time - around 10 seconds when I have a total of about 20k records in about 10 tables. The same SQL on any database system shouldn't take more than some fractions of a second.
What could the reason be, what can I do about it and where can I start debugging to figure out what's causing all this time?
The actual code is, like this:
SQLAusführen ( "SELECT COUNT(*) FROM " & _Stats::Table ; "" ; "" )
SQLAusführen ( "SELECT SUM(\"some_field_name\") FROM " & _Stats::Table ; "" ; "" )
Where "_Stats" is my statistics table, and it has a string field "Table" where I store the name of the other tables.
So each row in this _Stats table should have the stats for the table named in the "Table" field.
Update: I'm not using FileMaker server, this is a standalone client application.
We can definitely talk about why it may be slow. Usually this has mostly to do with the size and complexity of your schema. That is "usually", as you have found.
Can you instead use the DDR ( database design report ) instead? Much will depend on what you are actually doing with this data. Tools like FMPerception also will give you many of the stats you are looking for. Again, depends on what you are doing with it.
Also, can you post your actual calculation? Is the statistic table using unstored calculations? Is the statistics table related to any of the other tables? These are a couple things that will affect how ExecuteSQL performs.
One thing to keep in mind, whether ExecuteSQL, a Perform Find, or relationship, it's all the same basic query under-the-hood. So if it would be slow doing it one way, it's going to likely be slow with any other directly related approach.
Taking these one at a time:
All records count.
Placing an unstored calc in the target table allows you to get the count of the records through the relationship, without triggering a transfer of all records to the client. You can get the value from the first record in the relationship. Super light way to get that info vs using Count which requires FileMaker to touch every record on the other side.
Sum of Records Matching a Value.
using a field on the _Stats table with a relationship to the target table will reduce how much work FileMaker has to do to give you an answer.
Then having a Summary field in the target table so sum the records may prove to be more efficient than using an aggregate function. The summary field will also only sum the records that match the relationship. ( just don't show that field on any of your layouts if you don't need it )
ExecuteSQL is fastest when it can just rely on a simple index lookup. Once you get outside of that, it's primarily about testing to find the sweet-spot. Typically, I will use ExecuteSQL for retrieving either a JSON object from a user table, or verifying a single field value. Once you get into sorting and aggregate functions, you step outside of the optimizations of the function.
Also note, if you have an open record ( that means you as the current user ), FileMaker Server doesn't know what data you have on the client side, and so it sends ALL of the records. That's why I asked if you were using unstored calcs with ExecuteSQL. It can seem slow when you can't control when the calculations fire. Often I will put the updating of that data into a scheduled script.

Apex query optimization

I am trying this query:
List<Account> onlyRRCustomer = [SELECT
ac.rr_First_Name__c,
ac.rr_Last_Name__c,
ac.rr_National_Insurance_Number__c,
ac.id,
ac.rr_Date_of_Birth__c
FROM
Account ac
WHERE
ac.rr_National_Insurance_Number__c IN :uniqueNiInputSet
AND RecordTypeId = :recordTypeId];
It gives me an error:
SELECT ac.rr_First_Name__c, ac.rr_Last_Name__c,
ac.rr_National_Insurance_Number__c, ac.id, ac.rr_Date_of_Birth__c FROM
Account ac WHERE (ac.rr_National_Insurance_Number__c = :tmpVar1 AND
RecordTypeId = :tmpVar2) 10:12:05.0
(11489528)|EXCEPTION_THROWN|[49]|System.QueryException: Non-selective
query against large object type (more than 200000 rows). Consider an
indexed filter or contact salesforce.com about custom indexing.
I understand uniqueNiInputSet.size() ~ 50, so, it's not an issue but for that record type, it might contains more records.
So, if i changed the position will that work? Means, first the recordtype and then the NIset in where clause. Is there any order how where clause are selected in SF. So, it will only look for 50 member and then within 50 it will serach for the particular record type?
That just means that the script is taking too long to execute. You may need to move this to a #future method or make execute it using Database.Batchable.
I don't think the order matters in SOQL, I think it's just trying to return too many records.
A non-selective query means you are performing a query against a table that has a large number of records and your query is not specific enough. You can work with Salesforce support to try to resolve this, either through the creation of additional backend indexes or by making the query more selective.
To be honest, your query looks very selective already, you're not using LIKE or IN. You should also put your most selective conditions first (resulting in a more focused query against your records).
I know it should'nt matter, but I would also move your conditions out of the parenthesis.
If there are any other fields you can filter on, that may help. Sometimes, you have to actually create new fields and populate them just to help make your queries more selective.
Also, if rr_National_Insurance_Number__c is a formula field, you will want to change it to a text field and populate workflow or apex instead. Formula fields require additional time on the servers to calculate.
SELECT rr_First_Name__c, rr_Last_Name__c, rr_National_Insurance_Number__c, id, rr_Date_of_Birth__c
FROM Account
WHERE new_custom_field__c = TRUE
AND rr_National_Insurance_Number__c = :tmpVar1
AND RecordTypeId = :tmpVar2
Your query is non-selective. For a standard indexes is 30% for the fist million records and 15% of records over a million up to 1 million records total. For and "AND" query each individual where criteria must itself be selective see this quick reference cheat sheet. In general try making
rr_National_Insurance_Number__c
an external id which will make it an indexed by salesforce by default and retry you query. Record Types are already indexed by default. If the result is still non-selective because of the number of results returned, try limiting the number of results using a field like CreatedDate to limit the scope of the query.

Converting complex query with inner join to tableau

I have a query like this, which we use to generate data for our custom dashboard (A Rails app) -
SELECT AVG(wait_time) FROM (
SELECT TIMESTAMPDIFF(MINUTE,a.finished_time,b.start_time) wait_time
FROM (
SELECT max(start_time + INTERVAL avg_time_spent SECOND) finished_time, branch
FROM mytable
WHERE name IN ('test_name')
AND status = 'SUCCESS'
GROUP by branch) a
INNER JOIN
(
SELECT MIN(start_time) start_time, branch
FROM mytable
WHERE name IN ('test_name_specific')
GROUP by branch) b
ON a.branch = b.branch
HAVING avg_time_spent between 0 and 1000)t
GROUP BY week
Now I am trying to port this to tableau, and I am not being able to find a way to represent this data in tableau. I am stuck at how to represent the inner group by in a calculated field. I can also try to just use a custom sql data source, but I am already using another data source.
columns in mytable -
start_time
avg_time_spent
name
branch
status
I think this could be achieved new Level Of Details formulas, but unfortunately I am stuck at version 8.3
Save custom SQL for rare cases. This doesn't look like a rare case. Let Tableau generate the SQL for you.
If you simply connect to your table, then you can usually write calculated fields to get the information you want. I'm not exactly sure why you have test_name in one part of your query but test_name_specific in another, so ignoring that, here is a simplified example to a similar query.
If you define a calculated field called worst_case_test_time
datediff(min(start_time), dateadd('second', max(start_time), avg_time_spent)), which seems close to what your original query says.
It would help if you explained what exactly you are trying to compute. It appears to be some sort of worst case bound for avg test time. There may be an even simpler formula, but its hard to know without a little context.
You could filter on status = "Success" and avg_time_spent < 1000, and place branch and WEEK(start_time) on say the row and column shelves.
P.S. Your query seems a little off. Don't you need an aggregation function like MAX or AVG after the HAVING keyword?

FileMaker database design with calculated fields and filtering

I am trying out Filemaker Pro 12 right now with no previous FM experience, although other basic DB experience. The issue I have is trying to do filtered queries for a report that span one-to-many relationships. Here is an example;
The 2 tables:
Sample_Replicate
PK
Sample FK
other fields
Weights
Sample_Replicate_FK (linked to PK of Sample_Replicate)
Weight
Measurement type (tare, gross, dry, ash)
Wash type (null or from list of lab assays)
I want to create a report that displays: (gross-tare), (dry-tare)/(gross-tare), (ash-tare)/(gross-tare), and (dry-tare)/(gross-tare) for all dry weights with non null wash types.
It seems that FM wants me to create columns for each of these values (which is doable as the list of lab assays changes minimally and updating the database would be acceptable, though not preferred). I have tried to add a gross wt, tare wt, etc to the Sample_Replicate table, but it only is returning the first record (tare wt) when I use calculated field and method:
tare wt field = Case ( Weights::Measurement type = "Tare"; Weights::Weights )
gross wt field = Case ( Weights::Measurement type = "Gross"; Weights::Weights )
etc...
It also seems to be failing when I add the criteria:
and Is Empty(Weights::Wash type )
Could someone point me in the right direction on this issue. Thanks
EDIT:
I came across this: http://www.filemakertoday.com/com/showthread.php/14084-Calculation-based-on-1-to-many-relationship
It seems that I can create ~15 calculated fields for each combination of measurement and wash type on the weights table, then do a sum of these columns in the sample_replicate after adding these 15 columns to the table. This seems absolutely asinine. Isn't there a better way to filter results of a one-to-many relationship in FM?
What about the following structure:
Replicate
ID
Wash Weight
Replicate ID
Type (null or from list of lab assays)
Tare
Gross
Dry
Ash
+ calculated fields
I assume you only calculate weight ratios of the same wash type. The weight types (tare, gross, etc.) are not just labels here; since you use them in formulas in specific places, they are more like roles, so I think they deserve their own fields.
add tare wt field, etc. in the Weights table but then add a calc field in your Sample_Replicate table to get the sum of all related values
ex: add field "total tare wt" to be "sum ( Weights::tare wt)"

simplest example of a query by date range in cassandra 1.x

I want to store an ID and a date and I want to retrieve all entries from dateA up to dateB, what exactly do I need to be able to perform select from my_column_family where date >= dateA and date < dateB; ?
the guys at #cassandra (IRC) helped me find a way, there's many subtle details so I'd like to document that here.
first you need to declare a column family similar to this (examples from cassandra-cli):
create column family users with comparator=UTF8Type and key_validation_class=UTF8Type and column_metadata=[
{column_name: id, validation_class: LongType}
{column_name: name, validation_class: UTF8Type, index_type: KEYS}
{column_name: age, validation_class: LongType}
];
few important things about this declaration:
the comparator and key_validation_class are there to be able to use strings as key names
the first declared column is special, it's the "row key" which is used to address each row and therefore cannot contain duplicate values (the INSERT is really an UPSERT so when there's duplicates the new values overwrite the old ones)
the second column declares a "secondary index" on its values (more on that below)
the dates are stored as Long datatypes, interpretation is up to the client
now let's add some values:
set users[1][name] = john;
set users[1][age] = 19;
set users[2][name] = jane;
set users[2][age] = 21;
set users[3][name] = john;
set users[3][age] = 32;
according to this: http://pkghosh.wordpress.com/2011/03/02/cassandra-secondary-index-patterns/ Cassandra does not support the < operators, what it does is to manually exclude the rows that don't match but it does that AFTER there's a resultset and it also refuses to do so unless and actual filtering has taken place.
what that means is that a query like get users where age > 20; will return null but if we add a predicate that includes = it'll magically work.
here's where the secondary index is important, without it you can't use = so on this example I can do get users where name = jane; but I cannot ask for get users where age = 21;
the funny thing is that, after using = the < works so having a secondary index allows you to ask for get users where name = john and age > 20; and it'll filter correctly.
There are a few ways to solve this. The simplest is probably the secondary index solution with the equality limitation mentioned in your own answer. I've used this method, adding an additional column called 'valid', setting the value to 1. Then the queries can become where valid=1 and date>nnnn
The other solutions require additional column families and additional queries.
When loading the data, create and add to a column family which contains the timestamps as keys, and each entry would list all the user ids as column names.
If the partitioning strategy is ordered, then a single RangeSliceQuery can specify the date range as a key range and get all the columns for each key. Then iterate through the result keys, using the column values for each user id and if needed, query the original column family for the data associated with each id. Cassandra always stores the column names sorted, and can be reversed when reading.
But, as documented, the ordered partitioner is not ideal, leading to hot spots and difficulty in load balancing the nodes.
Without the ordered partitioner, still keeping the timestamp column family, you would have to create another column family while loading data where you can store all the timestamps as the columns under one or more known keys (e.g. 'created' or 'updated'). The first query would be a SliceQuery for a known key, and then the column names (as timestamps) would provide the keys for the MultigetSliceQuery to the timestamp column family.
I've used variations on this, usually adding Composite keys or columns for additional flexibility.