Find First and First Difference in Progress 4GL - progress-4gl

I'm not clear about below queries and curious to know what is the different between them even though both retrieves same results. (Database used sports2000).
FOR EACH Customer WHERE State = "NH",
FIRST Order OF Customer:
DISPLAY Customer.Cust-Num NAME Order-Num Order-Date.
END.
FOR EACH Customer WHERE State = "NH":
FIND FIRST Order OF Customer NO-ERROR.
IF AVAILABLE Order THEN
DISPLAY Customer.Cust-Num NAME Order-Num Order-Date.
END.
Please explain me
Regards
Suga

As AquaAlex says your first snippet is a join (the "," part of the syntax makes it a join) and has all of the pros and cons he mentions. There is, however, a significant additional "con" -- the join is being made with FIRST and FOR ... FIRST should never be used.
FOR LAST - Query, giving wrong result
It will eventually bite you in the butt.
FIND FIRST is not much better.
The fundamental problem with both statements is that they imply that there is an order which your desired record is the FIRST instance of. But no part of the statement specifies that order. So in the event that there is more than one record that satisfies the query you have no idea which record you will actually get. That might be ok if the only reason that you are doing this is to probe to see if there is one or more records and you have no intention of actually using the record buffer. But if that is the case then CAN-FIND() would be a better statement to be using.
There is a myth that FIND FIRST is supposedly faster. If you believe this, or know someone who does, I urge you to test it. It is not true. It is true that in the case where FIND returns a large set of records adding FIRST is faster -- but that is not apples to apples. That is throwing away the bushel after randomly grabbing an apple. And if you code like that your apple now has magical properties which will lead to impossible to cure bugs.
OF is also problematic. OF implies a WHERE clause based on the compiler guessing that fields with the same name in both tables and which are part of a unique index can be used to join the tables. That may seem reasonable, and perhaps it is, but it obscures the code and makes the maintenance programmer's job much more difficult. It makes a good demo but should never be used in real life.

Your first statement is a join statement, which means less network traffic. And you will only receive records where both the customer and order record exist so do not need to do any further checks. (MORE EFFICIENT)
The second statement will retrieve each customer and then for each customer found it will do a find on order. Because there may not be an order you need to do an additional statement (If Available) as well. This is a less efficient way to retrieve the records and will result in much more unwanted network traffic and more statements being executed.

Related

Check for object ownership with Prisma

I'm new to working with Prisma. One aspect that is unclear to me is the right way to check if a user has permission on an object. Let's assume we have Book and Author models. Every book has an author (one-to-many). Only authors have permission to delete books.
An easy way to enforce this would be this:
prismaClient.book.deleteMany({
id: bookId, <-- id is known
author: {
id: userId <-- id is known
}
})
But this way it's very hard to show an UnauthorizedError to the user. Instead, the response will be a 500 status code since we can't know the exact reason why the query failed.
The other approach would be to query the book first and check the author of the book instance, which would result in one more query.
Is there a best practice for this in Prisma?
Assuming you are using PostgreSQL, the best approach would be to use row-level-security(RLS) - but unfortunately, it is not yet officially supported by Prisma.
There is a discussion about this subject here
https://github.com/prisma/prisma/issues/5128
As for the current situation, to my opinion, it is better to use an additional query and provide the users with informative feedback rather than using the other method you suggested without knowing why it was not deleted.
Eventually, it is up to you to decide based on your use case - whether or not it is important for you to know the reason for failure.
So this question is more generic than prisma - it is also true when running updates/deletes in raw SQL.
When you have extra where clauses to check for ownership, it's difficult to infer which of the clause(s) caused that if the update does not happen, without further queries.
You can achieve this with row level security in postgres, but even that does not come out the box and involves custom configuration to throw specific exceptions when rows are not found due to row level security rules. See this answer for more detail.
I tend to think that doing customised stuff like this is rarely worth the tradeoff, unless you need specialised UX for an uncommon circumstance.
What I would suggest instead in this case is to keep it simple and just use extra queries to check for ownership, but optimise the UX optimistically for the case where the user does own the entity and keep that common and legitimate usecase to a single query.
That is, catch the exception from primsa (or the fact that the update returns 0 rows, or whatever it is in different cases), and only then run a specific select for ownership, to check if that was the reason the update failed.
This is a nice tradeoff because it keeps things simple, and only runs extra queries in the (usually) far less common failure case.
Even having said all that, the broader caveat as always is that probably the extra queries simply won't matter regardless! It's, in 99% of cases, probably best to just always run the extra ownership query upfront as a pattern to keep things as simple as possible, which is what you really want to optimise for over performance until you're running at significant scale.

Regarding Microservices Fuzzy Boundaries

I am reading Microservices Patterns by Chris. In his book, he gave some example, which I could not able to understand section 5.2.1. The problem with fuzzy boundaries
Here is the link to read online. Can you someone please look into section 5.2.1 and help me understand what exactly the issue with fuzzy boundaries?
I didn't get clearly especially below statement:
In this scenario, Sam reduces the order total by $X and Mary reduces it by $Y. As a result, the Order is no longer valid, even though the application verified that the order still satisfied the order minimum after each consumer’s update
In above statement, can someone please explain me, why Order is no longer valid?
In above statement, can someone please explain me, why Order is no longer valid?
The business problem that Chris Richardson is using in this example assumes that (a) the system should ensure that orders are always valid, and (b) that valid orders exceed some minimum amount.
Minimum amount is determined by a sum of the order_items associated with a specific order.
The "fuzzy boundary" issue comes about because the code in question allows Sam and Mary to manipulate order_items directly; in other words, writing changes to order items does not lock the other items of the order.
If Sam and Mary were forced to acquire a lock on the entire order before validating their changes, then you wouldn't have a problem; the second person would see the changes made by the first.
Alternatively, locking at the level of the order_item would be fine if you weren't trying to ensure that the set of order items satisfy some property. Take away the constraint on the total order cost, and Sam and Mary only need to get locks on their specific item.

is this an SQL injection

In the apache access logs I found the following code as query string (GET), submitted multiple times each second for quite a while from one IP:
**/OR/**/ASCII(SUBSTRING((SELECT/**/COALESCE(CAST(LENGTH(rn)/**/AS/**/VARCHAR(10000))::text,(CHR(32)))/**/FROM/**/"public".belegtable/**/ORDER/**/BY/**/lv/**/OFFSET/**/1492/**/LIMIT/**/1)::text/**/FROM/**/1/**/FOR/**/1))>9
What does it mean?
Is this an attempt of breaking in via injection?
I have never seen such a statement and I don't understand its meaning. PostgreSQL is used on the server.
rn and belegtable exist. Some other attempts contain other existing fields/tables. Since the application is very costum, I don't know how the information on existing SQL fields can be known to strangers. Very weird.
**/
OR ASCII(
SUBSTRING(
( SELECT COALESCE(
CAST(LENGTH(rn) AS VARCHAR(10000))::text,
(CHR(32))
)
FROM "public".belegtable
ORDER BY lv
OFFSET 1492
LIMIT 1
)::text
FROM 1
FOR 1
)
) > 9
Is this an attempt of breaking in via injection?
The query in question does not have too many characteristics of an attempted SQL injection.
An SQL injection typically involves inserting an unwanted action into some section of a bigger query, under the disguise of a single value. Typically the injected part tries to guess what comes before it, neutralise it, do something malicious and secure the entire query from syntax errors by also neutralising what comes after the injected piece, which might not be visible to the attacker.
I don't see anything that could work as an escape sequence at the beginning or anything that would neutralise the remains of the query coming in after the injection. What this query does also isn't malicious. An SQL injection would attempt to extract some additional information about the database security, structure and configuration, or - if the attacker already gathered enough data - it would try to steal the data, encrypt it or tamper with it otherwise, depending on the aim and strategy of the attacker as well as the type of data found in the database. There also wouldn't be much point looping it like that.
As to the looping part: if someone attempted to put load on the database - as in DDoS - you'd likely see more than one node doing that and probably in a more elaborate and well disguised manner, using different and more demanding queries sent at different frequencies.
What does it mean?
It's likely someone's buggy code stuck in an unterminated loop, judging by the LIMIT and OFFSET mechanism I've seen used for looping through some set of records by taking one at a time (LIMIT 1) and incrementing which one to get next (OFFSET n). The whole expression always returns true because ASCII() returns the character code of the first character in the string. That string defaults to a space ' ', ASCII code 32, or some text representation of a number between 0 and 99999. Sice all ASCII digits are between code 48 and 57, it's effectively always comparing some bigger number than 9 to a 9, checking if it indeed is bigger.
The author of that code might not have predicted the loop to be able to run infinitely and might have misinterpreted what some of the functions used in that query do. Regardless of what really happened, I think it was a good idea to cut off that IP avoiding needless stress on the database. Double-checking your security setup is always a good idea but I wouldn't call this an attempted attack. At least not this query alone, as it might be a harmless piece of a bigger, more malicious operation - but that could be said about any query.

SQL Database Design - Flag or New Table?

Some of the Users in my database will also be Practitioners.
This could be represented by either:
an is_practitioner flag in the User table
a separate Practitioner table with a user_id column
It isn't clear to me which approach is better.
Advantages of flag:
fewer tables
only one id per user (hence no possibility of confusion, and also no confusion in which id to use in other tables)
flexibility (I don't have to decide whether fields are Practitioner-only or not)
possible speed advantage for finding User-level information for a practitioner (e.g. e-mail address)
Advantages of new table:
no nulls in the User table
clearer as to what information pertains to practitioners only
speed advantage for finding practitioners
In my case specifically, at the moment, practitioner-related information is generally one-to-many (such as the locations they can work in, or the shifts they can work, etc). I would not be at all surprised if it turns I need to store simple attributes for practitioners (i.e., one-to-one).
Questions
Are there any other considerations?
Is either approach superior?
You might want to consider the fact that, someone who is a practitioner today, is something else tomorrow. (And, by that I don't mean, not being a practitioner). Say, a consultant, an author or whatever are the variants in your subject domain, and you might want to keep track of his latest status in the Users table. So it might make sense to have a ProfType field, (Type of Professional practice) or equivalent. This way, you have all the advantages of having a flag, you could keep it as a string field and leave it as a blank string, or fill it with other Prof.Type codes as your requirements grow.
You mention, having a new table, has the advantage for finding practitioners. No, you are better off with a WHERE clause on the users table for that.
Your last paragraph(one-to-many), however, may tilt the whole choice in favour of a separate table. You might also want to consider, likely number of records, likely growth, criticality of complicated queries etc.
I tried to draw two scenarios, with some notes inside the image. It's really only a draft just to help you to "see" the various entities. May be you already done something like it: in this case do not consider my answer please. As Whirl stated in his last paragraph, you should consider other things too.
Personally I would go for a separate table - as long as you can already identify some extra data that make sense only for a Practitioner (e.g.: full professional title, University, Hospital or any other Entity the Practitioner is associated with).
So in case in the future you discover more data that make sense only for the Practitioner and/or identify another distinct "subtype" of User (e.g. Intern) you can just add fields to the Practitioner subtable, or a new Table for the Intern.
It might be advantageous to use a User Type field as suggested by #Whirl Mind above.
I think that this is just one example of having to identify different type of Objects in your DB, and for that I refer to one of my previous answers here: Designing SQL database to represent OO class hierarchy

Can I avoid a relation loop in my database design?

I try to design database tables for the case shown below. I also have an account defined, but it's not important regarding my problem.
There is a list of operations (expenses). Each operation can take place in specified POI, places can be grouped in chains (optional). Each operation can have a recipient, specifically a shop chain.
My current design looks like below. I could even remove chain table in favor of direct reference to recipient, but it still leaves a loop between tables. Effectively, single row could contain references to place and receiving account having different recipient defined.
The only solution I can see is a table check to exclude described case, but I'm wondering: is there a better fix?
As far as I can tell there isn't anything fundamentally wrong with your design. There's no need to change it just because it contains a loop. The loop in this case doesn't even appear to be a circular dependency. If you believe your current design accurately models what it is intended to then I see no need to change it.