Rails 4+: only select specific columns from an included association - postgresql

I have a user model that belongs to to an account (pretty standard). On every request, I need to retrieve the current_user but I also want to return the account to reduce DB queries:
# ApplicationController
def current_user
#current_user ||= User.includes(:account).find_by(auth_token: cookies[:auth_token])
def current_account
#current_account ||= current_user.account
This works just fine. However, the account record has a lot of extra text data that I don't always want to return. A more permanent solution would be to move those extra fields to a new table called account_details and have a one-to-one between an account and account_details.
However, I need a shorter-term solution that will provide some optimizations until I can refactor that (perfection is an iterative process!).
I want to return ALL columns from the current_user, but I only want to select a few columns from the current_account.
I tried this and I didn't work:
#current_user ||= User.includes(:account).references(:account).select("users.*, account.id, account.uuid").find_by(auth_token: cookies[:auth_token])
When I look at #current_user.account it still displays all the columns with their values and I only wanted the id, uuid columns.
I couldn't find any clear syntax on using select(...) with includes(:association); is this something that's possible?

You need to add an extra association in your User model.
belongs_to :account_columns, -> { select(:id, :uuid) }, class_name: 'Accoun'
This association should be used in your query.
This is because Rails don't have the facility to pass the options for include query.
Also consider using join instead includes in your query, as it also caters your need.
User.joins(:accounts).select("accounts_ids, accounts.uuid")


How to execute graphql query for a specific schema in hasura?

As it can be seen in the following screenshot, the current project database (postgresql)
named default has these 4 schema - public, appcompany1, appcompany2 and appcompany3.
They share some common tables. Right now, when I want to fetch data for customers, I write a query like this:
query getCustomerList {
customer {
And it fetches the required data from public schema.
But according to the requirements, depending on user interactions in front-end, that query will be executed for appcompanyN (N=1,2,3,..., any positive integer). How do I achieve this goal?
NOTE: Whenever the user creates a new company, a new schema is created for that company. So the total number of schema is not limited to 4.
I suspect that you see a problem where it does not exists actually.
Everything is much simpler than maybe it seems.
A. Where all those tables?
There are a lot of schemas with identical (or almost identical) objects inside them.
All tables are registered in hasura.
Hasura can't register different tables with the same name, so by default names will be [schema_name]_[table_name] (except for public)
So table customer will be registered as:
customer (from public)
It's possible to customize entity name in GraphQL-schema with "Custom GraphQL Root Fields".
B. The problem
But according to the requirements, depending on user interactions in front-end, that query will be executed for appcompanyN (N=1,2,3,..., any positive integer). How do I achieve this goal?
There are identical objects that differs only with prefixes with schema name.
So solutions are trivial
1. Dynamic GraphQL query
Application stores templates of GraphQL-queries and replaces prefix with real schema name before request.
query getCustomerList{
substitute [schema] with appcompany1, appcompany2, appcompanyZ and execute.
2. SQL view for all data
If tables are 100% identical then it's possible to create an sql view as:
SELECT 'public' as schema,* FROM public.customer
SELECT 'appcompany1' as schema,* FROM appcompany1.customer
SELECT 'appcompany2' as schema,* FROM appcompany2.customer
SELECT `appcompanyZ',* FROM appcompanyZ.customer
This way: no need for dynamic query, no need to register all objects in all schemas.
You need only to register view with combined data and use one query
query getCustomerList($schema: string) {
all_customer(where: {schema: {_eq: $schema}}){
About both solutions: it's hard to call them elegant.
I myself dislike them both ;)
So decide yourself which is more suitable in your case.

Filter and display database audit / changelog (activity stream)

I'm developing an application with SQLAlchemy and PostgreSQL. Users of the system modify data in 8 or so tables. Consider this contrived example schema:
I want to add visible logging to the system to record what has changed, but not necessarily how it has changed. For example: "User A modified product Foo", "User A added user B" or "User C purchased product Bar". So basically I want to store:
Who made the change
A message describing the change
Enough information to reference the object that changed, e.g. the product_id and customer_id when an order is placed, so the user can click through to that entity
I want to show each user a list of recent and relevant changes when they log in to the application (a bit like the main timeline in Facebook etc). And I want to store subscriptions, so that users can subscribe to changes, e.g. "tell me when product X is modified", or "tell me when any products in store S are modified".
I have seen the audit trigger recipe, but I'm not sure it's what I want. That audit trigger might do a good job of recording changes, but how can I quickly filter it to show recent, relevant changes to the user? Options that I'm considering:
Have one column per ID type in the log and subscription tables, with an index on each column
Use full text search, combining the ID types as a tsvector
Use an hstore or json column for the IDs, and index the contents somehow
Store references as URIs (strings) without an index, and walk over the logs in reverse date order, using application logic to filter by URI
Any insights appreciated :)
Edit It seems what I'm talking about it an activity stream. The suggestion in this answer to filter by time first is sounding pretty good.
Since the objects all use uuid for the id field, I think I'll create the activity table like this:
Have a generic reference to the target object, with a uuid column with no foreign key, and an enum column specifying the type of object it refers to.
Have an array column that stores generic uuids (maybe as text[]) of the target object and its parents (e.g. parent categories, store and organisation), and search the array for marching subscriptions. That way a subscription for a parent category can match a child in one step (denormalised).
Put a btree index on the date column, and (maybe) a GIN index on the array UUID column.
I'll probably filter by time first to reduce the amount of searching required. Later, if needed, I'll look at using GIN to index the array column (this partially answers my question "Is there a trick for indexing an hstore in a flexible way?")
Update this is working well. The SQL to fetch a timeline looks something like this:
SELECT DISTINCT ON (activity.created, activity.id)
FROM activity
LEFT OUTER JOIN unnest(activity.object_ref) WITH ORDINALITY AS act_ref
ON true
LEFT OUTER JOIN subscription
ON subscription.object_id = act_ref.act_ref
WHERE activity.created BETWEEN :lower_date AND :upper_date
AND subscription.user_id = :user_id
ORDER BY activity.created DESC,
act_ref.ordinality DESC
) AS sub
WHERE sub.subscribed = true;
Joining with unnest(...) WITH ORDINALITY, ordering by ordinality, and selecting distinct on the activity ID filters out activities that have been unsubscribed from at a deeper level. If you don't need to do that, then you could avoid the unnest and just use the array containment #> operator, and no subquery:
FROM activity
JOIN subscription ON activity.object_ref #> subscription.object_id
WHERE subscription.user_id = :user_id
AND activity.created BETWEEN :lower_date AND :upper_date
ORDER BY activity.created DESC;
You could also join with the other object tables to get the object titles - but instead, I decided to add a title column to the activity table. This is denormalised, but it doesn't require a complex join with many tables, and it tolerates objects being deleted (which might be the action that triggered the activity logging).

How do I do conditional check, return error, or continue?

A user wants to invite a friend but I want to do a check first. For example:
SELECT friends_email from invites where friends_email = $1 limit 1;
If that finds one then I want to return a message such as "This friend already invited."
If that does not find one then I want to do an insert
INSERT INTO invites etc...
but then I need to return the primary user's region_id
SELECT region_id from users where user_id = $2
What's the best way to do this?
EDIT --------------------------------------------------------------
After many hours below is what I ended up with in 'plpgsql'.
IF EXISTS (SELECT * FROM invitations WHERE email = friends_email) THEN
return 'Already Invited';
INSERT INTO invitations (email) VALUES (friends_email);
return 'Invited';
I undestand that there are probably dozens of better ways but this worked for me.
Without writing the exact code snippet for you...
Consider solving this problem by shaping your data to conform to your business rules. If you can only invite someone once, then you should have an "invites" table that reflects this by a UNIQUE rule across whatever columns define a unique invite. If it is just an email address, then declare the "invites.email" as a unique column.
Then do an INSERT. Write the insert so that it takes advantage of Postgres' RETURNING clause to give an answer on success. If the INSERT fails (because you already have that email address -- which was the point of the check you wanted to do), then catch the failure in your application code, and return the appropriate response.
catch error.UniqueFail
return "He's already been invited"
# ...do other stuff
(data fields + SELECT region thingy)
(some arrangement of data that includes "region_id")
RETURNING region_id
If that's hard to make work the first time you try it, phrasing the insert target as a CTE may be helpful. If all else fails, write it procedurally in plpgsql for the time being, making sure the external interface accepts a normal INSERT (so you don't have to change application code later) and sort it out once you know whether or not performance is an issue.
The basic idea here is to let the relational shape of your data obviate the need for any procedural checking wherever you can. That's at the heart of relational data modeling ...somewhat of a lost art these days.
You can create SQL stored procedure for implement functionality like described above.
But it is wrong form architecture point of view. See: Direct database manipulation an anti-pattern?
DB have scope of responsibility: store data.
You have to put business logic into your business layer.

FileMaker Pro 12 Auto-populating Tables

I'm new to Filemaker and need some advice on auto-populating tables.
Part 1:
I have TableA which includes many records with client information. I want a separate TableB which is identical to TableA except that it is "de-identified"; that is, it does not contain two of the fields, first name and last name.
I would like the two tables to interact such that if I add a new record to TableA, that same record (sans first and last name) appear automatically in TableB.
Part 2:
In addition to the above functionality, I would also like said functionality to be dependent on a specific field type from TableA. For example, I enter a new record, which has a "status" field set to "active," into tableA. I then want that record to be auto-popualted into TableB; however, if I add another record with a "status" of "inactive," I want that that record auto-populated into a TableC but not into TableB.
FileMaker can perform this with script triggers so long as every layout where TableA will be edited has a layout script trigger of OnRecordCommit connected to it. When the record is committed (which can happen in a number of ways), the attached script will run, which you can use to create the appropriate record in the appropriate table.
The script could create the record in a number of ways. If the primary keys for both records are the same, you could use lookups. You could export the record in TableA and then import it into the correct table. You could pass the field information as a parameter to the script. The best choice really depends on your needs.
Having said that, I would question the wisdom of this approach. It brings up a few questions that would seem to complicate matters. For example, what happens when the status changes? When a record in TableA is deleted? When fields in TableA are modified? Each of these contingencies (and others) will require thought and more complicated scripts.
So I would ask what problem you're really trying to solve. My best guess is that you are trying to keep the name information private from certain users. User accounts and privileges with dedicated layouts for each privilege can solve this without the need for duplicate tables. FileMaker privilege sets can be quite granular.
For example, you can specify that users with PrivilegeA can create records and view names, but PrivilegeB users can only view records if the status is "active" and the name fields are not available to them, while PrivilegeC users can view records if the status is "inactive" and the name fields are also not available to them.
I would definitely use filters and permissions on the "status field" to achieve this and not two mirroring tables. Unless the inactive information is drastically different, you would be complicated your solution and creating more possible pitfalls.

Stored procedure list and parameter number used in conjunction with ComboBox

I'm trying to get a list of all the user defined stored procedures to populate a combobox with. The idea was to manually create a table with the following columns:
SProc Name, Number of Inputs, Parameter 1, Parameter 2 ...
The user is meant to click a button and a SProc selects all this data from that table, loads it into an array and populates the combo box.
The User is meant to then select a stored proc name from the combobox, and the number of parameters required are shown (with the relevant names).
As per our discussion in SO chat:
--AND PATINDEX('/*<SomeKeyToSearch>*/', sprocs.ROUTINE_DEFINITION) > 0
This will give you the list of all sprocs with their parameters and data types. Just be warned that you also need to pay close attention to data types (precision, scale, max length, etc), since this will be used to allow a user to call an arbitrary stored procedure. Once you get this entire table in your C# application, you can group/sort/limit based on whatever criteria you want. If you want to ensure only specific sprocs get returned from the above query, just add a top-level comment to the sproc with some sort of key that you can search on.
Good luck.