I have the following strategy for the full text search in my web app which uses PostgreSQL for relational data storage. For example I will take Invoices table.
In the tables I have one additional field ALTER TABLE invoices ADD COLUMN tsv tsvector on which the full text search query is done like this ... WHERE tsv ## to_tsquery('query:*') ...
On every full text search table I have set an update trigger that updates tsv field on every change of the record. Update sets and concatenates the data from different fields to tsv field, sets the right weights, etc...
The data that gets set into tsv field can also be relational data from other tables. From example in table invoices I have client_id field but since I want to search invoices by the client name as well I also include clients.client_name data in the invoices.tsv field
My question is what is the best strategy to keep the relational data in tsv selectors in sync. In above scenario -> if client name changes I would need to update this in tsv field for every invoice...
Should I set cron job setup up that would do this every night? It could be also done with triggers, but since my database schema is very large I am scared it might get out of control if I have triggers all over the place.
If you add the clients name into the tsv field you will end up with more complexity. You might want to look into Materialized views as mentioned in this article. The trade-off might be speed in showing results and the need to refresh the view periodically. As of Postgres 9.4 you can now refresh a view concurrently.
Another thing you could do is create an update trigger in the Client table and when there's an update it will update the data in the Invoices table as well.
Related
My flow is simple and I am just reading a raw file into a SQL table.
At times the raw file contains data corresponding to existing records. I do not want to insert a new record in that case and would only want to update the existing record in the SQL table. The challenge is, there is a 'record creation date' column which I initialize at the time of record creation. The update operation overwrites that column too. I just want to avoid overwriting that column, while updating the other columns from the information coming from the raw file.
So far I am having no idea about how to do that. Could someone make a recommendation?
I defaulted the creation column to auto-populate in the SQL database itself. And I changed my flow to just update the remaining records. Talend job is now not touching that column. Problem solved.
Yet another reminder of 'Simplification is underrated'. :)
I'm creating a compliance mailing for my organization, the mailing will include merge fields that identify the office location, physician, and SiteId. The mailing will also include a table of information that is dependent upon the particular SiteId.
I'd like to use the import table function of MS word and set up a query that references a merged field (SiteId) so that the inserted tables populate the appropriate data for the particular site. I'm unable to do this.
How can I set up this document so that I can import only records from my source (an ms access query) that match the SiteId merge field?
Word's mail merge does not support one-to-many relationships. There are ways to coerce it, but only one of them can yield a table as a result and over the years it has become less and less reliable as Microsoft has not regarded it as important enough to maintain...
What you need to do is set up a query that provides ONLY the information you want displayed in the table, plus the key (SiteId). It's best to sort it so that all the SiteId entries list together, and are in the order the data will come through in the mail merge data source.
On the Insert tab go to Text/Quick Parts/Insert Field and select the Database field from the list in the dialog box. Click "Insert Database" and follow the instructions in the dialog box to link in the data. Be sure to set the Query Options to filter on the first SiteId from the data source. When you "Insert Data" make sure to choose the option to "Insert as a field".
This inserts a DATABASE field in the document which you can see by toggling field codes (Alt+F9). The field code can be edited and what you need to do is substitute the literal SiteId value you entered for the query with its corresponding MergeField.
When you execute the merge to a new document that should generate a table for each data record corresponding to the SiteId for the record. But, as I said, Microsoft hasn't done a great job of maintaining this, so it may require quite a bit of tweaking and experimenting.
If the results are not satisfactory then you should give up the idea of mail merge and use automation code to generate and populate the documents.
You can find more (albeit somewhat out-dated) information on this topic at http://homepage.swissonline.ch/cindymeister/mergfaq1.htm
I'm using play 2.2.1 with scala. And I have a managing database evaluations. When I run my application with changes on the database, it'll drop all my datas.
At this moment evaluationplugin = disabled is comment in. If I comment out, it does not apply my changes.
For example. I have a users table and there are id, f_name, l_name
User
id f_name l_name
1. khazo rasp
And I want to add age field to this table without losing data. I've added this field in scala files.It works properly. I'm assuming I need to write script in 1.sql for some alter command, but I don't want to write script.
How to apply new changes without dropping current data in the db. I've read this documentation. Thanks in advance.
I've added this field in scala files
In slick (you have the tag play-slick), you can specify a default value in your Table
See the documentation here, under Tables:
Default[T](defaultValue: T)
Specify a default value for inserting data the table without this column. This information is only used for creating DDL statements so that the database can fill in the missing information.
I am not sure if it gets translated to ALTER statement if the table already exists . You will have to test it.
I have two tables in APEX that are linked by their primary key. One table (APEX_MAIN) holds the basic metadata of a document in our system and the other (APEX_DATES) holds important dates related to that document's processing.
For my team I have created a contrl panel where they can interact with all of this data. The issue is that right now they alter the information in APEX_MAIN on a page then they alter APEX_DATES on another. I would really like to be able to have these forms on the same page and submit updates to their respective tables & rows with a single submit button. I have set this up currently using two different regions on the same page but I am getting errors both with the initial fetching of the rows (Which ever row is fetched 2nd seems to work but then the page items in the form that was fetched 1st are empty?) and with submitting (It give some error about information in the DB having been altered since the update request was sent). Can anyone help me?
It is a limitation of the built-in Apex forms that you can only have one automated row fetch process per page, unfortunately. You can have more than one form region per page, but you have to code all the fetch and submit processing yourself if you do (not that difficult really, but you need to take care of optimistic locking etc. yourself too).
Splitting one table's form over several regions is perfectly possible, even using the built-in form functionality, because the region itself is just a layout object, it has no functionality associated with it.
Building forms manually is quite straight-forward but a bit more work.
Items
These should have the source set to "Static Text" rather than database column.
Buttons
You will need button like Create, Apply Changes, Delete that submit the page. These need unique request values so that you know which table is being processed, e.g. CREATE_EMP. You can make the buttons display conditionally, e.g. Create only when PK item is null.
Row Fetch Process
This will be a simple PL/SQL process like:
select ename, job, sal
into :p1_ename, :p1_job, :p1_sal
from emp
where empno = :p1_empno;
It will need to be conditional so that it only fires on entry to the form and not after every page load - otherwise if there are validation errors any edits will be lost. This can be controlled by a hidden item that is initially null but set to a non-null value on page load. Only fetch the row if the hidden item is null.
Submit Process(es)
You could have 3 separate processes for insert, update, delete associated with the buttons, or a single process that looks at the :request value to see what needs doing. Either way the processes will contain simple DML like:
insert into emp (empno, ename, job, sal)
values (:p1_empno, :p1_ename, :p1_job, :p1_sal);
Optimistic Locking
I omitted this above for simplicity, but one thing the built-in forms do for you is handle "optimistic locking" to prevent 2 users updating the same record simultaneously, with one's update overwriting the other's. There are various methods you can use to do this. A common one is to use OWA_OPT_LOCK.CHECKSUM to compare the record as it was when selected with as it is at the point of committing the update.
In fetch process:
select ename, job, sal, owa_opt_lock.checksum('SCOTT','EMP',ROWID)
into :p1_ename, :p1_job, :p1_sal, :p1_checksum
from emp
where empno = :p1_empno;
In submit process for update:
update emp
set job = :p1_job, sal = :p1_sal
where empno = :p1_empno
and owa_opt_lock.checksum('SCOTT','EMP',ROWID) = :p1_checksum;
if sql%rowcount = 0 then
-- handle fact that update failed e.g. raise_application_error
end if;
Another, easier solution for the fetching part is creating a view with all the feilds that you need.
The weak point is it that you later need to alter the "submit" code to insert to the tables that are the source for the view data
This may be a very simplistic question, so apologies in advance, but I am very new to database usage.
I'd like to have Postgres run its full text search across multiple joined tables. Imagine something like a model User, with related models UserProfile and UserInfo. The search would only be for Users, but would include information from UserProfile and UserInfo.
I'm planning on using a gin index for the search. I'm unclear, however, on whether I'm going to need a separate tsvector column in the User table to hold the aggregated tsvectors from across the tables, and to setup triggers to keep it up to date. Or if it's possible to create an index without a tsvector column that'll keep itself up to date whenever any of the relevant fields in any of the relevant tables change. Also, any tips on the syntax of the command to create all this would be much appreciated as well.
Your best answer is probably to have a separate tsvector column in each table (with an index on, of course). If you aggregate the data up to a shared tsvector, that'll create a lot of updates on that shared one whenever the individual ones update.
You will need one index per table. Then when you query it, obviously you need multiple WHERE clauses, one for each field. PostgreSQL will then automatically figure out which combination of indexes to use to give you the quickest results - likely using bitmap scanning. It will make your queries a little more complex to write (since you need multiple column matching clauses), but that keeps the flexibility to only query some of the fields in the cases where you want.
You cannot create one index that tracks multiple tables. To do that you need the separate tsvector column and triggers on each table to update it.