I'm new to PostgreSQL and I am working on a function to return the word locations for a searched word.
I want to first narrow down the text fields the search has to go though to make sure it is a relevant result from the database.
My table name is 'testing' then the text field column is called 'context' and the line number where it is located is called 'line_number'. Where the context text is associated with a specific line_number.
Right now my ranking code looks like this:
select line_number into lineLocation
from (
SELECT
testing.line_number,
ts_rank_cd(to_tsvector('english', testing.context),
to_tsquery('Cats & Dogs & Kids')) AS score
FROM Testing
) ranking
WHERE score >0
ORDER BY score DESC;
Return QUERY select * from lineLocation;
When I try to print out lineLocation as a return query, it works in reporting the new ranked line numbers 22,19,21,20,17,13 each returned in their own column.
My problem now is that I want to search each of those lines (22 ... 13) for a key word like "dog" and return its position
Obtaining the text for that by using:
select context into sample from testing
where testing.line_number = lineLocation;
If I try to just decrement the lineLocation in a loop like lineLocation -i
It goes out of order, and will eventually search context that is not relevant.
Is there any type of 'read next line' function I could use?
I am looking for a way to loop through the ranked result line numbers
EDIT I then go on to use a for loop where I want it to read through all of the rows of text in the column context from the ranked results
The problem I am having with this is that it only reads the first row of text in the column 'context' and I need it to look at all of the rows that are returned by the ranked search
Ended up creating a ranking function of its own, and inserting the results of that text search into another table with a serial increment column.
filled the values of the new table (ranked_results) with this code:
INSERT INTO ranked_results(sentence) VALUES (columnRanking());
I also had to create a function to delete/reset the columns in the new table upon insertion of more lines.
TRUNCATE table ranked_results RESTART IDENTITY;
Related
I want to update the content of a table calls VIDEO_TAGS which contains 3 columns, fk_video_id, principal (not too important), and tag_value.
I want to do an update as follows,
"UPDATE VIDEO_TAGS
SET tag_value= :newTag
where tag_value= :oldTag
and fk_video_id = (SELECT FROM VIDEO_TAGS fk_video_id where tag_value= :productName)"
but it is clear that the sub-query will return many elements, while I only need one element -more than one row returned by a subquery used as an expression ERROR-.
my question is how to edit the sub-query to get the one element that I need? Thank you
I have been searching endlessly for the answer to this problem I have been having:
Our team uses a query that returns a dataset with 13 columns. We want to narrow down the results by returning only rows where any string value in column "Actual Collection" is in the adjacent column "PrvPrComments". Additionally we want to do the same thing for column "Actual Manufacturer" and "PrvPrComments". If a string value in either Actual collection or Actual manufacturer exsists in PrvPrComments then we want to return that row and if it does not then exclude it.
The tricky part is that PrvPrComments is a column that has long text strings in them and so the query needs to parse through to find and match the string. They also need to be exact matches so "Pillow Perfect" and "pillow" would not be the same thing.
Here is an example posted below. I would want to return rows that contains "cowboy" and "chandelier" because there is a match but not the others:
Example of data
My initial guess would be to write a query that uses Full Text Index and/or contains. Any help would be greatly appreciated and I apologize for not having a foundation code to post here, I'm fairly new to this and am having trouble with where to start.
Thank you
where '%' + actualCollection + '%' like PrvPrComments
If data is not that much you can use (like expression) to return the data,
WHERE PrvPrComments LIKE '%' + actualCollection + '%'
But if data is huge and full-text search will not be that much useful, you might have another column as a flag and populate the same at INSERTION time, (when the actualCollection is LIKE PrvPrComments then set the flag as 1 ). later you need to query against rows having flag as 1
This is a follow-up to another question I recently asked.
I currently have a SphinxQL query like this:
SELECT * FROM my_index
WHERE MATCH(\'#field1 "a few words"/1 #field2 "more text here"/1\')
However, I would still like it to match rows in the case where one of the fields in the row is empty.
For example, let's say the following rows exist in the database:
field1 | field2
-----------------------
words in here | text in here
| text in here
The above query would match the first row, but it would not match the second row because the quorum operator specifies that there has to be one or more matches for each field.
Is what I'm asking possible?
The actual query I'm trying to make this work with was provided in Barry Hunter's answer to my previous question:
sphinxQL> SELECT *, WEIGHT() AS w FROM index
WHERE MATCH('#tags "cute hairy happy"/1 #tags2 "one two thee"/1') AND w = 2
OPTION ranker=expr('SUM(IF(word_count>=IF(user_weight=2,tags2_len,tags_len),1,0))'),
field_weights=(tags=1,tags2=2);
First problem is sphinx doesn't index "empty" so you can't search for it. (well actually the field_len attribute will be zero. But it can be hard to combine attribute filter with MATCH())
... so arrange for empty to be something to index
sql_query = SELECT id,...,IF(tags='','_empty_',tags) AS tags FROM ...
Then modify the query. As it happens your quorum search is easy!
#field1 "a few words _empty_"/1
Its just another word. But a more complex query would just have to be OR'ed with the word.
Then there is making it work within your complex query. But as luck would have it, its really easy. _empty_ is just another word. And in the case of the field being empty, one word will match. (ie there are no words in the field, not in the query)
So just add _empty_ into the two quorums and you done!
Sorry if this has already been asked. I couldn't see it in previously asked questions.
I have a table - 'eightks'.
This file contains 1,000,000 text documents.
I only need those that mention the word 'other events'. So I am trying to do some text matching and then output these files into a new table.
My current code is;
SELECT * FROM eightks\d
WHERE to_tsvector(text) ## to_tsquery('other_events');
When I run this I get the following error
string is too long for tsvector (2368732 bytes, max 1048575 bytes)
Also How do I output the matching rows into a new table?
Any help is appreciated.
That's a documented limitation.
The length of a tsvector (lexemes + positions) must be less than 1 megabyte
It might be possible to change the source code and recompile. See ts_type.h. I suspect it won't be simple, though.
You might need to break the documents up into smaller pieces for searching, then combine the pieces for presentation to the user.
As for inserting the rows into another table, you can just insert a correct select statement. Basically . . .
insert into table_name
select ...
You might need to supply column names.
I have a Filemaker table with multiple entries in fieldA, how can I set fieldB to count the number of occurrences of the corresponding number of records which have the same value in fieldA.
For example, if fieldA is a;b;b;c I want fieldB to read 1;2;2;1.
The simplest is to make a self-relationship from the table to another occurrence of the same table by fieldA. Then fieldB can be like Count( sameFieldA::fieldA ).
You'll want a recursive custom function which you pass the fieldA contents in to.
It takes as parameters:
the text being parsed
the current position being parsed (starting at 1)
the output text being built
grab the fieldA value (e.g. "a") at the supplied position, then count the number of occurrences of "a" in the text being parsed. Append this to the output text, then if there are more values to process, call the recursive function again, with an incremented position, returning the result. Otherwise, return the output text.