SPARQL: Choose one triple from multiple match? - match

I have a kind of interesting problem with SPARQL query. I am querying for a triple and I have multiple match - maybe 2. But I need the code to choose only one. It absolutely doesn't matter which one.
For example: I am querying for a book for 5-years old and it founds 2 books in the database but I need to have only one saved in variable (doesn't matter which one).
Is this even possible in SPARQL? Please let me know if you need any other information. Thank you all in advance for your time!
The change in input database is the last thing I want to do.
EXAMPLE:
<Book rdf:ID = 0001>
<Book.Title>Title1</Book.Title>
<Book.For>5 years old</Book.For>
<Book.Date>2000</Book.Date>
</Book>
<Book rdf:ID = 0002>
<Book.Title>Title2</Book.Title>
<Book.For>5 years old</Book.For>
<Book.Date>2005</Book.Date>
</Book>
Construct {?NewDatabase Example_of_book_for_5_years_old ?Title .}
Where {?Example Book.For "5 years old" .
?Example Book.Title ?Title .
};
Both books match the Where clause but I need only one to be in the new database output. Doesn't matter which one.

Not considering the apparent errors in your example query, the solution is to use SELECT instead of CONSTRUCT. The former allows you to limit the number of results.
If you totally need a triple as result, you can use limited SELECT as a subquery
To select only one triple from a CONSTRUCT query you can use the LIMIT clause just like with SELECT
Construct { ?Example <urn:Book:Title> ?Title }
Where {
?Example <urn:Book:For> "5 years old" .
?Example <urn:Book:Title> ?Title .
}
limit 1
This will select the first result (order undetermined).
And no, the judging by the SPARQL Grammar, it is not possible to use the LIMIT clause with SPARQL Updates.

Related

Postgres...how to improve ilike results (quality not speed)

I have a list of chemicals in my database and I provide our users with the ability to do a live search via our website. I use SQLAlchemy and the query I use looks something like this:
Compound.query.filter(Compound.name.ilike(f'%{name}%')).limit(50).all()
When someone searches for toluene, for example, they don't get the result they're looking for because there are many chemicals that have the word toluene in them, such as:
2, 4 Dinitrotoluene
2-Chloroethyl-p-toluenesulfonate
4-Bromotoluene
6-Amino-m-toluenesulfonic acid
a,2,4-trichlorotoluene
a,o-Dichlorotoluene
a-Bromtoluene
etc...
I realize I could increase my limit but I feel like 50 is more than enough. Or, I could change the ilike(f'%{name}%')) to something like ilike(f'{name}%')) but our business requirements don't want this. What I'd rather do is improve the ability for Postgres to return results so that toluene is at the top of the search results.
Any ideas on how Postgres' ilike capability?
Thanks in advance.
One option is to better rank the results. Postgres text search allows you to rank the results.
A cheap and dirty version of preferential ranking is to do multiple queries for name = ?, ilike(f'{name}%')), and ilike(f'%{name}%')) using a union. That way the ilike(f'{name}%')) results come first.
And rather than a hard limit, offer pagination. SQLAlchemy has paginate to help.
ILIKE yields a boolean. It doesn't specify what order to return the results, just whether to return them at all (you can order by a boolean, but if you only return trues there is nothing left to order by). So by the time you are done improving it, it would no longer be ILIKE at all but something else completely.
You might be looking for something like <-> from pg_trgm, which provides a distance score which can be sorted on. Although really, you could just order the result based on the length of the compound name, and return the shortest 50 that contain the target.
something like ilike(f'{name}%')) but our business requirements don't want this
Isn't your business requirement to get better results?
But at least in my database, this could just return a bunch of names in inverted format, like toluene, 2,4-dinitro, so the results might not be much better, unless you avoid storing such inverted names. Sorting by either <-> or by length would overcome that problem. But they would also penalize toluene, ACS reagent grade 99.99% by HPLC, should you have names like that.

Is there a SQLite for Swift method for more complex ORDER BY statements?

I have a query similar to the following that I'd like to perform on an sqlite database:
SELECT name
FROM table
WHERE name LIKE "%John%"
ORDER BY (CASE WHEN name = "John" THEN 1 WHEN name LIKE "John%" THEN 2 ELSE 3 END),name LIMIT 10
I'd like to use SQLite for Swift to chain the query together, but I'm stumped as to how to (or if it is even possible to) use the .order method.
let name = "John"
let filter = "%" + name + "%"
table.select(nameCOL).filter(nameCOL.like(filter)).order(nameCOL)
Gets me
SELECT name
FROM table
WHERE name LIKE %John%
ORDER BY name
Any ideas on how to add to the query to get the more advanced sorting where names that start with John come first, followed by names with John in them?
I saw the sqlite portion of the solution here: SQLite LIKE & ORDER BY Match query
Now I'd just like to implement it using SQlite for Swift
Seems it may be too restrictive for that given the limited examples, anyone else have any experience with more complicated ORDER BY clauses?
Thanks very much.
Sqlite.swift can handle that pretty cleanly with two .order statements:
let name = "John"
let filter = "%" + name + "%"
table.select(nameCol).filter(nameCol.like(filter))
.order(nameCol.like("\(name)%").desc
.order(nameCol)
The .order statements are applied in the order they are listed, the first being primary.
The filter will reduce the results to only those with "John" in them. SQlite.swift can do many complex things. I thought I would need a great deal of raw sql queries when I ported 100s of complex queries over to it, but I have yet to use raw sql.

Postgresql Query - Return all matching search terms for each result row when using an ANY query and LIKE

Essentially what I'm trying to figure out is if there is a way to return all matching search terms in addition to the matched row when running a query that looks up a list of items using ANY or IN. In most cases the search term will exactly match the returned column value but in cases such as text search or with certain extensions like IP4r this is not always the case. In addition, you can have multiple search terms match on a single row.
To make this concrete suppose this is my query:
SELECT id, item_name, description FROM items WHERE description LIKE ANY('{%gaming%, %computer%, %socks%, %men%}');
and it returns the following two rows:
id, item_name, description
1, 'computer', 'super fast gaming computer that will help you win'
5, 'socks', 'These socks are sure to please the men in your family'
What I'd like to know is which original search terms map to the result row that was returned. In other words, I'd like the returned rows to look like this:
id, search_terms, item_name, description
1, '{%gaming%, %computer%}', 'computer', 'super fast gaming computer that will help you win'
5, '{%socks%, %men%}', 'socks', 'These socks are sure to please the men in your family'
Is there a way to efficiently do this in PostgreSQL? In the example above we're using LIKE with strings but in my real-world scenario I'm using the IP4r extension to do IP lookups against CIDR ranges where you can have multiple IP addresses in the same returned CIDR range.
I previously asked this question: PostgreSQL 9.5: Return matching search terms in each result row when using LIKE which used a CASE statement to almost solve the problem I'm describing here.
The added complexity in the scenario above is that you can have multiple search terms match a single row (e.f., gaming and computer are both matches for the description super fast gaming computer that will help you win). If you use a CASE statement then only the first match in the CASE statement gets set as the search term and you miss any other matching search terms.
Thank you for your help!
This would be a way using VALUES:
SELECT i.id, i.item_name, i.description, m.pat
FROM items AS i
JOIN (VALUES ('%gaming%'), ('%computer%'), ('%socks%'), ('%men%')) AS m(pat)
ON i.description LIKE m.pat;

Oracle 10g, how to query numerical value, (years) with specific limits on results

So the question I am posed with is to take the years produced of all of the movies in two genre's, (SH and CH) an then print out a list of all the movies, (title and year), that were produced before any of the movies in my specific genre were produced. I have this:
SELECT x.title "Title", x.yr "Year"
FROM movies x
WHERE EXISTS (SELECT x FROM movies y
WHERE y.genre IN ('SH', 'CH') AND y.yr < x.yr)
ORDER BY yr;
but it's producing all sorts of titles that were produced during and after the two genres had any of their movies produced. I would think that the less than would limit the results to anything under 1965, (the oldest move in either genre), but it doesn't, but if I use the greater than operator it does, (although it still pumps out newer results as well, so that doesn't work either)
Does anybody see what it is I am missing here? Thanks for any help.
Got it!
I just needed to use the operator differently
WHERE yr <
(SELECT insert stuff here); // this is homework so I can't post the full code
Seems I was just over complicating things.

T-SQL speed comparison between LEFT() vs. LIKE operator

I'm creating result paging based on first letter of certain nvarchar column and not the usual one, that usually pages on number of results.
And I'm not faced with a challenge whether to filter results using LIKE operator or equality (=) operator.
select *
from table
where name like #firstletter + '%'
vs.
select *
from table
where left(name, 1) = #firstletter
I've tried searching the net for speed comparison between the two, but it's hard to find any results, since most search results are related to LEFT JOINs and not LEFT function.
"Left" vs "Like" -- one should always use "Like" when possible where indexes are implemented because "Like" is not a function and therefore can utilize any indexes you may have on the data.
"Left", on the other hand, is function, and therefore cannot make use of indexes. This web page describes the usage differences with some examples. What this means is SQL server has to evaluate the function for every record that's returned.
"Substring" and other similar functions are also culprits.
Your best bet would be to measure the performance on real production data rather than trying to guess (or ask us). That's because performance can sometimes depend on the data you're processing, although in this case it seems unlikely (but I don't know that, hence why you should check).
If this is a query you will be doing a lot, you should consider another (indexed) column which contains the lowercased first letter of name and have it set by an insert/update trigger.
This will, at the cost of a minimal storage increase, make this query blindingly fast:
select * from table where name_first_char_lower = #firstletter
That's because most database are read far more often than written, and this will amortise the cost of the calculation (done only for writes) across all reads.
It introduces redundant data but it's okay to do that for performance as long as you understand (and mitigate, as in this suggestion) the consequences and need the extra performance.
I had a similar question, and ran tests on both. Here is my code.
where (VOUCHER like 'PCNSF%'
or voucher like 'PCLTF%'
or VOUCHER like 'PCACH%'
or VOUCHER like 'PCWP%'
or voucher like 'PCINT%')
Returned 1434 rows in 1 min 51 seconds.
vs
where (LEFT(VOUCHER,5) = 'PCNSF'
or LEFT(VOUCHER,5)='PCLTF'
or LEFT(VOUCHER,5) = 'PCACH'
or LEFT(VOUCHER,4)='PCWP'
or LEFT (VOUCHER,5) ='PCINT')
Returned 1434 rows in 1 min 27 seconds
My data is faster with the left 5. As an aside my overall query does hit some indexes.
I would always suggest to use like operator when the search column contains index. I tested the above query in my production environment with select count(column_name) from table_name where left(column_name,3)='AAA' OR left(column_name,3)= 'ABA' OR ... up to 9 OR clauses. My count displays 7301477 records with 4 secs in left and 1 second in like i.e where column_name like 'AAA%' OR Column_Name like 'ABA%' or ... up to 9 like clauses.
Calling a function in where clause is not a best practice. Refer http://blog.sqlauthority.com/2013/03/12/sql-server-avoid-using-function-in-where-clause-scan-to-seek/
Entity Framework Core users
You can use EF.Functions.Like(columnName, searchString + "%") instead of columnName.startsWith(...) and you'll get just a LIKE function in the generated SQL instead of all this 'LEFT' craziness!
Depending upon your needs you will probably need to preprocess searchString.
See also https://github.com/aspnet/EntityFrameworkCore/issues/7429
This function isn't present in Entity Framework (non core) EntityFunctions so I'm not sure how to do it for EF6.