Filtering results from is_in() OverPass query - overpass-api

I'm new with OverPass API.
I would like to get the country element where a certain point is contained.
As a first step I tried this:
is_in(48.856089,2.29789);
out;
It gives me all the areas which contain the given coordinate, including regions, provinces...
So now, I would like to filter the country only. In the results, I can see that the country element is determined by the admin_level attribute, which must be equal to 2.
So, in order to filter my first request, I tried this:
is_in(48.856089,2.29789)[admin_level="2"];
out;
But with OverPass Turbo, it gives me the following error:
Error: line 1: parse error: ';' expected - '[' found.
I read that areas are an extended data type (compared to nodes, ways, and relations). Is it the reason why I can't filter my results?
How can I filter results from the is_in query, by [admin_level="2"]?

You cannot combine is_in with any additional filter criteria. The proper way to do this is as follows, where ._ refers to the area result returned by is_in.
is_in(48.856089,2.29789);area._[admin_level="2"];
out;

Related

Is there a way to get results for an overpass query paginated?

Let's say I want to get restaurants in Berlin and I have this query:
[out:json];
area["boundary"="administrative"]["name"="Berlin"] -> .a;
(
node(area.a)["amenity"="restaurant"];
); out center;
Let's say this result set is too big to extract in just one request to overpass. I would like to be able to use something like SQL's OFFSET and LIMIT arguments to get the first 100 results (0-99), process them, then get the next 100 (100-199) and so on.
I can't find an option to do that in the API, is it possible at all? If not, how should I query my data to get it divided into smaller sets?
I know I can increase the memory limit or the timeout, but this still leaves me handling one massive request instead on n small ones, which is how I would like to do it.
OFFSET is not supported by Overpass API, but you can limit the number of result this is getting returned by the query via an additional parameter in the out statement. The following example would return only 100 restaurants in Berlin:
[out:json];
area["boundary"="administrative"]["name"="Berlin"] -> .a;
(
node(area.a)["amenity"="restaurant"];
); out center 100;
One approach to limit the overall data volume could be to count the number of objects in a bounding box, and if that number is too large, split the bounding box in 4 parts. counting is supported via out count;. Once the number of objects is feasible, just use out; to get some results.
node({{bbox}})["amenity"="restaurant"];
out count;

Querying on multiple LINKMAP items with OrientDB SQL

I have a class that contains a LINKMAP field called links. This class is used recursively to create arbitrary hierarchical groupings (something like the time-series example, but not with the fixed year/month/day structure).
A query like this:
select expand(links['2017'].links['07'].links['15'].links['10'].links) from data where key='AAA'
Returns the actual records contained in the last layer of "links". This works exactly as expected.
But a query like this (note the 10,11 in the second to last layer of "links"):
select expand(links['2017'].links['07'].links['15'].links['10','11'].links) from data where key='AAA'
Returns two rows of the last layer of "links" instead:
{"1000":"#23:0","1001":"#24:0","1002":"#23:1"}
{"1003":"#24:1","1004":"#23:2"}
Using unionAll or intersect (with or without UNWIND) results in this single record:
[{"1000":"#23:0","1001":"#24:0","1002":"#23:1"},{"1003":"#24:1","1004":"#23:2"}]
But nothing I've tried (including various attempts at "compound" SELECTs) will get the expand to work as it does with the original example (i.e. return the actual records represented in the last LINKMAP).
Is there a SQL syntax that will achieve this?
Note: Even this (slightly modified) example from the ODB docs does not result in a list of linked records:
select expand(records) from
(select unionAll(years['2017'].links['07'].links['15'].links['10'].links, years['2017'].links['07'].links['15'].links['11'].links) as records from data where key='AAA')
Ref: https://orientdb.com/docs/2.2/Time-series-use-case.html
I'm not sure of what you want to achieve, but I think it's worth trying with values():
select expand(links['2017'].links['07'].links['15'].links['10','11'].links.values()) from data where key='AAA'

TSQL / SSRS : Using LIKE with multi-valued parameters

I am trying to determine a solution to filter records using LIKE with a multi-valued parameter. In a simplistic example a user wants to return a set of 5-digit Object codes by entering the following in a parameter window in a SSRS report:
#parm_Object
1,24,333,34567
This ideally would return Object codes satisfying the following criteria:
1 : All Object codes starting with '1'
24: All Object codes starting with '24'
333: Similar
34567: Object code '34567'
I guess a starting point for me would be to determine whether this could be handled in the actual query, or should I do it on the SSRS side.
general good practice is to get rid of the data you don't need ASAP. so that would be in the query.
a SSRS multivalued parameter will show up as a comma separated list in the query.
the first step is to get from this comma separated list to a table (or table function), then you can join this table and apply like operators
for example
SELECT *
FROM INFORMATION_SCHEMA.COLUMNS a
INNER JOIN dbo.split1('test,fafa,tata') b
ON 1=1
WHERE a.COLUMN_NAME like b.Value + '%'
will return rows having column names starting with test, fafa or tata. the dbo.split1 function you have to write your own or get one from the internet. the link suggested by Tab Alleman for example.

Calculate sum in script in ABBYY Flexicapture

I would like to perform the function of a Calculate Sum rule with a Script rule in ABBYY Flexicapture, because I want to only perform the calculation based on evaluation of an if statement.
I have tried the following in a C# script rule:
IFields AllTaxes = Context.Field("cu_location_taxes").Rows;
which gives me the error "Field is not a table."
I have also tried
IFields AllTaxes = Context.Field("cu_location_taxes").Children;
which gives me the error "Cannot access fields not specified in rule settings." Even though I have added the repeating group cu_location_taxes to the c# script rule.
Once I am able to get them in some kind of array or list or IFields variable, I would like to sum the children values in some way. I am open to doing this with JScript or C#.
The reasons of the errors you are facing can be found in ABBYY FlexiCapture Help.
In the description of IField class you can find the following descriptions of properties:
Rows - A set of table rows. Unavailable for non-table fields.
Well, it seems to be that "cu_location_taxes" is not a table. You said, it is a repeating group.
Children - Child items of the field (cells for tables). Unavailable in script rules.
But as I understand, you are exactly using script rules.
To achieve correct results try to use Items property exactly of the fields that you are summing.
For example, you have a repeating group that contains number fields field_1 and field_2. And you want to calculate the sum of all field_1 instances.
Then you can use the following code (JScript):
sum = 0
for (i = 0; i < this.Field("field_1").Items.Count; ++i)
{
sum += this.Field("field_1").Items.Item(i).Value
}
Also do not forget to add the field_1 in available fields of your rule settings.
Hope this will help.

Postgresql full text search on really short documents (filename)

I have a database of filenames in which I'm trying to search using PGs full text search facility. I'm running the search query on a table of filenames, the problem is that the ranking functions are not ranking the results as I'd like them to do. For the sake of argument, let's assume the schema looks like this:
create table files (
id serial primary key,
filename text,
filename_ft tsvector
);
The query that I run looks something like this:
select filename, ts_rank(filename_ft, query)
from files, to_tsquery('simple', 'a|b|c') as query
where query ## name_ft
order by rank desc limit 5;
This will return the 5 results with the highest rank. However, those search queries are coming from another process, and in most cases the queries have some 'garbage' in them. For instance, a query for 'a xxxx' might be executed, where xxxxx is just a bunch of other terms. In most cases this still returns the correct results, because the suffix is simply not in the database.
However, sometimes a query contains some extraneous information that screws with the ranking function. For instance, a query for 'a b c' will return a filename containing the tokens 'b c' as first result, and an exact match on 'a' as second result, my guess this is due to the fact the the first result contains a larger percentage of the actual search tokens.
In most cases (if not all) the most important token appears as the first token in the query, so my question is, is there a way to give the tokens in the query a weight?
is there a way to give the tokens in the query a weight?
Yes, there is. See the documentation; search for "weight".
Whether assigning weights is the right choice is another matter. It sounds to me like you really want to exclude some of the data from the inputs to to_tsvector in index creation and searching, so you just don't include that garbage in the index.