How to find IDs of all authors with a given affiliation, including those who were affiliated in the past? - pybliometrics

I have institution X. For that institution, I need to find the IDs of all authors who have had such an affiliation at any time during their careers. I attempted using this code, where XID is the affiliation_id of the institution X
from pybliometrics.scopus import AuthorSearch
query = 'AF-ID(XID)'
s = AuthorSearch(query)
authors = s.authors
authors = pd.DataFrame(authors)
When I inspect the list of authors, several scientists, who I know for sure have been affiliated with institution X in the past and have moved to a different place during their career, simply do not appear in the list. It looks as if the query is returning only authors who have a current affiliation with institution X, but not those who had that affiliation in the past.
How could I collect all authors with current and past affiliations to institution X?
Thank you.

For XID = 60105007 (my institute), I find both current and former colleagues.
One can see this from the fact that s.authors includes information on the current affiliation ID, which might differ:
import pandas as pd
from pybliometrics.scopus import AuthorSearch
XID = "60105007"
query = f'AF-ID({XID})'
s = AuthorSearch(query)
authors = pd.DataFrame(s.authors)
different_aff = authors["affiliation_id"] != XID
different_aff.sum()
14 out of 96 researchers somehow affiliated with institution 60105007 published at least their paper while affiliated with another one.
Very likely that those authors that you'd expect to be part of your institution are not properly assigned to this very institution profile. Oftentimes there are duplicate affiliation profiles, assigned to some non-org affiliation profile. That's due to difficulties relating from string parsing. Buth each of them is small in terms of associated authors.

Related

Workflow permissioning criterias matching

Currently, I have a workflow table quite complex containing multiple values that combine between each others. It is basically a workflow to find the right approver for the right user in permission model
Here is a simplified example of workflow table
Country
Department
Role
Approver
ALL
ALL
ALL
Bob
UK
IT
Developper
Tim
US
IT
Developper
Mike
ALL
ALL
Analyst
John
Those workflow steps always follow pre-requisite :
It always exist the default approvers with matching ALL criterias
Any criteria matching take precedence on the ALL criteria
If no criteria match, it will fall inside the default approvers
I have users that need to be approved according to above matrix
User
Country
Department
Role
U1
UK
IT
UX designer
U2
US
HR
Analyst
I am trying to figure out how i can extract the following matching :
User
Approver
Reason
U1
Bob
User U1 does not match the criterias so its climb to the default approver
U2
Tim
User U2 match workflow step because Tim validate all UK Developpers belonging to IT department
Of course for this example there is few criterias so a simple if / else would solve it.
But I have 6 criterias which I believe would reach an important number of combinaisons if applying this naive approach.
In this situation, does a rule engine would apply this problem ? (for instance drools?)
I would think the engine would take user / and list of workflow step as a fact.
Should a decision table more applicable in this situation ?
Any help in structuring the problem would be more than appreciated :)
Here is a really simple example of matching a user to its approver country
rule "Has country approver"
when
$user : User( $country : country )
$wf : WorkflowStep(country == $country) from $userAccess.steps
then
//take the approver for the matching step
$userAccess.setApprover($wf.getApprover());
//insertLogical ?
end
A rule engine is a perfect fit for your solution, you have to decide how to use it though.
The example you provided using Drools directly is very low-level, it works but it'll require you to write the rules in DRL.
Since you already know your evaluation is stateless and your input are literally formatted as a table, a Decision Table would be a better fit in my opinion. Given that you have two flavours of Decision Tables, Drools' and DMN.
I'll suggest you to try with DMN's as it's easier to start, Kogito provides a quick-start that let you experiment with the whole system and even write a test scenario for that, which is basically your second table.

Project Academic Knowlede | Query for and list papers by AA.AuId?

I've got a list of author names but I don't have Id's for any of them.
I'd like to:
Query by author name and store the most probable AuId.
List all papers written by a given AuId.
Is there any way to do this with the current interpret/evaluate APIs? It seems like everything is tied to a paper entity and I want to be sure I am only ever selecting and using one AuId.
Thanks.
I am not aware of such a feature. But indirectly, you could first search for the author name (AA.AuN in the expr-field), obtain all the (unique) various author IDs (AA.AuId in the attributes field), and search for their publications.
(You could even add orderby=logprob:desc, but to be honest, I am not 100% sure what logprob does.)
So, the first step could be to search for the author name (e.g. John Smith) like this and fetch all those AA.AuId where the names (AA.AuN) seem to fit John Smith (let's just add the orderby=logprob:desc):
https://api.labs.cognitive.microsoft.com/academic/v1.0/evaluate?&expr=Composite(AA.AuN=%27john%20smith%27)&count=100&attributes=AA.AuN,AA.AuId&orderby=logprob:desc&subscription-key={YOUR-KEY}
As a second step, if you have an Author ID AA.AuId (here, for example, 3038752200), use this to list their papers (ordered by year, in a descending manner orderby=Y:desc):
https://api.labs.cognitive.microsoft.com/academic/v1.0/evaluate?&expr=Composite(AA.AuId=3038752200)&count=100&attributes=AA.AuN,AA.AuId,DOI,Ti,VFN,Y&orderby=Y:desc&subscription-key={YOUR-KEY}
The approach would be more promising if you had an institutional affiliation as well. Then you could change the expr field to Composite(And(AA.AuN='{AUTHOR-NAME}',AA.AfId={AFFILIATION-ID})) so as to search for all {AUTHOR-NAMES} affiliated to {AFFILIATION-ID}.

Demandware: Find Product's Category Position?

I'm updating a data feed export, which links a Product to a given Category. I want to also include that product's merchandising position within that category, which currently exists in Business Manger, and is used to control sorting on Product listing pages:
I'm digging through the API docs, and the logical place for this information to be exposed in in dw.catalog.CategoryAssignment, but it's not there. I'm currently inferring the position by essentially doing this:
// assume var product, category
var position = category.products.firstIndex(p => p.ID == product.ID);
However, this tells me where the Product got sorted to, not what the actual Position value is within Demandware. It works for now as an expedient hack, but I really want to replace it with something that pulls the actual value from DW.
Where in the Commerce Cloud API can I find the merchandising position for a given Product in a given Category?
I think you would'nt get the actual position of the product index as you may have multiple sorting rules to display different outputs on the category listing pages. These sorting rules can be created as and when required based on certain rules. I don't think this can be reflected on the product feed.
It took some digging, but I managed to find that the "Position" field for Products in the BM is stored as Product.searchPlacement. To find it, you have to look in Category.products, find the Product you want, and grab the searchPlacement property of that product.
In effect, I used:
// assume var product, category
var position = category.products.find(p => p.ID == product.ID).searchPlacement;
For Products that don't have a Position assigned in the Business Manager, searchPlacement is 0. Otherwise, it reflects the value entered in the BM.

Determine the existence of a path between relations in PostgreSQL

I have a PostgreSQL 9.4 database storing organization and person tables.
There is an organization_person table linking organizations to people.
Organizations have multiple people, and people may belong to more than one
organization.
I want to be able to efficiently answer the query: for a given
organization X, does there exist a path between that organization, through
at most one person, to any organization in set S, the set of organizations
having a certain boolean field set to true?
Only two-hop connections through people need be found.
Finding X -- Person -- Y -- Person -- s isn't necessary.
The set S has about 10,000 entries. Most organizations aren't in S.
This is for online query purposes, not offline analytics or other batch processing. Updates to S are rarer; about 150 additions per day, with a few removals.
I'm willing to use of advanced features or extensions of PostgreSQL, or
other database technologies if they're simply far more suited to the task.
I only need to know whether such a path exists, not its members. I'm
willing to do this with a certain amount of denormalisation in PostgreSQL,
but I'm not sure how to integrate changes, such as changes to the
membership of S, in a sane and efficient
manner.
If I understand correctly, you are looking for a person who is in organization X and an organization in S. This can be expressed as a SQL query:
select 1
from s join
organization_person op
on op.organzation_id = s.organization_id join
organization_person opx
on opx.organization_id = 'x' and
opx.person_id = op.person_id
limit 1;
This will benefit from indexes on organization_person(organization_id, person_id) and organization_person(person_id, organization_id).

Documentation for CONTAINS() in FQL?

There have recently been several questions posted on Facebook.SO using CONTAINS() in the WHERE clause. It seems to work like the Graph API search function, AND functions as an indexed field. All great things for the FQL developer.
SELECT name,
username,
type
FROM profile
WHERE CONTAINS("Facebook")
However, the only official mention of the CONTAINS function appears in the unified_thread documentation. It is mentioned in passing, as a way to search for text contained in a message. It also appeared in this fbrell code sample.
But Contains doesn't seem to be a straightforward search. For example, this query:
SELECT name
FROM user
WHERE CONTAINS("Joe Biden")
returns "Joe Biden" and also "Joseph Biden" and "Biden Joe". But it also returns "Joe Scardino", "Lindsay Noyan" and "Mehmad Moha" among others. What relationship do these people have with the VP of the USA? They aren't my friends, so I'll never know.
There also appears to be the ability to pass CONTAINS a field to search on, however changing the end of my first query to `CONTAINS("Facebook", name) returns an OAuth error:
(#615) 'name' is not a valid search field for the profile table.
In my not-so rigorous testing, I have yet to find a field/table combination that does not return this error.
So what is this mystery function? How does it work? Can it allow us to do things to date impossible in FQL like traversing arrays and filtering data stored in strings?
An answer here would be great, but a description on an FQL functions & methods reference page on the official developer documentation site would be better still.
I don't think that a have any great answers here, but I can give a workaround for the issue of returning unrelated names- which I suspect is because people have made public posts about Joe Biden, liked him, or so on. If you do the following:
SELECT name
FROM user
WHERE CONTAINS("Joe Biden")
AND strpos(lower(name),lower("Joe Biden")) >=0
You will get a resultset that only contains the right names- though it removes the advantage of also returning Joseph Biden, etc. etc.
My personal point of pain is that CONTAINS() appears to work with partial strings (e.g. "Joe Bide") on the profile table, but not on the user table. Very frustrating.