list of all valid parameters and criteria that can be used in RMA queries - allen-sdk

I would like to get specific neuron models and even though I believe I understand the RMA query system, I can not find a list of the valid keywords/arguments/criteria/parameters that would correspond to what I am looking for.
For example 'homo sapiens' as donor species is valid, and makes sense.
But if 'm__biophys_perisomatic' returns all cells with perisomatic biophysical models, what about 'all active' ones (just an example, I would be interested in many other categories)?
I assume it is obvious but I will not stumble upon it until I have posted this question.

Thanks for your question. You can see what fields and associations are available for a table using the describe route. For example:
http://api.brain-map.org/api/v2/data/NeuronalModel/describe.xml
From your question, I believe you're looking at this table:
http://api.brain-map.org/api/v2/data/ApiCellTypesSpecimenDetail/describe.xml
You can use m__biophys_all_active to see if a cell in that table has an all-active model.
FYI: The ApiCellTypesSpecimenDetail table is a denormalized table, which means it combines a complex set of relationships among tables into a single flat table.
You could similarly use the following, more generic query to find the all-active models.
http://api.brain-map.org/api/v2/data/query.xml?criteria=model::NeuronalModel,rma::criteria,neuronal_model_template[name$eq'Biophysical - all active']&num_rows=150

Related

Feedback about my database design (multi tenancy)

The idea of the SaaS tool is to have dynamic tables with dynamic custom fields and values of different types, we were thinking to use "force.com/salesforce.com" example but is seems to be too complicated to maintain moving forward, also making some reports to create with a huge abstraction level, so we came up with simple idea but we have to be sure that this is kinda good approach.
This is the architecture we have today (in few steps).
Each tenant has it own separate database on the cluster (Postgres 12).
TABLE table, used to keep all of those tables as reference, this entity has ManyToOne relation to META table and OneToMany relation with DATA table.
META table is used for metadata configuration, has OneToMany relation with FIELDS (which has name of the fields as well as the type of field e.g. TEXT/INTEGER/BOOLEAN/DATETIME etc. and attribute value - as string, only as reference).
DATA table has ManyToOne relation to TABLES and 50 character varying columns with names like: attribute1...50 which are NULL-able.
Example flow today:
When user wants to open a TABLE DATA e.g. "CARS", we load the META table with all the FIELDS (to get fields for this query). User specified that he want to query against: Brand, Class, Year, Price columns.
We are checking by the logic, the reference for Brand, Class, Year and Price in META>FIELDS table, so we know that Brand = attribute2, Class = attribute 5, Year = attribute6 and Price = attribute7.
We parse his request into a query e.g.: SELECT [attr...2,5,6,7] FROM DATA and then show the results to user, if user decide to do some filters on it, based on this data e.g. Year > 2017 AND Class = 'A' we use CAST() functionality of SQL for example SELECT CAST(attribute6 AS int) AND attribute5 FROM DATA WHERE CAST(attribute6 AS int) > 2017 AND attribute5 = 'A';, so then we can actually support most principles of SQL.
However moving forward we are scared a bit:
Manage such a environment for more tenants while we are going to have more tables (e.g. 50 per customer, with roughly 1-5 mil per TABLE (5mil is maximum which we allow, for bigger data we have BigQuery) which is giving us 50-250 mil rows in single table DATA_X) which might affect performance of the queries, especially when we gave possibilities to manage simple WHERE statements (less,equal,null etc.) using some abstraction language e.g. GET CARS [BRAND,CLASS,PRICE...] FILTER [EQ(CLASS,A),MT(YEAR,2017)] developed to be similar to JQL (Jira Query Language).
Transactions lock, as we allow to batch upload CSV into the DATA_X so once they want to load e.g. 1GB of the data, it kinda locks the table for other systems to access the DATA table.
Keeping multiple NULL columns which can affect space a bit (for now we are not that scared as while TABLE creation, customer can decide how many columns he wants, so based on that we are assigning this TABLE to one of hardcoded entities DATA_5, DATA_10, DATA_15, DATA_20, DATA_30, DATA_50, where numbers corresponds to limitations of the attribute columns, and those entities are different, we also support migration option if they decide to switch from 5 to 10 attributes etc.
We are on super early stage, so we can/should make those before we scale, as we knew that this is most likely not the best approach, but we kept it to run the project for small customers which for now is working just fine.
We were thinking also about JSONB objects but that is not the option, as we want to keep it simple for getting the data.
What do you think about this solution (fyi DATA has PRIMARY key out of 2 tables - (ID,TABLEID) and built in column CreatedAt which is used form most of the queries, so there will be maximum 3 indexes)?
If it seem bad, what would you recommend as the alternative to this solution based on the details which I shared (basically schema-less RDBMS)?
IMHO, I anticipate issues when you wanted to join tables and also using cast etc.
We had followed the approach below that will be of help to you
We have a table called as Cars and also have a couple of tables like CarsMeta, CarsExtension columns. The underlying Cars table will have all the common fields for a ll tenant's. Also, we will have the CarsMeta table point out what are the types of columns that you can have for extending the Cars entity. In the CarsExtension table, you will have columns like StringCol1...5, IntCol1....5, LongCol1...10
In this way, you can easily filter for data also like,
If you have a filter on the base table, perform the search, if results are found, match the ids to the CarsExtension table to get the list of exentended rows for this entity
In case the filter is on the extended fields, do a search on the extension table and match with that of the base entity ids.
As we will have the extension table organized like below
id - UniqueId
entityid - uniqueid (points to the primary key of the entity)
StringCol1 - string,
...
IntCol1 - int,
...
In this case, it will be easy to do a join for entity and then get the data along with the extension fields.
In case you are having the table metadata and data being inferred from separate tables, it will be a difficult task to maintain this over long period of time and also huge volume of data.
HTH

Group By Grouping Sets Returning Unexpected Result

I have a table on which I'm using Group by Group Sets and it is returning one row of data that I do not understand. I was hope you all could help me make sense of it:
The first row that is returned contains Null for both Balance and WarehouseNo, but I know that the Total Value corresponds to WarehouseNo WW-COI with Balance as Null (see second image proving this).
Why does it appear as null when using Group By Grouping Sets?
I think you have a couple different confusions going on here.
Grouping sets are usually used for getting rid of Union All where you need different groupings on the same table.
In your case, you are keeping your Union All because it is on two different tables.
So, from what it seems, you probably just want to use a normal Group By to keep your groupings linked together. It's not clear to me why you'd need grouping sets here.
Now... to answer your question:
Since you are using grouping sets on this unioned dataset, it is going to do a different grouping for the two sets you provided.
Concurrently, it is going to do a grouping on just WarehouseNo and separately at the same time it is going to do a different grouping of just Balance.
Not seeing your original data, this is probably the reason you are getting Nulls in places you didn't expect.
If you want the two columns to be linked, you would need to include them both in the same set. as in:
Group By Grouping Sets ((WarehouseNo, Balance), (another grouping you may want))
The "other grouping" could well be just (WarehouseNo) or (Balance) or even no grouping (). But only you can decide why that information might be important to you.
So, from the looks of it, you probably just want to use a normal Group By here. But quite possibly I'm missing something that I don't understand about your data and what you are trying to achieve with it.
Hope that helps. :)

REST API structure for multiple countries

I'm designing a REST API where you can search for data in different countries, but since you can search for the same thing, at the same time, in different countries (max 4), am I unsure of the best/correct way to do it.
This would work to start with to get data (I'm using cars as an example):
/api/uk,us,nl/car/123
That request could return different ids for the different countries (uk=1,us=2,nl=3), so what do I do when data is requested for those 3 countries?
For a nice structure I could get the data one at the time:
/api/uk/car/1
/api/us/car/2
/api/nl/car/3
But that is not very efficient since it hits the backend 3 times.
I could do this:
/api/car/?uk=1&us=2&nl=3
But that doesn't work very well if I want to add to that path:
/api/uk/car/1/owner
Because that would then turn into:
/api/car/owner/?uk=1&us=2&nl=3
Which doesn't look good.
Anyone got suggestions on how to structure this in a good way?
I answered a similar question before, so I will stick to that idea:
You have a set of elements -cars- and you want to filter it in some way. My advice is add any filter as a field. If the field is not present, then choose one country based on the locale of the client:
mydomain.com/api/v1/car?countries=uk,us,nl
This field should dissapear when you look for a specific car or its owner
mydomain.com/api/v1/car/1/owner
because the country is not needed (unless the car ID 1 is reused for each country)
Update:
I really did not expect the id of the car can be shared by several cars, an ID should be unique (like a primary key in a database). Then, it makes sense to keep the country parameter with the owner's search:
mydomain.com/api/v1/car/1/owner?countries=uk,us
This should return a list of people who own a car with the id 1... but for me this makes little sense as a functionality, in this search I'll only allow one country:
mydomain.com/api/v1/car/1/owner?country=uk

Automating a data feed into a PostgreSQL table when the number of columns could change and there are duplicate names

My company uses a third-party vendor to get all of our NPS information. I'm trying to set up a data feed from this vendor into our data warehouse, which runs PostgreSQL.
The feed is in the form of 2 tab-separated text files: "question mapping" and the responses. The question map is one row per question, with columns for question id, question text, question label question type, etc - straightforward. The responses are one row per survey response, with a column for each question and stuff like user id, etc. Here are the 2 biggest problems:
The survey questions sometimes use the same question ID for different questions, resulting in multiple columns in the response data having the same name but not being the same question.
The number of questions could change, resulting in a different number of columns in the data.
Both of these things make it a real headache to automate a data feed into a single table.
I'm afraid I don't quite know how to phrase my real question other than, "Does anyone have any ideas how I can accomplish this?" If I think of something better than that, I'll come and update this, so for now:
Does anyone have any ideas at all about how I can efficiently set up my automated data feed without having to always drop and recreate everything?
If your data is a mess and doesn't really have well defined columns you can use the entity attribute value pattern, where you turn each fact into a set of rows with 4 columns - a unique row id, the same entity id for each row extracted from the map, an attribute column (where you put what would be the name of the column) you get from the key of the map, and a value column where you put the value from the map. It's not that neat but you can still query it and you won't have to drop it when you receive a map with a new column.

Designing Fact table associated with multiple attribute hierarchy members

I am designing a warehouse to accommodate a movie related database. I have a table with the columns - Title, Genre, SalesAmount, ProductionAmount.
One such row would be say GodFather, Crime|Drama,1000000,20000.
I want to move this to DW, I am looking at getting this into a Fact table say FactSale and have linkage to Genre dimension.
My objective is to analyze revenues by Genres. In this case, how would I be building the cube matrix? I have another mapping table with TitleId,GenreId present.
Also would it be possible to create a dynamic hierarchy say under Action, Drama, Romance etc. Idea is to gather info on a single genre or combination of genres.
Can someone guide me on how to go about it?
I have found the answer. Thanks to the whitepaper from SQLBI with extensive examples and explanations - http://www.sqlbi.com/wp-content/uploads/The_Many-to-Many_Revolution_2.0.pdf