Tableau rawsqlagg_real - tableau-api

Could somebody please give me a little guidance on rawsqlagg_real function in Tableau. What is right syntax for it when it is used to get data from MySQL.
I used it as per my understanding but I am getting an error "No such column [__measure__3]".
Code:
RAWSQLAGG_REAL("select count(Film Id) from flavia.TableforThe_top_10percent_of_the_user where count(distinct(User Id)) = %1",[it sucks])

I see a few issues here
Instead of WHERE, use HAVING
You have column names like Film Id, you should write them as 'Film Id' instead
Though I must say that it is better to do with LOD calculations as Tableau will be able to do better query optimizations that way. Plus it is less error prone and much easier to write.

I find another issue here in addition to using having instead of where. The filter value should be numeric, or the operator should be like and not =.
where count(distinct(User Id)) = **%1**

Related

Postgres...how to improve ilike results (quality not speed)

I have a list of chemicals in my database and I provide our users with the ability to do a live search via our website. I use SQLAlchemy and the query I use looks something like this:
Compound.query.filter(Compound.name.ilike(f'%{name}%')).limit(50).all()
When someone searches for toluene, for example, they don't get the result they're looking for because there are many chemicals that have the word toluene in them, such as:
2, 4 Dinitrotoluene
2-Chloroethyl-p-toluenesulfonate
4-Bromotoluene
6-Amino-m-toluenesulfonic acid
a,2,4-trichlorotoluene
a,o-Dichlorotoluene
a-Bromtoluene
etc...
I realize I could increase my limit but I feel like 50 is more than enough. Or, I could change the ilike(f'%{name}%')) to something like ilike(f'{name}%')) but our business requirements don't want this. What I'd rather do is improve the ability for Postgres to return results so that toluene is at the top of the search results.
Any ideas on how Postgres' ilike capability?
Thanks in advance.
One option is to better rank the results. Postgres text search allows you to rank the results.
A cheap and dirty version of preferential ranking is to do multiple queries for name = ?, ilike(f'{name}%')), and ilike(f'%{name}%')) using a union. That way the ilike(f'{name}%')) results come first.
And rather than a hard limit, offer pagination. SQLAlchemy has paginate to help.
ILIKE yields a boolean. It doesn't specify what order to return the results, just whether to return them at all (you can order by a boolean, but if you only return trues there is nothing left to order by). So by the time you are done improving it, it would no longer be ILIKE at all but something else completely.
You might be looking for something like <-> from pg_trgm, which provides a distance score which can be sorted on. Although really, you could just order the result based on the length of the compound name, and return the shortest 50 that contain the target.
something like ilike(f'{name}%')) but our business requirements don't want this
Isn't your business requirement to get better results?
But at least in my database, this could just return a bunch of names in inverted format, like toluene, 2,4-dinitro, so the results might not be much better, unless you avoid storing such inverted names. Sorting by either <-> or by length would overcome that problem. But they would also penalize toluene, ACS reagent grade 99.99% by HPLC, should you have names like that.

Sequelize how to use aggregate function on Postgres JSONB column

I have created one table with JSONB column as "data"
And the sample value of that column is
[{field_id:1, value:10},{field_id:2, value:"some string"}]
Now there are multiple rows like this..
What i want ?
I want to use aggregate function on "data" column such that, i should
get
Sum of all value where field_id = 1;
Avg of value where field_id = 1;
I have searched alot on google but not able to find a proper solution.
sometimes it says "Field doesn't exist" and some times it says "from clause missing"
I tried referring like data.value & also data -> value lastly data ->> value
But nothing is working.
Please let me know the solution if any one knows,
Thanks in advance.
Your attributes should be something like this, so you instruct it to run the function on a specific value:
attributes: [
[sequelize.fn('sum', sequelize.literal("data->>'value'")), 'json_sum'],
[sequelize.fn('avg', sequelize.literal("data->>'value'")), 'json_avg']
]
Then in WHERE, you reference field_id in a similar way, using literal():
where: sequelize.literal("data->>'field_id' = 1")
Your example also included a string for the value of "value" which of course won't work. But if the basic Sequelize setup works on a good set of data, you can enhance the WHERE clause to test for numeric "value" data, there are good examples here: Postgres query to check a string is a number
Hopefully this gets you close. In my experience with Sequelize + Postgres, it helps to run the program in such a way that you see what queries it creates, like in a terminal where the output is streaming. On the way to a working statement, you'll either create objects which Sequelize doesn't like, or Sequelize will create bad queries which Postgres doesn't like. If the query looks close, take it into pgAdmin for further work, then try to reproduce your adjustments in Sequelize. Good luck!

Redshift limitation on LIKE operator in CASE statement

I am running into an issue where if I have more than 15 LIKE operators within a case statement, I get an error java.lang.StackOverflowError.
Here is an example of what I am doing against a table with 60 million rows:
SELECT
CASE WHEN field LIKE '%value%' THEN 'result'
WHEN field LIKE '%value2%' THEN 'result2'
.... 14 more of those
END
I haven't seen this limitation documented anywhere. Any ideas how to get around this?
It sounds like it's an out-of-memory error.
I think you have some options:
use an intermediate table before doing the like processing (or use intermediate tables to process subsets of your initial data)
bump up the number of queue slots that you're using for this query to have more memory available https://docs.aws.amazon.com/redshift/latest/dg/r_wlm_query_slot_count.html
take a look at the explain output to see if it gives you clues about what's going wrong
You could Create a Scalar Python User-Defined Function to replace the LIKE comparisons.
Then, just use:
SELECT f_myfunc(field)
This turned out to be a driver issue. I was initially using 1.2.16.1027 and upgraded to 1.2.20.1043 and I am no longer receiving the error.

FsSql Not working when Parameterizing Columns

Using F# , FsSql and PostGres
So I'm using this function
let getSqlParameter value =
let uniqueKey = Guid.NewGuid().ToString("N")
let key = (sprintf "#%s" uniqueKey)
(key,Sql.Parameter.make(key,value))
to get me a parameter of anything I pass in dynamically
Which I then append to a query and I get something like this
select * from (select * from mytable) as innerQuery where #a29c575b69bb4629a9971dac2808b445 LIKE '%#9e3485fdf99249e5ad6adb6405f5f5ca%'
Then I take a collection of these and pass them off
Sql.asyncExecReader connectionManager query parameters
The problem that I'm having is that when I don't run this through my parameterization engine, it works fine. When I do, it doesn't work. It just returns empty sets.
The only thing I can think of is that the column names can't be parameterized. This is a problem because they're coming from the client. Is there a way to do this?
Okay so the answer here is that you can't parameterize column names as far as I can tell.
What I ended up doing was creating a whitelist of acceptable column names and then compare what was coming in to my whitelist. If it doesn't exist then I drop it.
By far a sub-optimal solution. I really wish there was a way to do this.

Tableau: Create a table calculation that sums distinct string values (names) when condition is met

I am getting my data from denormalized table, where I keep names and actions (apart from other things). I want to create a calculated field that will return sum of workgroup names but only when there are more than five actions present in DB for given workgroup.
Here's how I have done it when I wanted to check if certain action has been registered for workgroup:
WINDOW_SUM(COUNTD(IF [action] = "ADD" THEN [workgroup_name] END))
When I try to do similar thing with count, I am getting "Cannot mix aggregate and non-aggregate arguments":
WINDOW_SUM(COUNTD(IF COUNT([Number of Records]) > 5 THEN [workgroup_name] END))
I know that there's problem with the IF clause, but don't know how to fix it.
How to change the IF to be valid? Maybe there's an easier way to do it, that I am missing?
EDIT:
(after Inox's response)
I know that my problem is mixing aggregate with non-aggregate fields. I can't use filter to do it, because I want to use it later as a part of more complicated view - filtering would destroy the whole idea.
No, the problem is to mix aggregated arguments (e.g., sum, count) with non aggregate ones (e.g., any field directly). And that's what you're doing mixing COUNT([Number of Records]) with [workgroup_name]
If your goal is to know how many workgroup_name (unique) has more than 5 records (seems like that by the idea of your code), I think it's easier to filter then count.
So first you drag workgroup_name to Filter, go to tab conditions, select By field, Number of Records, Count, >, 5
This way you'll filter only the workgroup_name that has more than 5 records.
Now you can go with a simple COUNTD(workgroup_name)
EDIT: After clarification
Okay, than you need to add a marker that is fixed in your database. So table calculations won't help you.
By definition table calculation depends on the fields that are on the worksheet (and how you decide to use those fields to partition or address), and it's only calculated AFTER being called in a sheet. That way, each time you call the function it will recalculate, and for some analysis you may want to do, the fields you need to make the table calculation correct won't be there.
Same thing applies to aggregations (counts, sums,...), the aggregation depends, well, on the level of aggregation you have.
In this case it's better that you manipulate your data prior to connecting it to Tableau. I don't see a direct way (a single calculated field that would solve your problem). What can be done is to generate a db from Tableau (with the aggregation of number of records for each workgroup_name) then export it to csv or mdb and then reconnect it to Tableau. But if you can manipulate your database outside Tableau, it's usually a better solution