How to partition by LIST COLUMNS / RANGE COLUMNS in MySQL Workbench partition tab? - mysql-workbench

Looks like an option to partition by LIST COLUMNS or RANGE COLUMNS in MySQL Workbench partition tab is not available or am I missing something?
Is there any format/syntax in order to accomplish this? or it's just not available.
A workaround is to generate an SQL from the model and modify it but I wanted to do it in MySQL Workbench without doing any workaround. Is there any way to accomplish this?
PS: I was trying to partition by COLUMNS, see here: https://dev.mysql.com/doc/refman/5.5/en/partitioning-columns.html

LIST and RANGE are available for Partition by:

Related

Pivot data in Talend

I have some data which I need to pivot in Talend. This is a sample:
brandname,metric,value
A,xyz,2
B,xyz,2
A,abc,3
C,def,1
C,ghi,6
A,ghi,1
Now I need this data to be pivoted on the metric column like this:
brandname,abc,def,ghi,xyz
A,3,null,1,2
B,null,null,null,2
C,null,1,6,null
Currently I am using tPivotToColumnsDelimited to pivot the data to a file and reading back from that file. However having to store data on an external file and reading back is messy and unnecessary overhead.
Is there a way to do this with Talend without writing to an external file? I tried to use tDenormalize but as far as I understand, it will return the rows as 1 column which is not what I need. I also looked for some 3rd party component in TalendExchange but couldn't find anything useful.
Thank you for your help.
Assuming that your metrics are fixed, you can use their names as columns of the output. The solution to do the pivot has two parts: first, a tMap that transposes the value of each input-row in into the corresponding column in the output-row out and second, a tAggregate that groups the map's output-rows according to the brandname.
For the tMap you'd have to fill the columns conditionally like this, example for output colum named "abc":
out.abc = "abc".equals(in.metric)?in.value:null
In the tAggregate you'd have to group by out.brandname and aggregate each column as sum ignoring nulls.

apache cassandra - Inconsistency between number of records returned and count(*) result

I am importing some data into a table in Apache Cassandra using COPY command. I have 7 rows in my csv files. But after importing I just have 1 row instead of 7 rows. What would make this inconsistency?
attached is the image of my cqlsh screen
Possible issue:
same clustering key for the rows.
Solution
try adding another column as clustering key (domain specific) that gives the rows uniqueness.

HBase - MultiGet and Selective Columns

I can query using multiget to be able to selectively query multiple random rows from HBase.
http://hostname:port/tablename/multiget/?row=row1&row=row2
For selecting few columns,
http://hostname:port/tablename/rowkey/columnFamily:columnName
How to be able to use multiget and be able to select only few columns at the same time?
Looking at this (HBASE-3541) JIRA issue for multi-gets, it seems like there is no option to specify columns when using multiget. However, it also seems like it would be pretty simple to add this functionality.
EDIT: the issue I opened regarding this problem (linked here) was resolved and included in HBase's 1.3.0 release, and now it is possible to select only specific columns.

Can I have more than 250 columns in the result of a PostgreSQL query?

Note that PostgreSQL website mentions that it has a limit on number of columns between 250-1600 columns depending on column types.
Scenario:
Say I have data in 17 tables each table having around 100 columns. All are joinable through primary keys. Would it be okay if I select all these columns in a single select statement? The query would be pretty complex but can be programmatically generated. The reason for doing this is to get denormalised data to populate a web page. Please do not ask why though :)
Quite obviously if I do create table table1 as (<the complex select statement>), I will be hitting the limit mentioned in the website. But do simple queries also face the same restriction?
I could probably find this out by doing the exercise myself. In the next few days I probably will. However, if someone has an idea about this and the problems I might face by doing a single query, please share the knowledge.
I can't find definitive documentation to back this up, but I have
received the following error using JDBC on Postgresql 9.1 before.
org.postgresql.util.PSQLException: ERROR: target lists can have at most 1664 entries
As I say though, I can't find the documentation for that so it may
vary by release.
I've found the confirmation. The maximum is 1664.
This is one of the metrics that is available for confirmation in the INFORMATION_SCHEMA.SQL_SIZING table.
SELECT * FROM INFORMATION_SCHEMA.SQL_SIZING
WHERE SIZING_NAME = 'MAXIMUM COLUMNS IN SELECT';

How to select distinct values in a column in Talend

I am importing an excel file in Talend.
I want to select all the distinct values in column "A" and then dump that data into the database. Is it possible to do that with Talend?
If not, what are the alternatives available. Any help is appreciated
Yes you can do that easily with Talend Open Studio.
Create a new job like this one:
You can replace the tOracleOutput component by the component corresponding to your database.
Then parameterize the tAggregateRow component like this :
Distinct values of ColumnA will be transfered to distinctColumnA in the output schema.
You can also get the number of occurences by adding a count of columnB in the operations table.
Using tUniqRow in Talend Open Studio 6.3 works very well and you get to keep all your columns.