use group by clause with ols() and receive "getMember method not supported" in DolphinDB - group-by

I followed the groupby usage with ols() function example but got the error message: getMember method not supported.
This is the example provided by DolphinDB manual:
def myols(y,x) {
r=ols(y,x,true,2)
return r.Coefficient.beta join r.RegressionStat.statistics[0]
}
select myols(y,(factor1,factor2)) as `int`factor1`factor2`R2 from t group by id;
This is what I wrote:
def myols(y,x) {
r=ols(y,x,true,2)
return r.RegressionStat.statistics[1]
}
select myols(price, volume) as r2 from t1 group by date, wind_code

You can use try…catch() and return double() when the exception is thrown
t1 = select * from table_raw where date>2021.12.08
def myols(y,x) {
r=ols(y,x,true,2)
try {return r.RegressionStat.statistics[1]} catch(ex) {return double()}
}
select myols(price, volume) as r2 from t1 group by date, wind_code

Related

What is the relevant rules of Flink Window TVF and CEP SQL?

I am trying to parse Flink windowing TVF sql column level lineage, I initial a custom FlinkChainedProgram and set some Opt rules.
Mostly works fine except Window TVF SQL and CEP SQL.
for example, I get a logical plan as
insert into sink_table(f1, f2, f3, f4)
SELECT cast(window_start as String),
cast(window_start as String),
user_id,
cast(SUM(price) as Bigint)
FROM TABLE(TUMBLE(TABLE source_table, DESCRIPTOR(event_time), INTERVAL '10' MINUTES))
GROUP BY window_start, window_end, GROUPING SETS ((user_id), ());
rel#1032:FlinkLogicalCalc.LOGICAL.any.None: 0.[NONE].[NONE](input=FlinkLogicalAggregate#1030,select=CAST(window_start) AS EXPR$0, CAST(window_start) AS EXPR$1, null:BIGINT AS EXPR$2, user_id, null:VARCHAR(2147483647) CHARACTER SET "UTF-16LE" AS EXPR$4, CAST($f4) AS EXPR$5)
As we seen, the Optimized RelNode Tree contains null column so that MetadataQuery can't get origin column info.
What rules should I set in Logical Optimized phase to parse Window TVF SQL and CEP SQL? Thanks
I solved the field blood relationship method of Flink CEP SQL and added the getColumnOrigins(Match rel, RelMetadataQuery mq, int iOutputColumn) method in org.apache.calcite.rel.metadata.org.apache.calcite.rel.metadata. RelMdColumnOrigins:
/**
* Support field blood relationship of CEP.
* The first column is the field after PARTITION BY, and the other columns come from the measures in Match
*/
public Set<RelColumnOrigin> getColumnOrigins(Match rel, RelMetadataQuery mq, int iOutputColumn) {
if (iOutputColumn == 0) {
return mq.getColumnOrigins(rel.getInput(), iOutputColumn);
}
final RelNode input = rel.getInput();
RexNode rexNode = rel.getMeasures().values().asList().get(iOutputColumn - 1);
RexPatternFieldRef rexPatternFieldRef = searchRexPatternFieldRef(rexNode);
if (rexPatternFieldRef != null) {
return mq.getColumnOrigins(input, rexPatternFieldRef.getIndex());
}
return null;
}
private RexPatternFieldRef searchRexPatternFieldRef(RexNode rexNode) {
if (rexNode instanceof RexCall) {
RexNode operand = ((RexCall) rexNode).getOperands().get(0);
if (operand instanceof RexPatternFieldRef) {
return (RexPatternFieldRef) operand;
} else {
// recursive search
return searchRexPatternFieldRef(operand);
}
}
return null;
}
Source address: https://github.com/HamaWhiteGG/flink-sql-lineage/blob/main/src/main/java/org/apache/calcite/rel/metadata/RelMdColumnOrigins.java
I have given detailed test cases, you can refer to: https://github.com/HamaWhiteGG/flink-sql-lineage/blob/main/src/test/java/com/dtwave/flink/lineage/cep/CepTest.java
Flink CEP SQL test case:

Sometimes postgres does not return records when queried remotely from JDBC Client

public static ArrayList<String[]> getDailyRecords () throws ClassNotFoundException {
ArrayList<String[]> list=new ArrayList<String[]> ();
String[] header = {"City", "Location", "Asset", "Number of Alerts", "Time spent in alerts", "Last seen temparature", "Limit"};
list.add (header);
String myDriver = "org.postgresql.Driver";
Class.forName (myDriver);
try( Connection conn = DriverManager.getConnection( ApplicationSettings.DATABASE_URL, ApplicationSettings.DATABASE_USER, ApplicationSettings.DATABASE_PASSWORD);
Statement st = conn.createStatement ();) {
conn.setAutoCommit (true);
ResultSet rs=null;
st.setFetchSize (200);
String dailyQuery="select * from sch.reports();";
rs= st.executeQuery (dailyQuery);
while (rs.next ()) {
String[] ar = new String[7];
ar[0] = rs.getString ("location");
ar[1] = rs.getString ("sublocation");
ar[2] = rs.getString ("zone");
ar[3] = rs.getString ("total_count");
ar[4] = rs.getString ("period");
String temp = rs.getString ("temparature");
ar[5] = temp;
list.add (ar);
}
conn.close ();
st.close ();
rs.close ();
}catch(Exception e){
e.printStackTrace ();
}catch(Error e){
e.printStackTrace ();
}
finally {
if(list.size ()==1){
System.out.println ("NO records found");
}else{
System.out.println ("Foudn some records");
}
return list;
}
I have a sql function, which does return records when queried locally from postgres clients. I invoke this function at scheduled timings from a java application. This worked fine for few months. But all of sudden, st.executeQuery() returning empty result set occasionally. Like 4 out of 10 attempts of executeQuery() in a day return empty resultset.
Things I have tried out:
Made sure query returns some records always. It should return 60 records.
Captured Errors,Exceptions. There were no errors seen.
Made sure JDBC connection is closed.
Debugged Postgres JDBC driver classes. Found that in cases when resultset is empty, data was not received from postgres data stream.
can size of data in tables influence query response?
Any help is kindly appreciated!
Update: SQL function definition added.
DECLARE
exception_error_code text;
exception_message text;
exception_detail text;
exception_hint text;
exception_context text;
BEGIN
RETURN QUERY
select l.name as location,sl.name as sublocation,z.name as zone,al.lim as lim,count(al.id) total_count,sum(al.timestamp_to-al.timestamp_from) period,t2.temperature temparature
from
map.location l
left join
map.sub_location sl
on l.id=sl.location_id
left join
map.zone z
on sl.id=z.sub_location_id
left join
map.subzone sz
on z.id=sz.zone_id
inner join
(select *,(info::json -> 'breach_type')::text as lim from alr.live where date(timestamp_from)= date(now() + INTERVAL '5 hours 30 minutes')) al
on sz.id= al.sub_zone_id
left join
(select subzone_id,max(nd_timestamp) as nd_timestamp from tel.temperature_sh group by subzone_id) t1
on al.sub_zone_id=t1.subzone_id
left join
tel.temperature_sh t2
on t1.subzone_id=t2.subzone_id
and t1.nd_timestamp=t2.nd_timestamp
group by l.name ,sl.name,z.name,t2.temperature , al.lim
order by l.name ,sl.name,z.name;
-- if something "breaks" do the following
EXCEPTION WHEN others THEN
get stacked diagnostics
exception_error_code = RETURNED_SQLSTATE
,exception_message = MESSAGE_TEXT
,exception_detail = PG_EXCEPTION_DETAIL
,exception_hint = PG_EXCEPTION_HINT
,exception_context = PG_EXCEPTION_CONTEXT
;
-- log exception for debugging
PERFORM public.insert_db_exception(
exception_error_code
,exception_message
,exception_detail
,exception_hint
,exception_context
);
END;

Column must appear in the GROUP BY clause or be used in an aggregate function

I'm updating a Qt software, to make it compatible with both SQLite and PostgreSQL.
I have a C++ method that is used to count elements of a given table with given clauses.
In SQLite, the following worked and gave me a number N (the count).
SELECT COUNT(*) FROM table_a
INNER JOIN table_b AS
ON table_b.fk_table_a = table_a.id
WHERE table_a.start_date_time <> 0
ORDER BY table_a.creation_date_time DESC
With PostgreSQL (I'm using 9.3), I have the following error :
ERROR: column "table_a.creation_date_time" must appear in the
GROUP BY clause or be used in an aggregate function
LINE 5: ORDER BY
table_a.creation_date_time DESC
If I add, GROUP BY table_a.creation_date_time, it gives me a table with N rows.
I've read a lot of stuff about how different DBMS allow you to omit columns in the GROUP BY clause. Now, I'm just confused.
For those who are curious, the C++ method is:
static int count(const QString &table, const QString &clauses = QString(""))
{
int success = -1;
if (!table.isEmpty())
{
QString statement = QString("SELECT COUNT(*) FROM ");
statement.append(table);
if (!clauses.isEmpty())
{
statement.append(" ").append(clauses) ;
}
QSqlQuery query;
if(!query.exec(statement))
{
qWarning() << query.lastError();
qWarning() << statement;
}
else
{
if (query.isActive() && query.isSelect() && query.first())
{
bool ok = false;
success = query.value(0).toInt(&ok);
if (ok == false)
{
success = -1;
return success;
}
}
}
}
return success;
}
If you're just doing a count(*) on the table in order to get a single scalar-value result, then surely having the order by present is obsolete ?
solution
Remove the obsolete order by to get "standard" query behavior across multiple dbms

JPA criteria: count from multiselect query

I want to implement a table component with pagination. The result in the table is retrieved by a multiselect-query like this:
SELECT DISTINCT t0.userId,
t0.userName,
t1.rolleName
FROM userTable t0
LEFT OUTER JOIN roleTable t1 ON t0.userId = t1.fkUser
WHERE(t0.userType = 'normalUser' AND t1.roleType = 'loginRole')
This result I can get via a multiselect-query.
Now for the pagination I have to retrieve the total rowcount at first.
Is there anybody who can define a criteriaquery for one of this sql? I failed because a subquery does not support multiselects and I do not know how to get this distinct into a count statement.
SELECT COUNT(*) FROM
(
SELECT DISTINCT t0.userId,
t0.userName,
t1.rolleName
FROM userTable t0
LEFT OUTER JOIN roleTable t1 ON t0.userId = t1.fkUser
WHERE(t0.userType = 'normalUser' AND t1.roleType = 'loginRole')
)
or
SELECT COUNT(DISTINCT t0.userId || t0.userName || t1.rolleName)
FROM userTable t0
LEFT OUTER JOIN roleTable t1 ON t0.userId = t1.fkUser
WHERE(t0.userType = 'normalUser' AND t1.roleType = 'loginRole')
Thanks in advance!
Btw. I am using OpenJpa on a WebSphere AppServer
The following is not tested but should work:
CriteriaBuilder builder = em.getCriteriaBuilder();
CriteriaQuery<Long> query = builder.createQuery(Long.class);
Root<User> t0 = query.from(User.class);
Join<User, Role> t1 = t0.join("roles", JoinType.LEFT);
query.select(builder.concat(t0.get(User_.userId), builder.concat(t0.get(User_.userName), t1.get(Role_.rolleName))).distinct(true);
query.where(cb.equal(t0.get("userType"), "normalUser"), cb.equal(t1.get("roleType"), "loginRole"));
TypedQuery<Long> tq = em.createQuery(query);
Due to known issue https://jira.spring.io/browse/DATAJPA-1532 Multiselect does not work with repo.findall method. I handled this by autowiring entity manager to service class.
#Autowired
EntityManager entityManager;
public List<?> getResults() throws ParseException
{
//ModelSpecification modelSpecification = new ModelSpecification();
CriteriaQuery<DAO> query = modelSpecification.getSpecQuery();
TypedQuery<DAO> typedQuery = entityManager.createQuery(query);
List<?> resultList = typedQuery.getResultList();
//List<DAO> allData = entityManager.createQuery(query).getResultList();
return resultList;
}
You can find working code here https://github.com/bsridharpatnaik/CriteriaMultiselectGroupBy

Extending Zend_Db

I apologize if my title is a bit misleading and it turns out that it's some other class under Zend_Db.
I use the following method of extracting data from a MSSQL:
// $_config contains information about how to connect to my MSSQL server
$config = new Zend_Config($_config);
$db = Zend_Db::factory($config->database);
$sql = "SELECT * FROM table1";
print_r($db->fetchAll($sql));
So far no problem and everything runs smooth :).
Now I need to run some more complex queries with multiple rowsets:
$sql2 = <<<EOD
DECLARE #test INT
SET #test = 42
SELECT * FROM table1 WHERE col1 = #test
SELECT col2 FROM table2 UNION
SELECT col3 FROM table2
SELECT * FROM table3
EOD;
print_r($db->fetchAll($sql2));
I hope you get the idea.
Using $db->fetchAll($sql2); will result in
Fatal error: Uncaught exception
'Zend_Db_Statement_Exception' with
message 'SQLSTATE[HY000]: General
error: 10038 Attempt to initiate a new
SQL Server operation with results
pending. [10038] (severity 7)
[(null)]' in
\Sacp026a\sebamweb$\prod\includes\Zend\Db\Statement\Pdo.php:234
The following function will return all the correct rowsets:
function sqlquery_multiple($zdb, $sql) {
$stmt = $zdb->query($sql);
$rowsets = array();
do {
if ($stmt->columnCount() > 0) {
$rowsets[] = $stmt->fetchAll(PDO::FETCH_ASSOC);
}
} while ($stmt->nextRowset());
return $rowsets;
}
print_r(sqlquery_multiple($db, $sql2));
Now my big question is:
How do I extend Zend_Db, so I can implement and use the function above as $db->fetchMulti($sql2); instead of sqlquery_multiple($db, $sql2)?
Thanks in advance :)
NB: It's worth mentioning that I'm using the ODBC-patch in order to be able to fetch multiple rowsets in the first place.