Joining Enumerated and Symbol Column in kdb+ using Over - kdb

Im looking at two tables, one with an enumerated column the other is not. Im using (,/) to join them together and it doesn't unenumerate the data. (uj/) and last (,\) both do. Does anyone why this might be the case?
KDB+ 4.0 2021.04.26 Copyright (C) 1993-2021 Kx Systems
q)enum:`a`b
q)t1:([]c1:`enum$`a`b)
q)t2:([]c1:`a`b)
q)(,/) (t1;t2)
c1
--------
`enum$`a
`enum$`b
`a
`b
q)(uj/) (t1;t2)
c1
--
a
b
a
b
q)last(,\) (t1;t2)
c1
--
a
b
a
b
q)```

I suspect it's because ,/ (along with raze) is given special treatment by the interpreter while the others aren't.
An actual append-over gives you the result you want:
q)0N!({x,y}/)(t1;t2);
+(,`c1)!,`a`b`a`b
So even though {x,y}/ appears to be the same as ,/, that isn't always the case due to the "special treatment" under the covers

Related

Given an existing table, generate code for defining an empty table with the same schema

I would like to take an existing table that I deserialized from a binary file or obtained from a remote process and generate code that will create an empty copy of the table so that I can have a human-readable representation of the schema that I can use to easily re-create the table.
For example, assume I have a trade table in memory and I want to generate code that will return an empty table of the same schema.
q)show meta trade
c | t f a
-----| -----
time | n
sym | s g
price| f
size | i
stop | b
cond | c
ex | c
I'm aware I can obtain an empty copy of trade by running 0#trade. However, I'd like to have a general function (let's say it's called getSchema) that will behave something like this:
q) getSchema trade
"trade:([]time:`timespan$(); sym:`g#`symbol$(); price:`float$(); size:`int$(); stop:`boolean$(); cond:`char$(); ex:`char$())"
I think it would be straightforward to implement this by processing the result of meta trade, but I was wondering if there was was a more straightforward or publicly available implementation of this function. Thanks.
I haven't seen such function available in public. An example is below, but it does not cover all cases (this is left for an exercise)
getSchema: {
typeMapping: "nsfibc"!("timespan";"symbol";"float";"int";"boolean";"char");
c: exec c from meta x;
t: exec t from meta x;
statement: string[x],":([]";
statement,: "; " sv string[c],'": `",/:(typeMapping#t),\:"$()";
statement,: ")";
statement
};
//Expects table's name as symbol
getSchema`trade
What is not covered:
attributes: Attribute should go in the middle of "; " sv string[c],'": _code_for_attribute_ `",/:(typeMapping#t),\:"$()" statement
types: typeMapping must be enriched to cover the rest of Q types
keys. If table is keyed, then keyed columns are listed inside square brackets t: ([keyed_columns] other_columns)
foreign keys. To be fair they are seldom in use, so I would ignore them
The following should work nicely, however it will only render types atomic types, which will conform to how tables are normally defined. This method just utilizes the string representation of the empty schema that you can achieve via -3!
This should handle keyed tables and attributes but not foreign key.
getSchema:{[x]
/ Render the types with -3!
typs:ssr[-3!value flip 0!0#get x;"\"\"";"`char$()"];
/ Append the column names to the types
colNameType:(string[cols 0!get x],\:":"),'";" vs 1_-1_typs;
/ Ensure the correct keys are shown, add in ;
colnametyp:#[colNameType;count[keys x] - 1;,;"]"],\:";";
/ Combine
raze string[x],":([",(-1_raze colnametyp),")"
}
q)trade:2!([]time:`timespan$(); sym:`g#`symbol$(); price:`float$(); size:`int$(); stop:`boolean$(); cond:`char$(); ex:`char$());
q)getSchema `trade
"trade:([time:`timespan$();sym:`g#`symbol$()];price:`float$();size:`int$();stop:`boolean$();cond:`char$();ex:`char$())"

Answer Set Programming: Group into two sets so that those who like each other are in same set, and dislike = different set

I'm basically a beginner to Answer Set Programming (CLINGO), so I've been attempting this problem for hours now.
person(a;b;c;d;e;f).
likes(b,e; d,f).
dislikes(a,b; c,e).
People who like each other must be in the same set, and cannot be in the same set as someone they dislike.
So the output should be:
b,e | a, c, d,f
I know the logic behind it; partition it so that if an element is in both likes & dislikes, then it should be in its own set, and everything else in the other. But this is declarative programming, so I'm not sure how to tackle this. Any help would be appreciated.
Try this one, it should work for you:
person(a;b;c;d;e;f).
like(b,e; d,f).
dislike(a,b; c,e).
group(1..2).
% every person belongs to one group only.
1{in(S,G): group(G)}1 :- person(S).
% no two persons who do dislike each other are in the same group
:- in(X, G), in(Y, G), dislike(X,Y).
#show in/2.
The result you'll get is:
a & b are in different group.
and c & e are in different group.
The result you can get is like:

Assigning a whole DataStructure its nullind array

Some context before the question.
Imagine file FileA having around 50 fields of different types. Instead of all programs using the file, I tried having a service program, so the file could only be accessed by that service program. The programs calling the service would then receive a DataStructure based on the file structure, as an ExtName. I use SQL to recover the information, so, basically, the procedure would go like this :
Datastructure shared by service program :
D FileADS E DS ExtName(FileA) Qualified
Procedure called by programs :
P getFileADS B Export
D PI N
D PI_IDKey 9B 0 Const
D PO_DS LikeDS(FileADS)
D LocalDS E DS ExtName(FileA) Qualified
D NullInd S 5i 0 Array(50) <-- Since 50 fields in fileA
//Code
Clear LocalDS;
Clear PO_DS;
exec sql
SELECT *
INTO :LocalDS :nullind
FROM FileA
WHERE FileA.ID = :PI_IDKey;
If SqlCod <> 0;
Return *Off;
EndIf;
PO_DS = LocalDS;
Return *On;
P getFileADS E
So, that procedure will return a datastructure filled with a record from FileA if it finds it.
Now my question : Is there any way I can assign the %nullind(field) = *On without specifying EACH 50 fields of my file?
Something like a loop
i = 1;
DoW (i <= 50);
if nullind(i) = -1;
%nullind(datastructure.field) = *On;
endif;
i++;
EndDo;
Cause let's face it, it'd be a pain to look each fields of each file every time.
I know a simple chain(n) could do the trick
chain(n) PI_IDKey FileA FileADS;
but I really was looking to do it with SQL.
Thank you for your advices!
OS Version : 7.1
First, you'll be better off in the long run by eliminating SELECT * and supplying a SELECT list of the 50 field names.
Next, consider these two web pages -- Meaningful Names for Null Indicators and Embedded SQL and null indicators. The first shows an example of assigning names to each null indicator to match the associated field names. It's just a matter of declaring a based DS with names, based on the address of your null indicator array. The second points out how a null indicator array can be larger than needed, so future database changes won't affect results. (Bear in mind that the page shows a null array of 1000 elements, and the memory is actually relatively tiny even at that size. You can declare it smaller if you think it's necessary for some reason.)
You're creating a proc that you'll only write once. It's not worth saving the effort of listing the 50 fields. Maybe if you had many programs using this proc and you had to create the list each time it'd be a slight help to use SELECT *, but even then it's not a great idea.
A matching template DS for the 50 data fields can be defined in the /COPY member that will hold the proc prototype. The template DS will be available in any program that brings the proc prototype in. Any program that needs to call the proc can simply specify LIKEDS referencing the template to define its version in memory. The template DS should probably include the QUALIFIED keyword, and programs would then use their own DS names as the qualifying prefix. The null indicator array can be handled similarly.
However, it's not completely clear what your actual question is. You show an example loop and ask if it'll work, but you don't say if you had a problem with it. It's an array, so a loop can be used much like you show. But it depends on what you're actually trying to accomplish with it.
for old school rpg just include the nulls in the data structure populated with the select statement.
select col1, ifnull(col1), col2, ifnull(col2), etc. into :dsfilewithnull where f.id = :id;
for old school rpg that can't handle nulls remove them with the select statement.
select coalesce(col1,0), coalesce(col2,' '), coalesce(col3, :lowdate) into :dsfile where f.id = :id;
The second method would be easier to use in a legacy environment.
pass the key by value to the procedure so you can use it like a built in function.
One answer to your question would be to make the array part of a data structure, and assign *all'0' to the data structure.
dcl-ds nullIndDs;
nullInd Ind Dim(50);
end-ds;
nullIndDs = *all'0';
The answer by jmarkmurphy is an example of assigning all zeros to an array of indicators. For the example that you show in your question, you can do it this way:
D NullInd S 5i 0 dim(50)
/free
NullInd(*) = 1 ;
Nullind(*) = 0 ;
*inlr = *on ;
return ;
/end-free
That's a complete program that you can compile and test. Run it in debug and stop at the first statement. Display NullInd to see the initial value of its elements. Step through the first statement and display it again to see how the elements changed. Step through the next statement to see how things changed again.
As for "how to do it in SQL", that part doesn't make sense. SQL sets the values automatically when you FETCH a row. Other than that, the array is used by the host language (RPG in this case) to communicate values back to SQL. When a SQL statement runs, it again automatically uses whatever values were set. So, it either is used automatically by SQL for input or output, or is set by your host language statements. There is nothing useful that you can do 'in SQL' with that array.

SQL -302 on OPEN cursor in SQLRPGLE

The problem
I've got a SQLRPGLE program that executes queries that look like this:
SELECT orapdt, oraptm, orodr#, c.ccctls, orbill, b.cuslmn, b.cusvrp, orocty, orost, o.cubzip, o.cucnty, ordcty, ordst, d.cubzip, d.cucnty
FROM order
LEFT JOIN cmtctlf c ON orbill = c.cccode
LEFT JOIN custmast b ON orbill = b.cucode
LEFT JOIN custmast o ON orldat = o.cucode
LEFT JOIN custmast d ON orcons = d.cucode
WHERE
orstat != 'C' AND
orbill IN ('ABCDE', 'VWXYZ', 'JKFRTE') AND
orapdt BETWEEN 2012365 AND 2013362 AND
o.cucnty = 'USA' AND
(o.cubzip LIKE '760%' OR o.cubzip LIKE '761%' OR o.cubzip LIKE '762%') AND
d.cubzip = '38652' AND
ordcty = 'NA' AND
ordst = 'MS' AND
d.cucnty = 'USA'
ORDER BY orapdt, oraptm, orodr#
Field definitions:
orapdt 7 0
oraptm 4a
orodr# 7a
c.ccctls 6a
orbill 6a
b.cuslmn 2a
b.cusvrp 3a
orocty 4a
orost 2a
o.cubzip 5a
o.cucnty 3a
ordcty 4a
ordst 2a
d.cubzip 5a
d.cucnty 3a
c.cccode 6a
b.cucode 6a
o.cucode 6a
d.cucode 6a
I see the following errors in my job log:
Field HVR0001 and value 1 not compatible. Reason 7.
Conversion error on host variable or parameter *N.
When I prompt for additional message information I'm told:
The attributes of variable field HVR0001 in query record format FORMAT0001 are not compatible with the attributes of value number 1. The value is *N. The reason code is 7.
7 -- Value contains numeric data that is not valid
and
Host variable or parameter *N or entry 1 in a descriptor area contains a value that cannot be converted to the attributes required by the statement. Error type 6 occurred.
6 -- Numeric data that is not valid.
These errors are triggered by opening the cursor:
...
exec sql PREPARE S1 FROM :sql_stmt;
exec sql DECLARE C1 SCROLL CURSOR FOR S1;
exec sql OPEN C1;
...
I also have QSQSVCDMP files in my outq filled with dump information. The only useful thing I see in there is a reference to CPF4278 and CPD4374
CPF4278 means Query definition template &1 not valid.
CPD4374 means Field &1 and value &3 not compatible. Reason &5.
Unfortunately the error message itself isn't there, only the strings "CPF4278" and "CPD4374".
In the program I monitor for SQL error codes and they are all the same:
SQLSTATE: 22023
SQLCODE: -302
SQLERRMC: <non-displayable character>*N
The error state/code means "A parameter or variable value is invalid."
What I've tried...
After much Googling I've tried:
removing the ORDER BY clause (on OPEN, data is fetched and
ordered when there is an ORDER BY clause)
changing all LEFT JOIN's to INNER JOIN's (did this to make sure there were no NULL's
in the result records from the right side)
adding " AND orapdt IS NOT NULL" to the WHERE clause
many more things that I've forgotten
What I'm asking...
How do I find out which field has bad data in it? I know that HVR0001 is invalid but which field is represented by HVR0001? I tried SELECTing fields in a different order but it's always HVR0001 that has an invalid value.
Ideally I'd like to be able to print out all HVR* fields/values so I can inspect them.
When I look at the compile listing there are no HVR* fields listed. There are some SQL_* fields listed and I can see that SQL_00011 is used to temporarily hold data that gets put into orapdt. SQL_00011 is defined exactly like orapdt (7,0 packed). That's the only numeric field in my query...
I feel like my problem is being caused by how the files are being joined, that somehow an invalid value (probably NULL) is being placed into my orapdt field.
I also think my problem has something to do with executing many of these queries one after the other (some of the WHERE specifics change for each query) because I can take one of the queries that fail and put it into it's own program and run it and it works fine.
This is on DB2 for i (V6R1) and all files involved were created using DDS
Edit:
Here is the host variable (data structure) and the two external data structures needed for the LIKE statements:
d eds_custmast e ds extname('CUSTMAST') inz
d eds_order e ds extname('ORDER') inz
d o ds
d orapdt like(ORAPDT)
d oraptm like(ORAPTM)
d orodr# like(ORODR#)
d orctls like(CUCODE)
d orbill like(ORBILL)
d orslmn like(CUSLMN)
d orcsr like(CUSVRP)
d orocty like(OROCTY)
d orost like(OROST)
d orozip like(CUBZIP)
d orocntry like(CUCNTY)
d ordcty like(ORDCTY)
d ordst like(ORDST)
d ordzip like(CUBZIP)
d ordcntry like(CUCNTY)
// Define an array to indicate nulls...
d o1nv s 3i 0 dim(15)
And here's the fetch statement that actually gets the data:
dow sqlcode = *zeros;
exec sql FETCH NEXT FROM C1 INTO :o :o1nv;
if sqlcode = *zeros;
// process the data.
endif;
enddo;
exec sql CLOSE C1;
I didn't include this before simply because the error occurs when I'm OPENing the cursor, not FETCHing a row. The OPEN statement shouldn't know anything about the o data structure.
As for what changes in the WHERE clause - all of it is dynamically built (and thus can change) other than:
orstat != 'C' AND orapdt BETWEEN 2012365 AND 2013362
It's not at all easy to find out what the actual error is. I tend to copy statements like these into IBM i Navigator and use Visual Explain to try to get a grasp of what decisions the optimiser is making. Another way to do this is to do a STRDBG and look at the job log. When STRDBG is in effect, the optimiser puts informational messages into the job log. But even then, it cam be tough to puzzle out.
In this case, there's only one numeric column, orapdt. Try the query without that column and see if that's the culprit.
Since ORAPDT is your only numeric column, so the problem must lie there.
The issue is in the way DDS defined files work. The validity of values is not checked when being written into DDS defined files, so it appears you have non-numeric data in ORAPDT on one or more records. SQL does not like this, and throws an error.
SQL (DDL) defined tables validate the values before they are written, thus protecting the integrity of your database better.
To solve your problem, find the offending record(s) and fix them or delete them.
Assuming error comes from orapdt, you could monitor it by creating new varible or replacing null or garbage values with other number e.g. null = 9999999, non-numeric = 8888888
SELECT case when orapdt is null
then 9999999
when TRANSLATE(SUBSTR(orapdt,1,LENGTH(orapdt)-1),' ','0123456789',' ') <>' '
then 8888888
else orapdt
end
, oraptm,
or check thru strsql or run sql script for offending records
SELECT orapdt, oraptm, orodr#,
...
WHERE ( orapdt is null or TRANSLATE(SUBSTR(orapdt,1,LENGTH(orapdt)-1),' ','0123456789',' ') <>' ' ) AND
orstat != 'C' AND
......
What seems to be the issue...
The code I posted in my question is in program A. Program A calls (via CALLP) program B. Nothing out of the ordinary there.
Program A uses embedded SQL declaring a prepared statement called S1 and a scrollable cursor called C1. Program B also happens to declare a prepared statement called S1 and a scrollable cursor called C1.
What appears to be happening is the cursor's are interfering with each other because they have the same name. My belief is the query being executed in program B is fetching data that is valid for itself – but is invalid for the query defined in program A. So when program A scrolls through the results of it's query and calls program B the query executed by program B attempts to put invalid values in fields associated with program A – and this only happens when the cursor names are the same in both programs.
All I did was give the cursors in both programs unique names (PGMA_C1 and PGMB_C1 for instance) and the errors stopped happening. Nothing else changed, just the cursor names. This goes against the information I found here (http://pic.dhe.ibm.com/infocenter/iseries/v6r1m0/index.jsp?topic=/rzala/rzalaccl.htm)
“Scope of a cursor: The scope of cursor-name is the source program in which it is defined; that is, the program submitted to the precompiler. Thus, a cursor can only be referenced by statements that are precompiled with the cursor declaration. For example, a program called from another separately compiled program cannot use a cursor that was opened by the calling program.”
Of course that statement seems to be contradicted by this one:
A cursor can only be referred to in the same instance of the program
in the program stack unless CLOSQLCSR(*ENDJOB), CLOSQLCSR(*ENDSQL), or
CLOSQLCSR(*ENDACTGRP) is specified on the CRTSQLxxx commands.
If CLOSQLCSR(*ENDJOB) is specified, the cursor can be referred to by any instance of the program on the program stack.
If CLOSQLCSR(*ENDSQL) is specified, the cursor can be referred to by any instance of the program on the program stack until the last
SQL program on the program stack ends.
If CLOSQLCSR(*ENDACTGRP) is specified, the cursor can be referred to by all instances of the module in the activation group until the
activation group ends.
But in our case both program A and B have CLOSQLCSR(*ENDMOD) – so the two cursors shouldn't be aware of each other.
Unfortunately I don't have the time to dig into this any deeper. I have confirmed that simply giving each program a unique cursor name solves our problem.
Before I figured out that using unique cursor names would fix our problem I did comprehensive testing of all our data. Every field in every record in every file used by these two programs contains valid data. Based on the error message I was expecting there to be a NULL or some other invalid character somewhere but that wasn't the case.
I appreciate your replies and suggestions, +1 all around :-)

How to express queries in Tuple Relational Calculus?

Problem:
Consider a relation of scheme Building(Street, Number, No.Apartments, Color, Age).
TRC: find the oldest building in Downing Street.
The associated SQL statement would be:
SELECT MAX(Age) AS ‘Oldest building’, Street FROM Building WHERE Street = ‘Downing Street’;
My answer using TRC: (B stands for Building relation)
{V.*|V(B) | V.BAge >=Age ^ V.Bstreet = ‘Downing Street’}
V.* (it returns evry single tuple of Building)
V(B) (it maps variables V to Building’s tuples)
V.BAge >=Age ^ V.Bstreet = ‘Downing Street’(here I set the condition…maybe..)
If this is still relevant: the hint would be to realize that the oldest building is the one such that no other building is older than it.