kdb+ tests to check we're not flipping back and forth - kdb

How do we find out if the sym is flipping back and forth or not?
Once the syms changes to some other sym, it should not go back to previously occurred symbols in history.
Everytime the symbols flips/changes to new symbols, then new symbol should be unique and should not occurred in previous sym.
Things we want here is:
We want to have a Boolean column 1b if the rollover happened
We want to have another Boolean column 1b is rollover happned and the sym has flipped back and forth. (the current sym has occurred in previous or future date syms.)
//To create a sample table
tmp:([]sdate:`date$();sym:`symbol$();name:`symbol$());
{`tmp insert (x;`VXV2;`someName1)} each tdate;
{`tmp insert (x;`VXJ2;`someName2)} each tdate;
{`tmp insert (x;`VXG8;`someName3)} each tdate;
{`tmp insert (x;`VXZ4;`someName4)} each tdate;
//update the dataset to get the scenario
tmp:`sdate`sym xasc tmp;
tmp:update sym:`VXG8 from tmp where sdate=2020.04.11;
tmp:update sym:`VXN6 from tmp where sdate>=2020.04.21;
//the below query is we are interested in to find the rollover and sym flipping back and forth.
select from tmp where name=`someName1
//notes
from 2020-03-30 till 2020-04-10 then sym is VXV2
from 2020-04-11 till 2020-04-11 then sym is VXG8 // the sym was updated to VXG8
from 2020-04-12 till 2020-04-20 then sym is VXV2 // the sym was flipped back to VXV2 from VXG8
from 2020-04-21 till 2021-03-29 then sym is VXN6 // the sym was updated to VXN6

You can use differ on a list to return where it changes. For your second column, you can also use differ and filter out elements that are the first occurrence in the list.
q)list:`A`A`A`A`B`A`A`A`C`C
q)differ list
1000110010b
q)differ[list] and not #[count[list]#0b; list?distinct list; :; 1b]
0000010000b

It depends on exactly how you want the booleans to look - only one true boolean on the flip or all true booleans for every duplicate sym - but this does the former:
update roll:differ sym,dup:differ[sym]&not i=group[sym][;0]sym from select from tmp where name=`someName1
Similar to Cathans answer

Related

Why is Arrayformula returning only the first row

Update: sample sheet provided here: https://docs.google.com/spreadsheets/d/1BapXdaVOUL634SstNJXqYNocsD_EvvtlbJ77vlElmZs/edit?usp=drivesdk. Any help will be appreciated!
Hi fellow nerds.
I'm trying to make the current column (most recent interaction date with client) display the max values (most recent dates) from ContactLog!b:b (dates of all recorded interactions), when the client name in ContactLog!A:A matches to the client name in current row column A.
After many days of trying, I've found several formulas to successfully achieve this result for the current cell only.
=MAXIFS(ContactLog!B:B, ContactLog!A:A, A:A)
=MAX(FILTER(ContactLog!B4:B, ContactLog!A4:A=VLOOKUP(A2, ContactLog!A4:B, 1, FALSE)))
=MAX(QUERY(ContactLog!A4:B, ""SELECT B WHERE A = '""&VLOOKUP(A2, ContactLog!A4:B, 1, FALSE)&""'"", 0))
=IF(COUNTIF(ContactLog!A:A, A2),MAX(FILTER(ContactLog!B:B, ContactLog!A:A = A2)),"")
But none of these seem to work with arrayformula, to spread to the entire column. I'd like this result to apply automatically to the entire column (wherever column A is not blank).
It's displaying the correct max value for the first cell (in which the formula is written), and I could drag the formula down, but not spreading automatically as an array.
I've tried using =match with =filter, but that keeps running into mismatched range row sizes. (I've previously solved that by using filter within a filter, but can't figure that out here).
[I have a similar issue for the nearby columns also, "most recent interaction method", and "reminders & goals". The formula there is:
=INDEX(ContactLog!C:C, MATCH(MAX(IF(ContactLog!A:A=A2, IF(ContactLog!B:B=MAX(IF(ContactLog!A:A=A2, ContactLog!B:B)), ROW(ContactLog!B:B)))), ROW(ContactLog!B:B), 0))
And
=IFERROR(CONCATENATE(JOIN(" • ",FILTER(ContactLog!D:D,ContactLog!A:A=A2, ContactLog!D:D<>"")),IF(INDEX(ContactLog!D:D,MAX(IF(ContactLog!A:A=A2,ROW(ContactLog!D:D))))="","","")),"")
They both work great, but I can't get them to work with arrayformula...]
What am I missing?
You can do something like this with BYROW, that allows you to expand your formula through the column and be calculated "row by row". Using your first option:
=BYROW(A:A, LAMBDA (each,IF(each="","",MAXIFS(ContactLog!B:B, ContactLog!A:A, each))))

Should I be able to assign in nested dictionaries in KDB?

According to the docs, the assignment of `lt below should have upsert mechanics:
s:()!()
s[`MSFT]:(`state`sym)!(`init`MSFT)
| state sym
----| ----------
MSFT| init MSFT
s[`MSFT][`lt]: 3
'assign
[0] s[`MSFT][`lt]: 3
^
But instead I get an error.
Wham I doing wrong?
This goes back to the same problem you had before with typed dictionaries (dictionaries whose values are all the same type so kdb tries to keep it that way) - this time it's happening twice at two depths!
If you define:
s:()!()
s[`MSFT]:(`state`sym)!(`init`MSFT)
then kdb assumes the shape of the "values" based on your first insert to the dictionary. In this case, kdb enforces that any value in that dictionary (even the one for MSFT) is a dictionary with keys state and sym. That means you can't force a new shape on it by adding a third key (at least not in the way you're attempting to).
On top of that - the sub-dictionary that you've created is itself a typed dictionary whose values are all symbol so kdb will force it to stay symbol values (aka you can't suddenly make "3" a value).
The final issue here is the one Matthew pointed out - you can't assign using double brackets [][] you can only assign with one (and use depth if necessary).
Putting all of this together:
/define s to allow generic datatype
q)s:(1#`)!enlist[::]
/also don't allow the inner dictionary to be typed
q)s[`MSFT]:(``state`sym)!(::;`init;`MSFT)
/now you can assign
q)s[`MSFT;`lt]:3
q)s
| ::
MSFT| ``state`sym`lt!(::;`init;`MSFT;3)
A keyed table is a map from a table to a table, so what you're indexing in to s with needs to itself be a table. So
(enlist `) ! enlist `MSFT
Second, if you are starting with an empty keyed table, you need to enlist the key and value.
q)s: () ! ()
q)s[enlist (enlist `) ! enlist `MSFT]: enlist (`state`sym) ! `init`MSFT
q)s
| state sym
----| ----------
MSFT| init MSFT
When your table is no longer empty, you don't need to enlist the key and value.
q)s[(enlist `) ! enlist `GOOG]: (`state`sym) ! `init`GOOG
q)s
| state sym
----| ----------
MSFT| init MSFT
GOOG| init GOOG

JCL SORT - Need to add fields from different rows of input to a single output row

I have a 6 row input file which consists of a field(Position 1 to 6) that contains a different value on every line. Based on the different values contained in this field the other fields (From position 7 -80) will be moved to to a single row in output.
E.G.
Input:
035MI 88122
035ST 72261
035SU 317786762
105 06616858
1601 11
1651 0000000140006PC
Output:
1 8812272261317786762 06616858 11 0000000140006PC
I need to find out how to read these all in as different rows and then output to a single row. I've tried using something similar to the code to this:
SORT FIELDS=COPY
INREC IFTHEN=(WHEN=(1,6,CH,EQ,C'035MI '),
OVERLAY=(3:7,5)),
But this will move the data onto the correct position on seperate rows like this:
1 8812272261317786762
1 06616858
1 11
1 0000000140006PC
So now I think I need to do a sort in one step and a merge in another step. I would prefer to do it in one if it's possible however. I'd appreciate any help on this. Thanks.
You add a sequence-number to each record.
The use WHEN=GROUP to copy data from one record to one or more subsequent records.
You use OUTFIL INLUDE= to just pick up the final record.
OPTION COPY
INREC IFTHEN=(WHEN=INIT,
OVERLAY=(81:SEQNUM,1,ZD)),
IFTHEN=(WHEN=GROUP,
BEGIN=(81,1,CH,EQ,C'1'),
PUSH=(somestuff),
RECORDS=6),
IFTHEN=(WHEN=GROUP,
BEGIN=(81,1,CH,EQ,C'2'),
PUSH=(somestuff),
RECORDS=5),
IFTHEN=(WHEN=GROUP,
BEGIN=(81,1,CH,EQ,C'3'),
PUSH=(somestuff),
RECORDS=4),
IFTHEN=(WHEN=GROUP,
BEGIN=(81,1,CH,EQ,C'4'),
PUSH=(somestuff),
RECORDS=3),
IFTHEN=(WHEN=GROUP,
BEGIN=(81,1,CH,EQ,C'5'),
PUSH=(somestuff),
RECORDS=2),
OUTFIL INCLUDE=(81,1,CH,EQ,C'6'),
BUILD=(1,80)
You need to do a bit of planning. The sixth record will contain all the data, but perhaps not yet in the order that you want. Either with an IFTHEN=(WHEN=(logical expression) (to identify the sixth record) on the INREC or with the BUILD on the OUTREC, you can do your final formatting.
You need to change the something each time, it will be receivingposition:sourceposition,length
The DFSORT manuals are very good, there is a Getting Started for those new to the product, and everything you'll ever need is in the Application Programming Guide,

SQL -302 on OPEN cursor in SQLRPGLE

The problem
I've got a SQLRPGLE program that executes queries that look like this:
SELECT orapdt, oraptm, orodr#, c.ccctls, orbill, b.cuslmn, b.cusvrp, orocty, orost, o.cubzip, o.cucnty, ordcty, ordst, d.cubzip, d.cucnty
FROM order
LEFT JOIN cmtctlf c ON orbill = c.cccode
LEFT JOIN custmast b ON orbill = b.cucode
LEFT JOIN custmast o ON orldat = o.cucode
LEFT JOIN custmast d ON orcons = d.cucode
WHERE
orstat != 'C' AND
orbill IN ('ABCDE', 'VWXYZ', 'JKFRTE') AND
orapdt BETWEEN 2012365 AND 2013362 AND
o.cucnty = 'USA' AND
(o.cubzip LIKE '760%' OR o.cubzip LIKE '761%' OR o.cubzip LIKE '762%') AND
d.cubzip = '38652' AND
ordcty = 'NA' AND
ordst = 'MS' AND
d.cucnty = 'USA'
ORDER BY orapdt, oraptm, orodr#
Field definitions:
orapdt 7 0
oraptm 4a
orodr# 7a
c.ccctls 6a
orbill 6a
b.cuslmn 2a
b.cusvrp 3a
orocty 4a
orost 2a
o.cubzip 5a
o.cucnty 3a
ordcty 4a
ordst 2a
d.cubzip 5a
d.cucnty 3a
c.cccode 6a
b.cucode 6a
o.cucode 6a
d.cucode 6a
I see the following errors in my job log:
Field HVR0001 and value 1 not compatible. Reason 7.
Conversion error on host variable or parameter *N.
When I prompt for additional message information I'm told:
The attributes of variable field HVR0001 in query record format FORMAT0001 are not compatible with the attributes of value number 1. The value is *N. The reason code is 7.
7 -- Value contains numeric data that is not valid
and
Host variable or parameter *N or entry 1 in a descriptor area contains a value that cannot be converted to the attributes required by the statement. Error type 6 occurred.
6 -- Numeric data that is not valid.
These errors are triggered by opening the cursor:
...
exec sql PREPARE S1 FROM :sql_stmt;
exec sql DECLARE C1 SCROLL CURSOR FOR S1;
exec sql OPEN C1;
...
I also have QSQSVCDMP files in my outq filled with dump information. The only useful thing I see in there is a reference to CPF4278 and CPD4374
CPF4278 means Query definition template &1 not valid.
CPD4374 means Field &1 and value &3 not compatible. Reason &5.
Unfortunately the error message itself isn't there, only the strings "CPF4278" and "CPD4374".
In the program I monitor for SQL error codes and they are all the same:
SQLSTATE: 22023
SQLCODE: -302
SQLERRMC: <non-displayable character>*N
The error state/code means "A parameter or variable value is invalid."
What I've tried...
After much Googling I've tried:
removing the ORDER BY clause (on OPEN, data is fetched and
ordered when there is an ORDER BY clause)
changing all LEFT JOIN's to INNER JOIN's (did this to make sure there were no NULL's
in the result records from the right side)
adding " AND orapdt IS NOT NULL" to the WHERE clause
many more things that I've forgotten
What I'm asking...
How do I find out which field has bad data in it? I know that HVR0001 is invalid but which field is represented by HVR0001? I tried SELECTing fields in a different order but it's always HVR0001 that has an invalid value.
Ideally I'd like to be able to print out all HVR* fields/values so I can inspect them.
When I look at the compile listing there are no HVR* fields listed. There are some SQL_* fields listed and I can see that SQL_00011 is used to temporarily hold data that gets put into orapdt. SQL_00011 is defined exactly like orapdt (7,0 packed). That's the only numeric field in my query...
I feel like my problem is being caused by how the files are being joined, that somehow an invalid value (probably NULL) is being placed into my orapdt field.
I also think my problem has something to do with executing many of these queries one after the other (some of the WHERE specifics change for each query) because I can take one of the queries that fail and put it into it's own program and run it and it works fine.
This is on DB2 for i (V6R1) and all files involved were created using DDS
Edit:
Here is the host variable (data structure) and the two external data structures needed for the LIKE statements:
d eds_custmast e ds extname('CUSTMAST') inz
d eds_order e ds extname('ORDER') inz
d o ds
d orapdt like(ORAPDT)
d oraptm like(ORAPTM)
d orodr# like(ORODR#)
d orctls like(CUCODE)
d orbill like(ORBILL)
d orslmn like(CUSLMN)
d orcsr like(CUSVRP)
d orocty like(OROCTY)
d orost like(OROST)
d orozip like(CUBZIP)
d orocntry like(CUCNTY)
d ordcty like(ORDCTY)
d ordst like(ORDST)
d ordzip like(CUBZIP)
d ordcntry like(CUCNTY)
// Define an array to indicate nulls...
d o1nv s 3i 0 dim(15)
And here's the fetch statement that actually gets the data:
dow sqlcode = *zeros;
exec sql FETCH NEXT FROM C1 INTO :o :o1nv;
if sqlcode = *zeros;
// process the data.
endif;
enddo;
exec sql CLOSE C1;
I didn't include this before simply because the error occurs when I'm OPENing the cursor, not FETCHing a row. The OPEN statement shouldn't know anything about the o data structure.
As for what changes in the WHERE clause - all of it is dynamically built (and thus can change) other than:
orstat != 'C' AND orapdt BETWEEN 2012365 AND 2013362
It's not at all easy to find out what the actual error is. I tend to copy statements like these into IBM i Navigator and use Visual Explain to try to get a grasp of what decisions the optimiser is making. Another way to do this is to do a STRDBG and look at the job log. When STRDBG is in effect, the optimiser puts informational messages into the job log. But even then, it cam be tough to puzzle out.
In this case, there's only one numeric column, orapdt. Try the query without that column and see if that's the culprit.
Since ORAPDT is your only numeric column, so the problem must lie there.
The issue is in the way DDS defined files work. The validity of values is not checked when being written into DDS defined files, so it appears you have non-numeric data in ORAPDT on one or more records. SQL does not like this, and throws an error.
SQL (DDL) defined tables validate the values before they are written, thus protecting the integrity of your database better.
To solve your problem, find the offending record(s) and fix them or delete them.
Assuming error comes from orapdt, you could monitor it by creating new varible or replacing null or garbage values with other number e.g. null = 9999999, non-numeric = 8888888
SELECT case when orapdt is null
then 9999999
when TRANSLATE(SUBSTR(orapdt,1,LENGTH(orapdt)-1),' ','0123456789',' ') <>' '
then 8888888
else orapdt
end
, oraptm,
or check thru strsql or run sql script for offending records
SELECT orapdt, oraptm, orodr#,
...
WHERE ( orapdt is null or TRANSLATE(SUBSTR(orapdt,1,LENGTH(orapdt)-1),' ','0123456789',' ') <>' ' ) AND
orstat != 'C' AND
......
What seems to be the issue...
The code I posted in my question is in program A. Program A calls (via CALLP) program B. Nothing out of the ordinary there.
Program A uses embedded SQL declaring a prepared statement called S1 and a scrollable cursor called C1. Program B also happens to declare a prepared statement called S1 and a scrollable cursor called C1.
What appears to be happening is the cursor's are interfering with each other because they have the same name. My belief is the query being executed in program B is fetching data that is valid for itself – but is invalid for the query defined in program A. So when program A scrolls through the results of it's query and calls program B the query executed by program B attempts to put invalid values in fields associated with program A – and this only happens when the cursor names are the same in both programs.
All I did was give the cursors in both programs unique names (PGMA_C1 and PGMB_C1 for instance) and the errors stopped happening. Nothing else changed, just the cursor names. This goes against the information I found here (http://pic.dhe.ibm.com/infocenter/iseries/v6r1m0/index.jsp?topic=/rzala/rzalaccl.htm)
“Scope of a cursor: The scope of cursor-name is the source program in which it is defined; that is, the program submitted to the precompiler. Thus, a cursor can only be referenced by statements that are precompiled with the cursor declaration. For example, a program called from another separately compiled program cannot use a cursor that was opened by the calling program.”
Of course that statement seems to be contradicted by this one:
A cursor can only be referred to in the same instance of the program
in the program stack unless CLOSQLCSR(*ENDJOB), CLOSQLCSR(*ENDSQL), or
CLOSQLCSR(*ENDACTGRP) is specified on the CRTSQLxxx commands.
If CLOSQLCSR(*ENDJOB) is specified, the cursor can be referred to by any instance of the program on the program stack.
If CLOSQLCSR(*ENDSQL) is specified, the cursor can be referred to by any instance of the program on the program stack until the last
SQL program on the program stack ends.
If CLOSQLCSR(*ENDACTGRP) is specified, the cursor can be referred to by all instances of the module in the activation group until the
activation group ends.
But in our case both program A and B have CLOSQLCSR(*ENDMOD) – so the two cursors shouldn't be aware of each other.
Unfortunately I don't have the time to dig into this any deeper. I have confirmed that simply giving each program a unique cursor name solves our problem.
Before I figured out that using unique cursor names would fix our problem I did comprehensive testing of all our data. Every field in every record in every file used by these two programs contains valid data. Based on the error message I was expecting there to be a NULL or some other invalid character somewhere but that wasn't the case.
I appreciate your replies and suggestions, +1 all around :-)

Display table field through query using form

Just some background information. My table, (HireHistory) has 50 columns in it (horizontal). I have a Form (HireHistoryForm) which has a 2 text boxes (HistoryMovieID and HistoryCustomerID) and a button (the button runs the query 'HireHistoryQuery')
Here's an excerpt of my data (the CustomerID's are along the top):
So what I need is so that if a number is entered into the HistoryCustomerID box, then it displays that column. e.g. if the value entered is '1', then in my query it will show all records from column 1.
If a number is entered into the HistoryMovieID box (e.g. 0001) then it displays all instances of that MovieID for the specific CustomerID's. i.e. In column 1 is the ID's, so for ID=1 it will show "0001 on 19/05/2006" then will go on to find the next instance of '0001' etc.
For the HistoryCustomerID I tried to put this into my 'Field' for the query:
=[Forms]![HireHistoryForm]![HistoryCustomerID]
But it didn't work. My query just returned a column labelled '10' and the rows were just made up of '10'.
If you could help I'd greatly appreciate it. :)
No offense intended (or as little as possible, anyway), but that is a horrible way to structure your data. You really need to restructure it like this:
CustomerID MovieID HireDate
---------- ------- --------
1 0001 19/05/2006
1 0003 20/10/2003
1 0007 13/08/2003
...
2 0035 16/08/2012
2 0057 06/10/2012
...
If you keep your current data structure then
You'll go mad, and
It's extremely unlikely that anyone else will go anywhere near this problem.
Edit
Your revised data structure is a very slight improvement, but it still works against you. Consider that in your other question here you are essentially asking for a way to "fix" your data structure "on the fly" when you do a query.
The good news is that you can run a bit of VBA code once to convert your data structure to something workable. Start by creating your new table, which I'll call "HireHistoryV2"
ID - AutoNumber, Primary Key
CustomerID - Number(Long Integer), Indexed (duplicates OK)
MovieID - Text(4), Indexed (duplicates OK)
HireDate - Date/Time, Indexed (duplicates OK)
The VBA code to copy your data to the new table would look something like this:
Function RestructureHistory()
Dim cdb As DAO.Database, rstIn As DAO.Recordset, rstOut As DAO.Recordset
Dim fld As DAO.Field, a() As String
Set cdb = CurrentDb
Set rstIn = cdb.OpenRecordset("HireHistory", dbOpenTable)
Set rstOut = cdb.OpenRecordset("HireHistoryV2", dbOpenTable)
Do While Not rstIn.EOF
For Each fld In rstIn.Fields
If fld.Name Like "Hire*" Then
If Not IsNull(fld.Value) Then
a = Split(fld.Value, " on ", -1, vbBinaryCompare)
rstOut.AddNew
rstOut!CustomerID = rstIn!CustomerID
rstOut!MovieID = a(0)
rstOut!HireDate = CDate(a(1))
rstOut.Update
End If
End If
Next
Set fld = Nothing
rstIn.MoveNext
Loop
rstOut.Close
Set rstOut = Nothing
rstIn.Close
Set rstIn = Nothing
Set cdb = Nothing
MsgBox "Done!"
End Function
Note: You appear to be using dd/mm/yyyy date formatting, so check the date conversions carefully to make sure that they converted properly.