I want a JCL Sort card to write all the records with column n to n+k in a dataset into a newfile. How to achieve this? - jcl

1234 ABCD
3991 ABCD
3818 ABCD
1939 PQRS
2838 PQRS
1939 ABCD
2819 PQRS
2102 FILQ
2911 ABCD
3912 FILQ
I want to write all the records with ABCD in a file, all the records PQRS in a file all the records with FILQ in a file and so on. I dont know what these columns are going to be before hand.

If you want to split into multiple files based on a fixed position with values that you don't know, you are going to need to SORT the data (on the field to split by) and use WHEN=GROUP on OUTREC to have something to be able to INCLUDE on in multiple OUTFILs.
//SYSIN DD *
SORT FIELDS=(6,4,CH,A)
OUTREC IFTHEN=(WHEN=GROUP,
KEYBEGIN=(6,4),
PUSH=(81:ID=2))
OUTFIL INCLUDE=(81,2,CH,EQ,C'01'),
FNAMES=OUT1,
BUILD=(1,80)
OUTFIL INCLUDE=(81,2,CH,EQ,C'02'),
FNAMES=OUT2,
BUILD=(1,80)
OUTFIL SAVE,
FNAMES=OUTA,
BUILD=(1,80)
//SORTIN DD *
1234 ABCD
3991 ABCD
3818 ABCD
1939 PQRS
2838 PQRS
1939 ABCD
2819 PQRS
2102 FILQ
2911 ABCD
3912 FILQ
Gives:
OUT1
1234 ABCD
1939 ABCD
2911 ABCD
3991 ABCD
3818 ABCD
OUT2
3912 FILQ
2102 FILQ
OUTA
1939 PQRS
2819 PQRS
2838 PQRS
I've used 80-byte fixed-length records for testing. If your record-length is different, change all the references to 81 to your-record-length-plus-one and all references to 80 to your record-length.
If your data is on variable-length records, you should have mentioned it earlier. The code is different.
WHEN=GROUP defines a group, and allows information from the definition of the group to be applied to all records in the group (using PUSH). There are two special fields available, ID (a sequence number for groups) and SEQ (a sequence number within the group). ID=2 means a-two-digit sequence number for groups. This allows up to 100 groups before things start going wrong with the code.
There will be 10 OUTFILs (I show three). For the final OUTFIL (I've called it OUTA) I'd suggest using SAVE instead of INCLUDE. SAVE means "all records which aren't on another OUTFIL go here". Even if you get more than 10 groups, at least you'll have all the data (until you exceed 100 groups).
PUSH is similar to OVERLAY, except it cannot use literal values of any type, only the special fields mentioned above and any data from the record which defines the group.
This PUSH will extend the records. To make it a temporary extension, the BUILD in each OUTFIL returns each record to its original size.

Assuming that values you need to sort on will not change on the fly, you could use something like this:
//STEP1 EXEC PGM=ICEMAN
//SYSOUT DD SYSOUT=*
//SYSIN DD DSN=YOUR.INPUT.FILE,DISP=OLD
//OUT1 DD DSN=OUTPUT.FILE.ONE,DISP(NEW,CATLG),
// SPACE=(CYL,(5,5),UNIT=SYSDA
//OUT2 DD DSN=OUTPUT.FILE.TWO,DISP(NEW,CATLG),
// SPACE=(CYL,(5,5),UNIT=SYSDA
//SYSIN DD *
OPTION COPY
OUTFIL INCLUDE=(6,4,CH,EQ,C'ABCD'),FNAMES=OUT1
OUTFIL INCLUDE=(6,4,CH,EQ,C'PQRS'),FNAMES=OUT2
/*
Then you can repeat this up to as many times as you need.
Basically, you just need to search that specific location (in this case position 6 for a length of 4) for the literal, then specify the output dataset.
Obviously you space parameters will probably be different than what I used, but that should be more than enough to get you going!

Related

SAS - how can I read in date data?

I am trying to read in some data in date format and the solution is eluding me. Here are four of my tries using the simplest self-contained examples I could devise. (And the site is making me boost my text-to-code ratio in order for this to post, so please ignore this sentence).
*EDIT - my example was too simplistic. I have spaces in my variables, so I do need to specify positions (the original answer said to ignore positions entirely). The solution below works, but the date variable is not a date.
data clinical;
input
name $ 1-13
visit_date $ 14-23
group $ 25
;
datalines;
John Turner 03/12/1998 D
Mary Jones 04/15/2008 P
Joe Sims 11/30/2009 J
;
run;
No need to specify the lengths. datalines already assumes space-delimited values. A simple way to specify an informat is to use a : after each input variable.
data clinical;
input ID$ visit_date:mmddyy10. group$;
format visit_date mmddyy10.; * Make the date look human-readable;
datalines;
01 03/12/1998 D
02 04/15/2008 P
03 11/30/2009 J
;
run;
Output:
ID visit_date group
01 03/12/1998 D
02 04/15/2008 P
03 11/30/2009 J
A friend of mine suggested this, but it seems odd to have to switch syntax markedly depending on whether the variable is a date or not.
data clinical; 
input
name $ 1-12
#13 visit_date MMDDYY10.
group $ 25 ;
datalines;
John Turner 03/12/1998 D
Mary Jones  04/15/2008 P
Joe Sims    11/30/2009 J
;
run;
SAS provides a lot of different ways to input data, just depending on what you want to do.
Column input, which is what you start with, is appropriate when this is true:
To read with column input, data values must have these attributes:
appear in the same columns in all the input data records
consist of standard numeric form or character form
Your data does not meet this in the visit_date column. So, you need to use something else.
Formatted input is appropriate to use when you want these features:
With formatted input, an informat follows a variable name and defines how SAS reads the values of this variable. An informat gives the data type and the field width of an input value. Informats also read data that is stored in nonstandard form, such as packed decimal, or numbers that contain special characters such as commas.
Your visit_date column matches this requirement, as you have a specific informat (mmddyy10.) you would like to use to read in the data into date format.
List input would also work, especially in modified list format, in some cases, though in your example of course it wouldn't due to the spaces in the name. Here's when you might want to use it:
List input requires that you specify the variable names in the INPUT statement in the same order that the fields appear in the input data records. SAS scans the data line to locate the next value but ignores additional intervening blanks. List input does not require that the data is located in specific columns. However, you must separate each value from the next by at least one blank unless the delimiter between values is changed. By default, the delimiter for data values is one blank space or the end of the input record. List input does not skip over any data values to read subsequent values, but it can ignore all values after a given point in the data record. However, pointer controls enable you to change the order that the data values are read.
(For completeness, there is also Named input, though that's more rare to see, and not helpful here.)
You can mix Column and Formatted inputs, but you don't want to mix List input as it doesn't have the same concept of pointer control exactly so it can be easy to end up with something you don't want. In general, you should use the input type that's appropriate to your data - use Column input if your data is all text/regular numerics, use formatted input if you have particular formats for your data.

KDB: How to serialize a table for a union join within kdb-tick architecture?

Im trying to modify the kdb-tick architecture to support a union join on incoming data and the local rdb table.
I have modified the upd function in the tick.q file to the following:
ups:{[t;x]ts"d"$a:.z.P;
if[not -16=type first first x;a:"n"$a;x:$[0>type first x;a,x;(enlist(count first x)#a),x]];
f:key flip value t;pub[t;$[0>type first x;enlist f!x;flip f!x]];if[l;l enlist (`ups;t;x);i+:1];};
With ups:uj subsequently set in the subscriber files.
My question relates to how one might serialize a table row before publishing it within the .u.ups[] function.
I.e. given a table:
second | amount price
-----------|----------------
02:46:01 | 54 9953.5
02:46:02 | 54 9953.5
02:46:03 | 54 9953.5
02:46:04 | 150 9953.5
02:46:05 | 150 9954.5
How should one serialize the first row 02:46:01 | 54 9953.5 such that it can be sent via the .u.ups function to subscribers whereby uj will be run between the row and the local table on the subscribers.
Thanks in advance for your advice.
Some of this might help:
You can't set ups:uj in the subscribers because the table name is being passed as a symbol so the subscriber will effectively try to do
uj[`tab1;tab2]
which won't work because uj doesn't accept table names (symbols) as input. You would have to instead set ups to
ups:{x set value[x] uj y}
A standard tickerplant is not designed to handle variable/changing schema - for good reason, it's generally not a good idea to have a schema that changes intraday. However your situation might warrant it so in that case you'd need to modify your .u.ups function to something like
\d .u
ups:{[t;x]ts"d"$a:.z.P;
x:`time xcols update time:"n"$a from x;
pub[t;$[98h=type x;x;1=count last x;enlist x;flip x]];if[l;l enlist (`ups;t;x);i+:1];};
\d .
and your feeder process would have to send kdb tables or kdb dictionaries to the .u.ups function. Since a feedhandler process is usually not a kdb process, it may or may not be possible to send tables/dictionaries to the tickerplant as normally the feedhandler would send lists (without column metadata). In your case you need to somehow supply the column metadata to the tickerplant on each update (or maybe you're doing that already?), as otherwise it won't know which columns are which.
In other words your feeder process could send either of the following:
(`.u.upd;`tab;([]col1:`a`b`c;col2:1 2 3))
(`.u.upd;`tab;`col1`col2!(`a;1))
(`.u.upd;`tab;`col1`col2!(`a`b;1 2))
I'm going to assume this is related to your previous few questions about disparate schemas. I'd like to suggest an alternative solution, which is only truly viable if you are using kdb version 3.6, which uses anymap. If you can narrow your schemas down to a minimal list of common columns, all other columns can be placed as dictionaries into a general column.
q)tab:([]sym:`$();col1:`float$();colGeneral:(::))
q)`tab upsert (`AAPL;3.454;(`colX`colY`colZ!(1;2.3;"abc")))
`tab
q)`tab upsert (`MSFT;3.0;(`colX`colY!(2;100.0)))
`tab
q)`tab upsert (`AMZN;100.0;((enlist `colX)!(enlist 10)))
`tab
q)tab
sym col1 colGeneral
----------------------------------------
AAPL 3.454 `colX`colY`colZ!(1;2.3;"abc")
MSFT 3 `colX`colY!(2;100f)
AMZN 100 (,`colX)!,10
q)select colGeneral from tab
colGeneral
-----------------------------
`colX`colY`colZ!(1;2.3;"abc")
`colX`colY!(2;100f)
(,`colX)!,10
q)select sym, colGeneral #\: `colX from tab
sym x
-------
AAPL 1
MSFT 2
AMZN 10
q)select sym, colGeneral #\: `colY from tab
sym x
---------
AAPL 2.3
MSFT 100f
AMZN 0N
With 3.6 you can be saving this to disk in any splayed format (splayed, partitioned, segmented) and still easily query the data. The storage of such a table will likely be sub-optimal due to poor compression characteristics of the general column (assuming you wish to compress data), but it will be perfectly functional.
Integrating uj into standard ingestion procedure with each update will be computationally expensive. Using a general column and dictionary method will massively improve your ingestion speed. Below I've given a demonstration using the example given a previous answer to a related question of yours
q)table:()
q)row1:enlist `x`y`colX!(`AMZN;100.0;10)
q)table:table uj row
q)\ts:100000 table:table uj row1
13828 6292352
q)\ts:100000 `tab upsert (`AMZN;100.0;((enlist `colX)!(enlist 10)))
117 12746880

to delete selected number of records in a big file useing jcl sort cards

I want to delete 501 records based on 5 character activity code from the test file with 38,792 records.
As there are 501 record I can't write a omit condition.
I need to use sort join card but my prombelem is this 5 charcter activity code is starting from 46th column for some records and 47th column for others.
So what can I do?
The question is unclear with many details missing, but here's something which may help another searcher:
//SYSIN DD *
JOINKEYS F1=INA,FIELDS=(1,5,A),SORTED,NOSEQCK
JOINKEYS F2=INB,FIELDS=(1,5,A)
JOIN UNPAIRED,F1
REFORMAT FIELDS=(F1:1,80,?)
OPTION COPY
INREC IFTHEN=(WHEN=(81,1,CH,EQ,C'B'),
OVERLAY=(82:SEQNUM,9,ZD))
OUTFIL OMIT=(82,9,CH,LE,C'000000501',
AND,
81,1,CH,EQ,C'B')
//JNF2CNTL DD *
INREC IFTHEN=(WHEN=(1,1,CH,EQ,C'0'),
BUILD=(3,5)),
IFTHEN=(WHEN=NONE,
BUILD=(2,5))
//INA DD *
11111 IN
22222 KEEP UNMATCHED
33333 OUT
66666 IN
66667 KEEP UNMATCHED
66668 KEEP UNMATCHED
77777 OUT
88888 SHAKE IT ALL ABOUT
//INB DD *
0X11111
0X66666
0X88888
133333
799999
877777
This is using two input files, INA and INB.
INA is already in sequence (so specify SORTED,NOSEQCHK on the JOINKEYS for it), and is fixed-length 80-byte records.
INB is not already in sequence, because it is a mixture of different files, all are fixed-length 80-byte records.
In JNF2CNTL, only the key from the second file is extracted, as no other data is required from that file. The key is sourced from different places depending on the record-type. The file will be sorted automatically (with OPTION EQUALS set) before the JOIN itself.
The JOIN is for matches, and unmatched records from F1 (INA).
The ? in the REFORMAT statement is the "match marker" and it will be automatically set to B (both) for a match and 1 (in this case, only one is possible due to the ONLY on the JOIN statement) for an unmatched record from F1.
Of those that match, you want to ignore the first 501. So, set up a sequence number which is only incremented for the matching records.
Then on OUTFIL, OMIT= for those matched records which have a sequence less than or equal to the 501 count.
The output on SORTOUT will be all the records from the INA file, except the first 501 which matched.

crystal reports group name formula using a record from a different line in table

I am trying to create a group name formula using a record from a different line in the table than the line that the records contained in this group are linked to.
First off, my data has the word group in it, so to avoid confusion I have italicized it to differentiate it from groups in Crystal Reports.
My table that I am pulling the records for this group name from has:
{GroupSection.Group} Groups of items in our inventory database
{GroupSection.Section} Sections (these are like subgroups nested within each inventory group).
{GroupSection.Description} The problems start here because the descriptions for {GroupSection.Group} and {GroupSection.Section} are both stored here.
This is a sample of my table:
{GroupSection.Group} {GroupSection.Section} {GroupSection.Description}
3.00 0.00 PRECAST CONCRETE PRODUCT
3.00 50.00 MISC PRECAST CONCRETE PRODUCT
3.00 99.00 *Z* MISC PRECAST CONCRETE PRODUC
4.00 0.00 CEMENT SUPPLIES
4.00 50.00 MISC CEMENT SUPPLIES
4.00 99.00 *Z* MISC CEMENT SUPPLIES
The first and fourth line in this table are descriptions for {GroupSection.Group} (they have a 0.00 in the {GroupSection.Section} line) and the rest are descriptions for {GroupSection.Section}. The actual data that is contained in this report is in a different table and has the same two fields as the first two in this table, but not the third field, hence the need to link to and use this table to make the descriptions of the group names. The other table has no records that link to the lines with 0.00.
I want my Group Tree to look like this:
3. PRECAST CONCRETE PRODUCT
50 MISC PRECAST CONCRETE
99 *Z* MISC PRECAST CONCRETE
4. CEMENT SUPPLIES
50 MISC CEMENT SUPPLIES
99 *Z* MISC CEMENT SUPPLIES
This is the flawed formula I am using now in the Group Name formula for the top group:
ToText (left(Cstr({GroupSection.Group}),2))+ " " + ToText (If {GroupSection.Section} <> 0 then {GroupSection.Description} Else " ")
This is the Group Name formula for the inner nested group (working fine):
ToText (left(Cstr({GroupSection.Section}),2))+ " " + ToText ({GroupSection.Description})
This is what my Group Tree looks like now:
3. MISC PRECAST CONCRETE
50 MISC PRECAST CONCRETE
99 *Z* MISC PRECAST CONCRETE
4. MISC CEMENT SUPPLIES
50 MISC CEMENT SUPPLIES
99 *Z* MISC CEMENT SUPPLIES
As you can see, I need to get the name for the outer group from the first row in the table even though it's linking the records that are in the group to the second and third row in the table. Make sense?
I have tried using previous() in the formula but it gives the following error: This function cannot be used because it must be evaluated later.
Edit: I also just tried adding the same table again and linking it once to the group and once to the section and it worked for the group names, but now I have 1.9 million records, something really duplicated. So that won't work unless I can figure out how to fix the multiple records.
The "." in the group name description is an unrelated problem. It's the best I can do to get 3.00 to display as 3 and 50.00 to display as 50.
Thanks for any help!
Edit Dec 30, 2013 for user #Promethean:
Sorry I am new at SQL Command Tables. I took a command table from another report and did some changes to it. This is how far I got:
SELECT
"GroupSection"."Group",
"GroupSection"."Section",
"GroupSection"."Description"
FROM ("SpruceDotNet"."dbo"."InventoryCommon" "InventoryCommon" with (nolock)
INNER JOIN
"SpruceDotNet"."dbo"."GroupSection" "GroupSection" with (nolock)
ON "InventoryCommon"."Group"="GroupSection"."Group")
WHERE
"GroupSection"."Section"=0
ORDER BY
"InventoryCommon"."Group"
Is there any chance you could use this info to modify your command table so a dummy like me can follow it? It looks like you are adding one of the fields twice, but I don't follow everything.
And then how would you join the command table to the bigger table that contains the more detailed? I would have to use the right kind of join so I don't end up duplicating the records?
Any help is greatly appreciated.
This may be very straightforward if you do this with SQL in a commmand table:
select a.Group, a.Section, a.Description,
b.GroupName
from GroupSection a
left outer join
(
select concat( format(b.Group, 0), '. ', a.Description) as GroupName, b.Group
from GroupSection b
where b.Section = 0
) c
on a.Group = b.Group
The concat may change depending on your database provider. This reflects MySQL. perforing this against a table that reflects your sample data would produce the following:
{GroupSection.Group}|{GroupSection.Section}|{GroupSection.Description} |{GroupSection.GroupName}
3.00 | 0.00 | PRECAST CONCRETE PRODUCT | 3. PRECAST CONCRETE PRODUCT
3.00 | 50.00 | MISC PRECAST CONCRETE PRODUCT | 3. MISC PRECAST CONCRETE PRODUCT
3.00 | 99.00 | *Z* MISC PRECAST CONCRETE PRODUC| 3. *Z* MISC PRECAST CONCRETE PRODUC
4.00 | 0.00 | CEMENT SUPPLIES | 4. CEMENT SUPPLIES
4.00 | 50.00 | MISC CEMENT SUPPLIES | 4. MISC CEMENT SUPPLIES
4.00 | 99.00 | *Z* MISC CEMENT SUPPLIES | 4. *Z* MISC CEMENT SUPPLIES
This can easily be grouped in Crystal to produce your desired results.
BEGIN EDITS:
Check this out. It's a working SQLfiddle using your table structure. Only GroupSection is necessary to accomplish this part, so I ignored the other. You can play around with different queries to see what hapens, but as-is it will pass to Crystal the necessary derived field to group on in Crystal. You'll need to still group on the new field, GroupName, in Crystal.
As for the other tables, you can treat the command table the same way you any other table. Just make sure you add key/linking field to the command if its not there in your mockup.
Here's the fiddle: http://sqlfiddle.com/#!2/aa88d/5/0
and just in case the comments there make it too confusing to see the code, here it is with my annotations stripped out:
SELECT gs2.GroupName,
gs.sGroup, gs.Section, gs.Description
FROM GroupSection gs
left outer join
(
SELECT concat( format(inn.sGroup, 0), '. ', inn.Description) as GroupName, inn.sGroup
FROM GroupSection inn
WHERE inn.Section = 0
) gs2
ON gs.sGroup = gs2.sGroup
This is quite a handful! It seems that your problem is in the database, as you pointed out at the beginning.
If you group by {GroupSection.Group} and put the {GroupSection.Description} in the Group Header next to it, you should get 3. MISC PRECAST CONCRETE. And then you put {GroupSection.Section} in the Detail Section and {GroupSection.Description} next to it, you should get 50 MISC PRECAST CONCRETE. So at this point I would not use a formula to group by, but use the fields. So only one group ({GroupSection.Group}) and the rest in the Detail Section. Please try that suggestion and llet me know how that works.
EDIT:
I don't think it's going to work. Because everything is in the same table you will always get the wrong description unless the first item in the second group happens to be the same item as the one in the first group. Two things come to mind to remedy that:
Create a new table and split the descriptions up.
Add the table to your report a second time and use one alias for the main group and the second alias for the second group.
CR will give you a message that you already have the table in your report, but you can ignore that. I would give #2 a shot and see how it works. It's definitely less work than creating a brand new table.

Subreports in Crystal Reports

Is it possible to create a single report in Crystal report showing the 3 lists that get information form only 1 main table?
This is my main table which I pulled out from a database and used full outer joins to AccountNum1 and AccountNum2, leading to the blank values in some rows:
AccountNum1 ActDate SuspDate AccountNum2 EntryDate Charge
12345 01/01/2001 12/12/2012 12345 01/01/2012 1.00
67890 02/02/2002 11/11/2011 67890 02/02/2012 1.00
<Blank> <Blank> <Blank> 23456 03/03/2012 1.00
34567 04/04/2004 12/12/2012 <Blank> <Blank> <Blank>
For the 1st report, I want to display all records with complete entries:
AccountNum ActDate SuspDate EntryDate Charge
12345 01/01/2001 12/12/2012 01/01/2012 1.00
67890 02/02/2002 11/11/2011 02/02/2012 1.00
For the 2nd report, I want to display all records that have entries for AccountNum2, EntryDate, Charge only
AccountNum EntryDate Charge
67890 02/02/2012 1.00
For the 3rd report, I want to display all records that have entries for AccountNum1, ActDate, SuspDate only
AccountNum ActDate SuspDate
34567 04/04/2004 12/12/2012
I need to be able to show the information in a single report and also summarize the count of entries in report1, report2 and report3.
Thanks for all your help.:)
This IS possible in Crystal via a workaround:
Add a formula that defines which section you want the row in, eg SectionNo:
Formula might need changing depending on your logic
If (Not Isnull(AccountNum) and Not Isnull(ActDate) and Not Isnull(SuspDate) and Not isnull(EntryDate) and Not Isnull(Charge) then
1
else if (Not Isnull(ActDate)) then
2
else
3
Now you can add a group by the new formula, this will separate the rows into the three sections.
Next add two new detail sections and setup detaila, detailb and detailc to show the fields you want in sections 1, 2 and 3.
Finally add 3 formulas to the three detail sections suppression formula:
DetailA enter "SectionNo <> 1"
DetailB enter "SectionNo <> 2"
DetailC enter "SectionNo <> 3"
If you need a hand setting it up let me know.
No this is not possible in crystal report, you have to create two sub report for second and third listing.