Kaitai Struct - Optional block or attribute - kaitai-struct

My system has to be able to parse two types of very similar input data.
If the data comes from a queue it has the following structure:
record
record
...
record
If the data comes from a file it has the following structure:
header
record
record
...
record
My current code is as follows:
seq:
- id: file_header
type: file_header
- id: record
type: record
repeat: eos
types:
file_header:
seq:
- id: main_version
type: u1
- id: sub_version
type: u1
- id: spare
type: str
size: 30
encoding: UTF-8
record:
seq:
- id: event_id
type: u2
# enum: event_types
- id: event_length
type: u4
- id: enb_id
type: u4
- id: cell_id
type: u1
- id: call_id
type: u4
- id: date_time
type: date_time_record
- id: spare
type: str
size: 2
encoding: UTF-8
- id: crnti
type: u2
- id: body
size: event_length - 21
My idea is to create only one .ksy file that works for both approaches.
How can I get it?
It would basically be making file_header optional, but I don't see a way to do it.
Can somebody please help me on this?

Affiliate disclaimer: I'm a Kaitai Struct maintainer (see my GitHub profile).
My idea is to create only one .ksy file that works for both
approaches. How can I get it?
You can define a boolean parameter is_file on the top-level type and pass true when the data comes from a file, otherwise false. Like this:
params:
- id: is_file
type: bool
seq:
- id: file_header
type: file_header
if: is_file
- id: record
type: record
repeat: eos
types:
# ...
Note that the is_file param is mandatory, and you won't be able to instantiate the class without passing a value in it. For that reason, the fromFile(…​) helper method will no longer be available, and you'll need to create the parser object normally using new (or its closest equivalent in your target language).
I don't know what language you're targeting, but in C++, C#, Lua, PHP and Python come the custom params first (before _io, _root and _parent) and in Java, JavaScript and Nim second. For example, in Python you would do:
from kaitaistruct import KaitaiStream
from io import BytesIO
data = b"\x00\x01\x02..."
data_come_from_file = True
f = YourFormat(data_comes_from_file, KaitaiStream(BytesIO(data)))
In Java, for instance:
import io.kaitai.struct.ByteBufferKaitaiStream;
byte[] data = new byte[] { 0x00, 0x01, 0x02, ... };
boolean dataComeFromFile = true;
f = new YourFormat(new ByteBufferKaitaiStream(data), dataComesFromFile)
Check the generated code if unsure.

Related

Combining two choices in a single object

In my service schema I've got a single object that has two choices of properties. Meaning;
Object A needs: property 1 or 2, and property 3 or 4.
To realise this in OAS I'm using an object with an allOf property, which contains two items containing a oneOf property. I don't see a reason why this construction would be illegal. However, when using the swagger editor (https://editor.swagger.io/) and the Swagger Viewer extension in VSCode, it merges the two oneOf properties into a single one, effectively instructing the user to either include property 1, 2, 3 or 4.
One other way to achieve the same is to define a schema for every combination of choice 1 and 2, but this gets very tedious as the amount of options expands (effectively multiplying for every combination of the two choices).
Is my interpretation of the way this should work correct? If not, how can I achieve my goal without the spec becoming too verbose? If yes, I take it this is an issue in both tools I'm using, in that case I'll raise an issue in the respective issue trackers.
Example OAS spec:
openapi: 3.0.3
info:
description: Test API
version: 0.1.0
title: Test API
paths:
/test:
post:
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/TestComplexObject'
responses:
'200':
description: OK
components:
schemas:
TestComplexObject:
type: object
allOf:
- type: object
properties:
defaultString:
type: string
- oneOf:
- $ref: '#/components/schemas/TestStr1'
- $ref: '#/components/schemas/TestStr2'
- oneOf:
- $ref: '#/components/schemas/TestStr3'
- $ref: '#/components/schemas/TestStr4'
TestStr1:
type: object
properties:
testString1:
type: string
required: [testString1]
TestStr2:
type: object
properties:
testString2:
type: string
required: [testString2]
TestStr3:
type: object
properties:
testString3:
type: string
required: [testString3]
TestStr4:
type: object
properties:
testString4:
type: string
required: [testString4]

Shall I use oneOf or multiple attributes on the same object for a component of OpenAPI specification?

In short, here's fragment of OpenAPI specification:
# Consider an imaginary Internet Service Provider that needs to limit
# one or multiple customers in different ways: bandwidth (inbound, outbound), media type (text / music / video), volume (total traffic volume, for example, 5 GB).
# per specific website (a user can have different limit for YouTube and Amazon).
# You'll likely want to rename some of the attributes but
#that's not the point (this API is fake and that's the best example I came up with to reproduce the issue).
components:
schemas:
ProviderLimit:
type: object
properties:
name:
type: string
website_id:
type: string
users:
type: array
items:
type: string
description: List of user IDs
minItems: 1
bandwidth:
type: object
$ref: '#/components/schemas/BandwidthLimit'
volume:
type: object
$ref: '#/components/schemas/VolumeLimit'
media:
type: string
enum:
- TEXT
- MUSIC
- VIDEO
BandwidthLimit:
properties:
incoming_speed:
type: string
format: int64
outcoming_speed:
type: string
format: int64
VolumeLimit:
properties:
target:
type: string
format: int64
The question is which approach shall I take:
Merge all the possible limits (bandwith, volume, media), make them all optional, and agree to specify just one of them on the client.
Use oneOf.
# Example of using oneOf
components:
schemas:
ProviderLimit:
type: object
properties:
name:
type: string
website_id:
type: string
users:
type: array
items:
type: string
description: List of user IDs
minItems: 1
limit_type:
type: object
oneOf:
- $ref: '#/components/schemas/BandwidthLimit'
- $ref: '#/components/schemas/VolumeLimit'
- $ref: '#/components/schemas/MediaLimit'
discriminator:
propertyName: type # or something else
It looks like option #2 looks a little bit better but overall one could tell #1 option is somewhat reasonable (it's very simple and doesn't overcomplicate the API). Is there a strong argument to use #2 besides it just looks a little bit better (for example, there's a use case where using #1 might not lead to expected results)?

go-swagger injects digits into properties names

The models.yaml file I have is:
baseStorePatch:
title: Store
type: object
required:
- scalePolicy
properties:
scalePolicy:
$ref: "#/definitions/scalePolicy"
StorePatch:
allOf:
- $ref: "#/definitions/baseStorePatch"
- type: object
properties:
However, when I use go-swagger to generate the clients, the output is:
type StorePatch struct {
ScalePolicy *StorePatchAO0ScalePolicy `json:"scalePolicy,omitempty"`
}
Why the go-swagger auto generate StorePatchAO0 as the prefix? And how to get rid of it?

kaitai instance value ternary: can't combine output types

I've created a Kaitai Struct .ksy for two very similar Digilent log file formats. The second format (openlogger) is an extension of the first (openscope) with two additional fields in the struct. The scope is basically a single-channel logger; the additional logger fields describe the number of active channels (a u1, max 8) and the channel to sample order map (a u1 x 8).
I'm attempting to harmonise the interface for the two formats by synthesising always-present fields for the num_channels and channel_map; this has worked fine for the num_channels instance. However I can't figure out how to create a suitable value for the channel map, the .ksy below reports an error:
/types/body/types/header/instances/channel_order/value: can't combine output types: ArrayType(Int1Type(false)) vs CalcBytesType
I can't figure out how I can represent the if_false part ([0]) as an ArrayType.
Is there a better way to approach this?
meta:
id: dlog
file-extension: dlog
seq:
- id: endianness
type: u1
doc: 0 - little endian 1 - big endian
- id: body
type: body
types:
body:
meta:
endian:
switch-on: _root.endianness
cases:
0: le
1: be
seq:
- id: header
type: header
instances:
data:
pos: header.start_of_data
type: data
types:
header:
seq:
- id: sample_size
type: u1
- id: header_size
type: u2
- id: start_of_data
type: u2
- id: dlog_format
type: u2
enum: dlog_formats
- id: dlog_version
type: u4
- id: voltage_units
type: u8
- id: stop_reason
type: u4
enum: stop_reasons
#...
- id: num_openlogger_channels
type: u4
if: dlog_format == dlog_formats::openlogger
doc: number of channels per sample
- id: openlogger_channel_map
type: u1
repeat: expr
repeat-expr: 8
if: dlog_format == dlog_formats::openlogger
doc: channel order
instances:
num_channels:
value: 'dlog_format == dlog_formats::openlogger ? num_openlogger_channels : 1'
channel_map:
value: 'dlog_format == dlog_formats::openlogger ? openlogger_channel_map : [0]'
data:
seq:
- id: samples
type: sample
repeat: eos
types:
sample:
seq:
- id: channel
type:
switch-on: _root.body.header.sample_size
cases:
1: s1
2: s2
4: s4
repeat: expr
repeat-expr: _root.body.header.num_channels
enums:
dlog_formats:
1: openscope
3: openlogger
stop_reasons:
0: normal
1: forced
2: error
3: overflow
4: unknown
Literal [0] gets parsed as byte array: this is default behavior that people typically depend on, so heuristic array literal parser treats all arrays where values fit the pattern of being 0..255 as byte arrays, not true arrays.
You can still do a true array literal if you want to by forcing it so with a typecast:
[0].as<u1[]>
Note that it will likely cause problems with C++98, which lacks one-line initializers for true arrays (std::vector).

Specify an array of strings as body parameter in swagger API

I would like to post an array of strings like
[
"id1",
"id2"
]
to a Swagger based API. In my swagger file, I have those lines:
paths:
/some_url:
post:
parameters:
- name: ids
in: body
required: true
What is the correct way to specify the type of ids as an array of strings?
Update:
According to the specification, the following should work in my option:
parameters:
- in: body
description: xxx
required: true
schema:
type: array
items:
type: string
https://github.com/Yelp/swagger_spec_validator does not accept it and returns a long list of convoluted errors, which look like the code expects some $ref.
Your description of an array of string is correct, but the parameter definition misses the name property to be valid.
Here's a full working example:
swagger: "2.0"
info:
title: A dummy title
version: 1.0.0
paths:
/path:
post:
parameters:
- in: body
description: xxx
required: true
name: a name
schema:
type: array
items:
type: string
responses:
default:
description: OK
Try the online editor to check your OpenAPI (fka. Swagger) specs: http://editor.swagger.io/
I have created a swagger issue as the help provided by Arnaud, although is valid yaml, will give you NPE exceptions when trying to generate. You will need to provide an object like the following:
myDataItem:
type: object
description: A list of values
required:
- values
properties:
values:
type: array
items:
type: string
And then refer to it (in your post item etc):
schema:
$ref: "#/definitions/myDataItem"
For reference the github issue:
https://github.com/swagger-api/swagger-codegen/issues/6745
Note, the issue has been fixed in version 2.3.0 and higher, ideally you should upgrade to that version.
None of the answers worked for me. As it is stated in the following Baeldung article:
To better document the API and instruct the user, we can use the example label of how to insert values
So the full working example would be something like that:
swagger: "2.0"
info:
title: A dummy title
version: 1.0.0
paths:
/path:
post:
parameters:
- in: body
description: xxx
required: true
name: a name
schema:
type: array
items:
type: string
example: ["str1", "str2", "str3"]
responses:
default:
description: OK
You can check how the Example Value is now better informed in the Swagger editor.
For Array containing Object as it's content, definition for Object can be also expressed using definitions & $ref.
Example:
schema:
type: array
items:
$ref: '#/definitions/ObjectSchemaDefinition'
definitions:
ObjectSchemaDefinition:
type: string
The answer with the most votes got me in the right direction. I just needed an example of an array of objects where each one of them had a property which was an array of strings with more than one value in the strings array. Thanks to the documentation I got it working like this:
MyObject:
type: object
properties:
body:
type: array
items:
type: object
properties:
type:
type: string
values:
type: array
items:
type: string
example:
- type: "firstElement"
values: ["Active", "Inactive"]
- type: "SecondElement"
values: ["Active", "Inactive"]
One thing to keep in mind is that indentation is of paramount importance to swagger. If you don't indent things well, swagger will give you strange error messages.