Scalable HEVC encoding: How to setup cfg files for quality scalability? - configuration-files

I downloaded SHM12.3 and I started scalable encoding.
This is the script I use in terminal:
/TAppEncoderStatic -c cfg/encoder_scalable_journal_B2.cfg -c cfg/per-sequence-svc/C_L-SNR.cfg -c cfg/layers_journal.cfg -b C_L_SSIM_B2.bin -o0 rec/C_L_B2_l0_rec.yuv -o1 rec/C_L_B2_l1_rec.yuv >> results_B2_26_06_2017.txt
This is the example script given in software description.
I need to perform scalable encoding having a video with different video qualities or video with different bitrate.
Can anyone help me to edit the configuration files to support quality scalability?
Thank you in advance!

I found the solution. The first configuration file is ncoder_scalable_journal_B2.cfg.
My setup includes 3 layers for SNR scalability.
#======== File I/O =====================
BitstreamFile : str.bin
#ReconFile : rec.yuv
#======== Profile ================
NumProfileTierLevel : 3
Profile0 : main # Profile for BL (NOTE01: this profile applies to whole layers but only BL is outputted)
# (NOTE02: this profile has no effect when NonHEVCBase is set to 1)
Profile1 : main # Profile for BL (NOTE01: this profile applies to HEVC BL only)
# (NOTE02: When NonHEVCBase is set to 1, this profile & associated level should be updated appropriately)
Profile2 : scalable-main # Scalable profile
#======== Unit definition ================
#MaxCUWidth : 64 # Maximum coding unit width in pixel
#MaxCUHeight : 64 # Maximum coding unit height in pixel
#MaxPartitionDepth : 16 # Maximum coding unit depth
#QuadtreeTULog2MaxSize : 5 # Log2 of maximum transform size for
# quadtree-based TU coding (2...6)
#QuadtreeTULog2MinSize : 2 # Log2 of minimum transform size for
# quadtree-based TU coding (2...6)
#QuadtreeTUMaxDepthInter : 3
#QuadtreeTUMaxDepthIntra : 3
#======== Coding Structure =============
MaxNumMergeCand : 2
#IntraPeriod : 4 # Period of I-Frame ( -1 = only first)
DecodingRefreshType : 2 # Random Accesss 0:none, 1:CRA, 2:IDR, 3:Recovery Point SEI
GOPSize : 6 # GOP Size (number of B slice = GOPSize-1)
# Type POC QPoffset CbQPoffset CrQPoffset QPfactor tcOffsetDiv2 betaOffsetDiv2 temporal_id #ref_pics_active #ref_pics reference pictures predict deltaRPS #ref_idcs reference idcs
Frame1: B 1 2 0 0 0.4624 0 0 0 1 1 -1 0
Frame2: B 2 1 0 0 0.4624 0 0 0 1 1 -2 2 1
Frame3: P 3 0 0 0 0.4624 0 0 0 1 1 -3 2 2
Frame4: B 4 2 0 0 0.4624 0 0 0 1 1 -1 2 2
Frame5: B 5 1 0 0 0.4624 0 0 0 1 1 -2 2 3
Frame6: P 6 0 0 0 0.4624 0 0 0 1 1 -3 2 3
#=========== Motion Search =============
FastSearch : 1 # 0:Full search 1:TZ search
SearchRange : 25 # (0: Search range is a Full frame)
BipredSearchRange : 4 # Search range for bi-prediction refinement
HadamardME : 1 # Use of hadamard measure for fractional ME
FEN : 1 # Fast encoder decision
FDM : 1 # Fast Decision for Merge RD cost
#======== Quantization =============
#QP : 32 # Quantization parameter(0-51)
MaxDeltaQP : 0 # CU-based multi-QP optimization
#MaxCuDQPDepth : 0 # Max depth of a minimum CuDQP for sub-LCU-level delta QP
DeltaQpRD : 0 # Slice-based multi-QP optimization
RDOQ : 1 # RDOQ
RDOQTS : 1 # RDOQ for transform skip
SliceChromaQPOffsetPeriodicity: 0 # Used in conjunction with Slice Cb/Cr QpOffsetIntraOrPeriodic. Use 0 (default) to disable periodic nature.
SliceCbQpOffsetIntraOrPeriodic: 0 # Chroma Cb QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
SliceCrQpOffsetIntraOrPeriodic: 0 # Chroma Cr QP Offset at slice level for I slice or for periodic inter slices as defined by SliceChromaQPOffsetPeriodicity. Replaces offset in the GOP table.
#=========== Deblock Filter ============
#DeblockingFilterControlPresent: 0 # Dbl control params present (0=not present, 1=present)
LoopFilterOffsetInPPS : 0 # Dbl params: 0=varying params in SliceHeader, param = base_param + GOP_offset_param; 1=constant params in PPS, param = base_param)
LoopFilterDisable : 0 # Disable deblocking filter (0=Filter, 1=No Filter)
LoopFilterBetaOffset_div2 : 0 # base_param: -6 ~ 6
LoopFilterTcOffset_div2 : 0 # base_param: -6 ~ 6
DeblockingFilterMetric : 0 # blockiness metric (automatically configures deblocking parameters in bitstream)
#=========== Misc. ============
#InternalBitDepth : 8 # codec operating bit-depth
#=========== Coding Tools =================
#SAO : 0 # Sample adaptive offset (0: OFF, 1: ON)
AMP : 0 # Asymmetric motion partitions (0: OFF, 1: ON)
TransformSkip : 0 # Transform skipping (0: OFF, 1: ON)
TransformSkipFast : 0 # Fast Transform skipping (0: OFF, 1: ON)
SAOLcuBoundary : 0 # SAOLcuBoundary using non-deblocked pixels (0: OFF, 1: ON)
#============ Slices ================
SliceMode : 0 # 0: Disable all slice options.
# 1: Enforce maximum number of LCU in an slice,
# 2: Enforce maximum number of bytes in an 'slice'
# 3: Enforce maximum number of tiles in a slice
SliceArgument : 1500 # Argument for 'SliceMode'.
# If SliceMode==1 it represents max. SliceGranularity-sized blocks per slice.
# If SliceMode==2 it represents max. bytes per slice.
# If SliceMode==3 it represents max. tiles per slice.
LFCrossSliceBoundaryFlag : 1 # In-loop filtering, including ALF and DB, is across or not across slice boundary.
# 0:not across, 1: across
#============ PCM ================
PCMEnabledFlag : 0 # 0: No PCM mode
PCMLog2MaxSize : 5 # Log2 of maximum PCM block size.
PCMLog2MinSize : 3 # Log2 of minimum PCM block size.
PCMInputBitDepthFlag : 1 # 0: PCM bit-depth is internal bit-depth. 1: PCM bit-depth is input bit-depth.
PCMFilterDisableFlag : 0 # 0: Enable loop filtering on I_PCM samples. 1: Disable loop filtering on I_PCM samples.
#============ Tiles ================
TileUniformSpacing : 0 # 0: the column boundaries are indicated by TileColumnWidth array, the row boundaries are indicated by TileRowHeight array
# 1: the column and row boundaries are distributed uniformly
NumTileColumnsMinus1 : 0 # Number of tile columns in a picture minus 1
TileColumnWidthArray : 2 3 # Array containing tile column width values in units of CTU (from left to right in picture)
NumTileRowsMinus1 : 0 # Number of tile rows in a picture minus 1
TileRowHeightArray : 2 # Array containing tile row height values in units of CTU (from top to bottom in picture)
LFCrossTileBoundaryFlag : 1 # In-loop filtering is across or not across tile boundary.
# 0:not across, 1: across
#============ WaveFront ================
#WaveFrontSynchro : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.
#=========== Quantization Matrix =================
#ScalingList : 0 # ScalingList 0 : off, 1 : default, 2 : file read
#ScalingListFile : scaling_list.txt # Scaling List file name. If file is not exist, use Default Matrix.
#============ Lossless ================
#TransquantBypassEnableFlag : 0 # Value of PPS flag.
#CUTransquantBypassFlagForce: 0 # Force transquant bypass mode, when transquant_bypass_enable_flag is enabled
#============ Rate Control ======================
#RateControl : 0 # Rate control: enable rate control
#TargetBitrate : 1000000 # Rate control: target bitrate, in bps
#KeepHierarchicalBit : 2 # Rate control: 0: equal bit allocation; 1: fixed ratio bit allocation; 2: adaptive ratio bit allocation
#LCULevelRateControl : 1 # Rate control: 1: LCU level RC; 0: picture level RC
#RCLCUSeparateModel : 1 # Rate control: use LCU level separate R-lambda model
#InitialQP : 0 # Rate control: initial QP
#RCForceIntraQP : 0 # Rate control: force intra QP to be equal to initial QP
### DO NOT ADD ANYTHING BELOW THIS LINE ###
### DO NOT DELETE THE EMPTY LINE BELOW ###
The second configuration file is C_L-SNR.cfg.
FrameSkip : 0 # Number of frames to be skipped in input
FramesToBeEncoded : 480 # Number of frames to be coded
Level0 : 3 # Level of the whole bitstream
Level1 : 3 # Level of the base layer
Level2 : 3 # Level of the enhancement layer
Level3 : 3 # Level of the enhancement layer
#======== File I/O ===============
InputFile0 : C_L_560x448_40.yuv
FrameRate0 : 40 # Frame Rate per second
InputBitDepth0 : 8 # Input bitdepth for layer 0
SourceWidth0 : 560 # Input frame width
SourceHeight0 : 448 # Input frame height
RepFormatIdx0 : 0 # Index of corresponding rep_format() in the VPS
IntraPeriod0 : 96 # Period of I-Frame ( -1 = only first)
ConformanceMode0 : 1 # conformance mode
QP0 : 31
LayerPTLIndex0 : 1
InputFile1 : C_L_560x448_40.yuv
FrameRate1 : 40 # Frame Rate per second
InputBitDepth1 : 8 # Input bitdepth for layer 1
SourceWidth1 : 560 # Input frame width
SourceHeight1 : 448 # Input frame height
RepFormatIdx1 : 1 # Index of corresponding rep_format() in the VPS
IntraPeriod1 : 96 # Period of I-Frame ( -1 = only first)
ConformanceMode1 : 1 # conformance mode
QP1 : 26
LayerPTLIndex1 : 2
InputFile2 : C_L_560x448_40.yuv
FrameRate2 : 40 # Frame Rate per second
InputBitDepth2 : 8 # Input bitdepth for layer 1
SourceWidth2 : 560 # Input frame width
SourceHeight2 : 448 # Input frame height
RepFormatIdx2 : 2 # Index of corresponding rep_format() in the VPS
IntraPeriod2 : 96 # Period of I-Frame ( -1 = only first)
ConformanceMode2 : 1 # conformance mode
QP2 : 23
LayerPTLIndex2 : 3
And the last configuration files is layers_journal.cfg.
NumLayers : 3
NonHEVCBase : 0
ScalabilityMask1 : 0 # Multiview
ScalabilityMask2 : 1 # Scalable
ScalabilityMask3 : 0 # Auxiliary pictures
AdaptiveResolutionChange : 0 # Resolution change frame (0: disable)
SkipPictureAtArcSwitch : 0 # Code higher layer picture as skip at ARC switching (0: disable (default), 1: enable)
MaxTidRefPresentFlag : 1 # max_tid_ref_present_flag (0=not present, 1=present(default))
CrossLayerPictureTypeAlignFlag: 1 # Picture type alignment across layers
CrossLayerIrapAlignFlag : 1 # Align IRAP across layers
SEIDecodedPictureHash : 1
#============= LAYER 0 ==================
#QP0 : 22
MaxTidIlRefPicsPlus10 : 7 # max_tid_il_ref_pics_plus1 for layer0
#============ Rate Control ==============
RateControl0 : 0 # Rate control: enable rate control for layer 0
TargetBitrate0 : 1000000 # Rate control: target bitrate for layer 0, in bps
KeepHierarchicalBit0 : 1 # Rate control: keep hierarchical bit allocation for layer 0 in rate control algorithm
LCULevelRateControl0 : 1 # Rate control: 1: LCU level RC for layer 0; 0: picture level RC for layer 0
RCLCUSeparateModel0 : 1 # Rate control: use LCU level separate R-lambda model for layer 0
InitialQP0 : 0 # Rate control: initial QP for layer 0
RCForceIntraQP0 : 0 # Rate control: force intra QP to be equal to initial QP for layer 0
#============ WaveFront ================
WaveFrontSynchro0 : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.
#=========== Quantization Matrix =================
ScalingList0 : 0 # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile0 : scaling_list0.txt # Scaling List file name. If file is not exist, use Default Matrix.
#============= LAYER 1 ==================
#QP1 : 20
NumSamplePredRefLayers1 : 1 # number of sample pred reference layers
SamplePredRefLayerIds1 : 0 # reference layer id
NumMotionPredRefLayers1 : 1 # number of motion pred reference layers
MotionPredRefLayerIds1 : 0 # reference layer id
NumActiveRefLayers1 : 1 # number of active reference layers
PredLayerIds1 : 0 # inter-layer prediction layer index within available reference layers
#============ Rate Control ==============
RateControl1 : 0 # Rate control: enable rate control for layer 1
TargetBitrate1 : 1000000 # Rate control: target bitrate for layer 1, in bps
KeepHierarchicalBit1 : 1 # Rate control: keep hierarchical bit allocation for layer 1 in rate control algorithm
LCULevelRateControl1 : 1 # Rate control: 1: LCU level RC for layer 1; 0: picture level RC for layer 1
RCLCUSeparateModel1 : 1 # Rate control: use LCU level separate R-lambda model for layer 1
InitialQP1 : 0 # Rate control: initial QP for layer 1
RCForceIntraQP1 : 0 # Rate control: force intra QP to be equal to initial QP for layer 1
#============ WaveFront ================
WaveFrontSynchro1 : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.
#=========== Quantization Matrix =================
ScalingList1 : 0 # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile1 : scaling_list1.txt # Scaling List file name. If file is not exist, use Default Matrix.
#============= LAYER 2 ==================
#QP1 : 20
NumSamplePredRefLayers2 : 1 # number of sample pred reference layers
SamplePredRefLayerIds2 : 0 # reference layer id
NumMotionPredRefLayers2 : 1 # number of motion pred reference layers
MotionPredRefLayerIds2 : 0 # reference layer id
NumActiveRefLayers2 : 1 # number of active reference layers
PredLayerIds2 : 0 # inter-layer prediction layer index within available reference layers
#============ Rate Control ==============
RateControl2 : 0 # Rate control: enable rate control for layer 1
TargetBitrate2 : 1000000 # Rate control: target bitrate for layer 1, in bps
KeepHierarchicalBit2 : 1 # Rate control: keep hierarchical bit allocation for layer 1 in rate control algorithm
LCULevelRateControl2 : 1 # Rate control: 1: LCU level RC for layer 1; 0: picture level RC for layer 1
RCLCUSeparateModel2 : 1 # Rate control: use LCU level separate R-lambda model for layer 1
InitialQP2 : 0 # Rate control: initial QP for layer 1
RCForceIntraQP2 : 0 # Rate control: force intra QP to be equal to initial QP for layer 1
#============ WaveFront ================
WaveFrontSynchro2 : 0 # 0: No WaveFront synchronisation (WaveFrontSubstreams must be 1 in this case).
# >0: WaveFront synchronises with the LCU above and to the right by this many LCUs.
#=========== Quantization Matrix =================
ScalingList2 : 0 # ScalingList 0 : off, 1 : default, 2 : file read
ScalingListFile2 : scaling_list1.txt # Scaling List file name. If file is not exist, use Default Matrix.
NumLayerSets : 3 # Include default layer set, value of 0 not allowed
NumLayerInIdList1 : 2 # 0-th layer set is default, need not specify LayerSetLayerIdList0 or NumLayerInIdList0
LayerSetLayerIdList1 : 0 1
NumLayerInIdList2 : 3 # 0-th layer set is default, need not specify LayerSetLayerIdList0 or NumLayerInIdList0
LayerSetLayerIdList2 : 0 1 2
NumAddLayerSets : 0
NumOutputLayerSets : 3 # Include defualt OLS, value of 0 not allowed
DefaultTargetOutputLayerIdc : 2
NumOutputLayersInOutputLayerSet : 1 1 # The number of layers in the 0-th OLS should not be specified,
# ListOfOutputLayers0 need not be specified
ListOfOutputLayers1 : 1
ListOfProfileTierLevelOls1 : 1 2
ListOfOutputLayers2 : 2
ListOfProfileTierLevelOls2 : 1 2 2

Related

How Can I Change Mime Type In the Metadata on File

I want to change the "MIME Type" in the file:
└──╼ $exiftool realshort.mp4
ExifTool Version Number : 12.10
File Name : realshort.mp4
Directory : .
File Size : 98 kB
File Modification Date/Time : 2021:01:19 23:53:01+00:00
File Access Date/Time : 2021:01:19 23:53:01+00:00
File Inode Change Date/Time : 2021:01:19 23:53:01+00:00
File Permissions : rw-r--r--
File Type : MP4
File Type Extension : mp4
MIME Type : video/mp4
Major Brand : MP4 Base Media v1 [IS0 14496-12:2003]
Minor Version : 0.0.0
Compatible Brands : isom, 3gp4
Movie Header Version : 0
Create Date : 2014:11:05 13:51:33
Modify Date : 2014:11:05 13:51:33
Time Scale : 1000
Duration : 1.20 s
Preferred Rate : 1
Preferred Volume : 100.00%
Preview Time : 0 s
Preview Duration : 0 s
Poster Time : 0 s
Selection Time : 0 s
Selection Duration : 0 s
Current Time : 0 s
Next Track ID : 3
Track Header Version : 0
Track Create Date : 2014:11:05 13:51:33
Track Modify Date : 2014:11:05 13:51:33
Track ID : 1
Track Duration : 1.20 s
Track Layer : 0
Track Volume : 0.00%
Image Width : 320
Image Height : 240
Graphics Mode : srcCopy
Op Color : 0 0 0
Compressor ID : avc1
Source Image Width : 320
Source Image Height : 240
X Resolution : 72
Y Resolution : 72
Compressor Name :
Bit Depth : 24
Video Frame Rate : 30.02
Matrix Structure : 1 0 0 0 1 0 0 0 1
Media Header Version : 0
Media Create Date : 2014:11:05 13:51:33
Media Modify Date : 2014:11:05 13:51:33
Media Time Scale : 48000
Media Duration : 1.17 s
Handler Type : Audio Track
Handler Description : SoundHandle
Balance : 0
Audio Format : mp4a
Audio Channels : 1
Audio Bits Per Sample : 16
Audio Sample Rate : 48000
XMP Toolkit : Image::ExifTool 12.10
Media Data Size : 95268
Media Data Offset : 4610
Image Size : 320x240
Megapixels : 0.077
Avg Bitrate : 636 kbps
Rotation : 0
If I do:
exiftool -artist=ii realshort.mp4
i can add artist tag with the value ii
But if I do: exiftool -"mime type"=ii realshort.mp4 it won't work
I looked at: https://libre-software.net/edit-metadata-exiftool/
And also here: How do you change the MIME type of a file from the terminal?
But I can't find any answer
How can I make it work?
You can't change the MIME type. It's not embedded data. It's a tag derived from what the type of file is.
You could edit try editing the .Exiftool_Config file if you have one (see the example config) to override the base definition, but that will only change what exiftool displays. Another program or another computer will output MP4 as the MIME type.

Q-Learning neural network implementation

I was trying to implement Q-Learning with neural networks. I've got q-learning with a q-table working perfectly fine.
I am playing a little "catch the cheese" game.
It looks something like this:
# # # # # # # #
# . . . . . . #
# . $ . . . . #
# . . . P . . #
# . . . . . . #
# . . . . . . #
# . . . . . . #
# # # # # # # #
The player p is spawning somewhere on the map. If he hits a wall, the reward will be negative. Lets call that negative reward -R for now.
If the player p hits the dollar sign, the reward will be positive. This positive reward will be +R
In both cases, the game will reset and the player will spawn somewhere randomly on the map.
My neural network architecture looks like this:
-> Inputsize: [1, 8, 8]
Flattening: [1, 1, 64] (So I can use Dense layers)
Dense Layer: [1, 1, 4]
-> Outputsize: [1, 1, 4]
For the learning, I am storing some game samples in a buffer. The buffer maximum size is b_max.
So my training looks like this:
Pick a random number between 0 and 1
If the number is greater than the threshold, choose a random action.
Otherwise pick the action with the highest reward.
Take that action and observe the reward.
Update my neural network by choosing a batch of game samples from the buffer
5.1 Iterate through the batch and train the network as following:
5.2 For each batch. The input to the network is the game state. (Everywhere 0, except at the players position).
5.3 The output error of the output layer will be 0 everywhere except at the output neuron that is equal to the action that has been taking at that sample.
5.4 Here the expected output will be:
(the reward) + (discount_factor * future_reward) (future_reward = max (neuralNetwork(nextState))
5.5 Do everything from the beginning.
The thing is that it just doesn't seem to work properly.
I've an idea on how I could change this so it works but I am not sure if this is "allowed":
Each game decision could be trained until it does exactly what is supposed to do.
Then I would go to the next decision and train on that and so on. How is the training usually done?
I would be very happy if someone could help and give me a detailed explanation on how the training works. Especially when it comes to "how many times do run what loop?".
Greetings,
Finn
This is a map that shows what decision the neural network would like to do on each field:
# # # # # # # # # #
# 1 3 2 0 2 3 3 3 #
# 1 1 1 1 0 2 2 3 #
# 0 0 $ 1 3 0 1 1 #
# 1 0 1 2 1 0 3 3 #
# 0 1 2 3 1 0 3 0 # //The map is a little bit bigger but still it can be seen that it is wrong
# 2 0 1 3 1 0 3 0 # //0: right, 1 bottom, 2 left, 3 top
# 1 0 1 0 2 3 2 1 #
# 0 3 1 3 1 3 1 0 #
# # # # # # # # # #

An issue with argument "sortv" of function seqIplot()

I'm trying to plot individual sequences by means of function seqIplot() in TraMineR. These individual sequences represent work trajectories, completed by former school's graduates via a WEB questionnaire.
Using argument "sortv", I'd like to sort my sequences according to the order of the levels of one covariate, the year of graduation, named "PROMO".
"PROMO" is a factor variable contained in a data frame named "covariates.seq", gathering covariates together:
str(covariates.seq)
'data.frame': 733 obs. of 6 variables:
$ ID_SQ : Factor w/ 733 levels "1","2","3","5",..: 1 2 3 4 5 6
7 8 9 10 ...
$ SEXE : Factor w/ 2 levels "Féminin","Masculin": 1 1 1 1 2 1
1 2 2 1 ...
$ PROMO : Factor w/ 6 levels "1997","1998",..: 1 2 2 4 4 3 2 2
2 2 ...
$ DEPARTEMENT : Factor w/ 10 levels "BC","GCU","GE",..: 1 4 7 8 7 9
9 7 7 4 ...
$ NIVEAU_ADMISSION: Factor w/ 2 levels "En Premier Cycle",..: NA 1 1 1 1
1 NA 1 1 1 ...
$ FILIERE_SECTION : Factor w/ 4 levels "Cursus Classique",..: NA 4 2 NA
1 1 NA NA 4 3 ..
I'm also using "SEXE", the graduates' gender, as a grouping variable. To plot the individual sequences so, my command is as follows:
seqIplot(sequences, group = covariates.seq$SEXE,
sortv = covariates.seq$PROMO,
cex.axis = 0.7, cex.legend = 0.7)
I expected that, by using a process time axis (with the year of graduation as sequence-dependent origin), sorting the sequences according to the order of the levels of "PROMO" would give a plot with groups of sequences from the longest (for the older graduates) to the shortest (for the younger graduates).
But I've got an issue: in the output plot, the sequences don't appear to be correctly sorted according to the levels of "PROMO". Indeed, by using "sortv = covariates.seq$PROMO" as in the command above, the plot doesn't show groups of sequences from the longest to the shortest, as expected. It looks like the plot obtained without using the argument "sortv" (see Figures below).
Without using argument "sortv"
Using "sortv = covariates.seq$PROMO"
Note that I have 733 individual sequences in my object "sequences", created as follows:
labs <- c("En poste","Au chômage (d'au moins 6 mois)", "Autre situation
(d'au moins 6 mois)","En poursuite d'études (thèse ou hors
thèse)", "En reprise d'études / formation (d'au moins 6 mois)")
codes <- c("En poste", "Au chômage", "Autre situation", "En poursuite
d'études", "En reprise d'études / formation")
sequences <- seqdef(situations, alphabet = labs, states = codes, left =
NA, right = "DEL", missing = NA,
cnames = as.character(seq(0,7400/365,1/365)),
xtstep = 365)
The values of the covariates are sorted in the same order as the individual sequences. The covariate "PROMO" doesn't contain any missing value.
Something's going wrong, but what?
Thank you in advance for your help,
Best,
Arnaud.
Using a factor as sortv argument in seqIplot works fine as illustrated by the example below:
sdc <- c("aabbccdd","bbbccc","aaaddd","abcabcab")
sd <- seqdecomp(sdc, sep="")
seq <- seqdef(sd)
fac <- factor(c("2000","2001","2001","2000"))
par(mfrow=c(1,3))
seqIplot(seq, with.legend=FALSE)
seqIplot(seq, sortv=fac, with.legend=FALSE)
seqlegend(seq)

about torch.nn.CrossEntropyLoss parameter shape

i'm learning pytorch, and taking the anpr project,which is based tensorflow
(https://github.com/matthewearl/deep-anpr,
http://matthewearl.github.io/2016/05/06/cnn-anpr/)
as a exercise, transplant it to pytorch platform.
there is a problem,i'm using nn.CrossEntropyLoss() as loss function:
criterion=nn.CrossEntropyLoss()
the output.data of model is:
- 1.00000e-02 *
- 2.5552 2.7582 2.5368 ... 5.6184 1.2288 -0.0076
- 0.7033 1.3167 -1.0966 ... 4.7249 1.3217 1.8367
- 0.7592 1.4777 1.8095 ... 0.8733 1.2417 1.1521
- 0.1040 -0.7054 -3.4862 ... 4.7703 2.9595 1.4263
- [torch.FloatTensor of size 4x253]
and targets.data is:
- 1 0 0 ... 0 0 0
- 1 0 0 ... 0 0 0
- 1 0 0 ... 0 0 0
- 1 0 0 ... 0 0 0
- [torch.DoubleTensor of size 4x253]
when i call:
loss=criterion(output,targets)
error occured,information is:
TypeError: FloatClassNLLCriterion_updateOutput received an invalid combination of arguments - got (int, torch.FloatTensor, **torch.DoubleTensor**, torch.FloatTensor, bool, NoneType, torch.FloatTensor), but expected (int state, torch.FloatTensor input, **torch.LongTensor** target, torch.FloatTensor output, bool sizeAverage, [torch.FloatTensor weights or None], torch.FloatTensor total_weight)
'expected torch.LongTensor'......'got torch.DoubleTensor',but if i convert the targets into LongTensor:
torch.LongTensor(numpy.array(targets.data.numpy(),numpy.long))
call loss=criterion(output,targets), the error is:
RuntimeError: multi-target not supported at /data/users/soumith/miniconda2/conda-bld/pytorch-0.1.10_1488752595704/work/torch/lib/THNN/generic/ClassNLLCriterion.c:20
my last exercise is mnist, a example from pytorch,i made a bit modification,batch_size is 4,the loss function:
loss = F.nll_loss(outputs, labels)
outputs.data:
- -2.3220 -2.1229 -2.3395 -2.3391 -2.5270 -2.3269 -2.1055 -2.2321 -2.4943 -2.2996
-2.3653 -2.2034 -2.4437 -2.2708 -2.5114 -2.3286 -2.1921 -2.1771 -2.3343 -2.2533
-2.2809 -2.2119 -2.3872 -2.2190 -2.4610 -2.2946 -2.2053 -2.3192 -2.3674 -2.3100
-2.3715 -2.1455 -2.4199 -2.4177 -2.4565 -2.2812 -2.2467 -2.1144 -2.3321 -2.3009
[torch.FloatTensor of size 4x10]
labels.data:
- 8
- 6
- 0
- 1
- [torch.LongTensor of size 4]
the labels, for a input image,must be a single element, in upper example, there is 253 numbers, and in 'mnist',there is only one number, the shape of outputs is difference from labels.
i review the tensorflow manual, tf.nn.softmax_cross_entropy_with_logits,
'Logits and labels must have the sameshape [batch_size, num_classes] and the same dtype (either float32 or float64).'
does pytorch support the same function in tensorflow?
many thks
You can convert the targets that you have to a categorical representation.
In the example that you provide, you would have 1 0 0 0.. 0 if the class is 0, 0 1 0 0 ... if the class is 1, 0 0 1 0 0 0... if the class is 2 etc.
One quick way that I can think of is first convert the target Tensor to a numpy array, then convert it from one hot to a categorical array, and convert it back to a pytorch Tensor. Something like this:
targetnp=targets.numpy()
idxs=np.where(targetnp>0)[1]
new_targets=torch.LongTensor(idxs)
loss=criterion(output,new_targets)
CrossEntropyLoss is equivalent to tf.nn.softmax_cross_entropy_with_logits. The input to CrossEntropyLoss is a categorical vector of shape [batch_size]. Use .view() to change the tensor shapes.
labels = labels.view(-1)
output = output.view(labels.size(0), -1)
loss = criterion(output, loss)
calling .view(x, y, -1) causes the tensor to use the remaining datapoints to fill the -1 dimension and will cause an error if there is not enough to make a full dimension
labels.size(0) gives the size of the 0th dimension of the label tensor
Additional
to convert between tensor types you can call the type on the tensor, for example 'labels = labels.long()`
Second Additional
If you unpack the data from a variable like so output.data then you will lose the gradients for that output and be unable to backprop when the time comes

Understanding --readable_model and --invert_hash in vowpal wabbit for Neural Networks

I was trying to make a diagram of the weights that vowpal wabbit has learnt to understand the architecture better and got very confused as to what was happening. I couldn't understand where all the weights given out by vowpal wabbit go in the structure.
My data:
$ cat dat1.vw
1 | a b
2 | a c
When doing a neural network with 2 nodes:
vw --nn 2 --invert_hash dat.nn2.ih --readable_model dat.nn2.rm dat1.vw
it gives dat.nn2.ih and dat.nn2.rm with some information like max, min, checksum etc and the weights as:
From dat.nn2.ih (from --invert_hash):
:29015:-0.3161
Constant:202096:-0.270493
Constant[1]:202097:0.214776
[1]:29016:-0.302343
[2]:156909:-0.479347
a:108232:-0.270493
a[1]:108233:0.214776
b:129036:-0.0849519
b[1]:129037:0.0473027
c:219516:-0.196927
c[1]:219517:0.172029
And from dat.nn2.rm (--readable_model):
29015:-0.3161 # <blank> ?
29016:-0.302343 # [1] ?
108232:-0.270493 # a (from input "a" to hidden node 0)
108233:0.214776 # a[1] (from input "a" to hidden node 1)
129036:-0.0849519 # b (from input "b" to hidden node 0)
129037:0.0473027 # b[1] (from input "b" to hidden node 1)
156909:-0.479347 # [2] ?
156910:0.394566 # <nonexistent> not there in .ih file ?
156911:0.69414 # <nonexistent> not there in .ih file ?
202096:-0.270493 # Constant (bias for hidden node 0)
202097:0.214776 # Constant[1] (bias for hidden node 1)
219516:-0.196927 # c (from input "c" to hidden node 0)
219517:0.172029 # c[1] (from input "c" to hidden node 1)
So, I can understand a, a[1], b, b[1], c, c[1], Constant, Constant[1] but I am unable to figure out what the rest of the hashes are for ?
From my understanding, there should be 3 more weights/hashes:
- From hidden node 0 to output node
- From hidden node 1 to output node
- Bias for output node
But I see a <blank>, [1], [2] and 2 hashes which are in the .rm but not in .ih. What exactly do these weights represent ?