What does flag `pIOK` mean? - perl

When dumping perl SV with Devel::Peek I can see:
SV = IV(0x1c13168) at 0x1c13178
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 2
But can not find the description what pIOK mean.
I tried to look it at Devel::Peek, perlapi , perlguts, perlxs ...
In sources I found that:
{SVp_IOK, "pIOK,"}
But still can not find what SVp_IOK is. What is it?
UPD
I found this document. It shed the light a bit what flags mean and where they are situated. (beware this DOC is outdated a bit)
This flag indicates that the object has a valid non-public IVX field value. It can only be set for value type SvIV or subtypes of it.
UPD
Why private and public flags are differ

pIOK is how Devel::Peek represents the bit corresponding to bit mask SVp_IOK. The p indicates a "private" flag, and it forms a pair with "public" flag IOK (bit mask SVf_IOK)
The exact meaning of the private flags has changed across perl versions, but in general terms they mean that the IV (or NV or PV) field of the SV is "inaccurate" in some way
The most common situation where pIOK is set on its own (pIOK is always set if IOK is set) is where a PV has been converted to a numeric NV value. The NV and IV fields are both populated, but if the IV value isn't an accurate representation of the number (i.e. it has been truncated) then pIOK is set but IOK is cleared
This code shows a way to reach that state. Variable $pi_str is set to a string value for π and it is converted to a floating-point value by adding 0.0 and storing it into $pi_num. Devel::Peek now shows that NOK/pNOK and POK/pPOK are set, but only pIOK while IOK remains clear. Looking at the IV value we can see why: it is set to 3, which is the cached value of int $pi_str in case we need it again, but it is not an accurate representation of the string "3.14159" in integer form
use strict;
use warnings 'all';
use Devel::Peek 'Dump';
my $pi_str = "3.14159";
my $pi_num = $pi_str + 0.0;
Dump $pi_str;
output
SV = PVNV(0x28fba68) at 0x3f30d30
REFCNT = 1
FLAGS = (NOK,POK,IsCOW,pIOK,pNOK,pPOK)
IV = 3
NV = 3.14159
PV = 0x3fb7ab8 "3.14159"\0
CUR = 7
LEN = 10
COW_REFCNT = 1
Perl v5.16 and before used to use the flag to indicate "magic" variables (such as tied values) because the value in the IV field could not be used directly. That was changed in v5.18 and later, and magic values now use pIOK in the same way as any other value

Related

Systemverilog expansion rule

When I review some codes, I found something strange.
It seems that it comes from expansion and operation priority.
(I know that because "sig" is declared with 'signed', $signed is not necessary and '-sig' is correct one, anyway..)
reg signed [9:0] sig;
reg signed [11:0] out;
initial
begin
$monitor ("%0t] sig=%0d, out=%0h", $time, sig, out);
sig = 64;
out = $signed(-sig);
#1
out = -$signed(sig);
#1
sig = -512;
out = $signed(-sig);
#1
out = -$signed(sig);
#1
$finish;
end
Simulation result for above codes is,
0] sig=64, out=-64
2] sig=-512, out=-512
3] sig=-512, out=512
When sig=-512, I expected that 10 bits sig would be expanded to 12bits before negation, but it was expanded after negation.
Therefore negation of -512 was still -512, and after expansion, it had a -512.
I guess "$signed() blocks expansion..Any idea what happens??
First of all, -512 and 512 are identical numbers in 10-bit represenntation. 10 bits can actually only hold signed values from -512 to 511. In this scheme negation of -512 should work weirdly, not mentioned in lrm, at least i was not able to locate anything related. This is probably an undefined behavior.
However, it is logical to assume that in this scheme in order to represent a negated value of '-512' just removing signess is sufficient. It seems that all commercial compilers in eda playground do this. So, a result of the unaray - operator in this case will be unsigned value of 512.
So, in out = $signed(-(-512)) the negation operator returns an unsigned value of 512 and it gets converted to a signed by the system task. Therefore, it gets sign extended in out.
out = -$signed(-512) for the same reason the outermost negation operator returns an unsigned value of 512. No sign extension happens here.
You can again make it signed by enclosing in yet another $signed as out = $signed(-$signed(-512))

Perl variable assignment side effects

I'll be the first to admit that Perl is not my strong suit. But today I ran across this bit of code:
my $scaledWidth = int($width1x * $scalingFactor);
my $scaledHeight = int($height1x * $scalingFactor);
my $scaledSrc = $Media->prependStyleCodes($src, 'SX' . $scaledWidth);
# String concatenation makes this variable into a
# string, so we need to make it an integer again.
$scaledWidth = 0 + $scaledWidth;
I could be missing something obvious here, but I don't see anything in that code that could make $scaledWidth turn into a string. Unless somehow the concatenation in the third line causes Perl to permanently change the type of $scaledWidth. That seems ... wonky.
I searched a bit for "perl assignment side effects" and similar terms, and didn't come up with anything.
Can any of you Perl gurus tell me if that commented line of code actually does anything useful? Does using an integer variable in a concatenation expression really change the type of that variable?
It is only a little bit useful.
Perl can store a scalar value as a number or a string or both, depending on what it needs.
use Devel::Peek;
Dump($x = 42);
Dump($x = "42");
Outputs:
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
PV = 0x178d9e0 "0"\0
CUR = 1
LEN = 16
SV = PVIV(0x139a808) at 0x178a0b8
REFCNT = 1
FLAGS = (POK,pPOK)
IV = 42
PV = 0x178d9e0 "42"\0
CUR = 2
LEN = 16
The IV and IOK tokens refer to how the value is stored as a number and whether the current integer representation is valid, while PV and POK indicate the string representation and whether it is valid. Using a numeric scalar in a string context can change the internal representation.
use Devel::Peek;
$x = 42;
Dump($x);
$y = "X" . $x;
Dump($x);
SV = IV(0x17969d0) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,pIOK)
IV = 42
SV = PVIV(0x139aaa8) at 0x17969e0
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 42
PV = 0x162fc00 "42"\0
CUR = 2
LEN = 16
Perl will seamlessly convert one to the other as needed, and there is rarely a need for the Perl programmer to worry about the internal representation.
I say rarely because there are some known situations where the internal representation matters.
Perl variables are not typed. Any scalar can be either a number or a string depending how you use it. There are a few exceptions where an operation is dependent on whether a value seems more like a number or string, but most of them have been either deprecated or considered bad ideas. The big exception is when these values must be serialized to a format that explicitly stores numbers and strings differently (commonly JSON), so you need to know which it is "supposed" to be.
The internal details are that a SV (scalar value) contains any of the values that have been relevant to its usage during its lifetime. So your $scaledWidth first contains only an IV (integer value) as the result of the int function. When it is concatenated, that uses it as a string, so it generates a PV (pointer value, used for strings). That variable contains both, it is not one type or the other. So when something like JSON encoders need to determine whether it's supposed to be a number or a string, they see both in the internal state.
There have been three strategies that JSON encoders have taken to resolve this situation. Originally, JSON::PP and JSON::XS would simply consider it a string if it contains a PV, or in other words, if it's ever been used as a string; and as a number if it only has an IV or NV (double). As you alluded to, this leads to an inordinate amount of false positives.
Cpanel::JSON::XS, a fork of JSON::XS that fixes a large number of issues, along with more recent versions of JSON::PP, use a different heuristic. Essentially, a value will still be considered a number if it has a PV but the PV matches the IV or NV it contains. This, of course, still results in false positives (example: you have the string '5', and use it in a numerical operation), but in practice it is much more often what you want.
The third strategy is the most useful if you need to be sure what types you have: be explicit. You can do this by reassigning every value to explicitly be a number or string as in the code you found. This assigns a new SV to $scaledWidth that contains only an IV (the result of the addition operation), so there is no ambiguity. Another method of being explicit is using an encoding method that allows specifying the types you want, like Cpanel::JSON::XS::Type.
The details of course vary if you're not talking about the JSON format, but that is where this issue has been most deliberated. This distinction is invisible in most Perl code where the operation, not the values, determine the type.

Boolean size in Ada

In my ada's project I have 2 different libraries with base types. I found two different definition for a boolean :
Library A :
type Bool_Type is new Boolean;
Library B :
type T_BOOL8 is new Boolean;
for T_BOOL8'Size use 8;
So I have a question, what is the size used for Bool_Type ?
Bool_Type will inherit the 'Size of Boolean, which is required to be 1,
see RM 13.3(49)
Compile with switch -gnatR2 to see its representation clause. For example:
main.adb
with Ada.Text_IO; use Ada.Text_IO;
procedure Main is
type Bool_Type is new Boolean;
type T_BOOL8 is new Boolean;
for T_BOOL8'Size use 8;
begin
Put_Line ("Bool_Type'Object_Size = " & Integer'Image (Bool_Type'Object_Size));
Put_Line ("Bool_Type'Value_Size = " & Integer'Image (Bool_Type'Value_Size));
Put_Line ("Bool_Type'Size = " & Integer'Image (Bool_Type'Size));
New_Line;
Put_Line ("T_BOOL8'Object_Size = " & Integer'Image (T_BOOL8'Object_Size));
Put_Line ("T_BOOL8'Value_Size = " & Integer'Image (T_BOOL8'Value_Size));
Put_Line ("T_BOOL8'Size = " & Integer'Image (T_BOOL8'Size));
New_Line;
end Main;
compiler output (partial):
Representation information for unit Main (body)
-----------------------------------------------
for Bool_Type'Object_Size use 8;
for Bool_Type'Value_Size use 1;
for Bool_Type'Alignment use 1;
for T_Bool8'Size use 8;
for T_Bool8'Alignment use 1;
program output
Bool_Type'Object_Size = 8
Bool_Type'Value_Size = 1
Bool_Type'Size = 1
T_BOOL8'Object_Size = 8
T_BOOL8'Value_Size = 8
T_BOOL8'Size = 8
As can be seen, the number returned by the 'Size / 'Value_Size attribute for Bool_Type is equal to 1 (as required by the RM; see egilhh's answer). The attribute 'Size / 'Value_Size states the number of bits used to represent a value of the type. The 'Object_Size attribute, on the other hand, equals 8 bits (1 byte) and states the amount of bits used to store a value of the given type in memory (see Simon Wright's comment). See here and here for details.
Note that the number of bits indicated by 'Size / 'Value_Size must be sufficient to uniquely represent all possible values within the (discrete) type. For Boolean derived types, at least 1 bit is required, for an enumeration type with 3 values, for example, you need at least 2 bits.
An effect of explicitly setting the 'Size / 'Value_Size attribute can be observed when defining a packed array (as mentioned in G_Zeus’ answer):
type Bool_Array_Type is
array (Natural range 0 .. 7) of Bool_Type with Pack;
type T_BOOL8_ARRAY is
array (Natural range 0 .. 7) of T_BOOL8 with Pack;
compiler output (partial):
Representation information for unit Main (body)
-------------------------------------------------
[...]
for Bool_Array_Type'Size use 8;
for Bool_Array_Type'Alignment use 1;
for Bool_Array_Type'Component_Size use 1;
[...]
for T_Bool8_Array'Size use 64;
for T_Bool8_Array'Alignment use 1;
for T_Bool8_Array'Component_Size use 8;
Because the number of bits used to represent a value of type T_BOOL8 is forced to be 8, the size of a single component of a packed array of T_BOOL8s will also be 8, and the total size of T_BOOL8_ARRAY will be 64 bits (8 bytes). Compare this to the total length of 8 bits (1 byte) for Bool_Array_Type.
You should find your answer (or enough information to find the answer to your specific question) in the Ada wikibooks entry for 'Size attribute.
Most likely Bool_Type has a the same size as Boolean, or 1 bit for the type (meaning you can pack Bool_Type elements in an array, for example) and 8 bits for instances (rounded up to full byte).
Whatever size the compiler wants, unless you override as in library B. Probably 8 bits but on some 32 bit RISC targets, 32 bits may be faster than 8. On a tiny microcontroller, 1 bit may save space.
The other answers let you find out for the specific target you compile for.
As your booleans are separate types, you need type conversions between them, providing hooks for the compiler to handle any format or size conversion without any further ado.

Tracking leak in a big async Perl process

I am sorry in advance - but this post will not contain a code sample.
I was assigned with a task to debug a memory leak in some module.
In this program I have a management object that holds Data and other Objects. The program uses async methods that updates the managment object from time to time.
I used a Perl module Devel::Peek to dump the object, and I was curious about the reference count.
Since I am using a local variable to print this object - the parent refcount is always 1 as expected.
My 2nd Level - the real management object refcount is always bigger then 1.
All other levels are also always 1 as expected.
Here is an example:
SV = RV(0xbb3e244) at 0xbb3e238
REFCNT = 1
FLAGS = (PADMY,ROK)
RV = 0xcf19478
SV = PVHV(0xd0e1f98) at 0xcf19478
REFCNT = 6
FLAGS = (PADMY,OBJECT,OOK,SHAREKEYS)
STASH = 0x9b116a0 "<XXXXX>"
ARRAY = 0xd0ff190 (0:106, 1:104, 2:34, 3:10, 4:2)
hash quality = 105.4%
KEYS = 210
FILL = 150
MAX = 255
RITER = -1
EITER = 0x0
Elt "<XXXXX>" HASH = 0x10b5af01
SV = PVIV(0xce05510) at 0xcf07ba8
REFCNT = 1
FLAGS = (IOK,POK,pIOK,pPOK)
IV = 16200
PV = 0xd0fc0d8 "16200"\0
CUR = 5
LEN = 8
Elt "<XXXXX>" HASH = 0x3ebbb602
SV = PV(0xd10c810) at 0xcfb4350
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0xd2008d8 "<XXX>"\0
CUR = 4
LEN = 8
Elt "<XXXXX>" HASH = 0x1c7c0002
SV = RV(0xcf197f4) at 0xcf197e8
REFCNT = 1
FLAGS = (ROK)
RV = 0xd456ba0
SV = PVHV(0xd66a11c) at 0xd456ba0
REFCNT = 1
FLAGS = (PADMY,OOK,SHAREKEYS)
ARRAY = 0xd19a8d8 (0:3, 1:3, 2:2)
hash quality = 111.4%
KEYS = 7
FILL = 5
MAX = 7
RITER = -1
EITER = 0x0
Elt "<XXXXX>" HASH = 0x2d2f24a1
SV = RV(0xc2e3fcc) at 0xc2e3fc0
REFCNT = 1
FLAGS = (ROK)
RV = 0xd550548
I want to understand the reference count process.
If I understand the management object Ptr is being accessed from several locations. The internal objects are being accessed only once from the management object.
Is it possible that if I update internal fields on the management object from several locations it will cause a memory leak?
A typical problem within async (event driven) programs is that objects are often referenced from within callbacks which are attached to some event loop and that one has to be really careful to clean everything up on error. Strategic uses of weaken from Scalar::Util helps here a lot.
But once you have the mess it is really hard to debug. I usually use my own module Devel::TrackObjects to track down objects which do not get destroyed as expected and consider it easier to use for this purpose than Devel::Peek. But Devel::TrackObjects it can only deal with objects and does not help with other kinds of circular references.
Well, it's hard to answer directly, without any sort of idea what you're actually doing.
But yes - perl uses reference counting to determine if memory is still 'in use'. It's perfectly possible to cause a circular reference, and thus that memory will never be eligible to 'free' and thus it will leak.
The way you can avoid this is via the Scalar::Util module, and the weaken function call - that allows a reference to exist, but not 'count' for reference counting.

Why does sv_setref_pv() store its void * argument in the IV slot?

When looking at the Perl API, I was wondering why
sv_setref_iv() stores its IV argument in the IV slot,
sv_setref_nv() stores its NV argument in the NV slot,
but sv_setref_pv() stores its void * argument in the IV slot, instead of the PV slot?
I have a hunch (the CUR and LEN fields wouldn't make sense for such a variable), but I'd like to have the opinion of someone knowledgeable in XS :-)
There are many different types of scalars.
SvNULL isn't capable of holding any value except undef.
SvIV is capable of holding an IV, UV or RV.
SvNV is capable of holding an NV.
SvPV is capable of holding a PV.
SvPVIV is capable of holding a PV, as well as an IV, UV or RV.
...
AV, HV, CV, GV are really just types of scalar too.
Note I said "capable" of holding. You can think of scalars as being objects, and the above as being classes and subclasses. Each of the above has a different structure.
Having these different types of scalars allows one to save memory.
An SvIV (the smallest scalar type capable of holding an IV) is smaller than an SvPV (the smallest scalar type capable of holding a PV).
$ perl -le'
use Devel::Size qw( total_size );
use Inline C => <<'\''__EOI__'\'';
void upgrade_to_iv(SV* sv) {
SvUPGRADE(sv, SVt_IV);
}
void upgrade_to_pv(SV* sv) {
SvUPGRADE(sv, SVt_PV);
}
__EOI__
{ my $x; upgrade_to_iv($x); print total_size($x); }
{ my $x; upgrade_to_pv($x); print total_size($x); }
'
24
40
Using an SvIV instead of an SvPV is a savings of 16 bytes per reference.