search objects with size larger then a threshold - windbg

One of class has many object present in .NET heap as discovered through following sos command.
!dumpheap -stat -type MyClass
Statistics:
MT Count TotalSize Class Name
00007ff8e6253494 1700 164123 MyNameSpace.MyClass
I need to find the instances of those objects that have ObjSize greater then 5 MB. I know I can list out objsize of all 1700 instances of MyClass using following.
.foreach (res {!DumpHeap -short -MT 00007ff8e6253494 }) {.if ( (!objsize res) > 41943040) {.echo res; !objsize res}}
With the script above, I don't get any results although there are object instances greater than 5MB. I think problem may be that output of objsize is follows
20288 (0x4f40) bytes
Its a string which make it harder to compare against any threshold. How can I get this script to only list objects that has objsize larger then 5MB?

Creating complex scripts in WinDbg is quite error prone. In such situations, I switch to PyKd, which is a WinDbg extension that uses Python.
In the following, I'll only cover the missing piece in your puzzle, which is the parts that does not work:
.if ( (!objsize res) > 41943040) {.echo res; !objsize res}
Here's my starting point:
0:009> !dumpheap -min 2000
Address MT Size
00000087c6041fe8 000007f81ea5f058 10158
00000087d6021018 000007f81ea3f1b8 8736
00000087d6023658 000007f81ea3f1b8 8192
00000087d6025658 000007f81ea3f1b8 16352
00000087d6029638 000007f81ea3f1b8 32672
You can write a script like this (no error handling!)
from pykd import *
import re
import sys
objsizeStr = dbgCommand("!objsize "+sys.argv[1])
number = re.search("= (.*)\(0x", objsizeStr)
size = int(number.group(1))
if size > 10000:
print sys.argv[1], size
and use it within your loop:
0:009> .foreach (res {!dumpheap -short -min 2000}) { !py c:\tmp\size.py ${res}}
00000087c6041fe8 10160
00000087d6021018 37248
00000087d6023658 27360
00000087d6025658 54488
00000087d6029638 53680
Note how the size of !objsize differs from that of !dumpheap. Just for cross-checking:
0:009> !objsize 00000087d6023658
sizeof(00000087d6023658) = 27360 (0x6ae0) bytes (System.Object[])
See also this answer on how to improve the script using expr() so that you can pass expressions etc. The way I did it now outputs the size in decimal, but that's not explicit. Maybe you want to output a 0n prefix to make it clear.

well as Steve Commented !dumpeap takes a min and max parameter and with those it should be possible to do it natively
0:004> !DumpHeap -type System.String -stat
Statistics:
MT Count TotalSize Class Name
6588199c 1 12 System.Collectionsxxxxx
65454aec 1 48 System.Collectionsxxxxx
65881aa8 1 60 System.Collectionsxxxxx
6587e388 17 596 System.String[]
6587d834 168 5300 System.String
Total 188 objects
0:004> !DumpHeap -type System.String -stat -min 0n64 -max 0n100
Statistics:
MT Count TotalSize Class Name
6587e388 3 212 System.String[]
6587d834 9 684 System.String
Total 12 objects
0:004> !DumpHeap -type System.String -min 0n64 -max 0n100
Address MT Size
01781280 6587d834 76
01781354 6587d834 78
01781478 6587e388 84
017816d8 6587d834 64
01781998 6587d834 78
017819e8 6587d834 70
01781a30 6587d834 82
01782974 6587d834 78
01782a6c 6587d834 90
01782c7c 6587d834 68
01783720 6587e388 64
01783760 6587e388 64
Statistics:
MT Count TotalSize Class Name
6587e388 3 212 System.String[]
6587d834 9 684 System.String
Total 12 objects
manipulating max,min we can fine tune to just one or two objects
an example where we have 1 object extra on upper side and 2 objects extra on lower side
from output preceding this (15 objects versus 12 objects)
0:004> !DumpHeap -type System.String -min 0n62 -max 0n106
Address MT Size
01781280 6587d834 76
01781354 6587d834 78
017813e8 6587d834 62
01781478 6587e388 84
017816d8 6587d834 64
01781898 6587d834 106
01781998 6587d834 78
017819e8 6587d834 70
01781a30 6587d834 82
01782974 6587d834 78
01782a6c 6587d834 90
01782c7c 6587d834 68
01783720 6587e388 64
01783760 6587e388 64
01783e4c 6587d834 62
Statistics:
MT Count TotalSize Class Name
6587e388 3 212 System.String[]
6587d834 12 914 System.String
Total 15 objects
if one needs the address and size both for some reason one could always awk it
0:004> .shell -ci "!DumpHeap -type System.String -min 0n62 -max 0n106" awk "{print $1,$3}"
Address Size
01781280 76
01781354 78
017813e8 62
01781478 84
017816d8 64
01781898 106
01781998 78
017819e8 70
01781a30 82
01782974 78
01782a6c 90
01782c7c 68
01783720 64
01783760 64
01783e4c 62

Related

AVX512 or AVX2 vector permutations, group uint16 elements by strides of 16 (32 bytes)

I have 64 two-byte (short) numbers in memory like this: 0 1 2 3 ... 63. I want to shuffle them so that they look like this in memory:
0 16 32 48 1 17 33 49 2 18 34 50 ... 15 31 47 63
What is the most effective way to do this using avx2 or avx512?

How to apply an "interface" method to a set of rows in kdb?

Sorry if this is a newbie question again.
I am trying to replicate the functionality of interfaces as seen in c++, rust etc. in kdb as is shown in a simple demonstration below:
q).iface.a.fun:{x*y+z}
q).iface.b.fun:{x*x+y+z}
q)ifaces:`a`b; // for demonstration purposes
q)tab:([]time:`datetime$();kind:`ifaces$();x:`long$();y:`long$();z:`long$());
q)n:10;
q)tab,:flip(n#.z.z;n?ifaces;n?10;n?10;n?10)
Now you would assume that the kind would be able to reference the `a`b fun methods of the iface interface as follows:
q)?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;`fun;`x;`y;`z))]
evaluation error:
fun
[0] ?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;`fun;`x;`y;`z))]
^
Obviously the functional nature of the select inhibits referencing the fun method on account of the symbol type field declarations.
You can avert this error by using enlist as follows:
q)?[`tab;();0b;`max`ifaceval!((max;`x);(`.iface;`kind;enlist`fun;`x;`y;`z))]
max ifaceval ..
-----------------------------------------------------------------------------..
9 77 154 95 65 0 128 153 126 60 49 77 154 95 65 0 128 153 126 60 49 77 154 ..
However this duplicates the result of fun for each row.
How might one effectively go about this without getting the above malformed responses?
Thanks again.
Selecting ifaceval first will ensure each row is returned. max x is a scalar, which forces all the ifaceval entries into one row. The scalar will be expanded across all rows if a vector column precedes it.
q)?[`tab;();0b;`ifaceval`max!((`.iface;`kind;enlist`fun;`x;`y;`z);(max;`x))]
ifaceval max
-------------------------------------
160 11 126 28 32 60 76 10 112 168 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
96 10 77 24 16 35 60 6 63 104 8
160 11 126 28 32 60 76 10 112 168 8
96 10 77 24 16 35 60 6 63 104 8
160 11 126 28 32 60 76 10 112 168 8
160 11 126 28 32 60 76 10 112 168 8
160 11 126 28 32 60 76 10 112 168 8
I'm not sure if this is exactly what you're looking for though. If you want to calculate ifaceval for each row in the table, this should work.
q)?[tab;();0b;`ifaceval`max!(((';(`.iface;::;enlist`fun));`kind;`x;`y;`z);(max;`x))]
ifaceval max
------------
160 8
10 8
77 8
24 8
16 8
60 8
60 8
10 8
112 8
168 8
One point to make is that it's probably best to avoid using kdb keywords for column names. Although it works in functional queries, it does not for qSQL ones.
q)select max:max x from tab
'assign
[0] select max:max x from tab
^

How can I interrupt a 'loop' in kdb?

numb is a list of numbers:
q))input
42 58 74 51 63 23 41 40 43 16 64 29 35 37 30 3 34 33 25 14 4 39 66 49 69 13..
31 41 39 27 9 21 7 25 34 52 60 13 43 71 10 42 19 30 46 50 17 33 44 28 3 62..
15 57 4 55 3 28 14 21 35 29 52 1 50 10 39 70 43 53 46 68 40 27 13 69 20 49..
3 34 11 53 6 5 48 51 39 75 44 32 43 23 30 15 19 62 64 69 38 29 22 70 28 40..
18 30 60 56 12 3 47 46 63 19 59 34 69 65 26 61 50 67 8 71 70 44 39 16 29 45..
I want to iterate through each row and calculate the sum of the first 2 and then 3 and then 4 numbers etc. If that sum is greater than 1000 I want to stop the iteration on that particualr row and jump on the next row and do the same thing. This is my code:
{[input]
tot::tot+{[x;y]
if[1000<sum x;:count x;x,y]
}/[input;input]
}each numb
My problem here is that after the count of x is added to tot the over keeps going on the same row. How can I exit over and jump on the next row?
UPDATE: (QUESTION STILL OPEN) I do appreciate all the answers so far but I am not looking for an efficient way to sum the first n numbers. My question is how do I break the over and jump on the next line. I would like to achieve the same thing as with those small scripts:
C++
for (int i = 0; i <= 100; i++) {
if (i = 50) { printf("for loop exited at: %i ", i); break; }
}
Python
for i in range(100):
if i == 50:
print(i);
break;
R
for(i in 1:100){
if(i == 50){
print(i)
break
}
}
I think this is what you are trying to accomplish.
sum {(x & sums y) ? x}[1000] each input
It takes a cumulative sum of each row and takes an element wise minimum between that sum and the input limit thereby capping the output at the limit like so:
q)(100 & sums 40 43 16 64 29)
40 83 99 100 100
It then uses the ? operator to find the first occurance of that limit (i.e the element where this limit was equaled or passed) adding one as it is 0 indexed. In the example the first 100 occurs after 3 elements. You might want add one to include the first element after the limit in the count.
q)40 83 99 100 100 ? 100
3
And then it sums this count over all rows of the input.
You could use coverage in this case to exit when you fail to satisfy a condition
https://code.kx.com/q/ref/adverbs/#converge-repeat
The first parameter would be a function that does your check based on the current value of x which will be the next value to be passed in the main function.
For your example ive made a projection using the main input line then increase the indexes of what i am summing each time:
q)numb
98 11 42 97 89 80 73 35 4 30
86 33 38 86 26 15 83 71 21 22
23 43 41 80 56 11 22 28 47 57
q){[input] {x+1}/[{100>sum (y+1)#x}[input;];0] }each numb
1 1 2
this returns the first index of each where running sum is over 100
However this isn't really an ideal use case of KDB
could instead be done with something like
(sums#/:numb) binr\: 100
maybe your real example makes more sense
You can use while loops in KDB although all KDB developers are generally too afraid of being openly mocked and laughed at for doing so
q){i:0;while[i<>50;i+:1];:"loop exited at ",string i}`
"loop exited at 50"
Kdb does have a "stop loop" mechanism but only in the case of a monadic function with single seed value
/keep squaring until number is no longer less than 1000, starting at 2
q){x*x}/[{x<1000};2]
65536
/keep dealing random numbers under 20 until you get an 18 (seed value 0 is irrelevant)
q){first 1?20}\[18<>;0]
0 19 17 12 15 10 18
However this doesn't really fit your use case and as other people have pointed out, this is not how you would/should solve this problem in kdb.

Equivalent of *nix fold in PowerShell

Today I had a few hundred items (IDs from SQL query) and needed to paste them into another query to be readable by an analyst. I needed *nix fold command. I wanted to take the 300 lines and reformat them as multiple numbers per line seperated by a space. I would have used fold -w 100 -s.
Similar tools on *nix include fmt and par.
On Windows is there an easy way to do this in PowerShell? I expected one of the *-Format commandlets to do it, but I couldn't find it. I'm using PowerShell v4.
See https://unix.stackexchange.com/questions/25173/how-can-i-wrap-text-at-a-certain-column-size
# Input Data
# simulate a set of 300 numeric IDs from 100,000 to 150,000
100001..100330 |
Out-File _sql.txt -Encoding ascii
# I want output like:
# 100001, 100002, 100003, 100004, 100005, ... 100010, 100011
# 100012, 100013, 100014, 100015, 100016, ... 100021, 100021
# each line less than 100 characters.
Depending on how big the file is you could read it all into memory, join it with spaces and then split on 100* characters or the next space
(Get-Content C:\Temp\test.txt) -join " " -split '(.{100,}?[ |$])' | Where-Object{$_}
That regex looks for 100 characters then the first space after that. That match is then -split but since the pattern is wrapped in parenthesis the match is returned instead of discarded. The Where removes the empty entries that are created in between the matches.
Small sample to prove theory
#"
134
124
1
225
234
4
34
2
42
342
5
5
2
6
"#.split("`n") -join " " -split '(.{10,}?[ |$])' | Where-Object{$_}
The above splits on 10 characters where possible. If it cannot the numbers are still preserved. Sample is based on me banging on the keyboard with my head.
134 124 1
225 234 4
34 2 42
342 5 5
2 6
You could then make this into a function to get the simplicity back that you are most likely looking for. It can get better but this isn't really the focus of the answer.
Function Get-Folded{
Param(
[string[]]$Strings,
[int]$Wrap = 50
)
$strings -join " " -split "(.{$wrap,}?[ |$])" | Where-Object{$_}
}
Again with the samples
PS C:\Users\mcameron> Get-Folded -Strings (Get-Content C:\temp\test.txt) -wrap 40
"Lorem ipsum dolor sit amet, consectetur
adipiscing elit, sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua. Ut enim
ad minim veniam, quis nostrud exercitation
... output truncated...
You can see that it was supposed to split on 40 characters but the second line is longer. It split on the next space after 40 to preserve the word.
If it's one item per line, and you want to join every 100 items onto a single line separated by a space you could put all the output into a text file then do this:
gc c:\count.txt -readcount 100 | % {$_ -join " "}
When I saw this, the first thing that came to my mind was abusing Format-Table to do this, mostly because it knows how to break the lines properly when you specify a width. After coming up with a function, it seems that the other solutions presented are shorter and probably easier to understand, but I figured I'd still go ahead and post this solution anyway:
function fold {
[CmdletBinding()]
param(
[Parameter(ValueFromPipeline)]
$InputObject,
[Alias('w')]
[int] $LineWidth = 100,
[int] $ElementWidth
)
begin {
$SB = New-Object System.Text.StringBuilder
if ($ElementWidth) {
$SBFormatter = "{0,$ElementWidth} "
}
else {
$SBFormatter = "{0} "
}
}
process {
foreach ($CurrentObject in $InputObject) {
[void] $SB.AppendFormat($SBFormatter, $CurrentObject)
}
}
end {
# Format-Table wanted some sort of an object assigned to it, so I
# picked the first static object that popped in my head:
([guid]::Empty | Format-Table -Property #{N="DoesntMatter"; E={$SB.ToString()}; Width = $LineWidth } -Wrap -HideTableHeaders |
Out-String).Trim("`r`n")
}
}
Using it gives output like this:
PS C:\> 0..99 | Get-Random -Count 100 | fold
1 73 81 47 54 41 17 87 2 55 30 91 19 50 64 70 51 29 49 46 39 20 85 69 74 43 68 82 76 22 12 35 59 92
13 3 88 6 72 67 96 31 11 26 80 58 16 60 89 62 27 36 37 18 97 90 40 65 42 15 33 24 23 99 0 32 83 14
21 8 94 48 10 4 84 78 52 28 63 7 34 86 75 71 53 5 45 66 44 57 77 56 38 79 25 93 9 61 98 95
PS C:\> 0..99 | Get-Random -Count 100 | fold -ElementWidth 2
74 89 10 42 46 99 21 80 81 82 4 60 33 45 25 57 49 9 86 84 83 44 3 77 34 40 75 50 2 18 6 66 13
64 78 51 27 71 97 48 58 0 65 36 47 19 31 79 55 56 59 15 53 69 85 26 20 73 52 68 35 93 17 5 54 95
23 92 90 96 24 22 37 91 87 7 38 39 11 41 14 62 12 32 94 29 67 98 76 70 28 30 16 1 61 88 43 8 63
72
PS C:\> 0..99 | Get-Random -Count 100 | fold -ElementWidth 2 -w 40
21 78 64 18 42 15 40 99 29 61 4 95 66
86 0 69 55 30 67 73 5 44 74 20 68 16
82 58 3 46 24 54 75 14 11 71 17 22 94
45 53 28 63 8 90 80 51 52 84 93 6 76
79 70 31 96 60 27 26 7 19 97 1 59 2
65 43 81 9 48 56 25 62 13 85 47 98 33
34 12 50 49 38 57 39 37 35 77 89 88 83
72 92 10 32 23 91 87 36 41
This is what I ended up using.
# simulate a set of 300 SQL IDs from 100,000 to 150,000
100001..100330 |
%{ "$_, " } | # I'll need this decoration in the SQL script
Out-File _sql.txt -Encoding ascii
gc .\_sql.txt -ReadCount 10 | %{ $_ -join ' ' }
Thanks everyone for the effort and the answers. I'm really surprised there wasn't a way to do this with Format-Table without the use of [guid]::Empty in Rohn Edward's answer.
My IDs are much more consistent than the example I gave, so Noah's use of gc -ReadCount is by far the simplest solution in this particular data set, but in the future I'd probably use Matt's answer or the answers linked to by Emperor in comments.
I came up with this:
$array =
(#'
1
2
3
10
11
100
101
'#).split("`n") |
foreach {$_.trim()}
$array = $array * 40
$SB = New-Object Text.StringBuilder(100,100)
foreach ($item in $array) {
Try { [void]$SB.Append("$item ") }
Catch {
$SB.ToString()
[void]$SB.Clear()
[Void]$SB.Append("$item ")
}
}
#don't forget the last line
$SB.ToString()
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101 1 2 3 10 11 100 101
Maybe not as compact as you were hoping for, and there may be better ways to do it, but it seems to work.

Function readmtx on matlab

I want to read a matrix that is on my matlab path. I was using the function readmtx but I don't know what to put on 'precision' (mtx = readmtx(fname,nrows,ncols,precision)).
I was wondering if you could help me with that. Or suggest a better way to read the matrix
You could read a matrix from text file with load command. If the first line include text, that should be started with %.
Note that each row of the text file should be values of a row in matrix, which are separated by a space, for Example:
%C1 C2 C3
1 2 3
4 5 6
7 8 9
Then, if you use load command you can read the text file into a matrix, something like:
myMatrix = load('textFileName.txt')
Now, Let's talk about readmtx ;)
About precision as described here:
Both binary and formatted data files can be read. If the file is binary, the precision argument is a format string recognized by fread. Repetition modifiers such as '40*char' are not supported. If the file is formatted, precision is a fscanf and sscanf-style format string of the form '%nX', where n is the number of characters within which the formatted data is found, and X is the conversion character such as 'g' or 'd'. Fortran-style double-precision output such as '0.0D00' can be read using a precision string such as '%nD', where n is the number of characters per element. This is an extension to the C-style format strings accepted by sscanf. Users unfamiliar with C should note that '%d' is preferred over '%i' for formatted integers. MATLAB syntax follows C in interpreting '%i' integers with leading zeros as octal. Formatted files with line endings need to provide the number of trailing bytes per row, which can be 1 for platforms with carriage returns or linefeed (Macintosh, UNIX®), or 2 for platforms with carriage returns and linefeeds (DOS).
Check this example also:
Write and read a binary matrix file:
fid = fopen('binmat','w');
fwrite(fid,1:100,'int16');
fclose(fid);
mtx = readmtx('binmat',10,10,'int16')
mtx =
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
mtx = readmtx('binmat',10,10,'int16',[2 5],3:2:9)
mtx =
13 15 17 19
23 25 27 29
33 35 37 39
43 45 47 49