Windbg - How can I Dump Strings which match a given filter - windbg

One can dump all the string using the following command
!dumpheap -type System.string
How can dump or print only those string which starts or contains a specific "string"
Example. I am only intrested to view the string which contains "/my/app/request"

Use sosex instead of sos for this. It has a !strings command which allows you to filter strings using the /m:<filter> option.

Use !sosex.strings. See !sosex.help for options to filter strings based on content and/or length.

Not sure if !dumpheap supports that. You can always use .logopen to redirect the output to a file and post-process that. For a more elegant (and thus more complicated) solution, you can also use .shell to redirect the command output to a shell process for parsing. Here's an example:
http://blogs.msdn.com/b/baleixo/archive/2008/09/06/using-shell-to-search-text.aspx
You can also see the .shell documentation for more details:
http://msdn.microsoft.com/en-us/library/windows/hardware/ff565339(v=vs.85).aspx

If you really want to go without SOSEX, then try
.foreach (string {!dumpheap -short -type System.String}) { .foreach (search {s -u ${string}+c ${string}+c+2*poi(${string}+8) "mySearchTerm"}) { du /c80 ${string}+c }}
It uses
!dumpheap to get all Strings on .NET heap
.foreach to iterate over them
s to search for a substring
.foreach again to find out if s found something
some offset calculations to get the first character (+c) of the string and the string length (+8) (multiplied by 2 to get bytes instead of characters). Those need to be adapted in case of 64 bit applications
The /c80 is just for nicer output. You could also use !do ${string} instead of du /c80 ${string}+c if you like the .NET details of the String.

Related

Perl interface with Aspell

I am trying to identify misspelled words with Aspell via Perl. I am working on a Linux server without administrator privileges which means I have access to Perl and Aspell but not, for example, Text::Aspell which is a Perl interface for Aspell.
I want to do the very simple task of passing a list of words to Aspell and having it return the words that are misspelled. If the words I want to check are "dad word lkjlkjlkj" I can do this through the command line with the following commands:
aspell list
dad word lkjlkjlkj
Aspell requires CTRL + D at the end to submit the word list. It would then return "lkjlkjlkj", as this isn't in the dictionary.
In order to do the exact same thing, but submitted via Perl (because I need to do this for thousands of documents) I have tried:
my $list = q(dad word lkjlkjlkj):
my #arguments = ("aspell list", $list, "^D");
my $aspell_out=`#arguments`;
print "Aspell output = $aspell_out\n";
The expected output is "Aspell output = lkjlkjlkj" because this is the output that Aspell gives when you submit these commands via the command line. However, the actual output is just "Aspell output = ". That is, Perl does not capture any output from Aspell. No errors are thrown.
I am not an expert programmer, but I thought this would be a fairly simple task. I've tried various iterations of this code and nothing works. I did some digging and I'm concerned that perhaps because Aspell is interactive, I need to use something like Expect, but I cannot figure out how to use it. Nor am I sure that it is actually the solution to my problem. I also think ^D should be an appropriate replacement for CTRL+D at the end of the commands, but all I know is it doesn't throw an error. I also tried \cd instead. Whatever it is, there is obviously an issue in either submitting the command or capturing the output.
The complication with using aspell out of a program is that it is an interactive and command-line driver tool, as you suspect. However, there is a simple way to do what you need.
In order to use aspell's command list one needs to pass it words via STDIN, as its man page says. While I find the GNU Aspell manual a little difficult to get going with, passing input to a program via its STDIN is easy enough and we can rewrite the invocation as
echo dad word lkj | aspell list
We get lkj printed back, as due. Now this can run out of a program just as it stands
my $word_list = q(word lkj good asdf);
my $cmd = qq(echo $word_list | aspell list);
my #aspell_out = qx($cmd);
print for #aspell_out;
This prints lines lkj and asdf.
I assemble the command in a string (as opposed to an array) for specific reasons, explained below. The qx is the operator form of backticks, which I prefer for its far superior readability.
Note that qx can return all output in a string, if in scalar context (assigned to a scalar for example), or in a list when in list context. Here I assign to an array so you get each word as an element (alas, each also comes with a newline, so may want to do chomp #aspell_out;).
Comment on a list vs string form of a command
I think that it's safe to recommend to use a list-form for a command, in general. So we'd say
my #cmd = ('ls', '-l', $dir); # to be run as an external command
instead of
my $cmd = "ls -l $dir"; # to be run as an external command
The list form generally makes it easier to manage the command, and it avoids the shell altogether.
However, this case is a little different
The qx operator doesn't really behave differently -- the array gets concatenated into a string, and that runs. The very fact that we can pass it an array is incidental, and not even documented
We need to pipe input to aspell's STDIN, and shell does that for us simply. We can use a shell with command's LIST form as well, but then we'd need to invoke it explicitly. We can also go for aspell's STDIN by means other than the shell but that's more complex
With a command in a list the command name must be the first word, so that "aspell list" from the question is wrong and it should fail (there is no command named that) ... except that in this case it wouldn't (if the rest were correct), since for qx the array gets collapsed into a string
Finally, apsell nicely exposes its API in a C library and that's been utilized for the module you mention. I'd suggest to install it as a user (no privileges needed) and use that.
You should take a step back and investigate if you can install Text::Aspell without administrator privilige. In most cases that's perfectly possible.
You can install modules into your home directory. If there is no C-compiler available on the server you can install the module on a compatible machine, compile and copy the files.

use a variable with whitespace Perl

I am currently working on a project but I have one big problem. I have some picture with a whitespace in the name and I want to do a montage. The problem is that I can't rename my picture and my code is like that :
$pic1 = qq(picture one.png);
$pic2 = qq(picture two.png);
my $cmd = "C:\...\montage.exe $pic1 $pic2 output.png";
system($cmd);
but because of the whitespace montage.exe doesn't work. How can I execute my code without renaming all my pictures?
Thanks a lot for your answer!
You can properly quote the filenames within the string you pass to system, as #Borodin shows in his answer. Something like: system("montage.exe '$pic1' '$pic2'")
However, A more reliable and safer solution is to pass the arguments to montage.exe as extra parameters in the system call:
system('montage.exe', $pic2, $pic2, 'output.png')
Now you don't have to worry about nesting the correct quotes, or worry about files with unexpected characters. Not only is this simpler code, but it avoids malicious injection issues, should those file names ever come from a tainted source. Someone could enter | rm *, but your system call will not remove all your files for you.
Further, in real life, you probably are not going to have a separate scalar variable for each file name. You'll have them in an array. This makes your system call even easier:
system('montage.exe', #filenames, 'output.png')
Not only is that super easy, but it avoids the pitfall of having a command line too long. If your filenames have nice long paths (maybe 50-100 characters), a Windows command line will exceed the max command length after around 100 files. Passing the arguments through system() instead of in one big string avoids that limitation.
Alternatively, you can pass the arguments to montage.exe as a list (instead of concatenating them all into a string):
use strict;
use warnings;
my $pic1 = qq(picture one.png);
my $pic2 = qq(picture two.png);
my #cmd = ("C:\...\montage.exe", $pic1, $pic2, "output.png");
system(#cmd);
You need to put quotes around the file names that have spaces. You also need to escape the backslashes
my $cmd = qq{C:\\...\\montage.exe "$pic1" "$pic2" output.png};
In unix systems, the best approach is the multi-argument form of system because 1) it avoids invoking a shell, and 2) that's the format accepted by the OS call. Neither of those are true in Windows. The OS call to spawn a program expects a command line, and system's attempt to form this command line is sometimes incorrect. The safest approach is to use Win32::ShellQuote.
use Win32::ShellQuote qw( quote_system );
system quote_system("C:\\...\\montage.exe", $pic1, $pic2, "output.png");

Substitute only one part of a string using perl

I have an array that have some symbols that I want to remove and even thought I find a solution, I will like to know if this is the right way because I'm afraid if I use it with array will remove the character that I might need on future arrays.
Here is an example item on my array:
$string1='22 | logging monitor informational';
so I try the following:
$string1=~ s/\s{6}\|(?=\s{6})//;
So my output is:
22 logging monitor informational
Is the other way that best match "|". I just want to remove the pipe character.
Thanks in advance
"I want to remove just the pipe character."
OK, then do this:
$string1 =~ s/\|//;
This will remove the first pipe character in the string. (You said in another comment that you don't want to remove any additional pipe characters.) If that's not what you want, then I'd suggest telling us exactly what you do want. We can't read minds, you know.
In the mean time, I'd also strongly recommend reading the Perl regular expressions tutorial.

Can kdb read from a named pipe?

I hope I'm doing something wrong, but it seems like kdb can't read data from named pipes (at least on Solaris). It blocks until they're written to but then returns none of the data that was written.
I can create a text file:
$ echo Mary had a little lamb > lamb.txt
and kdb will happily read it:
q) read0 `:/tmp/lamb.txt
enlist "Mary had a little lamb"
I can create a named pipe:
$ mkfifo lamb.pipe
and trying to read from it:
q) read0 `:/tmp/lamb.pipe
will cause kdb to block. Writing to the pipe:
$ cat lamb.txt > lamb.pipe
will cause kdb to return the empty list:
()
Can kdb read from named pipes? Should I just give up? I don't think it's a permissions thing (I tried setting -m 777 on my mkfifo command but that made no difference).
With release kdb+ v3.4 Q has support for named pipes: Depending on whether you want to implement a streaming algorithm or just read from the pipe use either .Q.fps or read1 on a fifo pipe:
To implement streaming you can do something like:
q).Q.fps[0N!]`:lamb.pipe
Then $ cat lamb.txt > lamb.pipe
will print
,"Mary had a little lamb"
in your q session. More meaningful algorithms can be implemented by replacing 0N! with an appropriate function.
To read the context of your file into a variable do:
q)h:hopen`:fifo://lamb.pipe
q)myText: `char$read1(h)
q)myText
"Mary had a little lamb\n"
See more about named pipes here.
when read0 fails, you can frequently fake it with system"cat ...". (i found this originally when trying to read stuff from /proc that also doesn't cooperate with read0.)
q)system"cat /tmp/lamb.pipe"
<blocks until you cat into the pipe in the other window>
"Mary had a little lamb"
q)
just be aware there's a reasonably high overhead (as such things go in q) for invoking system—it spawns a whole shell process just to run whatever your command is
you might also be able to do it directly with a custom C extension, probably calling read(2) directly…
The algorithm for read0 is not available to see what it is doing under the hood but, as far as I can tell, it expects a finite stream and not a continuous one; so it will will block until it receives an EOF signal.
Streaming from pipe is supported from v3.4
Details steps:
Check duplicated pipe file
rm -f /path/dataPipeFileName
Create named pipe
mkfifo /path/dataPipeFileName
Feed data
q).util.system[$1]; $1=command to fetch data > /path/dataPipeFileName &
Connect pipe using kdb .Q.fps
q).Q.fps[0N!]`$":/path/",dataPipeFileName;
Reference:
.Q.fps (streaming algorithm)
Syntax: .Q.fps[x;y] Where x is a unary function and y is a filepath
.Q.fs for pipes. (Since V3.4) Reads conveniently sized lumps of complete "\n" delimited records from a pipe and applies a function to each record. This enables you to implement a streaming algorithm to convert a large CSV file into an on-disk kdb+ database without holding the data in memory all at once.

Limit !dumpheap (windbg) output to n objects

When using windbg and running !dumpheap command to see the addresses of objects, how can you limit to a specific number of objects. The only way I found was using CTRL+BREAK
and a command line on a blog http://dotnetdebug.net/2005/07/04/dumpheap-parameters-and-some-general-information-on-sos-help-system/
-l X - Prints out only X items from each heap instead of all the objects.
Apparently -l no longer exists in SOS.dll
What are you actually looking for? Before looking at individual objects, it's usual to narrow the area of interest.
The –stat switch shows a summary, per type of the objects on the heap.
DumpHeap [-stat] [-min ][-max ] [-thinlock] [-mt ] [-type ][start [end]]
The -stat option restricts the output to the statistical type summary.
The -min option ignores objects that are less than the size parameter, specified in bytes.
The -max option ignores objects that are larger than the size parameter, specified in bytes.
The -thinlock option reports ThinLocks. For more information, see the SyncBlk command.
The -mt option lists only those objects that correspond to specified the MethodTable structure.
The -type option lists only those objects whose type name is a substring match of the specified string.
The start parameter begins listing from the specified address. The end parameter stops listing at the specified address.
Ref.
According which criteria would you like to limit the number of outputs?
The -l option just limits the output according to line numbers. This is useless: let's say it shows only the first 10 objects, maybe the object you're looking for is not even listed.
If the output is too long for WinDbgs output window, use .logopen to dump the objects into a file and then review the file with a text editor.
If you have other ideas how your object looks like, you can perform a loop over all objects
.foreach ( obj { !dumpheap -short -type MyType} )
and then decide with .if whether or not your object matches this criteria.
As an example, I was looking for a needle in a haystack. I was searching a specific Hashtable in a program with more than 3000 Hashtables on the heap. The command I tried to use was
.foreach ( obj { !dumpheap -short -type Hashtable }) {.if (poi(poi(${obj}+1c)) > 100) {!do ${obj}} }
1C is the offset of the count member of the hashtable.
100 is the number of items the Hashtable was expected to have at least.
Unfortunately it didn't work for Hashtables immediately, because !dumpheap -type also listed HashtableEnumerators which somehow crashed the debugger.
To dump hashtables only, run !dumpheap -stat and figure out the method table of hashtables and run the command with -mt <methodtable> instead of -type <classname>, which gives
.foreach ( obj { !dumpheap -short -mt <MT of Hashtable> }) {.if (poi(poi(${obj}+1c)) > 100) {!do ${obj}} }