Powershell 5.1 - the filename or extension is too long. How to split 1 command with dynamic arguments into more sequential calls? - powershell

Problem:
In powershell 5.1 I run a command, myProgram, and pass to its -itemsToProcess flag a comma separated string list (of dynamic length), $commaSeparatedList, as arguments. There are often too many characters, sometimes 125000 characters, or more, or less, in $commaSeparatedList which can cause the error shown below.
$commaSeparatedList = 'file1,file2,file3,file4 ... fileX'
myProgram -itemsToProcess $commaSeparatedList
Question:
How might I avoid the error? How might i split this into multiple calls such that the error is never thrown? The calls must be sequential too, not in parallel.
The pseudo code setup above works fine/succeeds when $commaSeparatedList is short in character length , however if its too long it crashes/fails/errors:
"the filename or extension is too long"
For example, the dynamically generated $commaSeparatedList has been 125k characters and could potentially be longer or shorter.
how might we detect and avoid the error? Perhaps somehow split it into multiple myProgram -itemsToProcess calls to avoid the error? What would that look like?

Thanks for the comments, I split the input into smaller strings and made more calls

Related

How can I avoid package/character errors in (read) in Common Lisp?

I'm getting some surprising errors when I try to input a string using (read). Context: I'm building a mini language, with inputs deliminated using characters like {, }, :, etc.
Here's what happens, I run (read) and enter {9.I:{8.II:hello}{8.III:hi}} (an example input string from my mini language).
I then get 2 errors:
1:
too many colons in "{8.II"
2:
Package HELLO}{8.III does not exist.
It seems as though there's something extra going on in the (read) function that's tripping me up. Can someone point me in the right direction?
read is designed to read a valid Lisp S-expression. It's going to use Common Lisp's parser. If your language is sufficiently Lisp-like, you may be able to make it work for you, but given the example input you've shown, I doubt it's what you want.
You're probably looking for read-line, which reads a single line of text as a string and does not perform any additional parsing on it.

Subscript multiple characters in Julia variable name?

I can write:
x\_m<TAB> = 5
to get x subscript m as a variable name in Julia. What if I want to subscript a word instead of a single character? This
x\_max<TAB> = 5
doesn't work. However,
x\_m<TAB>\_a<TAB>\_x<TAB> = 5
does work, it's just very uncomfortable. Is there a better way?
As I noted in my comment, not all ASCII characters exist as unicode super- or sub-scripts. In addition, another difficulty in generalizing this tab completion will be determining what \_phi<TAB> should mean: is it ₚₕᵢ or ᵩ? Finally, I'll note that since these characters are cobbled together from different ranges for different uses they look pretty terrible when used together.
A simple hack to support common words you use would be to add them piecemeal to the Base.REPLCompletions.latex_symbols dictionary:
Base.REPLCompletions.latex_symbols["\\_max"] = "ₘₐₓ"
Base.REPLCompletions.latex_symbols["\\_min"] = "ₘᵢₙ"
You can put these additions in your .juliarc.jl file to load them every time on startup. While it may be possible to get a comprehensive solution, it'll take much more work.
Since Julia 1.6 this works for subscripts (\_) and superscripts(\^) in the Julia REPL.
x\_maxTAB will print out like this: xₘₐₓ.
x\^maxTAB will print out like this: xᵐᵃˣ.

Can the MATLAB editor show the file from which text is displayed? [duplicate]

In MATLAB, how do you tell where in the code a variable is getting output?
I have about 10K lines of MATLAB code with about 4 people working on it. Somewhere, someone has dumped a variable in a MATLAB script in the typical way:
foo
Unfortunately, I do not know what variable is getting output. And the output is cluttering out other more important outputs.
Any ideas?
p.s. Anyone ever try overwriting Standard.out? Since MATLAB and Java integration is so tight, would that work? A trick I've used in Java when faced with this problem is to replace Standard.out with my own version.
Ooh, I hate this too. I wish Matlab had a "dbstop if display" to stop on exactly this.
The mlint traversal from weiyin is a good idea. Mlint can't see dynamic code, though, such as arguments to eval() or string-valued figure handle callbacks. I've run in to output like this in callbacks like this, where update_table() returns something in some conditions.
uicontrol('Style','pushbutton', 'Callback','update_table')
You can "duck-punch" a method in to built-in types to give you a hook for dbstop. In a directory on your Matlab path, create a new directory named "#double", and make a #double/display.m file like this.
function display(varargin)
builtin('display', varargin{:});
Then you can do
dbstop in double/display at 2
and run your code. Now you'll be dropped in to the debugger whenever display is implicitly called by the omitted semicolon, including from dynamic code. Doing it for #double seems to cover char and cells as well. If it's a different type being displayed, you may have to experiment.
You could probably override the built-in disp() the same way. I think this would be analagous to a custom replacement for Java's System.out stream.
Needless to say, adding methods to built-in types is nonstandard, unsupported, very error-prone, and something to be very wary of outside a debugging session.
This is a typical pattern that mLint will help you find:
So, look on the right hand side of the editor for the orange lines. This will help you find not only this optimization, but many, many more. Notice also that your variable name is highlighted.
If you have a line such as:
foo = 2
and there is no ";" on the end, then the output will be dumped to the screen with the variable name appearing first:
foo =
2
In this case, you should search the file for the string "foo =" and find the line missing a ";".
If you are seeing output with no variable name appearing, then the output is probably being dumped to the screen using either the DISP or FPRINTF function. Searching the file for "disp" or "fprintf" should help you find where the data is being displayed.
If you are seeing output with the variable name "ans" appearing, this is a case when a computation is being done, not being put in a variable, and is missing a ';' at the end of the line, such as:
size(foo)
In general, this is a bad practice for displaying what's going on in the code, since (as you have found out) it can be hard to find where these have been placed in a large piece of code. In this case, the easiest way to find the offending line is to use MLINT, as other answers have suggested.
I like the idea of "dbstop if display", however this is not a dbstop option that i know of.
If all else fails, there is still hope. Mlint is a good idea, but if there are many thousands of lines and many functions, then you may never find the offender. Worse, if this code has been sloppily written, there will be zillions of mlint flags that appear. How will you narrow it down?
A solution is to display your way there. I would overload the display function. Only temporarily, but this will work. If the output is being dumped to the command line as
ans =
stuff
or as
foo =
stuff
Then it has been written out with display. If it is coming out as just
stuff
then disp is the culprit. Why does it matter? Overload the offender. Create a new directory in some directory that is on top of your MATLAB search path, called #double (assuming that the output is a double variable. If it is character, then you will need an #char directory.) Do NOT put the #double directory itself on the MATLAB search path, just put it in some directory that is on your path.
Inside this directory, put a new m-file called disp.m or display.m, depending upon your determination of what has done the command line output. The contents of the m-file will be a call to the function builtin, which will allow you to then call the builtin version of disp or display on the input.
Now, set a debugging point inside the new function. Every time output is generated to the screen, this function will be called. If there are multiple events, you may need to use the debugger to allow processing to proceed until the offender has been trapped. Eventually, this process will trap the offensive line. Remember, you are in the debugger! Use the debugger to determine which function called disp, and where. You can step out of disp or display, or just look at the contents of dbstack to see what has happened.
When all is done and the problem repaired, delete this extra directory, and the disp/display function you put in it.
You could run mlint as a function and interpret the results.
>> I = mlint('filename','-struct');
>> isErrorMessage = arrayfun(#(S)strcmp(S.message,...
'Terminate statement with semicolon to suppress output (in functions).'),I);
>>I(isErrorMessage ).line
This will only find missing semicolons in that single file. So this would have to be run on a list of files (functions) that are called from some main function.
If you wanted to find calls to disp() or fprintf() you would need to read in the text of the file and use regular expresions to find the calls.
Note: If you are using a script instead of a function you will need to change the above message to read: 'Terminate statement with semicolon to suppress output (in scripts).'
Andrew Janke's overloading is a very useful tip
the only other thing is instead of using dbstop I find the following works better, for the simple reason that putting a stop in display.m will cause execution to pause, every time display.m is called, even if nothing is written.
This way, the stop will only be triggered when display is called to write a non null string, and you won't have to step through a potentially very large number of useless display calls
function display(varargin)
builtin('display', varargin{:});
if isempty(varargin{1})==0
keyboard
end
A foolproof way of locating such things is to iteratively step through the code in the debugger observing the output. This would proceed as follows:
Add a break point at the first line of the highest level script/function which produces the undesired output. Run the function/script.
step over the lines (not stepping in) until you see the undesired output.
When you find the line/function which produces the output, either fix it, if it's in this file, or open the subfunction/script which is producing the output. Remove the break point from the higher level function, and put a break point in the first line of the lower-level function. Repeat from step 1 until the line producing the output is located.
Although a pain, you will find the line relatively quickly this way unless you have huge functions/scripts, which is bad practice anyway. If the scripts are like this you could use a sort of partitioning approach to locate the line in the function in a similar manner. This would involve putting a break point at the start, then one half way though and noting which half of the function produces the output, then halving again and so on until the line is located.
I had this problem with much smaller code and it's a bugger, so even though the OP found their solution, I'll post a small cheat I learned.
1) In the Matlab command prompt, turn on 'more'.
more on
2) Resize the prompt-y/terminal-y part of the window to a mere line of text in height.
3) Run the code. It will stop wherever it needed to print, as there isn't the space to print it ( more is blocking on a [space] or [down] press ).
4) Press [ctrl]-[C] to kill your program at the spot where it couldn't print.
5) Return your prompt-y area to normal size. Starting at the top of trace, click on the clickable bits in the red text. These are your potential culprits. (Of course, you may need to have pressed [down], etc, to pass parts where the code was actually intended to print things.)
You'll need to traverse all your m-files (probably using a recursive function, or unix('find -type f -iname *.m') ). Call mlint on each filename:
r = mlint(filename);
r will be a (possibly empty) structure with a message field. Look for the message that starts with "Terminate statement with semicolon to suppress output".

Translating DOS batch file to PowerShell

I am trying to translate a .bat file to PowerShell and having trouble with understanding what a few snippets of code is doing:
set MY_VARIABLE = "some\path\here"
"!MY_VARIABLE:\=/!"
What is line 2 above doing? Specially, I dont understand what the :\=/ is doing since I have seen the variable else where in the code being referenced like !MY_VARIABLE!.
The other point of confusion is the below code.
set SOME_VARIABLE=!SOME_ARGUMENTS:\=\\!
set SOME_VARIABLE=!SOME_ARGUMENTS:"=\"!
Also, can you tell me what is going on in lines 3 and 4 above as well?
What would the below variables translate into PowerShell as well?
set TN0=%~n0
set TDP0=%~dp0
set STAR=%*
Any help on this is much appreciated. Thanks.
The !var:find=replace! is string substitution for a variable that is delay-expanded.
http://www.robvanderwoude.com/ntset.php#StrSubst
When you use ! instead of % for a variable, you want DOS to do the variable replacement at execution time (which is probably what you think it does with %, but it doesn't). With %, the variable is substituted at the point that the command is parsed (before it's run) -- so if the variable changes as part of the command, it won't be seen. I think some switch to using ! all of the time, because it gives "normal" behavior.
You can read more about delayed expansion here
http://www.robvanderwoude.com/ntset.php#DelayedExpansion
The first two set variableName= commands use modifiers to expand on the name of the batch file, represented as %0.
%~n0 expands it to a file name, and
%~dp0 expands it to include a drive letter and path.
The final one, %*, represents all arguments passed to the batch file.
Additional information can be found in answers here or here.
Exclamation points (!) i n DOS batch files reference the intermediate value, useful if you are in a for-loop. If you were to use a % instead (in a loop), it would return the same value over and over.
Lines 3 and 4 are setting "SOME_VARIABLE" to the intermediate value of "SOME_ARGUMENTS:\=\" and SOME_ARGUMENTS:"=\", respectively. Again, I'm guessing that these lines are from a loop.
As for the variable assignments, Powershell variable assignments work like this:
$myVariable = "my string"
~dp0 (in DOS batch) translates into the path (with drive letter) of the current bat file. You can get that in Powershell by doing a "get-location".
Why someone would need to set a variable for STAR(*) is beyond me, so I'm assuming there was some encoding issue or other reason that they couldn't just use an asterisk.
~n0 I'm not sure about; maybe someone else knows what that one is.

Can the C preprocessor perform simple string manipulation?

This is C macro weirdness question.
Is it possible to write a macro that takes string constant X ("...") as argument and evaluates to sting Y of same length such that each character of Y is [constant] arithmetic expression of corresponding character of X.
This is not possible, right ?
No, the C preprocessor considers string literals to be a single token and therefore it cannot perform any such manipulation.
What you are asking for should be done in actual C code. If you are worried about runtime performance and wish to delegate this fixed task at compile time, modern optimising compilers should successfully deal with code like this - they can unroll any loops and pre-compute any fixed expressions, while taking code size and CPU cache use patterns into account, which the preprocessor has no idea about.
On the other hand, you may want your code to include such a modified string literal, but do not want or need the original - e.g. you want to have obfuscated text that your program will decode and you do not want to have the original strings in your executable. In that case, you can use some build-system scripting to do that by, for example, using another C program to produce the modified strings and defining them as macros in the C compiler command line for your actual program.
As already said by others, the preprocessor sees entire strings as tokens. There is only one exception the _Pragma operator, that takes a string as argument and tokenizes its contents to pass it to a #pragma directive.
So unless your targeting a _Pragma the only way to do things in the preprocessing phases is to have them written as token sequences, manipulate them and to stringify them at the end.