how to increase the input line length(max) in windows? - windows-xp

in my one batch file i set a variable,
after running this batch file it shows the error
"The input line is too long."
how to overcome from this problem.
or it is possible to increase the input line length(max) in windows?

As per my comment, to delimit a line in a batch file, append the ^ character to the line. Eg:
somelong command ^
carries on here ^
and finally ends here
This will behave as 1 line.
Not however if this will overcome the input length limitation.

Related

How to parse sed regex syntax?

sed -i "0,/test/s//#test/g" file.txt
I do not know how to parse this regex. It is commenting out test by putting #, but my questions are
what is "0," at the beginning?
what is it not like "s/test/#test/g" ? aka why is /s is in the middle?
Any help is appreciated.
Lets break it down into smaller pieces:
https://www.gnu.org/software/sed/manual/sed.html#sed-script-overview
sed commands follow this syntax:
[addr]X[options]
X is a single-letter sed command. [addr] is an optional line address. If [addr] is specified, the command X will be executed only on the matched lines.
And
https://www.gnu.org/software/sed/manual/sed.html#Range-Addresses
An address range can be specified by specifying two addresses separated by a comma (,). An address range matches lines starting from where the first address matches, and continues until the second address matches (inclusively)
In the case of 0,/test/s//#test/g the address part is 0,/test/ because s is the command. An address part of 0,/test/ means the s command is only executed on lines inside that range. If the sed command was s/test/#test/g there wouldn't be an address part and the s command would be attempted on every line in the file.
https://www.gnu.org/software/sed/manual/sed.html#index-addr1_002c_002bN
A line number of 0 can be used in an address specification like 0,/regexp/ so that sed will try to match regexp in the first input line too. In other words, 0,/regexp/ is similar to 1,/regexp/, except that if addr2 matches the very first line of input the 0,/regexp/ form will consider it to end the range, whereas the 1,/regexp/ form will match the beginning of its range and hence make the range span up to the second occurrence of the regular expression.
Note that this is the only place where the 0 address makes sense; there is no 0-th line and commands which are given the 0 address in any other way will give an error.
So in 0,/test/s//#test/g, the address part 0,/test/ runs the s command only on the first line that matches /test/ - even if it is the first line.
https://www.gnu.org/software/sed/manual/sed.html#index-empty-regular-expression
The empty regular expression ‘//’ repeats the last regular expression match (the same holds if the empty regular expression is passed to the s command).
So 0,/test/s//#test/g is the same as 0,/test/s/test/#test/g because the empty regular expression matches the one that was used in the address part - but it can be left out because writing the same regex twice just makes the whole command less readable.
In conclusion:
s/test/#test/g does the replacement on every line in the file that contains test
0,/test/s//#test/g does the replacement only on the first line in the file that contains test

Tesseract OCR line breaks on command line

I am using tesseract.exe in Windows 7 by command line and while scanning image for OCR, I get output in continuous lines. I want it in the word wrap exactly the way it is in image. Is there a command line argument for such variations? Any help will be appreciated.
This is because Tesseract puts just line feeds at the end of a line instead of carriage returns + line feeds as expected by Windows' Notepad. An easy workaround is to output the results to stdout and redirect this output into a file:
tesseract.exe eurotext.tif - > result.txt
instead of
tesseract.exe eurotext.tif result

command line find instances of substring in file split by carriage return only

I have a file that logs the verbose output of an ffmpeg encoding, including any errors. However, each "line" is indicated only by a CR (0x0D), instead of a CRLF. FINDSTR apparently doesn't think that CR indicates a new line, so I basically have a single monolithic string. I need to count how many instances there are of the substring "aresample" in the log file, but even if there are fifty of them, since they are in a single "line", it only returns 1. What can I do to count multiple instances of a substring in a single "line"? Using Windows.

SED program works but when trying to execute as script it doesn't

My sed command works when I enter it followed by the file name but when I saved it in a file and used chmod u+rx to give it executable permission it doesn't work. The command is sed 's/\.\s*$/.\n/' .
Here is what happens
tim#tim-desktop:~$ ./dlsp lines
THIS IS Just A BLANK LINE.
If I enter it with followed by a file name it does what it is suppose to do
tim#tim-desktop:~$ sed 's/\.\s*$/.\n/' lines
Line one.
The second line.
The third.
This is line four.
five.
This is the sixth sentence.
This is line seven.
Eighth and last.
Are you certain that your dlsp script is passing the filename argument on to sed:
sed 's/\.\s*$/.\n/' $1
# ^^ This is the important bit!
If you don't do that, it will seem to sit there forever as it's waiting for you to type in an input file since, without an argument, it reads from standard input.

At what stage is sed's pattern space printed?

I have heard that for the pattern space, the maximum number of addresses is two.
And that sed goes through each line of the text file, and for each of them, runs through all the commands in the script expression or script file.
When does sed print the pattern space? Is it at the end of the text file, after it has done the last line? Or is it as the ending part of processing each line of the text file, just after it has run through all commands, it dumps the pattern space?
Can anybody demonstrate
a)the max limit of the pattern space being two?
b)the fact of when the pattern space is printed. And, if you can, please provide a textual source that says so too.
And why is it that here in my attempt to see the size of the pattern space, it looks like it can fit a lot..
When this tutorial, says
http://www.thegeekstuff.com/2009/12/unix-sed-tutorial-7-examples-for-sed-hold-and-pattern-buffer-operations/
Sed G function
The G function appends the contents of the holding area to the contents of the pattern space. The former and new contents are separated by a newline. The maximum number of addresses is two.
An example of what I found about the size of the pattern space, trying unsuccessfully to see its limit of two..
abc.txt is a text file with just the character z
sed h;G;G;G;G;G;G;G;G abc.txt
prints many zs so I guess it can hold more than 2.
So i've misunderstood some thing(s).
An address is a way of selecting lines. Lines can be selected using zero, one or two addresses. This has nothing to do with the capacity of pattern space.
Consider the following input file:
aaa
bbb
ccc
ddd
eee
This sed command has zero addresses, so it processes every line:
s/./X/
Result:
Xaa
Xbb
Xcc
Xdd
Xee
This command has one address, it selects only the third line:
3s/./X/
Result:
aaa
bbb
Xcc
ddd
eee
An address of $ as in $s/./X/ would function the same way, but for the last line (regardless of the number of lines).
Here is a two-address command. In this case, it selects the lines based on their content. A single address command can do this, too.
/b/,/d/s/./X/
Result:
aaa
Xbb
Xcc
Xdd
eee
Pattern space is printed when given an explicit p or P command or when the script is complete for the current line of the input file (which includes ending the processing of the file with the q command) if the -n (suppress automatic printing) option is not in place.
Here's a demonstration of sed printing each line immediately upon receiving and processing it:
for i in {1..3}; do echo aaa$i; sleep 2; done | sed 's/./X/'
The capacity of pattern space (and hold space) has to do with the number of characters it can hold (and is implementation dependent) rather than the number of input lines. The newlines separating those lines are simply another character in that total. The G command simply appends a copy of hold space onto the end of what's in pattern space. Multiple applications of the G command appends that many copies.
In the tutorial that you linked to, the statement "The maximum number of addresses is two." is somewhat ambiguous. What that indicates is that you can use zero, one or two addresses to select lines to apply that command to. As in the above examples, you could apply G to all lines, one line or a range of lines. Each command can accept zero, zero or one, or zero, one, or two addresses. See man sed under the Synopsis section for sub headings that group the commands by the number of addresses they accept.
From info sed:
3.1 How `sed' Works
'sed' maintains two data buffers: the active pattern space, and the
auxiliary hold space. Both are initially empty.
'sed' operates by performing the following cycle on each lines of
input: first, 'sed' reads one line from the input stream, removes any
trailing newline, and places it in the pattern space. Then commands
are executed; each command can have an address associated to it:
addresses are a kind of condition code, and a command is only executed
if the condition is verified before the command is to be executed.
When the end of the script is reached, unless the '-n' option is in
use, the contents of pattern space are printed out to the output
stream, adding back the trailing newline if it was removed.(1) Then the
next cycle starts for the next input line.
Unless special commands (like 'D') are used, the pattern space is
deleted between two cycles. The hold space, on the other hand, keeps
its data between cycles (see commands 'h', 'H', 'x', 'g', 'G' to move
data between both buffers).