ImageMagick crop with row/column in file name only saving last image - powershell

I'm attempting to crop an image using ImageMagick and via PowerShell. I can crop the image fine with the following command, and it creates the 2000+ images:
convert -crop 16x16 .\original.png tileOut%d.png
However, I would like to take advantage of ImageMagick's ability to dynamically set the file name.
According to a post on their forums I should be able to run something like the following via a batch file:
convert ^
bigimage.jpg ^
-crop 256x256 ^
-set filename:tile "%%[fx:page.x/256+1]_%%[fx:page.y/256+1]" ^
+repage +adjoin ^
tiled_%%[filename:tile].gif
I shouldn't need to escape the % since I'm running this in PowerShell directly, so I used the following:
convert -crop 16x16 .\original.png -set filename:tile "%[fx:page.x/16+1]_%[fx:page.y/16+1]" +repage +adjoin directory\tiled_%[filename:tile].png
However, when I run this command I end up with one file called tiled_%[filename and another called tiled_45_47.png.
So while it does seem to create the last file, it only creates the one. The first file is 0 bytes in size, but takes up over 8 MB of space on disc, according to properties on the file.
Trying to run the command in a batch file results in the same behavior, which makes me think PowerShell itself isn't the issue, but rather the command is.
According to the documentation +adjoin is required since I want different images. +repage doesn't make much sense to me, but I've kept it in the command since the original had it, and excluding it doesn't seem to change the output. -set filename seems pretty straightforward.
Large size of the first leads me to believe that all the previous images might be getting added to it. However, the file name also suggests it's getting hung up on the :, but it doesn't appear to be a special character in PowerShell. It's also creating an image for the very last crop. Baffling.
So what am I doing wrong?
Thanks in advance!
EDIT:
PowerShell 5.0.10586.0, on Windows 10.
ImageMagick 6.9.2 Q16 (64-bit)
From the comments, I'm thinking the issue might be with the ImageMagick command.

I'm not using Powershell, but I think you will have more success by specifying your image first, then the crop, then setting the filename:
convert original.png -crop 16x16 -set filename:tile "%[fx:page.x/16+1]_%[fx:page.y/16+1]" +repage "tiled_%[filename:tile].png"

So in the past I was using the following command to crop images, with the %d being automatically converted to a number based upon the sequence.
convert -crop 16x16 .\original.png directory\tileOut%d.png
That works perfectly fine. However, the example provided on that forum had the original file name listed as the first argument to the convert command. Changing my command so that it was listed first results in the expected behavior.
convert .\original.png -crop '16x16' -set 'filename:tile' '%[fx:page.x/16+1]_%[fx:page.y/16+1]' +repage +adjoin 'directory\tiled_%[filename:tile].png'
The use of single quotes in so many locations may not be required, but it works.

Related

ImageMagick: How to batch append 4 parts of images into one (2 rows, 2 columns) (I have 500+ images that need to be combined like this)

everyone!
I am using ImageMagick-7.0.10-Q16 on Windows 10. I’ve tried Googling for answers, but I’m still left very confused about how to do this. Most of the answers have been for UNIX and not Windows, I have no idea what it means, or given me errors. I don’t have any experience with coding or Windows PowerShell, so forgive my slowness
I have scanned pages of books that have been split into four pieces of jpg files. The images are named after the page number and the orientation of the corresponding piece. BL=Bottom left. BR=Bottom right. TR=Top right. TL=Top left. (BM=Bottom pieces merged. TB=Top pieces merged). So “BL0001.jpg" is the bottomleft piece of page 1. I’m not mentioning their sizes because I don’t want them to be resized or whatever. I just want them to be combined via append like a puzzle like this:
Combined jpg pieces.
The borders and the text-boxes there are just to demonstrate, and are not to be included
So the files are for example like this:
BL0001.jpg
BR0001.jpg
TL0001.jpg
BR0001.jpg
BL0002.jpg
BR0002.jpg
TL0002.jpg
BR0002.jpg
And so on...
This was the last thing I’ve tried in Windows PowerShell:
magick convert B*0001.jpg +append 0001BM.jpg
magick convert T*0001.jpg +append 0001TM.jpg
magick convert 0001*.jpg +swap -append 0001merged.jpg
This combines 4 parts into one image just like I want it to. I found out adding * works like a wildcard and merges all the images like BR and TR together in one go. But I can’t do that for the page number (in this case ‘0001’ in ‘B*0001.jpg’), because that would merge all the files in the folder into the same image, something I don’t want. So what I want to figure out is to how to “batch” run this command for with a sequential numbering system for the different pages. In other words, use a command to batch combine pieces of an image into one image, but with all the scanned pages in jpg in the folder. I know the commands above create addition files with the merged top and bottom parts before the final merge, but I don’t know how to make this command otherwise. I'm willing to try other commands/things too
Using ImageMagick v7 in a simple Windows BAT script you could do something like this...
#echo off
setlocal EnableDelayedExpansion
for /l %%n in ( 1 1 9999 ) do (
set V1=000%%n
set V1=!V1:~-4!
magick *!V1!.jpg +append -crop 2x1# +swap -append +repage !V1!merged.jpg
)
exit /b
That uses a "for" loop to read all four "*0001.jpg" images at a time into an ImageMagick command. The "set V1=" lines are to make sure the variables have the correct number of leading zeros.
The IM command appends, crops, and appends the four images into the properly ordered output, and writes the image as "0001merged.jpg". Then it moves on to process "*0002.jpg" and so on.
I put a top limit on the number of image sets to process with that "9999" in the "for" command to work with the number of leading zeros. Make sure that number is the same or more than the number of image sets you have. It will just print an error for each loop after it goes over the number of image sets, but no harm done.
Note: Using ImageMagick v7 you should just use "magick" because when you use "magick convert" it emulates IMv6 behavior. You probably won't usually want that.

pytesseract results different from tesseract command line results

I am trying to convert a scanned page to text using both pytesseract and tesseract command line on Ubuntu. The results are remarkably different (pytesseract performs way better than tesseract command line) and I am unable to understand why. I looked at the default values for the parameters and tried altering some of the parameter values in tesseract command line (like psm ) but I am unable to get the same result as pytesseract. Due to lack of proper documentation in pytesseract I am not able to figure out what default values for parameters are used.
Here is my pytesseract code
print(pytesseract.image_to_string(Image.open('test.tiff'))
Looking at the source code of pytesseract, it seems the image is always converted into a .bmp file.
Working with a .bmp file and psm of 6 at the command line with Tesseract gives same result as pytesseract.
Also, tesseract can work with uncompressed bmp files only. Hence, if ImageMagick is used to convert .pdf to .bmp, the following will work
convert -density 300 -quality 100 mypdf.pdf BMP3:mypdf.bmp
tesseract mypdf.bmp -psm 6 mypdf txt
In tessaract v5 3.0+
Pytessaract does not convert images to BMP. You can verify this by commenting out cleanup(f.name) in the save context manager, which is found within the source code /pytesseract/pytesseract.py. The filename of the temp file will also need to retrieved (Pytessaract was saving files within temp files directory of the user, ie. "[path-to-user]\AppData\Local[file-name]". I found what Pytesseract is actually doing is in the prepare function.
Basically, taking the temp file and using that same file with the tesseract command directly will yeild the same results

ghostscript not creating exact images

I am running below script to create images from postscript file, the images are coming but on first page watermark is not there.
gs -dUseCIEColor -dNOPAUSE -sDEVICE=jpeg -dFirstPage="1" -dLastPage=2 -sOutputFile=outputImage_%0d_A.gif -dJPEGQ=100 -r300 -q inputFile.ps -c quit;'
I am giving the link of ps file which i am using.
http://speedy.sh/Y7vWj/inputFile.ps
Can anybody please help!!!!
Thanks in advance...
OK you haven't stated what version of Ghostscript you are using, nor have you been very clear about what is missing. By 'watermak' do you mean the dark grey text 'PAULDAVIS' written diagonally across the very dark grey rectangle ?
If so then I can see that using the current version of Ghostscript and your command line, its not missing
A few observations on your command line:
-dUseCIEColor - Don't use this unless you know exactly what you are doing and why you want this, I'm guessing you don't (because you have not set any Color Rendering Dictionary). With this you get very dark grey text which is nearly invisible against the very dark grey rectangle. Not surprising since this relates to colour management.
You've set the device to jpeg, but you've set the output file to have a .gif extension.
You are using -dFirstPage and -dLastPage which have no effect when the input is not PDF (though this is added as a new feature in unreleased code).
You've set FirstPage=1 and LastPage=2 on a 2 page file.....
You have set -dFirstPage="1", which isn't going to work for any code which parses and uses it. The quotes won't work.
I'd recommend you do not set -q or -dQUIET when trying to diagnose problems, telling Ghostscript to be quiet will potentially mean you miss useful information.
-c quit; -c means 'process the next part of the command line as PostScript'. But quit; isn't valid PostScript (the semicolon should not be present) and will throw a PostScript error. If you want GS to exit after processing, consider simply using -dBATCH.

Error in calling ImageMagick from Matlab

I have installed ImageMagick in my system (windows), and its commands are there in system PATH. Its working absolutely fine through Command line
I want to call the "convert" function of ImageMagick from Matlab using system command.
'C:\Users\Vivek' is the Path to image. I have to test working of ImageMagick through Matlab, as i need it in further processing (Making input suitable to Tesseract OCR)
cmd= ['convert ' 'C:\Users\Vivek\208.jpg ' 'C:\Users\Vivek\208.png']
system(cmd);
It says Invalid Parameter - C:\Users\Vivek\208.png, I tried some other ways. But, all the time the problem is with the second parameters.
Need Help
Thanks
Windows comes with its own convert program, and it looks like you're calling that one because it's first on the path in this context. It's described here on ImageMagick's site: http://www.imagemagick.org/Usage/windows/#convert_issue
I do not have ImageMagick installed, and I get the same error message when I try calling convert. That's consistent with your code getting the wrong convert program.
C:\Users\janke>convert C:\Users\Vivek\286.jpg C:\Users\Vivek\208.png
Invalid Parameter - C:\Users\Vivek\208.png
Specify the full path to ImageMagick's convert program and it should work for you.
The solution mentioned in the last post is the standard way to solve the issue, but the simplest way to do this is to just rename the ImageMagick's convert.exe file to something else, like convert1.exe, and use this filename in your scripts.

PDF output from MATLAB and inclusion in LaTeX

I'm printing some figures in MATLAB in PDF form, and can view them fine with the Evince PDF viewer on Fedora 16.
When I try to include them in LaTeX (TeXLive 2011), however, I get an error
!pdfTeX error: /usr/local/texlive/2011/bin/x86_64-linux/pdflatex (file ./caroti
d_amp_mod_log.pdf): xpdf: reading PDF image failed
However, I can take an example PDF image generated in Mathematica and include it just fine, which tells me that the problem is with the PDF's generated by MATLAB and not with PDF's in general.
Might it have something to do with the set(0,'defaultfigurepaperpositionmode','auto')I put in my startup.m file so that pages would auto-fit the images?
EDIT: I just tried using saveas(figure(1), 'filename.pdf') instead of print(figure(1), 'filename.pdf') and it worked fine, but the PaperPositionMode property is ignored. Any way around this?
Finally found the problem. The correct way to print images is to use the print(handle, '-dformat', 'filename') syntax.
So, for PDF's, we need print(figure(1), '-dpdf', 'myfigure'). See MATLAB documentation on graphics file formats for more information.
Using print(figure(1), 'filename.pdf') still produces a valid PDF for viewing, but it can't be included in LaTeX.
You can try using
pdfpages
or
pgf
to include pdf files. However, you need to use pdflatex only, as you are doing right now.