I am pretty new to batch image processing in imageJ. I have a macro that allows me to process several images in a single directory. The problem is that the macro generates an individual summary window for each processed image, leaving me to manually compile all the output data into a single .csv or .xls. I would prefer that all the summary data be automatically compiled into one file. While I have found several sources showing how to do this, it has not been particularly helpful in my situation.
If you could help, I'd be very grateful.
Here is an abbreviated example of the code:
dir1 = getDirectory("Choose Source Directory ");
dir2 = getDirectory("Choose Destination directory");
list = getFileList(dir1);
setBatchMode(true);
for (i=0; i<list.length; i++){
print (list[i]);
open(dir1+list[i]);
name=File.nameWithoutExtension;
//Prepare the image by removing any scale and making 8-bit
run("Set Scale...", "distance=237.3933 known=1 pixel=1 unit=cm
global");
makeRectangle(4068, 5940, 1572, 1320);
run("Crop");
// Convert the image into RGB channels for proper thresholding
run("RGB Stack");
setSlice(3);
//Threshold
setAutoThreshold("Default");
// Analyze particles
// Provides total area of number of cotyledons in image
run("Analyze Particles...", "size=60-Infinity pixel display include
summarize");
run("Revert");
}
//Save the results
selectWindow("Summary");
saveAs("Results", dir2+"Results.xls");
In my tests (with ImageJ 2.0.0-rc-61/1.51n on macOS 10.12.6), repeatedly executing Analyze Particles with the summarize option checked appends the summary to the existing Summary window, which can then be saved as a single file at the end.
For example, the following macro generates two summary lines and then saves them:
setBatchMode(true);
run("Blobs (25K)");
setAutoThreshold("Default");
run("Analyze Particles...", "size=60-Infinity pixel display include summarize");
run("Boats (356K)");
setAutoThreshold("Default");
run("Analyze Particles...", "size=60-Infinity pixel display include summarize");
selectWindow("Summary");
saveAs("Results", "/Users/curtis/Desktop/Results.xls");
Related
I tried searching around in the internet, github issues and such, but was unable to find if it's possible to get the result with different possible character alternatives while using tesseract.
for example while running tesseract -l jpn --psm 10 input.png - on this image I get the output 白, but if possible I'd like to also see the other possibilities, and if possible with their confidence coefficients.
I found that it's specially useful while trying to recognize a single character as the tesseract --psm 10 will give wrong but close results for complex kanji.
Like was being recognized as 側. So, I was thinking if I could like get the 5 most probable or sth like that from the command line, then it could be great. And if it's not possible through the command line I'm also willing to see a direct programming approach using the API.
EDIT:
tesseract -l jpn --psm 10 iu.png - command on results in 雨 result. On doing this on the code given in the answer I can see that the confidence is 93.68% and shows only one result. If I run the same in this image instead , I'll get 言 (99.46%) which means it is giving a sensible result, but it's only giving me a single result ignoring others. I hypothesized that it does so because the confidence is high because if I run the same command on , I get 遊 but when I run the code, I get
遊 (71.77%)
遮 (67.41%)
遭 (66.76%)
避 (65.36%)
遷 (65.00%)
選 (64.70%)
透 (64.55%)
進 (64.52%)
適 (63.95%)
週 (63.22%)
Hence, I assume it's giving single result in previous images because it is confident.
Furthermore doing tesseract -l jpn_vert screenshot.png - in this image gives the output 言わない, which is correct, which means even when it gave me 雨 result on the cropped character in the same image, there was 言 match there but with lower coefficient but it came up when it was doing the dictionary match in the whole word (which removed 雨 as a possibility). That's why while trying to identify a single character, I want to get the output with all those matches (a fixed number or a threshold decided by myself).
The code I have is almost identical to the one given in the example I have only added whitelist characters (around 2000+ kanji). Had to remove api->SetVariable("tessedit_char_whitelist", ...) line because SO thought it was spam.
#include <tesseract/baseapi.h>
#include <leptonica/allheaders.h>
#include <tesseract/publictypes.h>
int main(int argc, char *argv[]){
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
// Initialize tesseract-ocr with Japanese, without specifying tessdata path
if (api->Init(NULL, "jpn")) {
fprintf(stderr, "Could not initialize tesseract.\n");
exit(1);
}
// Open input image with leptonica library
Pix *image = pixRead(argv[1]);
api->SetImage(image);
api->SetVariable("save_blob_choices", "T");
api->SetPageSegMode(tesseract::PSM_SINGLE_CHAR);
api->Recognize(NULL);
tesseract::ResultIterator* ri = api->GetIterator();
tesseract::PageIteratorLevel level = tesseract::RIL_SYMBOL;
if(ri != 0) {
do {
const char* symbol = ri->GetUTF8Text(level);
float conf = ri->Confidence(level);
if(symbol != 0) {
tesseract::ChoiceIterator ci(*ri);
do {
const char* choice = ci.GetUTF8Text();
printf("%s (%.2f%%)\n", choice, ci.Confidence());
} while(ci.Next());
}
delete[] symbol;
} while((ri->Next(level)));
}
// Destroy used object and release memory
api->End();
pixDestroy(&image);
return 0;
}
IMHO you will need to use tesseract API https://github.com/tesseract-ocr/tessdoc/blob/master/APIExample.md#example-of-iterator-over-the-classifier-choices-for-a-single-symbol
I have a weird problem... I want to create a video with the extension .tif with a lot of frames. My script is running well 2 times over 3 but it crashes randomly sometimes...
I have a loop which has the length of the total of frames in the video and at each turn I add a tif to my multipage tif.
There is my code to create the new video :
% --- Create the new frame
newVid.cData = iL(y0:y0end, x0:x0end);
% --- Create the new video
if nbrFrames == 1
imwrite(newVid.cData,dataOutVid);
else
imwrite(newVid.cData,dataOutVid,'WriteMode','append');
end
Each turn I change the value of "newVid.cData". in fact the new video is a portion of an original video in which I focused on a specific object (a mouse for me). "dataOutVid" is the path where I store the new video and the extension of the path is .tif.
How I obtain the path:
disp('Where do you want to save the new video and under which name ?');
[name, path] = uiputfile({'.tif'}, 'Save Video');
dataOutVid = strcat(path,name);
There is the mistake I can possibly have randomly:
Error using imwrite (line 454)
Unable to open file "D:\Matlab\Traitement Vidéo\test.tif" for writing. You might not have write permission.
Error in mouseExtraction(line 164)
imwrite(newVid.cData,dataOutVid,'WriteMode','append');
Well I don't understand why this error appears randomly (one time at the frame 270, another time at the frame 1250, etc...). How is it possible I suddendly loose the right to overwrite my file...
Edit : I already checked if I didn't have a RAM problem but I only use 20% of it during the execution of the script...
I have a ".mat" file supposedly containing a [30720000x4 double] matrix (values from accelerometers). When I try to open this file with "Import data" in Matlab I get the following error:
Error using load
Can't read file F:\vibration_exp_2\GR_UB50n\bearing1\GR_UB50n_1_2.mat.
Error using load
Unknown text on line number 1 of ASCII file
F:\vibration_exp_2\GR_UB50n\bearing1\GR_UB50n_1_2.mat
"MATLAB".
Error in uiimport/runImportdata (line 456)
datastruct = load('-ascii', fileAbsolutePath);
Error in uiimport/gatherFilePreviewData (line 424)
[datastruct, textDelimiter, headerLines]= runImportdata(fileAbsolutePath,
type);
Error in uiimport (line 240)
[ctorPreviewText, ctorHeaderLines, ctorDelim] = ...
The filesize is 921MB which is the same as my other files that do open. I also tried opening the file using python, but no success. Any suggestions? I use MATLAB R2013b .
More info:
How the file was create:
%% acquisition of vibration data
% input:
% sample rate in Hz (max. 51200 Hz, should be used as bearing
% faults are high-frequent)
% time in seconds, stating the duration of the measurement
% (e.g. 600 seconds = 10 minutes)
% filename for the file to be saved
%
% examples:
% data = DAQ(51200, 600, 'NF1_1.mat');
% data = DAQ(51200, 600, 'NF1_2.mat');
function data = DAQ(samplerate,time,filename)
s = daq.createSession('ni'); % Creates the DAQ session
%%% Add the channels as accelerometer channels (meaning IEPE is turned on)
s.addAnalogInputChannel('cDAQ1Mod1','ai0','Accelerometer');
s.addAnalogInputChannel('cDAQ1Mod1','ai1','Accelerometer');
s.addAnalogInputChannel('cDAQ1Mod1','ai2','Accelerometer');
s.addAnalogInputChannel('cDAQ1Mod1','ai3','Accelerometer');
%s.addAnalogInputChannel('cDAQ1Mod2','ai0','Accelerometer');
s.Rate = samplerate;
s.NumberOfScans = samplerate*time;
%%% Defining the Sensitivities in V/g
s.Channels(1).Sensitivity = 0.09478; %31965, top outer
s.Channels(2).Sensitivity = 0.09531; %31966, back outer
s.Channels(3).Sensitivity = 0.09275; %31964, top inner
s.Channels(4).Sensitivity = 0.09363; %31963, back inner
data = s.startForeground(); %Acquiring the data
save(filename, 'data');
More info:
When I open the file using a simple text editor I can see a lot of characters that do not make sense but also the first line:
MATLAB 5.0 MAT-FILE, Platform: PCWIN64, Created on: Thu Apr 30
16:29:07 2015
More info:
The file itself: https://www.dropbox.com/s/r7mavil79j47xa2/GR_UB50n_1_2.mat?dl=0
It is 921MB.
EDIT:
How can I recover my data?
I've tried this, but got memory errors.
I've also tried this, but it did not work.
I fear I can't add many good news to what you know already, but it hasn't been mentioned yet.
The reason the .mat-file can't be load is due to the data beeing corrupted. What makes it 'unrecoverable' is the way it is stored internally. The exact format is specified in the MAT-File Format Documentation. So I decided to manually construct a simple reader to specifically read your .mat file.
It makes sense, that the splitmat.m can't recover anything, as it will basicly split the data into chunks, one stored variable per chunk, however in this case there is only 1 variable stored and thus only one chunk, which happens to be the corrupted one.
In this case, the data is stored as a miCOMPRESSED, which is a normal matlab array compressed using gzip. (Which, as a side note, doesn't seem like a good fit for 'random' vibration data.) This might explain previous comments about the smaller file size then the full data, as the filesize matches exatly with the internally stored value.
I extracted the compressed archive and tried to uncompress it in a variety of ways. Basicly it is a '.gz' without the header, that can be appended manually. Unfortunatly there seems to be a corrupted block near the start of the dataset. I am by no means an expert on gzip, but as far as I know the dictionary (or decryption key) is stored dynamicly which makes all data useless from the point the block is corrupted. If you are really eager, there seems to be a way to recover data even behind the point where data is corrupted, but that method is massively timeconsuming. Also the only way to validate data of those sections is manual inspection, which in your case might proof very difficult.
Below is the code, that I used to extract the .gz-file, so if you want to give it a try, this might get you started. If you manage to decrypt the data, you can read it as described in the MAT-File Format, 13f.
corrupted_file_id = fopen('corrupt.mat','r');
%% some header data
% can be skipped replacing this block with
% fread(id,132);
%header of .mat file
header_text = char(fread(corrupted_file_id,116,'char')');
subsystem_data_offset = fread(corrupted_file_id,8,'uint8');
version = fread(corrupted_file_id,1,'int16');
endian_indicator = char(fread(corrupted_file_id,2,'int8')');
data_type = fread(corrupted_file_id,4,'uint8');
%data_type is 15, so it is a compressed matlab array
%% save te content
data_size = fread(corrupted_file_id,1,'uint32');
gz_file_id = fopen('compressed_array.gz','w');
% first write a valid gzip head
fwrite(gz_file_id,hex2dec('1f8b080000000000'),'uint64',0,'b');
% then write the data sequentialy
step = 1:1e3:data_size;% 1MB steps
for idx = step
fwrite(gz_file_id,fread(corrupted_file_id,1e3,'uint8'));
end
step = step(end):data_size;% 1B steps
for idx = step
fwrite(gz_file_id,fread(corrupted_file_id,1,'uint8'));
end
fclose(gz_file_id);
fclose(corrupted_file_id);
To answer literally to the question, my suggestion would be to make sure first that the file is okay. This tool on File Exchange apparently knows how to diagnose corrupted .MAT files starting with version V5 (R8):
http://www.mathworks.com/matlabcentral/fileexchange/6893-matcat-mat-file-corruption-analysis-tool
The file's size (indices going out of range) seems to be a problem. Octave, which should read .mat files, gives the error
memory exhausted or requested size too large for range of Octave's index type
To find out what is wrong you may need to write a test program outside MatLab, where you have more control over memory management. Examples are here, including instructions on how to build them on your own platform. These stand-alone programs may not have the same memory issues. The program matdgns.c is specifically made to check .mat files for errors.
I have 1000th of PDF generated from emails containing .png (I am not owner of the generator). For some reasons, those PDF are very very slow to render with the Imaging system I am using (I am not the developer of that system and may not change it).
If I use iTextSharp and implement a IRenderListener to count the Images to be rendered, there are thousands per page (99% being 1 or 2 pixels only). But if I count the Images in the resources of the PDF, there are only a few (~tens).
I am counting the images in the resources, per page, with the code here after
var dict = pdfReader.GetPageN(currentPage)
PdfDictionary res = (PdfDictionary)PdfReader.GetPdfObject(dict.Get(PdfName.RESOURCES));
PdfDictionary xobj = (PdfDictionary)PdfReader.GetPdfObject(res.Get(PdfName.XOBJECT));
if (xobj != null)
{
foreach (PdfName name in xobj.Keys)
{
PdfObject obj = xobj.Get(name);
if ((obj.IsIndirect()))
{
PdfDictionary tg = (PdfDictionary)PdfReader.GetPdfObject(obj);
PdfName subtype = (PdfName)PdfReader.GetPdfObject(tg.Get(PdfName.SUBTYPE));
if (PdfName.IMAGE.Equals(subtype))
{
Count++
And my IRenderListener looks like this:
class ImageRenderListener : IRenderListener
{
public void RenderImage(iTextSharp.text.pdf.parser.ImageRenderInfo renderInfo)
{
PdfImageObject image = renderInfo.GetImage();
if (image == null) return;
var refObj = renderInfo.GetRef();
if (refObj == null)
Count++; // but why no ref ??
else
Count++;
}
I just started to learn about PDF specification and iTextSharp this evening, to analyze my PDF and understand what could be wrong... if I am correct, I see that many images to be rendered that are not referencing a resource (refObj == null) and that they are .png (image.streamContentType.FileExtension = "png"). So, I think those are the images making the rendering so slow...
For testing purpose, I would like to delete those images from the PDF but don't find how to proceed.
I only found code samples to remove image that are in the resources... but the images I want to delete are not :/
Is there any code sample somewhere to help me ? I did google on "iTextSharp remove object", etc... but there was nothing similar to my case :(
Let me start with the blunt observation that you have a shitty PDF.
The image you see when opening the PDF in a PDF viewer seems to be composed of several small 1- or 2-pixel images. The drawing operations to show these pixels one by one is suboptimal, no matter which imaging system you use: you are faced with a bad PDF.
In your first snippet, I see that you loop over all of the indirect objects stored in the the XObject resources of each page in search of images. You count these images, resulting in a number of Image XObjects stored in the PDF. If you add up all the Count values for all the pages, this number can be higher than the actual number of Image XObject stored in the PDF as you don't take into account that some images can be reused on different pages.
You do not count the inline images that are stored in the content streams. I'm biased. In the ISO committees for PDF, I'm on the side of the group of people saying that "inline images are evil" and "inline images should die". For now, we didn't succeed in getting rid of inline images, but we introduced some substantial limitations that should reduce the (ab)use of inline images in PDF that conform to ISO-32000-2 (the PDF 2.0 spec that is due in 2016).
You've already discovered that your PDF has inline images. Those are the images where refObj == null. They are not stored as indirect objects; they are stored inline, in the content stream of the page. As you can imagine based on my feelings towards inline images, I consider your PDF being a bad PDF for this reason (although it does conform to ISO-32000-1).
The presence of inline images is a first explanation why you have a different image count: when you loop over the indirect objects you only find part of the images. When you parse the document for images, you also find the inline images.
A second explanation could be the fact that the Image XObject are used more than once. That's the whole point of not using inline images. For instance: if you have an image that represents a logo that needs to be repeated on every page, one could use inline images. That would be a bad idea: the same image bytes would be present in the PDF as many times as there are pages. One should use an Image XObject. In this case, the image bytes of the logo are stored only once in an indirect object. There's a reference to this object from every page, so that the image bytes are stored in the document only once. In a 10-page document, you can see 10 identical images on 10 pages, but when looking inside the document, you'll find only one image that is referenced from every page.
If you remove Image XObjects by removing the indirect objects containing the image stream objects, you have to be very careful: are you sure you're not corrupting your document? Because there's a reference to the Image XObject in the content stream of your page. This reference points to an entry in the /XObjects entry of the page's /Resources. This /XObject references to the stream object with the image bytes. If you remove that indirect object without removing the references (e.g. from the content stream), you break your PDF. Some viewers will ignore those errors, but at some point in time some tool (or some body) is going to complain that your PDF is corrupt.
If you want to remove inline images, you have to parse all the content streams in your PDF: page content streams as well as Form XObject content streams. You have to rewrite all these streams and make sure all inline images are removed. That is: all objects that that start with the BI operator (Begin Image) and end with the EI operator (End Image).
That's a task for a PDF specialist who knows both iTextSharp and ISO-32000-1 inside-out. The solution to your problem probably doesn't fit into an answering window on StackOverflow.
I'm the original author of iText. From a certain point of view, iText is like a sharp knife. A sharp knife is a very good tool that can be used for many good things. However, you can also seriously cut your fingers when you're not using the knife in a correct way. I hope you'll be careful and that you're not going to create a whole series of damaged PDF files.
For instance: you assume that some of the files in the PDF are PNGs because iText suggests to store them as PNGs. However: PNG is not supported by ISO-32000-1, so your assumption that your PDF contains PNGs is wrong. I honestly worry when I see questions like yours.
I wrote a macro in Fiji to perform a set of operations on all the images in a particular folder. But I ran into trouble and can't get over one problem. I get an error message that says 'There are no images open' when I run the macro. What does it mean? (The images in input folder are of .tif type)
Here's the macro:
input = "C:"+File.separator+"Winter Quarter slides"+File.separator+"CTIA"+File.separator+"Project"+File.separator+"Original Image data"+File.separator+"Input Images"+File.separator;
output = "C:"+File.separator+"Winter Quarter slides"+File.separator+"CTIA"+File.separator+"Project"+File.separator+"Original Image data"+File.separator+"Output Images"+File.separator;
setBatchMode(true);
list=getFileList(input);
for(i=0; i<list.length; i++)
action(input,output,list[i]);
setBatchMode(false);
function action(input,output,filename) {
open(input+filename);
run("16-bit");
run("Gaussian Blur...", "sigma=3");
setAutoThreshold("Otsu");
//run("Threshold...");
setAutoThreshold("Otsu");
setOption("BlackBackground", false);
run("Convert to Mask");
run("Close");
run("Watershed");
saveAs("Tiff", output+filename);
close();
}
close();
Can someone please help me out with it asap?
Thanks!
Another thing that would cause this error would be non-image files in the input directory. You loop through all the content in the folder and treat it like an image. If there is for example a text file, the result of open(input+filename) will not be an open image.
When several windows are open, the macro commands need definition of which window to work on.
In my micros i use ; selectWindow("imagename"); before the command. This should hopfully solve the problem.
I've not used the macro language but I have seen that error when developing in Java. Some Plugins require that an image is showing.
If the image isn't showing after open(input+filename); then you need to run a show function to display the image.
You do
run("Close");
run("Watershed");
saveAs("Tiff", output+filename);
So you close the image and then try to do things to the image which would produce that error.