How does one read multiple DICOM and PNG files at once using pydicom.read_file() and cv2.imread()? - png

Currently working on a Fully CNN for renal segmentation in MR images. Have 40 images and their ground truth labels, attempting to load all of the images for pre-processing purposes.
Using Google Colab, with the latest versions of pydicom and pip installed, for this project. Currently have the Google Drive mounted to the Colab program and the code below shows the correct pathways to the images and their masks in the pydicom.read_file() and cv2.imread() calls, respectively.
However, when I use the "/../IMG*.dcm" or "/../IMG*.png" file paths (which should be legal?), I receive a "FileNotFoundError" as listed below. But, when I specify a specific .dcm or .png image, the pydicom.read_file() and cv2.imread() calls function quite normally.
Any suggestions on how to resolve this issue? I am struggling a lot with loading the data and pre-processing but have the model architecture ready to go once these preliminary hurdles are overcome.
#import data as data
import pydicom
import numpy as np
images= pydicom.read_file("/content/drive/My Drive/CHOAS_Kidney_Labels/Training_Images/T1DUAL/IMG*.dcm");
numpyArray = images.pixel_array
masks= cv2.imread("/content/drive/My Drive/CHOAS_Kidney_Labels/Ground_Truth_Training/T1DUAL/IMG*.png");
-----> FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/CHOAS_Kidney_Labels/Training_Images/T1DUAL/IMG*.dcm'

pydicom.read_file does not support wildcards. You have to iterate over the files yourself, something like (untested):
import glob
import pydicom
pixel_data = []
paths = glob.glob("/content/drive/My Drive/CHOAS_Kidney_Labels/Training_Images/T1DUAL/IMG*.dcm")
for path in paths:
dataset = pydicom.dcmread(path)
pixel_data.append(dataset.pixel_array)

Related

Using .traineddata with passportEye Python for MRZ

I am trying to improve accuracy of passport MRZ reading with tesseract ocr and passportEye I have found few github repositories containing "*.traineddata", it says to move it into tesseract ocr tessdata folder, I did that. No where in readme of these repos says how to use it, I believe it is something trivial, but I am very new to this tesseract thing.
How do I use it with passportEye in python, I am completely lost here. searched a lot. Here is the current code.
import os
from passporteye import read_mrz
pr_path = os.getcwd()
file_path = os.path.join(pr_path,'my_app', 'data')
mrz = read_mrz(file_path + '/test1.jpg')
print(mrz)
This is the .traineddata file I want to test for more accuracy : https://github.com/DoubangoTelecom/tesseractMRZ/blob/master/tessdata_best/mrz.traineddata
I do not want to use bulky openCV. Please help
From looking into the source code I would say you can`t, without changing the codebase of PassportEye:
Normally you would pass the language you are using via: -l paramerter to tesseract - in your case:
-l mrz
But the PassportEye implementation does not give you that option:
https://github.com/konstantint/PassportEye/blob/929c186c4dfa80a1ac975b5f2b95002ca12889d0/passporteye/util/ocr.py#L48
they pass lang=None, you would need to change that part to lang=mrz
pytesseract.run_tesseract(input_file_name,
output_file_name_base,
'txt',
lang='mrz',
config=config)

How to download a file (csv.gz) from a url using Python 3.7

As with others who have posted in the past, I cannot figure out to download a csv.gz file from a URL in Python 3.7. I see posts but they only post a 2kb file.
I am a 100% newbie using Python. What follows is the code for one file that I am trying to obtain. I can't even do that. The final goal would be to request all files that start with 2019* using python. Please try the code below to save the file. As others stated, the file is just a name without the true content - Ref: Downloading a csv.gz file from url in Python
import requests
url = 'https://public.bitmex.com/?prefix=data/trade/20191026.csv.gz'
r = requests.get(url, allow_redirects=True)
open('20191026.csv.gz', 'wb').write(r.content)
Yields:
Out[40]:
1245
I've tried "wget" and urllib.request along with "urlretrieve" also.
I wish I could add a screenshot or attach a file. The file created is 2kb and not even a csv.gz file. But the true file that I can download from a web browser is 78mb. The file is 20191026.csv.gz not that it matters as they all do the same thing. The location is https://public.bitmex.com/?prefix=data/trade/
Again, if you know of a way to obtain all the files using a filter such that 2019*csv.gz would be fantastic.
You are trying to download the files from https://public.bitmex.com/?prefix=data/trade/.
To achieve your final goal of download all the files starting from 2019* you have to do in 3 steps
1) you read the content of https://public.bitmex.com/?prefix=data/trade/
2) convert the content into an list, from that filter out the file names which starting from 2019.
3) from the result list try to download the csv using the example which you referring.
Hope this approach will help you
Happy coding.

How do I find more comprehensive Google Documentation for using its APIs

A lot of the times the Google documentation is incomplete and is missing things like libraries required to import. How do I view a more comprehensive example?
Example: https://cloud.google.com/vision/docs/detecting-faces#vision-face-detection-python
def highlight_faces(image, faces, output_filename):
"""Draws a polygon around the faces, then saves to output_filename.
Args:
image: a file containing the image with the faces.
faces: a list of faces found in the file. This should be in the format
returned by the Vision API.
output_filename: the name of the image file to be created, where the
faces have polygons drawn around them.
"""
im = Image.open(image)
draw = ImageDraw.Draw(im)
for face in faces:
box = [(vertex.x, vertex.y)
for vertex in face.bounding_poly.vertices]
draw.line(box + [box[0]], width=5, fill='#00ff00')
im.save(output_filename)
Missing the PIL import
On many of Google's code examples, there will be a VIEW ON GITHUB button that will take you to a complete working example rather than a snippet. Very useful for finding necessary library imports or just going straight to more code.
When that is missing, sometimes there is a link to the file like this firebase example linking to index.js:

Caffe web demo error when running a model trained on Digits

I trained a neural network model on Digits and it seemed to run fine there.
Then i exported the trained model files and copied them into a different system running the standard caffe web demo.
I hoped to just be able to plug those files in and have them run in Caffe but i am getting an error.
Specifically I copied my model into bvlc_reference_caffenet.caffemodel, the deploy.prototxt into deploy.prototxt, and the mean.binaryproto into the ilsvrc_2012_mean.npy file.
However when I try to run it , it appears to not like the format of the mean.binaryproto file as indicated by the error message:
IOError: Failed to interpret file '/home/vagrant/caffe/python/caffe/imagenet/ilsvrc_2012_mean.npy' as a pickle
what am I doing wrong here? Do I need to process the mean.binaryproto file from Digits somehow before I use it with caffe?
You need to convert the .binaryproto file to a numpy file.
There is a nice example here using caffe.io and caffe.proto.

How to combine several PNG images as layers in a single XCF image?

I have several PNG images, which I need to combine as individual layers in a new GIMP XCF image. I need to perform this task many times, so a script based solution would be best.
So far i tried to play around with the batch mode of GIMP, but failed miserably.
Instead of script-fu, which uses Scheme, I'd recommend using the GIMP-Python binding for this, since it is far easier to manipulate files and listings.
If you check filters->Python->Console you will b dropped into an interactive mode - at the bottom of it, there will be a "Browse" button which lets you select any of GIMP's procedures in its API and paste it directly in this console.
There is as an API call to "load a file as a layer" - pdb.gimp_file_load_layer -
this however, brings the image to memory, but do not add it to the image - you have to call
pdb.gimp_image_insert_layer afterwards
You can type this directly in the interactive console, or,check one of my other GIMP-related answers, or some resource on GIMP-Python on the web to convert it to a plug-in, which won't require pasting this code each time you want to perform the task:
def open_images_as_layers(img, image_file_list):
for image_name in image_file_list:
layer = pdb.gimp_file_load_layer(image_name)
pdb.gimp_image_insert_layer(img, layer, None, 0)
img = gimp.image_list()[0]
image_list = "temp1.png temp2.png temp3.png"
open_images_as_layers(img, image_list.split())
The second to last line img = ... picks a needed reference to an open image
in GIMP - you could also create a new image using pdb calls if you'd prefer
(example bellow).
The file list is a hardcoded space separated string in the snippet above,
but you can create the file list in any of the ways allowed by Python.
For example, to get all the ".png" file names in a
c:\Documents and Settings\My user\Pictures folder, you could do:
from glob import glob
image_list = glob("c:/Documents and Settings/My user/Pictures/*png")
To create an image programatically:
img = gimp.Image(1024, 768)
pdb.gimp_display_new(img)