I'm trying to read numbers from a picture with pytesseract.
This is my code:
import matplotlib.pyplot as plt
import pytesseract
img = cv2.imread(r'bild4.jpg', 2)
[ret, bw_img] = cv2.threshold(img, 200, 255, cv2.THRESH_BINARY)
imgplot = plt.imshow(bw_img)
plt.show()
text = pytesseract.image_to_string(bw_img, config='digits')
print("Text: " + text)
I tried many ways to preprocess the image, the best I got is:
"673
504 .
5 552"
Only the last line is correct.
Without the config='digits' I get:
"673 ost
Fir 504 .
5 552
ii"
I tried with black and withe only, witch is really easy to read for me as a human, but it don't recognice numbers at all...
You could use inRange thresholding
Read the image and convert it to the HSV color-scale
bgr = cv2.imread("JSe5v.jpg")
hsv = cv2.cvtColor(bgr, cv2.COLOR_BGR2HSV)
Initialize mask and kernel
mask = cv2.inRange(hsv, np.array([0, 0, 244]), np.array([179, 35, 255]))
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
Remove the background using mask and kernel
dilated = cv2.dilate(mask, kernel, iterations=1)
thresh = cv2.bitwise_and(dilated, mask)
The result will be:
If you read it with psm mode 6
675 OS!
312504
5 552
1st line second part is not correct "51" is recognized as "S!"
You could look for improving the tesseract accuracy
Code:
import cv2
import pytesseract
import numpy as np
bgr = cv2.imread("JSe5v.jpg")
hsv = cv2.cvtColor(bgr, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv, np.array([0, 0, 244]), np.array([179, 35, 255]))
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
dilated = cv2.dilate(mask, kernel, iterations=1)
thresh = cv2.bitwise_and(dilated, mask)
text = pytesseract.image_to_string(thresh, config="--psm 6")
print(text)
Related
This is my first time trying to train a network and use PyTorch, so please forgive me if this is considered simple.
I have a pretrained AlexNet network that was modified to classify 3 classes, which I've already trained on MNIST that I mapped to 3 different labels.
class Net( nn.Module ) :
def __init__( self ) :
super( Net, self ).__init__()
self.model = models.alexnet( pretrained = True )
# changed in_channels from 3 to 1 bc images are black and white
self.model.features[ 0 ] = nn.Conv2d( 1, 64, kernel_size = 11, stride = 4, padding = 2 )
# binary classifier -> 3 out_features
self.model.classifier[ 4 ] = nn.Linear( 4096, 1024 )
self.model.classifier[ 6 ] = nn.Linear( 1024, 3 )
def forward( self, x ):
return self.model( x )
model = Net().to( device )
I want to test this on a single .png image that I drew, which is already 255x255, and in black and white. I would like the predicted label. This is the code I have so far for preprocessing the image:
from PIL import Image
import matplotlib.pyplot as plt
import cv2
image_8 = Image.open( "eight.png" ).convert('L')
image_8 = list( image_8.getdata())
normalized_8 = [(255 - x) * 1.0 / 255.0 for x in image_8 ]
tensor_8 = torch.FloatTensor( normalized_8 )
pred = model( tensor_8 )
from which I got the following error: Expected 4-dimensional input for 4-dimensional weight [64, 1, 11, 11], but got 1-dimensional input of size [50176] instead. So this is clearly the wrong way to do things, but I'm not sure how to proceed.
Change your inference code to the following. Images are not intended to be flattened into 1d.
import matplotlib.pyplot as plt
import cv2
image_8 = cv2.imread("eight.png")
# following line may or may not be necessary
image_8 = cv2.cvtColor(image_8, cv2.COLOR_BGR2GRAY)
# you can divide numpy arrays by a constant natively
image_8 /= 255.
# This makes a 4d tensor (batched image) with shape [1, channels, width, height]
image_8 = torch.Tensor(tensor_8).unsqueeze(axis=0)
pred = model(image_8)
If the image is still 3d (shape of [1, width, height]), add a second .unsqueeze(axis=0).
This h5 file contains the information of an analytical function on a regular 3D gird. For interpolation purpose, I have got very poor result using the Regulargridinterpolator here. Now, I want to test scipy.interpolate.Rbf interpolator for my data set. Can anyone help me to do that? I had a look at the documentation of this interpolator but didn't understand properly.
I have created a h5 file like this:
import numpy as np
from numpy import gradient
import h5py
from scipy.interpolate import Rbf
def f(x,y,z):
return ( -1 / np.sqrt(x**2 + y**2 + z**2))
#grid
x = np.linspace(0, 100, 32) # since the boxsize is 320 Mpc/h
y = np.linspace(0, 100, 32)
z = np.linspace(0, 100, 32)
mesh_data = phi_an(*np.meshgrid(x, y, z, indexing='ij', sparse=True))
#create h5 file
h5file = h5py.File('analytic.h5', 'w')
h5file.create_dataset('/x', data=x)
h5file.create_dataset('/y', data=y)
h5file.create_dataset('/z', data=z)
h5file.create_dataset('/mesh_data', data=mesh_data)
h5file.close()
I have a h5 file containing regulargrid data. I have used a code by which I can easily get the interpolated value for three given value. I have used RegularGridInterpolator function for interpolation purpose here. Now I want to make a plot to check whether the interpolation is correct or not. But I don't understand how can I do that. Can anyone help me to do that please? Here is my code:
import numpy as np
import h5py
from scipy.interpolate import RegularGridInterpolator
f = h5py.File('file.h5', 'r')
list(f.keys())
dset = f[u'data']
dset.shape
dset.value.shape
dset[0:63,0:63,0:63]
x = np.linspace(-10, 320, 64)
y = np.linspace(-10, 320, 64)
z = np.linspace(-10, 320, 64)
my_interpolating_function = RegularGridInterpolator((x, y, z), dset.value)
pts = np.array([7.36970468e-09, -4.54271563e-09, 1.51802701e-09])
my_interpolating_function(pts)
The output of the interpolation is array([5.45534467e-10])
I'm trying to unblur the blurred segments of the following picture.
the original PSF was not given, so I proceeded to analyze the blurred part and see whether there was a word I could roughly make out. I found out that I could make out "of" in the blurred section. I cropped out both the the blurred "of" and its counterpart in the clear section, as seen below.
I then thought through lectures in FFT that you divide the blurred (frequency domain) with a particular blurring function (frequency domain) to recreate the original image.
I thought that if I could do Unblurred (frequency domain) \ Blurred(frequency domain), the original PSF could be retrieved. Please advise on how I could do this.
Below is my code:
img = im2double(imread('C:\Users\adhil\Desktop\matlab pics\image1.JPG'));
Blurred = imcrop(img,[205 541 13 12]);
Unblurred = imcrop(img,[39 140 13 12]);
UB = fftshift(Unblurred);
UB = fft2(UB);
UB = ifftshift(UB);
F_1a = zeros(size(B));
for idx = 1 : size(Blurred, 3)
B = fftshift(Blurred(:,:,idx));
B = fft2(B);
B = ifftshift(B);
UBa = UB(:,:,idx);
tmp = UBa ./ B;
tmp = ifftshift(tmp);
tmp = ifft2(tmp);
tmp = fftshift(tmp);
[J, P] = deconvblind(Blurred,tmp);
end
subplot(1,3,1);imshow(Blurred);title('Blurred');
subplot(1,3,2);imshow(Unblurred);title('Original Unblurred');
subplot(1,3,3);imshow(J);title('Attempt at unblurring');
This code, however, does not work, and I'm getting the following error:
Error using deconvblind
Expected input number 2, INITPSF, to be real.
Error in deconvblind>parse_inputs (line 258)
validateattributes(P{1},{'uint8' 'uint16' 'double' 'int16' 'single'},...
Error in deconvblind (line 122)
[J,P,NUMIT,DAMPAR,READOUT,WEIGHT,sizeI,classI,sizePSF,FunFcn,FunArg] = ...
Error in test2 (line 20)
[J, P] = deconvblind(Blurred,tmp);
Is this a good way to recreate the original PSF?
I'm not an expert in this area, but I have played around with deconvolution a little bit and have written a program to compute the point spread function when given a clear image and a blurred image. Once I got the psf function using this program, I verified that it was correct by using it to deconvolve the blurry image and it worked fine. The code is below. I know this post is extremely old, but hopefully it will still be of use to someone.
import numpy as np
import matplotlib.pyplot as plt
import cv2
def deconvolve(normal, blur):
blur_fft = np.fft.rfft2(blur)
normal_fft = np.fft.rfft2(normal)
return np.fft.irfft2(blur_fft/(normal_fft))
img = cv2.imread('Blurred_Image.jpg')
blur = img[:,:,0]
img2 = cv2.imread('Original_Image.jpg')
normal = img2[:,:,0]
psf_real = deconvolve(normal, blur)
fig = plt.figure(figsize=(10,4))
ax1 = plt.subplot(131)
ax1.imshow(blur)
ax2 = plt.subplot(132)
ax2.imshow(normal)
ax3 = plt.subplot(133)
ax3.imshow(psf_real)
plt.gray()
plt.show()
What is the equivalent of blockproc in opencv?
http://www.mathworks.com/help/images/ref/blockproc.html?refresh=true
I want to break the image in blocks of 3X3 and apply an average on each block.
No, you need to do it manually
http://answers.opencv.org/question/33258/how-to-cut-an-image-in-small-images-with-opencv/
When using python binding we can do the following
import cv2
import numpy as np
mat = cv2.imread('x.jpg')
rows, cols, x = mat.shape
rows, cols = 3*(rows/3), 3*(cols/3)
reshaped = mat[:rows,:cols].reshape(rows//3, 3, cols//3, 3,3)
mat1 = reshaped.sum(axis=(1, 3)) / 9
cv2.imwrite('y.jpg', mat1.astype(np.uint8))