Open only the top left corner of a JPEG? [closed] - python-imaging-library

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
I am wondering if there is a software library that enables opening a jpeg and converting a subset of it to an uncompressed array (such as Pillow). This would be in contrast to the more usual method which is to open the file, convert it fully to a bit array, then take a subset of the bit array.
In my case the subset that I have in mind is the upper left corner. The files decompress to 2544 × 4200 pixels, but I am only interested in the top left 150 x 900 pixels.
The background is below:
I am hopeful that the JPEG format is a string of compressed subpanels and an algorithm could stop when it had processed enough subpanels to fulfill the required subset of the image.
I have been searching for a while but have not found any mention of such an algorithm which, admittedly, is a special case.
Background
I use pyzbar to capture a barcode from the top left corner of a JPEG image as produced a high-speed scanner. Generally this required about 250 msecs per image. The actual Pyzbar time is about 2.5 msecs while the other 99% of the time is spent reading the image from a file, having it decompressed using Pillow, extracting the upper left corner.
The non-profit where I do this work as a volunteer cannot really afford to replace the $25K scanner and the channel that this clunker has is the overall bottleneck. Telling the scanner to send uncompressed images would slow the whole process down by at least 90%

I don't know of an existing library that can do this, but it is possible to modify jpegtran.c and jdtrans.c from the IJG C library to only read as many MCU rows as necessary when cropping, and in your specific case the decode time should be reduced by ~75% (900 / 4200 lines).
Since you are using Python, you could obtain a suitably cropped jpeg with:
os.popen("jpegtran -crop 152x904 %s" % inputfile).read()
This assumes the input jpeg is not progressive.

Related

Split HDMI Image to 3D Projector System [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 1 year ago.
Improve this question
I am quite stuck on a board, or something that could fit my needs.
I made a dual Projector 3D System at home, like this : http://www.cinema3dglass.com/Dual_projector_3D_polarization_system.php
And the HDMI can have different (8) formats(Left Right, Above Below, ect..) all here: https://www.tridef.com/user-guide/3d-file-formats
So the needed images on the incoming HDMI port can ble placed like above, and I need to split them on two separate HDMI outputs according to the format, so I can plug them into the projectors.
Basically I need a device that is in the first image in the first link above title: "HDMI Distribution Amplifier & EDID Emulator"
I know an Arduino can't handle this amount of processing, because I overloaded it with simlier tasks.
Can anyone Help me where to start? I foud Panda development board but that's too expensive.
Or if there is a not owerly expensive device existing for this task, I could buy that.
I manadged to use the system from Tridef 3D but that's hard to get working.
I'd like my device to get the input from a Chromecast 2.0, but if it's not possible a normal player 'll do it.
I found some devices called HDMI Demultiplexer they simply cut one half of the input, but that's quite expensieve, for 260$ and two would be needed.
Help me please.
Thanks in advance.
From the HDMI specification at page 56 the transfer/interconnection looks like this:
I would start with interleaved left/right format where even pixels are left and odd pixels are right because there is high chance that it does not need any FIFO. If you want standard left/right then you need single line FIFO for each channel and for up/down full image FIFO. In case variable clock is supported by your HW then this simplified example should work:
You need to add H/V sync decoder from Channel0 to reset the binary counter. Binary counter counts which data address is being processed. The single data-line to the AND gates should be D1 half of the input clock but not entirely you need to toggle between D0 and D1 depends in the timing of data processed (for pixels it would be D1 and for other data it would be D0) that is the variable clock I mentioned before. The comparator just compares the address against predefined constants (like half of line for non interleaved left/right format or detect even odd for interleaved format but both must take +/- other data offsets) beware the transfer is on bits not Bytes so the address will be multiplied by number of bits per data chunk ... The gates just toggle clock between left and right part. LATCHES make sure output signal will be not mixed and also boost the signal.
I would start with oscilloscope measurements of the channels so you can see how the data is transfered and then experiment-ate. If you use FPGA then you do not need to make any changes to the board while ecxperimentating with configurations as the circuit will be solely inside FPGA.
If variable clock is not supported then you need to use FIFO and or RAM to store the full line/image and then send the appropriate parts to their connectors. For that you most likely need full decoding capability so use the SIL9134 + SIL9135. Halving resolution will introduce timing problems because you will need more time to send half speed half frame then the full speed full frame (the auxiliary and sync data is copied not halved). If the sending has big enough gaps you could fit the missing time there but again not all HW can support it losing sync/flickering/etc. In such case you could change the resolution to a bit smaller (after halving) to fit in the send time ... or enlarge thi full resolution input (in x axis).
Good luck with your quest.

Why is bandwidth measured in bits per second? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
There is another question with this same title, but the question is asked differently than what's troubling me, and the answer is not sufficient.
The most prominent analogies I hear to explain bandwidth are the highway example, and the pipe example. In the highway example, bandwidth is the amount of cars that can drive on the highway in a given amount of time, and in the pipe its an amount of water that can flow through.
My question is - by measuring by cars per second, or liters per second, does that mean that a longer highway, pipe or copper wire has a higher bandwidth than a shorter one? That seems strange to me.
Wouldn't it make more sense to give the highway bandwidth as the amount of lanes it has - irrespective of a unit of time? It just makes more sense to me and is simpler to say that the pipe is "1 foot in diameter" rather than "it carries 100 litres per second".
Why do we measure bandwidth in bits per second and not just in bits?
"My question is - by measuring by cars per second, or liters per second, does that mean that a longer highway, pipe or copper wire has a higher bandwidth than a shorter one?"
No!
Bandwidth is not about how many cars can fit on the road. It's about how many cars can pass a point on the road during a certain time. How many cars per second can pass under a bridge, for example.
No, it wouldn't. You quote a highway in terms of lanes, because it's more understandable, and a reasonable approximation to assume 4 lanes = 4x as much traffic. But even then, you might have a traffic jam, and then 4 lanes is 'transmitting' fewer cars per minute than it would otherwise.
With a hose pipe, the width of the pipe is the speed of transmission if you assume the same water pressure.
These assumptions don't apply to communications - when I'm transmitting 'a bit' nothing physical is moving *. A 'bit' is the smallest piece that 'information' can be broken down into, and in order to transmit it, something needs to change.
If I turn on my torch and shine it at you, I've sent one 'message' (my torch is on). To send you anything more detailed, I would need to turn it off and on again - morse code is an example of doing this. The pattern of switching it off and on gives you some letters. How fast I can switch it off and on again, is how fast I can send a message.
So it is with bandwidth. I need to change things to communicate. If I can change things faster, I can communicate faster.
"bits" would be a measure of the number of torches I own. Bits per second is how fast I can flick them on and off to send a message.
* Electrons and photons do move, as does air to carry sound. But the signal isn't the thing moving - I don't have to move an atom of air from my mouth to your ear to 'talk' to you, the wave propagates through the medium.

Morphology operations using Matlab [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Here is the problem:
A camera takes an image I of a penny, a dime, and a quarter lying on a white
background and the coins do not overlap. Suppose that thresholding creates a binary image B successfully
with 1 for the coin regions and 0 for the background.
You are given the known diameters the coins d_p, d_d, and d_q in pixels (note that d_d < d_p < d_q). How do I use morphology operations (dilation, erosion, opening, and
closing) and logical and set operations (AND, OR, NOT, and set difference), to produce three binary output images P, D, and Q, where P should contain just the penny, D should contain just the dime, and Q should contain just the quarter?
Can anyone give the codes or some hints? Thanks in advance!
This obviously looks like homework so I won't write any code for you, but I'll give you some hints to push you in the right direction. The situation you described is highly idealized and not reflective of real-world situations.... which is actually great as it makes coding a lot more simpler. I'm going to assume that the picture was taken directly above the surface with the coins and not on an angle.
You already know the diameters of each of the coins, and because the diameters are in pixels, this makes this problem a whole lot easier. As such, you would specify three structuring elements that are circular that have the same diameters for each of the coins.
First do a morphological opening on B using the largest structuring element, which is the quarter. Opening is an erosion, followed by a dilation. One thing you should know about erosion is that any objects that are smaller than the structuring element will disappear while those that are larger will have pixels in the object that remain. As such, by doing a closing, you would remove the penny and dime, while the quarter will be fully reconstructed. One good thing about opening is that if your structuring element is smaller than the object itself, doing an opening should keep the object the same, provided that the structuring element and object follow more or less the same characteristics. Because your structuring element is circular and so are the coins, we're good to go. As such, this is your first image Q.
Next, use the second largest structuring element, which is the penny, and do an opening on the image B. What will happen now is that the dime should disappear while the quarter and the penny should still remain. As such, do a set difference between this image and Q. Our result is just the dime that is left, and so this is P.
Finally for the dime, you actually don't even need to do any morphology. Do a logical OR operation to combine the quarter Q and penny P to get a combined image. After, do a set difference between the original image B and this combined image. You'll then isolate the dime, which is now D.
This should be enough to get you started. Good luck!

OpenGL double precision [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 months ago.
Improve this question
in the meantime, is there a way to dictate MATLAB or Paraview or any other application that uses OpenGL to do stuff in double precision ? I could use a workaround for my problems, but I prefer not to :) Thanks!
EDIT:
I try to be more specific about the problem/issue. First two images:
The first one is rendered using openGL, the second (fine one) is rendered after typing the "opengl neverselect" method, which switches to another renderer. Since I experience quite simiular renderering problems in Paraview as well, I am quite sure that this is OpenGL specific and not the "fault" of matlab or Paraview. When I shift the values as mentioned in the comment below, I get smoothly rendered images as well. I assume that is because my data range has a huge offset from zero and the precision in the rendering routine is not accurate enough and produces serious rounding errors in the rendering calculations.
Thus, I would like to know if you know some way (in MATLAB, Paraview, in the OS settings) to set the rendering precision higher ( i read that gpus/OpenGL usually calculate in float)
First off, this has nothing to do with OpenGL. The part of MATLAB actually doing the plotting is written in some compiled language, and relies on OpenGL just for displaying stuff to the screen.
The precision used (double/float) is hard coded into the program. You can't have the OS or something force the program to use different data types. In certain cases you might be able to make the relevant changes to the source code of a program and then recompile, but this doesn't sound like it is applicable in your case.
This doesn't mean that there isn't a way to do what you want in MATLAB. In fact, since the program is specifically designed to do numeric computation there almost certainly is a way to specify the precision. You would need to provide more detailed information on your issue (screenshot?) if you want to get further guidance.

Separation of singing voice from music [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I want to know how to perform "spectral change detection" for the classification of vocal & non-vocal segments of a song. We need to find the spectral changes from a spectrogram. Any elaborate information about this, particularly involving MATLAB?
Separating out distinct signals from audio is a very active area of research, and it is a very hard problem. This is often called Blind Signal Separation in the literature. (There is some MATLAB demo code in the previous link.
Of course, if you know that there is vocal in the music, you can use one of the many vocal separation algorithms.
As others have noted, solving this problem using only raw spectrum analysis is a dauntingly hard problem, and you're unlikely to find a good solution to it. At best, you might be able to extract some of the vocals and a few extra crossover frequencies from the mix.
However, if you can be more specific about the nature of the audio material you are working with here, you might be able to get a little bit further.
In the worst case, your material would be normal mp3's of regular songs -- ie, a full band + vocalist. I have a feeling that this is the case you are probably looking at given the nature of your question.
In the best case, you have access to the multitrack studio recordings and have at least a full mixdown and an instrumental track, in which case you could extract the vocal frequencies from the mix. You would do this by generating an impulse response from one of the tracks and applying it to the other.
In the middle case, you are dealing with simple music which you could apply some sort of algorithm tuned to the parameters of the music to. For instance, if you are dealing with electronic music, you can use to your advantage the stereo width of the track to eliminate all mono elements (ie, basslines + kicks) to extract the vocals + other panned instruments, and then apply some type of filtering and spectrum analysis from there.
In short, if you are planning on making an all-purpose algorithm to generate clean acapella cuts from arbitrary source material, you're probably biting off more than you can chew here. If you can specifically limit your source material, then you have a number of algorithms at your disposal depending on the nature of those sources.
This is hard. If you can do this reliably you will be an accomplished computer scientist. The most promising method I read about used the lyrics to generate a voice only track for comparison. Again, if you can do this and write a paper about it you will be famous (amongst computer scientists). Plus you could make a lot of money by automatically generating timings for karaoke.
If you just want to decide wether a block of music is clean a-capella or with instrumental background, you could probably do that by comparing the bandwidth of the signal to a normal human singer bandwidth. Also, you could check for the base frequency, which can only be in a pretty limited frequency range for human voices.
Still, it probably won't be easy. However, hearing aids do this all the time, so it is clearly doable. (Though they typically search for speech, not singing)
first sync the instrumental with the original, make sure they are the same length and bitrate and start and end at the exact time and convert them to .wav
then do something like
I = wavread(instrumental.wav);
N = wavread(normal.wav);
i = inv(I);
A = (N - i); // it could be A = (N * i) or A = (N + i) you'll have to play around
wavwrite(A, acapella.wav)
that should do it.. a little linear algebra goes a long way.