Which coding trees you know about? - encoding

I have read and understood huffman coding to some extent.
I did some googling but was unable to find other coding TREES.
I need to make comparision of HuffMan coding TREE with other coding TREES.
All I need are names of coding trees and if u can provide there small description it would be really helpful..
regards,
Aqif

Shannon-Fano was a predecessor of Huffman coding (Fano was Huffman's professor).
Hu-Tucker is another method.
Context tree weighting method (CTW) is a lossless compression and prediction algorithm.
WBTC (wavelet block tree coder) algorithm is used for image coding.
Block-zero tree coding (BZTC) algorithm is also used for image coding.
Some of these are in Wikipedia, and some from plain old Google searching.

Related

Is adding Huffman tree or frequency table needed?

I am learning about encoding and decoding and my instructor have a question for my homework?
That is: Are we still able to decode the Huffman compressed files without adding Huffman tree or the
frequency table to the files or not?
You need some sort of description of the Huffman code. However that description does not have to be the frequencies or the tree. Typically a canonical Huffman code is used, and all that is needed to describe the code is the bit lengths of the code for each symbol.

Difference between cvPOSIT and cvFindExtrinsicCameraParams2

Another OpenCV question;
Without me having to implement 2 versions - can anyone enlighten me to what the differences are between cvPOSTIT and cvFindExtrinsicCameraParams2 and maybe the advantages of each.
The inputs and outputs appear to be the same.
From my experience, cvFindExtrinsicCameraParams2() works for coplanar points (so it is probably an implementation of http://dl.acm.org/citation.cfm?id=228149), while cvPOSIT() doesn't. But I am not 100% sure.
It appears that cvPOSIT() only exists in OpenCV's old C API and not in the new C++ API. Conversely, cvFindExtrinsicCameraParams2() is in both. While not a perfect indicator, my best guess is that they both implement the POSIT algorithm with minor modifications and the former exists only for legacy reasons.
Beyond that, your guess is good as mine. If you want a definitive answer, I suggest asking on the OpenCV mailing list.
I've used cvPOSIT already. It only works on 3D non-coplanar points on the object. Because it bases on the algorithm from "DAVIS, D. F. D. A. L. S. 1995. Model-Based Object Pose in 25 Lines of Code". So you will have to find a way around for coplanar features
With cvFindExtrinsicCameraParams2(), it also works on planar features, solve the transformation using cvFindHomography and then refine the result by levenberg-marquardt approximation. For non-coplanar points, the preprocessing is done by a different method DLT (Direct Linear Transformation) (not ".. 25 lines of Code" article anymore)
I'm not pretty sure about thier performance, which one is faster. As I know, ".. 25 lines of code" is very fast, and suitable for realtime vision up to now.

Audio File Matching Program

I'm trying to write a program in iPhone than can take two audio files (e.g. WAV) as inputs, compare them, and spit out a number that tells you how similar the audio files are.
If someone has done something like this, know how to go about doing it, or just have some ideas, please let me know. Anything will be greatly appreciated.
Specific questions: What language is suitable? How hard is it to do (how many
hours, roughly)? Where can I find a good source of audio library/tools?
Thanks!
I'd say it's pretty hard, not so much the implementation, but coming up with a reasonable definition of 'similar'.
That said, you're probably looking at techniques like autocorrelation and FFT, both of which are CPU-intensive tasks, so I'd say a fully-compiled language (C, C++, don't know about Objective-C) would be most suitable at least for the actual calculations. Also, you're facing a somewhat underpowered platform for such tasks (if only because uncompressed audio files are pretty large), so you're in for quite some optimization.
This book: http://www.dspguide.com/ is quite concise reading for all things DSP-related.
Sounds similar to what 'Shazam' does - awesome iPhone app by the way, check it out if you haven't already (it's free too).
A while ago there was an article on how Shazam works, read it here. It takes an acoustic fingerprint and compares it to other songs' fingerprints, returning the closest match.
I would say there is a lot of math, probably some matrices and maybe Fourier transforms involved in fingerprinting and then trying to compare the audio.
-
Probably would take a good while to program. If your math skills are up to it though, sounds like a good challenge :-)
-
EDIT: turns out there was some source code on the site I linked. It's in Java but would be well worth a look through before you start writing your own. Source code here
I am working on something similar in Java on a speech recognition app.
I would recommend using MFCC (requires calculating FFT) for feature extraction and Neural Networks or some other sort of machine learning technique for training and recognition. You train the NN with the features extracted from the reference wav file, more precisely from consecutive equal lenght slices/windows of that audio file. Then you use the NN to detect if another file, also split into slices, has the same features.
This is the basic idea upon which you can elaborate to further your own specifications, or exactly what you want your app to do.
In terms of libraries in Objective C I think you can find a few for the signal processing part (FFT and such) as for the machine learning part I have no idea about what you could find.
As for programming time it's hard to estimate because it depends on a lot of details. I would say somewhere about a week, but that's just a fair estimation.
ps: MFCC stands for Mel-Frequency Coeficients: http://en.wikipedia.org/wiki/Mel-frequency_cepstrum

Your experiences with Matlab/F#/R for data analysis and modeling algorithms

I've been using F# for a while now to model algorithms before coding them in C++, and also using it afterwards to check the results of the C++ code, and also against real-world recorded data.
For the modeling side of things, it's very handy, but for the 'data mashup' kind of stuff, pulling in data from CSV and other sources, generating statistics, drawing charts etc., my colleague teases me no end ("why are you coding that yourself? It's built in to MatLab").
And I have another colleague who swears by R, which also has charting stuff 'built-in'.
I know that MatLab, R and F# are not strictly comparable, so I'm not asking for a 'feature comparison shoot out'. I just wondered what other people are using for these kind of pre- and post-analysis scenarios, and how happy they are with it.
(If there's anyone out there working on wrapping Microsoft Charts into something F#-friendly, let me know, I'd be happy to participate...)
(Note: answers to this question will be subjective, but based on experience, please)
I have very little experience with F#, but regarding C++/Matlab/R: If the speed of your program's execution is the most important, use C++. If speed of implementation is the most important, use Matlab or R. This is true for a number of reasons, not the least of which is their massive libraries of math/stats packages.
Both Matlab and R can be sped up through parallelism: so generally, I think that speed and quality of implementation should be a bigger concern. That's where the real "value" of programming is taking place, in the design of the application. It's not a minor proposition if you can write 3 or 4 good R programs in the same time it takes you to write 1 good C++ program.
Regarding F#: so far as it is part of Microsoft's framework, it must have a lot to offer. If you're developing in Visual Studio or working on a big .Net project (for instance), it might make sense to use F#. On the other hand, you can call both Matlab and R from .Net applications, so I would probably argue that their libraries should be a bigger concern. For instance, see this article as an example for R and the Matlab Builder.
Long story short: comparing F# and Matlab/R isn't a good comparison. F# is a general purpose programming language, while Matlab/R can be viewed as massive mathematical/data analysis toolkits. Some people call Matlab or R from F# in order to take advantage of each language's benefits (e.g. see this discussion, this article on Matlab/F#, or this article on R/F#).
So far as charting is concerned: R is extremely strong on this front. Have a look at the graphics view on CRAN and this series of posts on the LearnR blog about Lattice and ggplot2.
I've worked a bit with matlab and python/pylab for these purposes. What these tools have 'built-in' is a programming environment, a shell, and gui tools designed for quickly looking at data from a variety of sources.
In a few commands, you can go from having a csv file to interactive plots on the screen, then to an image export in just about any format. It takes a minute or two to go from data to visualization once you have the hang of it. I would imagine this is uncommon in the C++ world (although I have seen some professors with pretty impressive work-flows).
I've tried R, but I can't say much useful about it. It seems to offer about the same set of features, but it may be troublesome to Google for support.
If you are spending more than a couple minutes getting from data to plot using your current method, it's definitely worth learning one of these environments. The best choice depends on your colleagues, your work environment, experience, and your budget.
This is a reasonable close double to the previous question on suitable functional language for scientific/statistical computing so you may want to peruse the long and detailed answers there.
Answers depends, as so often, on your experience and prior language training. I very much prefer R for data munging / modeling / visualization.
I use R because on the one hand it has everything built in and on the other hand you can still manipulate almost everything or start from scratch. Nevertheless, R is rather slow for heavy calculations (although I do all my Monte Carlo simulations in it).
I would say that Matlab is best for the availability of mathematical functionalities in general, R is best for data input/manipulation/visualisation/analysis/etc., and C++ for high-speed subroutines. You can by the way easily integrate C++ (or C, fortran, ...) code in R. Why not read and manipulate input data in R, apply the models in C++, and analyse/visualize output back in R?
I always prototype my models in MATLAB. If my prototype is fast enough, I refactor and it's done. If not, I go back and implement certain functions in C to be called by MATLAB. This requires knowledge of a low level language, which I think is always going to be the case if you are doing anything that is technically challenging.
I'm intrigued with this Lisp flavor if it ever gets off the ground.

Efficient way to fingerprint an image (jpg, png, etc)?

Is there an efficient way to get a fingerprint of an image for duplicate detection?
That is, given an image file, say a jpg or png, I'd like to be able to quickly calculate a value that identifies the image content and is fairly resilient to other aspects of the image (eg. the image metadata) changing. If it deals with resizing that's even better.
[Update] Regarding the meta-data in jpg files, does anyone know if it's stored in a specific part of the file? I'm looking for an easy way to ignore it - eg. can I skip the first x bytes of the file or take x bytes from the end of the file to ensure I'm not getting meta-data?
Stab in the dark, if you are looking to circumvent meta-data and size related things:
Edge Detection and scale-independent comparison
Sampling and statistical analysis of grayscale/RGB values (average lum, averaged color map)
FFT and other transforms (Good article Classification of Fingerprints using FFT)
And numerous others.
Basically:
Convert JPG/PNG/GIF whatever into an RGB byte array which is independent of encoding
Use a fuzzy pattern classification method to generate a 'hash of the pattern' in the image ... not a hash of the RGB array as some suggest
Then you want a distributed method of fast hash comparison based on matching threshold on the encapsulated hash or encoding of the pattern. Erlang would be good for this :)
Advantages are:
Will, if you use any AI/Training, spot duplicates regardless of encoding, size, aspect, hue and lum modification, dynamic range/subsampling differences and in some cases perspective
Disadvantages:
Can be hard to code .. something like OpenCV might help
Probabilistic ... false positives are likely but can be reduced with neural networks and other AI
Slow unless you can encapsulate pattern qualities and distribute the search (MapReduce style)
Checkout image analysis books such as:
Pattern Classification 2ed
Image Processing Fundamentals
Image Processing - Principles and Applications
And others
If you are scaling the image, then things are simpler. If not, then you have to contend with the fact that scaling is lossy in more ways than sample reduction.
Using the byte size of the image for comparison would be suitable for many applications. Another way would be to:
Strip out the metadata.
Calculate the MD5 (or other suitable hashing algorithm) for the
image.
Compare that to the MD5 (or whatever) of the potential dupe
image (provided you've stripped out
the metadata for that one too)
You could use an algorithm like SIFT (Scale Invariant Feature Transform) to determine key points in the pictures and match these.
See http://en.wikipedia.org/wiki/Scale-invariant_feature_transform
It is used e.g. when stitching images in a panorama to detect matching points in different images.
You want to perform an image hash. Since you didn't specify a particular language I'm guessing you don't have a preference. At the very least there's a Matlab toolbox (beta) that can do it: http://users.ece.utexas.edu/~bevans/projects/hashing/toolbox/index.html. Most of the google results on this are research results rather than actual libraries or tools.
The problem with MD5ing it is that MD5 is very sensitive to small changes in the input, and it sounds like you want to do something a bit "smarter."
Pretty interesting question. Fastest and easiest would be to calculate crc32 of content byte array but that would work only on 100% identical images. For more intelligent compare you would probably need some kind of fuzy logic analyzis...
I've implemented at least a trivial version of this. I transform and resize all images to a very small (fixed size) black and white thumbnail. I then compare those. It detects exact, resized, and duplicates transformed to black and white. It gets a lot of duplicates without a lot of cost.
The easiest thing to do is to do a hash (like MD5) of the image data, ignoring all other metadata. You can find many open source libraries that can decode common image formats so it's quite easy to strip metadata.
But that doesn't work when image itself is manipulated in anyway, including scaling, rotating.
To do exactly what you want, you have to use Image Watermarking but it's patented and can be expensive.
This is just an idea: Possibly low frequency components present in the DCT of the jpeg could be used as a size invariant identifier.