Trying to recreate cocktail party algorithm in Matlab, Wrong results? - matlab

using this sound file: http://www.ism.ac.jp/~shiro/research/sounds/RSM/X_rsm2.wav
I'm trying to recreate Andrew Ng's Machine Learning presentation(https://class.coursera.org/ml-005/lecture) from coursera in matlab
What I do is to read a .wav file (16khz, 7 sec, 2 channels)
[x,xfs] = wavread('track.wav')
Now I transpose x
x = x'
Now I proceed to use x on the cocktail party algorithm
[W,s,v] = svd((repmat(sum(x.*x,1),size(x,1),1).*x)*x')
MATLAB returns:
W =
-0.9233 -0.3841
-0.3841 0.9233
s =
265.4832 0
0 13.0768
v =
-0.9233 -0.3841
-0.3841 0.9233
Where is the separated audio?
EDIT: From further research, I found out that W is only the unmixing matrix. Meaning this algo is incomplete if my goal is to get the two output separated sound sources. What do I do with this unmixing matrix?

I believe you want to apply the unmixing matrix W you found through SVD to the mixed signals x. This can be done simply as follows:
sigs = W*x;
Now sigs(1,:) will be one of the separated signals and sigs(2,:) will be the other.
Good luck.

I believe you're running out of memory because you're trying to repmat across the wrong dimension (or possibly your x variable needs to be transposed). Loading x as you have gives you a variable of size:
>> size(x) = [110000, 2]
Of course, if you try to repmat this as you have, you're essentially telling MATLAB to:
repmat(x,110000,1);
If you do the math, you're trying to create a variable of size [12100000000, 2]. That's 12 billion if you can't be bothered counting the zeros. A single double value in MATLAB is 8 bytes in size, so you're trying to create a variable that would use 12100000000*8*2 bytes = ~200 GB. Chances are you don't have this much memory, hence why MATLAB isn't letting you.
Long story short, try transposing x before repmatting it.

Related

Limiting large array to 1D in MATLAB

I'm working with XShooter data and for galactic corrections, I'm using ccm_unred in MATLAB. The problem is
funred = flux*10.^(0.4*A_lambda);
this line of code generates a 29686 X 29686 double array. I want only one side of it, I can do it by reassigning funred as funred = funred(:,1) but this piece of code also takes 57 seconds to be executed and uses up my CPU and RAM too much for my laptop to stay stable. Is there any method by which I can limit the generation of funred to only (:,1) from the beginning?
You say that your code generates a 29686 X 29686 matrix, however you are doing element-wise operations in your equation. That means that either flux or A_lambda bust be 29686 X 29686. Just slice the ones that are that size!
Assuming one of them is 29686 X 29686
funred = flux(:,1)*10.^(0.4*A_lambda(:,1));
Just remove the (:,1) of the one that is is not a matrix.
If both of them are a matices, then you can not do it, as flux*... would need the whole matrix to operate.

Better dataformat than .mat v7.3 for import from Matlab 2016 to Mathematica 11?

I am trying to import data from Matlab 2016a/2016b-prerelease to Mathematica 11 with local storage
preferable complete documentation
expansible to big data so binary; now, 1-100 GBs, but TBs needed later
to convert from Mathematica 11 to Matlab 2016 - is also a plus if the data format can be a bijection; so understanding to read the data format in Matlab 2016a/b and Mathematica 11
...
Data and Quality Assurance
Full data is x-y plane which has 650 000 x 650 000 points so dimensions are about a square matrix.
Take any subset of the data etc 6500 x 6500 for your examples.
Save datafile and .tiff image.
Steps
Specify any test -data (presenting a 2D-image), and Specify filenames of .tif and datafile files
Quality assurance that you get a figure in .tif
Store data in your chosen dataformat
Quality assurance that you can reload datafile back into Matlab
Import datafile in Mathematica
Open/process datafile in Mathematica
Differential solutions
Challenges with Matlab's v7.3
follows HDF5 unconventionally without complete documentation; so you cannot do directly Attempt 1 below
I do not understand how unconventiol the dataformat really is so please propose also a stable method to work with v7.3
not sure about binary and expansibility to big data
Related: How to get Matlab data imported with the same dimensions?, How does import data from matlab into mathematica? or how to import matlab code that can be run in mathematica?, MAT-File Level 5 File Format, ...
Challenges of automatic approaches
I think not suitable because I want local storage, etc here.
Matlab's v7 .mat deprecated against v7.3
Matlab's v5 and v4 .mat - I think deprecated against v7.3
My attempts of Work flow for Diff Condition (1)
Some specification: specify datafile as .mat file of v7.3
Attempt 1
Matlab
# Specify test data here
time=0:0.001:1; potential=sin(time); C = spectrogram(potential); C=reshape(C,1,[]); C=nthroot( abs( C(1,1:1001) ), 1); hFig=figure(); hax=axes(hFig); imagesc(time,potential,C);
filename=fullfile('/home/masi/Images/test');
filenameMat=fullfile('/home/masi/Images/test.mat');
export_fig(filename, '-tif', '-q101', '-a1', '-m1', '-RGB', '-nofontswap', '-nocrop', '-transparent', '-dpng', hax);
save(filenameMat,'time', 'potential', 'C', '-v7.3');
Mathematica where #1-2 both succeed with import of many variables
(* http://mathgis.blogspot.fi/2010/09/tips-import-matlab-mat-files.html *)
(* https://mathematica.stackexchange.com/a/10589/9815 *)
(* #1 Succeeds; select specific data sets *)
mma = Import["~/Images/test.mat", {"HDF5", "Datasets", "/time"}];
(* #2 Succeeds: Out {"/C", "/potential", "/time"} *)
mma = Import["~/Images/test.mat", {"HDF5", "Datasets"}];
(* Output: {{1.}, {1.5}, {2.}} *)
Output: steps (1-4) succeeds but import of datafile (step 5) fails in Mathematica 11, see the error message above.
Reading the data in Mathematica where Flatten is used to remove one set of braces because one set too much
(* https://stackoverflow.com/a/16834090/54964 *)
SetDirectory["Desktop"]
a = Import["m.mat"] ;
(* https://mathematica.stackexchange.com/a/97252/9815 *)
a=Partition[Flatten[a], 5000]
(* Output fails: {} *)
Studying agentp's answer
He is using simply a square matrix.
I have the data in three variables: time, potential and C, fitting imagesc()'s parameters.
Do square matrix of the vectors time m x 1 and potential n x 1. How can you apply the vector C in the square matrix A? I do not understand the mathematics here sufficiently to answer the question myself.
# time's dimensions m x 1
# (potential')'s dimensions 1 x n
time=0:0.001:1; potential=sin(time); A = time' * potential;
# Output: A is m x n matrix, which is as as expected.
# C is vector 1 x m here.
C = spectrogram(potential); C=reshape(C,1,[]); C=nthroot( abs( C(1,1:1001) ), 1);
How can you convert the square matrix A(C) back to those three variables? - - A(C) is about the square matrix where the vector C has been applied on the square matrix A. I do not understand the mathematics behind it to create the result.
How can you keep those pieces of data separate as one binary? - - This may not possible but I want to understand the standards currently.
Matlab: 2016a, 2016b prerelease
Mathematica: 11
OS: Debian 8.5
Related: Is there a way to import the results or data from Matlab to Mathematica automatically?
an example of raw binary file exchange from matlab to mathematica:
matlab:
mat = [ pi 2*pi 3*pi ; 1 sqrt(2) sqrt(3) ]
f=fopen('out.bin','w')
fwrite(f,size(mat))
fwrite(f,mat,'double')
... # repeat for however many matrices we need to write
fwrite(f,size(mat2))
fwrite(f,mat2,'double')
...
fclose(f)
mathematica:
f = OpenRead["out.bin", BinaryFormat -> True];
size = BinaryReadList[f, "Integer8", 2];
mat = Transpose#ArrayReshape[
BinaryReadList[f, "Real64",Times##size],
Reverse#size];
(* repeat as needed to read multiple matrices *)
Close[f];
MatrixForm#mat
note the Reverse and Transpose are needed because matlab writes the data in in column major order. You could alternately do fwrite(f,transpose(mat),'double') when you write.
note also this assumes a square array. If you wanted to handle multidimensional arrays you'd also need to write length(size) to the file and so on.
for completeness, go back like this:
f = OpenWrite["out.bin", BinaryFormat -> True];
BinaryWrite[f, Dimensions[mat], "Integer8"];
BinaryWrite[f, Transpose[mat], "Real64"];
Close[f]
..
f=fopen('out.bin','r')
sz=transpose(fread(f,2))
mat=fread(f,sz,'double')
fclose(f)

basic - Trying to add noise to an Audio file and trying to reduce errors using basic coding such as Repeatition code

We were recently taught the concepts of error control coding - basic codes such as Hamming code, repeatition code etc.
I thought of trying out these concepts in MATLAB. My goal was to compare how an audio file plays when corrupted by noise and in the case when the file is protected by basic codes and then corrupted by noise.
So I opened a small audio clip of 20-30 seconds in MATLAB using audioread function. I used 16 bit encoded PCM wave file.
If opened in 'native' format it is in int16 format . If not it opens in a double format.
I then added two types of noises to it : - AWGN noise (using double format) and Binary Symmetric Channel noise (by converting the int16 to uint16 and then by converting that to binary using dec2bin function). Reconverting back to the original int16 format does add a lot of noise to it.
Now my goal is to try out a basic repeatition code. So what I did was to convert the 2-d audio file matrix which consists of binary data into a 3-d matrix by adding redundancy. I used the following command : -
cat(3,x,x,x,x,x) ;
It created a 3-D matrix such that it had 5 versions of x along the 3rd dimension.
Now I wish to add noise to it using bsc function.
Then I wish to do the decoding of the redundant data by removing the repetition bits using a mode() function on the vector which contains the redundant bits.
My whole problem in this task is that MATLAB is taking too long to do the computation. I guess a 30 second file creates quite a big matrix so maybe its taking time. Moreover I suspect what I am doing is not the most efficient way to do it with regards to the various data types.
Can you suggest a way in which I may improve on the computation times. Are there some functions which can help do this basic task in a better way.
Thanks.
(first post on this site with respect to MATLAB so bear with me if the posting format is not upto the mark.)
Edit - posting the code here :-
[x,Fs] = audioread('sample.wav','native'); % native loads it in int16 format , Fs of sample is 44 khz , size of x is 1796365x1
x1 = x - min(x); % to make all values non negative
s = dec2bin(x); % this makes s as a 1796365x15 matrix the binary stream stored as character string of length 15. BSC channel needs double as the data type
s1 = double(s) - 48; % to get 0s and 1s in double format
%% Now I wish to compare how noise affects s1 itself or to a matrix which is error control coded.
s2 = bsc(s,0.15); % this adds errors with probability of 0.15
s3 = cat(3,s,s,s,s,s) ; % the goal here is to add repetition redundancy. I will try out other efficient codes such as Hamming Code later.
s4 = bsc(s3,0.15);% this step is taking forever and my PC is unresponsive because of this one.
s5 = mode(s4(,,:)) ; % i wish to know if this is a proper syntax, what I want to do is calculate mode along the 3rd dimension just to remove redundancy and thereby reduce error.
%% i will show what I did after s was corrupted by bsc errors in s2,
d = char(s2 + 48);
d1 = bin2dec(d) + min(x);
sound(d1,Fs); % this plays the noisy file. I wish to do the same with error control coded matrix but as I said in a previous step it is highly unresponsive.
I suppose what is mostly wrong with my task is that I took a large sampling rate and hence the vector was very big.

Forcing a specific size when using spconvert in Matlab

I am loading a sparse matrix in MATLAB using the command:
A = spconvert(load('mymatrix.txt'));
I know that the dimension of my matrix is 1222 x 1222, but the matrix is loaded as 1220 x 1221. I know that it is impossible for MATLAB to infer the real size of my matrix, when it is saved sparse.
A possible solution for making A the right size, is to include a line in mymatrix.txt with the contents "1222 1222 0". But I have hundreds of matrices, and I do not want to do this in all of them.
How can I make MATLAB change the size of the matrix to a 1222 x 1222?
I found the following solution to the problem, which is simple and short, but not as elegant as I hoped:
A = spconvert(load('mymatrix.txt'));
if size(A,1) ~= pSize || size(A,2) ~= pSize
A(pSize,pSize) = 0;
end
where pSize is the preferred size of the matrix. So I load the matrix, and if the dimensions are not as I wanted, I insert a 0-element in the lower right corner.
Sorry, this post is more a pair of clarifying questions than it is an answer.
First, is the issue with the 'load' command or with 'spconvert'? As in, if you do
B = load('mymatrix.txt')
is B the size you expect? If not, then you can use 'textread' or 'fread' to write a function that creates the matrix of the right size before inputting into 'spconvert'.
Second, you say that you are loading several matrices. Is the issue consistent among all the matrices you are loading. As in, does the matrix always end up being two rows less and one column less than you expect?
I had the same problem, and this is the solution I came across:
nRows = 1222;
nCols = 1222;
A = spconvert(load('mymatrix.txt'));
[i,j,s] = find(A);
A = sparse(i,j,s,nRows,nCols);
It's an adaptation of one of the examples here.

4 dimensional matrix

I need to use 4 dimensional matrix as an accumulator for voting 4 parameters. every parameters vary in the range of 1~300. for that, I define Acc = zeros(300,300,300,300) in MATLAB. and somewhere for example, I used:
Acc(4,10,120,78)=Acc(4,10,120,78)+1
however, MATLAB says some error happened because of memory limitation.
??? Error using ==> zeros
Out of memory. Type HELP MEMORY for your options.
in the below, you can see a part of my code:
I = imread('image.bmp'); %I is logical 300x300 image.
Acc = zeros(100,100,100,100);
for i = 1:300
for j = 1:300
if I(i,j)==1
for x0 = 3:3:300
for y0 = 3:3:300
for a = 3:3:300
b = abs(j-y0)/sqrt(1-((i-x0)^2) / (a^2));
b1=floor(b/3);
if b1==0
b1=1;
end
a1=ceil(a/3);
Acc(x0/3,y0/3,a1,b1) = Acc(x0/3,y0/3,a1,b1)+1;
end
end
end
end
end
end
As #Rasman mentioned, you probably want to use a sparse representation of the matrix Acc.
Unfortunately, the sparse function is geared toward 2D matrices, not arbitrary n-D.
But that's ok, because we can take advantage of sub2ind and linear indexing to go back and forth to 4D.
Dims = [300, 300, 300, 300]; % it will be a 300 by 300 by 300 by 300 matrix
Acc = sparse([], [], [], prod(Dims), 1, ExpectedNumElts);
Here ExpectedNumElts should be some number like 30 or 9000 or however many non-zero elements you expect for the matrix Acc to have. We notionally think of Acc as a matrix, but actually it will be a vector. But that's okay, we can use sub2ind to convert 4D coordinates into linear indices into the vector:
ind = sub2ind(Dims, 4, 10, 120, 78);
Acc(ind) = Acc(ind) + 1;
You may also find the functions find, nnz, spy, and spfun helpful.
edit: see lambdageek for the exact same answer with a bit more elegance.
The other answers are helping to guide you to use a sparse mat instead of your current dense solution. This is made a little more difficult since current matlab doesn't support N-dimensional sparse arrays. One implementation to do this is
replace
zeros(100,100,100,100)
with
sparse(100*100*100*100,1)
this will store all your counts in a sparse array, as long as most remain zero, you will be ok for memory.
then to access this data, instead of:
Acc(h,i,j,k)=Acc(h,i,j,k)+1
use:
index = h+100*i+100*100*j+100*100*100*k
Acc(index,1)=Acc(index,1)+1
See Avoiding 'Out of Memory' Errors
Your statement would require more than 4 GB of RAM (Around 16 Gigs, to be specific).
Solutions to 'Out of Memory' problems
fall into two main categories:
Maximizing the memory available to
MATLAB (i.e., removing or increasing
limits) on your system via operating
system selection and system
configuration. These usually have the
greatest overall applicability but are
potentially the most disruptive (e.g.
using a different operating system).
These techniques are covered in the
first two sections of this document.
Minimizing the memory used by MATLAB
by making your code more memory
efficient. These are all algorithm
and application specific and therefore
are less broadly applicable. These
techniques are covered in later
sections of this document.
In your case later seems to be the solution - try reducing the amount of memory used / required.