For functions with more than 4 variables, the usual solution is to use a number of 4-input K-map tables in parallel. I wonder why not use one table with more than 2 inputs per row or column as long as the set of variabes is coded in gray code. The cells will stil be adjascent
Related
I am sorry for this complicated problem, but, I will try my best to explain myself.
This is basically a Hidden Markov Model question. I have two columns of data. The data in these two columns are independent of each other, however, together represent a specific movement which can be character-coded. I assign a character in 3rd column by putting conditions on column1 and column2 entry. Note: the characters are finite (~10-15).
For example:-
if (column1(i)>0.5) && (column2(i)<15)
column3(i)='D';
I end up with a string something like this
AAAAADDDDDCCCCCFFFFAAAACCCCCFFFFFFDDD
So, each of the character gets repeated but not of constant lengths (e.g first time A's appear 5 times while second time A's appear 4 times only).
Now, let us take the first chunk of A's (AAAAA), each A containing an ordered pair of column1 and column2 values. Now, comparing with the second chunk of A's (AAAA), the values of column1 and column2 should be similar to those of first chunk. Usually, the values in each columns would be either increasing or decreasing or constant throughout a chunk. And the values of the columns in both chunks should be similar. For example, column1 goes from -1 to -5 in 5 unequal samples but in second chunk it goes from -1.2 to -5.1 in 4 unequal steps.
What I want is a fitting of a probability distribution over column1 and column2 values (independently) for each set of repeated characters (e.g. for A's and then D's then C's then F's and then again A's).
And the final goal is following:-
given n elements in column1, column2, column3, I want to predict what is (n+1) element is going to be in column 3, how many times it is going to repeat itself (with probability e.g. 70% chance it is going to repeat itself 4 times and 20% chance it is going to repeat itself 5 times). Also, what is the probability distribution of column1 and column2 is going to be for the predicted character.
Please feel free to ask questions if I fail to explain it well.
I am fairly new to matlab and I am trying to figure out when it is best to use cells, tables, or matrixes to store sets of data and then work with the data.
What I want is to store data that has multiple lines that include strings and numbers and then want to work with the numbers.
For example a line would look like
'string 1' , time, number1, number 2
. I know a matrix works best if al elements are numbers, but when I use a cell I keep having to convert the numbers or strings to a matrix in order to work with them. I am running matlab 2012 so maybe that is a part of the problem. Any help is appreciated. Thanks!
Use a matrix when :
the tabular data has a uniform type (all are floating points like double, or integers like int32);
& either the amount of data is small, or is big and has static (predefined) size;
& you care about the speed of accessing data, or you need matrix operations performed on data, or some function requires the data organized as such.
Use a cell array when:
the tabular data has heterogeneous type (mixed element types, "jagged" arrays etc.);
| there's a lot of data and has dynamic size;
| you need only indexing the data numerically (no algebraic operations);
| a function requires the data as such.
Same argument for structs, only the indexing is by name, not by number.
Not sure about tables, I don't think is offered by the language itself; might be an UDT that I don't know of...
Later edit
These three types may be combined, in the sense that cell arrays and structs may have matrices and cell arrays and structs as elements (because thy're heterogeneous containers). In your case, you might have 2 approaches, depending on how you need to access the data:
if you access the data mostly by row, then an array of N structs (one struct per row) with 4 fields (one field per column) would be the most effective in terms of performance;
if you access the data mostly by column, then a single struct with 4 fields (one field per column) would do; first field would be a cell array of strings for the first column, second field would be a cell array of strings or a 1D matrix of doubles depending on how you want to store you dates, the rest of the fields are 1D matrices of doubles.
Concerning tables: I always used matrices or cell arrays until I
had to do database related things such as joining datasets by a unique key; the only way I found to do this in was by using tables. It takes a while to get used to them and it's a bit annoying that some functions that work on cell arrays don't work on tables vice versa. MATLAB could have done a better job explaining when to use one or the other because it's not super clear from the documentation.
The situation that you describe, seems to be as follows:
You have several columns. Entire columns consist of 1 datatype each, and all columns have an equal number of rows.
This seems to match exactly with the recommended situation for using a [table][1]
T = table(var1,...,varN) creates a table from the input variables,
var1,...,varN . Variables can be of different sizes and data types,
but all variables must have the same number of rows.
Actually I don't have much experience with tables, but if you can't figure it out you can always switch to using 1 cell array for the first column, and a matrix for all others (in your example).
Suppose I have 121 elements and want to get all combinations of 4 elements taken at a time, i.e. 121c4.
Since combnk(1:121, 4) takes a lot of time, I want to go for 2% of that combination by providing:
z = 1:50:length(121c4(:, 1))
For example: 1st row, 5th row, 100th row and so on, up to 121c4, picking only those rows from a 121c4 matrix without generating the complete combination (it's consuming too much for large numbers like 625c4).
If you haven't defined an ordering on the combinations, why not just use
randi(121,p,4)
where p is the number of combinations you want in your set ? With this approach you may, or may not, want to replace duplicates.
If you have defined an ordering on the combinations, tell us what it is.
can I do this with the standard SQL or I need to create a function for the following problem?
I have 14 columns, which represent 2 properties of 7 consecutive objects (the order from 1 to 7 is important), so
table.object1prop1, ...,table.object1prop7,table.objects2prop2, ..., table.objects2prop7.
I need compute the minimum value of the property 2 of the 7 objects that have smaller values than a specific threshold for property 1.
The values of the property 1 of the 7 objects take values on a ascending arithmetic scale. So property 1 of the object 1 will ever be smaller than property 2 of the objects 1.
Thanks in advance for any clue!
This would be easier if the data were normalized. (Hint, any time you find a column name with a number in it, you are looking at a big red flag that the schema is not in 3rd normal form.) With the table as you describe, it will take a fair amount of code, but the greatest() and least() functions might be your best friends.
http://www.postgresql.org/docs/current/interactive/functions-conditional.html#FUNCTIONS-GREATEST-LEAST
If I had to write code for this, I would probably feed the values into a CTE and work from there.
The mode-function in Matlab returns the value that occurs most frequently in a dataset. But "when there are multiple values occurring equally frequently, mode returns the smallest of those values."
This is not very useful for what i am using it for, i would rather have it return a median, or arithmetic mean in the absence of a modal value (as they are at least somewhat in the middle of the distibution). Otherwise the results of using mode are far too much on the low side of the scale (i have a lot of unique values in my distribution).
Is there an elegant way to make mode favor more central values in a dataset (in the absence of a true modal value)?
btw.: i know i could use [M,F] = mode(X, ...), to manually check for the most frequent value (and calculate a median or mean when necessary). But that seems like a bit of an awkward solution, since i would be almost entirely rewriting everything that mode is supposed to be doing. I'm hoping that there's a more elegant solution.
Looks like you want the third output argument from mode. EG:
x = [1 1 1 2 2 2 3 3 3 4 4 4 5 6 7 8];
[m,f,c] = mode(x);
valueYouWant = median(c{1});
Or (since median takes the average of values when there are an even number of entries), in the cases where an even number of values may have the same max number of occurrences, maybe do something like this:
valueYouWant = c{1}(ceil(length(c{1})/2))