No idea how to use [hist] in Pure Data.
And the three arguments of [hist] is:
the value of first class,
the value of last class,
the number of classes.
I cannot figure out the first and second argument meaning? And how am I going to pass the output of [hist] to [tabwrite] and generate an array diagram in Pure Data.
It seems you are using the [hist] object from smlib.
The histogram will contain <number of classes> bins of equal size, with the first bin being equivalent to the <value of first class> and the last bin being equivalent to <value of last class>-1 (the offset is arguably a bug).
So, the value of first class is the minimum expected input value (x>=min), and the value of last class is the maximum expected input value (x<<max).
Any input value exceeding those boundaries will be clipped.
Examples:
[3, absolute(
|
[hist 2 5 3]
|
[print]
This will create a 3-bin histogram, with the bins 2±0.5 (with clipping this means x<2.5), 3±0.5 and 4±0.5 (with clipping that is 3.5<x).
The input 3 will be filed into the second bin, so the absolute histogram is 0 1 0.
Similarily:
[3, absolute(
|
[hist 3 6 3]
|
[print]
This will create a 3-bin histogram, with the bins 3±0.5, 4±0.5 and 5±0.5.
The input 3 will be now filed into the first bin, so the absolute histogram is 1 0 0.
Displaying the histogram:
You can set the table-values by sending a list of number to the table, prefixed with the starting index:
[relative(
|
[hist 0 100 100]
|
[list prepend 0]
|
[s $0-histo]
[table $0-histo 100]
Alternatively check the [array] object (which also can be accessed via [tabread] and the like)
Related
I have a 29736 x 6 table, which is referred to as table_fault_test_data. It has 6 columns, with names wind_direction, wind_speed, air_temperature, air_pressure, density_hubheight and Fault_Condition respectively. What I want to do is to label the data in the Fault_Condition (last table column with either a 1 or a 0 value, depending on the values in the other columns.
I would like to do the following checks (For eg.)
If wind_direction value(column_1) is below 0.0040 and above 359.9940, label 6 th column entry corresponding to the respective row of the table as a 1, else label as 0.
Do this for the entire table. Similarly, do this check for others
like air_temperature, air_pressure and so on. I know that if-else
will be used for these checks. But, I am really confused as to how I
can do this for the whole table and add the corresponding value to
the 6 th column (Maybe using a loop or something).
Any help in this
regard would be highly appreciated. Many Thanks!
EDIT:
Further clarification: I have a 29736 x 6 table named table_fault_test_data . I want to add values to the 6 th column of table based on conditions as below:-
for i = 1:29736 % Iterating over the whole table row by row
if(1st column value <x | 1st column value > y)
% Add 0 to the Corresponding element of 6 th column i.e. table_fault_test_data(i,6)
elseif (2nd column value <x | 2nd column value > y)
% Add 0 to the Corresponding element of 6 th column i.e. table_fault_test_data(i,6)
elseif ... do this for other cases as well
else
% Add 1 to the Corresponding element of 6 th column i.e. table_fault_test_data(i,6)
This is the essence of my requirements. I hope this helps in understanding the question better.
You can use logical indexing, which is supported also for tables (for loops should be avoided, if possible). For example, suppose you want to implement the first condition, and also suppose your x and y are known; also, let us assume your table is called t
logicalIndecesFirstCondition = t{:,1} < x | t{:,2} >y
and then you could refer to the rows which verify this condition using logical indexing (please refer to logical indexing
E.g.:
t{logicalIndecesFirstCondition , 6} = t{logicalIndecesFirstCondition , 6} + 1.0;
This would add 1.0 to the 6th column, for the rows for which the logical condition is true
I have created a cell array called Items of size (10,5). It contains the following as the first column in the cell array:
Item name Price (in $) Shipping (in $) Total price (in $) Total price (in €)
I have it all filled up, but what I need to do is to sort the cell array according to the total price in € from the smallest to the largest, but I can't seem to find a way to do it. I tried sort(Items,5); in order to sort according to the values in € but it returns an error. It could be useful to find a way to make the sorting automatic, so if I wanted to add more items it would still sort them in the global list.
sortrows will likely do exactly what you want to do. It will sort based on a specific column, assuming the datatype is constant in the entire column.
>> a ={'a',8,9;'b',5,6;'c',2,3};
>> a_sorted = sortrows(a,3)
a_sortred =
'c' [2] [3]
'b' [5] [6]
'a' [8] [9]
Edit
From your comments below, you can easily just sort the array first and then add a row to the cell array the same way you would combine regular arrays. Documentation
>> a = {7,8,9;4,5,6;1,2,3};
>> a_sorted = sortrows(a,3);
>> headers = {'col1','col2','col3'};
>> a_header = [headers;a_sorted]
a_header =
'col1' 'col2' 'col3'
[ 1] [ 2] [ 3]
[ 4] [ 5] [ 6]
[ 7] [ 8] [ 9]
EDIT #2
You can round the values that you are presenting using the second argument of the round function. After you round it, you can change the format of how things are displayed. Normally it is set as short, which is 4 decimal places. If you set it to shortg it will show as few decimals as possible, up to 4.
>> a = [1.23456789;2.3456789;3.456789]
a =
1.2346
2.3457
3.4568
>> a_rounded = round(a,2)
a_rounded =
1.2300
2.3500
3.4600
>> format shortg
>> a_rounded
a_rounded =
1.23
2.35
3.46
If changing the format is not an option, you could always just convert the number into a string and then display that. That gets a little more complicated, but a quick google will help you there.
EDIT #3
I did not know this existed before, but you can apparently use the format style called bank. This will display all numbers as two decimal points even if they are 0.
First place all of the prices in a separate array, sort on this array individually then use the indices of sorting to rearrange the rows of your cell array.
Try something like this:
price = [Items{:,5}];
[~,ind] = sort(price);
ItemsSorted = Items(ind,:);
Alternatively you can use the sortrows function that MZimmerman6 mentioned and operate along the fifth column of your cell array. I wasn't aware it worked on cell arrays, so I learned something new!
I have a set which has {0} and other 8 elements, total 9 elements. I want to choose random 3 value in this set and create a 3x1 column matrix. This will repeat all possible choices in the set. How can I do?
As #Picket said in comment,
The way RandomSample works will ensure it will not output the same choice twice in a single call
If your list is small, you can generate all subsets and sample it.
Example
RandomSample[Subsets[{a, b, c, d, e, f}, {3}], 7]
will generate all (20) subsets with 3 (distinct) elements and then pick 7 different uniformly (there are options to weight each member differently, chose the random generator, etc.).
RandomSample[Flatten[Permutations /# Subsets[{a, b, c, d, e, f}, {3}], 1], 13]
will generate all (120) possible ordered selections of 3 distinct elements among a set of 6 elements and give a sample of 13 distinct elements of this list.
If what you want is a random ordering of all possible subsets of size 3, or of all ordered selections without duplicate of size 3 just ask the same way but with the exact number of such sets.
myset = { foo, foo2, foo3, foo5 };
RandomSample[Subsets[myset, {3}], Binomial[Length[myset],3 ]]
RandomSample[Flatten[Permutations /# Subsets[myset, {3}], 1], 3!*Binomial[Length[myset],3 ] ]
(if you ask more than the exact number of possibilities, RandomSample will complain)
Now if your initial set is large so that the set of subsets is impractical for generation time and memory, take advantage of representing set composition by numbers, even if it is not perfect in term of uniform distribution. Say that your initial set has 20 distinct elements. A three digit number in base 20 can represent any selection of 3. If you account for the need to filter out the few with one digit appearing more than once
20^3/(3!*Binomial[20, 3]) // N
1.16959
You are probably safe by generating 25% more numbers than what you need and filtering the ones with repetition:
Cases[IntegerDigits[RandomSample[0 ;; 20^3-1, Ceiling[31*(1 + 1/4)] ], 20, 3], _?(Length[Union[#]] == 3 &), 1, 31]
This generates a random sample of 39 distinct 3-digit numbers in base 20 and select the first 31 with no duplicates in the form of a list of 3-coordinates vectors.
I have the following dataset containing information about countries
5,1,648,16,10,2,0,3,5,1,1,0,1,1,1,0,0,0,0,0,1,0,0,1,0,0,
3,1,29,3,6,6,0,0,3,1,0,0,1,0,1,0,0,0,0,0,1,0,0,0,1,0,
4,1,2388,20,8,2,2,0,3,1,1,0,0,1,0,0,0,0,0,0,1,1,0,0,0,0,
...
The sixth column in each row indicates the main religion of the country: 0 is catholic, 1 is other christian, 2 is muslim, etc. Some of the other data is about if different colors are present in the flag of the country symbols they contain, and so on.
The description of the data can be found here. I have removed the string data columns though so it doesn't fit exactly like the information shown.
My problem is that I want to use co-variance matrices and Pearson correlation to see if, for example, the fact that a flag has the color red in it will tell anything about if the religion of that country has a bigger chance of being something than something else. But since the religion is enumerated, I am a bit lost on how to progress with this problem.
Your problem is that, despite the fact that your data is ordered, this order is arbitrary. The "distance" between "muslim" (enum val=1) to "hindu" (enum val=3) is not 2.
The most straight-forward way of tackling this issue is to convert enum values to binary indicator vectors:
Suppose you have
enum {
Catholic = 0
Protestant,
Muslim,
Jewish,
Hindu,
...
NumOfRel };
You replace the single entry of enum val with a binary vector of length NumOfRel with zeros everywhere except for a single 1 at the appropriate place:
For a Protestant entry, you'll have the following binary vector:
[ 0 1 0 0 ... ]
For a Jewish:
[ 0 0 0 1 0 ... ]
And so on...
This way, the "distance" between different religions is always 1.
So, presume a matrix like so:
20 2
20 2
30 2
30 1
40 1
40 1
I want to count the number of times 1 occurs for each unique value of column 1. I could do this the long way by [sum(x(1:2,2)==1)] for each value, but I think this would be the perfect use for the UNIQUE function. How could I fix it so that I could get an output like this:
20 0
30 1
40 2
Sorry if the solution seems obvious, my grasp of loops is very poor.
Indeed unique is a good option:
u=unique(x(:,1))
res=arrayfun(#(y)length(x(x(:,1)==y & x(:,2)==1)),u)
Taking apart that last line:
arrayfun(fun,array) applies fun to each element in the array, and puts it in a new array, which it returns.
This function is the function #(y)length(x(x(:,1)==y & x(:,2)==1)) which finds the length of the portion of x where the condition x(:,1)==y & x(:,2)==1) holds (called logical indexing). So for each of the unique elements, it finds the row in X where the first is the unique element, and the second is one.
Try this (as specified in this answer):
>>> [c,~,d] = unique(a(a(:,2)==1))
c =
30
40
d =
1
3
>>> counts = accumarray(d(:),1,[],#sum)
counts =
1
2
>>> res = [c,counts]
Consider you have an array of various integers in 'array'
the tabulate function will sort the unique values and count the occurances.
table = tabulate(array)
look for your unique counts in col 2 of table.