How can I select certain rows in a dataset? Mathematica - select

My question is probably really easy, but I am a mathematica beginner.
I have a dataset, lets say:
Column: Numbers from 1 to 10
Column Signs
Column Other signs.
{{1,2,3,4,5,6,7,8,9,10},{d,t,4,/,g,t,w,o,p,m},{g,h,j,k,l,s,d,e,w,q}}
Now I want to extract all rows for which column 1 provides an odd number. In other words I want to create a new dataset.
I tried to work with Select and OddQ as well as with the IF function, but I have absolutely no clue how to put this orders in the right way!

Taking a stab at what you might be asking..
(table = {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} ,
Characters["abcdefghij"],
Characters["ABCDEFGHIJ"]}) // MatrixForm
table[[All, 1 ;; -1 ;; 2]] // MatrixForm
or perhaps this:
Select[table, OddQ[#[[1]]] &]
{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}}

The convention in Mathematica is the reverse of what you use in your description.
Rows are first level sublists.
Let's take your original data
mytable = {{1,2,3,4,5,6,7,8,9,10},{d,t,4,"/",g,t,w,o,p,m},{g,h,j,k,l,s,d,e,w,q}}
Just as you suggested, Select and OddQ can do what you want, but on your table, transposed. So we transpose first and back:
Transpose[Select[Transpose[mytable], OddQ[First[#]]& ]]
Another way:
Mathematica functional command MapThread can work on synchronous lists.
DeleteCases[MapThread[If[OddQ[#1], {##}] &, mytable], Null]
The inner function of MapThread gets all elements of what you call a 'row' as variables (#1, #2, etc.). So it test the first column and outputs all columns or a Null if the test fails. The enclosing DeleteCases suppresses the unmatching "rows".

Related

Dynamic Json Keys in Scala

I'm new to scala (from python) and I'm trying to create a Json object that has dynamic keys. I would like to use some starting number as the top-level key and then combinations involving that number as second-level keys.
From reading the play-json docs/examples, I've seen how to build these nested structures. While that will work for the top-level keys (there are only 17 of them), this is a combinatorial problem and the power set contains ~130k combinations that would be the second-level keys so it isn't feasible to list that structure out. I also saw the use of a case class for structures, however the parameter name becomes the key in those instances which is not what I'm looking for.
Currently, I'm considering using HashMaps with the MultiMap trait so that I can map multiple combinations to the same original starting number and then second-level keys would be the combinations themselves.
I have python code that does this, but it takes 3-4 days to work through up-to-9-number combinations for all 17 starting numbers. The ideal final format would look something like below.
Perhaps it isn't possible to do in scala given the goal of using immutable structures. I suppose using regex on a string of the output might be an option as well. I'm open to any solutions regarding data structures to hold the info and how to approach the problem. Thanks!
{
"2": {
"(2, 3, 4, 5, 6)": {
"best_permutation": "(2, 4, 3, 5, 6)",
"amount": 26.0
},
"(2, 4, 5, 6)": {
"best_permutation": "(2, 5, 4, 6)",
"amount": 21.0
}
},
"3": {
"(3, 2, 4, 5, 6)": {
"best_permutation": "(3, 4, 2, 5, 6)",
"amount": 26.0
},
"(3, 4, 5, 6)": {
"best_permutation": "(3, 5, 4, 6)",
"amount": 21.0
}
}
}
EDIT:
There is no real data source other than the matrix I'm using as my lookup table. I've posted the links to the lookup table I'm using and the program if it might help, but essentially, I'm generating the content myself within the code.
For a given combination, I have a function that basically takes the first value of the combination (which is to be the starting point) and then uses the tail of that combination to generate a permutation.
After that I prepend the starting location to the front of each permutation and then use sliding(2) to work my way through the permutation looking up the amount which is in a breeze.linalg.DenseMatrix by using the two values to index the matrix I've provided below and summing the amounts gathered by indexing the matrix with the two sliding values (subtracting 1 from each value to account for the 0-based indexing).
At this point, it is just a matter of gathering the information (starting_location, combination, best_permutation and the amount) and constructing the nested HashMap. I'm using scala 2.11.8 if it makes any difference.
MATRIX: see here.
PROGRAM:see here.

How to put numbers into an array and sorted by most frequent number in java

I was given this question on programming in java and was wondering what would be the best way of doing it.
The question was on the lines of:
From the numbers provided, how would you in java display the most frequent number. The numbers was: 0, 3, 4, 1, 1, 3, 7, 9, 1
At first I am thinking well they should be in an array and sorted first then maybe have to go through a for loop. Am I on the right lines. Some examples will help greatly
If the numbers are all fairly small, you can quickly get the most frequent value by creating an array to keep track of the count for each number. The algorithm would be:
Find the maximum value in your list
Create an integer array of size max + 1 (assuming all non-negative values) to store the counts for each value in your list
Loop through your list and increment the count at the index of each value
Scan through the count array and find the index with the highest value
The run-time of this algorithm should be faster than sorting the list and finding the longest string of duplicate values. The tradeoff is that it takes up more memory if the values in your list are very large.
With Java 8, this can be implemented rather smoothly. If you're willing to use a third-party library like jOOλ, it could be done like this:
List<Integer> list = Arrays.asList(0, 3, 4, 1, 1, 3, 7, 9, 1);
System.out.println(
Seq.seq(list)
.grouped(i -> i, Agg.count())
.sorted(Comparator.comparing(t -> -t.v2))
.map(t -> t.v1)
.toList());
(disclaimer, I work for the company behind jOOλ)
If you want to stick with the JDK 8 dependency, the following code would be equivalent to the above:
System.out.println(
list.stream()
.collect(Collectors.groupingBy(i -> i, Collectors.counting()))
.entrySet()
.stream()
.sorted(Comparator.comparing(e -> -e.getValue()))
.map(e -> e.getKey())
.collect(Collectors.toList()));
Both solutions yield:
[1, 3, 0, 4, 7, 9]

Matlab -- Finding missing number in a list

I have a relatively large data set, and I'm looking for the missing number via MatLab.
For example, I have a list of numbers that might look like:
1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5, 6, 6, 7, 7, 7, 7, 9, 10, 10.....
You can see the 8 is missing here. The list is in the thousands, and there are maybe just a couple missing numbers. How can I find out which ones are missing? My search only turned up useful results without randomly repeating numbers. Seems simple but I can't figure it out.
Thanks for help!
Use unique, like this:
B=unique(A); % A is your data
C=setdiff(1:max(A),B)
and C is your desired missing numbers.
EDIT (afetr seeing claj's answer):
If your data starts from another value (not "1"), the second line should be:
C=setdiff(min(A):max(A),B)
EDIT2: (according to Eitan's comment)
C=setdiff(min(A):max(A),A);
This line replaces the two lines from the original answer.
You could do something like this:
% Your data:
data = [1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5, 6, 6, 7, 7, 7, 7, 9, 10, 10];
for i = 1:data(end)
if (isempty(find(data==i)))
disp(['i = ',num2str(i)]);
end
end
Which will print out the values of the missing elements.
Or even simpler you could just use the ismember() function to construct
the set difference in just a single line below.
% First enter your data and construct 'set':
data = [1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 5, 6, 6, 7, 7, 7, 7, 9, 10, 10];
set = data(1):data(end);
Then to determine which elements of 'set' are also in 'data':
ismember(set, data)
The output then shows the locations in 'set' where the data is missing:
ans =
1 1 1 1 1 1 1 0 1 1
Use the ismember() function to check if a number is member of the data array
% set your data array
maximum = max(data);
minimum = min(data);
for i= minimum:maximum
if ~ismember(i,data);
disp([num2str(i) , ' is missed']);
end
end
Create a unique list of values in the array.
Find the min and max numbers in this unique set (these should be the same numbers as in the array, but quicker to find).
Create a range from min to max like [min:max].
Make a set difference of the uniqued array and the range-set.
This gives you the missing numbers in decently quick way.
this is similar to a few of the above but the simplest i've found is
find(~ismember(set,data))
which will return the indices of the members of set that are not in data

Select One Element in Each Row of a Numpy Array by Column Indices [duplicate]

This question already has answers here:
NumPy selecting specific column index per row by using a list of indexes
(7 answers)
Closed 2 years ago.
Is there a better way to get the "output_array" from the "input_array" and "select_id" ?
Can we get rid of range( input_array.shape[0] ) ?
>>> input_array = numpy.array( [ [3,14], [12, 5], [75, 50] ] )
>>> select_id = [0, 1, 1]
>>> print input_array
[[ 3 14]
[12 5]
[75 50]]
>>> output_array = input_array[ range( input_array.shape[0] ), select_id ]
>>> print output_array
[ 3 5 50]
You can choose from given array using numpy.choose which constructs an array from an index array (in your case select_id) and a set of arrays (in your case input_array) to choose from. However you may first need to transpose input_array to match dimensions. The following shows a small example:
In [101]: input_array
Out[101]:
array([[ 3, 14],
[12, 5],
[75, 50]])
In [102]: input_array.shape
Out[102]: (3, 2)
In [103]: select_id
Out[103]: [0, 1, 1]
In [104]: output_array = np.choose(select_id, input_array.T)
In [105]: output_array
Out[105]: array([ 3, 5, 50])
(because I can't post this as a comment on the accepted answer)
Note that numpy.choose only works if you have 32 or fewer choices (in this case, the dimension of your array along which you're indexing must be of size 32 or smaller). Additionally, the documentation for numpy.choose says
To reduce the chance of misinterpretation, even though the following "abuse" is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.
The OP asks:
Is there a better way to get the output_array from the input_array and select_id?
I would say, the way you originally suggested seems the best out of those presented here. It is easy to understand, scales to large arrays, and is efficient.
Can we get rid of range(input_array.shape[0])?
Yes, as shown by other answers, but the accepted one doesn't work in general so well as what the OP already suggests doing.
I think enumerate is handy.
[input_array[enum, item] for enum, item in enumerate(select_id)]
How about:
[input_array[x,y] for x,y in zip(range(len(input_array[:,0])),select_id)]

MATLAB, how to evaluate multiple indices in one line?

I don't know how to explain this better than by giving you an example.
Suppose I have the following array:
a = magic(6)
And then I take a 'slice' of that like this:
a(:,1)
It will print:
35
3
31
8
30
4
Now I want the first number, so I want to write:
a(:,1)(1)
Instead of:
b = a(:,1)
b(1)
Also, is there a way to do something like this (assignment and comparison, i.e. set b, then evaluate against it):
(b = a(:,1))(1)
Ok, here's an update with a function where it isn't trivial to use a(1, 1)
come_on = sprintf('%i, ', magic(3));
come_on(1:end-2)
8, 3, 4, 1, 5, 9, 6, 7, 2
Also, what if I only want the first 4 numbers on magic(3)?
It would be better to write
sprintf('%i, ', magic(3)(1:4))(1:end-2)
instead of tens of lines, MHO.
You cannot concatenate indexing as foo(1)(2)(3). However, you can index multiple dimensions at once. So in this case, a(1,1) will give you what you want.