filling a matrix with Scala library breeze - scala

I'm new to Scala and I'm having a mental block on a seemingly easy problem. I'm using the Scala library breeze and need to take an array buffer (mutable) and put the results into a matrix. This... should be simple but? Scala is so insanely type casted breeze seems really picky about what data types it will take when making a DenseVector. This is just some prototype code, but can anyone help me come up with a solution?
Right now I have something like...
//9 elements that need to go into a 3x3 matrix, 1-3 as top row, 4-6 as middle row, etc)
val numbersForMatrix: ArrayBuffer[Double] = (1, 2, 3, 4, 5, 6, 7, 8, 9)
//the empty 3x3 matrix
var M: breeze.linalg.DenseMatrix[Double] = DenseMatrix.zeros(3,3)
In breeze you can do stuff like
M(0,0) = 100 and set the first value to 100 this way,
You can also do stuff like:
M(0, 0 to 2) := DenseVector(1, 2, 3)
which sets the first row to 1, 2, 3
But I cannot get it to do something like...
var dummyList: List[Double] = List(1, 2, 3) //this works
var dummyVec = DenseVector[Double](dummyList) //this works
M(0, 0 to 2) := dummyVec //this does not work
and successfully change the first row to the 1, 2,3.
And that's with a List, not even an ArrayBuffer.
Am willing to change datatypes from ArrayBuffer but just not sure how to approach this at all... could try updating the matrix values one by one but that seems like it would be VERY hacky to code up(?).
Note: I'm a Python programmer who is used to using numpy and just giving it arrays. The breeze documentation doesn't provide enough examples with other datatypes for me to have been able to figure this out yet.
Thanks!

Breeze is, in addition to pickiness over types, pretty picky about vector shape: DenseVectors are column vectors, but you are trying to assign to a subset of a row, which expects a transposed DenseVector:
M(0, 0 to 2) := dummyVec.t

Related

Get networkx filtered nodes from subgraph_view of filtered edges

I've created a subgraph_view by applying a filter to edges. When I call nodes() on the subgraph it still shows me all nodes, even if none of the edges use them. I need to get a list of only nodes that are still part of the subgraph.
G = nx.path_graph(6)
G[2][3]["cross_me"] = False
G[3][4]["cross_me"] = False
def filter_edge(n1, n2):
return G[n1][n2].get("cross_me", True)
view = nx.subgraph_view(G, filter_edge=filter_edge)
# node 3 is no longer used by any edges in the subgraph
view.edges()
This produces
EdgeView([(0, 1), (1, 2), (4, 5)])
as expected. However, when I run view.nodes() I get
NodeView((0, 1, 2, 3, 4, 5))
What I expect to see is
NodeView((0, 1, 2, 4, 5))
This seems odd. Is there some way to extract only the nodes used by the subgraph?
The confusion stems from the definition of 'graph.' A disconnected node is still a part of a graph. In fact, you could have a graph with no edges at all. So the behavior of subgraph_view() is counterintuitive but correct.
If, however, you still want to achieve what you're describing, there are lots of potential ways, depending on your tolerance for modifying the original graph. I'll mention two that attempt to stay as close to your current method as possible and avoid deleting edges or nodes from G.
Method 1
The easiest way using your view object is to take it as input to edge_subgraph() (which only takes edges as input) like this:
final_view = view.edge_subgraph(view.edges())
final_view.nodes()
gives
NodeView((0, 1, 2, 4, 5))
Method 2
To me, Method 1 seems clunky and confusing by defining an intermediate view. If instead we go back up a little bit and start with G, we could define a filter_node function that checks the edge attributes of each node and filters that node if
all edges are flagged for removal, or
the node has no edges in the first place.
You could also do this by manually flagging the node itself, as you've done with the edges.
G = nx.path_graph(6)
G[2][3]["cross_me"] = False
G[3][4]["cross_me"] = False
def filter_edge(n1, n2):
return G[n1][n2].get("cross_me", True)
def filter_node(n):
return sum([i[2].get("cross_me", True) for i in G.edges(n, data=True)])
view = nx.subgraph_view(G, filter_node=filter_node, filter_edge=filter_edge)
view.nodes()
also gives the expected
NodeView((0, 1, 2, 4, 5))

Scala vector splicing (maintaining sorted sequence)

I need to maintain a sorted sequence (mutable or immutable — I don't care), dynamically inserting elements into the middle of it (to keep it sorted) and removing them likewise (so, random access by index is crucial).
The best thing I came onto is using a Vector and scala.collections.Searching from 2.11, and then:
var vector: Vector[Ordered]
...
val ip = vector.search(element)
Inserting:
vector = (vector.take(ip.insertionPoint) :+ element) ++ vector.drop(ip.insertionPoint)
Deleting:
vector.patch(from = ip.insertionPoint, patch = Nil, replaced = 1)
Doesn't look elegant to me, and I suspect performance issues. Is there a better way? Splicing sequences seems like a very basic operation to me, but I can't find an elegant solution.
You should use SortedSet. Default implementation of SortedSet is immutable red-black tree. There is also a mutable implementation.
SortedSet[Int]() + 5 + 3 + 4 + 7 + 1
// SortedSet[Int] = TreeSet(1, 3, 4, 5, 7)
Set contains no duplicate elements. In case you want to count duplicate elements you could use SortedMap[Key, Int] with elements as keys and counts as values. See this answer for MultiSet emulation using Map.

How to put numbers into an array and sorted by most frequent number in java

I was given this question on programming in java and was wondering what would be the best way of doing it.
The question was on the lines of:
From the numbers provided, how would you in java display the most frequent number. The numbers was: 0, 3, 4, 1, 1, 3, 7, 9, 1
At first I am thinking well they should be in an array and sorted first then maybe have to go through a for loop. Am I on the right lines. Some examples will help greatly
If the numbers are all fairly small, you can quickly get the most frequent value by creating an array to keep track of the count for each number. The algorithm would be:
Find the maximum value in your list
Create an integer array of size max + 1 (assuming all non-negative values) to store the counts for each value in your list
Loop through your list and increment the count at the index of each value
Scan through the count array and find the index with the highest value
The run-time of this algorithm should be faster than sorting the list and finding the longest string of duplicate values. The tradeoff is that it takes up more memory if the values in your list are very large.
With Java 8, this can be implemented rather smoothly. If you're willing to use a third-party library like jOOλ, it could be done like this:
List<Integer> list = Arrays.asList(0, 3, 4, 1, 1, 3, 7, 9, 1);
System.out.println(
Seq.seq(list)
.grouped(i -> i, Agg.count())
.sorted(Comparator.comparing(t -> -t.v2))
.map(t -> t.v1)
.toList());
(disclaimer, I work for the company behind jOOλ)
If you want to stick with the JDK 8 dependency, the following code would be equivalent to the above:
System.out.println(
list.stream()
.collect(Collectors.groupingBy(i -> i, Collectors.counting()))
.entrySet()
.stream()
.sorted(Comparator.comparing(e -> -e.getValue()))
.map(e -> e.getKey())
.collect(Collectors.toList()));
Both solutions yield:
[1, 3, 0, 4, 7, 9]

Easy way to create and initialise a 3D table of structs in Matlab

I would like to be able to initialise a big table in matlab easily.
Say I have the bounds x, y, z = 5, 4, 3. I want to be able to make a 5x4x3 table where each element is a struct that stores count and sum. Count and sum in this struct should be 0 when initialised.
I thought it would be enough to do this:
table = []
table(5,4,3) = struct('sum', 0, 'count', 0)
And this would work for a double but not with a structure evidently.
Any ideas?
EDIT:
As another question, (bonus if you will) is there a way to force matlab to store the struct, but when you access the element (i.e., table(1, 2, 3)) get it to return the average (i.e., table(1,2,3).sum/table(1,2,3).count).
Its not vital to the question but it would certainly be cool.
You'll need just to replace the line table = [] to avoid the error, that is
clear table;
table(5,4,3) = struct('sum', 0, 'count', 0)
works fine. Note, however, that this command only initializes one field of your array, i.e., the memory allocation is incomplete. To initialize all fields of your array, you can use
table2(1:5,1:4,1:3) = struct('sum', 0, 'count', 0)
to visualize the difference, use whos, which returns
>> whos
Name Size Bytes Class Attributes
table 5x4x3 736 struct
table2 5x4x3 8288 struct
Your second question can be solved, for instance, by using anonymous functions
myMean = #(a) a.sum./a.count; %define the function
myMean(table2(2,2,2)) % access the mean in the field (2,2,2)

For each element A[i] of array A, find the closest j such that A[j] > A[i]

Given : An array A[1..n] of real numbers.
Goal : An array D[1..n] such that
D[i] = min{ distance(i,j) : A[j] > A[i] }
or some default value (like 0) when there is no higher-valued element. I would really like to use Euclidean distance here.
Example :
A = [-1.35, 3.03, 0.73, -0.06, 0.71, -0.21, -0.12, 1.49, 1.41, 1.42]
D = [1, 0, 1, 1, 2, 1, 1, 6, 1, 2]
Is there any way to beat the obvious O(n^2) solution? The only progress I've made so far is that D[i] = 1 whenever A[i] is not a local maxima. I've been thinking a lot and have come up with NOTHING. I hope to eventually extend this to 2D (so A and D are matrices).
So I've puzzled on this a bit but I haven't come up with anything better that works. A few ideas:
Augment the array with extra information that can be gained in O(n) time or better. e.g., add indices, difference between neighbors, etc.
Would sorting (O(n(log n)) help in any way?
Seems like dynamic programming could be helpful here, if you can figure out a way to solve for each element based on the solution for its neighbors (augmenting the answers with information like the j for each A[i] instead of just the distance maybe).
Sort the array from highest to lowest element. If I understand your problem correctly, this gives you the answer immediately, since the closest bigger element to any element in the original list is the one before it. This way you don't even need to create the D[] array, since computation of its contents can be done using the array A[] exclusively. The first element in the sorted A[] array does not have a bigger friend so the answer for it would be your default valye ( 0 perhaps?). Extending the algorithm for matrices might be easy (depends on how you "look" at the matrix) - just use a mapping function which sort of transofrms the matrix into a 1D array.