find unique matrices from a larger matrix - scala

I'm fairly new the functional programming, so I'm going through some practice exercises. I want to write a function, given a matrix of unique naturals, let's say 5x5, return a collection of unique matrices of a smaller size, say 3x3, where the matrices must be intact, i.e. created from values that are adjacent in the original.
01 02 03 04 05
06 07 08 09 10
11 12 13 14 15
16 17 18 19 20
21 22 23 24 25
Simple. Just slide across, then down, one by one in groups of 3, to get something that looks like:
01 02 03 | 02 03 04 | 03 04 05 | 06 07 08
06 07 08 | 07 08 09 | 08 09 10 | 11 12 13
11 12 13 | 12 13 14 | 13 14 15 | 16 17 18
or, in Scala,
List(List(1, 2, 3), List(6, 7, 8), List(11, 12, 13))
List(List(2, 3, 4), List(7, 8, 9), List(12, 13, 14))
List(List(3, 4, 5), List(8, 9, 10), List(13, 14, 15))
List(List(6, 7, 8), List(11, 12, 13), List(16, 17, 18))
and so on and so on...
So I venture out with Scala (my language of choice because it allows me to evolve from imperative to functional, and I've spent the last few years in Java.
val array2D = "01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25".grouped(3).map(_.trim.toInt).grouped(5)
val sliced = array2D.map(row => row.sliding(3, 1).toList).sliding(3, 1).toList
Now I have a data structure I can work with, but I don't see a functional way. Sure I can traverse each piece of sliced, create a var matrix = new ListBuffer[Seq[Int]]() and imperatively create a bag of those and I'm done.
I want to find a functional, ideally point-free approach using Scala, but I'm stumped. There's got to be a way to zip with 3 or something like that... I've searched the ScalaDocs and can't seem to figure it out.

You got halfway there. In fact, I was having trouble figuring out how to do what you had done already. I broke up your code a bit to make it easier to follow. Also, I made array2D a List, so I could play with the code more easily. :-)
val input = "01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25"
val intArray = (input split " " map (_.toInt) toList)
val array2D = (intArray grouped 5 toList)
val sliced = array2D.map(row => row.sliding(3, 1).toList).sliding(3, 1).toList
Ok, so you have a bunch of lists, each one a bit like this:
List(List(List( 1, 2, 3), List( 2, 3, 4), List( 3, 4, 5)),
List(List( 6, 7, 8), List( 7, 8, 9), List( 8, 9, 10)),
List(List(11, 12, 13), List(12, 13, 14), List(13, 14, 15)))
And you want them like this:
List(List(List(1, 2, 3), List(6, 7, 8), List(11, 12, 13)),
List(List(2, 3, 4), List(7, 8, 9), List(12, 13, 14)),
List(List(3, 4, 5), List(8, 9, 10), List(13, 14, 15)))
Does that feel right to you? Each of the three sublists is a matrix on its own:
List(List(1, 2, 3), List(6, 7, 8), List(11, 12, 13))
is
01 02 03
06 07 08
11 12 13
So, basically, you want to transpose them. The next step, then, is:
val subMatrices = sliced map (_.transpose)
The type of that thing is List[List[List[Seq[Int]]]]. Let's consider that a bit... The 2D matrix is represented by a sequence of a sequence, so List[Seq[Int]] corresponds to a matrix. Let's say:
type Matrix = Seq[Seq[Int]]
val subMatrices: List[List[Matrix]] = sliced map (_.transpose)
But you want one one list of matrices, so you can flatten that:
type Matrix = Seq[Seq[Int]]
val subMatrices: List[Matrix] = (sliced map (_.transpose) flatten)
But, alas, a map plus a flatten is a flatMap:
type Matrix = Seq[Seq[Int]]
val subMatrices: List[Matrix] = sliced flatMap (_.transpose)
Now, you want the unique submatrices. That's simple enough: it's a set.
val uniqueSubMatrices = subMatrices.toSet
Or, if you wish to keep the result as a sequence,
val uniqueSubMatrices = subMatrices.distinct
And that's it. Full code just to illustrate:
type Matrix = Seq[Seq[Int]]
val input = "01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25"
val intArray = (input split " " map (_.toInt) toList)
val array2D: Matrix = (intArray grouped 5 toList)
val sliced: List[List[Matrix]] = (array2D map (row => row sliding 3 toList) sliding 3 toList)
val subMatrices: List[Matrix] = sliced flatMap (_.transpose)
val uniqueSubMatrices: Set[Matrix] = subMatrices.toSet
It could be written as a single expression, but unless you break it up into functions, it's going to be horrible to read. And you'd either have to use the forward pipe (|>, not in the standard library), or add these functions implicitly to the types they act on, or it will be difficult to read anyway.

Edit: Okay, I think I finally understand what you want. I'm going to show a way that works, not a way that is high-performance. (That's generally the mutable Java-like solution, but you already know how to do that.)
First, you really, really ought to do this with your own collections that work in 2D sensibly. Using a bunch of 1D collections to emulate 2D collections is going to lead to unnecessary confusion and complication. Don't do it. Really. It's a bad idea.
But, okay, let's do it anyway.
val big = (1 to 25).grouped(5).map(_.toList).toList
This is the whole matrix that you want. Next,
val smaller = (for (r <- big.sliding(3)) yield r.toList).toList
are the groups of rows that you want. Now, you should have been using a 2D data structure, because you want to do something that doesn't map well onto 1D operations. But:
val small = smaller.map(xss =>
Iterator.iterate(xss.map(_.sliding(3)))(identity).
takeWhile(_.forall(_.hasNext)).
map(_.map(_.next)).
toList
).toList
If you carefully pull this apart, you see that you're creating a bunch of iterators (xss.map(_.sliding(3))) and then iterating through them all in lock step by keeping hold of those same iterators step after step, stopping when at least one of them is empty, and mapping them onto their next values (which is how you walk forward with them).
Now that you've got the matrices you can store them however you want. Personally, I'd flatten the list:
val flattened = small.flatten
You wrote a structure that has the matrices side by side, which you can also do with some effort (again, because creating 2D operations out of 1D operations is not always straightforward):
val sidebyside = flattened.reduceRight((l,r) => (l,r).zipped.map(_ ::: _))
(note reduceRight to make this an O(n) operation instead of O(n^2)--joining to the end of long accumulating lists is a bad idea--but note also that with too many matrices this will probably overflow the stack).

Related

Decoding a JPEG Huffman Table

I am looking for a way to retrieve the minCode, maxCode and valPtr from an arbitrary Huffman table.
For instance, the following is a Huffman DC table generated by JpegSnoop:
Destination ID = 0
Class = 0 (DC / Lossless Table)
Codes of length 01 bits (000 total):
Codes of length 02 bits (001 total): 00
Codes of length 03 bits (005 total): 01 02 03 04 05
Codes of length 04 bits (001 total): 06
Codes of length 05 bits (001 total): 07
Codes of length 06 bits (001 total): 08
Codes of length 07 bits (001 total): 09
Codes of length 08 bits (001 total): 0A
Codes of length 09 bits (001 total): 0B
Codes of length 10 bits (000 total):
Codes of length 11 bits (000 total):
Codes of length 12 bits (000 total):
Codes of length 13 bits (000 total):
Codes of length 14 bits (000 total):
Codes of length 15 bits (000 total):
Codes of length 16 bits (000 total):
Total number of codes: 012
And the following are its Mincode, MaxCode and valPtr respectively:
{ 0, 0, 2, 14, 30, 62, 126, 254, 510, 0, 0, 0, 0, 0, 0, 0 },//YDC
{ -1, 0, 6, 14, 30, 62, 126, 254, 510, -1, -1, -1, -1, -1, -1, -1 },//YDC
{ 0, 0, 1, 6, 7, 8, 9, 10, 11, 0, 0, 0, 0, 0, 0, 0 },//YDC
Now I'm really confused about how these values were derived.
I checked the itu-t81 file, but it was not very clear.
To generate the code bits, you start with all zero bits. Within each code length, increment the code like an integer for each symbol. When stepping up a code length, increment and then add a zero bit to the end.
So for your example code, we have each length, followed by the corresponding codes in binary:
2: 00
3: 010, 011, 100, 101, 110
4: 1110
5: 11110
6: 111110
7: 1111110
8: 11111110
9: 111111110
Converting those to the corresponding integer ranges for each bit length, we have:
2: 0..0
3: 2..6
4: 14..14
5: 30..30
6: 62..62
7: 126..126
8: 254..254
9: 510..510
You can see exactly those ranges in your MinCode and MaxCode vectors.
You also have a list of symbols that correspond to the codes. In this example, that list is simply:
00 01 02 03 04 05 06 07 08 09 0A 0B
(The particular values of the symbols are not relevant to the valPtr vector. Those could be anything.)
The codes are assigned to the symbols from shortest to longest, and within each length, in integer order. The valPtr vector is simply the index of the first symbol in that vector that corresponds to each bit length. To generate the vector, start at zero, and add the number of symbols of each code length to get the starting index for the next code length.
1: 0, 0 symbols
2: 0 + 0 = 0, 1 2-bit symbol
3: 0 + 1 = 1, 5 3-bit symbols
4: 1 + 5 = 6, 1 4-bit symbol
5: 6 + 1 = 7, 1 5-bit symbol
6: 7 + 1 = 8, 1 6-bit symbol
7: 8 + 1 = 9, 1 7-bit symbol
8: 9 + 1 = 10, 1 8-bit symbol
9: 10 + 1 = 11
The valPtr example vector are the numbers after the equal signs above.
Thanks, I have created a code that decodes the tables and returns the desired values. The code may be found on my GitHub Here.

textfile parser: calculate start position with scala

I am beginner in scala and I trying to implement the following algorithm.
I have the following input :
11 DFI1-MONT_TT_13 9(18) 14 IntegerType
11 SERI1-SENS_13 X(01) 06 StringType
11 DDRI1-MONT_TT_14 9(18) 12 IntegerType
11 SQRI1-SENS_14 X(01) 14 StringType
11 XCRI1-MONT_TT_15 9(18) 10 IntegerType
11 QSRI1-SENS_15 X(01) 08 StringType
11 WQRI1-DEVISE X(03) 07 StringType
and I want to calculate the start position for each field so my output shall look like :
11 DFI1-MONT_TT_13 9(18) 0 14 IntegerType
11 SERI1-SENS_13 X(01) 14 06 StringType
11 DDRI1-MONT_TT_14 9(18) 20 12 IntegerType
11 SQRI1-SENS_14 X(01) 32 14 StringType
11 XCRI1-MONT_TT_15 9(18) 46 10 IntegerType
11 QSRI1-SENS_15 X(01) 56 08 StringType
11 WQRI1-DEVISE X(03) 64 07 StringType
The start position can be calculated as follows :
startposition_line_n= startposition_line_n-1 + length_line_n-1
We are assuming that the first line start position is equal to 0
I already know that I can use the scanLeft or the foldLeft but as I am begining I don't now how to do this recursively . I took a sample from the dataset in input, currently it includes much more lines.
Here's a tail-recursive method that takes a List[String] as input and produces a new, modified, List[String] as output.
def setPos(input :List[String]
,pos :Int=0
,acc :List[String]=List()
) :List[String] =
if (input.isEmpty) acc.reverse
else {
val line = input.head.split("\\s+")
setPos(input.tail
,pos + line(3).toInt
,line.patch(3, Seq(pos.toString), 0).mkString(" ") :: acc)
}
This assumes that the 3rd space-delimited field is always the offset integer. It will throw an error if that's not the case.
Something like this, drop comments if you need some clarifications.
val input = List(
"10 DFI1-MONT_TT_13 9(18) 14 IntegerType",
"10 SERI1-SENS_13 X(01) 06 StringType",
"10 DDRI1-MONT_TT_14 9(18) 12 IntegerType",
"10 SQRI1-SENS_14 X(01) 14 StringType",
"10 XCRI1-MONT_TT_15 9(18) 10 IntegerType",
"10 QSRI1-SENS_15 X(01) 08 StringType",
"10 WQRI1-DEVISE X(03) 07 StringType")
)
input.foldLeft((0, List[String]())) {
case ((sum, acc), line) => {
val sp = line.split(" ")
val si = 3
(sum + sp(si).toInt, acc :+ ((sp.take(si) :+ sum) ++ sp.takeRight(sp.size - si)).mkString(" "))
}
}._2

How to use the 'if' statement in matlab?

I have a cell array of size 5x5 as below
B= 00 10 11 10 11
01 01 01 01 11
10 00 01 00 01
10 10 01 01 11
10 10 10 00 10
And two column vectors
S1= 21
23
28
25
43
S2= 96
85
78
65
76
I want to create a new cell array of the same size as B say 5x5 such that it satisfies the following condition
Final={S1 if B{i}=11
S1 if B{i}=10
S2 if B{i}=01
S2 if B{i}=00
So the resulting output would be something like this
Z = s2 s1 s1 s1 s1
s2 s2 s2 s2 s1
s1 s2 s2 s2 s2
s1 s1 s2 s2 s1
s1 s1 s1 s2 s1
ie Z= 96 21 21 21 21
85 85 85 85 23
28 78 78 78 78
25 25 65 65 25
43 43 43 76 43
I tried using the if condition but i get error saying
'Error: The expression to the left of the equals sign is not a valid target for an assignment.'
for i=1:1:128
for j=1:1:16
if fs{i,j}=00
Z{i,j}=S1{i,j}
elseif fs{i,j}= 01
Z{i,j}=S2{i,j}
elseif fs{i,j}= 10
Z{i,j}=S1{i,j}
elseif fs{i,j}= 11
Z{i,j}=S2{i,j}
end
end
I think I'm making a mistake in the if statement as well as the expressions I'm using. Where am i going wrong? Please help thanks in advance.
Use == for comparison and = for assignment. So if fs{i,j}==00, etc.
Edit: Matlab is really designed for highly vectorized operations. Nested loops are slow compared to native functions, and typically can be replaced with vectorized versions. Is there any particular reason why you are using cell arrays instead of matrices, especially when you only have numeric data?
If B, S1, and S2 were matrices your code could be written in one highly efficient line that will run much much faster:
Z = bsxfun(#times, S1, B == 11 | B == 10) + bsxfun(#times, S2, B == 01 | B == 0)
Since B is a cell array you will want to convert it to a matrix using cell2mat unless you'd like to use cellfun.
Instead, you can just call B_mat = cell2mat(B), followed by (B_mat>=10).*repmat(S1,1,5) + (B_mat<10).*repmat(S2,1,5).
It's possible that your cell array actually contains binary values, possibly represented as strings, in which case the conditions used above would need to be changed. Then using cellfun may be necessary.

Encrypted timestamp 448 bit

I try to reverse engineer an GWT-API of a local public transport company (MVG in Munich). They don't offer a public REST-API or something similar. Unfortunately they use some sort of encrypted timestamps which consists of 7 letters. The alphabet is A-Za-z0-9$_ (in this order) which makes 64 different letters. One would need 6 bits to represent these 64 different letters.
So 7 letters * 6 bits/letter makes 42 bits.
I'm pretty sure that it is no bit field.
You can see it yourself on http://www.mvg-live.de/MvgLive/MvgLive.jsp#haltestelle=Am%20M%C3%BCnchner%20Tor&gehweg=0&zeilen=7&ubahn=true&bus=true&tram=true. Look out for (POST) requests to clockservice (http://www.mvg-live.de/MvgLive/mvglive/rpc/clockService, not working without using POST) which gives you the current server time.
Here are a few examples, with the date of the http-response:
UeEcvQB: Tue, 29 Jul 2014 23:27:15 GMT
UeGbS0O: Wed, 30 Jul 2014 08:40:13 GMT
UeGbhiJ: Wed, 30 Jul 2014 08:41:13 GMT
UeGozGI: Wed, 30 Jul 2014 09:39:13 GMT
UeGpBv$: Wed, 30 Jul 2014 09:40:13 GMT
Any help is appreciated. Thanks.
Looks to be the number of milliseconds after the Unix epoch (01/01/1970 00:00:00) converted to base-64 using that alphabet.
E.g.: UeGozGI can be converted back to decimal using:
U = 20
e = 30
G = 6
o = 40
z = 51
G = 6
I = 8
To decimal:
= (((((20 * 64 + 30) * 64 + 6) * 64 + 40) * 64 + 51) * 64 + 6) * 64 + 8
= 1406713147784
= 07/30/2014 09:39:07am
Which is (pretty close to) the time you indicates it encodes.

Matlab - data import with fixed numbers of columns

I have a (maybe simple) question about Matlab data import. I want to import a huge dataset (~1GB) which has a comma separated format like this:
08:05, 12, 33, 124, 13, 08:06, 22, 84, 12, 35, ..
Every 5th value is a timestamp. I want to import it with a fixed numbers of colums (5 colums), but there is no delimiter for the end of row. It should look like this in the end:
08:05 12 33 124 13
08:06 22 14 1 35
08:07 22 124 12 34
08:08 22 12 12 0
I thought about replacing every 5th comma by a subroutine, but it's too time consuming. Do you know a better solution? I'm hoping for a nice build-in function.
You can use fscanf and C-type format strings to accomplish this. For example:
fid=fopen('filename.txt');
A=reshape(fscanf(fid,'%d:%d, %d, %d, %d, %d, '),6,[])';
fclose(fid);
This stores your answer in a matrix A which will contain
A =
8 5 12 33 124 13
8 6 22 84 12 35
If you want to format this into a string or output file as you listed, you could use:
fprintf('%02d:%02d %-3d %-3d %-3d %-3d\n',A')