Lempel-Ziv encoding - split string - encoding

I have given string 1100011110000011111000000. I have to encode it by Basic Lempel-Ziv algorithm. I know how to do it. The problem is two first numbers are 1. I parse it into anordered dictionary of substrings. So result is:
1, 10, 0, 01, 11, 100, 00, 011, 111, 000
I want to number new substring, but then 1=No.1, 10=No.2 and 0=No.3., that means, I cant encode substring 10, because here is not value of 0.
Another problem is - at the end here are three 0 more, but I have one substring 000. What I have to do with last 000?
So, is there any different way to parse the sting in substrings?

Related

PDF417 decode and generate the same barcode using Swift

I have the following example of PDF417 barcode:
which can be decoded with online tool like zxing
as the following result: 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e�¼ô���������¬‚C`Ìe%�æ‹�ÀsõbÿG)=‡x‚�qÀ1ß–[FzùŽûVû�É�üæ±RNI�Y[.H»Eàó¼åñüì²�tØ¿ªWp…Ã�{�Õ*
or online-qrcode-generator
as 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e~|~~~~~~~~~~d~C`~e%~~~~;To~B~{~dj9v~~Z[Xm~~"HP3~~LH~~~O~"S~~,~~~~~~~k1~~~u~Iw}SQ~fqX4~mbc_ (I don't know which encoding is used to encode this)
The first part of the encoded key that contains barcode is always known and it is 5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e
The second part of it can be decoded from the base64string and it always contains 88 bytes. In my case it is:
Frz0DAAAAAAAAAAArIJDYMxlJQDmiwHAc/Vi/0cpPYd4ghlxwDHflltGevmO+1b7GckT/OZ/sVJOSRpZWy5Iu0Xg87zl8fzssg502L+qV3CFwxZ/ewjVKg==
I'm using Swift on iOS device to generate this PDF417 barcode by decoding the provided base64 string like this:
let base64Str = "Frz0DAAAAAAAAAAArIJDYMxlJQDmiwHAc/Vi/0cpPYd4ghlxwDHflltGevmO+1b7GckT/OZ/sVJOSRpZWy5Iu0Xg87zl8fzssg502L+qV3CFwxZ/ewjVKg=="
let knownKey = "5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e"
let decodedData = Data(base64Encoded: base64Str.replacingOccurrences(of: "-", with: "+")
.replacingOccurrences(of: "_", with: "/"))
var codeData=knownKey.data(using: String.Encoding.ascii)
codeData?.append(decodedData)
let image = generatePDF417Barcode(from: codeData!)
let imageView = UIImageView(image: image!)
//the function to generate PDF417 UIMAGE from parsed Data
func generatePDF417Barcode(from codeData: Data) -> UIImage? {
if let filter = CIFilter(name: "CIPDF417BarcodeGenerator") {
filter.setValue(codeData, forKey: "inputMessage")
let transform = CGAffineTransform(scaleX: 3, y: 3)
if let output = filter.outputImage?.transformed(by: transform) {
return UIImage(ciImage: output)
}
}
return nil
}
But I always get the wrong barcodes generated. It can be seen visually.
Please help me correct the code to get the same result as the first barcode image.
I also have the another example of barcode:
The first part of key is the same but it's second part is known as int8 byte array and I also don't have an idea how to generate the PDF417 barcode from it (with prepended key) correctly.
Here's how I try:
let knownKey = "5wwwwwxwww0app5p3pewi0edpeapifxe0ixiwwdfxxi0xf5e"
let secretArray: [Int8] = [22, 124, 24, 12, 0, 0, 0, 0, 0, 0, 0, 0, 100, 127, 67, 96, -52, 101, 37, 0, -85, -123, 1, -64, 111, -28, 66, -27, 123, -25, 100, 106, 57, 118, -4, 16, 90, 91, 88, 109, -105, 126, 34, 72, 80, 51, -116, 28, 76, 72, -37, -24, -93, 79, -115, 34, 83, 18, -61, 44, -12, -13, -8, -59, -107, -9, -128, 107, 49, -50, 126, 13, -59, 50, -24, -43, 127, 81, -85, 102, 113, 88, 52, -60, 109, 98, 99, 95]
let secretUInt8 = secretArray.map { UInt8(bitPattern: $0) }
let secretData = Data(secretUInt8)
let keyArray: [UInt8] = Array(knownKey.utf8)
var keyData = Data(keyArray)
keyData.append(secretData)
let image = generatePDF417Barcode(from: keyData!)
let imageView = UIImageView(image: image!)
There are a lot of things going on here. Gereon is correct that there are a lot of parameters. Choosing different parameters can lead to very different bar codes that decode identically. Your current barcode is "correct" (though a bit messy due to an Apple bug). It's just different.
I'll start with the short answer of how to make your data match the barcode you have. Then I'll walk through what you should probably actually do, and finally I'll get to the details of why.
First, here's the code you're looking for (but probably not the code you want, unless you have to match this barcode):
filter.setValue(codeData, forKey: "inputMessage")
filter.setValue(3, forKey: "inputCompactionMode") // This is good (and the big difference)
filter.setValue(5, forKey: "inputDataColumns") // This is fine, but probably unneeded
filter.setValue(0, forKey: "inputCorrectionLevel") // This is bad
PDF 417 defines several "compaction modes" to let it pack a truly impressive amount of information into a very small space while still offering excellent error detection and correction, and handling a lot of real-world scanning concerns. The default compaction mode only supports Latin text and basic punctuation. (It compacts even more if you only use uppercase Latin letters and space.) The first part of your string can be stored with text compaction, but the rest can't, so it has to switch to byte compaction.
Core Image actually does this switch shockingly badly by default (I opened FB9032718 to track). Rather than encoding in text and then switching to bytes, or just doing it all in bytes, it switches to bytes over and over again unnecessarily.
There's no way for you to configure multiple compaction methods, but you can just set it to byte, which is what value 3 is. And that's also how your source is doing it.
The second difference is the number of data columns, which drive how wide the output is. Your source is using 5, but Core Image is choosing 6 based on its default rules (which aren't fully documented).
Finally, your source has set the error correction level to 0, which is not recommended. For a message of this size, the minimum recommended error correction level is 3, which is what Core Image chooses by default.
If you just want a good barcode, and don't have to match this input, my recommendation would be to set inputCompactionMode to 3, and leave the rest as defaults. If you want a different aspect ratio, I'd use inputPreferredAspectRatio rather than modifying the number of data columns directly.
You may want to stop reading now. This was a very enjoyable puzzle to spend the morning on, so I'm going to dump a lot of details here.
If you want a deep dive into how this format works, I don't know anything currently available other than the ISO 15438 Spec, which will cost you around US$200. But there used to be some pages at GeoCities that explained a lot of this, and they're still available through the Wayback Machine.
There also aren't a lot of tools for decoding this stuff on the command line, but pdf417decode does a reasonable job. I'll use output from it to explain how I knew all the values.
The last tool you need is a way to turn jpeg output into black-and-white pbm files so that pdf417decode can read them. For that, I use the following (after installing netpbm):
cat /tmp/barcode.jpeg | jpegtopnm | ppmtopgm | pamthreshold | pamtopnm > new.pbm && ./pdf417decode -c -e new.pbm
With that, let's decode the first three rows of your existing barcode (with my commentary to the side). Everywhere you see "function output," that means this value is the output of some function that takes the other thing as the input:
0 7f54 0x02030000 (0) // Left marker
0 6a38 0x00000007 (7) // Number of rows function output
0 218c 0x00000076 (118) // Total number of non-error correcting codewords
0 0211 0x00000385 (901) // Latch to Byte Compaction mode
0 68cf 0x00000059 (89) // Data
0 18ec 0x0000021c (540)
0 02e7 0x00000330 (816)
0 753c 0x00000004 (4) // Number of columns function output
0 7e8a 0x00030001 (1) // Right marker
1 7f54 0x02030000 (0) // Left marker
1 7520 0x00010002 (2) // Security Level function output
1 704a 0x00010334 (820) // Data
1 31f2 0x000101a7 (423)
1 507b 0x000100c9 (201)
1 5e5f 0x00010319 (793)
1 6cf3 0x00010176 (374)
1 7d47 0x00010007 (7) // Number of rows function output
1 7e8a 0x00030001 (1) // Right marker
2 7f54 0x02030000 (0) // Left marker
2 6a7e 0x00020004 (4) // Number of columns function output
2 0fb2 0x0002037a (890) // Data
2 6dfa 0x000200d9 (217)
2 5b3e 0x000200bc (188)
2 3bbc 0x00020180 (384)
2 5e0b 0x00020268 (616)
2 29e0 0x00020002 (2) // Security Level function output
2 7e8a 0x00030001 (1) // Right marker
The next 3 lines will continue this pattern of function outputs. Note that the same information is encoded on the left and right, but in a different order. The system has a lot of redundancy, and can detect that it's seeing a mirror image of the barcode.
We don't care about the number of rows for this purpose, but given a current row of n and a total number of rows of N, the function is:
30 * (n/3) + ((N-1)/3)
Where / always means "integer, truncating division." Given there are 24 rows, on row 0, this is 0 + (24-1)/3 = 7.
The security level function's output is 2. Given a security level of e, the function is:
30 * (n/3) + 3*e + (N-1) % 3
=> 0 + 3*e + (23%3) = 2
=> 3*e + 2 = 2
=> 3*e = 0
=> e = 0
Finally, the number of columns can just be counted off in the output. For completeness, given a number of columns c, the function is:
30 * (n/3) + (c - 1)
=> 0 + c - 1 = 4
=> c = 5
If you look at the Data lines, you'll notice that they don't match your input data at all. That's because they have a complex encoding that I won't detail here. But for Byte compaction, you can think of it as similar to Base64 encoding, but instead of 64, it's Base900. Where Base64 encodes 3 bytes of data into 4 characters, Base900 encodes 6 bytes of data into 5 codewords.
In the end, all these codewords get converted to symbols (actual lines and spaces). Which symbol is used depends on the line. Lines divisible by 3 use one symbol set, the lines after use a second, and the lines after that use a third. So the same codewords will look completely different on line 7 than on line 8.
Taken together, all these things make it very difficult to look at a barcode and decide how "different" it is from another barcode in terms of content. You just have to decode them and see what's going on.
CIPDF417BarcodeGenerator has a few more input parameters besides inputMessage that can have an influence on how the generated barcode looks - see the documentation. Visual inspection/comparison of two codes only makes sense when you know that all these parameters, most importantly inputCorrectionLevel were equal for both generators.
So, instead of a visual comparison, simply try decoding the barcodes using one of the many scanner apps out there, and compare the decoded bytes.
For your second example, try this:
// ...
var keyData = knownKey.data(using: .isoLatin1)!
keyData.append(secretData)
let image = generatePDF417Barcode(from: keyData)

TSQL int/int overflow

I have a column called Odo that contains the number of meters in a trip. I would normally divide that by 1000 to display the number of Km's.
The line of code in question:
convert(varchar(10), startPos.Distance / 1000)
causes the following error
Msg 8115, Level 16, State 8, Procedure sp_report_select_trip_start_and_stop, Line 7 [Batch Start Line 2]
Arithmetic overflow error converting numeric to data type numeric.
Msg 232, Level 16, State 2, Procedure sp_report_select_trip_start_and_stop, Line 9 [Batch Start Line 2]
Arithmetic overflow error for type varchar, value = 931.785156.
That number is clearly longer than my varchar. How do I divide, truncate to 1 decimal place and then convert?
Edit: SQL Server 2008 R2, so format() is not available
You can use format
format(startPos.Distance/1000, '#,###,###.#')
Or,
convert(varchar(10),cast(startPos.Distance/1000 as decimal(9,1)))
You may wish to introduce round() into these as well. e.g.
format(round(startPos.Distance/1000,1), '#,###,###.#')
or
convert(varchar(10),cast(round(startPos.Distance/1000,1) as decimal(9,1)))

How to convert a 64bit hex number to binary in Scala?

I want to convert the following hash "d41d8cd98f00b204e9800998ecf8427e" to binary string. However, I can't seem to find a way to do that? Can someone teach me how I can do it in Scala? Thanks
Use BigInt:
scala> BigInt("d41d8cd98f00b204e9800998ecf8427e", 16).toString(2)
res0: String = 11010100000111011000110011011001100011110000000010110010000001001110100110000000000010011001100011101100111110000100001001111110
The 16 above means the string should be parsed in hex (base 16), and the 2 means the output string should be in binary (base 2).
If you want to dump the raw binary out to a file, you can convert the BigInt to a byte array and dump that:
scala> BigInt("d41d8cd98f00b204e9800998ecf8427e", 16).toByteArray
res1: Array[Byte] = Array(0, -44, 29, -116, -39, -113, 0, -78, 4, -23, -128, 9, -104, -20, -8, 66, 126)
Note that this gives you back 17 bytes instead of the 16 you'd expect for a 128-bit hash. That's because BigInt is a signed value, so it pads the byte array with an extra 0 in the most-significant-byte place to keep the value from being interpreted as negative. You could use res1.takeRight(16) to grab only the 16 bytes you're probably interested in.

How can I select certain rows in a dataset? Mathematica

My question is probably really easy, but I am a mathematica beginner.
I have a dataset, lets say:
Column: Numbers from 1 to 10
Column Signs
Column Other signs.
{{1,2,3,4,5,6,7,8,9,10},{d,t,4,/,g,t,w,o,p,m},{g,h,j,k,l,s,d,e,w,q}}
Now I want to extract all rows for which column 1 provides an odd number. In other words I want to create a new dataset.
I tried to work with Select and OddQ as well as with the IF function, but I have absolutely no clue how to put this orders in the right way!
Taking a stab at what you might be asking..
(table = {{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} ,
Characters["abcdefghij"],
Characters["ABCDEFGHIJ"]}) // MatrixForm
table[[All, 1 ;; -1 ;; 2]] // MatrixForm
or perhaps this:
Select[table, OddQ[#[[1]]] &]
{{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}}
The convention in Mathematica is the reverse of what you use in your description.
Rows are first level sublists.
Let's take your original data
mytable = {{1,2,3,4,5,6,7,8,9,10},{d,t,4,"/",g,t,w,o,p,m},{g,h,j,k,l,s,d,e,w,q}}
Just as you suggested, Select and OddQ can do what you want, but on your table, transposed. So we transpose first and back:
Transpose[Select[Transpose[mytable], OddQ[First[#]]& ]]
Another way:
Mathematica functional command MapThread can work on synchronous lists.
DeleteCases[MapThread[If[OddQ[#1], {##}] &, mytable], Null]
The inner function of MapThread gets all elements of what you call a 'row' as variables (#1, #2, etc.). So it test the first column and outputs all columns or a Null if the test fails. The enclosing DeleteCases suppresses the unmatching "rows".

How to put numbers into an array and sorted by most frequent number in java

I was given this question on programming in java and was wondering what would be the best way of doing it.
The question was on the lines of:
From the numbers provided, how would you in java display the most frequent number. The numbers was: 0, 3, 4, 1, 1, 3, 7, 9, 1
At first I am thinking well they should be in an array and sorted first then maybe have to go through a for loop. Am I on the right lines. Some examples will help greatly
If the numbers are all fairly small, you can quickly get the most frequent value by creating an array to keep track of the count for each number. The algorithm would be:
Find the maximum value in your list
Create an integer array of size max + 1 (assuming all non-negative values) to store the counts for each value in your list
Loop through your list and increment the count at the index of each value
Scan through the count array and find the index with the highest value
The run-time of this algorithm should be faster than sorting the list and finding the longest string of duplicate values. The tradeoff is that it takes up more memory if the values in your list are very large.
With Java 8, this can be implemented rather smoothly. If you're willing to use a third-party library like jOOλ, it could be done like this:
List<Integer> list = Arrays.asList(0, 3, 4, 1, 1, 3, 7, 9, 1);
System.out.println(
Seq.seq(list)
.grouped(i -> i, Agg.count())
.sorted(Comparator.comparing(t -> -t.v2))
.map(t -> t.v1)
.toList());
(disclaimer, I work for the company behind jOOλ)
If you want to stick with the JDK 8 dependency, the following code would be equivalent to the above:
System.out.println(
list.stream()
.collect(Collectors.groupingBy(i -> i, Collectors.counting()))
.entrySet()
.stream()
.sorted(Comparator.comparing(e -> -e.getValue()))
.map(e -> e.getKey())
.collect(Collectors.toList()));
Both solutions yield:
[1, 3, 0, 4, 7, 9]