Reverse Engineering combined CRC/Command Byte - hash

I have an unknown hash/CRC byte which is 4 bits random/crc/unknown and 4 bits control flags. Each time the command byte is seen it can have a total of 16 distinct byte forms. And there are going to be 16 different groups. There is no colliding and each byte is only ever seen once. I'm reverse engineering a Moshiboard/MS10105 laser control board. So some of the code flags are able to be guessed. As I can logically assume the differences between turning on the laser and off the laser should be a bit-flip.
It seems like a solid puzzle but I can't really figure it out. There's a lot of information to go by, but also a lot of unknowns. Some of the control groups never show up and consequently I don't have them.
1. 0A 0E 1A 1E 4A 4E 51 53 59 5A 5B 5E 71 74 79 7B
2. 00 01 03 04 09 0C 10 14 21 23 29 2B 40 44 50 54
3. 55 57 5D 5F 75 77 7D 7F 8A 8E 9A 9E CA CE DA DE
4. 05 07 0D 0F 25 27 2D 2F 80 84 90 94 C0 C4 D0 D4
5. AA AE BA BE D5 D7 DD DF EA EE F5 F7 FA FD FE FF
6. 15 17 1D 1F 35 37 3D 3F 88 8C 98 9C C8 CC D8 DC
7. 45 47 4D 4F 65 67 6D 6F 82 86 92 96 C2 C6 D2 D6
8. A2 A6 B2 B6 C5 C7 CD CF E2 E5 E6 E7 ED EF F2 F6
Note: These are sorted numerically, since their actual order is unknown.
There is a partial versions of another command code range
08 0C 13 19 1B 1C 31 3B 48 58 5C ?? ?? ?? ?? ?? which I could figure out with a higher sample size. Though likely I could also figure it out just by the patterns.
1. Command header control byte. Followed by 1 int16_le. (Speed, Unk)
2. Position byte. Followed by 3 int16_le
3. Laser Off, X value, Y value. Followed by 2 int16_le
4. Termination byte. 0 int16_le
5. Laser On, X value, Y value. Followed by 2 int16_le
6. Laser Off, Y value. Followed by 1 int16_le
7. Laser Off, X value. Followed by 1 int16_le
8: Laser On, X value. Followed by 1 int16_le.
Now reasonably I can assume that 3 and 5 differ by a bit. And 7 and 8 differ by a bit. And there's a bunch of patterns in the byte codes. And 4 of those bits are randomish. They might be CRC or actually just purely random, the bytes are evenly distributed within the group. So the 4 non-control bytes are likely something evenly distributed.
Given that I do not know the hash, or even where any of these bytes are located. Is this solvable? I think the position would be solvable if there a very easy method to do the hash. I don't know anything about the randomish bits. And I can only take some educated guesses about the other bits. But, things like patterns 3, 4, 6 and 7 are highly similar I can clearly say if you add 0x50 to the first 8 bytes of 3 and subtract 0x0A from the second 8 bytes these patterns are the same. Or that 4 and 6 are the same if you add 0x10 for the first 8 bytes and subtract 0x08 from the second 8 bytes.
I can't however solve it. And the parts I'd think should be 1 bit flip apart because of their contextual meaning (3,5 and 7,8), seem less similar. I mean parts of #5 are clearly 2x parts of #3. Some seem off by a bitshift, others seem static amount in the higher nibble and lower nibble.
There's a lot of pattern here, and it's a 2013 series laserboard so it's not going to do something highly processor intensive.
Additional seemingly not highly relevant information available: https://github.com/meerk40t/moshi for the project, and the wiki there as well.
All numbers are in Hex.
What are these lines of hex?
These are all different forms of the same command as interpreted by the laser cutter control board.
Are they pulled from network packets?
These were intercepted using wireshark, running over a USB channel. The messages were then sent to a CH341 serial chip.
What are the "x" and "y" values you used to generate this traffic?
The X and Y values are positions for the laser cutter to go to. These are given by a 16 bit little endian value. The command issued has some kind of flag as to what type of value is being sent. Either x or y or both and y. These are different commands but there are 16 forms of each command.
What do you mean by "followed by 1 int16_le"
I mean in the command structure that particular command in one of those 16 forms is followed by a single 16 bit integer in little endian form.
The problem is I would expect these values to be the same value, typically flagged with certain bytes on or off to explain from the laser cutting software what this command wants the laser cutting board to do. But, rather than 1 command that means 1 thing, I have 16 commands that mean 1 (I think) thing. Which seems weird. There's a bunch of patterning, but I can't really figure out what that pattern is.

def swizzle(b, p7, p6, p5, p4, p3, p2, p1, p0):
return ((b >> 0) & 1) << p0 | ((b >> 1) & 1) << p1 | \
((b >> 2) & 1) << p2 | ((b >> 3) & 1) << p3 | \
((b >> 4) & 1) << p4 | ((b >> 5) & 1) << p5 | \
((b >> 6) & 1) << p6 | ((b >> 7) & 1) << p7
def convert(q):
if q & 1:
return swizzle(q, 7, 6, 2, 4, 3, 5, 1, 0)
else:
return swizzle(q, 5, 1, 7, 2, 4, 3, 6, 0)
Applies to turn the code lines into:
['50', '51', '52', '53', '54', '55', '56', '58', '59', '5a', '5b', '5c', '5d', '5e', '5f', '8e']
['00', '01', '02', '03', '04', '05', '06', '07', '08', '09', '0a', '0c', '0d', '0e', '0f', '18']
['70', '71', '72', '73', '74', '75', '76', '77', '78', '79', '7a', '7b', '7c', '7d', '7e', '7f']
['20', '21', '22', '23', '24', '25', '26', '27', '28', '29', '2a', '2b', '2c', '2d', '2e', '2f']
['f0', 'f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9', 'fa', 'fb', 'fc', 'fd', 'fe', 'ff']
['30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '3a', '3b', '3c', '3d', '3e', '3f']
['60', '61', '62', '63', '64', '65', '66', '67', '68', '69', '6a', '6b', '6c', '6d', '6e', '6f']
['e0', 'e1', 'e2', 'e3', 'e4', 'e5', 'e6', 'e7', 'e8', 'e9', 'ea', 'eb', 'ec', 'ed', 'ee', 'ef']
The patterns used differ slightly based on the bit parity. There's an errant number in the first and second rows that does not fit the pattern. In both cases this number is increased by 1 from the number one would predict. This is at least usable. There might be a different algorithm causing the swizzling with a carry error causing rounding down for 0x0C not being 0x0B and 0x74 not being 0x73.
This also predicts the other unseen patterns are:
['00', '01', '40', '03', '10', '21', '50', '23', '04', '09', '44', '0b', '14', '29', '54', '2b']
['08', '11', '48', '13', '18', '31', '58', '33', '0c', '19', '4c', '1b', '1c', '39', '5c', '3b']
['80', '05', 'c0', '07', '90', '25', 'd0', '27', '84', '0d', 'c4', '0f', '94', '2d', 'd4', '2f']
['88', '15', 'c8', '17', '98', '35', 'd8', '37', '8c', '1d', 'cc', '1f', '9c', '3d', 'dc', '3f']
['02', '41', '42', '43', '12', '61', '52', '63', '06', '49', '46', '4b', '16', '69', '56', '6b']
['0a', '51', '4a', '53', '1a', '71', '5a', '73', '0e', '59', '4e', '5b', '1e', '79', '5e', '7b']
['82', '45', 'c2', '47', '92', '65', 'd2', '67', '86', '4d', 'c6', '4f', '96', '6d', 'd6', '6f']
['8a', '55', 'ca', '57', '9a', '75', 'da', '77', '8e', '5d', 'ce', '5f', '9e', '7d', 'de', '7f']
['20', '81', '60', '83', '30', 'a1', '70', 'a3', '24', '89', '64', '8b', '34', 'a9', '74', 'ab']
['28', '91', '68', '93', '38', 'b1', '78', 'b3', '2c', '99', '6c', '9b', '3c', 'b9', '7c', 'bb']
['a0', '85', 'e0', '87', 'b0', 'a5', 'f0', 'a7', 'a4', '8d', 'e4', '8f', 'b4', 'ad', 'f4', 'af']
['a8', '95', 'e8', '97', 'b8', 'b5', 'f8', 'b7', 'ac', '9d', 'ec', '9f', 'bc', 'bd', 'fc', 'bf']
['22', 'c1', '62', 'c3', '32', 'e1', '72', 'e3', '26', 'c9', '66', 'cb', '36', 'e9', '76', 'eb']
['2a', 'd1', '6a', 'd3', '3a', 'f1', '7a', 'f3', '2e', 'd9', '6e', 'db', '3e', 'f9', '7e', 'fb']
['a2', 'c5', 'e2', 'c7', 'b2', 'e5', 'f2', 'e7', 'a6', 'cd', 'e6', 'cf', 'b6', 'ed', 'f6', 'ef']
['aa', 'd5', 'ea', 'd7', 'ba', 'f5', 'fa', 'f7', 'ae', 'dd', 'ee', 'df', 'be', 'fd', 'fe', 'ff']
Given in byte order rather than sorted.

Related

Boolean expression to determine if 8-bit input is within range

Given the following in 8-bit 2s complement numbers:
11000011 = -61 (decimal)
00011111 = +31 (decimal)
I am required to obtain a boolean expression of a logic circuit whose output out goes high when its 8-bit input in (also in 2s complement representation) is in the following range:
-61 < in < 31
Number line for 8 bit numbers (2s complement):
10000000 (most negative) ..... 11000011 (-61) ..... 00000000 ..... 00011111 (31) ..... 01111111 (most positive)
Is there any way of solving this problem besides brute-forcing and comparing bit-by-bit?
Edit: The following statement is not allowed
out = ((in < 11000011 && in > 10000000) || (in > 00011111 && in < 01111111)) ? 1'b0 : 1'b1;
I'm not sure if there is a faster way to do this. But what I did was to list the numbers out in 2s complement format before trying to find a pattern. The following chunks of numbers are sorted in numerical order (from 00000000 to 11111111 so that the pattern can be more clearly seen).
Let the MSB be A and LSB be H. The equation is: A B C + A B D + A B E + A B F + A' B' C' D' + A' B' C' E' + A' B' C' F' + A' B' C' G' + A' B' C' H'
A' B' C' D' (easiest to observe):
00000000 (<- min)
00000001
00000010
00000011
00000100
00000101
00000110
00000111
00001000
00001001
00001010
00001011
00001100
00001101
00001110
00001111
A' B' C' E' + A' B' C' F' + A' B' C' G' + A' B' C' H':
00010000
00010001
00010010
00010011
00010100
00010101
00010110
00010111
00011000
00011001
00011010
00011011
00011100
00011101
00011110
A B D + A B E + A B F:
11000100
11000101
11000110
11000111
11001000
11001001
11001010
11001011
11001100
11001101
11001110
11001111
11010000
11010001
11010010
11010011
11010100
11010101
11010110
11010111
11011000
11011001
11011010
11011011
11011100
11011101
11011110
11011111
A B C (easiest to observe):
11100000
11100001
11100010
11100011
11100100
11100101
11100110
11100111
11101000
11101001
11101010
11101011
11101100
11101101
11101110
11101111
11110000
11110001
11110010
11110011
11110100
11110101
11110110
11110111
11111000
11111001
11111010
11111011
11111100
11111101
11111110
11111111 (<-max)

Remove specific rows from a structure

I have a 1x1 structure (EEG) with 42 fields. One of these fields is called event and is a 1x180 structure, with 13 different fields, some of which are strings and some numeric values.
The 4th field of EEG.event is type and it contains strings (i.e. 'preo', 'pred', 'to', 'td', 'po', 'pd').
I would like to keep only those rows of the structure that contain 'preo' in the column EEG.event.type.
My ultimate aim is to create a matrix with all the columns from the structure EEG.event and only the rows with 'preo' in EEG.event.type, plus other columns from other variables.
So far I tried:
S = struct2table(EEG.event);
and it correctly returns a 180x13 table.
However I was not able to select only the rows with 'preo' in type. I tried:
A= S(S.type=='preo', :);
and it gives me an error:
Undefined operator '==' for input arguments of type 'cell'.
I also tried:
array(strcmp(S(:, 4), 'preo'), :) = [];
and it gives me this error:
Deletion requires an existing variable.
Then I thought that maybe I should have converted the table into matrix, to directly delete rows from the matrix. I tried:
B = cell2mat(S);
but it returns this error:
Error using cell2mat (line 42)
You cannot subscript a table using only one subscript. Table subscripting requires both row and variable subscripts.
Any suggestion or tip is welcome, because I don't know how to continue.
Example list that I have (only 18 rows here):
13 1 201011 'preo' 2502 201 1 1 'y' 'h' 13 13.9230000000000 13
14 1 201011 'pred' 2684 201 1 1 'y' 'h' 14 14.1049999960000 14
15 1 201012 'to' 2707 201 1 2 'y' 'h' 15 14.1280000000000 15
16 1 201012 'td' 2993 201 1 2 'y' 'h' 16 14.4140000000000 16
17 1 201013 'po' 3019 201 1 3 'y' 'h' 17 14.4400000000000 17
18 1 201013 'pd' 3383 201 1 3 'y' 'h' 18 14.8040000000000 18
55 2 61011 'preo' 8213 61 1 1 'y' 'h' 55 53.9000000000000 55
56 2 61011 'pred' 8522 61 1 1 'y' 'h' 56 54.2089999850000 56
57 2 61012 'to' 8547 61 1 2 'y' 'h' 57 54.2340000000000 57
58 2 61012 'td' 8834 61 1 2 'y' 'h' 58 54.5210000000000 58
59 2 61013 'po' 8858 61 1 3 'y' 'h' 59 54.5450000000000 59
60 2 61013 'pd' 9091 61 1 3 'y' 'h' 60 54.7780000000000 60
85 3 124011 'preo' 13924 124 1 1 'y' 'h' 85 82.4550000000000 85
86 3 124011 'pred' 14159 124 1 1 'y' 'h' 86 82.6899999990000 86
87 3 124012 'to' 14181 124 1 2 'y' 'h' 87 82.7120000000000 87
88 3 124012 'td' 14448 124 1 2 'y' 'h' 88 82.9790000000000 88
89 3 124013 'po' 14470 124 1 3 'y' 'h' 89 83.0010000000000 89
90 3 124013 'pd' 14713 124 1 3 'y' 'h' 90 83.2440000000000 90
Example list that I would like to have (from the 18 rows above):
13 1 201011 'preo' 2502 201 1 1 'y' 'h' 13 13.9230000000000 13
55 2 61011 'preo' 8213 61 1 1 'y' 'h' 55 53.9000000000000 55
85 3 124011 'preo' 13924 124 1 1 'y' 'h' 85 82.4550000000000 85
I found a solution, I post it here for others with my same issue.
I first create a cell array, and then I delete rows from the cell array. At the moment is the best I can think of.
myCell= struct2cell(EEG.event);
%it results in a 3d cell array, with the fields as first dimension (42) x a singleton dimension as second dimension x the number of the rows as third dimension (180)
new_Cell = permute(myCell,[3,1,2]);
%it deletes the singleton dimension and swap the other 2 dimensions, obtaining 180x42.
[r,c] = find(strcmp(new_Cell,'preo'));
%indices as rows (r) and columns (c) of cells with the string 'preo'
y = new_Cell([r],:);
%It keeps only the rows that you want from the original cell array 'myCell'.

Matlab : reshaping matrix from a vector

I've been making a multi-channel steaming DAQ system in Labview.
And, I bring the saved binary file into Matlab for post-processing.
I need to sort the file data, according to channels.
An example is below.
with 3 multiple channel acquisition and 5Hz sampling rate
First channel voltage : 1V(constant)
Second channel voltage : 2V(constant)
Third channel voltage : 3V(constant)
if I acquire signals for 4 seconds with this condition, the saved data will be like below, because the system saves the signal in buffer, once in a second, on a single file.
ch1=[1 1 1 1 1];
ch2=[2 2 2 2 2];
ch3=[3 3 3 3 3];
B=[ch1 ch2 ch3 ch1 ch2 ch3 ch1 ch2 ch3 ch1 ch2 ch3];
I want to rearrange the data like below.
desiredB=[ch1 ch1 ch1 ch1; ch2 ch2 ch2 ch2; ch3 ch3 ch3 ch3];
In order to rearrange B I made a code like below with two for loop.
fs=5; %sampling frequency
nCh=3; %number of channels
nB=length(B);
C=zeros(nB/fs,fs);
for i=1:nB/fs;
temp=B((i-1)*fs+1:fs*i);
C(i,1:fs)=temp;
end
sizeC=size(C);
T=sizeC(1)/nCh;
D=zeros(nCh,fs*T);
for j=1:T
temp2=C(3*(j-1)+1:3*j,:);
D(:,(j-1)*fs+1:j*fs)=temp2;
end
t_axis=0:1/fs:T-1/fs;
plot(t_axis,D','linewidth',2),grid on
axis([0 3.8 0 5])
xlabel('time(sec)')
ylabel('voltage(V)')
legend('first channel','second channel','third channel')
It worked, but when I read a big size data, it's slow.
Are there any nice ways to reshape this kind of data?
I think this does what you want:
fs=5; %sampling frequency
nCh=3; %number of channels
ch1=[11 12 13 14 15];
ch2=[21 22 23 24 25];
ch3=[31 32 33 34 35];
B=[ch1 ch2 ch3 ch1 ch2 ch3 ch1 ch2 ch3 ch1 ch2 ch3];
C = reshape(B, fs, nCh, []);
D = permute(C, [1, 3, 2]);
E = reshape(D, [], nCh).'
E =
11 12 13 14 15 11 12 13 14 15 11 12 13 14 15 11 12 13 14 15
21 22 23 24 25 21 22 23 24 25 21 22 23 24 25 21 22 23 24 25
31 32 33 34 35 31 32 33 34 35 31 32 33 34 35 31 32 33 34 35

How to use the 'if' statement in matlab?

I have a cell array of size 5x5 as below
B= 00 10 11 10 11
01 01 01 01 11
10 00 01 00 01
10 10 01 01 11
10 10 10 00 10
And two column vectors
S1= 21
23
28
25
43
S2= 96
85
78
65
76
I want to create a new cell array of the same size as B say 5x5 such that it satisfies the following condition
Final={S1 if B{i}=11
S1 if B{i}=10
S2 if B{i}=01
S2 if B{i}=00
So the resulting output would be something like this
Z = s2 s1 s1 s1 s1
s2 s2 s2 s2 s1
s1 s2 s2 s2 s2
s1 s1 s2 s2 s1
s1 s1 s1 s2 s1
ie Z= 96 21 21 21 21
85 85 85 85 23
28 78 78 78 78
25 25 65 65 25
43 43 43 76 43
I tried using the if condition but i get error saying
'Error: The expression to the left of the equals sign is not a valid target for an assignment.'
for i=1:1:128
for j=1:1:16
if fs{i,j}=00
Z{i,j}=S1{i,j}
elseif fs{i,j}= 01
Z{i,j}=S2{i,j}
elseif fs{i,j}= 10
Z{i,j}=S1{i,j}
elseif fs{i,j}= 11
Z{i,j}=S2{i,j}
end
end
I think I'm making a mistake in the if statement as well as the expressions I'm using. Where am i going wrong? Please help thanks in advance.
Use == for comparison and = for assignment. So if fs{i,j}==00, etc.
Edit: Matlab is really designed for highly vectorized operations. Nested loops are slow compared to native functions, and typically can be replaced with vectorized versions. Is there any particular reason why you are using cell arrays instead of matrices, especially when you only have numeric data?
If B, S1, and S2 were matrices your code could be written in one highly efficient line that will run much much faster:
Z = bsxfun(#times, S1, B == 11 | B == 10) + bsxfun(#times, S2, B == 01 | B == 0)
Since B is a cell array you will want to convert it to a matrix using cell2mat unless you'd like to use cellfun.
Instead, you can just call B_mat = cell2mat(B), followed by (B_mat>=10).*repmat(S1,1,5) + (B_mat<10).*repmat(S2,1,5).
It's possible that your cell array actually contains binary values, possibly represented as strings, in which case the conditions used above would need to be changed. Then using cellfun may be necessary.

Matlab beginner median , mode and binning

I am a beginner with MATLAB and I am struggling with this assignment. Can anyone guide me through it?
Consider the data given below:
x = [ 1 , 48 , 81 , 2 , 10 , 25 , ,14 , 18 , 53 , 41, 56, 89,0, 1000, , ...
34, 47, 455, 21, , 22, 100 ];
Once the data is loaded, see if you can find any:
Outliers or
Missing data in the data file
Correct the missing values using median, mode and noisy data using median binning, mean binning and bin boundaries.
This isn't so bad. First off, take a look at the distribution of your data. You can see that the majority of your data has double digits. The outliers are those with single digits, or those that are way larger than double digits. Mind you, this is totally subjective so someone else may tell you that the single digits are part of your data too. Also, the missing data are those numbers that are spaces in between the commas. Let's write some MATLAB code and change these to NaN (or not-a-number), because if you try copying and pasting this code directly into MATLAB, it will give you a syntax error because if you are explicitly defining numbers this way, you have to be sure all of them are there.
To do this, use regexprep so that any parts of this string that have a comma, space, then another comma, put a NaN in between. To do this, we need to put this statement as a string first. We then use eval to convert this string to an actual MATLAB statement:
x = '[ 1 , 48 , 81 , 2 , 10 , 25 , ,14 , 18 , 53 , 41, 56, 89,0, 1000, , 34, 47, 455, 21, , 22, 100 ];'
y = eval(regexprep(x, ', ,', ', NaN, '));
If we display this data, we get:
y =
Columns 1 through 6
1 48 81 2 10 25
Columns 7 through 12
NaN 14 18 53 41 56
Columns 13 through 18
89 0 1000 NaN 34 47
Columns 19 through 23
455 21 NaN 22 100
As such, to answer our first question, any values that are missing are denoted as NaN and those numbers that are bigger than double digits are outliers.
For the next question, we simply extract those values that are not missing, calculate the mean and median of what is not missing, and fill in those NaN values with the mean and median. For the bin boundaries, this is the same thing as using the values to the left (or right... depends on your definition, but let's use left) of the missing value and fill those in. As such:
yMissing = isnan(y); %// Which values are missing?
y_noNaN = y(~yMissing); %// Extract the non-missing values
meanY = mean(y_noNaN); %// Get the mean
medianY = median(y_noNaN); %// Get the median
%// Output - Fill in missing values with median
yMedian = y;
yMedian(yMissing) = medianY;
%// Same for mean
yMean = y;
yMean(yMissing) = meanY;
%// Bin boundaries
yBinBound = y;
yBinBound(yMissing) = y(find(yMissing)-1);
The mean and median for the data of the non-missing values is:
meanY =
105.8500
medianY =
37.5000
The outputs for each of these, in addition to the original data with the missing values looks like:
format bank; %// Do this to show just the first two decimal places for compact output
format compact;
y =
Columns 1 through 5
1 48 81 2 10
Columns 6 through 10
25 NaN 14 18 53
Columns 11 through 15
41 56 89 0 1000
Columns 16 through 20
NaN 34 47 455 21
Columns 21 through 23
NaN 22 100
yMean =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 105.85 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
105.85 34.00 47.00 455.00 21.00
Columns 21 through 23
105.85 22.00 100.00
yMedian =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 37.50 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
37.50 34.00 47.00 455.00 21.00
Columns 21 through 23
37.50 22.00 100.00
yBinBound =
Columns 1 through 5
1.00 48.00 81.00 2.00 10.00
Columns 6 through 10
25.00 25.00 14.00 18.00 53.00
Columns 11 through 15
41.00 56.00 89.00 0 1000.00
Columns 16 through 20
1000.00 34.00 47.00 455.00 21.00
Columns 21 through 23
21.00 22.00 100.00
If you take a look at each of the output values, this fills in our data with the mean, median and also the bin boundaries as per the question.