How do I print a string in one line in MARIE? - marie

I want to print a set of letters in one line in MARIE. I modified the code to print Hello World and came up with:
ORG 0 / implemented using "do while" loop
WHILE, LOAD STR_BASE / load str_base into ac
ADD ITR / add index to str_base
STORE INDEX / store (str_base + index) into ac
CLEAR / set ac to zero
ADDI INDEX / get the value at ADDR
SKIPCOND 400 / SKIP if ADDR = 0 (or null char)
JUMP DO / jump to DO
JUMP PRINT / JUMP to END
DO, STORE TEMP / output value at ADDR
LOAD ITR / load iterator into ac
ADD ONE / increment iterator by one
STORE ITR / store ac in iterator
JUMP WHILE / jump to while
PRINT, SUBT ONE
SKIPCOND 000
JUMP PR
HALT
PR, OUTPUT
JUMP WHILE
ONE, DEC 1
ITR, DEC 0
INDEX, HEX 0
STR_BASE, HEX 12 / memory location of str
STR, HEX 48 / H
HEX 65 / E
HEX 6C / L
HEX 6C / L
HEX 6F / O
HEX 0 / carriage return
HEX 57 / W
HEX 6F / O
HEX 72 / R
HEX 6C / L
HEX 64 / D
HEX 0 / NULL char
My program ends up halting past two iterations. I can't seem to figure out how to print a set of characters in one line. Thanks.

Your value of STR_BASE is almost certainly incorrect. Based on what is here I would say it needs to be 18 instead of 12. Also you would either want to remove current null char that is between "HELLO" and "WORLD" and replace it with a space or simply remove that line, depending on your intended output.

Related

Calculating jmp's from one segment to another in windows PE files

Assume I have a binary on my disk that I load into memory using VirtualAlloc and ReadFile.
If I want to follow a jmp instruction from one section to another, what do I need to add/subtract to get the destination address.
In other words, I want to know how IDA calculates the loc_140845BB8 from jmp loc_140845BB8
Example:
.text:000000014005D74E jmp loc_140845BB8
Jumps to the section seg007
seg007:0000000140845BB8 ; seg007:0000000140845BC4↓j
seg007:0000000140845BB8 and rbx, r14
PE info (seg007 is the section named "")
Segments are arbitary, it jumps where it jumps, without regard for segments. Jump location is calculated as the signed 32-bit value following the 0xE9 JMP opcode, added to the the address of where the next instruction would be (i.e. the location of JMP + 5 bytes).
def GetInsnLen(ea):
insn = ida_ua.insn_t()
return ida_ua.decode_insn(insn, ea)
def MakeSigned(number, size):
number = number & (1<<size) - 1
return number if number < 1<<size - 1 else - (1<<size) - (~number + 1)
def GetRawJumpTarget(ea):
if ea is None:
return None
insnlen = GetInsnLen(ea)
if not insnlen:
return None
result = MakeSigned(idc.get_wide_dword(ea + insnlen - 4), 32) + ea + insnlen
if ida_ida.cvar.inf.min_ea <= result < ida_ida.cvar.inf.max_ea:
return result
return None

Encoding Spotify URI to Spotify Codes

Spotify Codes are little barcodes that allow you to share songs, artists, users, playlists, etc.
They encode information in the different heights of the "bars". There are 8 discrete heights that the 23 bars can be, which means 8^23 different possible barcodes.
Spotify generates barcodes based on their URI schema. This URI spotify:playlist:37i9dQZF1DXcBWIGoYBM5M gets mapped to this barcode:
The URI has a lot more information (62^22) in it than the code. How would you map the URI to the barcode? It seems like you can't simply encode the URI directly. For more background, see my "answer" to this question: https://stackoverflow.com/a/62120952/10703868
The patent explains the general process, this is what I have found.
This is a more recent patent
When using the Spotify code generator the website makes a request to https://scannables.scdn.co/uri/plain/[format]/[background-color-in-hex]/[code-color-in-text]/[size]/[spotify-URI].
Using Burp Suite, when scanning a code through Spotify the app sends a request to Spotify's API: https://spclient.wg.spotify.com/scannable-id/id/[CODE]?format=json where [CODE] is the media reference that you were looking for. This request can be made through python but only with the [TOKEN] that was generated through the app as this is the only way to get the correct scope. The app token expires in about half an hour.
import requests
head={
"X-Client-Id": "58bd3c95768941ea9eb4350aaa033eb3",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"App-Platform": "iOS",
"Accept": "*/*",
"User-Agent": "Spotify/8.5.68 iOS/13.4 (iPhone9,3)",
"Accept-Language": "en",
"Authorization": "Bearer [TOKEN]",
"Spotify-App-Version": "8.5.68"}
response = requests.get('https://spclient.wg.spotify.com:443/scannable-id/id/26560102031?format=json', headers=head)
print(response)
print(response.json())
Which returns:
<Response [200]>
{'target': 'spotify:playlist:37i9dQZF1DXcBWIGoYBM5M'}
So 26560102031 is the media reference for your playlist.
The patent states that the code is first detected and then possibly converted into 63 bits using a Gray table. For example 361354354471425226605 is encoded into 010 101 001 010 111 110 010 111 110 110 100 001 110 011 111 011 011 101 101 000 111.
However the code sent to the API is 6875667268, I'm unsure how the media reference is generated but this is the number used in the lookup table.
The reference contains the integers 0-9 compared to the gray table of 0-7 implying that an algorithm using normal binary has been used. The patent talks about using a convolutional code and then the Viterbi algorithm for error correction, so this may be the output from that. Something that is impossible to recreate whithout the states I believe. However I'd be interested if you can interpret the patent any better.
This media reference is 10 digits however others have 11 or 12.
Here are two more examples of the raw distances, the gray table binary and then the media reference:
1.
022673352171662032460
000 011 011 101 100 010 010 111 011 001 100 001 101 101 011 000 010 011 110 101 000
67775490487
2.
574146602473467556050
111 100 110 001 110 101 101 000 011 110 100 010 110 101 100 111 111 101 000 111 000
57639171874
edit:
Some extra info:
There are some posts online describing how you can encode any text such as spotify:playlist:HelloWorld into a code however this no longer works.
I also discovered through the proxy that you can use the domain to fetch the album art of a track above the code. This suggests a closer integration of Spotify's API and this scannables url than previously thought. As it not only stores the URIs and their codes but can also validate URIs and return updated album art.
https://scannables.scdn.co/uri/800/spotify%3Atrack%3A0J8oh5MAMyUPRIgflnjwmB
Your suspicion was correct - they're using a lookup table. For all of the fun technical details, the relevant patent is available here: https://data.epo.org/publication-server/rest/v1.0/publication-dates/20190220/patents/EP3444755NWA1/document.pdf
Very interesting discussion. Always been attracted to barcodes so I had to take a look. I did some analysis of the barcodes alone (didn't access the API for the media refs) and think I have the basic encoding process figured out. However, based on the two examples above, I'm not convinced I have the mapping from media ref to 37-bit vector correct (i.e. it works in case 2 but not case 1). At any rate, if you have a few more pairs, that last part should be simple to work out. Let me know.
For those who want to figure this out, don't read the spoilers below!
It turns out that the basic process outlined in the patent is correct, but lacking in details. I'll summarize below using the example above. I actually analyzed this in reverse which is why I think the code description is basically correct except for step (1), i.e. I generated 45 barcodes and all of them matched had this code.
1. Map the media reference as integer to 37 bit vector.
Something like write number in base 2, with lowest significant bit
on the left and zero-padding on right if necessary.
57639171874 -> 0100010011101111111100011101011010110
2. Calculate CRC-8-CCITT, i.e. generator x^8 + x^2 + x + 1
The following steps are needed to calculate the 8 CRC bits:
Pad with 3 bits on the right:
01000100 11101111 11110001 11010110 10110000
Reverse bytes:
00100010 11110111 10001111 01101011 00001101
Calculate CRC as normal (highest order degree on the left):
-> 11001100
Reverse CRC:
-> 00110011
Invert check:
-> 11001100
Finally append to step 1 result:
01000100 11101111 11110001 11010110 10110110 01100
3. Convolutionally encode the 45 bits using the common generator
polynomials (1011011, 1111001) in binary with puncture pattern
110110 (or 101, 110 on each stream). The result of step 2 is
encoded using tail-biting, meaning we begin the shift register
in the state of the last 6 bits of the 45 long input vector.
Prepend stream with last 6 bits of data:
001100 01000100 11101111 11110001 11010110 10110110 01100
Encode using first generator:
(a) 100011100111110100110011110100000010001001011
Encode using 2nd generator:
(b) 110011100010110110110100101101011100110011011
Interleave bits (abab...):
11010000111111000010111011110011010011110001...
1010111001110001000101011000010110000111001111
Puncture every third bit:
111000111100101111101110111001011100110000100100011100110011
4. Permute data by choosing indices 0, 7, 14, 21, 28, 35, 42, 49,
56, 3, 10..., i.e. incrementing 7 modulo 60. (Note: unpermute by
incrementing 43 mod 60).
The encoded sequence after permuting is
111100110001110101101000011110010110101100111111101000111000
5. The final step is to map back to bar lengths 0 to 7 using the
gray map (000,001,011,010,110,111,101,100). This gives the 20 bar
encoding. As noted before, add three bars: short one on each end
and a long one in the middle.
UPDATE: I've added a barcode (levels) decoder (assuming no errors) and an alternate encoder that follows the description above rather than the equivalent linear algebra method. Hopefully that is a bit more clear.
UPDATE 2: Got rid of most of the hard-coded arrays to illustrate how they are generated.
The linear algebra method defines the linear transformation (spotify_generator) and mask to map the 37 bit input into the 60 bit convolutionally encoded data. The mask is result of the 8-bit inverted CRC being convolutionally encoded. The spotify_generator is a 37x60 matrix that implements the product of generators for the CRC (a 37x45 matrix) and convolutional codes (a 45x60 matrix). You can create the generator matrix from an encoding function by applying the function to each row of an appropriate size generator matrix. For example, a CRC function that add 8 bits to each 37 bit data vector applied to each row of a 37x37 identity matrix.
import numpy as np
import crccheck
# Utils for conversion between int, array of binary
# and array of bytes (as ints)
def int_to_bin(num, length, endian):
if endian == 'l':
return [num >> i & 1 for i in range(0, length)]
elif endian == 'b':
return [num >> i & 1 for i in range(length-1, -1, -1)]
def bin_to_int(bin,length):
return int("".join([str(bin[i]) for i in range(length-1,-1,-1)]),2)
def bin_to_bytes(bin, length):
b = bin[0:length] + [0] * (-length % 8)
return [(b[i]<<7) + (b[i+1]<<6) + (b[i+2]<<5) + (b[i+3]<<4) +
(b[i+4]<<3) + (b[i+5]<<2) + (b[i+6]<<1) + b[i+7] for i in range(0,len(b),8)]
# Return the circular right shift of an array by 'n' positions
def shift_right(arr, n):
return arr[-n % len(arr):len(arr):] + arr[0:-n % len(arr)]
gray_code = [0,1,3,2,7,6,4,5]
gray_code_inv = [[0,0,0],[0,0,1],[0,1,1],[0,1,0],
[1,1,0],[1,1,1],[1,0,1],[1,0,0]]
# CRC using Rocksoft model:
# NOTE: this is not quite any of their predefined CRC's
# 8: number of check bits (degree of poly)
# 0x7: representation of poly without high term (x^8+x^2+x+1)
# 0x0: initial fill of register
# True: byte reverse data
# True: byte reverse check
# 0xff: Mask check (i.e. invert)
spotify_crc = crccheck.crc.Crc(8, 0x7, 0x0, True, True, 0xff)
def calc_spotify_crc(bin37):
bytes = bin_to_bytes(bin37, 37)
return int_to_bin(spotify_crc.calc(bytes), 8, 'b')
def check_spotify_crc(bin45):
data = bin_to_bytes(bin45,37)
return spotify_crc.calc(data) == bin_to_bytes(bin45[37:], 8)[0]
# Simple convolutional encoder
def encode_cc(dat):
gen1 = [1,0,1,1,0,1,1]
gen2 = [1,1,1,1,0,0,1]
punct = [1,1,0]
dat_pad = dat[-6:] + dat # 6 bits are needed to initialize
# register for tail-biting
stream1 = np.convolve(dat_pad, gen1, mode='valid') % 2
stream2 = np.convolve(dat_pad, gen2, mode='valid') % 2
enc = [val for pair in zip(stream1, stream2) for val in pair]
return [enc[i] for i in range(len(enc)) if punct[i % 3]]
# To create a generator matrix for a code, we encode each row
# of the identity matrix. Note that the CRC is not quite linear
# because of the check mask so we apply the lamda function to
# invert it. Given a 37 bit media reference we can encode by
# ref * spotify_generator + spotify_mask (mod 2)
_i37 = np.identity(37, dtype=bool)
crc_generator = [_i37[r].tolist() +
list(map(lambda x : 1-x, calc_spotify_crc(_i37[r].tolist())))
for r in range(37)]
spotify_generator = 1*np.array([encode_cc(crc_generator[r]) for r in range(37)], dtype=bool)
del _i37
spotify_mask = 1*np.array(encode_cc(37*[0] + 8*[1]), dtype=bool)
# The following matrix is used to "invert" the convolutional code.
# In particular, we choose a 45 vector basis for the columns of the
# generator matrix (by deleting those in positions equal to 2 mod 4)
# and then inverting the matrix. By selecting the corresponding 45
# elements of the convolutionally encoded vector and multiplying
# on the right by this matrix, we get back to the unencoded data,
# assuming there are no errors.
# Note: numpy does not invert binary matrices, i.e. GF(2), so we
# hard code the following 3 row vectors to generate the matrix.
conv_gen = [[0,1,0,1,1,1,1,0,1,1,0,0,0,1]+31*[0],
[1,0,1,0,1,0,1,0,0,0,1,1,1] + 32*[0],
[0,0,1,0,1,1,1,1,1,1,0,0,1] + 32*[0] ]
conv_generator_inv = 1*np.array([shift_right(conv_gen[(s-27) % 3],s) for s in range(27,72)], dtype=bool)
# Given an integer media reference, returns list of 20 barcode levels
def spotify_bar_code(ref):
bin37 = np.array([int_to_bin(ref, 37, 'l')], dtype=bool)
enc = (np.add(1*np.dot(bin37, spotify_generator), spotify_mask) % 2).flatten()
perm = [enc[7*i % 60] for i in range(60)]
return [gray_code[4*perm[i]+2*perm[i+1]+perm[i+2]] for i in range(0,len(perm),3)]
# Equivalent function but using CRC and CC encoders.
def spotify_bar_code2(ref):
bin37 = int_to_bin(ref, 37, 'l')
enc_crc = bin37 + calc_spotify_crc(bin37)
enc_cc = encode_cc(enc_crc)
perm = [enc_cc[7*i % 60] for i in range(60)]
return [gray_code[4*perm[i]+2*perm[i+1]+perm[i+2]] for i in range(0,len(perm),3)]
# Given 20 (clean) barcode levels, returns media reference
def spotify_bar_decode(levels):
level_bits = np.array([gray_code_inv[levels[i]] for i in range(20)], dtype=bool).flatten()
conv_bits = [level_bits[43*i % 60] for i in range(60)]
cols = [i for i in range(60) if i % 4 != 2] # columns to invert
conv_bits45 = np.array([conv_bits[c] for c in cols], dtype=bool)
bin45 = (1*np.dot(conv_bits45, conv_generator_inv) % 2).tolist()
if check_spotify_crc(bin45):
return bin_to_int(bin45, 37)
else:
print('Error in levels; Use real decoder!!!')
return -1
And example:
>>> levels = [5,7,4,1,4,6,6,0,2,4,3,4,6,7,5,5,6,0,5,0]
>>> spotify_bar_decode(levels)
57639171874
>>> spotify_barcode(57639171874)
[5, 7, 4, 1, 4, 6, 6, 0, 2, 4, 3, 4, 6, 7, 5, 5, 6, 0, 5, 0]

windbg script causes memory access violation

I am using the following windbg script to break when a certain value is encountered in the buffer when reading a file
bp ReadFile
.while(1)
{
g
$$ Get parameters of ReadFile()
r $t0 = dwo(esp+4)
r $t1 = dwo(esp+8)
r $t2 = dwo(esp+0x0c)
$$ Execute until return is reached
pt
$$ Read magic value in the buffer
$$ CHANGE position in buffer here
r $t5 = dwo(#$t1+0x00)
$$ Check if magic value matches
$$ CHANGE constant here
.if(#$t5 == 0x70170000)
{
$$db #$t1
$$ break
.break
}
}
$$ Clear BP for ReadFile (assume it is the 0th one)
bc 0
I get the following memory access violation when I run this script.
Memory access error at ');; $$ Check if magic value matches; $$ CHANGE constant here; .if(#$t5 == 0x70170000); {; $$db #$t1;; $$ break; .break; };'
Why is this the case?
If you need to read the buffer contents at kernel32!ReadFile you need to save the buffer address and step out of the function using gu (goup or step out)
when broken on ReadFile esp+8 points to the buffer so save it and step out
r $t1 = poi(#esp+8);gu
the first Dword of the buffer is poi(#$t1) compare it with the required Dword
and take necessary action with .if .else
.if( poi(#$t1) != 636c6163 ) {gc} .else {db #$t1 l10;gc}
putting this all together in one line the script shoule be
bp k*32!ReadFile "r $t1 =poi(#esp+8);gu;.if((poi(#$t1))!=636c6163){gc}.else{db #$t1 l10;gc}"
here 636c6163 is 'clac' (calc reversed ) use the dword you want instead of this
a sample run on calc.exe xp sp3 32 bits
bl
bp k*32!ReadFile "r $t1=poi(#esp+8);gu;.if((poi(#$t1))!=636c6163){gc}.else{db #$t1 l10;gc}"
.bpcmds
bp0 0x7c801812 "r $t1 = poi(#esp+8);gu;.if( (poi(#$t1))!=636c6163){gc}.else{db #$t1 l10;gc}"
0:002> g
00b865b0 63 61 6c 63 5f 77 68 61-74 69 73 5f 69 6e 74 72 calc_whatis_intr
00374df0 63 61 6c 63 00 ab ab ab-ab ab ab ab ab fe ee fe calc............

Remove leading zeroes binary

I want to basically remove my leading zeroes. When I print out a number for example 17 is 00000 0000 0000 0000 0000 0000 00001 0001 but to do remove those leading zeroes. Because in sparc machine that is what is printed out and I need to do this using some sort of loop or logic or shift function.
this is my psuedocode for printing the binary
store input, %l1 ! store my decimal number in l1
move 1,%l2 !move 1 into l2 register
shift logical left l2,31,l2 !shift my mask 1 31 times to the left
loop:
and l2,l1,l3 ! do and logic between l1 and l2 and put this in l3
compare l3,0 compare l3 zero
bne print 1 !branch not equal to zero, to print 1
if equal to 0
print zero
print 1:
print a 1
go: increment counter
compare counter 32
if counter less than 32 return to loop
shift l2 to the right to continue comparison
so this is what is being done say my input is l1 is 17
00000 0000 0000 0000 0000 0000 00001 0001
10000 0000 0000 0000 0000 0000 00000 0000 and my mask 1 shift left 31 times
this pseucode print out my input decimal into binary. But how can I make it remove leading zeroes?
because in the sparc 17 input inside the machine is
0000 0000 0000 0000 0000 0000 0001 00001
You create the labels, like go and print 1 (more commonly done in all caps and without spaces, FYI). So, starting with bne you should always be printing 1, or falling through to see if it needs to print the 0:
! same initialization
mov 0, l4 ! Initialize a flag to avoid printing
LOOP:
and l2, l1, l3 ! do and logic between l1 and l2 and put this in l3
cmp l3, 0 ! Is this a 0 digit?
bne ALWAYS_PRINT ! If it's not 0, then it must be 1 (was "bne print 1")
cmp l4, 1 ! Should we be printing the 0?
be PRINT_VALUE ! Yes, we need to print the 0 because we have seen a 1
ba INCREMENT ! We should not be printing the 0, so check the next
! digit (ba is "branch always")
ALWAYS_PRINT: !
mov 1, %l4 ! Note that we want to always print for the
! rest of the loop
PRINT_VALUE: ! Do whatever you're doing to print values
print value in l3 ! Always print the value
INCREMENT: ! Formerly your "go:" label
! same logic
! AFTER LOOP IS DONE LOGIC
cmp l4, 0 ! If the flag was never set, then the value is 0
! Alternatively, you could just compare the value to 0
! and skip the loop entirely, only printing 0 (faster)
bne DO_NOT_PRINT ! If it was set (1), then do nothing
print zero ! If it wasn't set, then print the 0
DO_NOT_PRINT:
To walk through it a little, you need to continue to initialize your values and shift the bits to figure out what the current digit is for each iteration. Since you will need another flag, then you need to use another register that is initialized to an expected value (I chose 0, which commonly represents false).
Get current digit into l3 (0 or 1)
See if it is 0
If it's not 0, then it must be 1. So go remember that we found a 1, for later, then print the value and increment/loop.
If it's 0, then see if we have found a 1 before. If so, then print the value and increment/loop. If not, then increment/loop.
For actually printing, I have no idea what you are actually doing. However, you can avoid a second comparison by using the labels. For example, ALWAYS_PRINT will always be used when the value is 1, so you can just set the flag and immediately print 1, then jump to INCREMENT. If you did that, then PRINT_VALUE would only be used to print 0, which could then fall through to INCREMENT.
From a high level language's perspective, you want:
int l2 = // value...
bool seenOneAlready = false;
if (l2 != 0)
{
// MSB first
for (int i = 31; i > -1; --i)
{
int l3 = (l2 >> i) & 1;
if (l3 == 1)
{
seenOneAlready = true;
printf("1");
}
else if (seenOneAlready)
{
printf("0");
}
}
}
else
{
printf("0");
}

Need help identifying and computing a number representation

I need help identifying the following number format.
For example, the following number format in MIB:
0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]
It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.
So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]
What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?
I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity
This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at Encoding an integer in 7-bit format of C# BinaryReader.ReadString
I am not aware of any method of decoding other than bit manipulation.
Sample PHP code can be found at
http://php.net/manual/en/function.intval.php#62613
or in Python I would do something like
def encode_7bhm(i):
o = [ chr(i & 0x7f) ]
i /= 128
while i > 0:
o.insert(0, chr(0x80 | (i & 0x7f)))
i /= 128
return ''.join(o)
def decode_7bhm(s):
o = 0
for i in range(len(s)):
v = ord(s[i])
o = 128*o + (v & 0x7f)
if v & 0x80 == 0:
# found end of encoded value
break
else:
# out of string, and end not found - error!
raise TypeError
return o