How do I parse a captured packet in python? - sockets

I have a capture packet raw packet using python's sockets:
s = socket.socket(socket.AF_PACKET, socket.SOCK_RAW, socket.ntohs(0x0003))
while True:
message = s.recv(4096)
test = []
print(len(message))
print(repr(message))
I assumed that the packet returned would be in hex string format, however the printout of print(repr(message)) get me something like this:
b'\x00\x1b\xac\x00Gd\x00\x14\xd1+\x1f\x19\x05\n\x124VxC!UUUU\x00\x00\x00\x00\xcd\xcc\xcc=\xcd\xccL>\x9a\x99\x99>\xcd\xcc\xcc>\x00\x00\x00?\x9a\x......'
which has weird non hex characters like !UUUU or =. What encoding is this, and how do I decode the packet?
I know what the packet looks like ahead of time for now, since I'm the one generating the packets using winpcapy:
from ctypes import *
from winpcapy import *
import zlib
import binascii
import time
from ChanPackets import base, FrMessage, FrTodSync, FrChanConfig, FlChan, RlChan
while (1):
now = time.time()
errbuf = create_string_buffer(PCAP_ERRBUF_SIZE)
fp = pcap_t
deviceName = b'\\Device\\NPF_{8F5BD2E9-253F-4659-8256-B3BCD882AFBC}'
fp = pcap_open_live(deviceName, 65536, 1, 1000, errbuf)
if not bool(fp):
print ("\nUnable to open the adapter. %s is not supported by WinPcap\n" % deviceName)
sys.exit(2)
# FrMessage is a custom class that creates the packet
test = FrMessage('00:1b:ac:00:47:64', '00:14:d1:2b:1f:19', 0x12345678, 0x4321, 0x55555555, list(i/10 for i in range(320)))
# test.get_Raw_Packet() returns a c_bytes array needed for winpcap to send the packet
if (pcap_sendpacket(fp, test.get_Raw_Packet(), test.packet_size) != 0):
print ("\nError sending the packet: %s\n" % pcap_geterr(fp))
sys.exit(3)
elapsed = time.time() - now
if elapsed < 0.02 and elapsed > 0:
time.sleep(0.02 - elapsed)
pcap_close(fp)
Note: I would like to get an array of hex values representing each byte

What encoding is this, and how do I decode the packet?
What you see is the representation of bytes object in Python. As you might have guessed \xab represents byte 0xab (171).
which has weird non hex characters like !UUUU or =
Printable ASCII characters represent themselves i.e., instead of \x55 the representation contains just U.
What you have is a sequence of bytes. How to decode them depends on your application. For example, to decode a data packet that contains Ethernet frame, you could use scapy (Python 2):
>>> b = '\x00\x02\x157\xa2D\x00\xae\xf3R\xaa\xd1\x08\x00E\x00\x00C\x00\x01\x00\x00#\x06x<\xc0\xa8\x05\x15B#\xfa\x97\x00\x14\x00P\x00\x00\x00\x00\x00\x00\x00\x00P\x02 \x00\xbb9\x00\x00GET /index.html HTTP/1.0 \n\n'
>>> c = Ether(b)
>>> c.hide_defaults()
>>> c
<Ether dst=00:02:15:37:a2:44 src=00:ae:f3:52:aa:d1 type=0x800 |
<IP ihl=5L len=67 frag=0 proto=tcp chksum=0x783c src=192.168.5.21 dst=66.35.250.151 |
<TCP dataofs=5L chksum=0xbb39 options=[] |
<Raw load='GET /index.html HTTP/1.0 \n\n' |>>>>
I would like to get an array of hex values representing each byte
You could use binascii.hexlify():
>>> pkt = b'\x00\x1b\xac\x00Gd\x00'
>>> import binascii
>>> binascii.hexlify(pkt)
b'001bac00476400'
or If you want a list with string hex values:
>>> hexvalue = binascii.hexlify(pkt).decode()
>>> [hexvalue[i:i+2] for i in range(0, len(hexvalue), 2)]
['00', '1b', 'ac', '00', '47', '64', '00']

In python raw packet decode can be done using the scapy functions like IP(), TCP(), UDP() etc.
import sys
import socket
from scapy.all import *
s = socket.socket(socket.AF_INET, socket.SOCK_RAW, socket.IPPROTO_TCP)
while 1:
packet = s.recvfrom(2000);
packet = packet[0]
ip = IP(packet)
ip.show()

Related

Reading/Writing to LSM6DSOX via SPI from Raspberry Pi

I'm having trouble reading and writing to my Adafruit LSM6DSOX IMU from my Raspberry Pi 4 running Ubuntu 20.04. I need to do it via SPI since I require the bandwidth, but I can only seem to read the WHO_AM_I register successfully. Reading/writing to any other register only returns 0x00. I have verified that I can read data off the IMU from an Arduino via SPI, but if I try to read a register other than 0x0F (the IMU_ID) I get 0x0 as a response. Any insight/ideas what could be causing this would be greatly appreciated!
EDIT: It turns out I can read the following registers:
0x0f : 0x6c
0x13 : 0x1c
0x33 : 0x1c
0x53 : 0x1c
0x73 : 0x1c
These are all random registers however, and the value 0x1C doesn't seem to correspond with anything.
This is my main.py:
import LSM6DSOX
def main():
imu=LSM6DSOX.LSM6DSOX()
imu.initSPI()
whoamI=imu.read_reg(0x0F)
while(whoamI != imu.LSM6DSOX_ID):
imu.ms_sleep(200)
print('searching for IMU')
whoamI=imu.get_id()
print(hex(whoamI))
print('found lsm6dsox IMU')
imu.spi.close()
imu.spi = None
if __name__=="__main__":
main()
This is an excerpt from my LSM6DSOX.py:
def initSPI(self):
# Setup communication SPI
self.spi = spidev.SpiDev()
self.spi.open(0, 0)
self.spi.mode=0b11 #mode 3, (mode 0 is also fine)
self.spi.max_speed_hz = 500000
return self.spi
def read_reg(self, reg, len=1):
# Set up message
buf = bytearray(len+1)
buf[0] = 0b10000000 | reg # MSB bit must be 1 to indicate a read operation. this is OR'd with the register address you want to read
resp =self.spi.xfer2(buf) #send (and recieve) data to the imu
if len==1:
return resp[1]
else:
return resp[1:] #display recieved data
def write_reg(self, reg, data, len=1):
# Set up message
buf = bytearray(len+1)
buf[0] = 0b00000000 | reg # MSB bit must be 0 to indicate a read operation. this is OR'd with the register address you want to read
buf[1:] =bytes(data)
resp =self.spi.xfer2(buf) #send (and recieve) data to the imu
return resp[1:] #display recieved data

Sending Email From Databricks Notebooks

i want to send email from databricks notebooks, based on this article: https://docs.databricks.com/user-guide/faq/send-email.html
I am following the steps, however I got an error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
And, I think, the reason is because inside the function makeCompatibleImage we have this snipet: val = "" % base64.standard_b64encode(png.read()), and probably there is something wrong with base64.standard_b64encode
import numpy as np
import matplotlib.pyplot as plt
# Compute pie slices
N = 20
theta = np.linspace(0.0, 2 * np.pi, N, endpoint=False)
radii = 10 * np.random.rand(N)
width = np.pi / 4 * np.random.rand(N)
ax = plt.subplot(111, projection='polar')
bars = ax.bar(theta, radii, width=width, bottom=0.0)
# Use custom colors and opacity
for r, bar in zip(radii, bars):
bar.set_facecolor(plt.cm.viridis(r / 10.))
bar.set_alpha(0.5)
# Convert image add append to html array
html.append(makeCompatibleImage(ax))
#
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<command-890455078841631> in <module>()
16 bar.set_alpha(0.5)
17 # Convert image add append to html array
---> 18 html.append(makeCompatibleImage(ax))
<command-890455078841625> in makeCompatibleImage(image, withLabel)
11 val = None
12 with open(imageName) as png:
---> 13 val = "<img src='data:image/png;base64,%s'>" % base64.standard_b64encode(png.read())
14
15 displayHTML(val)
/databricks/python/lib/python3.6/codecs.py in decode(self, input, final)
319 # decode input (taking the buffer into account)
320 data = self.buffer + input
--> 321 (result, consumed) = self._buffer_decode(data, self.errors, final)
322 # keep undecoded input until the next call
323 self.buffer = data[consumed:]
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
I want to know how I can replicate this article.
Adding the following code in makeCompatibleImage function, to read the file in binary mode, worked for me:
with open(imageName, 'rb') as png:

Using pytorch cuda for RNNs on google colaboratory

I have a code (a code we saw in a class) of a recurrent neural network that reads a given text and tries to produce its own text similar to the example. The code is written in python and uses the pytorch library. I wanted to modify to see whether I could increase its speed by using GPU instead of CPU and I made some tests on google collaboratory. The GPU version of the code runs fine but is about three times slower than the CPU version. I do not know the details of GPU architecture so I can not really understand why it is slower. I know that GPUs can do more arithmetic operations per cycle but have more limited memory so I am curious if I am having a memory issue. I also tried using CUDA with a generative adversarial network and in this case it was almost ten times faster. Any tips on this would be welcome.
The code (CUDA version) is below. I am new at this stuff so sorry if some of the terminology is not correct.
The architecture is input->encoder->recursive network->decoder->output.
import torch
import time
import numpy as np
from torch.autograd import Variable
import matplotlib.pyplot as plt
from google.colab import files
#uploding text on google collab
uploaded = files.upload()
for fn in uploaded.keys():
print('User uploaded file "{name}" with length {length} bytes'.format(
name=fn, length=len(uploaded[fn])))
#data preprocessing
with open('text.txt','r') as file:
#with open closes the file after we are done with it
rawtxt=file.read()
rawtxt = rawtxt.lower()
#a function that assigns a number to each unique character in the text
def create_map(rawtxt):
letters = list(set(rawtxt))
lettermap = dict(enumerate(letters)) #gives each letter in the list a number
return lettermap
num_to_let = create_map(rawtxt)
#inverse to num_to_let
let_to_num =dict(zip(num_to_let.values(), num_to_let.keys()))
print(num_to_let)
#turns a text of characters into text of numbers using the mapping
#given by the input mapdict
def maparray(txt, mapdict):
txt = list(txt)
for k, letter in enumerate(txt):
txt[k]=mapdict[letter]
txt=np.array(txt)
return txt
X=maparray(rawtxt, let_to_num) #the data text in numeric format
Y= np.roll(X, -1, axis=0) #shifted data text in numeric format
X=torch.LongTensor(X)
Y=torch.LongTensor(Y)
#up to here we are done with data preprocessing
#return a random batch for training
#this reads a random piece inside data text
#with the size chunk_size
def random_chunk(chunk_size):
k=np.random.randint(0,len(X)-chunk_size)
return X[k:k+chunk_size], Y[k:k+chunk_size]
nchars = len(num_to_let)
#define the recursive neural network class
class rnn(torch.nn.Module):
def __init__(self,input_size,hidden_size,output_size, n_layers=1):
super().__init__()
self.input_size = input_size
self.hidden_size = hidden_size
self.output_size = output_size
self.n_layers= n_layers
self.encoder = torch.nn.Embedding (input_size, hidden_size)
self.rnn = torch.nn.RNN(hidden_size, hidden_size, n_layers, batch_first=True)
self.decoder = torch.nn.Linear (hidden_size, output_size)
def forward (self,x,hidden):
x=self.encoder(x.view(1,-1))
output, hidden = self.rnn(x.view(1,1,-1), hidden)
output = self.decoder(output.view(1,-1))
return output, hidden
def init_hidden(self):
return Variable(torch.zeros(self.n_layers , 1 , self.hidden_size)).cuda()
#hyper-params
lr = 0.009
no_epochs = 50
chunk_size = 150
myrnn = rnn(nchars, 150, nchars,1)
myrnn.cuda()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(myrnn.parameters(), lr=lr)
t0 = time.time()
for epoch in range(no_epochs):
totcost=0
generated = ''
for _ in range(len(X)//chunk_size):
h=myrnn.init_hidden()
cost = 0
x, y=random_chunk(chunk_size)
x, y= Variable(x).cuda(), Variable(y).cuda()
for i in range(chunk_size):
out, h = myrnn.forward(x[i],h)
_, outl = out.data.max(1)
letter = num_to_let[outl[0]]
generated+=letter
cost += criterion(out, y[i])
optimizer.zero_grad()
cost.backward()
optimizer.step()
totcost+=cost
totcost/=len(X)//chunk_size
print('Epoch', epoch, 'Avg cost/chunk: ', totcost)
print(generated[0:750],'\n\n\n')
t1 = time.time()
total = t1-t0
print('total',total)
#we encode each word into a vector of fixed size

Incorrect throughput value in python

I am writing a python code to find throughput between server and client. It is based on speedtest.net functionality where I am sending a dummy file to calculate the speed. The problem I am facing is unreliable throughput output. I will appreciate your suggestions on the same. Here is the code.
server.py
import socket
import os
port = 60000
s = socket.socket()
host = socket.gethostname()
s.bind((host, port))
s.listen(5)
print 'Server listening....'
while True:
conn, addr = s.accept()
print 'Got connection from', addr
data = conn.recv(1024)
print('Server received', repr(data))
filename='akki.txt'
b = os.path.getsize(filename)
f = open(filename,'rb')
l = f.read(b)
while (l):
conn.send(l)
l = f.read(b)
f.close()
print('Done sending')
conn.send('Thank you for connecting')
conn.close()
Client.py
import socket
import time
import os
s = socket.socket()
host = socket.gethostname()
port = 60000
t1 = time.time()
s.connect((host, port))
s.send("Hello server!")
with open('received_file', 'wb') as f:
print 'file opened'
t2 = time.time()
while True:
data = s.recv(1024)
if not data:
break
f.write(data)
t3 = time.time()
print data
print 'Total:', t3 - t1
print 'Throughput:', round((1024.0 * 0.001) / (t3 - t1), 3),
print 'K/sec.'
f.close()
print('Successfully received the file')
s.close()
print('connection closed')
Output when sending akki.txt
Server Output
Server listening....
Got connection from ('10.143.47.165', 60902)
('Server received', "'Hello server!'")
Done sending
Client output
file opened
Raw timers: 1503350568.11 1503350568.11 1503350568.11
Total: 0.00499987602234
**Throughput: 204.805 K/sec.**
Successfully received the file
connection closed
Output for ak.zip ( which is bigger file)
Client output
file opened
Total: 0.0499999523163
**Throughput: 20.48 K/sec.**
Successfully received the file
connection closed
Short Answer: you need to take the file size into consideration.
More Details:
Throughput is data/time. Your calculation:
round((1024.0 * 0.001) / (t3 - t1), 3)
Doesn't take the file size into account. Since sending a large file takes more time, 't3-t1' is bigger so your throughput is smaller (same numerator with larger denominator). Try adding the file size to the formula and you should get much more constant results.
Hope this helps.

Need help identifying and computing a number representation

I need help identifying the following number format.
For example, the following number format in MIB:
0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]
It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.
So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]
What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?
I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity
This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at Encoding an integer in 7-bit format of C# BinaryReader.ReadString
I am not aware of any method of decoding other than bit manipulation.
Sample PHP code can be found at
http://php.net/manual/en/function.intval.php#62613
or in Python I would do something like
def encode_7bhm(i):
o = [ chr(i & 0x7f) ]
i /= 128
while i > 0:
o.insert(0, chr(0x80 | (i & 0x7f)))
i /= 128
return ''.join(o)
def decode_7bhm(s):
o = 0
for i in range(len(s)):
v = ord(s[i])
o = 128*o + (v & 0x7f)
if v & 0x80 == 0:
# found end of encoded value
break
else:
# out of string, and end not found - error!
raise TypeError
return o