tshark export FIX messages - pcap

The Objective
I'm trying to achieve the following:
capture network traffic containing a conversation in the FIX protocol
extract the individual FIX messages from the network traffic into a "nice" format, e.g. CSV
do some data analysis on the exported "nice" format data
I have achieved this by:
using pcap to capture the network traffic
using tshark to print the relevant data as a CSV
using Python (pandas) to analyse the data
The Problem
The problem is that some of the captured TCP packets contain more than one FIX message, which means that when I do the export to CSV using tshark I don't get a FIX message per line. This makes consuming the CSV difficult.
This is the tshark commandline I'm using to extract the relevant FIX fields as CSV is:
tshark -r dump.pcap \
-R \'(fix.MsgType[0]=="G" or fix.MsgType[0]=="D" or fix.MsgType[0]=="8" or \ fix.MsgType[0]=="F") and fix.ClOrdID != "0"\' \
-Tfields -Eseparator=, -Eoccurrence=l -e frame.time_relative \
-e fix.MsgType -e fix.SenderCompID \
-e fix.SenderSubID -e fix.Symbol -e fix.Side \
-e fix.Price -e fix.OrderQty -e fix.ClOrdID \
-e fix.OrderID -e fix.OrdStatus'
Note that I'm currently using "-Eoccurrence=l" to get just the last occurrence of a named field in the case where there is more than one occurrence of a field in the packet. This is not an acceptable solution as information will get thrown away when there are multiple FIX messages in a packet.
This is what I expect to see per line in the exported CSV file (fields from one FIX message):
16.508949000,D,XXX,XXX,YTZ2,2,97480,34,646427,,
This is what I see when there is more than one FIX message (three is this case) in a TCP packet and the commandline flag "-Eoccurrence=a" is used:
16.515886000,F,F,G,XXX,XXX,XXX,XXX,XXX,XXX,XTZ2,2,97015,22,646429,646430,646431,323180,323175,301151,
The Question
Is there a way (not necessarily using tshark) to extract each individual, protocol specific message from a pcap file?

Better Solution
Using tcpflow allows this to be done properly without leaving the commandline.
My current approach is to use something like:
tshark -nr <input_file> -Y'fix' -w- | tcpdump -r- -l -w- | tcpflow -r- -C -B
tcpflow ensures that the TCP stream is followed, so no FIX messages are missed (in the case where a single TCP packet contains more than 1 FIX message). -C writes to the console and -B ensures binary output. This approach is not unlike following a TCP stream in Wireshark.
The FIX delimiters are preserved which means that I can do some handy grepping on the output, e.g.
... | tcpflow -r- -C -B | grep -P "\x0135=8\x01"
to extract all the execution reports. Note the -P argument to grep which allows the very powerful perl regex.
A (Previous) Solution
I'm using Scapy (see also Scapy Documentation, The Very Unofficial Dummies Guide to Scapy) to read in a pcap file and extract each individual FIX message from the packets.
Below is the basis of the code I'm using:
from scapy.all import *
def ExtractFIX(pcap):
"""A generator that iterates over the packets in a scapy pcap iterable
and extracts the FIX messages.
In the case where there are multiple FIX messages in one packet, yield each
FIX message individually."""
for packet in pcap:
if packet.haslayer('Raw'):
# Only consider TCP packets which contain raw data.
load = packet.getlayer('Raw').load
# Ignore raw data that doesn't contain FIX.
if not 'FIX' in load:
continue
# Replace \x01 with '|'.
load = re.sub(r'\x01', '|', load)
# Split out each individual FIX message in the packet by putting a
# ';' between them and then using split(';').
for subMessage in re.sub(r'\|8=FIX', '|;8=FIX', load).split(';'):
# Yield each sub message. More often than not, there will only be one.
assert subMessage[-1:] == '|'
yield subMessage
else:
continue
pcap = rdpcap('dump.pcap')
for fixMessage in ExtractFIX(pcap):
print fixMessage
I would still like to be able to get other information from the "frame" layer of the network packet, in particular the relative (or reference) time. Unfortunately, this doesn't seem to be available from the Scapy packet object - it's topmost layer is the Ether layer as shown below.
In [229]: pcap[0]
Out[229]: <Ether dst=00:0f:53:08:14:81 src=24:b6:fd:cd:d5:f7 type=0x800 |<IP version=4L ihl=5L tos=0x0 len=215 id=16214 flags=DF frag=0L ttl=128 proto=tcp chksum=0xa53d src=10.129.0.25 dst=10.129.0.115 options=[] |<TCP sport=2634 dport=54611 seq=3296969378 ack=2383325407 dataofs=8L reserved=0L flags=PA window=65319 chksum=0x4b73 urgptr=0 options=[('NOP', None), ('NOP', None), ('Timestamp', (581177, 2013197542))] |<Raw load='8=FIX.4.0\x019=0139\x0135=U\x0149=XXX\x0134=110169\x015006=20\x0150=XXX\x0143=N\x0152=20121210-00:12:13\x01122=20121210-00:12:13\x015001=6\x01100=SFE\x0155=AP\x015009=F3\x015022=45810\x015023=3\x015057=2\x0110=232\x01' |>>>>
In [245]: pcap[0].summary()
Out[245]: 'Ether / IP / TCP 10.129.0.25:2634 > 10.129.0.115:54611 PA / Raw'

Related

wireshark capture sip traffic and save an XML file with a specific format

I want to be able to capture SIP traffic and save the trace as XML file with an specific format , there is any way to do this with just wireshark - tshark commands ? FYI this will run in a Centos server .
The only way i found to create the specific format is by running a Perl script to format the XML file but is would be way better if I could do all with just wireshark?
Thanks
Tshark has options to convert a PCAP file to Text. Or you could capture it in a PDML or PSML format and then have a converter routine to your own XML format. And considering that wireshark if modifiable, you could tweak a plugin to do this differently. But i more inclined to build a converter process that parses the native pcap to what you like. So your raw data has all the information and you can freely keep changing the converter routine.
For Text Conversion -
tshark -V -r -T pdml "17c4d2c0-69cd-11e4-ae3e-9d5dee8b7eac.pcap" > capture.xml
For Capture as PDML / PSML -
tshark -T pdml
I am inclined to think that there could be options to directly covert to PSML like the Text converter from the PCAP.

tshark stopping criteria

I need to stop tshark (command line equi of wireshark) after a certain condition is met.
From the tshark man pages, I found that stopping condition can be applied with respect to duration, files, file size and multiple files mode.
Is there any stopping condition I can apply through capture filter so that tshark stops capturing.
ex: Upon receiving a TCP SYN packet from a particular port number (condition applied in capture filter), tshark stops capturing.
Please answer this riddle.
You can pipe the output to head and pick the first frame that matches your query but you also need to disable output buffering (stdbuf is part of coreutils)
e.g (Linux)
stdbuf -i0 -o0 -e0 tshark -r file.pcap -Y 'sctp.verification_tag == 0x2552' | head -1
Mac:
gstdbuf -i0 -o0 -e0 tshark -r file.pcap -Y 'tcp.flags.syn == 1 && tcp.port == 80' | head -1
When Wireshark 4.0.0 was released about 1 month ago, they changed how "-a" behaves in comparison to how "-c" behaves, and now "-a packets:1" does exactly what you want (5 years after your original question 😂).
From their documentation:
-a|--autostop
Specify a criterion that specifies when TShark is to stop writing to a capture file. The criterion is of the form test:value, where test is one of:
- *duration:value* ...
- *files:value* ...
- *filesize:value* ...
- *packets:value* switch to the next file after it contains value packets. **This does not include any packets that do not pass the display filter**, so it may differ from -c.
Although this fix was done ~8 months ago (see their commit), it seems that they intended it for the 4.0 branch only, since non of the 3.6 branch have received this fix (including version 3.6.10 which is still being developed).

How can I view output of tshark -V via Wireshark or similar?

Recently updated my Wireshark on a server, and lost the ability to use -R and -w from the CLI. Since I'm tracing SIP and RTP calls, I need to use -R and not -f.
I found out using -V is very useful (shows the packet tree on screen) and then I can redirect the output to a file. Unfortunately I'm not able to open that file through Wireshark to view properly (contains too muh text to easily scroll through).
I tried using -x t add the hex dump (removed -V), but still that is not openable through Wireshark when copying the text file to my PC.
Any ideas how I can trace using -R (with or without -V), copy the file to my PC and still be able to read it through Wireshark? I don't have issues to convert the file to a readable format.. Just need anything to view the files and share them :)
Thanks all,
//M

Pcap capture merge problem

I have two pcap files
$ capinfos cap1_stego0.pcap
File name: cap1_stego0.pcap
File type: Wireshark/tcpdump/... - libpcap
File encapsulation: Raw IP
Number of packets: 713
and
$ capinfos cap1_wlan0.pcap
File name: cap1_wlan0.pcap
File type: Wireshark/tcpdump/... - libpcap
File encapsulation: Ethernet
I want to merge them, but the incapsulation is different. If i use
mergecap -v -w asd.pcap cap1_stego0.pcap cap1_wlan0.pcap -T rawip
or
mergecap -v -w asd.pcap cap1_wlan0.pcap cap1_stego0.pcap -T rawip
Wireshark doesn't recognize the second past file and shows packets of cap1_wlan0.pcap or packets of cap1_stego0.pcap as raw packet data respectively. Also using "tcpslice" to remove ethernet layer of cap1_wlan0.pcap (to have both file with rawip encapsulation) show me unrecognized packet data.
How can i do? there is a way to merge pcap with different encapsulation or to convert eth->rawip or rawip->eth? Thank you.
One way to convert a RAW_IP file to an ethernet encapsulated file (which can then be merged with other ethernet-encapsulated files):
Use tshark to get a hex dump of the packets from the RAW_IP file:
tshark -nxr pcap-file-name | grep -vP "^ +\d" > foo.txt
( grep is used to remove the "summary" lines from the tshark output).
Use text2pcap to convert back to a pcap file while adding dummy
ethernet headers:
text2pacp -e 0x0800 foo.txt foo.pcap
If you want to keep the timestamps, you'll have to play around a bit with the tshark output
to get a text file which contains the timestamps in a format which text2pcap will accept and also contains the hex packet info.
[[
Does tcpslice have an option to remove ethernet headers ?
(Looking at the man page, it appears that tcpslice is used to extract time-ranges from a pcap file).
If you do have a way to remove ethernet headers from a capture file, you must make sure the resulting pcap file has an encapsulation type of RAW_IP before trying to read it with wireshark, mergecap , etc).
Also note that the -T switch to mergecap just forces the encapsulation type specified in the file; The actual encapsulation isn't altered (i.e., no bytes are added/changed/deleted).
]]
For merge pcap files try alternative utility - tcpmerge
sample merge command:
./tcpmerge asd.pcap cap1_wlan0.pcap cap1_stego0.pcap OUTFILEMERGED.pcap

How to read multiple pcap files >2GB?

I am trying to parse large pcap files with libpcap but there is a file limitation so my files are separated at 2gb. I have 10 files of 2gb and I want to parse them at one shot. Is there a possibility to feed this data on an interface sequentially (each file separately) so that libpcap can parse them on the same run?
I am not aware of any tools that will allow you to replay more than one file at a time.
However, if you have the disk space, you can use mergecap to merge the ten files into a single file and then replay that.
Mergecap supports merging the packets according to
chronological order of each packet's timestamp in each file
ignoring the timestamps and performing what amounts to a packet version of 'cat'; write the contents of the first file to the output, then the next input file, then the next.
Mergecap is part of the Wireshark distribution.
I had multiple 2GB pcap files. Used the following one liner to go through each pcap file sequentially and with display filter. This worked without merging the pcap files (avoided using more disk space and cpu)
for i in /mnt/tmp1/tmp1-pcap-ens1f1-tcpdump* ; do tcpdump -nn -r $i host 8.8.8.8 and tcp ; done
**Explanation:**
for loop
/mnt/tmp1/tmp1-pcap-ens1f1-tcpdump* # path to files with * for wildcard
do tcpdump -nn -r $i host 8.8.8.8 and tcp # tcpdump not resolving ip or port numbers and reading each file in sequence
done #
Note: Please remember to adjust the file path and display filter according to your needs.