Extract over 250 fields from a pcap using tshark - tshark

I have captured wireless traffic using Wireshark and the captured pcap file is approximately 500MB. I'd like to extract more than 250 fields from that capture file. How can I do that with tshark?

You can use one of the following tshark commands to extract all fields from your capture file:
tshark -r input.pcap -T pdml
tshark -r input.pcap -T json
The number of fields you will have in the output really depends on the structure of your packets. You might have a large number of fields if you have several encapsulation layers, or a very small number if you don't have a recognized application layer.

Related

Extract packets from a large pcap file to create a small pcap file

I have a large pcap file (30G).
I also have a CSV file that contains a large number of flow IDs (more than 50000) in the format of
"source_address-destination_address-source_port-destination_port-protocol".
I want to extract packets from the pcap file according to the flow IDs in the CSV file, and create another pcap file.
I know Wireshark can filter packets based on IP address or port, but it can only filter one flow at a time.
I have to filter a larger number of flows (more than 50000) according to the 5-tuple in the CSV file. It looks like Wireshark cannot achieve it.
Is there an efficient way to filter the packets according to the flow ID stored in a CSV file and create a new pcap file?

how to transform kafka logs stored on file system into csv

I have some logs that were generated with kafka that are currently stored in a .log format looking like this, on my computer:
I would like to convert those files into csv records, with message and times.
I know the question might probably seem too vague or unclear, sorry, but i'm really looking for a starting point to achieve this;
note: this is linked to the isoblue project and datasets here
Those files are encrypted.
Isn't easier if you to just write a consumer for those topics and write a CSV file?
You're looking for the DumpLogSegments command. However, this will not output CSV, so you'd have to parse something
https://cwiki.apache.org/confluence/display/KAFKA/System+Tools#SystemTools-DumpLogSegment
Dump Log Segment
This can print the messages directly from the log files or just verify
the indexes correct for the logs
bin/kafka-run-class.sh kafka.tools.DumpLogSegments
required argument "[files]"
Option Description
------ -----------
--deep-iteration if set, uses deep instead of shallow iteration
--files <file1, file2, ...> REQUIRED: The comma separated list of data and index log files to be dumped
--max-message-size <Integer: size> Size of largest message. (default: 5242880)
--print-data-log if set, printing the messages content when dumping data logs
--verify-index-only if set, just verify the index log without printing its content

Nfcapd to pcap conversion?

I've got few NetFlow dumps captured by nfcapd deamon. Is there any possibility to convert them to .pcap format so I can analyse ones with my software?
Basically no; most of the information from the packets is lost, including the entire payloads. NetFlow summarizes the header information from all the packets in a given session: it could be a dozen or thousands. The NetFlow dumps do not (to my recollection) include partial updates either. So, you can go one way (convert from pcap to NetFlow) but not the other way.
That said, if all you need for your analysis are the IP headers of the first packets, you might be able to fake something. But I don't know of any tool that does it.

How to encode multiple videos parallel (Debian)

I'd like to encode some video files either to MP4 and X264 format in Linux Debian.
It is very important that I encode multiple files parallel.
E.g. I want to encode two videos parallel on a Dual Code Machine and put the other videos in a queue. When a Video is finished I want the free core to encode the next video in the queue. Also even when this'd work with x264 I don't know about MP4.
What is the best approach here?
x264 supports parallel encoding but I don't know whether this is parallel encoding for multiple files or parallel encodings of different version for one single video.
Is there a way I can assign a encoding-process to core1 and another to core2?
Sincerly,
wolfen
Do you really need to encode multiple videos in parallel (are they racing?), or just not leave extra processor cores idle?
In either case, FFmpeg should work for your needs.
By default FFmpeg will use all available CPUs for any processing, allowing faster processing of single videos. However, you can also explicitly specify the number of cores to use via the -threads parameter, e.g., ffmpeg -i input.mov -threads 1 output.mov will only use one core.
It doesn't have any built-in queueing, though, you'll still have to code that aspect on your own.

How to use a binary executable which takes filenames as arguments in hadoop streaming?

Say I have a binary executable which takes filenames as arguments, like 'myprog file1 file2', it reads from file1 and writes to file2. The binary executable does not take stdin and does not emit stdout. How can I use this binary executable as a mapper or reducer in hadoop streaming? Thanks!
You would have to first save your data as a temporary file on local disk in order to use your program. Then you can read the results from the file.
However, this defeats the purpose of using Hadoop to process your data. The overhead of copying data to local disk and reading the results back into Hadoop-land would kill performance.
I would recommend making changes to your binary executable to allow i/o via stdin and stdout.