how to read aparticular line from log file using logstash - elastic-stack

I have to read 3 different lines from log files based on some text and then output the fields in a csv file.
sample log data:-
20110607 095826 [.] !! Begin test. Script filename/text.txt
20110607 095826 [.] Full path: filename/test/text.txt
20110607 095828 [.] FAILED: Test Failed()..
i have to read file name after !!Begin test. Script. this is my conf file
filter{
grok
{
match => {"message" => "%{BASE10NUM:Date}%{SPACE:pat}{BASE10NUM:Number}%
{SPACE:pat}[.]%{SPACE:pat}%{SPACE:pat}!! Begin test. Script%
{SPACE:pat}%{GREEDYDATA:file}"
}
overwrite => ["message"]
}
if "_grokparserfailure" in [tags]
{
drop{}
}
}
but its not giving me single record, its parsing full log file in json format no parsed field.

Related

What file format is this? ( Used for a perl script status )

I am trying to find out what format status file in nagios is, it has a .dat extention but is not the standard .dat ( at least not the windows .dat )
Here is an example of the format
contactstatus {
contact_name=noc
modified_attributes=0
modified_host_attributes=0
modified_service_attributes=0
host_notification_period=24x7
service_notification_period=24x7
last_host_notification=0
last_service_notification=1545078717
host_notifications_enabled=1
service_notifications_enabled=1
}
contactstatus {
contact_name=slack
modified_attributes=0
modified_host_attributes=0
modified_service_attributes=0
host_notification_period=24x7
service_notification_period=24x7
last_host_notification=0
last_service_notification=1545078717
host_notifications_enabled=1
service_notifications_enabled=1
}
Found that it is not a standard, but a nagios custom format

How to execute spark scala script saved in a text file

I have written word count scala-script in a text file and saved it in home directory.
How can i call and execute the script file "wordcount.txt" ?
If I try the command: spark-submit wordcount.txt, it is not working.
Content of "wordcount.txt" file-
val text = sc.textFile("/data/mr/wordcount/big.txt");
val counts = text.flatMap(line => line.split(" ").map(word => (word.toLowerCase(),1)).reduceByKey(+).sortBy(_._2,false).saveAsTextFile(“count_output”);

Cake Task output log to file

I have a set of Tasks inside a build.cake file and I would like to capture the log output from the console into a log file. I know it's possible to use the OnError() function to output errors to file but I would like to output everything to a log file, not just errors.
Below is an example of the build.cake file.
#load "SomeTask.cake"
#load "SomeOtherTask.cake"
var target = Argument("target", "Default");
var someTask = Task("SomeTask")
.Does(() =>
{
SomeMethodInsideSomeTask();
});
var someOtherTask = Task("SomeOtherTask")
.Does(() =>
{
SomeOtherMethodInsideSomeOtherTask();
});
Task("Default")
.IsDependentOn(someTask)
.IsDependentOn(someOtherTask);
RunTarget(target);
N.B. The Tasks are not running any sort of MSBuild commands so it's not possible to use MSBuildFileLogger.
How about pipe the stdout to a file i.e.
./build.ps1 > log.txt
Have you heard about tee ?
It reads standard input and writes it to both standard output and one or more files

Annotating a corpus using Syntaxnet

I am trying to annotate a corpus using Syntaxnet. I added the following lines in the end of the /models/syntaxnet/syntaxnet/models/parsey_mcparseface/context.pbtxt file:
input {
name: 'input_file'
record_format: 'english-text'
Part {
file_pattern: '/home/melvyn/text.txt'
}
}
output {
name: 'output_file'
record_format: 'english-text'
Part {
file_pattern: '/home/melvyn/text-tagged.txt'
}
}
When i run the command:
./demo.sh --input=input_file --output=output_file
I am getting:
./demo.sh: line 31: bazel-bin/syntaxnet/parser_eval: No such file or directory
./demo.sh: line 43: bazel-bin/syntaxnet/parser_eval: No such file or directory
./demo.sh: line 55: bazel-bin/syntaxnet/conll2tree: No such file or directory
According to the answer given ## here ## I changed my demo.sh file and now I get some errors which say:
[libprotobuf ERROR external/tf/google/protobuf/src/google/protobuf/text_format.cc:291] Error parsing text-format syntaxnet.TaskSpec: 200:8: Message type "syntaxnet.TaskOutput" has no field named "Part".
E external/tf/tensorflow/core/framework/op_segment.cc:53] Create kernel failed: Invalid argument: Could not parse task context at syntaxnet/models/parsey_mcparseface/context.pbtxt
E external/tf/tensorflow/core/common_runtime/executor.cc:333] Executor failed to create kernel. Invalid argument: Could not parse task context at syntaxnet/models/parsey_mcparseface/context.pbtxt
[[Node: DocumentSource = DocumentSourcebatch_size=32, corpus_name="stdin-conll", task_context="syntaxnet/models/parsey_mcparseface/context.pbtxt", _device="/job:localhost/replica:0/task:0/cpu:0"]]
What could be a possible solution?
Though it's not certain but I think you are not running the shell script from the root directory. Please try running it as per the instructions mentioned here
I hope it helps.

logstash(2.3.2) gzip codec not work

I'm using logstash(2.3.2) to read gz file by using gzip_lines codec.
The log file example (sample.log) is
127.0.0.2 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
The command I used to append to a gz file is:
cat sample.log | gzip -c >> s.gz
The logstash.conf is
input {
file {
path => "./logstash-2.3.2/bin/s.gz"
codec => gzip_lines { charset => "ISO-8859-1"}
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
#match => { "message" => "message: %{GREEDYDATA}" }
}
#date {
# match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
#}
}
output {
stdout { codec => rubydebug }
}
I have installed gzip_line plugin with bin/logstash-plugin install logstash-codec-gzip_lines
start logstash with ./logstash -f logstash.conf
When I feed s.gz with
cat sample.log | gzip -c >> s.gz
I expect that the console prints the data. but there is nothing print out.
I have tried it on mac and ubuntu, and get same result.
Is anything wrong with my code?
I checked the code for gzip_lines and it seemed obvious to me that this plugin is not working. At least for version 2.3.2. May be it is outdated. Because it does not implement the methods specified here:
https://www.elastic.co/guide/en/logstash/2.3/_how_to_write_a_logstash_codec_plugin.html
So current internal working is like that:
file input plugin reads file line by line and send it to codec.
gzip_lines codec tryies to create a new GzipReader object with GzipReader.new(io)
It then go through the reader line by line to create events.
Because you specify a gzip file, file input plugin tries to read gzip file as a regular file and sends lines to codec. Codec tries to create a GzipReader with that string and it fails.
You can modify it to work like that:
Create a file that contains list of gzip files:
-- list.txt
/path/to/gzip/s.gz
Give it to file input plugin:
file {
path => "/path/to/list/list.txt"
codec => gzip_lines { charset => "ISO-8859-1"}
}
Changes are:
Open vendor/bundle/jruby/1.9/gems/logstash-codec-gzip_lines-2.0.4/lib/logstash/codecs/gzip_lines.r file. Add register method:
public
def register
#converter = LogStash::Util::Charset.new(#charset)
#converter.logger = #logger
end
And in method decode change:
#decoder = Zlib::GzipReader.new(data)
as
#decoder = Zlib::GzipReader.open(data)
The disadvantage of this approach is it wont tail your gzip file but the list file. So you will need to create a new gzip file and append it to list.
I had a variant of this problem where I needed to decode bytes in a files to an intermediate string to prepare for a process input that only accepts strings.
The fact that encoding / decoding issues were ignored in Pyhton 2 is actually really bad IMHO. you may end up with various corrupt data problems especially if you needed to re-encode the string back into data.
using ISO-8859-1 works for both gz and text files alike. while utf-8 only worked for text files. I haven't tried it for png's yet.
Here's an example of what worked for me
data = os.read(src, bytes_needed)
chunk += codecs.decode(data,'ISO-8859-1')
# do the needful with the chunk....