I am looking for tips on how to pinpoint such errors as:
NoSuchElementException: Source was empty
whilst using project reactor. This indicates that the Mono/Flux did not emit any result but how I am supposed to find out the reason for that?
I currently use Hooks.onOperatorDebug(); which is very helpful, but I am looking for other ways of spotting the cause of such errors.
Any advice or recommendation on best practices welcome.
P.S. I have opened another question here: Issue with use of project reactor's flatMap and switchIfEmpty operators related to this one.
Just using log() can go a long way here.
Take something like the following:
Flux.range(0, 5)
.log("Initial")
.filter(x -> x%2==0)
.log("Even only")
.filter(x -> x<3)
.log("Less than 3 only")
.subscribe(System.out::println);
Which, taking the relevant bit out the log, will show:
21:08:28.809 [main] INFO Initial - | onNext(0)
21:08:28.809 [main] INFO Even only - | onNext(0)
21:08:28.809 [main] INFO Less than 3 only - | onNext(0)
0
21:08:28.809 [main] INFO Initial - | onNext(1)
21:08:28.809 [main] INFO Initial - | onNext(2)
21:08:28.810 [main] INFO Even only - | onNext(2)
21:08:28.810 [main] INFO Less than 3 only - | onNext(2)
2
21:08:28.810 [main] INFO Initial - | onNext(3)
21:08:28.810 [main] INFO Initial - | onNext(4)
21:08:28.810 [main] INFO Even only - | onNext(4)
This allows us to deduce where each element is being filtered, or if it was even emitted in the first place. From the above we can deduce:
0 was emitted, made it past the "Even only" filter, then past the "Less than 3 only" filter;
1 was emitted, but didn't get past the first filter
2 and 3 have the same patterns as 0 and 1 respectively
4 made it past the "Even only" filter, but went no further (so failed at the second filter)
5 and above, or any other elements, weren't ever emitted from the initial Flux.range() call.
With something like the above approach on your previous question, you may have noticed that userMono was never emitting anything the second time it was called, which may have helped narrow down the problem.
Related
I am seeing intermittent dropped records(only for error messages though not for success ones). We have a test case that intermittenly fails/passes because of a lost record. We are using "org.apache.beam.sdk.testing.TestPipeline.java" in the test case. This is the relevant setup code where I have tracked the dropped record too ....
PCollectionTuple processed = records
.apply("Process RosterRecord", ParDo.of(new ProcessRosterRecordFn(factory))
.withOutputTags(TupleTags.OUTPUT_INTEGER, TupleTagList.of(TupleTags.FAILURE))
);
errors = errors.and(processed.get(TupleTags.FAILURE));
PCollection<OrderlyBeamDto<Integer>> validCounts = processed.get(TupleTags.OUTPUT_INTEGER);
PCollection<OrderlyBeamDto<Integer>> errorCounts = errors
.apply("Flatten Roster File Error Count", Flatten.pCollections())
.apply("Publish Errors", ParDo.of(new ErrorPublisherFn(factory)));
The relevant code in ProcessRosterRecordFn.java is this
if(dto.hasValidationErrors()) {
RosterIngestError error = new RosterIngestError(record.getRowNumber(), record.toTitleValue());
error.getValidationErrors().addAll(dto.getValidationErrors());
error.getOldValidationErrors().addAll(dto.getOldValidationErrors());
log.info("Tagging record row number="+record.getRowNumber());
c.output(TupleTags.FAILURE, new OrderlyBeamDto<>(error));
return;
}
I see this log for the lost record of Tagging record row for 2 rows that fail. After that however, inside the first line of ErrorPublisherFn.java, we log immediately after receiving each message. We only receive 1 of the 2 rows SOMETIMES. When we receive both, the test passes. The test is very flaky in this regard.
Apache Beam is really annoying in it's naming of threads(they are all the same name), so I added a logback thread hashcode to get more insight and I don't see any and the ErrorPublisherFn could publish #4 on any thread anyways.
Ok, so now the big question: How to insert more things to figure out why this is being dropped INTERMITTENTLY?
Do I have to debug apache beam itself? Can I insert other functions or make changes to figure out why this error is 'sometimes' lost on some test runs and not others?
EDIT: Thankfully, this set of tests are not testing errors upstream and this line "errors = errors.and(processed.get(TupleTags.FAILURE));" can be removed which forces me to remove ".apply("Flatten Roster File Error Count", Flatten.pCollections())" and in removing those 2 lines, the issue goes away for 10 test runs in a row(ie. can't completely say it is gone with this flaky stuff going on). Are we doing something wrong in the join and flattening? I checked the Error structure and rowNumber is a part of equals and hashCode so there should be no duplicates and I am not sure why it would be intermittently failure if there are duplicate objects either.
What more can be done to debug here and figure out why this join is not working in the TestPipeline?
How to get insight into the flatten and join so I can debug why we are losing an event and why it is only 'sometimes' we lose the event?
Is this a windowing issue? even though our job started with a file to read in and we want to process that file. We wanted a constant dataflow stream available as google kept running into limits but perhaps this was the wrong decision?
I am getting 35=d message from ICE in my logs with all the details while requesting it via java application which is using quickfixj.
In onMessage implementation I am trying to get group data and values of individual field but my code fails at getGroup() and gives the error that field not found.
quickfix.fix44.SecurityDefinition.NoUnderlyings group = new quickfix.fix44.SecurityDefinition.NoUnderlyings();
message.getGroup(count, group);
This getGroup method internally calls getGroups function of quickfixJ which is failing becasuse at below line in -
this.getGroups(group.getFieldTag()); //group.getFieldTag() is 711 or NoUnderlyings
Is there anything that I am missing here? I have tried different ways to get the fields but no luck, help would be much appreciated.
Just an observation - In fromapp /on message method, I am not seeing the full message when I do message.toString(); . I Only see the first part, Second part which has actual security(many groups) is not being displayed. Not sure if there is any other way(other than toString()) to see full message in methods.
Message that I am getting in fromApp/OnMessage on message.toString();
<20190828-12:14:47, FIX.4.4:XXXX/1/0->ICE, incoming> (8=FIX.4.4 9=XXXXX 35=d 49=ICE 34=5 52=20190828-12:14:47.695 56=XXXX 57=1 322=10342 323=4 320=1566994457340_0 393=91 82=1 67=1 9064=0 711=91
Message that I am getting in logs :
<20190828-12:14:47, FIX.4.4:XXXX/1/0->ICE, incoming> (8=FIX.4.4 9=XXXXX 35=d 49=ICE 34=5 52=20190828-12:14:47.695 56=XXXX 57=1 322=10342 323=4 320=1566994457340_0 393=91 82=1 67=1 9064=0 711=91
311=5094981 309=UPL FQF0021.H0021 305=8 463=FXXXXX 307=UK Peak Electricity Futures (Gregorian) - UK Grid - Q1 21 308=IFEU 542=20201230 436=1.0 9013=0.01 9014=1.0 9040=0.01 9041=1.0 9017=1.0 9022=768 9024=1.0 9025=Y 916=20210101 917=20210331 9201=1900 9200=15 9202=Q1 21 9300=8148 9301=IPE UK Grid, UK (Peak) 9302=UK Grid 9303=UPL 998=MWh 9100=GBP 9101=GBP / MWh 9085=hourly 9083=2 9084=0 9061=4639 9062=UK Peak Electricity Futures (Gregorian) 9063=UK Power Peakload Futures (Gregorian) 9032=0.01 9215=1 9216=0 763=800
311=5094980 309=UPL FMH0021! 305=8 463=FXXXXX 307=UK Peak Electricity Futures (Gregorian) - UK Grid - Mar21 308=IFEU 542=20210225 436=1.0 9013=0.01 9014=1.0 9040=0.01 9041=1.0 9017=1.0 9022=276 9024=1.0 9025=Y 916=20210301 917=20210331 9201=1875 9200=12 9202=Mar21 9300=8148 9301=IPE UK Grid, UK (Peak) 9302=UK Grid 9303=UPL 998=MWh 9100=GBP 9101=GBP / MWh 9085=hourly 9083=2 9084=0 9061=4639 9062=UK Peak Electricity Futures (Gregorian) 9063=UK Power Peakload Futures (Gregorian) 9032=0.01 9215=1 9216=0 457=1 458=GB00H1RK4P63 459=U4 763=0
Which version of QuickFIX/J are you using? In some older versions the message got truncated when there were unknown fields.
That brings me to the question whether your used data dictionary is really containing all the fields that ICE is sending. Do you have all that 9000+ tags in your dictionary? Please double-check that.
Last time one of our application hit a strange issue. It happens from time to time (but not regularly) that some GWT RPC calls are duplicated.
Sample from the server logs:
2017-04-07 17:11:29,548 DEBUG AuthenticationChecker [ 67] - For: SearchServiceImpl.getDocSearchResults.......................................
2017-04-07 17:11:29,548 DEBUG AuthenticationChecker [ 67] - For: SearchServiceImpl.getDocSearchResults.......................................
AuthenticationChecker is an aspect that logs above information before every service method call.
As you can see two calls are done exactly in the same millisecond.
Do you have any idea what could case duplicated server calls in GWT/GXT application?
I would be appreciated for any help.
Here
I have added this
But my tomcat7 console still shows all this...
I flipped back and forth between SEVERE and ALL and noticed no difference in the console output.
What step am I missing?
Set level to OFF
in your example
level=OFF
Levels used for identifying the severity of an event. Levels are
organized from most specific to least:
OFF (most specific, no logging)
SEVERE (highest value)
WARNING
INFO
CONFIG
FINE
FINER
FINEST (lowest value)
I have a strange issue (hope you can help): I am working on a GWT Web Application that has times when more than 4 - 5 GWT RPC calls are made in the same time - as far as time is concerned.
Every once in a while - once every 15 calls maybe? The return Object from one call, gets 'assigned' to another. I have proof of this by using the gwt-log library on the client side.
Here the return object of the HistoryChangesCount call, got assigned to the modelingGetTemplates call also.
Thus resulting in a ClassCastException in the client file that made the call, on the same line as the onSuccess method.
Do you have any tips on how I can avoid this?
PS - I log every response object.toString() on error level. I know it's not best practice. It's just for troubleshooting.
[14:38:01.026] "(-:-) 2014-04-03 14:38:01,025 [ERROR] getHistoryChangesCount - HistoryPreviewFacet - SUCCESS RETURNED: HistoryChangesCount{dateToNumberOfChangesMap={Mon Mar 31 03:00:00 GMT+300 2014=3}, lastUpdatedOn=Mon Mar 31 11:11:02 GMT+300 2014}
"
[14:38:01.163] "(-:-) 2014-04-03 14:38:01,162 [ERROR] modelingGetTemplates - ModelingTemplatesDropdown - SUCCESS RETURNED: HistoryChangesCount{dateToNumberOfChangesMap={Mon Mar 31 03:00:00 GMT+300 2014=3}, lastUpdatedOn=Mon Mar 31 11:11:02 GMT+300 2014}
"
[14:38:01.175] "(-:-) 2014-04-03 14:38:01,174 [ERROR] Browser: null
java.lang.ClassCastException
at Unknown.iCb(StackTraceCreator.java:174)
at Unknown.sd(StackTraceCreator.java:508)
at Unknown.Txn(Throwable.java:46)
at Unknown.kIc(Cast.java:46)
at Unknown.rff(ModelingTemplatesDropdown.java:79)
at Unknown.bXi(AsyncWrapperForRPCManager.java:38)
at Unknown.Loe(RequestCallbackAdapter.java:232)
at Unknown.MWb(Request.java:258)
at Unknown.qXb(RequestBuilder.java:412)
at Unknown.anonymous(XMLHttpRequest.java:351)
at Unknown.eBb(Impl.java:189)
at Unknown.hBb(Impl.java:242)
at Unknown.anonymous(Impl.java:70)
"
Here is how a successful call to modelingGetTemplates looks like:
[14:37:24.933] "(-:-) 2014-04-03 14:37:24,932 [ERROR] modelingGetTemplates - ModelingTemplatesDropdown - SUCCESS RETURNED: [Advanced Business Application, Advanced Business Transaction, TestTemplate]
"
I am using vanilla GWT-RPC. I only have a class that extends AsyncWrapper for logging. I also created myself a client side queue that limits the number of parallel calls to 4, but even so it still happens.
Versions:
GWT: 2.5.1
and I also use Sencha GXT, not sure if relevant.
Here is a video of the issue reproducing - at 0:30 - this time another call get's the object from modelingGetTemplates.
The end result is that my widget is stuck on loading waiting for data forever. And of course angry users :)
Documenting the way I got around this, rather than fixing it :) (because I couldn't find a fix)
I created a client side GWT RPC call queue.
Any RPC call made by the UI was registring the call to the queue, and the queue would manage (during high load, read delay), the actual execution of the calls.
It acted similar to a thread pool. I had a constant of how many parallel calls I can have at one time, and also a minimum time interval between two calls. I believe it was eventually set to 200 milliseconds.
So by doing the above I (almost) never got that issue. The frequency was so low, nobody noticed it anymore.
My guess of the cause below:
I believe the GWT framework has some maps that use a key that depends on the timestamp of the calls, and if two calls happen at the same time, the map could switch the calls, messing up the results to calls mapping.