FIWARE Orion Runtime Error - fiware-orion

I am using FIWARE Orion (in a docker image) and I am facing with the possibility of losing some records. I looked in the log and came with a number of errors like the following:
time=Sunday 17 Dec 21:03:13 2017.743Z | lvl=ERROR | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=safeMongo.cpp[287]:setStringVector | msg=Runtime Error (element 0 in array was supposed to be an string but type=3 from caller mongoSubCacheItemInsert:225)
According to http://fiware-orion.readthedocs.io/en/0.26.1/admin/logs/ these kind of errors (Runtime) "may cause the Context Broker to fail" and "should be reported to the Orion development team using the appropriate channel" and this is exactly what I am doing.
Any help, will be highly appreciated.
Thank you very much in advance.
EDIT: Orion version is 1.5.0-next
EDIT: It has been upgraded to 1.10.0
EDIT: After executing ps ax | grep contextBroker I receive the following results:
23470 ? Ssl 4:24 /usr/bin/contextBroker -fg -multiservice -dbhost mongodb
EDIT: The problem occurs periodically. Actually, it takes place exactly every minute:
time=Wednesday 20 Dec 20:50:27 2017.235Z
time=Wednesday 20 Dec 20:51:27 2017.237Z
etc.

Orion 1.5.0-next means some running version between 1.5.0 (released in October 2016) and 1.6.0 (released in December 2016). In the best case, your version is one year old, which is pretty much time.
Thus, I recommend you to upgrade to the newest available Orion version (in the moment of writting this, that version is 1.10.0, released in December 2017). We have solved some "overlogging" problems in the delta of changes between 1.6.0 and 1.10.0 and the one you mention could be one of them.
If the problem stills after upgrading, tell about it in a comment to the answer and we'll keep debuging.

Diagnosis
The 60 seconds periodicity is exactly the subscriptions cache refresh interval with default configuration (your CLI confirms your are not using different setting for subscriptions cache).
Looking in detail to the line refered by the log trace in Orion 1.10.0 source code:
setStringVectorF(sub, CSUB_CONDITIONS, &(cSubP->notifyConditionV));
The log error means that Orion expects an array of strings for the CSUB_CONDITIONS field in a document of the subscription collection at database, but some of the elements in the array (or all) aren't strings but a objects (type 3 means object, as BSON specification details).
CSUB_CONDITIONS constant corresponds to conditions field at DB. Note this field changed at Orion 1.3.0. Before 1.3.0, for instance 1.2.0, it was an array of objects:
"conditions" : [
{
"type" : "ONCHANGE",
"value" : [ "temperature " ]
}
]
From 1.3.0 on, it was simplified to an array of strings:
"conditions" : [ "temperature" ]
So my hypothesis is that in some moment in the past that Orion instance was updated crossing the 1.3.0 boundary but without applying the procedure to migrate data (or the procedure was applied but failed in some way).
Solution
Given that you are in a situtation in which your data at Orion database is probably inconsistent, the cleanest solution would be to remove your database. Or, at least, the csubscollection.
However, this is possible only in the case you can regenerate the data to be deleted in an easy way. If that is not feasible, you can try with the procedure to migrate data. In particular, the csub_merge_condvalues.py script should fix the problem although I'd recommend to apply the full procedure in order to fix other potential inconsistencies.
Take into account that the migration procedure was designed to be applied before start using the new Orion version. It seems you have been using post-1.3.0 Orion with pre-1.3.0 data for a time, so your data can have evolved in some unexpected way the procedure couldn't fix. Anyway, even in this case the procedure is better than nothing :)
Note that if you are using multiple services (it seems so for the -multiservice CLI parameter) you have to apply the clean/migration procedure to every per-service database.

Related

Vertx Form Login Handler with Postgresql Failure

I am trying to authenticate user using FormLoginHandler and Postgresql Database with SqlAuthentication.
But I get the following error:
Jun 15, 2022 1:14:34 PM io.vertx.ext.web.RoutingContext
SEVERE: Unhandled exception in router
io.vertx.ext.web.handler.HttpException: Unauthorized
Caused by: io.vertx.core.impl.NoStackTraceThrowable: Invalid username/password
I am providing the right credentials.
The code snippet is:
SqlAuthenticationOptions sauthopts = new SqlAuthenticationOptions();
sauthopts.setAuthenticationQuery(AUTHENTICATE_QUERY);
SqlAuthentication authenticationProvider = SqlAuthentication.create(sqlClient, sauthopts);
router.route("/secure/*").handler(RedirectAuthHandler.create(authenticationProvider, "/login.html"));
FormLoginHandler formLoginHandler = FormLoginHandler.create(authenticationProvider);
router.route("/loginhandler").handler(formLoginHandler);
Please let me know if I am missing something here; or point me to a sample example.
Thanks in Advance.
Your setup doesn't show anything abnormal at first sight. For security reasons, we cannot "just" log the authentication data, as it would be a critical OWASP bug and security vulnerability.
My best guess is that probably is something not totally correct with the query, so this means you have now 2 options:
debug the application and see the query that is being sent + the arguments
prepare a small complete example that shows the bug and open an issue in vert.x so we can debug it further.
If you're upgrading from an older version, be aware that in vert.x 4.2.0 some changes were made to the base64 encoding to keep it consistent across modules. This could be a reason why authentication could fail as the encoded hashes may be slightly different. If you're just doing 4.3.0 from the start, then this would not be a problem.

Why is the generation of my TYPO3 documentation failing without a proper error?

I have a new laptop and I try to render the Changelogs of TYPO3 locally based on the steps on https://docs.typo3.org/m/typo3/docs-how-to-document/master/en-us/RenderingDocs/Quickstart.html#render-documenation-with-docker. It continues until the end but show some non-zero exit codes at the end.
project : 0.0.0 : Makedir
makedir /ALL/Makedir
2021-02-16 10:32:50 654198, took: 173.34 seconds, toolchain: RenderDocumentation
REBUILD_NEEDED because of change, age 448186.6 of 168.0 hours, 18674.4 of 7.0 days
OK:
------------------------------------------------
FINAL STATUS is: FAILURE (exitcode 255)
because HTML builder failed
------------------------------------------------
exitcode: 0 39 ms
When I run the command in another documentation project, it renders just fine.
I found the issue with this. It seemed the docker container did not have enough memory allocated. I changed the available memory from 2 Gb to 4 Gb in Docker Desktop and this issue is solved with that.
You already solved the problem. But in case of similar errors: To get more information on a failure, you can also use this trick:
Create a directory tmp-GENERATED-temp before rendering. Usually, this is automatically created and then removedd after rendering. If you create it before rendering, you will find logfiles with more details in this directory.
See the Troubleshooting page.
I had some errors where I found the output in the console insufficient and this helped me to narrow down the problem.
In case of other problems, I would file an issue in the GitHub repo: https://github.com/t3docs/docker-render-documentation
Note: This is specific to TYPO3 docs rendering and may change in the future.

Upgrading MongoDB from 4.2.9 to 4.4.0: Location13111: field not found, expected type date

I'm running a sharded MongoDB instance and as per the instructions, the config servers are a replica set. I'm unable to upgrade from v4.2.9 to 4.4.0. Per the upgrade instructions, I need to upgrade the config servers first, starting with a secondary. It already failed there. I shut down the secondary's instance, replaced the binaries, and restarted it. But it didn't start up again. The logs say the following (I removed the timestamps for clarity):
"msg":"The size storer reports that the oplog contains","attr":{"numRecords":53890848,"dataSize":13618131721}}
"msg":"Sampling the oplog to determine where to place markers for truncation"}
"msg":"Sampling from the oplog to determine where to place markers for truncation","attr":{"from":{"$timestamp":{"t":1494750837,"i":1}},"to":{"$timestamp":{"t":1598687615,"i":1}}}}
"msg":"Taking samples and assuming each oplog section contains","attr":{"numSamples":253,"containsNumRecords":2124552,"containsNumBytes":536870917}}
"msg":"User assertion","attr":{"error":"Location13111: field not found, expected type date","file":"src/mongo/bson/bsonelement.h","line":810}}
"msg":"WiredTiger record store oplog processing finished","attr":{"durationMillis":21}}
"msg":"~WiredTigerRecordStore for: {ns}","attr":{"ns":"local.oplog.rs"}}
"msg":"Invariant failure","attr":{"expr":"_oplogManagerCount > 0","file":"src/mongo/db/storage/wiredtiger/wiredtiger_kv_engine.cpp","line":2467}}
"msg":"\n\n***aborting after invariant() failure\n\n"}
"msg":"Writing fatal message","attr":{"message":"Got signal: 6 (Aborted).\n"}}
"msg":"BACKTRACE: {bt}","attr":{"bt":{"backtrace":[{"a":"55C91A79E621","b":"55C917AE3000","o":"2CBB621","s":"_ZN5mongo18stack_trace_detail12_GLOBAL__N_119printStackTraceImplERKNS1_7OptionsEPNS_14StackTraceSinkE.constprop.606","s+":"1E1"},{"a":"55C91A79FCC9","b":"55C917AE3000","o":"2CBCCC9","s":"_ZN5mongo15printStackTraceEv","s+":"29"},{"a":"55C91A79D4B6","b":"55C917AE3000","o":"2CBA4B6","s":"_ZN5mongo12_GLOBAL__N_116abruptQuitActionEiP9siginfo_tPv","s+":"66"},{"a":"7FAF200070E0","b":"7FAF1FFF6000","o":"110E0","s":"funlockfile","s+":"50"},{"a":"7FAF1FC89FFF","b":"7FAF1FC57000","o":"32FFF","s":"gsignal","s+":"CF"},{"a":"7FAF1FC8B42A","b":"7FAF1FC57000","o":"3442A","s":"abort","s+":"16A"},{"a":"55C9189E6C5F","b":"55C917AE3000","o":"F03C5F","s":"_ZN5mongo15invariantFailedEPKcS1_j","s+":"12C"},{"a":"55C9186CE4B6","b":"55C917AE3000","o":"BEB4B6","s":"_ZN5mongo18WiredTigerKVEngine16haltOplogManagerEv.cold.1904","s+":"18"},{"a":"55C918B0711C","b":"55C917AE3000","o":"102411C","s":"_ZN5mongo21WiredTigerRecordStoreD1Ev","s+":"2FC"},{"a":"55C918B0D68B","b":"55C917AE3000","o":"102A68B","s":"_ZN5mongo29StandardWiredTigerRecordStoreD0Ev","s+":"1B"},{"a":"55C9186CEC5B","b":"55C917AE3000","o":"BEBC5B","s":"_ZN5mongo18WiredTigerKVEngine21getGroupedRecordStoreEPNS_16OperationContextENS_10StringDataES3_RKNS_17CollectionOptionsENS_8KVPrefixE.cold.1921","s+":"57"},{"a":"55C919378A76","b":"55C917AE3000","o":"1895A76","s":"_ZN5mongo17StorageEngineImpl15_initCollectionEPNS_16OperationContextENS_8RecordIdERKNS_15NamespaceStringEb","s+":"316"},{"a":"55C91937A7BD","b":"55C917AE3000","o":"18977BD","s":"_ZN5mongo17StorageEngineImpl11loadCatalogEPNS_16OperationContextE","s+":"90D"},{"a":"55C91937E3D0","b":"55C917AE3000","o":"189B3D0","s":"_ZN5mongo17StorageEngineImplC1EPNS_8KVEngineENS_20StorageEngineOptionsE","s+":"270"},{"a":"55C918AC8005","b":"55C917AE3000","o":"FE5005","s":"_ZNK5mongo12_GLOBAL__N_117WiredTigerFactory6createERKNS_19StorageGlobalParamsEPKNS_21StorageEngineLockFileE","s+":"1A5"},{"a":"55C9193889EE","b":"55C917AE3000","o":"18A59EE","s":"_ZN5mongo23initializeStorageEngineEPNS_14ServiceContextENS_22StorageEngineInitFlagsE","s+":"4CE"},{"a":"55C918A84587","b":"55C917AE3000","o":"FA1587","s":"_ZN5mongo12_GLOBAL__N_114_initAndListenEPNS_14ServiceContextEi.isra.1409","s+":"3F7"},{"a":"55C918A88610","b":"55C917AE3000","o":"FA5610","s":"_ZN5mongo12_GLOBAL__N_111mongoDbMainEiPPcS2_","s+":"650"},{"a":"55C9189F7849","b":"55C917AE3000","o":"F14849","s":"main","s+":"9"},{"a":"7FAF1FC772E1","b":"7FAF1FC57000","o":"202E1","s":"__libc_start_main","s+":"F1"},{"a":"55C918A83A3A","b":"55C917AE3000","o":"FA0A3A","s":"_start","s+":"2A"}],"processInfo":{"mongodbVersion":"4.4.0","gitVersion":"563487e100c4215e2dce98d0af2a6a5a2d67c5cf","compiledModules":[],"uname":{"sysname":"Linux","release":"4.9.0-7-amd64","version":"#1 SMP Debian 4.9.110-3+deb9u2 (2018-08-13)","machine":"x86_64"},"somap":[{"b":"55C917AE3000","elfType":3,"buildId":"D7866CAA7FFAC402345915854064CD98A5B60C27"},{"b":"7FAF1FFF6000","path":"/lib/x86_64-linux-gnu/libpthread.so.0","elfType":3,"buildId":"16D609487BCC4ACBAC29A4EAA2DDA0D2F56211EC"},{"b":"7FAF1FC57000","path":"/lib/x86_64-linux-gnu/libc.so.6","elfType":3,"buildId":"775143E680FF0CD4CD51CCE1CE8CA216E635A1D6"}]}}}}
It appears to boil down to the following error message:
Location13111: field not found, expected type date.
src/mongo/bson/bsonelement.h:810
Googling didn't turn up anything useful. I didn't proceed after that but had to revert to v4.2.9. (I wanted to keep the damage to the config secondary and not get the same issue with the shards.)
I'm on Debian 9.13 and I tried both apt to install MongoDB 4.4.0 and directly installing the Debian 9.2 binaries. The error was the same both times.
Any ideas what to do about this one?

kernel - postgres segfault error 15 in libc-2.19.so

Yesterday we had crash of PostgreSQL 9.5.14 running on Debian 8 (Linux xxxxxx 3.16.0-7-amd64 #1 SMP Debian 3.16.59-1 (2018-10-03) x86_64 GNU/Linux) - Segmentation fault. Database closed all connections and reinitialized itself staying ~1 minute in recovery mode.
PostgreSQL log:
2018-10-xx xx:xx:xx UTC [580-2] LOG: server process (PID 16461) was
terminated by signal 11: Segmentation fault
kern.log:
Oct xx xx:xx:xx xxxxxxxx kernel: [117977.301353] postgres[16461]:
segfault at 7efd3237db90 ip 00007efd3237db90 sp 00007ffd26826678 error
15 in libc-2.19.so[7efd322a2000+1a1000]
According to libc documentation (https://support.novell.com/docs/Tids/Solutions/10100304.html) error code 15 means:
NX_EDEADLK 15 resource deadlock would occur - which does not tell me much.
Could you tell me please if we can do something to avoid this problem in the future? Because this server is of course production one.
All packages are up to date currently. Upgrade of PG is unfortunately not the option. Server runs on Google Compute Engine.
error code 15 means: NX_EDEADLK 15
No, it doesn't mean that. This answer explains how to interpret 15 here.
It's bits 0, 1, 2, 3 set => protection fault, write access, user mode, use of reserved bit. Most likely your postgress process attempted to write to some wild pointer.
if we can do something to avoid this problem in the future?
The only thing you can do is find the bug and fix it, or upgrade to a release of postgress where that bug is already fixed (and hope that no new ones were introduced).
To understand where the bug might be, you should check whether a core dump was produced (if not, do enable them). If you have the core, use gdb /path/to/postgress /path/to/core, and then where GDB command. That will give you crash stack trace, which may allow you to find similar bug reports.

Adobe CQ6 - Runaway ContentFinder

Starting on 1/13, our Adobe CQ6.0 SP1 error logs started filling up with:
GET /bin/wcm/contentfinder/product/view.json/etc/commerce/products HTTP/1.1] org.apache.jackrabbit.oak.plugins.index.property.strategy.ContentMirrorStoreStrategy Traversed 1041307000 nodes using index jcr:lastModified with filter Filter(query=select [jcr:path], [jcr:score], * from [nt:base] as a where isdescendantnode(a, '/etc/commerce/products') order by [jcr:lastModified] desc /* xpath: /jcr:root/etc/commerce/products//* order by #jcr:lastModified descending /, path=/etc/commerce/products//)
The error logs are huge and AEM 6.0 ran out of disk space:
error.log.2015-01-13: 30295763555 bytes
error.log.2015-01-14: 52886323200 bytes
We are able to reproduce the problem by issuing the following HTTP request against AEM Author:
GET /bin/wcm/contentfinder/product/view.json/etc/commerce/products
This issue suddenly on 1/13/2015, 9:47 a.m., with a co-worker loading a site in AEM 6.0, and ContentFinder never loaded, so she removed cf#, and then was able to proceed with the authoring of the content itself.
We are interested in knowing if others have had similar issues with ContentFinder in AEM6.0.
AEM 6.0 has a bug in the Querybuilder related to Oak 1.0.5. We need Oak to be upgraded to v1.0.9. The following URI has more information:
http://helpx.adobe.com/experience-manager/kb/aem6-available-hotfixes.html
SP1 needs to be installed first and then the hot fixes need to be installed in the given order over SP1. The two sample index packages (damLucene.zip and productsIndex.zip) need to be installed as well. These add the following indices:
/oak:index/damLucene
/etc/commerce/products/ntbaseProductsLucene