Entities disappears when Platform reboots - fiware-orion

We have a problem with entities in OrionCB. Each time platform testbed is out of service, entities we have created before, disappears.
[root#orioncb ~]# curl localhost:1026/ngsi10/contextEntities/finesce_meteo -s -S --header 'Content-Type: application/xml' | xmllint --format -
curl: (7) couldn't connect to host
-:1: parser error : Document is empty
^
-:1: parser error : Start tag expected, '<' not found
This is an exmaple of the output when we are trying to list "finesce_meteo" entity.
Regards,
Ismael

With the information in your question I cannot be sure (or not sure) that the entities have disapeared. The information in your questions lead to a different problem.
Note the curl: (7) couldn't connect to host message. That means that the client cannot reach the port 1026 of the host. The most probable causes of this problem are:
Orion Context Broker is not started in that host
Orion Context Broker is started in that host, but in a port different that 1026
Something in the host (e.g a firewall or security group) is blocking the incoming connection
Something in the client (e.g a firewall) is blocking the outcoming connection
There is some other network issue is causing the connection problem.

Related

Force kafka to connect brokers through IPs, not via hostnames

We have following kafka-ssh-tunneling setup.
ssh -N $JUMPHOST -L 2181:w.x.y.z:2181 -L 9092:a.b.c.d:9092 -L 9091:e.e.f.f:9092
broker IP is a.b.c.d , There is local lo0 device alias with same IP address
zookeper IP is w.x.y.z , There is local device alias with same IP address
kafkahost "entry" host is e.e.f.f
Our planned use case is kafkacat -C -b localhost:9091 -t <topic>
Problem:
Connecting to kafka host/ zookepers works fine, however
kafka clients ( e.g. kafkacat ) are accessing brokers by their hostname, ip-a.b.c.d.eu-central-1.compute.internal , not by their IP's.
To counteract, I've added entry to /etc/hosts
a.b.c.d ip-a.b.c.d.eu-central-1.compute.internal
Still doesn't work,
although pinging to that hostname is successful.
Nslookup gives
Non-authoritative answer:
Name: ip-a.b.c.d.eu-central-1.compute.internal
Address: a.b.c.d
** server can't find ip-a.b.c.d.eu-central-1.compute.internal: NXDOMAIN
Question:
Is there a way telling kafka to connect brokers through IP's and not via hostnames?
If not, will starting local dns server might resolve the issue?
What is happening here is:
The broker receives a petition from the client and returns him ip-a.b.c.d.eu-central-1.compute.internal, as this is the host name of the broker and the default value for listeners. (this is the key)
Your client tries to send data to the broker using the metadata it was given. But as it can't resolve ip-a.b.c.d.eu-central-1.compute.internal, it fails without even reaching the cluster, caused by a networking issue out of Kafka's scope.
If you set the value on /etc/hosts, you will fix the address resolution problem; The client will now be able to reach the cluster, solving the previous networking issue.
The following step involves Kafka replying with the 448_in_your_face error code (the exact code name may differ). Your petition fails again, now on cluster-side: your client is asking for a broker called/referenced a.b.c.d, but there is no registered listener with that name, as its identifier is still ip-a.b.c.d.eu-central-1.compute.internal.
The key here is within the advertised.listeners property, located in the server.properties configuration file.
In order your clients to be able to connect, modify that property, directly setting the ip there, or a resolvable dns (using IP for this example):
advertised.listeners=PLAINTEXT://a.b.c.d:9092
Now on client side, just use the IP in order to connect with the broker:
bootstrap.servers = a.b.c.d:9092
When the petition from the client is received, kafka will recognize the content of bootstrap.servers as one of its registered listeners, hence accepting the connection.
Found workaround. Posting here If anyone might face my problem.
Steps are following:
Create dummy aliases for all host you are planning to use, sudo ip add a dev lo $ip
These aliases should NOT have the same broker/zookeper IPs, BUT 127.0.j.k format
Add ip-<>.<>.<>.<>.eu-central-1.compute.internal < -- > 127.0.[].[] mapping to /etc/hosts
Create tunnel via SSH, taking account relation of broker/zookeper's IPs and your local (aliased) IPs
ssh -N $JUMPHOST -L 2181:<localIP>:<remoteIP>:2181 -L 9092:<localIP>:<remoteIP>:9092 ...
then you can consume messages via
kafkacat -C -b 127.0.[].[]:9092 -t <topic>

Looking for debugging advice on SSL errors from EKS using varnish

I know this is somewhat specific of a question, but I'm having a problem I can't seem to track down. I have a single pod deployed to EKS - the pod contains a python app, and a varnish reverse caching proxy. I'm serving chunked json (that is, streaming lines of json, a la http://jsonlines.org/), and it can be multiple GB of data.
The first time I make a request, and it hits the python server, everything acts correctly. It takes (much) longer than the cached version, but the entire set of json lines is downloaded. However, now that it's cached in varnish, if I use curl, I get:
curl: (56) GnuTLS recv error (-110): The TLS connection was non-properly terminated.
or
curl: (56) GnuTLS recv error (-9): A TLS packet with unexpected length was received.
The SSL is terminated at the ELB, and when I use curl from the proxy container itself (using curl http://localhost?....), there is no problem.
The hard part of this is that the problem is somewhat intermittent.
If there is any advice in terms of clever varnishlog usage, or anything of the same ilk on AWS, I'd be much obliged.
Thanks!
Because TLS is terminated on your ELB loadbalancers, the connection between should be in plain HTTP.
The error is probably not coming from Varnish, because Varnish currently doesn't handle TLS natively. I'm not sure if varnishlog can give you better insights in what is actually happening.
Checklist
The only checklist I can give you is the following:
Make sure the certificate you're using is valid
Make sure you're connecting to your target group over HTTP, not HTTPS
If you enable the PROXY protocol on your ELB, make sure Varnish has a -a listener that listens for PROXY protocol requests, on top of regular HTTP requests.
Debugging
Perform top-down debugging:
Increase the verbosity of your cURL calls and try to get more information about the error
Try accessing the logs of your ELB and get more details there
Get more information from your EKS logs
And finally, perform a varnislog -g request -q "ReqUrl eq '/your-url'" to get a full Varnishlog for a specific URL

Openstack instance is not reachable due to metadata issue in liberty

Getting this error in instance log. I could not seek out any errors in nova or neutron log.
Checked with all configuration and everything is fine.
url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [50/120s]: request error [(, 'Connection to 169.254.169.254 timed out. (connect timeout=50.0)')]
Anyone could help whats the actual error and how to solve it?
Probable Reason1:-
I guess you are running instances with GUI installed on them. When you install GUI on an instance(ubuntu/centos or whatever), they bring in a bunch of different services. Specially, in ubuntu, a service named "avahi" gets added and started which adds a route of 169.254/16 on the instance. This starts causing the issue as now the instance thinks that it can reach 169.254.169.254 directly rather than sending the packets to the gateway.
More details why this happens and how you can stop this can be found on this blog.
https://rahulait.wordpress.com/2016/04/02/metadata-failure-with-ubuntu-desktop-on-openstack/
Probable Reason2:-
If you have a private-network and it is not connected to any "router", the gateway interface for that private-network will be down. For communicating to metadata service, the packets needs to be sent to gateway of network, which would not be reachable in this case and hence you would see these logs.
I hope it helps.
In my case this error raised because the L3 agent was down due to some corruption in the ini file. Check if there is an agent down in neutron:
openstack network agent list
Fix the issue (check the log at /var/log/neutron and restart the service)
service neutron-l3-agent restart
This happened to me on a node that was still running nova-network from a previous configuration.
The effect on the faulty node was this (bad):
# curl -v http://169.254.169.254/openstack
* Hostname was NOT found in DNS cache
* Trying 169.254.169.254...
* connect to 169.254.169.254 port 80 failed: Connection refused
* Failed to connect to 169.254.169.254 port 80: Connection refused
* Closing connection 0
curl: (7) Failed to connect to 169.254.169.254 port 80: Connection refused
...instead of this (good):
# curl -v http://169.254.169.254/openstack
* Hostname was NOT found in DNS cache
* Trying 169.254.169.254...
* Immediate connect fail for 169.254.169.254: Network is unreachable
* Closing connection 0
curl: (7) Couldn't connect to server
If this is the case get rid of the legacy service on your node and enjoy.

Connecting Orion Context Broker from another machine

I can't connect to ContextBroker from another machine, even a machine in the same LAN.
Accessing by ssh without any problem
ssh geezar#192.168.1.115
and then
curl localhost:1026/statistics
the terminal shows the statistics, all right
<orion>
<xmlRequests>3</xmlRequests>
<jsonRequests>1</jsonRequests>
<updates>1</updates>
<versionRequests>1</versionRequests>
<statisticsRequests>2</statisticsRequests>
<uptime_in_secs>84973</uptime_in_secs>
<measuring_interval_in_secs>84973</measuring_interval_in_secs>
</orion>
But when I try without ssh connection...
curl 192.168.1.115:1026/statistics
curl: (7) Failed to connect to 192.168.1.115 port 1026: No route to host
Even, I routed the port 1026 to that machine (192.168.1.115) on the router configuration, and tried to access from my public IP, the result is the same, failed to connect
I think I am missing something, but.. what is it?
The most probable causes of this problem are:
Something in the host (e.g a firewall or security group) is blocking the incoming connection
Something in the client (e.g a firewall) is blocking the outcoming connection
There is some other network issue is causing the connection problem.
EDIT: in GNU/Linux system, iptables is usually used as firewall. It can be disabled typically running iptables -F.

RhodeCode - What is blocking my connection?

All connection attempts on RhodeCode on CentOS 6.3 are refused except from localhost.
Note that iptables is not running, and I am only trying to visit the web interface.
I have googled the exact error message below and looked around SO. I have yet to find a solution.
abort: error: No connection could be made because the target machine actively refused it
If the firewall is down, and I am not trying to modify any repository, what else is preventing me from connecting? EDIT: See #5 below. Not sure how to address it yet.
Things tried and other info
Using localhost, 127.0.0.1 and hostname in production.ini
service iptables stop
Connected over HTTP successfully. In other words, connections are accepted outside RhodeCode.
Made sure no authentication methods were enabled or configured in production.ini
Although the server accepts connections on localhost, netstat -l does not show that port 5000 is listening. Port 5000 is set in production.ini and ps uax | grep paster confirms the server is running. No other software tries to grab port 5000.
Ok, apparently I have been misunderstanding the host configuration. I was running on the assumption that host should be set to 127.0.0.1 or localhost in production.ini for RhodeCode to know what host to look for for another service. This was a faulty presumption on my part, since I am used to pointing web applications to local systems to look for databases.
It turns out that host binds the application to a specific address for access, meaning that it RhodeCode was supposed to only respond to local requests, regardless of what other system policies say. The setup docs did not make this clear because it did not specify that external connections would be refused. All it said was:
This command [paster serve] runs the RhodeCode server. The web app should be available at the 127.0.0.1:5000. This ip and port is configurable via the production.ini file created in previous step
The problem was fixed by binding RhodeCode to 0.0.0.0, which opened it to outside connections. Kudos to Ɓukasz Balcerzak for pointing this out in the RC support google group.