Divolte-collector with MAPR, Storm, Kafka and Cassandra - sockets

I am not sure if I can get help for this on here, but I thought it was worth a try.
I have 3 node cluster on AWS, I am running MAPR M3 , I installed Storm, Kafka and Divolte-collector and Cassandra. I would like try some of the clickstream examples and I am running into an issue with the tcp-consumer example. Also being quite new to java and distributed processing I have some clarification questions. Again I am not quite sure where to post this because I feel like this is divolte-collector specific and I also have some gaps in my understanding of the javadoc concept and the building and running of jar files; but I figured someone could point me to some resources or help with some clarifications. I can't get the json string to appear in the console running netcat socket listening for clicks:
Divolte tcp-kafka-consumer example
Everything works until the netcat part step 7 and my knowledge gap is with step 6.
Step 1: install and configure Divolte Collector
Install works and hello world click collections is promising :-)
Step 2: download, unpack and run Kafka
# In one terminal session
cd kafka_2.10-0.8.1.1/bin
./zookeeper-server-start.sh ../config/zookeeper.properties
# Leave Zookeeper running and in another terminal session, do:
cd kafka_2.10-0.8.1.1/bin
./kafka-server-start.sh ../config/server.properties
No erros plus tested kafka examples so seems to working as well
Step 3: start Divolte Collector
Go into the bin directory of your installation and run:
cd divolte-collector-0.2/bin
./divolte-collector
Step 3 no hitch, can test default divole-collector test page
Step 4: host your Javadoc files
Setup a HTTP server that serves the Javadoc files that you generated or downloaded for the examples. If you have Python installed, you can use this:
cd <your-javadoc-directory>
python -m SimpleHTTPServer
Ok so I can reach the javadoc pages
Step 5: listen on TCP port 1234
nc -kl 1234
Note: when using netcat (nc) as TCP server, make sure that you configure the Kafka consumer to use only 1 thread, because nc won't handle multiple incoming connections.
Tested netcat by opening port and sending messages so I figured I don't have any port issues on AWS.
Step 6: run the example
cd divolte-examples/tcp-kafka-consumer
mvn clean package
java -jar target/tcp-kafka-consumer-*-jar-with-dependencies.jar
Note: for this to work, you need to have the avro-schema project installed into your local Maven repository.
I installed the avro-schema with mvn clean install in avro project that comes with the examples. as per instructions here
Step 7: click around and check that you see events being flushed to the console where you run netcat
When you click around the Javadoc pages, you console should show events in JSON format similar to this:
I don't see the clicks in my netcat window :(
Investigating the issue I viewed the console and network tabs using chrome developer tools it seems divolte is running, but I am not sure how to dig further. This is the console view. Any ideas or pointers?
Thanks anyways
Initializing Divolte.
divolte.js:140 Divolte base URL detected http://ec2-x-x-x-x.us-west-x.compute.amazonaws.com:8290/
divolte.js:280 Divolte party/session/pageview identifiers ["0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh"]
divolte.js:307 Module initialized. Object {partyId: "0:i6i3g0jy:nxGMDVdU9~f1wF3RGqwmCKKICn4d1Sb9", sessionId: "0:i6qx4rmi:IXc1i6Qcr17pespL5lIlQZql956XOqzk", pageViewId: "0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh", isNewPartyId: false, isFirstInSession: falseā€¦}
divolte.js:21 Signalling event: pageView 0:6ZIHf9BHzVt_vVNj76KFjKmknXJixquh0
allclasses-frame.html:9 GET http://ec2-x-x-x-x.us-west-x.compute.amazonaws.com:8000/resources/fonts/dejavu.css
overview-summary.html:200 GET http://localhost:8290/divolte.js net::ERR_CONNECTION_REFUSED

(Intro: I work on Divolte Collector)
It seems that you are running the example on an AWS instance somewhere. If you are using the pre-packaged JavaDoc files that come with the examples, they have hard-coded the divolte location as http://localhost:8290/divolte.js. So if you are running somewhere other than localhost, you should probably create your own JavaDoc for the example, using the correct hostname for the Divolte Collector server.
You can do so using this command. Be sure to run it from the directory where you source tree is rooted. And of course change localhost for the hostname where you are running the collector.
javadoc -d YOUR_OUTPUT_DIRECTORY \
-bottom '<script src="//localhost:8290/divolte.js" defer async></script>' \
-subpackages .
As an alternative, you could also just try to run the examples locally first (possibly in a virtual machine, if you are on a Windows machine).
It doesn't seem there is anything MapR specific with the issue that you are seeing so far. The Kafka based examples and pipeline should work in any environment that has the required components installed. This doesn't touch MapR-FS or anything else MapR specific. Writing to the distributed filesystem is another story.
We don't compile Divolte Collector against MapR Hadoop currently, but incidentally I have given it a run on the MapR sandbox VM. When installing from the RPM distribution, create a /etc/divolte/divolte-env.sh with the following env var setting:
HADOOP_CONF_DIR=/usr/share/divolte/lib/guava-18.0.jar:/usr/share/divolte/lib/avro-1.7.7.jar:$(hadoop classpath)
Obviously this is a bit of a hack to get around classpath peculiarities and we hope to provide a distribution compiled against MapR that works out of the box in the future.
Also, you need Java 8 to run Divolte. If you install this from the Oracle RPM, add the proper JAVA_HOME to divolte-env.sh as well, e.g.:
JAVA_HOME=/usr/java/jdk1.8.0_31
With these settings I'm able to run the server and collect Avro files on MapR FS, create a external Hive table on those files and run a query.

Related

Bootstrapping a Staging Clojure App via REPL incl. fetching dependenices

Deploying Clojure/Java apps is hard, so I had this idea yesterday I want to understand better. If I spin up a machine that has Clojure and boot-clj installed and run boot wait repl -s -H 0.0.0.0 on the machine (let's ignore auth for now), I should be able to connect to it from my dev box and trigger the retrieval of dependencies over the wire (which will then be cached on the machine), then wire over all the source code and eval until I hit a snag, right?
Let's pretend this is a good idea. Is it possible to do this, and what are the hurdles involved? Right now I'm waiting 5 minutes for CircleCI to package up an uberjar, then fail because some Heroku token expired but all I want to do is see my code running on a staging environment so I can wire some more code and re-eval it.
The first thing t hat comes to mind is nREPL auth, which I see is not mentioned in any of the nREPL libraries. So let's say that's a higher-level networking concern and I'll do ACL via VPC.
Has anyone done this? Why is it a bad idea? Can you show your recipe for bootstrapping a Clojure app on a remote machine without the use of git or SSH (aside from initial REPL start)?
I should be able to connect to it from my dev box and trigger the retrieval of dependencies over the wire (which will then be cached on the machine), then wire over all the source code and eval until I hit a snag, right?
Is it possible to do this, and what are the hurdles involved?
Yes, but you need to specify -b (address server listens on) instead of -H (host to connect client to):
$ boot wait repl -s -b 0.0.0.0 -p 3000
nREPL server started on port 3000 on host 0.0.0.0 - nrepl://0.0.0.0:3000
Then connect to it however you like, for example with lein repl:
$ lein repl :connect 127.0.0.1:3000
Now you can add a dependency in the REPL and it'll be downloaded on the server/host. In the client REPL:
boot.user=> (set-env! :dependencies #(into % '[[clj-time "0.14.0"]]))
And if you're watching the server console you'll see it downloading dependencies:
Retrieving clj-time-0.14.0.pom from https://repo.clojars.org/ (3k)
Retrieving joda-time-2.9.7.pom from https://repo1.maven.org/maven2/ (32k)
Retrieving clj-time-0.14.0.jar from https://repo.clojars.org/ (22k)
Retrieving joda-time-2.9.7.jar from https://repo1.maven.org/maven2/ (618k)
And then back on the client side:
boot.user=> (require '[clj-time.core :refer [now]])
nil
boot.user=> (now)
#object[org.joda.time.DateTime 0x1f68b743 "2018-03-15T12:16:29.342Z"]
Has anyone done this?
Yes, I've seen people host nREPLs from remote servers and connect to them to tinker with a running system.
Why is it a bad idea?
Generally speaking, we want reproducible builds and stable artifacts to give some degree of certainty about what code is being released. Doing this type of development on-the-fly on-the-server works against those goals, making it harder to determine what code is running where. I'd try to structure the system (and its testing) such that this degree of remote dynamism isn't required for normal development.
It sounds like your primary problem is a cumbersome link (CI/CD) in your dev/test/run feedback loop. I'd explore other options for optimizing that feedback loop before going to dynamic dependency-hot-loading nREPL, if you can avoid it. Of course, it's there if you need it!
Can you show your recipe for bootstrapping a Clojure app on a remote machine
Personally, I only ever deploy JARs to remote machines, and usually in a container. By that time I've already exercised/tested the system locally and have some confidence it'll behave as expected. If most of your system is untestable without deploying, that may be a sign you should break it into smaller, more testable pieces.

Apache CloudStack: No templates showing when adding instance

I have setup the apache cloudstack on CentOS 6.8 machine following quick installation guide. The management server and KVM are setup on the same machine. The management server is running without problems. I was able to add zone, pod, cluster, primary and secondary storage from the web interface. But when I tried to add an instance it is not showing any templates in the second stage as you can see in the screenshot
However, I am able to see two templates under Templates link in web UI.
But when I select the template and navigate to Zone tab, I see Timeout waiting for response from storage host and Ready field shows no.
When I check the management server logs, it seems there is an error when cloudstack tries to mount secondary storage for use. The below segment from cloudstack-management.log file describes this error.
2017-03-09 23:26:43,207 DEBUG [c.c.a.t.Request] (AgentManager-Handler-
14:null) (logid:) Seq 2-7686800138991304712: Processing: { Ans: , MgmtId:
279278805450918, via: 2, Ver: v1, Flags: 10, [{"com.cloud.agent.api.Answer":
{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException:
GetRootDir for nfs://172.16.10.2/export/secondary failed due to
com.cloud.utils.exception.CloudRuntimeException: Unable to mount
172.16.10.2:/export/secondary at /mnt/SecStorage/6e26529d-c659-3053-8acb-
817a77b6cfc6 due to mount.nfs: Connection timed out\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.getRootDir(Nf
sSecondaryStorageResource.java:2080)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.execute(NfsSe
condaryStorageResource.java:1829)\n\tat
org.apache.cloudstack.storage.resource.NfsSecondaryStorageResource.executeReques
t(NfsSecondaryStorageResource.java:265)\n\tat
com.cloud.agent.Agent.processRequest(Agent.java:525)\n\tat
com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:833)\n\tat
com.cloud.utils.nio.Task.call(Task.java:83)\n\tat
com.cloud.utils.nio.Task.call(Task.java:29)\n\tat
java.util.concurrent.FutureTask.run(FutureTask.java:262)\n\tat
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)\
n\tat
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)\
n\tat java.lang.Thread.run(Thread.java:745)\n","wait":0}}] }
Can anyone please guide me how to resolve this issue? I have been trying to figure it out for some hours now and don't know how to proceed further.
Edit 1: Please note that my LAN address was 10.103.72.50 which I assume is not /24 address. I tried to give CentOs a static IP by making the following settings in ifcg-eth0 file
DEVICE=eth0
HWADDR=52:54:00:B9:A6:C0
NM_CONTROLLED=no
ONBOOT=yes
BOOTPROTO=none
IPADDR=172.16.10.2
NETMASK=255.255.255.0
GATEWAY=172.16.10.1
DNS1=8.8.8.8
DNS2=8.8.4.4
But doing this would stop my internet. As a workaround, I reverted these changes and installed all the packages first. Then I changed the IP to static by the same configuration settings as above and ran the cloudstack management. Everything worked fine untill I bumped into this template thing. Please help me figure out what might have went wrong
I know I'm late, but for people trying out in the future, here it goes:
I hope you have successfully added a host as mentioned in Quick Install Guide before you changed your IP to static as it autoconfigures VLANs for different traffic and creates two bridges - generally with names 'cloud' or 'cloudbr'. Cloudstack uses the Secondary Storage System VM for doing all the storage-related operations in each Zone and Cluster. What seems to be the problem is that secondary storage system vm (SSVM) is not able to communicate with the management server at port 8250. If not, try manually mounting the NFS server's mount points in the SSVM shell. You can ssh into the SSVM using the below command:
ssh -i /var/cloudstack/management/.ssh/id_rsa -p 3922 root#<Private or Link local Ip address of SSVM>
I suggest you run the /usr/local/cloud/systemvm/ssvm-check.sh after doing ssh into the secondary storage system VM (assuming it is running) and has it's private, public and link local IP address. If that doesn't help you much, take a look at the secondary storage troubleshooting docs at Cloudstack.
I would further recommend, if anyone in future runs into similar issues, check if the SSVM is running and is in "Up" state in the System VMs section of Infrastructure tab and that you are able to open up a console session of it from the browser. If that is working go on to run the ssvm-check.sh script mentioned above which systematically checks each and every point of operation that SSVM executes. Even if console session cannot be opened up, you can still ssh using the link local IP address of SSVM which can be accessed by opening up details of SSVM and than execute the script. If it says, it cannot communicate with Management Server at port 8250, I recommend you check the iptables rules of management server and make sure all traffic is allowed at port 8250. A custom command to check the same is nc -v <mngmnt-server-ip> 8250. You can do a simple search and learn how to add port 8250 in your iptables rules if that is not opened. Next, you mentioned you used CentOS 6.8, so it probably uses older versions of nfs, so execute exportfs -a in your NFS server to make sure all the NFS shares are properly exported and there are no errors. I would recommend that you wait for the downloading status of CentOS 5.5 no GUI kvm template to be complete and its Ready status shown as 'Yes' before you start importing your own templates and ISOs to execute on VMs. Finally, if your ssvm-check.sh script shows everything is good and the download still does not start, you can run the command: service cloud restart and actually check if the service has gotten a PID using service cloud status as the older versions of system vm templates sometimes need us to manually start the cloud service using service cloud start even after the restart command. Restarting the cloud service in SSVM triggers the restart of downloading of all remaining templates and ISOs. Side note: the system VMs uses a Debian kernel if you want to do some more troubleshooting. Hope this helps.

JBOSS 7 Monitoring Tools

Any good suggestion for Monitoring JBOSS 7 in Production ? I would also like to configure alerts based on certain condition. Of course , It has to be Open source.
Thanks.
You can use standard JConsole that comes with JBoss dependencies added. It's used to monitor your servers state and mbeans, it's very useful.
To test it on localhost start your server and then run the JConsole from your server/bin directory and select JBoss in the Local process selection.
To use it on "remote" server, start your server on "REMOTE_HOST" and then run JConsole from an JBoss/bin directory and connect with the followin string
service:jmx:jmx-remoting://REMOTE_HOST_NAME:9999 (or the port you use) and enter the username and password.
Secondly, for more detailed info of objects creation, memory leaks, CPU% (profiling) there is another one as:
http://jbossprofiler.jboss.org/
You can try to use free open source APM like scouter.
It shows very useful realtime performance information of every request.
And also you can set a threshold of resources and can make plugin for alerting to external.
https://github.com/scouter-project/scouter
JBoss7 need to set module option.
-Djboss.modules.system.pkgs=~~~,scouter

Proxy setting in gsutil tool

I use gsutil tool for download archives from Google Storage.
I use next CMD command:
python c:\gsutil\gsutil cp gs://pubsite_prod_rev_XXXXXXXXXXXXX/YYYYY/*.zip C:\Tmp\gs
Everything works fine, but if I try to run that command from corporate proxy, I receive error:
Caught socket error, retrying: [Errno 10051] A socket operation was attempted to an unreachable network
I tried several times to set the proxy settings in .boto file, but all to no avail.
Someone faced with such a problem?
Thanks!
Please see the section "I'm connecting through a proxy server, what do I need to do?" at https://developers.google.com/storage/docs/faq#troubleshooting
Basically, you need to configure the proxy settings in your .boto file, and you need to ensure that your proxy allows traffic to accounts.google.com as well as to *.storage.googleapis.com.
A change was just merged into github yesterday that fixes some of the proxy support. Please try it out, or specifically, overwrite this file with your current copy:
https://github.com/GoogleCloudPlatform/gsutil/blob/master/gslib/util.py
I believe I am having the same problem with the proxy settings being ignored under Linux (Ubuntu 12.04.4 LTS) and gsutils 4.2 (downloaded today).
I've been watching tcpdump on the host to confirm that gsutils is attempting to directly route to Google IPs instead of to my proxy server.
It seems that on the first execution of a simple command like "gsutil -d ls" it will use my proxy settings specified .boto for the first POST and then switch back to attempting to route directly to Google instead of my proxy server.
Then if I CTRL-C and re-run the exact same command, the proxy setting is no longer used at all. This difference in behaviour baffles me. If I wait long enough, I think it will work for the initial request again so this suggests some form on caching taking place. I'm not 100% of this behaviour yet because I haven't been able to predict when it occurs.
I also noticed that it always first tries to connect to 169.254.169.254 on port 80 regardless of proxy settings. A grep shows that it's hardcoded into oauth2_client.py, test_utils.py, layer1.py, and utils.py (under different subdirectories of the gsutils root).
I've tried setting the http_proxy environment variable but it appears that there is code that unsets this.

starting warden after zookeeper of MapR

I am installing the MapR and I stucked at starting warden after start zookeeper on a single node.
# service mapr-warden start
Error: warden can not be started. See /opt/mapr/logs/warden.log for details
On this file there is no detail. Does anybody have a hint? Thanks =)
If you aren't getting anything in warden.log, then it's likely that the warden JVM is never even being started by the mapr-warden init script.
In some MapR versions, the mapr-warden init script will log some details into /opt/mapr/logs/wardeninit.log. You can try checking there.
However, I will also caution that currently the logging done by the init script is sparse and not necessarily user friendly to read. If you can't discern the cause from the contents of the wardeninit.log you can post them here and maybe I can help.
Another thing you can do is edit /etc/init.d/mapr-warden and add "set -x" towards the top of the file, right before the "BASEMAPR=" line, then try starting warden again and you'll get a bunch of shell debugging output on your screen. If you copy and paste that output here that should be enough to tell the root cause of the problem.
One more thing to mention, you may be better off using the http://answers.mapr.com forum as that is MapR specific and I think there may be more users there that could help.
Was configure.sh (/opt/mapr/server/configure.sh -C nodeA -Z nodeA)run on the node? Did zookeeper come up successfully?
service mapr-zookeeper status
Even when using MapR in a single node configure.sh is still required. In fact, without configure.sh warden, zookeeper, cldb and other MapR components will lack their configuration and in many cases will fail to start.
You must run configure.sh after installing the software packages (deb or rpm).