I followed this great tutorial, and everything worked great except for one thing.
step#11, in the table, all the emotions scores are 0!
it seems that Tone Analyzer is not connected.
I am sure that I put the correct (credintials)username & password.
After I searched a lot, I found that one month ago, IBM changed Tone Analyzer plan from experimental to Beta.
I don't know what should I change in the code, to make the Tone Analyzer with a new plan works for this example?
I recently updated the tutorial to deal with API changes in Tone Analyzer which transitioned from experimental to Beta. Are you using the latest version of the tutorial?
There are multiple reasons that could explain why you are not getting any tweets: wrong twitter or Tone analyzer credentials. Please double check these according to the tutorial instructions. To better diagnose errors, I've also added a StreamingListener in the latest tutorial version that should give you more information. You should see messages as follow:
Twitter stream started
Tweets are collected real-time and analyzed
To stop the streaming and start interacting with the data use: StreamingTwitter.stopTwitterStreaming
Receiver Started: TwitterReceiver-0
Batch started with 139 records
Batch completed with 139 records
Batch started with 270 records
Stopping Twitter stream. Please wait this may take a while
Receiver Stopped: TwitterReceiver-0
Reason: : Stopped by driver
Batch completed with 270 records
Twitter stream stopped
You can now create a sqlContext and DataFrame with 38 Tweets created. Sample usage:
val (sqlContext, df) = com.ibm.cds.spark.samples.StreamingTwitter.createTwitterDataFrames(sc)
df.printSchema
sqlContext.sql("select author, text from tweets").show
Finally, if you are using the pre-built jar file I posted on Github, make sure that you are using Spark 1.6 and not a back level version.
Related
One of our AWS bots is not logging detected and missed utterances. Where as all the new bots created in the same account are logging missed utterances in Monitoring -> Utterances section. I have checked the configuration of all the bots and it is all same.
In Monitoring -> Monitoring Graphs, I can see the graph showing missed utterances. I am failing to understand why the utterances (both missed and detected) is not appearing in the Monitoring > Utterances section. I know we need to wait 24 hours for them to appear. But it is not appearing at all even after 2 days. So if you can suggest some reasons for this, I will try to look into it.
I have made the aliases point to the latest version so no chance of utterances going to a wrong version. Thanks in advance
Utterance statistics are not generated under the following conditions:
The childDirected field was set to true when the bot was created.
You are using slot obfuscation with one or more slots.
You opted out of participating in improving Amazon Lex.
And as you mentioned, need to wait ~24 hours for data to be processed:
https://docs.aws.amazon.com/lex/latest/dg/ex-utterances.html
I am totally new on Kubernetes, I am reading the book: Getting Started with Kubernetes from Jonathan Baier. After all the billing process from Google I was able to setup my project both in GCP and in my system, but then the book says that I need to execute:.
kubernetes/cluster/kube-up.sh
The first time, it reached the point from the following picture:
I had to cancel it, because it took too much time. The second time it was able to pass that message, but then 3 error messages appears:
I saw in another post a similar issue, and someone said that gcloud needed a downgrade to version 167. But I am not sure if that also applies to this issue,
Regards
well
All I had to do was open that link the error message gave me, then activate/enable the Compute Engine API, wait for some mins and then exute the kube-up script again..
Hope it can help someone later
I've built a Kafka Streams application. It's my first one, so I'm moving out of a proof-of-concept mindset into a "how can I productionalize this?" mindset.
The tl;dr version: I'm looking for kafka streams deployment recommendations and tips, specifically related to updating your application code.
I've been able to find lots of documentation about how Kafka and the Streams API work, but I couldn't find anything on actually deploying a Streams app.
The initial deployment seems to be fairly easy - there is good documentation for configuring your Kafka cluster, then you must create topics for your application, and then you're pretty much fine to start it up and publish data for it to process.
But what if you want to upgrade your application later? Specifically, if the update contains a change to the topology. My application does a decent amount of data enrichment and aggregation into windows, so it's likely that the processing will need to be tweaked in the future.
My understanding is that changing the order of processing or inserting additional steps into the topology will cause the internal ids for each processing step to shift, meaning at best new state stores will be created with the previous state being lost, and at worst, processing steps reading from an incorrect state store topic when starting up. This implies that you either have to reset the application, or give the new version a new application id. But there are some problems with that:
If you reset the application or give a new id, processing will start from the beginning of source and intermediate topics. I really don't want to publish the output to the output topics twice.
Currently "in-flight" data would be lost when you stop your application for an upgrade (since that application would never start again to resume processing).
The only way I can think to mitigate this is to:
Stop data from being published to source topics. Let the application process all messages, then shut it off.
Truncate all source and intermediate topics.
Start new version of application with a new app id.
Start publishers.
This is "okay" for now since my application is the only one reading from the source topics, and intermediate topics are not currently used beyond feeding to the next processor in the same application. But, I can see this getting pretty messy.
Is there a better way to handle application updates? Or are my steps generally along the lines of what most developers do?
I think you have a full picture of the problem here and your solution seems to be what most people do in this case.
During the latest Kafka-Summit this question has been asked after the talk of Gwen Shapira and Matthias J. Sax about Kubernetes deployment. The responses were the same: If your upgrade contains topology modifications, that implies rolling upgrades can't be done.
It looks like there is no KIP about this for now.
While trying to run the SAM topology using HDF 3.0.0 sandbox, I am getting the below exception. I have only 2 components in the canvas.
1) Get input from Kafka Topic
2) Write the contents from the topic to HDFS Sink.
java.lang.InstantiationException: org.apache.storm.kafka.bolt.selector.DefaultTopicSelector
The engine behind the scene is Storm. While trying to execute the flow, the above mentioned error occurs. I am trying to get more information on the specific error message, but not able to find more help on the internet for Hortonworks Stream Analytics Manager.
Screenshot will make the issue clear. Upon execution of the flow, the exception occurs.
Do you see any errors in the streamline.log ? Can you paste the stack trace?
It may be due to some missing classes. You might want to delete and recreate the App and if that doesn't help, raise an issue here - https://github.com/hortonworks/streamline/issues with the relevant information and someone will take a look.
Various errors started occurring in Google SQL. The system is saying temporary unavailable, but it's been quite a while. Looks like 1 in 10 queries now give 500/502 errors. Here is an example stacktrace http://pastebin.com/MNk06PT4
This is a follow-up from Severe delays in cloud SQL responses. It could be the same issue. Same conditions, google cloud engine connected to a cloud SQL, no zone preference. Hope that sheds more light on the issue.
Between 11.00PST and 11.30PST there was an issue that interrupted many Cloud SQL instances. The problem should now be resolved.
We apologize for the inconvenience and thank you for your patience and continued support. Please rest assured that system reliability is a top priority for the Google Cloud Platform, and we are making continuous improvements to make our systems better.
To be kept informed of other Google Cloud SQL issues and launches, please join google-cloud-sql-announce#googlegroups.com
https://groups.google.com/forum/#!forum/google-cloud-sql-announce
Seems all Google Cloud SQL is down, any expected recovery time?
https://cloud.google.com/console
Error: Server Error
The server encountered an error and could not complete your request.
Please try again in 30 seconds.
Best regards
Sergio
Also I noticed that you are using the deprecated JDBC driver that has much lower performance than using the MySQL wire protocol natively. See https://developers.google.com/cloud-sql/docs/external for information on connecting using the standard drivers. That will help latency as well as consistency of performance.
I also got this error using CodeIgniter v2.1 and v3 on app engine and got this error as well.
It happens when using $autoload['libraries'] = array('database');
Then after a few random page refreshes this error pops up.
After changing the following in database.php:
'pconnect' => TRUE,
into
'pconnect' => FALSE,
This errors is gone in my application. Now both version 2.1 and 3 are working for me.
Maybe there is a similar setting in the framework or code you're using.