Wait for EC2 instance to start - scala

I have a custom AMI which runs my service. Using the AWS Java SDK, I create an EC2 instance using RunInstancesRequest from the AMI. Now before I begin to use my service, I must ensure that the newly created instance is up and running. I poll the instance using:
var transitionCompleted = false
while (!transitionCompleted) {
val currentState = instance.getState.getName
if (currentState == desiredState) {
transitionCompleted = true
}
if(!transitionCompleted) {
try {
Thread.sleep(TRANSITION_INTERVAL)
} catch {
case e: InterruptedException => e.printStackTrace()
}
}
}
So when the currentState of the instance turns into desiredState(which is running), I get the status that the instance is ready. However any newly created instance, despite being in running state, is not available for immediate use as it is still initializing.
How do I ensure that I return only when I'm able to access the instance and its services? Are there any specific status checks to make?
PS: I use Scala

You are checking instance state, while what you are actually interested in are the instance status checks. You could use describeInstanceStatus method from the Amazon Java SDK, but instead of implementing your own polling (in a non-idiomatic Scala) it's better to use a ready solution from the SDK: the EC2 waiters.
import com.amazonaws.services.ec2._, model._, waiters._
val ec2client: AmazonEC2 = ...
val request = new DescribeInstanceStatusRequest().withInstanceIds(instanceID)
ec2client.waiters.instanceStatusOk.run(
new WaiterParameters()
.withRequest(request)
// Optionally, you can tune the PollingStrategy:
// .withPollingStrategy(...)
)
)
To customize polling delay and retry strategies of the waiter check the PollingStrategy documentation.

Related

How would I use multiple database clients in an Actix Web Application

I'm on a project where I need to manage connections to both a MongoDB Instance and a PostgreSQL instance.
My current idea is to make a custom type that will contain an Arc<Mutex<pgConnection>> and an Arc<Mutex<MongoConnection>> in a struct that itself is within an Arc which would be passed to the Actix Web app_data initialization function.
e.g.
// this is pseudo-code, kinda
type DbPoolPG = r2d2::Pool<ConnectionManager<PostgreSQL>>;
// wont be an r2d2 pool, MongoDB official drivers handle pooling automatically
type DbPoolMongo = r2d2::Pool<ConnectionManager<MongoDB>>;
struct DatabseConnections {
pg: Arc<Mutex<DbPoolPG>>;
mongo: Arc<Mutex<DbPoolMongo>>;
}
#[actix_web::main]
async fn main() -> io::Result<()> {
// Create connection pools
let PostGresPool = r2d2::Pool::builder()
.build(manager)
.expect("Failed to create pool.");
let MongoPool = mongo.create_connection()
let connections = DatabaseConnections {
pg: Arc::new(Mutex::new(PostGresPool))
mongo: Arc::new(Mutex::new(MongoPool))
}
// Start HTTP server
HttpServer::new(move || {
App::new().app_data(web::Data::new(Arc::new(connections)))
.resource("/{name}", web::get().to(index))
})
.bind(("127.0.0.1", 8080))?
.run()
.await
}
The idea seems a bit too simple to actually work though, Does anyone else have any ideas?
nvm, you can just call app_data twice as long as the types are different. then access them by calling the types

Mongo change-Stream with Spring resumeAt vs startAfter and fault tolerance in case of connection loss

Can't find an answer on stackOverflow, nor in any documentation,
I have the following change stream code(listen to a DB not a specific collection)
Mongo Version is 4.2
#Configuration
public class DatabaseChangeStreamListener {
//Constructor, fields etc...
#PostConstruct
public void initialize() {
MessageListenerContainer container = new DefaultMessageListenerContainer(mongoTemplate, new SimpleAsyncTaskExecutor(), this::onException);
ChangeStreamRequest.ChangeStreamRequestOptions options =
new ChangeStreamRequest.ChangeStreamRequestOptions(mongoTemplate.getDb().getName(), null, buildChangeStreamOptions());
container.register(new ChangeStreamRequest<>(this::onDatabaseChangedEvent, options), Document.class);
container.start();
}
private ChangeStreamOptions buildChangeStreamOptions() {
return ChangeStreamOptions.builder()
.returnFullDocumentOnUpdate()
.filter(newAggregation(match(where(OPERATION_TYPE).in(INSERT.getValue(), UPDATE.getValue(), REPLACE.getValue(), DELETE.getValue()))))
.resumeAt(Instant.now().minusSeconds(1))
.build();
}
//more code
}
I want the stream to start listening from system initiation time only, without taking anything prior in the op-log, will .resumeAt(Instant.now().minusSeconds(1)) work?
do I need to use starAfter method if so how can I found the latest resumeToken in the db?
or is it ready out of the box and I don't need to add any resume/start lines?
second question, I never stop the container(it should always live while app is running), In case of disconnection from the mongoDB and reconnection will the listener in current configuration continue to consume messages? (I am having a hard time simulation DB disconnection)
If it will not resume handling events, what do I need to change in the configuration so that the change stream will continue and will take all the event from the last received resumeToken prior to the disconnection?
I have read this great article on medium change stream in prodcution,
but it uses the cursor directly, and I want to use the spring DefaultMessageListenerContainer, as it is much more elegant.
So I will answer my own(some more dumb, some less :)...) questions:
when no resumeAt timestamp provided the change stream will start from current time, and will not draw any previous events.
resumeAfter event vs timestamp difference can be found here: stackOverflow answer
but keep in mind, that for timestamp it is inclusive of the event, so if you want to start from next event(in java) do:
private BsonTimestamp getNextEventTimestamp(BsonTimestamp timestamp) {
return new BsonTimestamp(timestamp.getValue() + 1);
}
In case of internet disconnection the change stream will not resume,
as such I recommend to take following approach in case of error:
private void onException() {
ScheduledExecutorService executorService = newSingleThreadScheduledExecutor();
executorService.scheduleAtFixedRate(() -> recreateChangeStream(executorService), 0, 1, TimeUnit.SECONDS);
}
private void recreateChangeStream(ScheduledExecutorService executorService) {
try {
mongoTemplate.getDb().runCommand(new BasicDBObject("ping", "1"));
container.stop();
startNewContainer();
executorService.shutdown();
} catch (Exception ignored) {
}
}
First I am creating a runnable scheduled task that always runs(but only 1 at a time newSingleThreadScheduledExecutor()), I am trying to ping the DB, after a successful ping I am stopping the old container and starting a new one, you can also pass the last timestamp you took so that you can get all events you might have missed
timestamp retrieval from event:
BsonTimestamp resumeAtTimestamp = changeStreamDocument.getClusterTime();
then I am shutting down the task.
also make sure the resumeAtTimestamp exist in oplog...

Why am I occasionally getting a InvalidStateStoreException PARTITIONS_REVOKED, not RUNNING when retrieving a store to query it?

I am accessing a state store to query it and have had to wrap the store() statement with a try/catch block to retry it because sometimes I am getting this exception:
org.apache.kafka.streams.errors.InvalidStateStoreException: Cannot get state store customers-store because the stream thread is PARTITIONS_REVOKED, not RUNNING
at org.apache.kafka.streams.state.internals.StreamThreadStateStoreProvider.stores(StreamThreadStateStoreProvider.java:49)
at org.apache.kafka.streams.state.internals.QueryableStoreProvider.getStore(QueryableStoreProvider.java:57)
at org.apache.kafka.streams.KafkaStreams.store(KafkaStreams.java:1053)
at com.codependent.kafkastreams.customer.service.CustomerService.getCustomer(CustomerService.kt:75)
at com.codependent.kafkastreams.customer.service.CustomerServiceKt.main(CustomerService.kt:108)
This is the code used to retrieve the store (the full code is on a github repo):
fun getCustomer(id: String): Customer? {
var keyValueStore: ReadOnlyKeyValueStore<String, Customer>? = null
while(keyValueStore == null) {
try {
keyValueStore = streams.store(CUSTOMERS_STORE, QueryableStoreTypes.keyValueStore<String, Customer>())
} catch (ex: InvalidStateStoreException) {
ex.printStackTrace()
}
}
val customer = keyValueStore.get(id)
return customer
}
And this is the main program:
fun main(args: Array<String>) {
val customerService = CustomerService("main", "localhost:9092")
customerService.initializeStreams()
customerService.createCustomer(Customer("53", "Joey"))
val customer = customerService.getCustomer("53")
println(customer)
customerService.stopStreams()
}
The exception happens randomly running the program several times, after the previous executions finish. Note: I don't do anything to the executing Kafka cluster and use its default config.
At the time you are accessing the store, the Kafka Streams application is going through a rebalance, and state stores aren't accessible at that time. You want to make sure you only query the stores when the application state is RUNNING and not REBALANCING.
What you could do is check the state of the application before attempting to read from the store like this:
if(streams.state() == State.RUNNING) {
keyValueStore = streams.store(...);
val customer = keyValueStore.get(id);
return customer;
}
There is also a KafkaStreams.setStateListener method you can use to register a KafkStreams.StateListener implementation. The StateListener.onChange method is called each time the application changes its state.

Play framework Scala run job in background

Is there any way I can trigger a job from the controller (to not to wait for its completion) and display the message to the user that job will be running in the background?
I have one controller method which takes quite long time to run. So I want to make that run offline and display the message to the user that it will be running in the background.
I tried Action.async as shown below. But the processing of the Future object is still taking more time and getting timed out.
def submit(id: Int) = Action.async(parse.multipartFormData) { implicit request =>
val result = Future {
//process the data
}
result map {
res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be ruuning in background."))
}
}
You can also return a result without waiting for the result of the future in a "fire and forget" way
def submit(id: Int) = Action(parse.multipartFormData) { implicit request =>
Future {
//process the data
}
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be running in background."))
}
The docs state:
By giving a Future[Result] instead of a normal Result, we are able to quickly generate the result without blocking. Play will then serve the result as soon as the promise is redeemed.
The web client will be blocked while waiting for the response, but nothing will be blocked on the server, and server resources can be used to serve other clients.
You can configure your client code to use ajax request and display a Waiting for data message for some part of the page without blocking the rest of the web page from loading.
I also tried the "Futures.timeout" option. It seems to work fine. But I'm not sure its correct way to do it or not.
result.withTimeout(20.seconds)(futures).map { res =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}.recover {
case e: scala.concurrent.TimeoutException =>
Redirect(routes.testController.list()).flashing(("success", s"Job(s) will be updated in background."))
}

Play framework 2.0 long running tasks

My app has a long running task (anywhere from 5 minutes to 2 hours) that I can start through an admin panel.
I need a good way to know whether the task is currently running or not, so I can display the status on the admin panel and to prevent the task from being started twice.
Currently my code looks like this (simplified):
object TaskMonitor extends Controller {
var isRunning = false
// Displays the task status on the admin panel
def status = Action {
Ok.chunked(running &> Comet(callback = "parent.running"))
}
// Check task status every 100 ms
lazy val running: Enumerator[String] = {
Enumerator.generateM {
Promise.timeout(Some(isRunning.toString), 100 milliseconds)
}
}
// start the task, but only if it's not already running
def startTask = Action {
if (!isRunning) {
isRunning = true
val f = scala.concurrent.Future { Task.start }
f onComplete {
case _ => isRunning = false
}
}
Ok
}
}
Obviously this is has all kinds of issues, mainly the fact that I have unsynchronized mutable state (isRunning variable) in my controller.
What would be the correct way to go about what I want to achieve ?
You're right, you have unsynchronized mutable state. Is it really a problem? I mean this is your admin right? How many concurrent 'startTask' are you gonna send?
The recommended way to handle this in Play! is to use an Akka actor - you don't need any extra dependency, Play! includes Akka already.
Create an actor to hold your isRunning value; in startTask you can send a message to the actor to start the task (the actor will check if the task is not already running). To send the current status of the task you can query the actor for the value of isRunning.
This will give you a protected mutable state. Your choice to decide if it's really worth the trouble.