We are using guava 16.0.1
On RateLimiter.create(MaxRequestsPerSecond), I cannot burst right away. We would like to allow this as customer rate limiters are only created on their first request and are cached right now(too many customers to hold all of them in).
Ideally, I would just set storedPermits to some number but I can't seem to do that. Also, the warming up only allows 2x or 3x the request rate or something so they can't just do 3 requests at the same time right out of the gate.
Is there a way too allow for a burst right away on creation of the RateLimiter?
thanks,
Dean
Yes, the RateLimiter is package private so you can extend it yourself.
Create a class in the same package in your code to access the underlying storedPermits. See an implementation below and adjust based on which Guava version you are using as the internal implementation changed.
package com.google.common.util.concurrent;
import com.google.common.util.concurrent.RateLimiter.SleepingStopwatch;
public class RateLimiters {
public static RateLimiter createCharged(double permitsPerSecond, double maxBurstSeconds) {
SmoothRateLimiter.SmoothBursty rateLimiter = new SmoothRateLimiter.SmoothBursty(SleepingStopwatch.createFromSystemTimer(), maxBurstSeconds);
rateLimiter.setRate(permitsPerSecond);
rateLimiter.storedPermits = maxBurstSeconds * permitsPerSecond;
return rateLimiter;
}
}
Our work around was to pre-create RateLimiters since there is no way to initialize StoredPermits. We grab one (and add a new one) hoping the new one has enough time to build up permits for the next customer as long as our queue size is large enough for bursts of customers coming in.
Related
Hi Axon Framework community,
I'd like to have your opinion on how to solve the following problem properly.
My Axon Test Setup
Two instances of the same Spring Boot application (using axon-spring-boot-starter 4.4 without Axon Server)
Every instance publishes the same events on a regular interval
Both instances are connected to the same EventSource (single SQL Server instance using JpaEventStorageEngine)
Every instance is configured to use TrackingEventProcessors
Every instances has the same event handlers registered
What I want to achieve
I'd like that events published by one instance are only handled by the very same instance
If instance1 publishes eventX then only instance1 should handle eventX
What I've tried so far
I can achieve the above scenario using SubscribingEventProcessor. Unfortunately this is not an option in my case, since we'd like to have the option to replay events for rebuilding / adding new query models.
I could assign the event handlers of every instance to differed processing groups. Unfortunately this didn't worked. Maybe because every TrackingEventProcessors instance processes the same EventStream ? - not so sure about this though.
I could implement a MessageHandlerInterceptor which only proceeds in case the event origin is from the same instance. This is what I implemented so far and which works properly:
MessageHandlerInterceptor
class StackEventInterceptor(private val stackProperties: StackProperties) : MessageHandlerInterceptor<EventMessage<*>> {
override fun handle(unitOfWork: UnitOfWork<out EventMessage<*>>?, interceptorChain: InterceptorChain?): Any? {
val stackId = (unitOfWork?.message?.payload as SomeEvent).stackId
if(stackId == stackProperties.id){
interceptorChain?.proceed()
}
return null
}
}
#Configuration
class AxonConfiguration {
#Autowired
fun configure(eventProcessingConfigurer: EventProcessingConfigurer, stackProperties: StackProperties) {
val processingGroup = "processing-group-stack-${stackProperties.id}"
eventProcessingConfigurer.byDefaultAssignTo(processingGroup)
eventProcessingConfigurer.registerHandlerInterceptor(processingGroup) { StackEventInterceptor(stackProperties) }
}
}
Is there a better solution ?
I have the impression that my current solution is properly not the best one, since ideally I'd like that only the event handlers which belongs to a certain instance are triggered by the TrackingEventProcessor instances.
How would you solve that ?
Interesting scenario you're having here #thowimmer.
My first hunch would be to say "use the SubscribingEventProcessor instead".
However, you pointed out that that's not an option in your setup.
I'd argue it's very valuable for others who're in the same scenario to know why that's not an option. So, maybe you can elaborate on that (to be honest, I am curious about that too).
Now for your problem case to ensure events are only handled within the same JVM.
Adding the origin to the events is definitely a step you can take, as this allows for a logical way to filter. "Does this event originate from my.origin()?" If not, you'd just ignore the event and be done with it, simple as that. There is another way to achieve this though, to which I'll come to in a bit.
The place to filter is however what you're looking for mostly I think. But first, I'd like to specify why you need to filter in the first place. As you've noticed, the TrackingEventProcessor (TEP) streams events from a so called StreamableMessageSource. The EventStore is an implementation of such a StreamableMessageSource. As you are storing all events in the same store, well, it'll just stream everything to your TEPs. As your events are part of a single Event Stream, you are required to filter them at some stage. Using a MessageHandlerInterceptor would work, you could even go and write a HandlerEnhacnerDefinition allowing you to add additional behaviour to your Event Handling functions. However you put it though, with the current setup, filtering needs to be done somewhere. The MessageHandlerInterceptor is arguably the simplest place to do this at.
However, there is a different way of dealing with this. Why not segregate your Event Store, into two distinct instances for both applications? Apparently they do not have the need to read from one another, so why share the same Event Store at all? Without knowing further background of your domain, I'd guess you are essentially dealing with applications residing in distinct bounded contexts. Very shortly put, there is zero interest to share everything with both applications/contexts, you just share specifics portions of your domain language very consciously with one another.
Note that support for multiple contexts, using a single communication hub in the middle, is exactly what Axon Server can achieve for you. I am not here to say you cant configure this yourself though, I have done this in the past. But leaving that work to somebody or something else, freeing you from the need to configure infrastructure, that would be a massive timesaver.
Hope this helps you set the context a little of my thoughts on the matter #thowimmer.
Sumup:
Using the same EventStore for both instances is probably no an ideal setup in case we want to use the capabilities of the TrackingEventProcessor.
Options to solve it:
Dedicated (not mirrored) DB instance for each application instance.
Using multiple contexts using AxonServer.
If we decide to solve the problem on application level filtering using MessageHandlerInterceptor is the most simplest solution.
Thanks #Steven for exchanging ideas.
EDIT:
Solution on application level using CorrelationDataProvider & MessageHandlerInterceptor by filtering out events not originated in same process.
AxonConfiguration.kt
const val METADATA_KEY_PROCESS_ID = "pid"
const val PROCESSING_GROUP_PREFIX = "processing-group-pid"
#Configuration
class AxonConfiguration {
#Bean
fun processIdCorrelationDataProvider() = ProcessIdCorrelationDataProvider()
#Autowired
fun configureProcessIdEventHandlerInterceptor(eventProcessingConfigurer: EventProcessingConfigurer) {
val processingGroup = "$PROCESSING_GROUP_PREFIX-${ApplicationPid()}"
eventProcessingConfigurer.byDefaultAssignTo(processingGroup)
eventProcessingConfigurer.registerHandlerInterceptor(processingGroup) { ProcessIdEventHandlerInterceptor() }
}
}
class ProcessIdCorrelationDataProvider() : CorrelationDataProvider {
override fun correlationDataFor(message: Message<*>?): MutableMap<String, *> {
return mutableMapOf(METADATA_KEY_PROCESS_ID to ApplicationPid().toString())
}
}
class ProcessIdEventHandlerInterceptor : MessageHandlerInterceptor<EventMessage<*>> {
override fun handle(unitOfWork: UnitOfWork<out EventMessage<*>>?, interceptorChain: InterceptorChain?) {
val currentPid = ApplicationPid().toString()
val originPid = unitOfWork?.message?.metaData?.get(METADATA_KEY_PROCESS_ID)
if(currentPid == originPid){
interceptorChain?.proceed()
}
}
}
See full demo project on GitHub
Suppose I send objects of the following type from GWT client to server through RPC. The objects get stored to a database.
public class MyData2Server implements Serializable
{
private String myDataStr;
public String getMyDataStr() { return myDataStr; }
public void setMyDataStr(String newVal) { myDataStr = newVal; }
}
On the client side, I constrain the field myDataStr to be say 20 character max.
I have been reading on web-application security. If I learned something it is client data should not be trusted. Server should then check the data. So I feel like I ought to check on the server that my field is indeed not longer than 20 characters otherwise I would abort the request since I know it must be an attack attempt (assuming no bug on the client side of course).
So my questions are:
How important is it to actually check on the server side my field is not longer than 20 characters? I mean what are the chances/risks of an attack and how bad could the consequences be? From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
Assuming I would not be wasting my time doing the field-size check on the server, how should one accomplish it? I seem to recall reading (sorry I no longer have the reference) that a naive check like
if (myData2ServerObject.getMyDataStr().length() > 20) throw new MyException();
is not the right way. Instead one would need to define (or override?) the method readObject(), something like in here. If so, again how should one do it within the context of an RPC call?
Thank you in advance.
How important is it to actually check on the server side my field is not longer than 20 characters?
It's 100% important, except maybe if you can trust the end-user 100% (e. g. some internal apps).
I mean what are the chances
Generally: Increasing. The exact proability can only be answered for your concrete scenario individually (i. e. no one here will be able to tell you, though I would also be interested in general statistics). What I can say is, that tampering is trivially easy. It can be done in the JavaScript code (e. g. using Chrome's built-in dev tools debugger) or by editing the clearly visible HTTP request data.
/risks of an attack and how bad could the consequences be?
The risks can vary. The most direct risk can be evaluated by thinking: "What could you store and do, if you can set any field of any GWT-serializable object to any value?" This is not only about exceeding the size, but maybe tampering with the user ID etc.
From what I have read, it looks like it could go as far as bringing the server down through overflow and denial of service, but not being a security expert, I could be mis-interpreting.
This is yet another level to deal with, and cannot be addressed with server side validation within the GWT RPC method implementation.
Instead one would need to define (or override?) the method readObject(), something like in here.
I don't think that's a good approach. It tries to accomplish two things, but can do neither of them very well. There are two kinds of checks on the server side that must be done:
On a low level, when the bytes come in (before they are converted by RemoteServiceServlet to a Java Object). This needs to be dealt with on every server, not only with GWT, and would need to be answered in a separate question (the answer could simply be a server setting for the maximum request size).
On a logical level, after you have the data in the Java Object. For this, I would recommend a validation/authorization layer. One of the awesome features of GWT is, that you can use JSR 303 validation both on the server and client side now. It doesn't cover every aspect (you would still have to test for user permissions), but it can cover your "#Size(max = 20)" use case.
I am observing really bad performance when using GWT requestfactory. For example, a request that takes my service layer 2 seconds to fullfil is taking GWT 20 seconds to serialize. My service is returning ~100 what will be EntityProxies. Each of these objects has what will become 4 ValueProxies and 2 more EntityProxies (100 root level EntityProxies, 400 ValueProxies and 200 additional EntityProxies). However, I see the same 10x performance degradation on much smaller datasets.
Example of log snippet:
D 2012-10-18 22:42:39.546 ServiceLayerDecorator invoke: Inoking service layer took 2265 ms
D 2012-10-18 22:42:58.957 RequestFactoryServlet doPost: Entire request took 22870 ms
I have added some profiling code to the ServiceLayerDecorator#invoke method and wrapped the entire servlet in a timer. I have profiled the service by itself, and it is indeed returning results in ~2s.
I am using GWT 2.4, but have tested this on GWT 2.5rc1 and GWT 2.5rc2. My backend is on GAE, but I dont think that is playing a role here.
I found this bug filed against 2.4, which seems to be very related. I have manually applied the patch from this google group without any luck.
My domain models look like:
class Trip {
protected Address origin; // becomes ValueProxy
protected Address destination; becomes ValueProxy
protected Set<TripPassenger> tripPassengers; // Set of ValueProxies
}
class TripPassenger {
protected Passenger passenger;
}
class Passenger {
protected Account account;
}
My question is:
Have I profiled the code correctly and isolated the problem to the GWT serialization?
Could I be doing something wrong that would cause this behavior?
How can I better profile the GWT serialization code to try and figure out the cause?
Have I profiled the code correctly and isolated the problem to the GWT serialization?
RequestFactory uses reflection a whole lot (much more than GWT-RPC for instance), so I'm not really surprised that it causes some perf issues in some cases. And GAE could play a role here.
I believe RequestFactory (the AutoBean part actually) could greatly benefit from code generation at build-time.
Could I be doing something wrong that would cause this behavior?
Check your locators' find and/or isLive methods.
How can I better profile the GWT serialization code to try and figure out the cause?
It would also be interesting to know the time spent on deserialization of the request, applying changes, and then serialization of the response. And don't forget to substract from those the time spent in find and isLive.
Consider testing the project you've just implemented. If it's using the system's clock in anyway, testing it would be an issue. The first solution that comes to mind is simulation; manually manipulate system's clock to fool all the components of your software to believe the time is ticking the way you want it to. How do you implement such a solution?
My solution is:
Using a virtual environment (e.g. VMWare Player) and installing a Linux (I leave the distribution to you) and manipulating virtual system's clock to create the illusion of time passing. The only problem is, clock is ticking as your code is running. Me, myself, am looking for a solution that time will actually stop and it won't change unless I tell it to.
Constraints:
You can't confine the list of components used in project, as they might be anything. For instance I used MySQL date/time functions and I want to fool them without amending MySQL's code in anyway (it's too costy since you might end up compiling every single component of your project).
Write a small program that changes the system clock when you want it, and how much you want it. For example, each second, change the clock an extra 59 seconds.
The small program should
Either keep track of what it did, so it can undo it
Use the Network Time Protocol to get the clock back to its old value (reference before, remember difference, ask afterwards, apply difference).
From your additional explanation in the comments (maybe you cold add them to your question?), my thoughts are:
You may already have solved 1 & 2, but they relate to the problem, if not the question.
1) This is a web application, so you only need to concern yourself with your server's clock. Don't trust any clock that is controlled by the client.
2) You only seem to need elapsed time as opposed to absolute time. Therefore why not keep track of the time at which the server request starts and ends, then add the elapsed server time back on to the remaining 'time-bank' (or whatever the constraint is)?
3) As far as testing goes, you don't need to concern yourself with any actual 'clock' at all. As Gilbert Le Blanc suggests, write a wrapper around your system calls that you can then use to return dummy test data. So if you had a method getTime() which returned the current system time, you could wrap it in another method or overload it with a parameter that returns an arbitrary offset.
Encapsulate your system calls in their own methods, and you can replace the system calls with simulation calls for testing.
Edited to show an example.
I write Java games. Here's a simple Java Font class that puts the font for the game in one place, in case I decide to change the font later.
package xxx.xxx.minesweeper.view;
import java.awt.Font;
public class MinesweeperFont {
protected static final String FONT_NAME = "Comic Sans MS";
public static Font getBoldFont(int pointSize) {
return new Font(FONT_NAME, Font.BOLD, pointSize);
}
}
Again, using Java, here's a simple method of encapsulating a System call.
public static void printConsole(String text) {
System.out.println(text);
}
Replace every instance of System.out.println in your code with printConsole, and your system call exists in only one place.
By overriding or modifying the encapsulated methods, you can test them.
Another solution would be to debug and manipulate values returned by time functions to set them to anything you want
I've set up mvc-mini-profiler against my Entity Framework-powered MVC 3 site. Everything is duly configured; Starting profiling in Application_Start, ending it in Application_End and so on. The profiling part works just fine.
However, when I try to swap my data model object generation to providing profilable versions, performance slows to a grind. Not every SQL query, but some queries take about 5x the entire page load. (The very first page load after firing up IIS Express takes a bit longer, but this is sustained.)
Negligible time (~2ms tops) is spent querying, executing and "data reading" the SQL, while this line:
var person = dataContext.People.FirstOrDefault(p => p.PersonID == id);
...when wrapped in using(profiler.Step()) is recorded as taking 300-400 ms. I profiled with dotTrace, which confirmed that the time is actually spent in EF as usual (the profilable components do make very brief appearances), only it is taking much longer.
This leads me to believe that the connection or some of its constituent parts are missing sufficient data, making EF perform far worse.
This is what I'm using to make the context object (my edmx model's class is called DataContext):
var conn = ProfiledDbConnection.Get(
/* returns an SqlConnection */CreateConnection());
return CreateObjectContext<DataContext>(conn);
I originally used the mvc-mini-profiler provided ObjectContextUtils.CreateObjectContext method. I dove into it and noticed that it set a wildcard metadata workspace path string. Since I have the database layer isolated to one project and several MVC sites as other projects using the code, those paths have changed and I'd rather be more specific. Also, I thought this was the cause of the performance issue. I duplicated the CreateObjectContext functionality into my own project to provide this, as such:
public static T CreateObjectContext<T>(DbConnection connection) where T : System.Data.Objects.ObjectContext {
var workspace = new System.Data.Metadata.Edm.MetadataWorkspace(
GetMetadataPathsString().Split('|'),
// ^-- returns
// "res://*/Redacted.csdl|res://*/Redacted.ssdl|res://*/Redacted.msl"
new Assembly[] { typeof(T).Assembly });
// The remainder of the method is copied straight from the original,
// and I carried over a duplicate CtorCache too to make this work.
var factory = DbProviderServices.GetProviderFactory(connection);
var itemCollection = workspace.GetItemCollection(System.Data.Metadata.Edm.DataSpace.SSpace);
itemCollection.GetType().GetField("_providerFactory", // <==== big fat ugly hack
BindingFlags.NonPublic | BindingFlags.Instance).SetValue(itemCollection, factory);
var ec = new System.Data.EntityClient.EntityConnection(workspace, connection);
return CtorCache<T, System.Data.EntityClient.EntityConnection>.Ctor(ec);
}
...but it doesn't seem to make much of a difference. The problem still exists whether I use the above hacked version that's more specific with metadata workspace paths or the mvc-mini-profiler provided version. I just thought I'd mention that I've tried this too.
Having exhausted all this, I'm at my wits' end. Once again: when I just provide my data context as usual, no performance is lost. When I provide a "profilable" data context, performance plummets for certain queries (I don't know what influences this either). What could mvc-mini-profiler do that's wrong? Am I still feeding it the wrong data?
I think this is the same problem as this person ran into.
I just resolved this issue today.
see: http://code.google.com/p/mvc-mini-profiler/issues/detail?id=43
It happened cause some of our fancy hacks were not cached well enough. In particular:
var workspace = new System.Data.Metadata.Edm.MetadataWorkspace(
new string[] { "res://*/" },
new Assembly[] { typeof(T).Assembly });
Is a very expensive call, so we need to cache the workspace.
Profiling, by definition, will effect performance of the application being profiled. The profiler needs to insert it's own method calls throughout the application, intercept low level system calls, and record all that data someplace (meaning writes to disk). All of those tasks take up precious CPU cycles, memory, and disk access.