Why does specifying Xss in Maven Surefire make ScalaTest run twice as fast?

Why does specifying Xss in Maven Surefire make ScalaTest run twice as fast? - scala

I have one FunSuite of unit tests testing a highly recursive (non-tail) Scala function. If I add the line below to my pom.xml Surefire <configuration> it runs twice as fast.
<argLine>-Xss1024k</argLine>
It doesn't matter what value I specify, except that if I specify a really low value like -Xss256k I get the expected StackOverflowException. Otherwise, I can set it anywhere from 512k up to 512m and the execution time is all the same. But then if I delete the line entirely from pom.xml the execution time doubles.
Why could that be?

JVM specification states,
Because the Java Virtual Machine stack is never manipulated directly
except to push and pop frames, frames may be heap allocated. The
memory for a Java Virtual Machine stack does not need to be
contiguous.
This specification permits Java Virtual Machine stacks either to be of
a fixed size or to dynamically expand and contract as required by the
computation. If the Java Virtual Machine stacks are of a fixed size,
the size of each Java Virtual Machine stack may be chosen
independently when that stack is created.
I think the jvm You are using supports dynamically expanding stacks. When chain of calls gets bigger than the default initial stack size, stack frames will be allocated from the heap. This will cause the execution to be slowed down.
And the case You specify the stack size with parameter, stack maybe work on fixed size mode. In this case all the space will be allocated before hand and no new allocations will be required. Also when You give the low stack size it does not expand and throws StackOverflow, this also shows it is working on fixed size mode.
Since You didn't mention JVM implementation You are using, this is just my assumption based on the info from the JVM specs and Your use case.
https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-2.html#jvms-2.5.2

Related

Connect to Apache Phoenix with JDBC and to Postgres with Slick from Play2 [duplicate]

This question already has answers here:
Error java.lang.OutOfMemoryError: GC overhead limit exceeded
(22 answers)
Closed 3 years ago.
I am getting this error in a program that creates several (hundreds of thousands) HashMap objects with a few (15-20) text entries each. These Strings have all to be collected (without breaking up into smaller amounts) before being submitted to a database.
According to Sun, the error happens "if too much time is being spent in garbage collection: if more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, an OutOfMemoryError will be thrown.".
Apparently, one could use the command line to pass arguments to the JVM for
Increasing the heap size, via "-Xmx1024m" (or more), or
Disabling the error check altogether, via "-XX:-UseGCOverheadLimit".
The first approach works fine, the second ends up in another java.lang.OutOfMemoryError, this time about the heap.
So, question: is there any programmatic alternative to this, for the particular use case (i.e., several small HashMap objects)? If I use the HashMap clear() method, for instance, the problem goes away, but so do the data stored in the HashMap! :-)
The issue is also discussed in a related topic in StackOverflow.

You're essentially running out of memory to run the process smoothly. Options that come to mind:
Specify more memory like you mentioned, try something in between like -Xmx512m first
Work with smaller batches of HashMap objects to process at once if possible
If you have a lot of duplicate strings, use String.intern() on them before putting them into the HashMap
Use the HashMap(int initialCapacity, float loadFactor) constructor to tune for your case

The following worked for me. Just add the following snippet:
dexOptions {
javaMaxHeapSize "4g"
}
To your build.gradle:
android {
compileSdkVersion 23
buildToolsVersion '23.0.1'
defaultConfig {
applicationId "yourpackage"
minSdkVersion 14
targetSdkVersion 23
versionCode 1
versionName "1.0"
multiDexEnabled true
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
}
}
packagingOptions {
}
dexOptions {
javaMaxHeapSize "4g"
}
}

#takrl: The default setting for this option is:
java -XX:+UseConcMarkSweepGC
which means, this option is not active by default. So when you say you used the option
"+XX:UseConcMarkSweepGC"
I assume you were using this syntax:
java -XX:+UseConcMarkSweepGC
which means you were explicitly activating this option.
For the correct syntax and default settings of Java HotSpot VM Options # this
document

For the record, we had the same problem today. We fixed it by using this option:
-XX:-UseConcMarkSweepGC
Apparently, this modified the strategy used for garbage collection, which made the issue disappear.

Ummm... you'll either need to:
Completely rethink your algorithm & data-structures, such that it doesn't need all these little HashMaps.
Create a facade which allows you page those HashMaps in-and-out of memory as required. A simple LRU-cache might be just the ticket.
Up the memory available to the JVM. If necessary, even purchasing more RAM might be the quickest, CHEAPEST solution, if you have the management of the machine that hosts this beast. Having said that: I'm generally not a fan of the "throw more hardware at it" solutions, especially if an alternative algorithmic solution can be thought up within a reasonable timeframe. If you keep throwing more hardware at every one of these problems you soon run into the law of diminishing returns.
What are you actually trying to do anyway? I suspect there's a better approach to your actual problem.

Use alternative HashMap implementation (Trove). Standard Java HashMap has >12x memory overhead.
One can read details here.

Don't store the whole structure in memory while waiting to get to the end.
Write intermediate results to a temporary table in the database instead of hashmaps - functionally, a database table is the equivalent of a hashmap, i.e. both support keyed access to data, but the table is not memory bound, so use an indexed table here rather than the hashmaps.
If done correctly, your algorithm should not even notice the change - correctly here means to use a class to represent the table, even giving it a put(key, value) and a get(key) method just like a hashmap.
When the intermediate table is complete, generate the required sql statement(s) from it instead of from memory.

The parallel collector will throw an OutOfMemoryError if too much time is being spent in garbage collection. In particular, if more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, OutOfMemoryError will be thrown. This feature is designed to prevent applications from running for an extended period of time while making little or no progress because the heap is too small. If necessary, this feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the command line.

If you're creating hundreds of thousands of hash maps, you're probably using far more than you actually need; unless you're working with large files or graphics, storing simple data shouldn't overflow the Java memory limit.
You should try and rethink your algorithm. In this case, I would offer more help on that subject, but I can't give any information until you provide more about the context of the problem.

If you have java8, and you can use the G1 Garbage Collector, then run your application with:
-XX:+UseG1GC -XX:+UseStringDeduplication
This tells the G1 to find similar Strings and keep only one of them in memory, and the others are only a pointer to that String in memory.
This is useful when you have a lot of repeated strings. This solution may or not work and depends on each application.
More info on:
https://blog.codecentric.de/en/2014/08/string-deduplication-new-feature-java-8-update-20-2/
http://java-performance.info/java-string-deduplication/

Fix memory leaks in your application with help of profile tools like eclipse MAT or VisualVM
With JDK 1.7.x or later versions, use G1GC, which spends 10% on garbage collection unlike 2% in other GC algorithms.
Apart from setting heap memory with -Xms1g -Xmx2g , try `
-XX:+UseG1GC
-XX:G1HeapRegionSize=n,
-XX:MaxGCPauseMillis=m,
-XX:ParallelGCThreads=n,
-XX:ConcGCThreads=n`
Have a look at oracle article for fine-tuning these parameters.
Some question related to G1GC in SE:
Java 7 (JDK 7) garbage collection and documentation on G1
Java G1 garbage collection in production
Agressive garbage collector strategy

For this use below code in your app gradle file under android closure.
dexOptions {
javaMaxHeapSize "4g"
}

In case of the error:
"Internal compiler error: java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.AbstractStringBuilder"
increase the java heap space to 2GB i.e., -Xmx2g.

You need to increase the memory size in Jdeveloper go to setDomainEnv.cmd.
set WLS_HOME=%WL_HOME%\server
set XMS_SUN_64BIT=256
set XMS_SUN_32BIT=256
set XMX_SUN_64BIT=3072
set XMX_SUN_32BIT=3072
set XMS_JROCKIT_64BIT=256
set XMS_JROCKIT_32BIT=256
set XMX_JROCKIT_64BIT=1024
set XMX_JROCKIT_32BIT=1024
if "%JAVA_VENDOR%"=="Sun" (
set WLS_MEM_ARGS_64BIT=-Xms256m -Xmx512m
set WLS_MEM_ARGS_32BIT=-Xms256m -Xmx512m
) else (
set WLS_MEM_ARGS_64BIT=-Xms512m -Xmx512m
set WLS_MEM_ARGS_32BIT=-Xms512m -Xmx512m
)
and
set MEM_PERM_SIZE_64BIT=-XX:PermSize=256m
set MEM_PERM_SIZE_32BIT=-XX:PermSize=256m
if "%JAVA_USE_64BIT%"=="true" (
set MEM_PERM_SIZE=%MEM_PERM_SIZE_64BIT%
) else (
set MEM_PERM_SIZE=%MEM_PERM_SIZE_32BIT%
)
set MEM_MAX_PERM_SIZE_64BIT=-XX:MaxPermSize=1024m
set MEM_MAX_PERM_SIZE_32BIT=-XX:MaxPermSize=1024m

For my case increasing the memory using -Xmx option was the solution.
I had a 10g file read in java and each time I got the same error. This happened when the value in the RES column in top command reached to the value set in -Xmx option. Then by increasing the memory using -Xmx option everything went fine.
There was another point as well. When I set JAVA_OPTS or CATALINA_OPTS in my user account and increased the amount of memory again I got the same error. Then, I printed the value of those environment variables in my code which gave me different values than what I set. The reason was that Tomcat was the root for that process and then as I was not a su-doer I asked the admin to increase the memory in catalina.sh in Tomcat.

This helped me to get rid of this error.This option disables
-XX:+DisableExplicitGC

How to reduce the size of javacard applet

I wrote an applet which has 19 KB size on disk. It has three classes. The first one is extended from Applet, the second one has static functions and third one is a class that i create an instance from it in my applet.
I have three questions:
Is there any way to find out how much size is taken by my applet instance in my javacard?
Is there any tool to reduce the size of a javacard applet (.cap file)?
Can you explain rules that help me to reduce my applet size?

Is there any way to find out how much size is taken by my applet instance in my javacard?
(AFAIK) There is no official way to do that (in GlobalPlatform / Java Card).
You can estimate the real memory usage from the difference in free memory before applet loading and after installation (and most likely after personalization -- as you probably will create some objects during the personalization). Some ways to find out free memory information are:
Using JCSystem.getAvailableMemory() (see here) which gives information for all memory types (if implemented).
Using Extended Card Resources Information tag retrievable with GET DATA (see TS 102 226) (if implemented).
Using proprietary command (ask you vendor).
You can have a look inside your .cap file and see the sizes of the parts that are loaded into the card -- this one is surely VERY INACCURATE as card OS is free to deal with the content at its own discretion.
I remember JCOP Tools have some special eclipse view which shows various statistics for the current applet -- might be informative as well.
The Reference Implementation offers an option to get some resource consumption statistics -- might be useful as well (I have never used this, though).
Is there any tool to reduce the size of a javacard applet (.cap file)?
I used ProGuard in the past to improve applet performance (which in fact increased applet size as I used it mostly for method inlining) -- but it should work to reduce the applet size as well (e.g. eliminate dead code -- see shrinking options). There are many different optimizations as well -- just have a look, but do not expect miracles.
Can you explain rules that help me to reduce my applet size?
I would emphasize good design and proper code re-use, but there are definitely many resources regarding generic optimization techniques -- I don't know any Java Card specific ones -- can't help here :(
If you have more applets loaded into a single card you might place common code into a shared library.
Some additional (random) notes:
It might be more practical to just get a card with a larger memory.
Free memory information given by the card might be inaccurate.
I wonder you have problems with your applet size as usually there are problems with transient memory size (AFAIK).
Your applet might be simply leaking memory and thus use more and more memory.
Do not sacrifice security for lesser applet size!
Good luck!

To answer your 3rd Question
3.Can you explain rules that help me to reduce my applet size?
Some basic Guidelines are :
Keep the number of methods minimum as you know we have very limited resources for smart cards and method calling is an overhead so with minimum method calls,performance of the card will increase.Avoid using get/set methods.Recursive calls should also be avoided as the stack size in most of the cards is around 200 Bytes.
Avoid using more than 3 parameters for virtual methods and 4 for static methods. This way, the compiler will use a number or bytecode shortcuts, which reduces code size.
To work on temporary data, one can use APDU buffer or transient arrays as writing on EEPROM is about 1,000 times slower than writing on RAM.
Inheritance is also an overhead in javacard particularly when the hierarchy is complex.
Accessing array elements is also an overhead to card.So, in situations where there is repeated accessing of an array element try to store the element in a local variable and use it.
Instead of doing this:
if (arr[index1] == 1) do this;
OR
if (arr[index1] == 2) do this;
OR
if (arr[index1] == 3) do this;
Do this:
temp = arr[index1];
if (temp == 1) do this;
OR
if (temp == 2) do this;
OR
if (temp == 3) do this;
Replace nested if-else statements with equivalent switch statements as switch executes faster and takes less memory.

Recursive call stack depth

I have a recursive function that works for input where the call stack depth is up to 1000, but fails for bigger inputs. I converted the function to be tail recursive and that allowed it to get to about 1350.
What are the limits and is there any way to increase that limit?
I am working with pure functions and would like to avoid having to use operations. I have a solution that breaks up the problem into a composition of steps, each of which has a smaller stack depth, but it is rather contrived since its only purpose is to avoid the issue and it is more complex.

This is my mistake again... the setting for the Java stack is -Xss (the -Xms setting is the starting heap size), sorry. So if you use the JVM Arguments section in the Debugger tab of the launcher, and set something like -Xss5m, you should get further.
In a simple experiment with a recursive function, the default stack allowed me a depth of 227 calls. Using -Xss5m gave me 4020 calls, and -Xss10m gave me 8050 calls. Note that these stack sizes are somewhat less that the Gb sizes you were trying - 5Mb of stack is a lot of calls!

Overture does not impose a stack limit over the underlying Java stack limit, so it will simply respect the -Xms JVM argument. I think the regular execution stack for the interpreter comes from the Overture.ini file (top level), where you see the -Xmx argument to set the maximum heap. Can you try adding (say) -Xms128m, or a size of your choice, and see whether that gets you further?

It sounds like you are asking about how to increase the Java Stack Limit in the Overture debugger and not in the Overture IDE (overture.ini).
To change pass additional arguments to the Overture debugger you need to add them to the launch configuration:
Open the launch configuration
Select the "Debugger" tab
The add your arguments to the box shown next to "Arguments:" in the top
Overture Launch configuration

I have tried with -Xms and -Xmx both set up to 2048m but without any impact. I have also tried on Overture 2.3.0 on both Mac OSX and Windows 10 with the same result.
To take my project out of the loop, I created a new project with one very simple function:
countdown(n:nat) res:nat
== if n=0 then n else countdown(n-1)
On both Windows and Mac I can call this with value 807 and be successful, while with 808 it fails with error:
internal error
Main 206: Error evaluating code
Detailed Message: internal error

Memory in Eclipse

I'm getting the java.lang.OutOfMemoryError exception in Eclipse. I know that Eclipse
by default uses heap size of 256M. I'm trying to increase it but nothing happens.
For example:
eclipse -vmargs -Xmx16g -XX:PermSize=2g -XX:MaxPermSize=2g
I also tried different settings, using only the -Xmx option, using different cases
of g, G, m, M, different memory sizes, but nothing helps. Tried also to specify the values in the eclipse.ini file. Does not matter which params I specify, the heap exception is thrown at the same time, so I assume there's something I'm doing wrong that Eclipse ignores the -Xmx parameter. I'm using a 32GB RAM machine and trying to execute something very simple such as:
double[][] a = new double[15000][15000];
It only works when I reduce the array size to something around 10000 on 10000.
I'm working on Linux and using the top command I can see how much memory the Java
process is consuming; it's less than 2%.
Thanks!

Okay, I found a solution after reading
Why does heap space run out only when running JUnit tests?
When I specify the -Xmx inside eclipse by going to run->configuration->vm arguments
and set the -Xmx there, everything works fine :)

The stack size used in kernel development

I'm developing an operating system and rather than programming the kernel, I'm designing the kernel. This operating system is targeted at the x86 architecture and my target is for modern computers. The estimated number of required RAM is 256Mb or more.
What is a good size to make the stack for each thread run on the system? Should I try to design the system in such a way that the stack can be extended automatically if the maximum length is reached?
I think if I remember correctly that a page in RAM is 4k or 4096 bytes and that just doesn't seem like a lot to me. I can definitely see times, especially when using lots of recursion, that I would want to have more than 1000 integars in RAM at once. Now, the real solution would be to have the program doing this by using malloc and manage its own memory resources, but really I would like to know the user opinion on this.
Is 4k big enough for a stack with modern computer programs? Should the stack be bigger than that? Should the stack be auto-expanding to accommodate any types of sizes? I'm interested in this both from a practical developer's standpoint and a security standpoint.
Is 4k too big for a stack? Considering normal program execution, especially from the point of view of classes in C++, I notice that good source code tends to malloc/new the data it needs when classes are created, to minimize the data being thrown around in a function call.
What I haven't even gotten into is the size of the processor's cache memory. Ideally, I think the stack would reside in the cache to speed things up and I'm not sure if I need to achieve this, or if the processor can handle it for me. I was just planning on using regular boring old RAM for testing purposes. I can't decide. What are the options?

Stack size depends on what your threads are doing. My advice:
make the stack size a parameter at thread creation time (different threads will do different things, and hence will need different stack sizes)
provide a reasonable default for those who don't want to be bothered with specifying a stack size (4K appeals to the control freak in me, as it will cause the stack-profligate to, er, get the signal pretty quickly)
consider how you will detect and deal with stack overflow. Detection can be tricky. You can put guard pages--empty--at the ends of your stack, and that will generally work. But you are relying on the behavior of the Bad Thread not to leap over that moat and start polluting what lays beyond. Generally that won't happen...but then, that's what makes the really tough bugs tough. An airtight mechanism involves hacking your compiler to generate stack checking code. As for dealing with a stack overflow, you will need a dedicated stack somewhere else on which the offending thread (or its guardian angel, whoever you decide that is--you're the OS designer, after all) will run.
I would strongly recommend marking the ends of your stack with a distinctive pattern, so that when your threads run over the ends (and they always do), you can at least go in post-mortem and see that something did in fact run off its stack. A page of 0xDEADBEEF or something like that is handy.
By the way, x86 page sizes are generally 4k, but they do not have to be. You can go with a 64k size or even larger. The usual reason for larger pages is to avoid TLB misses. Again, I would make it a kernel configuration or run-time parameter.

Search for KERNEL_STACK_SIZE in linux kernel source code and you will find that it is very much architecture dependent - PAGE_SIZE, or 2*PAGE_SIZE etc (below is just some results - many intermediate output are deleted).
./arch/cris/include/asm/processor.h:
#define KERNEL_STACK_SIZE PAGE_SIZE
./arch/ia64/include/asm/ptrace.h:
# define KERNEL_STACK_SIZE_ORDER 3
# define KERNEL_STACK_SIZE_ORDER 2
# define KERNEL_STACK_SIZE_ORDER 1
# define KERNEL_STACK_SIZE_ORDER 0
#define IA64_STK_OFFSET ((1 << KERNEL_STACK_SIZE_ORDER)*PAGE_SIZE)
#define KERNEL_STACK_SIZE IA64_STK_OFFSET
./arch/ia64/include/asm/mca.h:
u64 mca_stack[KERNEL_STACK_SIZE/8];
u64 init_stack[KERNEL_STACK_SIZE/8];
./arch/ia64/include/asm/thread_info.h:
#define THREAD_SIZE KERNEL_STACK_SIZE
./arch/ia64/include/asm/mca_asm.h:
#define MCA_PT_REGS_OFFSET ALIGN16(KERNEL_STACK_SIZE-IA64_PT_REGS_SIZE)
./arch/parisc/include/asm/processor.h:
#define KERNEL_STACK_SIZE (4*PAGE_SIZE)
./arch/xtensa/include/asm/ptrace.h:
#define KERNEL_STACK_SIZE (2 * PAGE_SIZE)
./arch/microblaze/include/asm/processor.h:
# define KERNEL_STACK_SIZE 0x2000

I'll throw my two cents in to get the ball rolling:
I'm not sure what a "typical" stack size would be. I would guess maybe 8 KB per thread, and if a thread exceeds this amount, just throw an exception. However, according to this, Windows has a default reserved stack size of 1MB per thread, but it isn't committed all at once (pages are committed as they are needed). Additionally, you can request a different stack size for a given EXE at compile-time with a compiler directive. Not sure what Linux does, but I've seen references to 4 KB stacks (although I think this can be changed when you compile the kernel and I'm not sure what the default stack size is...)
This ties in with the first point. You probably want a fixed limit on how much stack each thread can get. Thus, you probably don't want to automatically allocate more stack space every time a thread exceeds its current stack space, because a buggy program that gets stuck in an infinite recursion is going to eat up all available memory.

If you are using virtual memory, you do want to make the stack growable. Forcing static allocation of stack sized, like is common in user-level threading like Qthreads and Windows Fibers is a mess. Hard to use, easy to crash. All modern OSes do grow the stack dynamically, I think usually by having a write-protected guard page or two below the current stack pointer. Writes there then tell the OS that the stack has stepped below its allocated space, and you allocate a new guard page below that and make the page that got hit writable. As long as no single function allocates more than a page of data, this works fine. Or you can use two or four guard pages to allow larger stack frames.
If you want a way to control stack size and your goal is a really controlled and efficient environment, but do not care about programming in the same style as Linux etc., go for a single-shot execution model where a task is started each time a relevant event is detected, runs to completion, and then stores any persistent data in its task data structure. In this way, all threads can share a single stack. Used in many slim real-time operating systems for automotive control and similar.

Why not make the stack size a configurable item, either stored with the program or specified when a process creates another process?
There are any number of ways you can make this configurable.
There's a guideline that states "0, 1 or n", meaning you should allow zero, one or any number (limited by other constraints such as memory) of an object - this applies to sizes of objects as well.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse