Guava Bloom Filter does not support large insertions?

Guava Bloom Filter does not support large insertions? - guava

I was using BloomFilter in guava v.11.0.1 and it seems like I am getting an exception when my insertion is large. I tried at 10 million with 0.001 fpp, and it failed.
java.lang.IllegalArgumentException: Number of bits must be positive
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
at com.google.common.hash.BloomFilterStrategies.checkPositiveAndMakeMultipleOf64(BloomFilterStrategies.java:72)
at com.google.common.hash.BloomFilterStrategies.access$000(BloomFilterStrategies.java:18)
at com.google.common.hash.BloomFilterStrategies$From128ToN.withBits(BloomFilterStrategies.java:37)
at com.google.common.hash.BloomFilter.create(BloomFilter.java:192)
at com.ipg.collection.BloomFilterWritable.impl(BloomFilterWritable.java:43)
at com.ipg.collection.BloomFilterWritable.put(BloomFilterWritable.java:62)
at com.ipg.prophet.twitter.twitflow.archive.UnzipTweetsProcessAndUpload$ProcessorConsumer.process(UnzipTweetsProcessAndUpload.java:107)
at com.ipg.prophet.twitter.twitflow.archive.UnzipTweetsProcessAndUpload$ProcessorConsumer.run(UnzipTweetsProcessAndUpload.java:84)
at java.lang.Thread.run(Thread.java:662)
I think at least it should support that many insertions with such a high fpp, shouldn't it?

Sorry about this, I'm the culprit :)
Hopefully we will be able to push the next version soon. Not the time to mention this, but there is an upside to this accident: it means we can definitely kill the current serial form of BF and its related supporting code (which was an accident itself), which I'm trying to fix for a month now - incidentally the fix to that also fixes this problem.
Edit: more information here (and in Louis' filed issue)

This should probably be filed as an issue on Guava, not on StackOverflow. (I confirm it, by the way; and I've mostly figured out what's going on.)
UPDATE: I've filed an issue and started a patch.

Related

How to seed rand() in IBM Swift Sandbox?

I am new to StackOverflow so please correct me if there is a better way to post a question which is a specific case of an existing question.
Alberto Barrera answered
How does one seed the random number generator in Swift?
with
let time = UInt32(NSDate().timeIntervalSinceReferenceDate)
srand(time)
print("Random number: \(rand()%10)")
which is perfect generally, but when I try it in The IBM Swift Sandbox it gives the same number sequence every run, at least in the space of a half hour.
import Foundation
import CoreFoundation
let time = UInt32(NSDate().timeIntervalSinceReferenceDate)
srand(time)
print("Random number: \(rand()%10)")
At the moment, every Run prints 5.
Has anyone found a way to do this in the IBM Sandbox? I have found that random() and srandom() produce a different number sequence but similarly are the same each Run. I haven't found arc4random() in Foundation, CoreFoundation, Darwin, or Glibc.
As an aside, I humbly suggest someone with reputation above 1500 creates a tag IBM-Swift-Sandbox.

This was an issue with the way we implemented server-side caching in the Sandbox; non-deterministic code would continually return the same answer even though it should not have. We've disabled it for now, and you should be getting different results with each run. We're currently working on better mechanisms to ensure the scalability of the Sandbox.
I'll see about that tag, as well!

srand is working as expected. If you change value each time in let time = UInt32(NSDate().timeIntervalSinceReferenceDate) instead of NSDate().timeIntervalSinceReferenceDate with any number, it will output random numbers.
Maybe this is a caching issue, it just doesn't see any changes in code and doesn't send it for recompilation :)

I don't know what is going on but today it is totally working. So I guess the question is answered:
srand(UInt32(NSDate().timeIntervalSinceReferenceDate))
works fine.
(I think something must have changed. It was behaving the same way (generating the same number with repeated attempts) on two different computers for about 10 days... Bizarre.)

QGIS committedAttributeValuesChanges signal never emitted

I'm trying to use QgsVectorLayer committedAttributeValuesChanges signal (on a non geometry layer) but it is never emitted. committedFeaturesAdded and committedFeaturesRemoved signals are emitted but committedAttributeValuesChanges never. Does anybody face the same issue and/or know how to fix it ?
I'm using QGIS 2.8
Thanks

The same issue existed for the commitedGeometriesChanges signal which was discussed in detail on the developer mailing list here. The signals do almost the same thing so it is probably the same bug.
TL;DR: upgrade to 2.8.2 or higher.

EOFException - Server/Client readUTF problems

I've recently become interested in trying to adapt my Rock-Paper-Scissors game into a multiplayer-friendly program, so today I decided I'd look up a tutorial on servers. It seems I'm following it precisely (aside from using a different IDE). However, something is going wrong and I'm not sure exactly what it is, and it works fine for the tutorial maker. I've looked up EOFException but it didn't exactly help me out.
The tutorial on Youtube
My screenshot of the issue.
[Documentation on EOFException](I had a link here, but I need at least 10 reputation to post more than two links)
"Signals that an end of file or end of stream has been reached unexpectedly during input.
This exception is mainly used by data input streams to signal end of stream. Note that many other input operations return a special value on end of stream rather than throwing an exception."
[A similar but apparently unresolved question asked here](I had a link here too]
By the way, if you look up exactly what I've posted here, you'll also find that I've asked it at DaniWeb. I'm just posting in multiple places in case it isn't resolved at one or the other. If it is, well... the more knowledge, the better.

EOFException during readUTF() just means it's reached the end of the stream, like it says on the tin. Note that this method doesn't return null at end of stream, unlike readLine() (but like all other readXXX() methods).
It can also mean that your sending and receiving is out of sync, e.g. you are trying to read some ridiculously large number of bytes because you left out a readInt() or similar, or you wrote something extra at the peer that you shouldn't have, so you're trying to read the next bytes in the stream as the result of writeUTF() when it isn't. This is an application protocol error.
How this happened in the code you posted is another question, but your code doesn't close the sockets, which doesn't help. Add an out.close() to your server code, and in.close() to the client code. However I cannot reproduce your problem with or without these closes. Are you sure this is the real code?

cdsbootstrap matlab function error

I am very very new to matlab, actually I only just got the program to run one code. A good friend of mine wrote a code for me however it doesn't seem to work, I have provided the link to the code and data files required, can anyone tell me what's wrong or fix it for me? Much thanks!
The error I am getting is:
Warning: No such number of days in the future exist for a 30/360 day-count basis. > In daysadd at 141 In CDS_PD at 35
https://www.dropbox.com/s/jtjk187rpu0fhv5/cds.zip

It appears that it is only a warning and is an inherent limitation of the "30/360 day-count basis" as used by the daycount function. As far as I can tell it just means that the calculated day is not unique in this basis. So it doesn't necessarily mean that the program is functioning incorrectly.
See:
http://www.mathworks.com.au/help/finance/daysadd.html
http://www.mathworks.com.au/help/finance/_f0-6010.html#bsuk32t-1
http://www.google.com.au/search?q=30%2F360+day-count+basis

How to handle MEMCACHED_SERVER_MARKED_DEAD?

I have a cluster of 10 memcaches, using consistent hashing. When the key passed to memcached_get() is searched on the unavailable server I get just MEMCACHED_SERVER_MARKED_DEAD response (return value).
I would expect the key should be redistributed to the next available server in this case and I should get NOTFOUND from the next memcached_get() call. However I'm still getting MEMCACHED_SERVER_MARKED_DEAD and so I'm unable to set a new value.
I discovered I can call memcached_behavior_set(..., MEMCACHED_BEHAVIOR_DISTRIBUTION). This causes hash redistribution and it works as I wish then. However, I do not think it is a good approach. Is it?

Generally you want to enable MEMCACHED_BEHAVIOR_DISTRIBUTION from the start if you are dealing with multiple memcached pools. So yes that solution will work.
If you are having further problems, take a look at MEMCACHED_BEHAVIOR_REMOVE_FAILED_SERVERS that will auto purge failed servers from pool after x number of failures.

I found the answer myself.
https://bugs.launchpad.net/libmemcached/+bug/777672
Applying the patch solved all my problems. Note, I wonder it has beed broken since 0.39 and nobody has cared.