Using 'incr' with spymemcached client

Using 'incr' with spymemcached client - memcached

I am trying set up a very basic hit rate monitor in memcached using the spymemcached-2.8.4 client however the value stored in memcached never actually seems to increment... Is this a bug or am I missing something?
public static void checkHitRate(String clientId) throws FatalException, ForbiddenException {
MemcachedClient memcachedClient;
try {
memcachedClient = new MemcachedClient(new InetSocketAddress("localhost", 11211));
Integer hitRate = (Integer) memcachedClient.get(clientId);
if (hitRate == null) {
memcachedClient.set(clientId, 1800, 1);
} else if (hitRate > 500) {
throw new ForbiddenException("Hit rate too high. Try again later");
} else if (hitRate <= 500) {
memcachedClient.incr(clientId, 1);
}
} catch (IOException e) {
throw new FatalException("Could not read hit rate from Memcached.");
}
}
I know its possible in memcached:
(TELENT output)
set clientId_1 0 200 1
1
STORED
incr clientId_1 1
2
get clientId_1
VALUE clientId_1 0 1
2
END

For incr/decr memcached server considers stored values as string representation of integer,
Adding an item like
memcachedClient.set(clientId, 1800, 1);
adds value as an integer that's why memcached is unable to increment it.
Use
memcachedClient.set(clientId, 1800, "1");
instead. It will work as you need.

You can use another incr signature:
System.out.println("Incr result: " + memcachedClient.incr("testIncrKey", 1, 1));
That works for me with spymemcached-2.8.4 client.
Also, looks like there was an issue with incr spymemcached.
You should not use Integer since it will be converted back to String and stored as String on memcached server.
If you open using xmemcached then you can use specific Counter construct, e.g.
Counter counter=client.getCounter("counter",0);
counter.incrementAndGet();
counter.decrementAndGet();
counter.addAndGet(-10);
read more on it in xmemcached documentation:

Related

Graph processing increasingly gets slower on titan + dynamoDB (local) as more vertices/edges are added

I am working with titan 1.0 using AWS dynamoDB local implementation as storage backend on a 16GB machine. My use case involves generating graphs periodically containing vertices & edges in the order of 120K. Every time I generate a new graph in-memory, I check the graph stored in DB and either (i) add vertices/edges that do not exist, or (ii) update properties if they already exist (existence is determined by 'Label' and a 'Value' attribute). Note that the 'Value' property is indexed. Transactions are committed in batches of 500 vertices.
Problem: I find that this process gets slower each time I process a new graph (1st graph finished in 45 mins with empty db initially, 2nd took 2.5 hours, 3rd in 3.5 hours, 4th in 6 hours, 5th in 10 hours and so on). In fact, when processing a given graph, it is fairly quick at start time but progressively gets slower (initial batches take 2-4 secs and later on it increases to 100s of seconds for same batch size of 500 nodes; I also see sometimes it takes 1000-2000 secs for a batch). This is the processing time alone (see approach below); commit takes between 8-10 secs always. I configured the jvm heap size to 10G, and I notice that when the app is running it is eventually using up all of it.
Question: Is this behavior to be expected? It seems to me something is wrong here (either in my config / approach?). Any help or suggestions would be greatly appreciated.
Approach:
Starting from the root node of the in-memory graph, I retrieve all child nodes and maintain a queue
For each child node, I check to see if it exists in DB, else create new node, and update some properties
Vertex dbVertex = dbgraph.traversal().V()
.has(currentVertexInMem.label(), "Value",
(String) currentVertexInMem.value("Value"))
.tryNext()
.orElseGet(() -> createVertex(dbgraph, currentVertexInMem));
if (dbVertex != null) {
// Update Properties
updateVertexProperties(dbgraph, currentVertexInMem, dbVertex);
}
// Add edge if necessary
if (parentDBVertex != null) {
GraphTraversal<Vertex, Edge> edgeIt = graph.traversal().V(parentDBVertex).outE()
.has("EdgeProperty1", eProperty1) // eProperty1 is String input parameter
.has("EdgeProperty2", eProperty2); // eProperty2 is Long input parameter
Boolean doCreateEdge = true;
Edge e = null;
while (edgeIt.hasNext()) {
e = edgeIt.next();
if (e.inVertex().equals(dbVertex)) {
doCreateEdge = false;
break;
}
if (doCreateEdge) {
e = parentDBVertex.addEdge("EdgeLabel", dbVertex, "EdgeProperty1", eProperty1, "EdgeProperty2", eProperty2);
}
e = null;
it = null;
}
...
if ((processedVertexCount.get() % 500 == 0)
|| processedVertexCount.get() == verticesToProcess.get()) {
graph.tx().commit();
}
Create function:
public static Vertex createVertex(Graph graph, Vertex clientVertex) {
Vertex newVertex = null;
switch (clientVertex.label()) {
case "Label 1":
newVertex = graph.addVertex(T.label, clientVertex.label(), "Value",
clientVertex.value("Value"),
"Property1-1", clientVertex.value("Property1-1"),
"Property1-2", clientVertex.value("Property1-2"));
break;
case "Label 2":
newVertex = graph.addVertex(T.label, clientVertex.label(), "Value",
clientVertex.value("Value"), "Property2-1",
clientVertex.value("Property2-1"),
"Property2-2", clientVertex.value("Property2-2"));
break;
default:
newVertex = graph.addVertex(T.label, clientVertex.label(), "Value",
clientVertex.value("Value"));
break;
}
return newVertex;
}
Schema Def: (Showing some of the indexes)
Note:
"EdgeLabel" = Constants.EdgeLabels.Uses
"EdgeProperty1" = Constants.EdgePropertyKeys.EndpointId
"EdgeProperty2" = Constants.EdgePropertyKeys.Timestamp
public void createSchema() {
// Create Schema
TitanManagement mgmt = dbgraph.openManagement();
mgmt.set("cache.db-cache",true);
// Vertex Properties
PropertyKey value = mgmt.getPropertyKey(Constants.VertexPropertyKeys.Value);
if (value == null) {
value = mgmt.makePropertyKey(Constants.VertexPropertyKeys.Value).dataType(String.class).make();
mgmt.buildIndex(Constants.GraphIndexes.ByValue, Vertex.class).addKey(value).buildCompositeIndex(); // INDEX
}
PropertyKey shapeSet = mgmt.getPropertyKey(Constants.VertexPropertyKeys.ShapeSet);
if (shapeSet == null) {
shapeSet = mgmt.makePropertyKey(Constants.VertexPropertyKeys.ShapeSet).dataType(String.class).cardinality(Cardinality.SET).make();
mgmt.buildIndex(Constants.GraphIndexes.ByShape, Vertex.class).addKey(shapeSet).buildCompositeIndex();
}
...
// Edge Labels and Properties
EdgeLabel uses = mgmt.getEdgeLabel(Constants.EdgeLabels.Uses);
if (uses == null) {
uses = mgmt.makeEdgeLabel(Constants.EdgeLabels.Uses).multiplicity(Multiplicity.MULTI).make();
PropertyKey timestampE = mgmt.getPropertyKey(Constants.EdgePropertyKeys.Timestamp);
if (timestampE == null) {
timestampE = mgmt.makePropertyKey(Constants.EdgePropertyKeys.Timestamp).dataType(Long.class).make();
}
PropertyKey endpointIDE = mgmt.getPropertyKey(Constants.EdgePropertyKeys.EndpointId);
if (endpointIDE == null) {
endpointIDE = mgmt.makePropertyKey(Constants.EdgePropertyKeys.EndpointId).dataType(String.class).make();
}
// Indexes
mgmt.buildEdgeIndex(uses, Constants.EdgeIndexes.ByEndpointIDAndTimestamp, Direction.BOTH, endpointIDE,
timestampE);
}
mgmt.commit();
}

The behavior you experience is expected. Today, DynamoDB Local is a testing tool built on SQLite. If you need to support high TPS for large and periodic data loads, I recommend you use the DynamoDB service.

Java_java_net_PlainSocketImpl_socketSetOption

in open-jdk-8 :
this jin function : Java_java_net_PlainSocketImpl_socketSetOption:
/*
* SO_TIMEOUT is a no-op on Solaris/Linux
*/
if (cmd == java_net_SocketOptions_SO_TIMEOUT) {
return;
}
file: openjdk7/jdk/src/solaris/native/java/net/PlainSocketImpl.c
does this mean , on linux setOption of SO_TIMEOUT will be ignored ?
I am can't found the jin for linux. but the solaris's code seems also works for linux .

No, it just means it isn't implemented as a socket option. Some platforms don't support it. On those platforms select() or friends are used.

The source inside solaris folder is also used for Linux.
SO_TIMEOUT is ignored in Java_java_net_PlainSocketImpl_socketSetOption0. But timeout is kept as a field when AbstractPlainSocketImpl.setOption is called:
case SO_TIMEOUT:
if (val == null || (!(val instanceof Integer)))
throw new SocketException("Bad parameter for SO_TIMEOUT");
int tmp = ((Integer) val).intValue();
if (tmp < 0)
throw new IllegalArgumentException("timeout < 0");
// Saved for later use
timeout = tmp;
break;
And timeout is used when doing read in SocketInputStream:
public int read(byte b[], int off, int length) throws IOException {
return read(b, off, length, impl.getTimeout());
}

Unable to send a big list over TCP Sockets - Blackberry

I'm trying to send a list of 600 records over TCP/IP sockets using a java server and a Blackberry client. But every time it reaches the 63th record it stops, the odd thing about this is that if I only send 200 records they are sent ok.
I haven't been able to understand why it happens, only that 63 records equals aprox to 4kB, basically it sends:
an integer with the total number of records to be sent
And for every record
an integer with the length of the string
the string
a string terminator "$$$"
Since i need to send the whole 600 i have tried to close the InputStreamReader and reopen it, also reset it but without any result.
Does anybody else have experienced this behaviour? thanks in advanced.
EDIT
Here the code that receives:
private String readfromserver() throws IOException {
int len=_in.read(); // receives the string length
if (len==0) // if len=0 then the string was empty
return "";
else {
char[] input = new char[len+1];
for (int i = 0; i < len; ++i)
input[i] = (char)_in.read();
StringBuffer s = new StringBuffer();
s.append(input);
return s.toString();
}
}
private void startRec(String data) throws IOException
{
boolean mustcontinue=true;
int len=_in.read(); // read how many records is about to receive
if (len==0) {
scr.writelog("There is no data to receive");
}
else {
for(int i=0; i<len; i++)
if (mustcontinue) {
mustcontinue=mustcontinue && showdata(readfromserver());
}
else {
scr.writelog("Inconsistency error #19");
}
}
}
the function showdata only shows the received string in a LabelField.
The code in the server:
try {
_out.write(smultiple.size()); // send the number of records
_out.flush();
for (int x=0; x<smultiple.size(); x++)
{
int l=smultiple.elementAt(x).length();
_out.write(l); // send string length
if (l>0)
_out.write(smultiple.elementAt(x)); // send string
}
_out.flush();
} catch (Exception e) {
principal.dblog(e.toString());
}
smultiple is a vector containing the strings and everyone already have the terminator $$$.
Thanks.

I think 200 goes fine and 600 -not, because the latter number is bigger than 255 :-) Your code
int len=_in.read();
probably reads a byte and not an integer (4 bytes)

Microsoft Robotics and Sql

I have an issue implementing CCR with SQL. It seems that when I step through my code the updates and inserts I am trying to execute work great. But when I run through my interface without any breakpoints, it seems to be working and it shows the inserts, updates, but at the end of the run, nothing got updated to the database.
I proceeded to add a pause to my code every time I pull anew thread from my pool and it works... but that defeats the purpose of async coding right? I want my interface to be faster, not slow it down...
Any suggestions... here is part of my code:
I use two helper classes to set my ports and get a response back...
/// <summary>
/// Gets the Reader, requires connection to be managed
/// </summary>
public static PortSet<Int32, Exception> GetReader(SqlCommand sqlCommand)
{
Port<Int32> portResponse = null;
Port<Exception> portException = null;
GetReaderResponse(sqlCommand, ref portResponse, ref portException);
return new PortSet<Int32, Exception>(portResponse, portException);
}
// Wrapper for SqlCommand's GetResponse
public static void GetReaderResponse(SqlCommand sqlCom,
ref Port<Int32> portResponse, ref Port<Exception> portException)
{
EnsurePortsExist(ref portResponse, ref portException);
sqlCom.BeginExecuteNonQuery(ApmResultToCcrResultFactory.Create(
portResponse, portException,
delegate(IAsyncResult ar) { return sqlCom.EndExecuteNonQuery(ar); }), null);
}
then I do something like this to queue up my calls...
DispatcherQueue queue = CreateDispatcher();
String[] commands = new String[2];
Int32 result = 0;
commands[0] = "exec someupdateStoredProcedure";
commands[1] = "exec someInsertStoredProcedure '" + Settings.Default.RunDate.ToString() + "'";
for (Int32 i = 0; i < commands.Length; i++)
{
using (SqlConnection connSP = new SqlConnection(Settings.Default.nbfConn + ";MultipleActiveResultSets=true;Async=true"))
using (SqlCommand cmdSP = new SqlCommand())
{
connSP.Open();
cmdSP.Connection = connSP;
cmdSP.CommandTimeout = 150;
cmdSP.CommandText = "set arithabort on; " + commands[i];
Arbiter.Activate(queue, Arbiter.Choice(ApmToCcrAdapters.GetReader(cmdSP),
delegate(Int32 reader) { result = reader; },
delegate(Exception e) { result = 0; throw new Exception(e.Message); }));
}
}
where ApmToCcrAdapters is the class name where my helper methods are...
The problem is when I pause my code right after the call to Arbiter.Activate and I check my database, everything looks fine... if I get rid of the pause ad run my code through, nothing happens to the database, and no exceptions are thrown either...

The problem here is that you are calling Arbiter.Activate in the scope of your two using blocks. Don't forget that the CCR task you create is queued and the current thread continues... right past the scope of the using blocks. You've created a race condition, because the Choice must execute before connSP and cmdSP are disposed and that's only going to happen when you're interfering with the thread timings, as you have observed when debugging.
If instead you were to deal with disposal manually in the handler delegates for the Choice, this problem would no longer occur, however this makes for brittle code where it's easy to overlook disposal.
I'd recommend implementing the CCR iterator pattern and collecting results with a MulitpleItemReceive so that you can keep your using statements. It makes for cleaner code. Off the top of my head it would look something like this:
private IEnumerator<ITask> QueryIterator(
string command,
PortSet<Int32,Exception> resultPort)
{
using (SqlConnection connSP =
new SqlConnection(Settings.Default.nbfConn
+ ";MultipleActiveResultSets=true;Async=true"))
using (SqlCommand cmdSP = new SqlCommand())
{
Int32 result = 0;
connSP.Open();
cmdSP.Connection = connSP;
cmdSP.CommandTimeout = 150;
cmdSP.CommandText = "set arithabort on; " + commands[i];
yield return Arbiter.Choice(ApmToCcrAdapters.GetReader(cmdSP),
delegate(Int32 reader) { resultPort.Post(reader); },
delegate(Exception e) { resultPort.Post(e); });
}
}
and you could use it something like this:
var resultPort=new PortSet<Int32,Exception>();
foreach(var command in commands)
{
Arbiter.Activate(queue,
Arbiter.FromIteratorHandler(()=>QueryIterator(command,resultPort))
);
}
Arbiter.Activate(queue,
Arbiter.MultipleItemReceive(
resultPort,
commands.Count(),
(results,exceptions)=>{
//everything is done and you've got 2
//collections here, results and exceptions
//to process as you want
}
)
);

When should I call SaveChanges() when creating 1000's of Entity Framework objects? (like during an import)

I am running an import that will have 1000's of records on each run. Just looking for some confirmation on my assumptions:
Which of these makes the most sense:
Run SaveChanges() every AddToClassName() call.
Run SaveChanges() every n number of AddToClassName() calls.
Run SaveChanges() after all of the AddToClassName() calls.
The first option is probably slow right? Since it will need to analyze the EF objects in memory, generate SQL, etc.
I assume that the second option is the best of both worlds, since we can wrap a try catch around that SaveChanges() call, and only lose n number of records at a time, if one of them fails. Maybe store each batch in an List<>. If the SaveChanges() call succeeds, get rid of the list. If it fails, log the items.
The last option would probably end up being very slow as well, since every single EF object would have to be in memory until SaveChanges() is called. And if the save failed nothing would be committed, right?

I would test it first to be sure. Performance doesn't have to be that bad.
If you need to enter all rows in one transaction, call it after all of AddToClassName class. If rows can be entered independently, save changes after every row. Database consistence is important.
Second option I don't like. It would be confusing for me (from final user perspective) if I made import to system and it would decline 10 rows out of 1000, just because 1 is bad. You can try to import 10 and if it fails, try one by one and then log.
Test if it takes long time. Don't write 'propably'. You don't know it yet. Only when it is actually a problem, think about other solution (marc_s).
EDIT
I've done some tests (time in miliseconds):
10000 rows:
SaveChanges() after 1 row:18510,534SaveChanges() after 100 rows:4350,3075SaveChanges() after 10000 rows:5233,0635
50000 rows:
SaveChanges() after 1 row:78496,929
SaveChanges() after 500 rows:22302,2835
SaveChanges() after 50000 rows:24022,8765
So it is actually faster to commit after n rows than after all.
My recommendation is to:
SaveChanges() after n rows.
If one commit fails, try it one by one to find faulty row.
Test classes:
TABLE:
CREATE TABLE [dbo].[TestTable](
[ID] [int] IDENTITY(1,1) NOT NULL,
[SomeInt] [int] NOT NULL,
[SomeVarchar] [varchar](100) NOT NULL,
[SomeOtherVarchar] [varchar](50) NOT NULL,
[SomeOtherInt] [int] NULL,
CONSTRAINT [PkTestTable] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Class:
public class TestController : Controller
{
//
// GET: /Test/
private readonly Random _rng = new Random();
private const string _chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
private string RandomString(int size)
{
var randomSize = _rng.Next(size);
char[] buffer = new char[randomSize];
for (int i = 0; i < randomSize; i++)
{
buffer[i] = _chars[_rng.Next(_chars.Length)];
}
return new string(buffer);
}
public ActionResult EFPerformance()
{
string result = "";
TruncateTable();
result = result + "SaveChanges() after 1 row:" + EFPerformanceTest(10000, 1).TotalMilliseconds + "<br/>";
TruncateTable();
result = result + "SaveChanges() after 100 rows:" + EFPerformanceTest(10000, 100).TotalMilliseconds + "<br/>";
TruncateTable();
result = result + "SaveChanges() after 10000 rows:" + EFPerformanceTest(10000, 10000).TotalMilliseconds + "<br/>";
TruncateTable();
result = result + "SaveChanges() after 1 row:" + EFPerformanceTest(50000, 1).TotalMilliseconds + "<br/>";
TruncateTable();
result = result + "SaveChanges() after 500 rows:" + EFPerformanceTest(50000, 500).TotalMilliseconds + "<br/>";
TruncateTable();
result = result + "SaveChanges() after 50000 rows:" + EFPerformanceTest(50000, 50000).TotalMilliseconds + "<br/>";
TruncateTable();
return Content(result);
}
private void TruncateTable()
{
using (var context = new CamelTrapEntities())
{
var connection = ((EntityConnection)context.Connection).StoreConnection;
connection.Open();
var command = connection.CreateCommand();
command.CommandText = #"TRUNCATE TABLE TestTable";
command.ExecuteNonQuery();
}
}
private TimeSpan EFPerformanceTest(int noOfRows, int commitAfterRows)
{
var startDate = DateTime.Now;
using (var context = new CamelTrapEntities())
{
for (int i = 1; i <= noOfRows; ++i)
{
var testItem = new TestTable();
testItem.SomeVarchar = RandomString(100);
testItem.SomeOtherVarchar = RandomString(50);
testItem.SomeInt = _rng.Next(10000);
testItem.SomeOtherInt = _rng.Next(200000);
context.AddToTestTable(testItem);
if (i % commitAfterRows == 0) context.SaveChanges();
}
}
var endDate = DateTime.Now;
return endDate.Subtract(startDate);
}
}

I just optimized a very similar problem in my own code and would like to point out an optimization that worked for me.
I found that much of the time in processing SaveChanges, whether processing 100 or 1000 records at once, is CPU bound. So, by processing the contexts with a producer/consumer pattern (implemented with BlockingCollection), I was able to make much better use of CPU cores and got from a total of 4000 changes/second (as reported by the return value of SaveChanges) to over 14,000 changes/second. CPU utilization moved from about 13 % (I have 8 cores) to about 60%. Even using multiple consumer threads, I barely taxed the (very fast) disk IO system and CPU utilization of SQL Server was no higher than 15%.
By offloading the saving to multiple threads, you have the ability to tune both the number of records prior to commit and the number of threads performing the commit operations.
I found that creating 1 producer thread and (# of CPU Cores)-1 consumer threads allowed me to tune the number of records committed per batch such that the count of items in the BlockingCollection fluctuated between 0 and 1 (after a consumer thread took one item). That way, there was just enough work for the consuming threads to work optimally.
This scenario of course requires creating a new context for every batch, which I find to be faster even in a single-threaded scenario for my use case.

If you need to import thousands of records, I'd use something like SqlBulkCopy, and not the Entity Framework for that.
MSDN docs on SqlBulkCopy
Use SqlBulkCopy to Quickly Load Data from your Client to SQL Server
Transferring Data Using SqlBulkCopy

Use a stored procedure.
Create a User-Defined Data Type in Sql Server.
Create and populate an array of this type in your code (very fast).
Pass the array to your stored procedure with one call (very fast).
I believe this would be the easiest and fastest way to do this.

Sorry, I know this thread is old, but I think this could help other people with this problem.
I had the same problem, but there is a possibility to validate the changes before you commit them. My code looks like this and it is working fine. With the chUser.LastUpdated I check if it is a new entry or only a change. Because it is not possible to reload an Entry that is not in the database yet.
// Validate Changes
var invalidChanges = _userDatabase.GetValidationErrors();
foreach (var ch in invalidChanges)
{
// Delete invalid User or Change
var chUser = (db_User) ch.Entry.Entity;
if (chUser.LastUpdated == null)
{
// Invalid, new User
_userDatabase.db_User.Remove(chUser);
Console.WriteLine("!Failed to create User: " + chUser.ContactUniqKey);
}
else
{
// Invalid Change of an Entry
_userDatabase.Entry(chUser).Reload();
Console.WriteLine("!Failed to update User: " + chUser.ContactUniqKey);
}
}
_userDatabase.SaveChanges();

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Using 'incr' with spymemcached client - memcached

Related

Graph processing increasingly gets slower on titan + dynamoDB (local) as more vertices/edges are added

Java_java_net_PlainSocketImpl_socketSetOption

Unable to send a big list over TCP Sockets - Blackberry

Microsoft Robotics and Sql

When should I call SaveChanges() when creating 1000's of Entity Framework objects? (like during an import)

Categories

Resources