Hazelcast as drop-in replacement for Memcached - memcached

Currently, we have an application that works with Memcached using spy-memcached (Java) client. Wanted to introduce Hazelcast for this application. At a very early stages of this integration, I want to use Hazelcast using Memcached protocol.
Starting standalone Hazelcast server ./server.sh. and trying to connect from app.
Getting error from Hazelcast console
INFO: [172.17.42.1]:5701 [dev] [3.2.4] Connection [/127.0.0.1:44028] lost. Reason: java.nio.BufferOverflowException[null]
Error from spy-memcached
2014-07-28 15:51:23.986 INFO net.spy.memcached.MemcachedConnection: Reconnecting due to exception on {QA sa=localhost/127.0.0.1:5701, #Rops=1, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 1 Key: Report-1281-NULL-NULL Cas: 0 Exp: 0 Flags: 1 Data Length: 2796, topWop=null, toWrite=0, interested=1}
java.io.IOException: Disconnected unexpected, will reconnect.
at net.spy.memcached.MemcachedConnection.handleReads(MemcachedConnection.java:452)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:380)
at net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:242)
at net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:836)
2014-07-28 15:51:23.987 WARN net.spy.memcached.MemcachedConnection: Closing, and reopening {QA sa=localhost/127.0.0.1:5701, #Rops=1, #Wops=0, #iq=0, topRop=Cmd: 1 Opaque: 1 Key: MarginRequirementsReport-1281-NULL-NULL Cas: 0 Exp: 0 Flags: 1 Data Length: 2796, topWop=null, toWrite=0, interested=1}, attempt 0.
2014-07-28 15:51:23.987 WARN net.spy.memcached.protocol.binary.BinaryMemcachedNodeImpl: Discarding partially completed op: Cmd: 1 Opaque: 1 Key: Report-1281-NULL-NULL Cas: 0 Exp: 0 Flags: 1 Data Length: 2796
Apparently, hazelcast.socket.receive.buffer.size and hazelcast.socket.send.buffer.size properties have nothing to do this this.
Please, advise

Related

data lost in mongodb replica set mode

My replica set has two nodes:
1: the master node
2: a slave node with priority:0, votes:0
The oplog size is 5000MB.
run this for loop in master shell:
for (i=0;i<1000000;i++)
{
db.getSiblingDB("ff").c.insert(
{ a:i,
d:i+".#234"+(++i)+".234546"+(++i)+".568679"+(++i)+"31234."+(++i)+".12342354"+(++i)+"5346457."+(++i)+"33543465456."+(++i)+".6346456"+(++i)+"123235434."+(++i)+".2345345345"+(++i)
}
)
}
Kill the slave node while the for loop is running: kill -9 $(pidof slave_node)
Stop the for loop after a second; then restart the slave node.
Then run db.getSiblingDB("ff").c.count() to check data in both slave and master nodes, with the results:
master:20w
slave:15w
The slave node can catch up with the primary, but there is a lot of data lost from the slave.
Why is this?
Here is the slave node's log as it restarts after being killed:
2017-11-27T05:53:53.873+0000 I NETWORK [thread1] waiting for connections on port 28006
2017-11-27T05:53:53.876+0000 I REPL [replExecDBWorker-0] New replica set config in use: { _id: "cpconfig2", version: 2, protocolVersion: 1, members: [ { _id: 0, host: "127.0.0.1:28007", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 3.0, tags: {}, slaveDelay: 0, votes: 1 }, { _id: 1, host: "127.0.0.1:28006", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 0.0, tags: {}, slaveDelay: 0, votes: 0 } ], settings: { chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, catchUpTimeoutMillis: 60000, getLastErrorModes: {}, getLastErrorDefaults: { w: 1, wtimeout: 0 }, replicaSetId: ObjectId('5a1ba5bbb0a652502a5f002a') } }
2017-11-27T05:53:53.876+0000 I REPL [replExecDBWorker-0] This node is 127.0.0.1:28006 in the config
2017-11-27T05:53:53.876+0000 I REPL [replExecDBWorker-0] transition to STARTUP2
2017-11-27T05:53:53.876+0000 I REPL [replExecDBWorker-0] Starting replication storage threads
2017-11-27T05:53:53.877+0000 I REPL [replExecDBWorker-0] Starting replication fetcher thread
2017-11-27T05:53:53.877+0000 I REPL [replExecDBWorker-0] Starting replication applier thread
2017-11-27T05:53:53.877+0000 I REPL [replExecDBWorker-0] Starting replication reporter thread
2017-11-27T05:53:53.877+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Connecting to 127.0.0.1:28007
2017-11-27T05:53:53.877+0000 I REPL [rsSync] transition to RECOVERING
2017-11-27T05:53:53.878+0000 I REPL [rsSync] transition to SECONDARY
2017-11-27T05:53:53.879+0000 I ASIO [NetworkInterfaceASIO-Replication-0] Successfully connected to 127.0.0.1:28007, took 2ms (1 connections now open to 127.0.0.1:28007)
2017-11-27T05:53:53.879+0000 I REPL [ReplicationExecutor] Member 127.0.0.1:28007 is now in state PRIMARY
2017-11-27T05:53:54.011+0000 I FTDC [ftdc] Unclean full-time diagnostic data capture shutdown detected, found interim file, some metrics may have been lost. OK
2017-11-27T05:53:54.645+0000 I NETWORK [thread1] connection accepted from 127.0.0.1:52404 #1 (1 connection now open)
2017-11-27T05:53:54.645+0000 I NETWORK [conn1] received client metadata from 127.0.0.1:52404 conn1: { driver: { name: "NetworkInterfaceASIO-Replication", version: "3.4.9" }, os: { type: "Linux", name: "PRETTY_NAME="Debian GNU/Linux 8 (jessie)"", architecture: "x86_64", version: "Kernel 3.10.0" } }
2017-11-27T05:53:59.878+0000 I REPL [rsBackgroundSync] sync source candidate: 127.0.0.1:28007
See the page Accuracy after Unexpected Shutdown for details and information on how to recover from this situation.

Mongodb Replication doesnt start

we are trying to move from mongo 2.4.9 to 3.4, we have a lot of data so we tried to set replication and wait while data will be synced and then swap primary.
Configurations done but when replication is initiated new server cant stabilize replication:
017-07-07T12:07:22.492+0000 I REPL [replication-1] Starting initial sync (attempt 10 of 10)
2017-07-07T12:07:22.501+0000 I REPL [replication-1] sync source candidate: mongo-2.blabla.com:27017
2017-07-07T12:07:22.501+0000 I STORAGE [replication-1] dropAllDatabasesExceptLocal 1
2017-07-07T12:07:22.501+0000 I REPL [replication-1] ******
2017-07-07T12:07:22.501+0000 I REPL [replication-1] creating replication oplog of size: 6548MB...
2017-07-07T12:07:22.504+0000 I STORAGE [replication-1] WiredTigerRecordStoreThread local.oplog.rs already started
2017-07-07T12:07:22.505+0000 I STORAGE [replication-1] The size storer reports that the oplog contains 0 records totaling to 0 bytes
2017-07-07T12:07:22.505+0000 I STORAGE [replication-1] Scanning the oplog to determine where to place markers for truncation
2017-07-07T12:07:22.519+0000 I REPL [replication-1] ******
2017-07-07T12:07:22.521+0000 I REPL [replication-1] Initial sync attempt finishing up.
2017-07-07T12:07:22.521+0000 I REPL [replication-1] Initial Sync Attempt Statistics: { failedInitialSyncAttempts: 9, maxFailedInitialSyncAttempts: 10, initialSyncStart: new Date(1499429233163), initialSyncAttempts: [ { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" }, { durationMillis: 0, status: "CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find", syncSource: "mongo-2.blabla.com:27017" } ] }
2017-07-07T12:07:22.521+0000 E REPL [replication-1] Initial sync attempt
failed -- attempts left: 0 cause: CommandNotFound: error while getting last
oplog entry for begin timestamp: no such cmd: find
2017-07-07T12:07:22.521+0000 F REPL [replication-1] The maximum number
of retries have been exhausted for initial sync.
2017-07-07T12:07:22.522+0000 E REPL [replication-0] Initial sync failed,
shutting down now. Restart the server to attempt a new initial sync.
2017-07-07T12:07:22.522+0000 I - [replication-0] Fatal assertion 40088 CommandNotFound: error while getting last oplog entry for begin timestamp: no such cmd: find at src/mongo/db/repl/replication_coordinator_impl.cpp 632
please assits guys, since we have more than 100G of data, so dump and restore will take a lot of downtime
Configurations:
3.4.5 new machine:
storage:
dbPath: /mnt/dbpath
journal:
enabled: true
engine: wiredTiger
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
net:
port: 27017
replication:
replSetName: prodTest
2.4.9 old machine with data:
dbpath=/var/lib/mongodb
logpath=/var/log/mongodb/mongodb.log
logappend=true port = 27017
the task have been solved in such way:
-create replica master-v2.4, 3 slaves-v2.6
-stop app, step down master
-stop new master and upgrade mongo version to v3.0,
start master and upgrade slaves sequentually to 3.2(slave db files
removed new version started on wiredTiger engine)
-step down master, upgrade all slaves to 3.4
This process become very fast because replica slave recovery of 40G db takes around 30m.

Why can't I connect to Gremlin-Server?

Abstract
I'm trying to set up a Titan/Cassandra/Gremlin-Server stack in Docker (v1.13.0). The problem I'm facing is that applications trying to connect to Gremlin-Server on the default port 8182 are reporting errors (details below).
First, here is some relevant version information:
Cassandra v2.2.8
Titan v1.0.0 (Hadoop 1)
Gremlin 3.2.3
Setup
Setup takes place in a Dockerfile in order to be reproducible. It assumes that a Cassandra container already exists, running a cassandra.yaml in which start_rpc has been set to true.
The Dockerfile is as follows:
FROM openjdk:alpine
ENV TITAN 'titan-1.0.0-hadoop1'
RUN apk update && apk add bash unzip && rm -rf /var/cache/apk/* \
&& adduser -S -s /bin/bash -D srg \
&& wget -O /tmp/$TITAN.zip http://s3.thinkaurelius.com/downloads/titan/$TITAN.zip \
&& unzip /tmp/$TITAN.zip -d /opt && ln -s /opt/$TITAN /opt/titan \
&& rm /tmp/*.zip \
&& chown -R srg /opt/$TITAN/ \
&& /opt/titan/bin/gremlin-server.sh -i org.apache.tinkerpop gremlin-python 3.2.3
COPY conf/gremlin-server/* /opt/$TITAN/conf/gremlin-server/
USER srg
WORKDIR /opt/titan
EXPOSE 8182
CMD ["bin/gremlin-server.sh", "conf/gremlin-server/srg.yaml"]
The astute reader will note that I am copying custom configuration files into the container, namely a Gremlin-Server configuration file (srg.yaml) and a titan graph properties file (srg.properties).
srg.yaml
host: localhost
port: 8182
threadPoolWorker: 1
gremlinPool: 8
scriptEvaluationTimeout: 30000
serializedResponseTimeout: 30000
channelizer: org.apache.tinkerpop.gremlin.server.channel.WebSocketChannelizer
graphs: {
graph: conf/gremlin-server/srg.properties
}
plugins:
- aurelius.titan
scriptEngines: {
gremlin-groovy: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI],
scripts: [scripts/empty-sample.groovy]},
gremlin-jython: {},
gremlin-python: {},
nashorn: {
imports: [java.lang.Math],
staticImports: [java.lang.Math.PI]}}
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { serializeResultToString: true }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0, config: { useMapperFromGraph: graph }}
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { useMapperFromGraph: graph }}
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { sessionTimeout: 28800000 }}
metrics: {
consoleReporter: {enabled: true, interval: 180000},
csvReporter: {enabled: true, interval: 180000, fileName: /tmp/gremlin-server-metrics.csv},
jmxReporter: {enabled: true},
slf4jReporter: {enabled: true, interval: 180000},
gangliaReporter: {enabled: false, interval: 180000, addressingMode: MULTICAST},
graphiteReporter: {enabled: false, interval: 180000}}
threadPoolBoss: 1
maxInitialLineLength: 4096
maxHeaderSize: 8192
maxChunkSize: 8192
maxContentLength: 65536
maxAccumulationBufferComponents: 1024
resultIterationBatchSize: 64
writeBufferLowWaterMark: 32768
writeBufferHighWaterMark: 65536
ssl: {
enabled: false}
srg.properties
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.backend=cassandrathrift
storage.hostname=cassandra # refers to the linked container
cache.db-cache = true
cache.db-cache-clean-wait = 20
cache.db-cache-time = 180000
cache.db-cache-size = 0.25
# Start elasticsearch inside the Titan JVM
index.search.backend=elasticsearch
index.search.directory=db/es
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true
Execution
The container is run with the following command: docker run -ti --rm=true --link test.cassandra:cassandra -p 8182:8182 titan.
Here is the log output for Gremlin-Server:
0 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer -
\,,,/
(o o)
-----oOOo-(3)-oOOo-----
297 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Configuring Gremlin Server from conf/gremlin-server/srg.yaml
439 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics ConsoleReporter configured with report interval=180000ms
448 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics CsvReporter configured with report interval=180000ms to fileName=/tmp/gremlin-server-metrics.csv
557 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics JmxReporter configured with domain= and agentId=
561 [main] INFO org.apache.tinkerpop.gremlin.server.util.MetricManager - Configured Metrics Slf4jReporter configured with interval=180000ms and loggerName=org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics
1750 [main] INFO com.thinkaurelius.titan.core.util.ReflectiveConfigOptionLoader - Loaded and initialized config classes: 12 OK out of 12 attempts in PT0.148S
1972 [main] INFO com.thinkaurelius.titan.diskstorage.cassandra.thrift.CassandraThriftStoreManager - Closed Thrift connection pooler.
1990 [main] INFO com.thinkaurelius.titan.graphdb.configuration.GraphDatabaseConfiguration - Generated unique-instance-id=ac1100031-ad2d5ffa52e81
2026 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Configuring index [search]
2386 [main] INFO org.elasticsearch.node - [Lunatik] version[1.5.1], pid[1], build[5e38401/2015-04-09T13:41:35Z]
2387 [main] INFO org.elasticsearch.node - [Lunatik] initializing ...
2399 [main] INFO org.elasticsearch.plugins - [Lunatik] loaded [], sites []
6471 [main] INFO org.elasticsearch.node - [Lunatik] initialized
6472 [main] INFO org.elasticsearch.node - [Lunatik] starting ...
6477 [main] INFO org.elasticsearch.transport - [Lunatik] bound_address {local[1]}, publish_address {local[1]}
6507 [main] INFO org.elasticsearch.discovery - [Lunatik] elasticsearch/u2StmRW1RsyEHw561yoNFw
6519 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.cluster.service - [Lunatik] master {new [Lunatik][u2StmRW1RsyEHw561yoNFw][ad2d5ffa52e8][local[1]]{local=true}}, removed {[Lunatik][kKyL9UE-R123LLZTTrsVCw][ad2d5ffa52e8][local[1]]{local=true},}, reason: local-disco-initial_connect(master)
6908 [main] INFO org.elasticsearch.http - [Lunatik] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.3:9200]}
6909 [main] INFO org.elasticsearch.node - [Lunatik] started
6923 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.gateway - [Lunatik] recovered [0] indices into cluster_state
7486 [elasticsearch[Lunatik][clusterService#updateTask][T#1]] INFO org.elasticsearch.cluster.metadata - [Lunatik] [titan] creating index, cause [api], templates [], shards [5]/[1], mappings []
8075 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Initiated backend operations thread pool of size 4
8241 [main] INFO com.thinkaurelius.titan.diskstorage.Backend - Configuring total store cache size: 94787290
8641 [main] INFO com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog - Loaded unidentified ReadMarker start time 2017-01-21T16:31:28.750Z into com.thinkaurelius.titan.diskstorage.log.kcvs.KCVSLog$MessagePuller#3520958b
8642 [main] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Graph [graph] was successfully configured via [conf/gremlin-server/srg.properties].
8643 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized Gremlin thread pool. Threads in pool named with pattern gremlin-*
14187 [main] INFO com.jcabi.manifests.Manifests - 108 attributes loaded from 264 stream(s) in 185ms, 108 saved, 3371 ignored: ["Agent-Class", "Ant-Version", "Archiver-Version", "Bnd-LastModified", "Boot-Class-Path", "Build-Date", "Build-Host", "Build-Id", "Build-Java-Version", "Build-Jdk", "Build-Job", "Build-Number", "Build-Time", "Build-Timestamp", "Build-Version", "Built-At", "Built-By", "Built-OS", "Built-On", "Built-Status", "Bundle-ActivationPolicy", "Bundle-Activator", "Bundle-BuddyPolicy", "Bundle-Category", "Bundle-ClassPath", "Bundle-Classpath", "Bundle-Copyright", "Bundle-Description", "Bundle-DocURL", "Bundle-License", "Bundle-Localization", "Bundle-ManifestVersion", "Bundle-Name", "Bundle-NativeCode", "Bundle-RequiredExecutionEnvironment", "Bundle-SymbolicName", "Bundle-Vendor", "Bundle-Version", "Can-Redefine-Classes", "Change", "Class-Path", "Created-By", "DynamicImport-Package", "Eclipse-AutoStart", "Eclipse-BuddyPolicy", "Eclipse-SourceReferences", "Embed-Dependency", "Embedded-Artifacts", "Export-Package", "Extension-Name", "Extension-name", "Fragment-Host", "Git-Commit-Branch", "Git-Commit-Date", "Git-Commit-Hash", "Git-Committer-Email", "Git-Committer-Name", "Gradle-Version", "Gremlin-Lib-Paths", "Gremlin-Plugin-Dependencies", "Gremlin-Plugin-Paths", "Ignore-Package", "Implementation-Build", "Implementation-Build-Date", "Implementation-Title", "Implementation-URL", "Implementation-Vendor", "Implementation-Vendor-Id", "Implementation-Version", "Import-Package", "Include-Resource", "JCabi-Build", "JCabi-Date", "JCabi-Version", "Java-Vendor", "Java-Version", "Main-Class", "Main-class", "Manifest-Version", "Maven-Version", "Module-Email", "Module-Origin", "Module-Owner", "Module-Source", "Originally-Created-By", "Os-Arch", "Os-Name", "Os-Version", "Package", "Premain-Class", "Private-Package", "Require-Bundle", "Require-Capability", "Scm-Connection", "Scm-Revision", "Scm-Url", "Specification-Title", "Specification-Vendor", "Specification-Version", "Tool", "X-Compile-Source-JDK", "X-Compile-Target-JDK", "hash", "implementation-version", "mode", "package", "url", "version"]
14842 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-jython ScriptEngine
15540 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded nashorn ScriptEngine
16076 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-python ScriptEngine
16553 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines - Loaded gremlin-groovy ScriptEngine
17410 [main] INFO org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor - Initialized gremlin-groovy ScriptEngine with scripts/empty-sample.groovy
17410 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - Initialized GremlinExecutor and configured ScriptEngines.
17419 [main] INFO org.apache.tinkerpop.gremlin.server.util.ServerGremlinExecutor - A GraphTraversalSource is now bound to [g] with graphtraversalsource[standardtitangraph[cassandrathrift:[cassandra]], standard]
17565 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17566 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+gryo-stringd with org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0
17808 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/vnd.gremlin-v1.0+json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerGremlinV1d0
17811 [main] INFO org.apache.tinkerpop.gremlin.server.AbstractChannelizer - Configured application/json with org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0
17958 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Gremlin Server configured with worker thread pool of 1, gremlin pool of 8 and boss thread pool of 1.
17959 [gremlin-server-boss-1] INFO org.apache.tinkerpop.gremlin.server.GremlinServer - Channel started at port 8182.
1/21/17 4:34:20 PM =============================================================
-- Meters ----------------------------------------------------------------------
org.apache.tinkerpop.gremlin.server.GremlinServer.errors
count = 0
mean rate = 0.00 events/second
1-minute rate = 0.00 events/second
5-minute rate = 0.00 events/second
15-minute rate = 0.00 events/second
180564 [metrics-logger-reporter-thread-1] INFO org.apache.tinkerpop.gremlin.server.Settings$Slf4jReporterMetrics - type=METER, name=org.apache.tinkerpop.gremlin.server.GremlinServer.errors, count=0, mean_rate=0.0, m1=0.0, m5=0.0, m15=0.0, rate_unit=events/second
Symptoms
So far, everything appears to be working as intended. The logs indicate that I am able to load srg.properties and bind the data structure to a variable called graph.
The problem appears when I try to connect to the Gremlin-Server instance over the exported port 8182, for example using gremlin-python:
# executed via python 3.6.0 on the host machine, i.e. not inside of Docker
from gremlin_python import statics
from gremlin_python.structure.graph import Graph
from gremlin_python.process.graph_traversal import __
from gremlin_python.process.strategies import *
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/gremlin','graph'))
produces the following exception ...
---------------------------------------------------------------------------
HTTPError Traceback (most recent call last)
<ipython-input-10-59ad504f29b4> in <module>()
----> 1 g = graph.traversal().withRemote(DriverRemoteConnection('ws://localhost:8182/','g'))
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/gremlin_python/driver/driver_remote_connection.py in __init__(self, url, traversal_source, username, password, loop, graphson_reader, graphson_writer)
41 self._password = password
42 if loop is None: self._loop = ioloop.IOLoop.current()
---> 43 self._websocket = self._loop.run_sync(lambda: websocket.websocket_connect(self.url))
44 self._graphson_reader = graphson_reader or GraphSONReader()
45 self._graphson_writer = graphson_writer or GraphSONWriter()
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/ioloop.py in run_sync(self, func, timeout)
455 if not future_cell[0].done():
456 raise TimeoutError('Operation timed out after %s seconds' % timeout)
--> 457 return future_cell[0].result()
458
459 def time(self):
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/concurrent.py in result(self, timeout)
235 return self._result
236 if self._exc_info is not None:
--> 237 raise_exc_info(self._exc_info)
238 self._check_done()
239 return self._result
/Users/lthibault/.pyenv/versions/3.6.0/lib/python3.6/site-packages/tornado/util.py in raise_exc_info(exc_info)
HTTPError: HTTP 599: Stream closed
Suspecting a problem specific to this library:
1) attempt to connect to the websocket port with nc
$ nc -z -v localhost 8182
found 0 associations
found 1 connections:
1: flags=82<CONNECTED,PREFERRED>
outif lo0
src ::1 port 58627
dst ::1 port 8182
rank info not available
TCP aux info available
Connection to localhost port 8182 [tcp/*] succeeded!
2) attempt to connect to Gremlin-Server using a different client library, namely go-gremlin
Test case:
package main
import (
"fmt"
"log"
"github.com/go-gremlin/gremlin"
)
func main() {
if err := gremlin.NewCluster("ws://localhost:8182/gremlin"); err != nil {
log.Fatal(err)
}
data, err := gremlin.Query(`graph.V()`).Exec()
if err != nil {
log.Fatalf("Query error: %s", err)
}
fmt.Println(string(data))
}
Output:
$ go run cmd/test/main.go
2017/01/21 14:47:42 Query error: unexpected EOF
exit status 1
Current Conclusions & Questions
From the previous tests, I conclude that this is an application-level problem (i.e. a problem on the websocket or ws protocol level, not a problem in the host or container networking stack). Indeed, nc reports that the socket connection is successful, but in both the Python and Go client libraries ostensibly complain of an inappropriate (empty) response from the server.
I have tried removing the /gremlin path from the websocket URL both in gremlin-python and in go-gremlin, to no avail.
My question is: where do I go from here? Any suggestions or diagnostic paths would be most appreciated!
The main problem is that the host in your Gremlin Server configuration is set to the default which is localhost. This will only allow connections from the server itself. You need to change the value to an external IP of the server or 0.0.0.0.
The other issue is that gremlin-python server plugin was made available with Apache TinkerPop 3.2.2. Titan 1.0.0 uses TinkerPop 3.0.1. I dobut that the gremlin-python 3.2.3 plugin will work with Titan 1.0.0.
Update: Consider using JanusGraph 0.1.1 which uses TinkerPop 3.2.3. JanusGraph was forked from Titan, so the code is basically the same with updated dependencies.

Datanucleus JPA named query return the deleted entity

I am using Datanucleus to perform CRUD. I delete a entity, then perform named query, why the already deleted entity is still among the result list?
Firstly, delete the entity:
MyEntity e = manager.find(MyEntity.class, id);
manager.remove(e);
Then, query:
#NamedQueries({
#NamedQuery(name = MyEntity.FIND_ALL, query = "SELECT a FROM MyEntity a ORDER BY a.updated DESC")
})
public static final String FIND_ALL = "MyEntity.findAll";
TypedQuery<MyEntity> query = manager.createNamedQuery(FIND_ALL, MyEntity.class);
return query.getResultList();
Configure datanucleus.Optimistic persistence.xml:
<property name="datanucleus.Optimistic" value="true" />
The named query will unexpectedly return the list of results which has the deleted entities in it.
However, If the datanucleus.Optimistic=false, then result is correct. Why datanucleus.Optimistic=true doesn't work?
More details about this case:
Below is the CRUD related log:
1. Log of the Save operation:
DEBUG: DataNucleus.Transaction - Transaction begun for ExecutionContext org.datanucleus.ExecutionContextThreadedImpl#6bc3bf (optimistic=true)
INFO : org.springframework.test.context.transaction.TransactionalTestExecutionListener - Began transaction (1): transaction manager [org.springframework.orm.jpa.JpaTransactionManager#7dfefcef]; rollback [true]
DEBUG: DataNucleus.Persistence - Making object persistent : "com.demo.MyEntity#30a7803e"
DEBUG: DataNucleus.Cache - Object with id "com.demo.MyEntity:07cad778-d1c3-4834-ace7-ac2e4ecacc24" not found in Level 1 cache [cache size = 0]
DEBUG: DataNucleus.Cache - Object with id "com.demo.MyEntity:07cad778-d1c3-4834-ace7-ac2e4ecacc24" not found in Level 2 cache
DEBUG: DataNucleus.Persistence - Managing Persistence of Class : com.demo.MyEntity [Table : (none), InheritanceStrategy : superclass-table]
DEBUG: DataNucleus.Cache - Object "com.demo.MyEntity#96da65f" (id="com.demo.MyEntity:07cad778-d1c3-4834-ace7-ac2e4ecacc24") added to Level 1 cache (loadedFlags="[YNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN]")
DEBUG: DataNucleus.Lifecycle - Object "com.demo.MyEntity#96da65f" (id="com.demo.MyEntity:07cad778-d1c3-4834-ace7-ac2e4ecacc24") has a lifecycle change : "HOLLOW"->"P_NONTRANS"
DEBUG: DataNucleus.Persistence - Fetching object "com.demo.MyEntity#96da65f" (id=07cad778-d1c3-4834-ace7-ac2e4ecacc24) fields [entityId,extensions,objectType,openSocial,published,updated,url,actor,appId,bcc,bto,cc,content,context,dc,endTime,generator,geojson,groupId,icon,inReplyTo,ld,links,location,mood,object,odata,opengraph,priority,provider,rating,result,schema_org,source,startTime,tags,target,title,to,userId,verb]
DEBUG: DataNucleus.Datastore.Retrieve - Object "com.demo.MyEntity#96da65f" (id="07cad778-d1c3-4834-ace7-ac2e4ecacc24") being retrieved from HBase
DEBUG: org.apache.hadoop.hbase.zookeeper.ZKUtil - hconnection opening connection to ZooKeeper with ensemble (master.hbase.com:2181)
....
DEBUG: org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at row=MyEntity,,00000000000000 for max=10 rows using org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation#25c7f5b0
...
DEBUG: DataNucleus.Cache - Object with id="com.demo.MyEntity:07cad778-d1c3-4834-ace7-ac2e4ecacc24" being removed from Level 1 cache [current cache size = 1]
DEBUG: DataNucleus.ValueGeneration - Creating ValueGenerator instance of "org.datanucleus.store.valuegenerator.UUIDGenerator" for "uuid"
DEBUG: DataNucleus.ValueGeneration - Reserved a block of 1 values
DEBUG: DataNucleus.ValueGeneration - Generated value for field "com.demo.BaseEntity.entityId" using strategy="custom" (Generator="org.datanucleus.store.valuegenerator.UUIDGenerator") : value=4aa3c4a8-b450-473e-aeba-943dc6ef30ce
DEBUG: DataNucleus.Cache - Object "com.demo.MyEntity#30a7803e" (id="com.demo.MyEntity:4aa3c4a8-b450-473e-aeba-943dc6ef30ce") added to Level 1 cache (loadedFlags="[YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY]")
DEBUG: DataNucleus.Transaction - Object "com.demo.MyEntity#30a7803e" (id="4aa3c4a8-b450-473e-aeba-943dc6ef30ce") enlisted in transactional cache
DEBUG: DataNucleus.Persistence - Object "com.demo.MyEntity#30a7803e" has been marked for persistence but its actual persistence to the datastore will be delayed due to use of optimistic transactions or "datanucleus.flush.mode" setting
2. Log of the DELETE operation:
DEBUG: DataNucleus.Cache - Object "com.demo.MyEntity#30a7803e" (id="com.demo.MyEntity:4aa3c4a8-b450-473e-aeba-943dc6ef30ce") taken from Level 1 cache (loadedFlags="[YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY]") [cache size = 1]
DEBUG: DataNucleus.Persistence - Deleting object from persistence : "com.demo.MyEntity#30a7803e"
DEBUG: DataNucleus.Lifecycle - Object "com.demo.MyEntity#30a7803e" (id="com.demo.MyEntity:4aa3c4a8-b450-473e-aeba-943dc6ef30ce") has a lifecycle change : "P_NEW"->"P_NEW_DELETED"
3. Log of the named QUERY operation:
DEBUG: DataNucleus.Cache - Query Cache of type "org.datanucleus.query.cache.SoftQueryCompilationCache" initialised
DEBUG: DataNucleus.Cache - Query Cache of type "org.datanucleus.store.query.cache.SoftQueryDatastoreCompilationCache" initialised
DEBUG: DataNucleus.Cache - Query Cache of type "org.datanucleus.store.query.cache.SoftQueryResultsCache" initialised
DEBUG: DataNucleus.Query - JPQL Single-String with "SELECT a FROM MyEntity a ORDER BY a.updated DESC"
DEBUG: DataNucleus.Persistence - ExecutionContext.internalFlush() process started using optimised flush - 0 to delete, 1 to insert and 0 to update
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #7
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #7
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: exists 0
DEBUG: DataNucleus.Datastore.Persist - Object "com.demo.MyEntity#30a7803e" being inserted into HBase with all reachable objects
DEBUG: DataNucleus.Datastore.Native - Object "com.demo.MyEntity#30a7803e" PUT into HBase table "MyEntity" as {"totalColumns":3,"families":{"MyEntity":[{"timestamp":9223372036854775807,"qualifier":"DTYPE","vlen":8},{"timestamp":9223372036854775807,"qualifier":"userId","vlen":5},{"timestamp":9223372036854775807,"qualifier":"entityId","vlen":36}]},"row":"4aa3c4a8-b450-473e-aeba-943dc6ef30ce"}
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #8
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #8
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: multi 2
DEBUG: DataNucleus.Datastore.Persist - Execution Time = 123 ms
DEBUG: DataNucleus.Persistence - ExecutionContext.internalFlush() process finished
DEBUG: DataNucleus.Query - JPQL Query : Compiling "SELECT a FROM MyEntity a ORDER BY a.updated DESC"
DEBUG: DataNucleus.Query - JPQL Query : Compile Time = 13 ms
DEBUG: DataNucleus.Query - QueryCompilation:
[from:ClassExpression(alias=a)]
[ordering:OrderExpression{PrimaryExpression{a.updated} descending}]
[symbols: a type=com.demo.MyEntity]
DEBUG: DataNucleus.Query - JPQL Query : Compiling "SELECT a FROM MyEntity a ORDER BY a.updated DESC" for datastore
DEBUG: DataNucleus.Query - JPQL Query : Compile Time for datastore = 2 ms
DEBUG: DataNucleus.Query - JPQL Query : Executing "SELECT a FROM MyEntity a ORDER BY a.updated DESC" ...
DEBUG: DataNucleus.Datastore.Native - Retrieving objects for candidate=com.demo.MyEntity and subclasses
DEBUG: org.apache.hadoop.hbase.client.ClientScanner - Creating scanner over MyEntity starting at key ''
DEBUG: org.apache.hadoop.hbase.client.ClientScanner - Advancing internal scanner to startKey at ''
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #9
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #9
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: openScanner 1
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #10
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #10
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: next 0
DEBUG: DataNucleus.Cache - Object "com.demo.MyEntity#30a7803e" (id="com.demo.MyEntity:4aa3c4a8-b450-473e-aeba-943dc6ef30ce") taken from Level 1 cache (loadedFlags="[YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY]") [cache size = 1]
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #11
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #11
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: next 0
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 sending #12
DEBUG: org.apache.hadoop.ipc.HBaseClient - IPC Client (47) connection to namenode.hbase.com/192.168.1.99:60020 from user1 got value #12
DEBUG: org.apache.hadoop.ipc.RPCEngine - Call: close 1
DEBUG: org.apache.hadoop.hbase.client.ClientScanner - Finished with scanning at {NAME => 'MyEntity,,1457106265917.c6437b9afd33cd225c33e0ed52ff50d4.', STARTKEY => '', ENDKEY => '', ENCODED => c6437b9afd33cd225c33e0ed52ff50d4,}
DEBUG: DataNucleus.Query - JPQL Query : Processing the "ordering" clause using in-memory evaluation (clause = "[OrderExpression{PrimaryExpression{a.updated} descending}]")
DEBUG: DataNucleus.Query - JPQL Query : Processing the "resultClass" clause using in-memory evaluation (clause = "com.demo.MyEntity")
DEBUG: DataNucleus.Query - JPQL Query : Execution Time = 14 ms
Why the following logs (PUT entity with lifecycle "P_NEW_DELETED" into datastore) appear during the QUERY operation? And how to avoid this behavior?
DEBUG: DataNucleus.Datastore.Persist - Object "com.demo.MyEntity#30a7803e" being inserted into HBase with all reachable objects
DEBUG: DataNucleus.Datastore.Native - Object "com.demo.MyEntity#30a7803e" PUT into HBase table "MyEntity" as {"totalColumns":3,"families":{"MyEntity":[{"timestamp":9223372036854775807,"qualifier":"DTYPE","vlen":8},{"timestamp":9223372036854775807,"qualifier":"userId","vlen":5},{"timestamp":9223372036854775807,"qualifier":"entityId","vlen":36}]},"row":"4aa3c4a8-b450-473e-aeba-943dc6ef30ce"}
You turned on optimistic transactions so consequently all data write operations only happen at commit. You executed a query before that happened (and didn't set the flush mode for the query) so consequently your delete is not in the datastore when you execute the query.
Call
em.flush()
before executing the query, or set
query.setFlushMode(FlushModeType.AUTO);

MongoDs in ReplSet won't start after trying out some MapReduce

I was practicing some MapReduce inside of my Primary's mongo shell when it suddenly became a Secondary. I SSHed into the two other VM's with the other secondaries, and discovered that the mongod's had been rendered inoperaple. I killed them and I issued the mongod --config /etc/mongod.conf to kick them off and I entered the mongo shell. After a few seconds they were interrupted with:
2014-09-14T22:29:54.142-0500 DBClientCursor::init call() failed
2014-09-14T22:29:54.143-0500 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-09-14T22:29:54.143-0500 warning: Failed to connect to 127.0.0.1:27017, reason: errno:111 Connection refused
2014-09-14T22:29:54.143-0500 reconnect 127.0.0.1:27017 (127.0.0.1) failed failed couldn't connect to server 127.0.0.1:27017 (127.0.0.1), connection attempt failed
>
This is from their (the two original secondaries in the replicaset) logs:
2014-09-14T22:09:21.879-0500 [rsBackgroundSync] replSet syncing to: vm-billing-001:27017
2014-09-14T22:09:21.880-0500 [rsSync] replSet still syncing, not yet to minValid optime 54165090:1
2014-09-14T22:09:21.882-0500 [rsBackgroundSync] replset setting syncSourceFeedback to vm-billing-001:27017
2014-09-14T22:09:21.886-0500 [rsSync] replSet SECONDARY
2014-09-14T22:09:21.886-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1_inc properties: { v: 1, key: { 0: 1 }, name: "_temp_0", ns: "test.tmp.mr.CCS.nonconforming_1_inc" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.887-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, key: { _id: 1 }, name: "_id_", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.887-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.888-0500 [repl writer worker 1] build index on: test.tmp.mr.CCS.nonconforming_1 properties: { v: 1, unique: true, key: { id: 1.0 }, name: "id_1", ns: "test.tmp.mr.CCS.nonconforming_1" }
2014-09-14T22:09:21.888-0500 [repl writer worker 1] added index to empty collection
2014-09-14T22:09:21.891-0500 [repl writer worker 2] ERROR: writer worker caught exception: :: caused by :: 11000 insertDocument :: caused by :: 11000 E11000 duplicate key error index: cisco.tmp.mr.CCS.nonconforming_1.$id_1 dup key: { : null } on: { ts: Timestamp 1410748561000|46, h: 9014687153249982311, v: 2, op: "i", ns: "cisco.tmp.mr.CCS.nonconforming_1", o: { _id: 14, value: 1.0 } }
2014-09-14T22:09:21.891-0500 [repl writer worker 2] Fatal Assertion 16360
2014-09-14T22:09:21.891-0500 [repl writer worker 2]
I can issue mongo --host ... --port ... from both of the two VMs that can't start the mongo to the original primary mongo, but I do see some connection refused notes above in the error log.
My original primary mongod can still be connected to in the mongo shell, but it is a primary. I can kill it and restart it and it will start up in secondary.
How can I roll back to the last known state and restart my replica set?