How can I move Sphinx realtime index to another server? - sphinx

What is the best method to transfer Sphinx real time index from one machine to another. If it was disk index, I could just move the database and re-index it again, but the index is RT. Thanks in advance!

Stop searchd on source gracefully. (ie searchd --stopwait, rather than just forceibly killing it, or crashing etc)
Copy /var/folder/indexname* to destination machine. (where using the prefix as noted in the index definition)
Copy the index definition to the destination.
Start up searchd on destination.
Most likly to work successfully if both machines have the same version of sphinx installed.

Related

How Sync between replica sets in mongoDB achieved. Automatic or Manual triggering Needed?

Going through MongoDB documentation I missed clarity regarding the above question. Please provide commands if any , are used for manual triggering a sync among replica sets.
Replica sets are always automatically synced, but if you need to do a manual re-sync you have a couple of options as explained here https://docs.mongodb.org/manual/tutorial/resync-replica-set-member/
So basically you can stop the member you want to re-sync and empty its data directory. When you will restart it, Mongo will automatically start the sync process.
Stop the member’s mongod instance. To ensure a clean shutdown, use the db.shutdownServer() method from the mongo shell or on Linux systems, the mongod --shutdown option.
Delete all data and sub-directories from the member’s data directory. By removing the data dbPath, MongoDB will perform a complete resync. Consider making a backup first.
Another way MongoDB suggests is that you can copy the data from another member, done that MongoDB will start syncing the rest of the data with the master. Similar to the first solution but faster because you have some data yet and you don't need to start from scratch.

Mongodb interrupted while reindexing

I have a collection with about 3,000,000 entries that I need to reindex. This whole thing began when I tried to add a 2d index. To do this, I created an ssh tunnel, opened the mongo shell and tried to use ensureIndex. I'm in a place with a somewhat unreliable internet connection, and an hour in it ended up breaking the pipe. I then tunneled back in, opened the mongo shell and tried to look at the number of indexes using getIndexes; the new index I created showed up, but I wasn't confident it had finished, so I decided to use reIndex. In retrospect, this was stupid. The pipe broke again. Now when I open the shell and try to issue getIndexes, the shell doesn't respond.
So what should I do? Do I need to repair my database? Can I issue reIndex when I have a more reliable internet connection? Is there a way to issue reIndex without keeping the shell open, but without doing it in the background and having it take eons? (I'll check the mongod shell options to see if I can find anything, then check the node.js mongo api so I can try running something as a service on server)
And also, if I end up running reIndex as a service on the server, is there any way to check if it's working? The most frustrating part of this right now is I have no idea if my database is ok, if reIndex is still running, etc. Any help would be much appreciated. Thanks.
You don't have a problem. Mongo runs commands and only stops them if you explicitly kill the operation (db.killOp()).
You do not need to wait for the index operation to finish!
Regarding the connection problems, try using the screen command.
It enables you to create a "persistent" screen - not in the way of disk persistence, but in the means of connection-loss.

Postgres 9.2 pg_largeobject tablespace

I am currently moving some data around and I am running into an interesting issue.
I have a CentOS server (6.3) up and running with Postgres 9.2 on a server with limited built in disk space; however, I do have a large amount of extremely reliable external network disk space available.
I have set the tablespace to a directory on this storage devise for my database and everything seems to be working well, until...
I realized that I have a large amount of BLOB data that needs to be stored in pg_largeobject.
I have been goggling how to set the tablespace of pg_largeobject and I did find some results, but they are horribly out dated.
I did find one article that looks promising, but I'm hesitant because the thread also references that things will/should have changed.
I have two questions...
In an ideal world, I would like to move all of postgres (including pg_largeobject) onto this external storage for ease of maintenance. Is this possible?
If not, how can I get pg_largeobject to use my network storage?
As you alluded to, your best bet is to move the entirety of PostgreSQL onto the remote storage, assuming that storage uses a reliable file network block device like iSCSI, ATAoE or NBD. I wouldn't recommend running Pg on NFS, and running it on CIFS/SMBFS just won't work.
Just:
Make a backup
Take a note of the output of SHOW data_directory; in psql
Shut PostgreSQL down
Move the data directory (the folder containing pg_xlog, pg_clog, etc) to the remote storage
Adjust the permissions on the parent directories for the datadir's new location to make sure the postgres user, postgres, group or others permissions block has at least execute on each parent directory so it can traverse the tree.
Adjust your system startup scripts to set the new location as the PostgreSQL datadir or symlink the old datadir location (output by SHOW data_directory) to the new location.
Start PostgreSQL
Unfortunately, different systems and packages find the datadir different ways. Debian/Ubuntu use pg_wrapper, for example.

Where are the db files? at /var/lib/mongodb I cant find any increase in size. I ran very big loop to create lakhs of objects

I am using UBUNTU, from /etc/mongod.conf, I found that /var/lib/mongdb is the path for data.
I have found some files like collectionname.0, .1, .ns in that directory. but when I run a very big loop(lakhs), I am able to get them back using mongo shell, but that mongodb directory size is not increasing, so there must be someother place where this data is being stored
What is that place?
There is no another place. As indicated by #itsbruce, in Ubuntu it's /var/lib/mongodb.
On a non-packaged installation (on Linux), i.e. without a /etc/mongodb.conf file, the default is /data/db/ (unless otherwise specified).
You can modify the switch "--dbpath" on the command line to designate a different directory for the mongod instance to store its data. Typical locations include: /srv/mongodb, /var/lib/mongodb or /opt/mongodb.
(Windows systems use the \data\db directory.) If you installed using a package management system.
I recommend using the db.collection.stats command as outlined here to monitor the size of your collection as you insert data. The link also explains what each field (in the output) means.
That is the correct data location for MongoDB on Ubuntu. MongoDB pre-allocates filespace. Are you sure you have generated more data than would fit into the initial pre-allocated files? Try blowing away any existing data files and restarting Mongo with the --noprealloc flag. Then add data.

Incremental backups from server to local machine

My live site is using mongodb to store user activities on the site.
I am having a single server running monogdb. I cant afford a second server for master slave replication.
my problem is i want to take the dump of server's mongodb database everyday and restore it to my local machine so that i can query on my local machine.I know how to dump and restore but the issue is every day i have to dump the entire database from server and restore it from the scratch in my local machine ..it takes a lot of time.
so my question is ..is there any way to have incremental backup in mongodb so that i have to dump and restore only single day data as it will take less time.
i do not know much about mongodb, but i have an idea.
i think you can introduce your local mongodb instance as a slave of master production db, and make slave only writable if possible, for preventing live system making selects from your local.
this way can work because slaves keeps track of master writes and deletes and try to make themselves as a copy of master.
And there is a good reason to do that is a slave doesn't have to be online always, when it becomes online, slave will check masters list (this list lenght like 1hour or 1 day is configurable at master) and copy datas from master as quick as possible.
Once you dump master to your local, then you can backup your data twice a day with this method i think.