Sphinx - NFS index

Sphinx - NFS index - sphinx

We are running sphinx 1.10 version. We are having multiple sphinx servers under Loadbalancer where searchd is running. We want to share the same index file across multiple servers via NFS. We do not want to do rsync as it would have different servers getting updated with indexes at different time and hence would create inconsistency in the search output.
Due to the .lock file creation, currently via NFS we are unable to start searchd in multiple servers. Any solution would be of great help!

You can use rsync, and then rotate all the servers in unison. basically you can do the reindexing, syncing. And then control when the servers actully rotate in the new index.
Works well. A couple of mentions of it here
http://sphinxsearch.com/forum/search.html?q=rsync+sighup&f=1

I can say it is impossible to share indexes between two or more searchd instances.
You have to implement something similar to rsync, see how we are doing Sphinx replication.

Related

can multiple server access the same mongodb?

I am going to create a load balancer in Azure. I have a VM that already running and going to take a backup of the existing VM and will create another VM using that backup. So two servers will have the same configuration and will use the same credentials.
In the already existing server, I have MongoDB configured, and if I create the same VM that will also have the same configuration as the old VM. Now what I want to know is can I use the same MongoDB which will be accessed by two servers that have the same configurations?
Will it create any mess or any give any error?
can I use like above mentioned?
Do I need to configure another MongoDB for the second server?
can anyone please clarify my questions? it would be great to have some clear explanation. thank you

MongoDB has build in support for horizontal scalability and high availability meaning that you dont need to create a 3th party load balancer , the mongos service part of mongoDB sharding cluster is the load balancer itself. Check the official documentation for mongoDB replication and sharding ...
On your questions:
Will it create any mess or any give any error?
If you just copy data to another VM it will be fine , as far as you dont write to one of the VMs you can loadbalance reads between this independent VMs , but this is strange approch when you have build in mongoDB replication mechanism and you can just add the second VM as a SECONDARY member from replicaSet.
can I use like above mentioned?
Sure , you can use also this approach but why you will need to do it?
Do I need to configure another MongoDB for the second server?
Depends on the use case , but in general you would prefer to create 3x members replicaSet or if your database is large and write performance is strong requirement you may need to distribute the database between multiple servers ( shards ) so you will need more then just 3x servers ...

What is upconfig and downconfig in zookeeper?

I am a noob in Solr and zookeeper and trying to learn by myself. I understood that zookeeper is a file structure that manages solr cluster and prevents race condition using locks. I didn’t understand what is upconfig and downconfig and when we do that. It would be of great help if someone can give me a clear picture on it. Thanks in advance!

A better and more general description of Zookeeper is an application that provides centralised configuration for distributed systems. So in Solr Cloud, you can have multiple Solr instances across multiple servers acting together as a single cloud. However, if you want to update a collection's configuration, you don't want to have to go to each server and update them all individually. You want only one version of the config which is then used by any collection that needs it. Hence the conf commands.
upconfig uploads a configuration to ZooKeeper, which then ensures that all collections using that configuration (throughout the Cloud, on all the servers) have that specific config. So you only need to upload it once, on one server.
downconfig lets you fetch a configuration from Zookeeper.

Which way to run PostgreSQL in Docker?

Which of these methods is correct?
One db container for each app
One db container for all apps
Install db without docker
I tried to find information, but nothing. Or I badly searched?

It is immature, but that doesn't seem to be stopping a lot of people using Docker for persistence.
The official Postgres image has 4.5 million pulls - Ok, this doesn't mean that all those images/containers are being used but it does suggest that it is a popular solution.
If you have already decided that you would like to use Docker, because of what containers can offer your architecture, then I don't think you will have trouble using it for persistence - assuming you are happy learning Docker.
I'm using Postgres and MySql in several projects quite successfully on Docker.
In choosing option 1 or 2, I would say that unless your apps are related to the same problem domain/company/project I would go with option 1. Of course, running costs will possibly factor in as well.
I generally go with option 1.

All 3 options could be valid but it depends on the use you have to perform.
In my server I've 1 container for every main postgresql releases currently I use.
I run all of them on different ports (use not random number for ports but some easy to remember because a problem with docker is remember all port numbers and other for every container).
pg84(port 8432), pg93(port 9332), pg94 (port 9432)
I link the pgXX to the container I need and that's perfect for me.
For my experience I prefer option 2 so.

postgreSQL with different data directories

Today I installed postgreSQL to work with. When I was reading documents about postgreSQL, I found there can be more than one data directory present. Is it more than one data directory for single installation ? Or I understood wrongly ?
In my installation, data directory is in
C:\Program Files\PostgreSQL\8.3\data
if it can be more than one data directory for single installation, how will be the directory structure ? Please help me in understanding.

I think you mean a tablespace, please check the manual: CREATE TABLESPACE

There can be only one data directory for each PostgreSQL cluster. A cluster is a postmaster listening on a port, managing several databases from a single data directory. You can have multiple clusters by starting multiple postmasters with pg_ctl or via a system service, each listening on a different port and using a different data dir.
If you have multiple clusters on a machine you have multiple data directories. It's unusual to need to do this, but it is possible.
It would be immensely helpful if you'd link to the documents you're talking about when asking questions about them.

How to scale MongoDB?

I know that MongoDB can scale vertically. What about if I am running out of disk?
I am currently using EC2 with EBS. As you know, I have to assign EBS for a fixed size.
What if the MongoDB growth bigger than the EBS size? Do I have to create a larger EBS and Copy & Paste the files?
Or shall we start more MongoDB instance and each connect to different EBS disk? In such case, I could connect to a different instance for different databases.

If you're running out of disk, you obviously need to get a bigger disk.
There are several ways to migrate your data, it really depends on the type of up-time you need. First steps of course involve bundling the machine and creating the new volume.
These tips go from easiest to hardest.
Can you take the database completely off-line for several minutes?
If so, do this (migration by copy):
Mount new EBS on the server.
Stop your app from connecting to Mongo.
Shut down mongod and wait for everything to write (check the logs)
Copy all of the data files (and probably the logs) to the new EBS volume.
While the copy is happening, update your mongod start script (or config file) to point to the new volume.
Start mongod and check connection
Restart your app.
Can you take the database off-line for just a few minutes?
If so, do this (slaving and switch):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a --slave pointing at the current database. (you may need to re-start the current as --master)
The slave will do a fresh synchronization. Once the slave is up-to-date, you'll do a "switch" (next steps).
Turn off writes from the system.
Shut down the original mongod process.
Re-start the "new" mongod as a master instead of the slave.
Re-activate system writes pointing at the new master.
Done correctly those last three steps can happen in minutes or even seconds.
Can you not afford any down-time?
If so, do this (master-master):
Start up a new instance and mount the new EBS on that server.
Install / start mongod as a master and a slave against the current database. (may need to re-start current as master, minimal down-time?)
The new computer should do a fresh synchronization.
Once the new computer is up-to-date, switch the system to point at the new server.
I know it seems like this last version is actually the best, but it can be a little dicey (as of this writing). The reason is simply that I've honestly had a lot of issues with "Master-Master" replication, especially if you don't start with both active.
If you plan on using this method, I highly suggest a smaller practice run first. If something bombs here, Mongo might simply wipe all of your data files which will have the effect of taking more stuff down.
If you get a good version of this please post the commands, I'd like to see it in action.

Doesn't the E in EBS stand for elastic meaning something like resizing on the fly?
Currently the MongoDB team is working on finishining sharding which will allow you horizontal scaling by partitioning data separately on different servers. Give it a month or two and it will work fine. The developers are quite good at keeping their promises.
http://api.mongodb.org/wiki/current/Sharding%20Introduction.html
http://api.mongodb.org/wiki/current/Sharding%20Limits.html

You could slave the bigger disk off the smaller until it's caught up
or
fsync+lock and take a file system snapshot and copy it onto the bigger disk.

well, I am using Mongo DB now. I am pretty amazed the performance it generated, especially on some simple sorting.
I believe it's a good tool for simple web application logic. The remaining concern for is how to scale and backup. I will continue to explore.
The only disadvantage I have is that I didn't have any good tools to reveal the data stored inside. For example, I want to put my logging from MYSQL into Mongo as well. However, it's pretty difficult for me to view the log. Previously, i can use MYSQL query to fetch what I want easily.
Anyway, it's a good tool and I will continue to use it.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Sphinx - NFS index - sphinx

You can use rsync, and then rotate all the servers in unison. basically you can do the reindexing, syncing. And then control when the servers actully rotate in the new index. Works well. A couple of mentions of it here http://sphinxsearch.com/forum/search.html?q=rsync+sighup&f=1

I can say it is impossible to share indexes between two or more searchd instances. You have to implement something similar to rsync, see how we are doing Sphinx replication.