I run MATLAB2014b on an Ubuntu client and have ssh-based access (with passwordless ssh enabled) to multiple other computer clusters that have both data in the form of huge .mat files and MATLAB code on them. Because my local machine is very limited in terms of space, I cannot easily copy the files locally and work on them. Furthermore, those computer clusters have shared resources, so opening up a MATLAB GUI on the clusters (by doing ssh -X, for example) is going to cause issues to other researchers connected to those machines. Opening up a MATLAB session without a display is totally possible, but it lowers my productivity since the GUI is very helpful to me.
What I would ideally want is to use my local MATLAB GUI but access all the files and my code in the remote server. Please note that this is not the same as creating a cluster profile to submit parallel jobs to; for this particular - more complicated - piece of information, multiple tutorials exist.
Related
I want to use Postgresql in Windows Server 2012 R2 for one our project where it can be 24/7 uptime.
I would like to ask the community if I can have 2 master instances in 2 different servers A&B and they will 'work' on the same DB located in a shared file storage in lan. Always one master instance on server A will be online and when it goes offline for some reason (I suppose) a powershell script will recognize that the postgresql service stopped and will start the service in server B. The same script will continuous check that only one service in servers A & B is working to avoid conflicts.
I'd like to ask if this is possible or a better approach for my configuration.
(I can't use replication because when server A shuts down the server B is in read-only mode thing that I don't want)
If you manage to start two instances of PostgreSQL on the same data directory, serious data corruption will happen.
Normally there is a postmaster.pid file that prevents that, but a PostgreSQL server process on a different machine that accesses the same file system will happily unlink that after spewing some log messages, thinking it was left behind from a crash.
So you are really walking on thin ice with a solution like that.
One other issue that you didn't think of is that script that is supposed to check if the server is still running. What if that script fails, because for example the network connection between the two servers is down, but the server is still up an running happily? Such a “split brain” scenario will cause data corruption with your setup.
Another word of caution: since you seem to be using Windows (Powershell?), you probably envision a CIFS file system when you are talking of shared storage. A Windows “network share” is not a reliable file system — last time I checked, it did not honor _commit.
Creating a reliable failover cluster is harder than you think, and I'd recommend that you check existing solutions before you try to roll your own.
I'm trying to implement a distributed grep. How can I access the log files from different systems? I know I need to use the network but I don't know whether you use ssh, telnet, or anything else? What information do I need to know about the machines I am going to connect to from my machine? I want to be able to connect to different Linux machines and read their log files and pipe it back to my machine.
Your system contains a number of Linux machine which produce log data(SERVERs), and one machine which you operate(CLIENT). Right?
Issue 1) file to be accessed.
In general, log file is locked by a software which produce log data, because the software has to be able to write data into log file at any time.
To access the log file from other software, you need to prepare unlocked log data file.
Some modification of the software's setup ane/or the software(program) itself.
Issue 2) program to serve log files.
To get log data from SERVER, each SERVERs have to run some server program.
For remote shell access, rshd (remote shell deamon) is needed. (ssh is combination of rsh and secure communication).
For FTP access, ftpd (file transfer protocol deamon) is needed.
The software to be needed is depend how CLIENT accesses SERVERs.
Issue 3) distribued grep.
You use words 'distribued grep'. What do you mean by the words?
What are distribued in your 'distributed grep'?
Many senarios came in my mind.
a) Log files are distribued in SERVERs. All log data are collected to CLIENT, and grep program works for collected log data at CLIENT.
b) Log files are distribued in SERVERs. Grep function are implemented on each SERVERs also. CLIENT request to each SERVERs for getting the resule of grep applied to log data, and results are collected to CLIENT.
etc.
What is your plan?
Issue 4) access to SERVERs.
Necessity of secure communication is depend on locations of machines and networks among them.
If all machines are in a room/house, and networks among machines are not connected the Internet, secure communication is not necessary.
If the data of log is top secret, you may need encript the data before send the data on the network.
How is your log data important?
At very early stage of development, you should determing things described above.
This is my advice.
I am looking to collect data snapshot on a random interval from various machines in our network that we don't own, but may get access to install an agent to collect these data.
These machines are either in a domain or work-group and kind of data i get are based on the role they play and information they have. The machines are "Windows Server 2003" and above and I do not want to install anything on those machines before i get started, so thought I can use the PowerShell scripts that I can remote invoke form my server and pass the script it has to run to return the data.
I was wondering if this is possible to do that with the PowerShell scripts and as this is supposed to run in a secure environment, is there any major security implications with this approach. i.e. do I need to do anything on the client machines that can make them vulnerable to security threats.
BTW these machines are not exposed to internet and are behind a firewall.
I would appreciate if you point me to any other alternatives that can be useful for my analysis.
Regards
Kiran
Is there a way that I can copy updated files only from one network to another? The networks aren't connected in anyway, so the transfer will need to go via CD (or USB, etc.)
I've had a look at things like rsync, but that seems to require the two networks to be connected.
I am copying from a Windows machine, but it's going onto a network with both Windows and Linux machines (although Windows is preferable due to the way the network is set up).
you can rsync from source to the use-drive, use the usb-drive as a buffer, and then rsync from the usb-drive to the target. to benefit from the rsync algorithm and reduce the amount of copied data you need to keep the data on the usb-drive between runs.
I am having a conflict of ideas with a script I am working on. The conflict is I have to read a bunch of lines of code from a VMware file. As of now I just use SSH to probe every file for each virtual machine while the file stays on the server. The reason I am now thinking this is a problem is because I have 10 virtual machines and about 4 files that I probe for filepaths and such. This opens a new SSH channel every time I refer to the ssh object I have created using Net::OpenSSH. When all is said and done I have probably opened about 16-20 ssh objects. Would it just be easier in a lot of ways if I SCP'd the files over to the machine that needs to process them and then have most of the work done on the local side. The script I am making is a backup script for ESXi and it will end up storing the files anyway, the ones that I need to read from.
Any opinion would be most helpful.
If the VM's do the work locally, it's probably better in the long run.
In the short term, the ~equal amount of resources will be used, but if you were to migrate these instances to other hardware, then of course you'd see gains from the processing distribution.
Also from a maintenance perspective, it's probably more convenient for each VM to host the local process, since I'd imagine that if you need to tweak it for a specific box, it would make more sense to keep it there.
Aside from the scalability benefits, there isn't really any other pros/cons.