I see many deployments where IT groups run effectively nothing but a JVM application stack inside a VM (vmware, &c) instance.
I guess I consider the JVM to be a formal VM: what real benefit is it to run your Java application stack inside another VM?
Two JVM instances within the same (real or virtualized) machine wouldn't be completely isolated from each other: they couldn't both have sockets listening on the same well-known numbered port, they might interfere with each other if they both wrote in the same filesystem, and so on, and so forth.
Using OS-level VMs (vmware or whatever) does guarantee you as much isolation as you would have on physically separate systems, which is quite a different proposition.
It's an unfortunate terminology collision
Those are really two different terms that unfortunately use the same english words, but have only a rather abstract connection.
IBM used the term "virtual machine" first, so I guess we can't rename that one to "virtual server" or something.
Too bad "software framework" doesn't have VM in its initials. If you think of the JVM that way it will be obvious that you are really just running a framework in a VM, not a thing inside the same kind of thing...
So a real VM can casually give away super user shell accounts, ssh access, software installation privs, ....
what real benefit is it to run your Java application stack inside another VM?
By doing this, your JVM will run on virtualized hardware that you can modify and run in parallel of other virtual machines. This is a nice way to slice a big server into "shares" that you can allocate on demand.
(EDIT: I'm answering a comment from the OP directly in this answer)
I get what you're saying, but why would one not be able to do the very same thing as separate processes on the host OS?
I could mention that a guest can possibly run another OS but this is not the most important part. As pointed out in another answer, the biggest difference is that a virtual machine is isolated from other VMs, it's are real dedicated environment. The port stuff was a good example but I prefer to illustrate it this way: another process won't eat "your" CPU cycles. This is a very important difference, especially for IT teams that usually don't like to share resources. Instead, you can size a virtual machine exactly as needed, possibly dynamically, and bill IT teams for what they are really using. This is IMO what makes mutualisation of resources actually possible (and thus costs cutting).
Related
Please excuse my ignorance. I don't know anything and am just trying to learn.
I've been thinking and it appears to me that spinning off more virtual machines on a single computer so you can run separate services on each VM instance only creates isolation between those services so that if one VM instance were crash, it wouldn't affect the others.
If the same service were installed on each of the VM's, then it also provides for availability.
But it does not help the scalability of your services because each VM on the same hardware must share the same limited hardware resources such as disk, memory, CPU, and network interface cards.
Am I thinking right? Sorry if this isn't the right forum to ask this sort of a question. If someplace else is more appropriate, please feel free to move it there.
Virtualization is a thing of the post. Now it's all about containerization. I would suggest you to look into Docker and specifically docker-compose
Try visiting https://www.docker.com, and also https://docs.docker.com/compose/gettingstarted/
Hope It helps.
Virtual Machines have the advantage that you can adjust the parameters of usage. You can change virtual hardware, memory, disk-space and CPU-usage.
This makes the VM scalable related to the needs.
And if the Host-Machine is getting to weak to serve the requirements you can move the VM on another Host-Machine.
If you want to run a server with maximum available resources it's better to do it without VM but with the best hardware you can get / pay because the VM adds an additional level to computing and is using resources too.
I have production environment, which is running on one server. But I need to run 2 instances of one software, each on "another" server.
Is it possible to imitate more servers on one real server for free? Without loss of computing power and network flow in/out of the real server?
EDIT:
In another words: I want to run two instances of the same software on one machine.
And then I need to use some function that transport some subinstance from instance1 into instance2. But this function is only possible to use when instance1 is on another server than instance2. So I need to imitate that one of both instances running on local is on different servers.
I'm making an assumption that you are using Windows, in which case you could use a Hypervisor like Hyper-V however if you have only purchased one license of Windows you may be fairly limited in what you can run in a production capacity.
If you mean that the software you need to run only has one license you typically are not allowed to virtualize it either, so it seems like the answer is legally you are not going to be able to do much with just one license, however my assumptions may be all wrong, your question wasn't clear enough.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
You're probably familiar with virtualization which takes a single host and is able to "emulate" many instances by sharing the resources among them all. You probably heard about XEN.
Is it completely insane to imagine the "opposite" of XEN : a layer that would abstract several hosts in a single running instance? I believe this would allow building apps which wouldn't need to really care much about a "clustering" layer themselves.
I wonder what are the technical limits to this, because I'm pretty sure some people are already working on it somewhere :)
The goal is NOT to achieve any kind of failure recovery. I believe this can (and should?) be handled at a higher level. For example, if someone is able to run a MySQL server on a gigantic instance (made of say 50 hosts), then one can easily use MySQL's replication features to replicate the database over a similar virtual instance.
Good question. Microsoft Azure is attempting to address this by allowing you to put applications "in the cloud" and not have to be as concerned with scalability up/down, redundancy, data storage, etc. But this is not accomplished at the hypervisor level.
http://www.microsoft.com/windowsazure/
Hardware-wise, there are some downsides to having everything be one big VM rather than many smaller ones. For one thing, software doesn't always understand how to handle all the resources. For example, some applications still can't handle multiple processor cores. I've seen informal benchmarks showing that IIS performs better spreading the same resources over multiple instances rather than one giant instance.
From a management perspective, it is probably better to have multiple VMs in certain cases. Imagine that a bad deployment corrupts a node. If that were your one and only (albeit giant) node, now your whole application is down.
You're probably talking about the concept Single System Image.
There used to be a Linux implementation, openMosix that since closed down. I don't know of any replacements. openMosix made it very easy to create and use SSI on a standard Linux kernel; too bad it got overtaken by events.
I do not know enough about Xen to know if it is possible but with VMware you can create pools of resources which come from many physical hosts. Then you can assign the resources to your VMs. That could be many VMs or just one VM.
Aggregation: Transform Isolated Resources into Shared Pools
Simulating a single core over multiple physical cores is very inefficient. You can do it, but it'll be slower than a cluster. Two physical cores can talk to each other in near-real-time, if they're on separate machines then you're doing something like say clocking down your motherboard speed by factors of 10 or more if these two physical cores (and RAM) are communicating even over a fibre optic network.
Dual cores can communicate faster than two distinct CPUs on the same motherboard, if they are on separate machines, thats slower again, if there are multiple machines, slower even again.
Basically you can, but there is net performance loss compared to the net performance gain you would be hoping to achieve.
Real life example, I had a bunch of VMs on a dual quad core server (~2.5Ghz/core) performing way, way below what they should have been. On closer inspection, it turned out that the hypervisor was emulating a single 3.5-4Ghz core when the load on an individual VM was more than 2.5Ghz -- after limiting each VM to 2.5Ghz performance went back to what was expected.
I agree with saidimu, you are talking about the Single System Image concept. In addition to the OpenMosix project, there have been several commercial implementations of the same idea (one contemporary example is ScaleMP). It's not a new idea.
I just wanted to elaborate on some of the technical points of SSI.
Basically, the reason it's not done is because the performance is generally absolutely unpredictable or terrible. There is a concept in computer systems known as [NUMA][3], which basically means that the cost of accessing different pieces of memory is not uniform. This can apply to huge systems where CPUs may have some memory accesses routed around to different chips, or in cases where memory is accessed remotely over a network (such as in SSI). Typically, the operating system will attempt to compensate for this by laying out programs and data in memory in such a way that a program can run as quickly as possible. I.e., the code and data will all be placed in the same NUMA "region", and be scheduled on the closest possible CPU.
However, in cases where you are running big applications (attempting to use all the memory in your SSI), there is little the operating system can do to reduce the impact of remote memory fetches. MySQL is not aware that accessing page 0x1f3c will cost 8 nanoseconds, while accessing page 0x7f46 will stall it for hundreds of microseconds, possibly milliseconds while the memory is fetched over the network. This means that non-NUMA aware applications will run like crap (seriously, very bad) in this kind of environment. As far as I know, most comtemporary SSI products rely on the fastest possible interconnects (such as Infiniband) between machines to achieve even a passable performance.
This is also why frameworks that expose the true cost of accessing data to the programmer (such as MPI: message passing interface) have achieved more traction than SSI or DSM (distributed shared memory) approaches. In fact, there is basically no way for a programmer to optimize an application to run in an SSI environment, which just sucks.
I work for a small company with a .NET product that was acquired by a medium sized company with "big iron" products. Recently, the medium-sized part of the company acquired another small company with a similar .NET product and management went to have a look at their technology. They make heavy use of virtualization in their production environment and it's been decided that we will too.
Our product was not designed to be run in a virtual environment, but some accommodations can be made. For instance; there are times when we're resource bound due to customer initiated processes. This initiation is "bursty" by nature, but the processing can be made asynchronous and throttled. This is something that would need to be done for scalability anyway.
But there is other processing that we do that isn't so easily modified because we're resource bound for extended periods of time.
How do I convince management that heavy use of virtualization is probably not appropriate for us?
If I were your manager, and heard your argument (above), I'd assume that you're just resistant to change. I'd challenge you to show me the data. You haven't really made a case against virtualization. You say that your product "was not designed to be run in a virtual environment". You're in good company, very few apps ARE designed that way. It usually "just works". And if it's too slow, they just throw more resources at it. If they need to move it, make it fault tolerant, expand or contract, it's all transparent. Poorly-behaved apps can be firewalled from other environments, without having to have dedicated hardware. etc., etc.,. What's not to like about that?
You should prepare a better argument, backed up with data from testing. Or you should prepare to be steamrolled by an organization with a lot of time, $$$, and momentum invested in (insert favorite technology here).
It sounds like you're confused about how virtualization works.
You still need to provide enough resources for your virtual machines, the real benefit of virtualization is consolidating 5 machines that only run at 10-15% CPU onto a single machine that will run at 50-75% CPU and which still leaves you 25-50% overhead for those "bursty" times.
If your "bursty" application is slowing down other VM's, then you need to put resource limits in place (e.g. VM#1 can't use more than 3Ghz CPU) and ensure that there are enough resources.
I've seen this in a production environment, where 20 machines were virtualized but each was using as much CPU as it could. This caused problems as a machine was trying to use more Ghz than a single core could provide but the VM would only show a single core. Once we throttled the CPU usage of each VM to the maximum available from any single core, performance skyrocketed. I've seen the same with overallocation of RAM and where the hypervisor keeps paging to disk and killing performance.
Virtualization works, given sufficient resources.
Don't fight the methods, specify requirements.
Do some benchmarks on different sized platforms and establish a rough requirement guideline. If possible, don't say 'this is the minimum needed'; it's better to say "with X resources, we do Y work units per hour, with X', we do Y'. A host that costs Z$ can hold W virtual machines of X' resources", then the bean counters will have beans to count. If after all they decide that virtualization is cost-effective, they might be right.
Why does it help to know about virtualization from a programmer's perspective? Except testing and developing on several different platforms without the need of switching between operating systems is there a particular reason why virtualization is important for a programmer? Are there any details that must be kept in mind before developing on virtual instances?
I use it for testing our installer, because it is important to check whether the application will work on a clean installation of the operating system.
I used to do these tests by keeping a hard drive with a fresh operating system installation and making a copy of that disk for (almost) every new test run. This was very time consuming, and the virtual machine solution has saved me a lot of time. Note that this even allows you to do remote debugging as easily as when using two non-virtual machines.
Note: If you're interested, I'm using VirtualBox, which is a very good and free virtualization tool.
If you develop a driver or something very close to the hardware with a high risk to crash the machine, you will be glad to be working on a virtual machine.
Reverting to an old state is easier than to repair a damaged OS.
One of the main advantages is having your entire development environment as a single image file. I have a perfectly configured version of Windows Server, Visual Studio, ReSharper, etc. I can easily try a new version of something on a copy of this virtual machine without worrying about it causing problems.
I can also back up my entire dev environment to transfer it to another physical machine very easily. I've been through 3 machines in this office alone so that was a lifesaver in itself.
The only real trade-off I see is performance. You generally have to use less physical CPU cores than you actually have and less memory. With a sufficiently powerful machine this is not much of a problem though.
Edit: As nader said, I/O is obviously important for most projects as well. Although developing on a virtual machine does mean a fairly large I/O penalty compared with a native OS install, in practice I rarely find it to be a problem. The superior random access capabilities of SSDs are helping to mitigate this drawback as well.
Being able to completely reset the state of the system is very useful to debug applications which modify their environment - If the actions are repeated after a reset, and they're constrained to the sandbox environment of the VM, you are pretty much guaranteed to get the same result.
We have a large number of different versions / customer customisations of our software, and its not possible for 2 installs of our software to coexist on the same machine. Virtualisation allows us to replace the 50-60 physical machines that we need to maintain for testing and problem reproduction with 2-3 virtual servers - it takes around 10 miniutes to make a copy of a VHD template we have and create a new virtual machine, and as long as you allocate 1-2Gb of RAM the performance is comparable to that of a (slow) physical machine.
Virtual machines are also great for build machines.
Personally I do all of my development on my deskop machine for best performance, and remote debug into VMs. I dont run virtual machines on my desktop as it uses up too much RAM, we have dedicated virtual servers for that.
Good for developing, because you have same server configuration in virtual machine like on production server.
https://stackoverflow.com/questions/905926/developer-software-setup
From a user space application there should be no difference developing for a virtualised OS versus a normal OS. There may be some gotchas if your code makes explicit assumptions of the machines memory size and number of processors and believes what the hypervisor tells you.
I'm surprised no one has mentioned the ease of deployment. All you need to do is get the build down on the virtual O/S and then you can copy the image to as many new servers (running some kind of virtualization solution [like VMWare]) as you want, easily scaling your application.
Record the state of a bug in a program, and send it to the developer (along with the entire "machine").
Testing your code on various O.S, some of which you don't have.
Working in a more protected environment, making sure that the code doesn't harm your system -useful for understanding dangerous programs, like viruses, and developing security against it, for writing potentially wrong hard-drive programs, and anything that can have catastrophic effects on your system.
Easily Write your own O.S without the need to write on 'real' boot sectors, a potentially harmful act (Hope this is not new...).
Quickly use tools and programs not found on your own O.S.
Demonstrate a program at various times, by restoring a virtual machine,
quicker and less prone to failure, than trying to recreate the state at the minutes before the demo.
Less directly connected to programing, but surfing vie a virtual machine (for example to see documentation) has the added value that your own important system (and code) is less likely to be harmed by malicious programs.
From my experience in most cases the answer is typically "no" (When testing and targeting multiple platforms is removed) Both are huge reasons to be familiar with "desktop" VM solutions. Others have done an excellent job of listing rare exceptions like debugging kernel codes.
There are some quirks one must be aware of when running on a virtual machine. This is hardly an exhaustive list:
Loss of precsision or even time reversal in high resolution timers due to emulation of hardware resources (depends somewhat on the vm platform and operating system)
Virtual network interfaces ususally bridged. We've seen some extremely odd behavior in the host system with an application that sets up its own bridge between virtual interfaces -behavior which logically should not effect the host in one of the leading VM solutions.
Usage models - If your product has orwellian licensing codes or records state dependant behavior when interacting with remote systems you should account for what would happen if a system were "paused" and "restarted" or restarted from an earlier "state". Normally this kind of thing would be taken into account anyway in a robust implementation.
If you are developing in a virtual environment you will want to make sure you know what specifications were used to create the environment. If you have say a 4 Gig machine and create a virtual environment with 1 Gig you will want to make sure things in your development do not grow to a point that it overruns the memory. This will cause slight performance problems. I personally ran into this and it was a pretty tricky thing to track down. The scenario was that I was fixing a bug and testing it in a virtual environment. I did not setup the virtual environment by the way... The application took a performance hit because of all of the memory swapping that was taking place.
A very good use for a virtual environment is when you are developing applications that mess with the Windows Gina. It's much easier to reinstall a virtual environment than an entire PC....(been here done that too).
I do all of my development on a virtual XP instance under VMWare Fusion so that I can use a Mac for everything and still write .NET code ;-)
Sometimes they are necessary, because the platform you are programming doesn't support the standard developer environment. One such example is Sharepoint. As of Sharepoint 2007 you still need a server OS to install Sharepoint 2007, WSS, and the Visual Studio Sharepoint Extensions (VseWSS).
Thus for Sharepoint I have to use a Window Server VM to do my development work. As for Sharepoint 2010 they are supporting installations on Vista and 7 x64, but I will still use a VM, because I don't want to have Sharepoint on my main machine slowing everything down. Rather I want it in a VM where the services are on when needed and off when I don't without having to manually turn off/on each service. This in addition to the many great answers posted above.