What is called a Node in a WebSpere Network Deployment - deployment

In a installation of WebSphere Application Server with Network Deployment, a node is:
a physical machine
an instance of operative system
a logical set of WAS instances that is independent of physical machine or OS instance

Basically,
A server is a runtime environment, a process of execution.
A node is a grouping of servers that share common configuration. It is a physical machine.
A cell is a grouping of nodes into a sigle administrative domain. For websphere, it mean that if you group several servers within a cell, then you can administer them with one Websphere admin console
Hope this helps!

#ggasp Here is what I got off IBM's Information Center
A node is a logical grouping of managed servers.
A node usually corresponds to a logical or physical computer system with a distinct IP host address. Nodes cannot span multiple computers.
http://publib.boulder.ibm.com/infocenter/wasinfo/v6r1/index.jsp?topic=/com.ibm.websphere.nd.multiplatform.doc/info/ae/ae/cagt_node.html

Keep in mind that usually <> always.
Since WAS 6.0 and up you usually want to setup more than one node in each physical computer, given the usual power of the server you use the node to separate logical business entities.
Like for example have 6 nodes, 3 in each of 2 machines and have 1 pair of nodes you could define 3 different clusters one for each stage (dev, qa, staging) and making each cluster be invisible to the other.

A Cell is a virtual unit that is built of a Deployment Manager and one or more Nodes. A Node is another virtual unit that is built of a Node Agent and one or more Server instances.
Here you can find more details including a diagram.

Related

Consumen multiple network interfaces of single machine for kafka cluster

I have a Linux machine with 3 network interfaces,
let's say IPs are 192.168.1.101,192.168.1.102,192.168.1.103
I want to consume all 3 IPs of this single node to create a Kafka cluster with other nodes, Should all 3 IPs have their separate brokers?
Also using nic bonding is not recommended, all IPs need to be utilized
Overall, I'm not sure why you'd want to do this... If you are using separate volumes (log.dirs) for each address, then maybe you'd want separate Java processes, sure, but you'd still be sharing the same memory, and having that machine be a single point of failure.
In any case, you can set one process to have advertised.listeners list out each of those addresses for clients to communicate with, however, you'd still have to deal with port allocations in the OS, so you might need to set listeners like so
listeners=PLAINTEXT_1://0.0.0.0:9092,PLAINTEXT_2://0.0.0.0:9093,PLAINTEXT_3://0.0.0.0:9094
And make sure you have listener.security.protocol.map setup as well using those names
Note that clients will only communicate with the leader topic-partition at any time, so if you have one broker JVM process and 3 addresses for it, then really only one address is going to be utilized. One optimization for that could be your intra-cluster replication can use a separate NIC.

MongoDB sharding: mongos and configuration servers together?

We want to create a MongoDB shard (v. 2.4). The official documentation recommends to have 3 config servers.
However, the policies of our company won't allow us to get 3 extra servers for this purpose. Since we have already 3 application servers (1 web node, 2 process nodes) we are considering to put the configuration servers in the same application servers, with the mongos. Availability is not critical for us.
What do you think about this configuration? Can we face some problem or is it discouraged for some reason?
Given that Availability is not critical for your use case, I would say it should be fine to place the config servers in the same application servers and mongos.
If one of the process nodes is down, you will lose: 1 x mongos, 1 application server and 1 config server. During this down time, the other two config servers will be read-only , which means there won't be balancing of shards, modification to cluster config etc. Although your other two mongos should still be operational (CRUD wise). If your web-node is down, then you have a bigger problem to deal with.
If two of the nodes are down (2 process nodes, or 1 web server and process node), again, you would have bigger problem to deal with. i.e. Your applications are probably not going to work anyway.
Having said that, please consider the capacity of these nodes to be able to handle a mongos, an application server and a config server. i.e. CPU, RAM, network connections, etc.
I would recommend to test the deployment architecture in a development/staging cluster first under your typical workload and use case.
Also see Sharded Cluster High Availability for more info.
Lastly, I would recommend to check out MongoDB v3.2 which is the current stable release. The config servers in v3.2 are modelled as a replica set, see Sharded Cluster config servers for more info.

Messaging between torque jobs in a cluster

So, I need to submit computation intensive jobs (deep neural network training) to a torque cluster that lease computation time on, and I need to exchange a few megs of large float arrays every few minutes between the active nodes, as the nodes need to be working on the most recent version of the neural network in order to train it well.
I was wondering if there were any good communication options, at least to tell each active job its sisters jobs' ips so it can connect to them by tcp. The nodes don't have access to the internet, and we can't have daemons working on the job submitting server.
The only options that I see would be:
some message passing option on Torque (I'm am fairly noob at torque)
the very error prone option of using files to communicate, which I hate.
a way to query the ips of the active nodes from the server.
There are a variety of ways to exchange information between nodes on a cluster, depending on the architecture of the cluster. Torque is a resource manager, so if the job is being submitted to the cluster using a batch script there are a few environment variables that should be able to give you the hostnames or IP addresses of the nodes being used on a job.
The exact syntax for finding the IP addresses and/or hostnames will depend on the scheduler/workload manager being used with Torque on your cluster. This link has documentation for the PBS Works workload manager.
Parallel communication between nodes can be achieved in a variety of ways and will be partly dependent on the hardware available in the cluster. Using MPI is one of the most common ways to parallelize code for use on a cluster and many implementations support multiple high-performance fabrics/interconnect systems like Infiniband. Some useful introductions to the different types of parallelism can be found here.
As an alternative to MPI Remote Direct Memory Access(RDMA) can be used to pass and access information between nodes. If the cluster has Infiniband network adapters looking into the IB-Verbs API from the vendor would be an additional option for passing data between nodes.

Running Kubernetes locally

I am planning to test Kubernetes locally, but would like to ask some theoretic questions before.
I created a pipeline in python that takes as input a whole bunch of files from a directory, and created a docker image out of it (this is my Pod)
What I understood from the documentation is that the Kubernetes scheduler will choose automatically the minion to deploy for a given task, my question is, using an 8G memory laptop, is there a 'rule' to follow before creating the minion (specifying the number of minions to deploy) based on the amount of memory available in a machine (regardless if it is a laptop or a cluster) ?
Thanks
You would typically only ever have one minion/host. So if you are deploying your minions on physical hardware, there is a 1:1 mapping between minions and physical hosts.
If you are deploying into a virtual cluster on your laptop, you will want to make sure that each virtual minion has enough memory to run at least a single instance of whatever containers you plan on deploying. "How much is enough?" is a question that only you can answer.

OpenStack API Implementations

I have spent the last 6 hours reading through buzzword-riddled, lofty, high-level documents/blogs/articles/slideshares, trying to wrap my head around what OpenStack is, exactly. I understand that:
OpenStack is a free and open-source cloud computing software platform. Users primarily deploy it as an infrastructure as a service (IaaS) solution.
But again, that's a very lofty, high-level, gloss-over-the-details summary that doesn't really have meaning to me as an engineer.
I think I get the basic concept, but would like to bounce my understanding off of SO, and additionally I am having a tough time seeing the "forest through the trees" on the subject of OpenStack's componentry.
My understanding is that OpenStack:
Installs as an executable application on 1+ virtual machines (guest VMs); and
Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources; and
Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and
Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant
So please, if anything I have stated about OpenStack so far is incorrect, please begin by correcting me!
Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs and require the open source community to provide concrete implementations:
Nova (VM manager)
Keystone (auth provider)
Neutron (networking manager)
Cinder (block storage manager)
etc...
Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc. However, after reviewing all of the official documentation this morning, I can find no such providers for these APIs. This leaves a sick feeling in my stomach like I am fundamentally mis-understanding OpenStack's componentry. Can someone help connect the dots for me?
Not quite.
Installs as an executable application on 1+ virtual machines (guest VMs); and
OpenStack isn't a single executable, there are many different modules, some required and some optional. You can install OpenStack on a VM (see DevStack, a distro that is friendly to VMs) but that is not the intended usage for production, you would only do that for testing or evaluation purposes.
When you are doing it for real, you install OpenStack on a cluster of physical machines. The OpenStack Install Guide recommends the following minimal structure for your cloud:
A controller node, running the core services
A network node, running the networking service
One or more compute nodes, where instances are created
Zero or more object and/or block storage nodes
But note that this is a minimal structure. For a more robust install you would have more than one controller and network nodes.
Somehow, all instances of your OpenStack cluster know about each other (that is, all instances running on all VMs you just installed them on) and form a collective pool of resources;
The OpenStack nodes (be them VMs or physical machines, it does not make a difference at this point) talk among themselves. Through configuration they all know how to reach the others.
Each OpenStack instance (again, running inside its own VM) houses the dashboard app ("Horizon") as well as 10 or so other components/modules (Nova, Cinder, Glance, etc.); and
No. In OpenStack jargon, the term "instance" is associated with the virtual machines that are created in the compute nodes. Here you meant "controller node", which does include the core services and the dashboard. And once again, these do not necessarily run on VMs.
Nova, is the OpenStack component/module that CRUDs VMs/nodes for your tenants, is somehow capable of turning the guest VM that it is running inside of into its own hypervisor, and spin up 1+ VMs inside of it (hence you have a VM inside of a VM) for any particular tenant
I think this is easier to understand if you forget about the "guest VM". In a production environment OpenStack would be installed on physical machines. The compute nodes are beefy machines that can host many VMs. The nova-compute service runs on these nodes and interfaces to a hypervisor, such as KVM, to allocate virtual machines, which OpenStack calls "instances".
If your compute nodes are hosted on VMs instead of on physical machines things work pretty much in the same way. In this setup typically the hypervisor is QEMU, which can be installed in a VM, and then can create VMs inside the VM just fine, though there is a big performance hit when compared to running the compute nodes on physical hardware.
Assuming I am more or less correct, my understanding of the various OpenStack components is that they are really just APIs
No. These services expose themselves as APIs, but that is not all they are. The APIs are also implemented.
and require the open source community to provide concrete implementations
Most services need to interface with an external service. Nova needs to talk to a hypervisor, neutron to interfaces, bridges, gateways, etc., cinder and swift to storage providers, and so on. This is really a small part of what an OpenStack service does, there is a lot more built on top that is independent of the low level external service. The OpenStack services include the support for the most common external services, and of course anybody who is interested can implement more of these.
Above, I believe all components are APIs. But these APIs have to have implementations that make sense for the OpenStack deployer/maintainer. So I would imagine that there are, say, multiple Neutron API providers, multipe Nova API providers, etc.
No. There is one Nova API implementation, and one Neutron API implementation. Based on configuration you tell each of these services how to interface with lower level services such as the hypervisor the networking stack, etc. And as I said above, support for a range of these is already implemented, so if you are using with ordinary x86 hardware for your nodes, then you should be fine.