How do others collect server names into a file like C:\Servers.csv - powershell

Nearly all PowerShell code leans heavily on a list of computer names. Maintaining this list is crucial for an up to date data center.
I used to gather computer names from my WSUS boxes. We have now switched to an inferior product which does not give up secrets easily! How are others gathering accurate lists of computer names with which to fuel your code?
Grabbing names from Active Directory works well, however not ALL servers are in the domain. However all servers need updates so let's get a list from the update server. Ahhh, there is the rub, inferior updating software! How is everyone gathering computer names for future processing?

For our datacenters, I always query each of the domain controllers to get a list of systems instead of relying on csv files (we no longer keep anything unjoined).
we used to have stand alone boxes, when we did I would simply get their IPs or netbios names by using the VMWare Power CLI to query that info from vCenter.
Alternatively, if you only have VMs, you can just go that route directly.
If you have stand-alone physical machines, then you would want to scan all of your IP range (after converting items you got from the previous two steps into IPs, so that you can exclude all IPs that are already accounted for by replies to netbios (to check the OS reported to make sure they are Windows systems)).
Alternatively, if all of the servers are in DNS, then just query the DNS entries from your domain controller, slap a parallel computer-exists check on the results, select unique where result is good and that is your list.

I'm not sure where you're getting this idea that people are composing CSV lists of machines for data centers. Most data centers I'm aware of will have virtualization stacks and have complex monitoring software such as the various flavors of SolarWinds products. Usually these will allow you to export a customized CSV of the machines they monitor. Of course, like anything else, there are quite a few of these monitoring services out there.
For small and mid-sized companies, admins may or may not maintain an inventory of company assets. I've been contracted to come out to companies who have let their asset management system slip and have no idea where everything is, and I usually use a combination of command line tools such as nmap, fping, and the PSADUC module to do some network discovery over several site days.
PowerShell isn't a great tool for data center use anyway, since most infrastructure uses some sort of Linux to host their virtual machines. Increasingly, the industry has moved towards containerization and microservices as well, which make maintaining boxes less and less relevant.

Related

What did "share nothing" mode real mean?

We called Greenplum,Redshift MPP or share nothing.
but i really dont understand why?
if it mean during a mutil-level join query,one host are computing all the time,no hosts exchange data each other?,there is no shuffle?
and or other situation.
whats the crux means of "share nothing"?
Shared nothing means that no two servers share the same data (aside from mirrors for high availability). A simple example would be a two node cluster where the data is distributed by gender_code. Node1 would have all of the males and node2 would have all of the females.
In the real world, you have way more nodes than just two so you distribute the data by something like an ID column. This gives you an even distribution of data across all of the nodes.
As you can probably guess, the optimizer has to be pretty smart to reduce the amount of data movement needed to execute a query. It also needs to slice the query into multiple parts so that it can execute the multiple slices of the query at once. Greenplum has been around for over 10 years and has a mature optimizer which can execute a wide variety of queries pretty well.
"Shared Nothing" is a description of what resources are shared between the processes running in parallel. So you may have shared memory approaches running on a single host, shared storage between multiple hosts or self contained systems with their own processing, RAM and storage. A deployment based a a few of these self contained systems would be described as "shared nothing".
In a shared-nothing system each node will store a subset of the data. Query planners in these systems try to do as much work as possible on the same host the data is stored on and move or shuffle as little data as possible (on Greenplum systems these steps in the query plan are called motions).
We call MPP 'Shared Nothing' as a way to compare Greenplum to something with a 'Shared Everything' architecture like Oracle RAC which also has multiple servers in a cluster but they all connect back to the same set of datafiles.

Enumerating lots of iSCSI volumes. Very slow. Having issues with programmatically getting disk number

I am currently writing some orchestration software for Hyper-V 2012 R2.
The orchestration platform as a whole also talks to other Hypervisors, like Xen.
I am in the process of introducing new SAN storage and, due to some desirable features that exist at a storage level, I want to use a LUN-to-VM mapping on all Hypervisors.
I am having real issues with managing this volume of iSCSI connections on Windows. But not in the way I thought I would...
I had heard that there were scaling issues with Windows and ‘lots of LUNs’, but I wanted to check for myself. I am seeing none of the issues other people have mentioned.
For instance, I can enumerate 500+ LUNs via diskpart in a second. I can list all connected disks with ‘get-disk’, in under a second, the issue comes from iSCSI scaling itself.
If anyone has a moment to read on, perhaps they could shed some light on why...
I have no issue programmatically connecting to the iSCSI targets, but I seem to have real issues when I start trying to get session information (which I need to get other information).
i.e. There seems to be no way to specify which disk number\address an iSCSI target receives at the point it is connected (unless i'm mistaken). I can work backwards from the IQN via WMI, via a call to
MSiSCSIInitiator_SessionClass
When you start talking about 100+ connected volumes, a call to this class can sometimes take over 10 minutes to return. If I test it via Powerhell with something like:
$query = "Select * from MSiSCSIInitiator_SessionClass Where TargetName='$iqn'"
Get-WmiObject -Namespace "root\WMI" -Query $query
...you can see it get stuck midway through enumerating the volumes. It will pause.
I haven’t run the exact numbers, but every additional volume seems to put about 3-4 (or more) seconds on the total time a query takes to return.
It gets a bit weirder. Windows 2012 has some built in iSCSI commands. I can get a connected iSCSI target object in under a second with
Get-IscsiTarget –nodeaddress blah
I can get an iSCSI connection object using
$iscsi_target_object | Get-IsciConnection
... all in in under a second. These must be related to iSCSI session inforation in some way.
A call, no matter how I package it, to Get-IscsiSession takes about 10 minutes to return.
The Hyper-VM manager GUI is also terribly slow when opening the settings page for a VM, presumably because it is enumerating possible pass-through disks via their iSCSI session. This also takes around ten minutes.
A query to Msvm_DiskDrive in root/virtualisation also takes an age to return..
Again, diskpart, Get-Disk etc all return in seconds. I can refresh all iSCSI targets on the system in about a minute with 500+ targets. I thought that was going to be the hard bit.
So, I have two questions.
First all, does this sound right? Is there anything at all I doing something that might
impact the speed at which WMI calls are returned? Can I speed WMI up at all?
Secondly, can anyone think of any other way – other than the MSiSCSIInitiator_SessionClass – that I could derive a disk number from an IQN? This might solve the bulk of my problems. Perhaps there other routes to this information i might have missed.
Cheers.

MongoDB: Can different databases be placed on separate drives?

I am working on an application in which there is a pretty dramatic difference in usage patterns between "hot" data and other data. We have selected MongoDB as our data repository, and in most ways it seems to be a fantastic match for the kind of application we're building.
Here's the problem. There will be a central document repository, which must be searched and accessed fairly often: it's size is about 2 GB now, and will grow to 4GB in the next couple years. To increase performance, we will be placing that DB on a server-class mirrored SSD array, and given the total size of the data, don't imagine that memory will become a problem.
The system will also be keeping record versions, audit trail, customer interactions, notification records, and the like. that will be referenced only rarely, and which could grow quite large in size. We would like to place this on more traditional spinning disks, as it would be accessed rarely (we're guessing that a typical record might be accessed four or five times per year, and will be needed only to satisfy research and customer service inquiries), and could grow quite large, as well.
I haven't found any reference material that indicates whether MongoDB would allow us to place different databases on different disks (were're running mongod under Windows, but that doesn't have to be the case when we go into production.
Sorry about all the detail here, but these are primary factors we have to think about as we plan for deployment. Given Mongo's proclivity to grab all available memory, and that it'll be running on a machine that maxes out at 24GB memory, we're trying to work out the best production configuration for our database(s).
So here are what our options seem to be:
Single instance of Mongo with multiple databases This seems to have the advantage of simplicity, but I still haven't found any definitive answer on how to split databases to different physical drives on the machine.
Two instances of Mongo, one for the "hot" data, and the other for the archival stuff. I'm not sure how well Mongo will handle two instances of mongod contending for resources, but we were thinking that, since the 32-bit version of the server is limited to 2GB of memory, we could use that for the archival stuff without having it overwhelm the resources of the machine. For the "hot" data, we could then easily configure a 64-bit instance of the database engine to use an SSD array, and given the relatively small size of our data, the whole DB and indexes could be directly memory mapped without page faults.
Two instances of Mongo in two separate virtual machines Would could use VMWare, or something similar, to create two Linux machines which could host Mongo separately. While it might up the administrative burden a bit, this seems to me to provide the most fine-grained control of system resource usage, while still leaving the Windows Server host enough memory to run IIS and it's own processes.
But all this is speculation, as none of us have ever done significant MongoDB deployments before, so we don't have a great experience base to draw upon.
My actual question is whether there are options to have two databases in the same mongod server instance utilize entirely separate drives. But any insight into the advantages and drawbacks of our three identified deployment options would be welcome as well.
That's actually a pretty easy thing to do when using Linux:
Activate the directoryPerDB config option
Create the databases you need.
Shut down the instance.
Copy over the data from the individual database directories to the different block devices (disks, RAID arrays, Logical volumes, iSCSI targets and alike).
Mount the respective block devices to their according positions beyond the dbpath directory (don't forget to add the according lines to /etc/fstab!)
Restart mongod.
Edit: As a side note, I would like to add that you should not use Windows as OS for a production MongoDB. The available filesystems NTFS and ReFS perform horribly when compared to ext4 or XFS (the latter being the suggested filesystem for production, see the MongoDB production notes for details ). For this reason alone, I would suggest Linux. Another reason is the RAM used by rather unnecessary subsystems of Windows, like the GUI.

NoSql option for MMORPGs

I was wondering what sort of benefits a nosql option might have either in lieu of, or in conjunction with a rdbms for a typical MMORPG. In particular, I'm curious as to how data integrity is preserved under nosql.
I'll give a few examples where I'm kind of confused as to how server logic would operate:
Scenario 1: Let's say player A and B attack player C at the same time. A player_stats.kill_count is updated for whichever player kills player C. Normally, one might handle this AttackPlayerRequest in the following way:
Begin Transaction
{
attacker.health = newAttackerHealth
defender.health = newDefenderHealth
if defender.Health == 0
{
attacker.stats.kills += 1
}
}
Scenario 2: Player A sells an item to an NPC vendor, and quickly attempts to drop the same item from their inventory to the ground. (Even if you're UI disallows this, they can certainly just put the events on the wire).
This list obviously goes on and effects just about every player-player player-world interaction. One last piece of information is that the server is threaded as we don't want any particularly lengthy request to block other players from being able to do simple things in the game.
I guess my big question here is, am I misunderstanding something about nosql wherein this a trivial problem? If not, is this beyond the domain of what nosql options are meant to solve? Under a typical MMO, where might one inject nosql options to increase server performance?
This all depends on how your game operates.
NoSQL
Farmville on facebook uses NoSQL. Farmville acts like 1 enormous server that everyone is on. Let's say Farmville uses 3 computers in its database cluster. Computer 1 in San Diego. Computer 2 in New York. Computer 3 in Moscow. I login to Facebook from San Diego so I
connect to the database here. I level up and log out. Eventually Computer 1 Will tell computer 2 and 3 about how I leveled but it could be up to an hour before someone in Russia would see those changes to my account.
NoSQL scaling is as simple as adding another computer to the cluster and telling the website in that region to use that computer.
SQL
There are several methods of making SQL Work but I will explain a way to make it similar to NoSQL
Each Data-center will have 10 game servers, each with its own SQL Database + 1 Sub Master Database. The Sub Master Database shared the information between all 10 servers so if you login to server 1 and make character John Doe, then logout, Server 10 will have your character if you login to that server next.
Sub Master then shares its information with the Master Server at your HQ. The Master Server will then share John Doe to every other sub master at all the other data-centers, and those sub masters will update their game servers. This configuration would allow for you to login to server Weed in San Francisco, play your character, logout, login on server Vodka in Moscow, and still see your character.
Games like World of Warcraft use SQL but only certain parts of the database are shared. This allows for reduced Database size on each computer and also reducing hardware requirements.
In a real life situation each Sub Master would have a backup Sub Master, and your Master server would have a few backup servers in the event that one server goes down your entire network would not lock up.
I think MMO servers would do a lot of the stuff you've listed in memory. I don't think they flush all of those out into the database. I think more hard-set events like receiving new gear or changing your gear layout would be saved.
It's a lot of data, but not nearly as much as every single combat interaction.
MMORPGs that are worried about latency typically run on a single server, and store everything except object model information in RAM. Anything with battle environments are worried about latency.
Anything that uses HDD, be it an RDBMS or NoSQL, is a bad idea in battle environments. Updating and sharing object customization, position, movement vector, orientation vector, health, power, etc... data between 10,000 users on a single server via any mechanism that uses HHD is just silly if you are worried about latency. Even if you have 2 people fighting in a simple arena, latency is still an issue and you should run on a single server and keep everything in memory. Everything that will be rendered should be on your PC or in the memory of your PC as well.
MMORPGs that are not worried about latency can run on multiple servers and any database you wish (NoSQL, MySQL, etc...)
If you're making a game like farmville or neopets, then yes, NoSQL or MySQL become valid options.
To my understanding, nosql technology is something you would use in a batch job context, and not in a real-time context. As a matter of fact, the open source H-base implementation sits on top of the hadoop distributed file system (hdfs), which is primarily used for read-only situations.
The idea is that you 'seek' a record by iterating over the whole data set you are keeping 'on disk' (on a distributed file system actually). For petascale datasets it is infeasible to keep things in memory, so you utilize massive amounts of disk reads to be able to 'seek' the data at all. The distributed nature makes sure (or facilitates) that this occurs in a timely fashion.
The principle is that you bring the processing to the distributed data, instead of pulling data from a very large database, sending it over the network and then process it on a local node. Pig (http://pig.apache.org/) and Hive (http://hive.apache.org/) are interfaces on top of hdfs, which allow for SQL-like queries, but in the background, they use mapreduce jobs. Interaction is slow, but this is not the point. The point is that the immense data set can be processed at all.
For gaming, this is obviously not the way to go. So you would expect that data to be fetched from the distributed file system, cached on a server, and used during a session, When the game has finished, all data would be committed back to the data set. (At least, that's how I imagine it would go).
Hdfs data sets are mostly processed at a certain point in time, after which the results are being published in one batch. For example, in a social network site, you would compile all connection data on a daily basis, search for all new connections between people, and when the operation has finished for all people in the data set, the results would be published in the typical 'X is now conected to Y' messages. This does not happen in real-time.

What is a cluster in a RDBMS?

Please explain what a cluster is in a RDBMS?
In SQL, a cluster can also refer to a specific physical ordering of rows.
For example, consider a database with two tables: INVOICES and INVOICE_ITEMS. If many INVOICE_ITEMs are inserted concurrently, chances are that items of the same invoice end up on multiple physical blocks of the underlying storage. When reading such an invoice, unneeded data will be read together with the interesting rows. Clustering INVOICE_ITEMS over the foreign key to INVOICES groups rows of items the same invoice together in the same block, thus reducing the amount of necessary read operations when accessing the invoice.
Read about clustered index on wikipedia.
In system administration, a "cluster" is a number of servers configured to provide the same service, but look like one server to the users.
This can be done for performance reasons (two servers can answer more requests than a single one) or redundancy (if one server crashes, the others still work).
Such configurations often need special software or setup to work. Some services, like serving static web content, can be clustered very easily. Others, like RDBMS, need complicated replication schemes to coordinate.
Read about computer clusters on wikipedia.
In statistics, a cluster is a "group of items so that objects from the same cluster are more similar to each other than objects from different clusters."
Read about Cluster analysis on wikipedia.
From here:
High-availability clusters (also known
as HA Clusters or Failover Clusters)
are computer clusters that are
implemented primarily for the purpose
of providing high availability of
services which the cluster provides.
They operate by having redundant
computers or nodes which are then used
to provide service when system
components fail. Normally, if a server
with a particular application crashes,
the application will be unavailable
until someone fixes the crashed
server. HA clustering remedies this
situation by detecting
hardware/software faults, and
immediately restarting the application
on another system without requiring
administrative intervention, a process
known as Failover
In database context it can have two completely different meanings:
may either mean data clustering or index clustering, which is grouping of similar rows. This is useful for data mining, some databases (e.g. Oracle) also use it to optimize physical data organization;
or cluster as database running on many closely linked servers.
Clustering, in the context of databases.
It refers to the ability of several servers or instances to connect to a single database.
An instance is the collection of memory and processes that interacts with a database, which is the set of physical files that actually store data.