In Powershell, is it a good practice to write multiple functions in a single script file? - powershell

I was told to write multiple actions using powershell script. Actions such as Apppool creation, SQL updation, File editing and etc.
I am going to write such a bulk thing in script first time.
So i would like to know the best practice before writting them.
Is it a good practice to write all the function in a single file?
I am thinking at least 10 functions i may need to write. Assuming each function may have 10 lines of code.

Consider modules: the simplest format is a manifest (.psd1) and a single script file (.psm1) containing all the functions, aliases, ... the module exports (plus any internal helpers).
In this case you are clearly putting multiple connected functions in one file. Even if much of the code is only dot-sourced into the script module they are still logically in one entity.
On the other hand using scripts in your path to execute without having to load before hand would tend (as per Adriano's comment to the question) to support one function (at script scope rather than a function statement) makes sense.
Therefore: there is no one "good practice": it all depends on the details of the circumstance.

Be pragmatic, truth come from action, no from words ;O)
So begin, by the beginining :
1) Does the thing you want to do exists somewhere on internet EX PoshCode (if so you can adapt it)
2) Think about your functions (not to much) object : reuse the code (write your algorith in pseudo code)
3) Use internet to look for the functions even existing
4) Wrote all functions in the same file as the main code to test them. During this phase you'll discover new functions and parameters to add or to remove from existing ones
5) Once you have tested your code, put the reusable functions (and the ones they depend on) into one or multiple module.

My solutions will be to create a custom Module where will be possible add function later.
You can save your single file with all functions as mymodule.psm1 in mymodule folder under this path $env:psmodulepath.
then add-module mymodule (or better call it in you $profile to have it ready when console is up)

Related

Powershell Remove-Variable cmdlet, do I need to call it at the end of each function/scriptblock?

This is a generic question, no code.
Not sure if I need to remove local variables as I thought it should be done by the Powershell Engine.
I had a script to gather info from WMI and used a lot of local variables. The output was messed up when running multiple times, but it got fixed after I clean up all local variables at the end of function/scriptblock.
Any thoughts/idea would be appreciated.
The trouble do not come from the fact that you do not remove your vars, but by at least from two beginers errors (or done by lazy developpers like me, supposing I'am a developper).
we forget to itialize our vars before using them.
we do not test every returns of our functions or CmdLets calls.
Once these two things done (code multipled by two at least) you can restart your script without cleaning anything except the processed datas.
For me scripting is most of the time done on a corner of a table even not push in a source repository.
So start scripting and ask yourself fewer questions.

Using variables defined in R environment within knitr

I'd like to use knitr to produce a document which includes numbers, tables, and figures defined separately in R. I have a separate .R script within which I define the variables of interest, and I have executed this script and verified the variables are in memory. Then, within a .Rmd file I have the R markup code, within which I attempt to display the variables I've defined in R. I keep getting error messages whenever I attempt to knit, stating something like:
Error in unique(c("AsIs", oldClass(x))) : object "lamdaQ6" not found...
Clearly the knitr process initiates a new environment which excludes already defined variables. I have rather extensive R code to define the variables I want to include in the document, which I want to keep separate from the R markup code (both for clarity, and because R markup is not a good development environment for R).
Is there some means of preserving awareness of existing variables in R memory within knitr? I have searched extensively and not found a solution, probably because I don't know the correct term.
The solution offered by Paul Roub worked -- I thought it didn't, but this turned-out to be due to my own typo.
A cleaner solution for my needs, rather than source the whole R file, was to use "save" to save just the objects I needed to create the document. I added a line to save the objects at the end of the .R file, then used "load" to load the objects at the top of the .Rmd script.
So, at the end of the .R file, to save objects XX and YY, I have:
save(XX,YY,file="Data4Rmd.RData")
Then, near the beginning of the .Rmd file, to import these objects, I have:
```
{r,echo=FALSE}
setwd("c:\CurrentProjectDirectory")
load(file="Data4Rmd.RData")
```
This allows me to later use statements like:
Decay rate is lambda = `r XX`, and mean lifetime is `r YY`.
Yet another alternative might have been to save and load the whole workspace, but this seemed excessive.
Thanks for the assistance Paul, much appreciated :)

Powershell: Creating Class

Is it possible to create classes in PowerShell so that it's possible to have all the form objects in one class, button clicks in a second class and the functions in a third?
Just an idea as I've got a script (+GUI) that's just under 900 lines of code but it's getting a little unmanageable, even though I've divided the code into three different blocks separated by comments (The three sections are functions, forms and click_events), it still means I've got to scroll from the top of the script to the bottom if I want to add a function to a click_event.
If you are looking for the concept of libary.
You can dot source .ps1 files as shown here.
You can create a module .psm1 file (not mandatory but better approch) like shown here.
Dot sourcing is the old fashion coming with PowerShell 1.0. Powershell 2.0 bring modules which are more manageable and really bring the concept of library (script, binary).
Now just think about reuse your libraries between different scripts.
I face this problem by breaking my script into several scripts, depending on the function, and then using the Import-Module cmdlet. I even have a couple modules that import other modules of lower level functions.

Argument passing strategy - environment variables vs. command line

Most of the applications we developers write need to be externally parametrized at startup. We pass file paths, pipe names, TCP/IP addresses etc. So far I've been using command line to pass these to the appplication being launched. I had to parse the command line in main and direct the arguments to where they're needed, which is of course a good design, but is hard to maintain for a large number of arguments. Recently I've decided to use the environment variables mechanism. They are global and accessible from anywhere, which is less elegant from architectural point of view, but limits the amount of code.
These are my first (and possibly quite shallow) impressions on both strategies but I'd like to hear opinions of more experienced developers -- What are the ups and downs of using environment variables and command line arguments to pass arguments to a process? I'd like to take into account the following matters:
design quality (flexibility/maintainability),
memory constraints,
solution portability.
Remarks:
Ad. 1. This is the main aspect I'm interested in.
Ad. 2. This is a bit pragmatic. I know of some limitations on Windows which are currently huge (over 32kB for both command line and environment block). I guess this is not an issue though, since you just should use a file to pass tons of arguments if you need.
Ad. 3. I know almost nothing of Unix so I'm not sure whether both strategies are as similarily usable as on Windows. Elaborate on this if you please.
1) I would recommend avoiding environmental variables as much as possible.
Pros of environmental variables
easy to use because they're visible from anywhere. If lots of independent programs need a piece of information, this approach is a whole lot more convenient.
Cons of environmental variables
hard to use correctly because they're visible (delete-able, set-able) from anywhere. If I install a new program that relies on environmental variables, are they going to stomp on my existing ones? Did I inadvertently screw up my environmental variables when I was monkeying around yesterday?
My opinion
use command-line arguments for those arguments which are most likely to be different for each individual invocation of the program (i.e. n for a program which calculates n!)
use config files for arguments which a user might reasonably want to change, but not very often (i.e. display size when the window pops up)
use environmental variables sparingly -- preferably only for arguments which are expected not to change (i.e. the location of the Python interpreter)
your point They are global and accessible from anywhere, which is less elegant from architectural point of view, but limits the amount of code reminds me of justifications for the use of global variables ;)
My scars from experiencing first-hand the horrors of environmental variable overuse
two programs we need at work, which can't run on the same computer at the same time due to environmental clashes
multiple versions of programs with the same name but different bugs -- brought an entire workshop to its knees for hours because the location of the program was pulled from the environment, and was (silently, subtly) wrong.
2) Limits
If I were pushing the limits of either what the command line can hold, or what the environment can handle, I would refactor immediately.
I've used JSON in the past for a command-line application which needed a lot of parameters. It was very convenient to be able to use dictionaries and lists, along with strings and numbers. The application only took a couple of command line args, one of which was the location of the JSON file.
Advantages of this approach
didn't have to write a lot of (painful) code to interact with a CLI library -- it can be a pain to get many of the common libraries to enforce complicated constraints (by 'complicated' I mean more complex than checking for a specific key or alternation between a set of keys)
don't have to worry about the CLI libraries requirements for order of arguments -- just use a JSON object!
easy to represent complicated data (answering What won't fit into command line parameters?) such as lists
easy to use the data from other applications -- both to create and to parse programmatically
easy to accommodate future extensions
Note: I want to distinguish this from the .config-file approach -- this is not for storing user configuration. Maybe I should call this the 'command-line parameter-file' approach, because I use it for a program that needs lots of values that don't fit well on the command line.
3) Solution portability: I don't know a whole lot about the differences between Mac, PC, and Linux with regard to environmental variables and command line arguments, but I can tell you:
all three have support for environmental variables
they all support command line arguments
Yes, I know -- it wasn't very helpful. I'm sorry. But the key point is that you can expect a reasonable solution to be portable, although you would definitely want to verify this for your programs (for example, are command line args case sensitive on any platforms? on all platforms? I don't know).
One last point:
As Tomasz mentioned, it shouldn't matter to most of the application where the parameters came from.
You should abstract reading parameters using Strategy pattern. Create an abstraction named ConfigurationSource having readConfig(key) -> value method (or returning some Configuration object/structure) with following implementations:
CommandLineConfigurationSource
EnvironmentVariableConfigurationSource
WindowsFileConfigurationSource - loading from a configuration file from C:/Document and settings...
WindowsRegistryConfigurationSource
NetworkConfigrationSource
UnixFileConfigurationSource - - loading from a configuration file from /home/user/...
DefaultConfigurationSource - defaults
...
You can also use Chain of responsibility pattern to chain sources in various configurations like: if command line argument is not supplied, try environment variable and if everything else fails, return defauls.
Ad 1. This approach not only allows you to abstract reading configuration, but you can easily change the underlying mechanism without any affect on client code. Also you can use several sources at once, falling back or gathering configuration from different sources.
Ad 2. Just choose whichever implementation is suitable. Of course some configuration entries won't fit for instance into command line arguments.
Ad 3. If some implementations aren't portable, have two, one silently ignored/skipped when not suitable for a given system.
I think this question has been answered rather well already, but I feel like it deserves a 2018 update. I feel like an unmentioned benefit of environmental variables is that they generally require less boiler plate code to work with. This makes for cleaner more readable code. However a major disadvatnage is that they remove a layers of isolation from different applications running on the same machine. I think this is where Docker really shines. My favorite design pattern is to exclusively use environment variables and run the application inside of a Docker container. This removes the isolation issue.
I generally agree with previous answers, but there is another important aspect: usability.
For example, in git you can create a repository with the .git directory outside of that. To specify that, you can use a command line argument --git-dir or an environmental variable GIT_DIR.
Of course, if you change the current directory to another repository or inherit environmental variables in scripts, you get a mistake. But if you need to type several git commands in a detached repository in one terminal session, this is extremely handy: you don't need to repeat the git-dir argument.
Another example is GIT_AUTHOR_NAME. It seems that it even doesn't have a command line partner (however, git commit has an --author argument). GIT_AUTHOR_NAME overrides the user.name and author.name configuration settings.
In general, usage of command line or environmental arguments is equally simple on UNIX: one can use a command line argument
$ command --arg=myarg
or an environmental variable in one line:
$ ARG=myarg command
It is also easy to capture command line arguments in an alias:
alias cfg='git --git-dir=$HOME/.cfg/ --work-tree=$HOME' # for dotfiles
alias grep='grep --color=auto'
In general most arguments are passed through the command line. I agree with the previous answers that this is more functional and direct, and that environmental variables in scripts are like global variables in programs.
GNU libc says this:
The argv mechanism is typically used to pass command-line arguments specific to the particular program being invoked. The environment, on the other hand, keeps track of information that is shared by many programs, changes infrequently, and that is less frequently used.
Apart from what was said about dangers of environmental variables, there are good use cases of them. GNU make has a very flexible handling of environmental variables (and thus is very integrated with shell):
Every environment variable that make sees when it starts up is transformed into a make variable with the same name and value. However, an explicit assignment in the makefile, or with a command argument, overrides the environment. (-- and there is an option to change this behaviour) ...
Thus, by setting the variable CFLAGS in your environment, you can cause all C compilations in most makefiles to use the compiler switches you prefer. This is safe for variables with standard or conventional meanings because you know that no makefile will use them for other things.
Finally, I would stress that the most important for a program is not programmer, but user experience. Maybe you included that into the design aspect, but internal and external design are pretty different entities.
And a few words about programming aspects. You didn't write what language you use, but let's imagine your tools allow you the best possible argument parsing. In Python I use argparse, which is very flexible and rich. To get the parsed arguments, one can use a command like
args = parser.parse_args()
args can be further split into parsed arguments (say args.my_option), but I can also pass them as a whole to my function. This solution is absolutely not "hard to maintain for a large number of arguments" (if your language allows that). Indeed, if you have many parameters and they are not used during argument parsing, pass them in a container to their final destination and avoid code duplication (which leads to inflexibility).
And the very final comment is that it's much easier to parse environmental variables than command line arguments. An environmental variable is simply a pair, VARIABLE=value. Command line arguments can be much more complicated: they can be positional or keyword arguments, or subcommands (like git push). They can capture zero or several values (recall the command echo and flags like -vvv). See argparse for more examples.
And one more thing. Your worrying about memory is a bit disturbing. Don't write overgeneral programs. A library should be flexible, but a good program is useful without any arguments. If you need to pass a lot, this is probably data, not arguments. How to read data into a program is a much more general question with no single solution for all cases.

How to split long Perl code into several files without too much manual editing?

How do I split a long Perl script into two or more different files that can all access the same variables - without having to rename all shared variables from e.g. $count to $::count (or $main::count which is the same)?
In other words, what's the best and simplest way to split the Perl script into several files without having to import a lot of variables/functions and/or do a lot of manual editing.
I assume it has something to do with making the code part of the same package/scope/namespace, but my experiments so far have failed.
I am not sure it makes a difference, but the script is used for web/CGI purposes and will be running under mod_perl.
EDIT - Background:
I kind of knew I would get that response. The reason I want to split up the file is the following:
Currently I have a single very old and very long Perl file. I know it is not following Perl best practices but it works.
The problem is, I need to distribute the data files it uses between different web servers, first of all for performance reasons. There will be one "master" server and one or several "slaves".
About 20% of the mentioned Perl file contains shared functions, 40% has the code need to run on the master server and 40% on the slave servers. Therefore, I would like to split the code into three files: 1. shared, 2. master-only, 3. slave-only. On the master server, 1 and 2 will be loaded, on the slaves, 1 and 3 will be loaded.
I assume this approach would use less process RAM and, more importantly, I would minimize the risk of not splitting the code correctly (e.g. a slave process calling a master data file). I don't see a great need for modularization, as the system works and the code does not need a lot of changes or exchanges with other projects.
EDIT 2 - Solution:
Found the solution I was looking for here:
http://www.perlmonks.org/?node_id=95813
In cases where the main package is in ownership of the variable, the
actual word 'main' can be ommitted to yield something like: $::var
It is possible to get around having to fully qualify variable names
when strict is in use. Applying a simply use vars to your script, with
the variable names as it arguments will get around explicit package
names.
Actually, I ended up repeating the our ($count, etc...) statement for the needed variables instead of use vars ();
Do let me know if I am missing something vital - apart from not going with modules! :)
#Axeman, Thanks, I will accept your answer, both for your effort and for sending me in the right direction.
Unless you put different package statements in their files, they will all be treated as if they had package main; at the top. So assuming that the scripts use package variables, you shouldn't have to do anything. If you have declared them with my (that is, if they are lexically scoped variables) then you would have to make sure that all references to the variables are in the same file.
But splitting scripts up for length is a rotten substitute for modularization. Yes, modularization helps keep code length down, but modularization if the proper way to keep code length down--for all the reasons that you would want to keep code-length down, modularization does it best.
If chopping the files by length could really work for you, then you could create a script like this:
do '/path/to/bin/part1.pl';
do '/path/to/bin/part2.pl';
do '/path/to/bin/part3.pl';
...
But I kind of suspect that if the organization of this code is as bad as you're--sort of--indicating, it might suffer from some of the same re-inventing the wheel that I've seen in Perl-ignorant scripts. Just offhand (I might be wrong) but I'm thinking you would be surprised how much could be chopped from the length by simply substituting better-tested Perl library idioms than for-looping and while-ing everything.