Verifying Automation

If you’re anything like me, you’ve been bitten by the PowerShell bug and are using it among other automation sources to make you life in IT much more enjoyable. If this is not the case…you need to get started!  There’s no time like the present, and a PowerShell New Year’s resolution should be something to consider.

For those of you that are with me in the PowerShell camp, I have something that I’d like to discuss.  You probably have hundreds (dozens?) of scripts scheduled on multiple servers, possibly in multiple domains or geographical locations to perform things like these:

  • Gather information about servers
  • Generate reports about application usage
  • Copy information from one place to another
  • Validate security setup
  • Start and Stop processes
  • Scan log files for error conditions
  • Lots of other things (you get the point)

How do you know that the scripts that you have written carefully and scheduled are actually running successfully?  At first, this seems like a silly question.  When you deployed the script, surely you ran it once to make sure it worked.  What could have gone wrong?

Here are some examples that come to mind:

  • A policy was pushed which set the execution policy to Restricted
  • The credentials you scheduled the script with have been revoked
  • A file share that the scripts depend on is unavailable
  • Firewall rules change and now WMI queries aren’t working
  • The Task Scheduler service is stopped

You can probably think of a lot more examples of things that would keep scripts from working, but you get the idea.  I’ve given some thought to how to do this, but haven’t come to any real conclusions.  Obviously, having your scripts log results is helpful, but only if you monitor the logs for success/failure.  Also, if you have a script which is supposed to run every 10 minutes, it doesn’t help if you don’t get alerted when it only runs once in a day, even if  it runs successfully.  Also, if there is more than one person writing scripts, how do you make sure that everyone is using the same techniques to log progress?

Here are some of my thoughts:

  • Use a “launcher” to run scripts (see below)
  • Keep a database of processes with an expected # of runs per day
  • Monitor matching start/end of scripts
  • Log all output streams (example)

The first item in the list (the launcher) has been something I’ve been considering because it’s not trivial to run a PowerShell script in a scheduled task.  Even with the -file parameter which was added in PowerShell 2.0, it can involve a fairly long command-line.  With the added difficulty of trying to capture output streams (most of which are not exposed to the command shell) it becomes a process that is almost hard to get right every time.  Some features I’m planning for the launcher are:

  • Load appropriate profiles
  • Log all output streams (with timestamps) to file or database
  • Log application start/end times
  • Email owner of script if there are unhandled exceptions

I know this topic is not specific to PowerShell, but as Windows administrators get more used to scripting solutions to their automation problems with PowerShell (which I am confident that they are doing), it’s something that every organization will need to consider.  I’ll try to follow up with some posts that have some actual code to address some of these points.

Mike

P.S. I’m specifically not discussing “enterprise job scheduling” solutions like JAMS because of the high cost involved. I’d like to see the community come up with something a little more budget-friendly.