Aggregation In PowerShell (and another pointless function)

I’ve been doing a lot of thinking about “idiomatic PowerShell” since my last post and my thinking led me to an idea that I haven’t actually used, but seems like the kind of thing that people would do in PowerShell.

If I were writing a script that needed to get a “bunch of things” from somewhere (perhaps several different sources) and return all of them, I might be tempted to do something like this. Please forgive my PowerShell pseudocode:

function get-stuff{
param($parm1 )
    $results=@()
    foreach ($source in $sources){
        $results += ($source | where { $_ -and "Some condition exists"  })
    }
    foreach ($source in $someothersources){
        $results += ($source | where { $_ -and "Some condition exists"  })
    }
 return $results
}

I’ve used several permutations of that kind of code using arrays of some sort to collect the results as I go along and eventually return the collection from the function. I’m not sure that there’s anything wrong with doing it this way. That is, I’m not sure that you’re likely to have issues with doing it this way.

On the other hand, it’s more idiomatic (i.e more in the style of the PowerShell language) to do something like this (again, pardon the pseudocode):

function get-stuff{
param($parm1 )
    foreach ($source in $sources){
        $source | where { $_ -and "Some condition exists"  }
    }
    foreach ($source in $someothersources){
        $source | where { $_ -and "Some condition exists"  }
    }

}

All I’m doing here is sending the output of the inner statements (which are pipelines) to the output stream of the function. Note that there isn’t any need for anything to accumulate the results into.  Using the output stream makes this function work more like the built-in cmdlets in PowerShell as it won’t be blocking the pipeline.

The only thing that I have against this code is that it goes against rule #2 that I wrote last time about writing values to the output stream. I said there that if you were going to write to the output stream, you should explicitly use write-output. We could modify the code above to use write-output, but that would involve using parentheses (around the pipelines), messing up the flow of the code, and even blocking the pipeline while the expressions in the parentheses were collected (as an argument to write-output).

That brings me to what I was saying about “another pointless function”. About a year ago I wrote a post about the identity function, which doesn’t really do anything except return the input. It is a really useful function for creating lists and such, allowing you to skip on providing a bunch of punctuation. It’s not a pointless function, but it’s not one that is getting much press, either. I was thinking about how to make the “pipeline” version of the code work nicely and not make it ugly and thought of the following function.

function out-output{
    process{ $_ }
} 

Like the identity function (or ql, as I’ve seen it referred to), out-output doesn’t do anything but emit values that are provided. Out-output, however, gets its values from the pipeline rather than the argument list. This function allows us to be explicit about our intent to use the output stream.

function get-stuff{
param($parm1 )
    foreach ($source in $sources){
        $source | where { $_ -and "Some condition exists"  } | out-output
    }
    foreach ($source in $someothersources){
        $source | where { $_ -and "Some condition exists"  } | out-output
    }
}

I’m not sure if this is a good idea, and I know that it’s just adding a tiny bit of processing to the script. My thought is that making the operation of the script explicit is worth it in the long run.

What do you think?

-mike