When preparing a PowerShell training class for a group of DBAs, I realized that there were some parallels between basic SQL and basic PowerShell commands.
A (very) basic SQL statement has the form:
SELECT <COLUMNS> FROM <TABLE> WHERE <CONDITION> ORDER BY <EXPRESSION>
I noticed that a very common idiom for PowerShell pipelines* was:
<data source cmdlet> | select-object <properties> | where-object <CONDITION> | sort-object <EXPRESSION>
By “<data source cmdlet>”, I mean some cmdlet that puts a bunch of objects in the pipeline, like get-childitem, get-process, get-task, etc.
Part of the power of SQL is that it doesn’t matter what kind of data is in the tables, the same form of SQL statement works the same way (predictability). This is one of the things I love about PowerShell. It doesn’t matter what kinds of data is returned by a cmdlet. The same form of PowerShell pipeline* will perform the same kind of predictable operations on it. I know that this is often mentioned in tutorials and videos about PowerShell, but this was when it really struck me.
A few other SQL/PowerShell comparisons might be:
|SUM(), AVG(), etc.||measure-object|
|SELECT DISTINCT||select-object –unique|
|SELECT TOP n||select-object –first n|
Obviously, this comparison breaks down pretty quickly. There isn’t really a parallel that I can find to JOIN statements, which make SQL so powerful, and clearly there’s a lot of powershell scripts that don’t fit the pattern I’m describing. I think, though, that it’s a useful comparison and can help get people “over the hump” in their quest to master PowerShell.
Let me know what you think.
* A Pipeline in PowerShell is a sequence of cmdlets where each takes the output of the previous cmdlet as its input.