In our environment, we run the task scheduler as a cluster resource, so that scheduled tasks always run on the active node. (In ancient history, we would have scheduled tasks that weren't cluster aware - in the event of a failover, we'd have to manually disable task scheduler on the now passive node, and manually enable it on the active node.)
Unfortunately, when you run in this mode, the service pack or hotfix can't patch both nodes, because the patching of the passive node is achieved by remotely creating a scheduled task. When it tries to create the task on the passive node(s), the log records this error:
Task Scheduler: Error, cannot create new scheduled task for product instance target \\<PSVNodeName> Task Scheduler: Error, cannot create scheduled task for product instance target \\<PSVNodeName> Task Scheduler: Removed remote folder for product instance target \\<PSVNodeName> Error, remote process failed for product instance target Exit code for passive node: <PSVNodeName> = 1603 The following exception occurred: No passive nodes were successfully patched Date: <date/time> File: \depot\sqlvault\stable\setupmainl1\setup\sqlse\sqlsedll\instance.cpp Line: 3510
|
The workaround Grasshopper posted in this SQLServerCentral thread might work, but it seems like a lot of hassle to me.
What I did to get around the problem was as follows (and I'll use a 2-node, single instance cluster as the example):
- establish remote desktop sessions on the individual nodes (not through the cluster)
- in Cluster Administrator on the active node (say, node1), pause the passive node (say, node2)
- install the hotfix on node1
- un-pause node2
- failover to node2 (since you can't install a patch from a passive node)
- reboot node1
- from node2, wait for node1 to come back online, and then pause node1
- install the hotfix on node2
- un-pause node1
- reboot node2, failing back over to node1
[Yes, this requires two brief outages, so you should do this during a maintenance window.]
So, is this more or less hassle than Grasshopper's solution? I think the answer is very subjective. His also requires restarting nodes, so the service downtime during failover is unavoidable at least with these two approaches. I am just offering an alternative "solution" so you can pick your own poison.