SCOM R2: CU3 Agent Update and Windows 2008

December 3, 2010 mengotto 4 comments

I did the CU3 update yesterday to our infrastructure.  Later, in the afternoon, I started to approve and process agent updates.  In the evening I got pinged on OCS by our OCS and Group Chat engineer.  He asked if I was doing an install on OCS because “SCOM” is restarting all of the OCS and GroupChat services.  I told him that this wasn’t possible, that the agent install shouldn’t bounce application services.  After looking at one of the boxes, it was apparent that RestartManager was bouncing several services after the SCOM agent update took place.  I had patched other Windows 2008 servers earlier that day without any issue.  I am still uncertain what caused this to happen on our OCS and GroupChat servers, however if it happens to you here is what you need to look for and what you need to do to resolve it.

Despite the push showing as “Successful” you will find that some of these were not so.  The quick way to find them is through an alert view and or this view in the console:

Unhealthy Agents

All of the above Critical states are agents that experienced problems during install.  Pick one and log onto that box.  Checking the SCOM Agent service you will find it in a “Starting State”:

SCOM Agent starting....

After you verify that the SCOM service is “Starting” open up task manager and you should find the MOMAgentInstaller.exe still operating:

What the hell?

Kill this and the HealthService.exe process:

Now start the SCOM agent service and verify your .dll’s have been updated with the .49 version.  If we look at the application and scom event logs we will see what potentially happened.  When looking at the application log we notice that after the scom agent install started the RestartManager started to cycle several services and the SCOM agent had been hung since the incident started:

So be careful about pushing agent updates to Windows 2008 servers if the Restart Manager service is running and is allowed to run, as it may cause some application outages for you.

Categories: Operations Manager

SCOM R2: CU3 Agent Install Issue and Fix – Error 80070641

December 2, 2010 mengotto Leave a comment

So I had to roll CU3 to production today and one of my agents was throwing an odd error:

The Agent Management Operation Agent Install failed for remote computer servername.domain.com.
Install account: myaccount
Error Code: 80070641
Error Description: The Windows Installer Service could not be accessed. This can occur if you are running Windows in safe mode, or if the Windows Installer is not correctly installed. Contact your support personnel for assistance.
Microsoft Installer Error Description:
For more information, see Windows Installer log file “(null)” on the Management Server.

I thought this was odd and had never seen it before.  Did a little “google” search for this issue and found this KB that mentioned the windows installer service could be unregistered or corrupt.  After I followed the steps in the article, I tried to install the update to the agent and it was successful.  Very nice!

Categories: Operations Manager

SCOM R2: One of my CU3 pleasures – agent updates and the patch list

October 14, 2010 mengotto 2 comments

So I rolled CU3 into our labs in the past week.   Doing this on virtual machines and terminal servers with very low free space and a great distance between themselves on the network was not that fun.  Regardless, I was able to patch my SCOM infrastructure.  Our labs are a shared environment so I moved on to agent updates and ran into a few problems.  Two agents, that I have noticed so far, could not update.  The agent logs refered to the CU1 update bits and how the agent was unable to locate them.  I thought that was odd.  So I had to jump on the box and see what was going on.  I tried to do a remote uninstall, but that failed.  I tried to do a uninstall of the agent from the actual server itself, but that failed as well with some pop up box asking for the location of the momagent.msi file.  I suspect something got corrupt in the registry and will now have to follow Jonathan’s blog post on how to brute force uninstall the agent from the few servers that are behaving like this.

Then, a few days after patching several agents, I checked the patch list and saw the following:

 

Before deploying CU3 I read Kevin’s post for guidance.  I noticed he had made a comment about the patch list may not appear correctly on some patched agents:

“Note: experienced 100% success rate on the agent updates…. however, some of my agents are still reporting both the CU2 and CU3 in patchlist.  I am investigating this as it should not be reporting this way.”

At the bottom of his post he did address this:

4.  Agentpatchlist information incomplete.  The agent Patchlist is showing CU3, but also CU2 or CU1.  The localization ENU update is not showing in patchlist.  This appears to be related to the agents needing a reboot.  Once they are rebooted, and a repair initiated, the patchlist column looks correct.”

I have quite a few like this, and didn’t want to have to do all of this in order to get this fixed.  I verified that the .dll’s on the agent were updated and then I looked for this value that the discovery is pulling from the registy of the agent that displayed a mixed up Patch List:

The key where this information is stored: HKLM\Software\Microsoft\CurrentVersion\Installer\UserData\S-1-5-18\Products7779052F1B26F94B\Patches

The reg keys represent the different patches applied and dictate the order they appear in the patch list.  If we look at the values in the key we will notice something different between those that list the correct CU3 patch and those that list the CU3 and older patches:

If the State value is 1, then this patch display name will be listed in your patch list.  If the State value is 2, then it will not be listed in the patch list view.

When I followed Kevin’s advice, it did resolve the issue, but that meant that I had several servers that I would have to first reboot, then repair (basically reinstallation of the agent).  In the lab that might be ok, but production may pose a bigger issue, especially if my lab patching is any indication of the percentage I will see in production.  Furthermore, if the .dll files are updated on the agent, then I would rather just use PSEXEC to batch a reg change on the STATE value and then bounce the health service on that agent.  This would save a lot of time for me, and a lot of outages for our mission critical applications.  In the screen shot I say a repair is not necessary.  This is not the “official” word from MSFT, but just my observations from my lab.  I will fix the remainder of my agents modifying the registry and bouncing the health service, then let it cook for a while before I decide to use this method should this problem appear in production when we patch.  I recommend you test this in your own environment before coming to any conclusions as to if this is a viable work around.  If you feel comfortable with this solution, and have ensured all your workflows and monitoring are working as expected (also ensuring the .dll’s are updated), then you may have saved yourself a lot of time wasted rebooting servers and doing repairs on agents.  ;-)

The only caveat I have seen so far with this is that you may have a inconsistent patch list even after this because on the agent I repaired the patch list showed two CU3 patches (the one with and without the ENU Components) and on the ones that I repaird now showed just the CU3 patch without the ENU addition.  If you are worried about that, just pick one you want displayed and disable the rest.

Blog has moved to WordPress

October 13, 2010 mengotto Leave a comment

As most people with Windows Live Spaces know, the blog feature has been moved to Word Press.  I am still getting used to this, so please hang in there with me.

Categories: Operations Manager

SCOM: OpsMgr 2007 R2 MP version 6.1.7672.0 is released

July 1, 2010 mengotto Leave a comment

MSFT has released an updated version of the SCOM MP.  I believe this is for R2 environments only.  Either way it’s a must have and should be deployed ASAP!  Cory has done a fantastic job (maybe he and his team?) of delivering updates to management packs on a quarterly basis (which I love and respect).

Read all about it here and download it at once.

Categories: Operations Manager

Management Packs

June 30, 2010 mengotto Leave a comment
I have been busy working on a few mp’s.  I also have some updates for the minor ones I posted here a while ago.  When I get time from home I will compile them and upload them to my spaces.

I created a minor MP for Good Technologies (product used for wireless devices and Exchange 2007).  Very minor, does a discovery of two classes and monitors two services.  I will be building on it going forward.

I have updates for my Communication Suite management packs.

Other than that, I have been extremely busy getting ready for Exchange 2007 deployment here at work… But I will be taking vacation soon!

Categories: Operations Manager

Simple Custom Management Packs for SCOM

May 20, 2010 mengotto 1 comment

The past few days I have had to create a few lightweight custom management packs for our ops teams.  They gave me the requirements (pretty light) and I set out to build the management packs for them.  One MP is for OCS that will discovery OCS installation, then defines two roles (Communicator Web Access and Front End role).  I have a few state views and the health roles up to the application level.  Initial discovery is a bit lame, as it is only looking for RTC key and Version (which any place OCS is installed will get detected – I plan on fixing this). 

We use FaceTime Vantage for IM auditing (this was called ImAuditor).  The ops teams want to monitor the dedicated IMAuditor servers as well as the Front End Process Controller service on select OCS front ends.  So I have created my FaceTime Vantage management pack.  Again, lightweight, but you can extend or modify it as you see fit.

Finally we needed to monitor OCS Group Chat.  The requirements for this was just to monitor the two services on Group Chat.  We did get a document from MSFT about counters to evaluate, but I haven’t included those in the MP yet. 

All discoveries are set to run once a day.  Health rolls up from the lowest application component classes (for service monitoring only) to the application level.  I put the sealed versions in one zip file and the unsealed in another.

If you use these, then great, if you don’t then I can’t blame ya!  ;-)   Remember if you do decide to use these management packs, make sure you test them before putting any of them into production.  The primary reason for releasing these was just to give some first time authors a sense of how you might go about creating a basic application management pack.  You can download them from my sky drive here.

Enjoy and happy authoring!

Views for each of the management packs if imported:

comsuite

Categories: Uncategorized

OCS R2 Classes cause grief for Green Machine V 1.03

We use Tim Helton’s Green Machine to reset the health of our various applications.  I wanted to reset the health of OCS, but ran into some issues.  Green Machine should be run against non-abstract classes (OCS R2 Enterprise Edition is a non abstract class, Microsoft.OCS.ServerRole is abstract).  Furthermore, best practices for class format is to use a ‘.’ as a separator not a ‘_’.  The OCS MP does achieve this for the QoE portion of the MP (the only view with a STATE view), but not for the remainder of the classes.

The results of my attempts to run Green Machine against a few OCS classes:

image

image

I was able to get in contact with Tim to see if he could modify his tool to work around this issue and he added the ability to run it against a group (he knocked it out the change in a day, thanks Tim!).   I have not tried his most recent updated tool to see if this will work, but I will let you know that v 1.03 will not work with the OCS R2 classes.  Keep in mind this is not because of a short coming in Tim’s tool, the culprit here is the OCS management pack itself.  So if you are monitoring OCS R2 with SCOM and or testing a custom monitor, be prepared to do a lot of manual resets or create a group containing your OCS servers and try Tim’s new v1.04 of Green Machine R2 against it.

~~Updated so as to clairify that this was not a problem with Tim’s tool.  Original title might have lead people to believe that.  This is not the case.~~

Categories: Operations Manager

OCS R2 Management Pack for SCOM Snafu (No display name for the classes)

Where I work we have OCS R2 deployed.  The OCS ops team is not happy with the level of alerting despite several tuning efforts for this management pack.  We finally decided to just keep it installed for “information” and develop our own custom management pack that will include classes for OCS, GroupChat, and FaceTime IM Auditor.  I could break these out by application, but since this MP will be pretty light I will probably gather the requirements and create one management pack that would discovery all of these applications.  Down side to this is, of course, that any changes will be pushed to the other applications even if the monitors were not targeting them.  I will deal with it.

Anyway, if you are not fortunate enough to have a SCOM administrator who can create a new mp for you, then you might be unhappy with a few issues in this R2 MP such as the LACK of a display name for the classes.  This is problematic when creating custom views targeting the OCS classes, as well as setting up subscriptions.  Whenever you set these up, you will not know what your target is.  That might be an issue for you or your organization.  This is what you will experience if you do not modify the management pack:

Custom Alert View:

Go to your “Workspace” and create a new Alert View:

image

Now change the “Show data related to:” box so that you have OCS R2 Enterprise Edition selected:

image

Now look at what you have (What class is THAT?):

image 

Now if you want to set up a subscription for this class this will be your experience:

Create a new subscription and select “raised by instance of a specific class”:

image

Filter by ‘Enterprise’ and select the OCS R2 EE class:

image

Add it:

image

Look at the results:

image

Your SCOM admin quits, you hire a new guy, something is wrong with the subscription, how is he supposed to know?  Of course you could name it and give it a description, but how do you know that the subscription hasn’t been modified to another OCS R2 class that also has no display name?  Well you won’t know unless maybe you dive into the notifications management pack.  Who wants to do that?

Now, I have a feeling there will be no updates to the current release of the OCS R2 management pack.  Therefore, to fix this, I will export the MP via power shell and modify the unsealed MP in the authoring console (give the classes display names).  Then I will seal it, delete the original and any dependencies in my SCOM configuration group, then import this modified sealed vendor mp and the dependency management packs.

For the OCS dependent custom management packs you have (override, custom, etc) you will have to update the references section of the MP to now use the new OCS MP.  If you don’t do this, then you will not be able to import them.  It’s pretty easy to update references for unsealed custom management packs (you can just edit the XML).  If they are sealed custom MP’s then you will have to do this via the authoring console with the unsealed custom MP, then seal it, and re-import it.

*Keep in mind that you should do this in a lab first, test it, and then understand that this is NOT supported by MSFT.  This means, if you ever have issues with the OCS MP or one of it’s workflows, etc., then you would be in an unsupportable situation.  Also, if the OCS team does release an updated MP that is upgrade compatible, then you will have to probably delete this MP first.  Therefore do these steps at your own risks!*

Open up the SCOM power shell and export the OCS MP from your configuration group:

image

Now that it is unsealed, open it up in the Authoring Console and look at the various classes.  You will notice most are missing a name for the display string.  Naughty, naughty!:

image

Double click on any of those classes that are missing a display name and add it yourself (warning may require a lot of typing):

image

Now view your mastery at work in the Authoring console and save that updated MP:

image

Now seal this mp using mpseal and your friendly little certificate:

image

Before you can import this modified sealed mp, you will have to delete your management packs that are referencing/dependant on the original OCS MP and then delete it as well.  Once you do this, you can import this updated MP which has the incremented number of .22 instead of .21.

image

After the import verify it is there and the correct version:

image

Now step through creating a view or subscription again (for this example we will use a subscription):

image

Viola (that was easy, but time consuming and really not our job)!

image

Happy authoring/modification and don’t forget to update the MP’s that were dependent on the original OCS MP so that you can use them.

Categories: Operations Manager

Add a class filter to make performance reporting easier

At my job I have to create performance reports and schedule them to be delivered on a weekly basis to a group of people.  In the past, finding the right object from the "Add Object" window was painful.  I could never seem to find everything I was trying to find.  So the other day, I clicked on the "Options" button and saw I could filter by Class.  This helped me to find all the objects I was looking for without trying to enter a search phrase for the object.

Below is the outline on how to do this.

1) Go to the reporting pane of the operations console and open the Generic Performance report.

image

2) Create a new chart and new series and click add object.

image

3) At the Add Object window click ‘Options’

image

4) In the Options window click ‘Add…’

image

5) In the Add Class window type the name of the class and pick the class (or classes) you want to filter against.

image

image

image

6) Review your selections.

image

7) Now if you search without a filter you will return only Windows Computer and Exchange 2003 Physical Installations.  Notice it tells you that a Filter Option has been applied.

image

This list is easier to manage compared to a list of all object imho.  Happy reporting!

Categories: Operations Manager