Showing posts with label deadlock. Show all posts
Showing posts with label deadlock. Show all posts

Tuesday, March 8, 2011

Loading and modifying the registry of a dead Amazon EC2 instance

In a recent post, I had to troubleshoot an issue with an Amazon EC2 instance not accessible via RDP after Windows Update and reboot. Back then, I didn't realize that I could have edited the registry of the unresponsive instance. Here is how to do it (Thanks to Nick Greising at Amazon for providing me with the steps). You will first need a repair instance in the same zone.
  1. Note down instance information such as instance ID, attached block devices (volumes), private IP address, associated elastic IP address
  2. Stop the instance
  3. Detach the root volume
  4. Attach the volume to repair instance 
  5. Login to the repair instance
  6. Bring the disk online (eg: drive E)
  7. Run regedit
  8. Go to HKLM
  9. Select File->Load Hive
  10. Browse to E:\Windows\System32\config
  11. Open the hive you want (eg: SYSTEM)
  12. Pick a Key Name (eg: System_old)
  13. Make whatever changes you need
  14. Select the root of the hive you just loaded and modified (eg: HKLM\System_old)
  15. Select File->Unload Hive
  16. [Optional: Note if you are running SharePoint you may need to set Ec2SetComputerName to Disabled so the machine does not change names on restart]
  17. Take the disk offline
  18. You can now logoff or close the connection to the repair instance
  19. Detach the volume from the repair instance
  20. Attach volume to original instance
  21. Start instance
  22. You will also need reconfigure your security groups as the internal IP address would have changed and to reassociate the Elastic IP Address.
Now, I could just plop in the steps from Amazon EC2 instance not accessible via RDP after Windows Update and reboot into step 13 and I can repair those unresponsive instances. Note that when the hive is loaded, there won't be a CurrentControlSet. However, you can look at the value of HKLM\System_old\Select\Current to determine which ControlSet to use. See the knowledgebase article What are Control Sets? What is CurrentControlSet? for details.


Wednesday, February 16, 2011

Amazon EC2 instance not accessible via RDP after Windows Update and reboot

For the last couple of days, I have been running into a problem where my Amazon EC2 instance is no longer accessible via remote desktop after a Windows Update and a reboot.

My process for setting up these servers is pretty straightforward: Install SQL 2008, Install SharePoint 2010 Prerequisites, Install SharePoint 2010, run Windows Update. I have done this a couple of dozen times now without any problems. Just a couple of days ago, I was finding that after the Windows Update and ensuing reboot, the server comes up, status is active, but it is not accessible via RDP or HTTP (just times out).

My first thought was that one of the more recent Windows Updates is incompatible Amazon EC2. Comparing my last successful installation with the latest Windows Update packages, I found that the following might be the culprit:


  • Cumulative Security Update for Internet Explorer 8 for Windows Server 2008 x64 Edition (KB2482017)
  • Platform Update Supplement for Windows Server 2008 x64 Edition (KB2117917)
  • Security Update for Windows Server 2008 x64 Edition (KB2393802)
  • Security Update for Windows Server 2008 x64 Edition (KB2479628)
  • Security Update for Windows Server 2008 x64 Edition (KB2483185)
  • Security Update for Windows Server 2008 x64 Edition (KB2485376)
  • Update for Windows Server 2008 x64 Edition (KB971029)
  • Windows Malicious Software Remove Tool x64 - February 2011 (KB890830)

So, I'd figure I would try again and leave out these specific updates, but upon reboot, my new instance would also be unaccessible via RDP or HTTP.

Perplexed and after a lot of searching, swearing, hair pulling, I came across this post: Avoiding RDP connectivity issues when running SharePoint 2010 on Amazon EC2. In particular, it mentions the Microsoft Article KB2379016: A computer that is running Windows Vista or Windows Server 2008 stops responding at the "Applying User Settings" stage of the logon process which describes the problem as being a deadlock in the Service Control Manager database. To break the deadlock, it is just a matter of forcing HTTP.sys depend on CryptSvc. This can be accomplished as follows:
  1. Run regedit
  2. Locate and the registry subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\HTTP
  3. Create a New, Multi-string Value: DependOnService
  4. Set a single value CRYPTSVC
So far so good. Maybe I'll get some sleep tonight.