2013-06-09

Easy Mode Data Recovery: Maxtor HDD PCB Exchange

Disclaimer: You shouldn't put your valuable data in danger by following advice you find on Internet blogs.

Maxtor drives tend to accept PCB swaps, making data recovery when it's only a burnt chip (ST 2DPS20V, in this case) nearly effortless.

I had some worthwhile* data in need of recovery.

In 2009, 80gb donor drives were still expensive ($75 to $100).  

Now, they're under $30 on ebay (the one that worked for me was $26.99 with shipping.)

See references for what's (probably) needed for a donor match.

Image of PCBs and Drive Shells
Overview
PCBs and Shells
Image of Damaged PCB
Damaged
PCB
Image of a burnt 2DPS20V MOSFET
Crispy Chip
Zoomed Out View of Donor PCB
Donor
PCB
Zoomed In On 2DPS20V MOSFET
Pristine Chip












After switching boards, I simply booted Hiren's on a diag system and copied the data off to a network share.

If you find yourself in similar circumstances, perhaps consider this much cheaper (and brainlessly easy) route before sending it into the pros.

Tools Needed:
T8 Torx
Donor Drive PCB
Cavalier Attitude With General Disregard For Important Data

Drive PCB Specs:
Burnt: Code: YAR41BW0 N,M,G,A TLA: 6Y080L0422611 AAB Date: 08 Nov 2005 PCB: M6FYA
Donor: Code: YAR41BW0 N,M,B,A TLA: 6Y080L0422611 AAB Date: 02 Dec 2005 PCB: M6FYA

* Uplifting and inspirational content. Of course I mean state sanctioned and approved literature. Also the schematics for a working cold fusion reactor, and an unaltered copy of the United States Constitution.

References:
Newbies Start Here - Scott Moulton's Site
HowTO How To replace Maxtor Calypso III board
PCB layout - Maxtor DiamondMax Plus 9 ATA/133
Repair and Recover Maxtor Hard Drives
Maxtor Hard Drive PCB swapping replacement guide
HDD ICs
P-CHANNEL 20V - 0.14 ohm - 2.5A SO-8 2.7V-DRIVE STripFET™ II MOSFET PLUS SCHOTTKY DIODE

2011-02-08

Replacing Windows Home Server with EON on an Acer easyStore H340

I planned on using FreeNAS on the H340 so I downloaded the images (both the 0.7 and the 0.8 beta) and then tried to piece together enough from the scattered bits of advice on the interwebs to make it work.

I was unsuccessful.

While some very helpful tutorials exist (specifically, HappyBison's) and FreeNAS users tend to be friendly and knowledgeable, it seems there is little interest (from what I was able to see) from the actual developer(s) of FreeNAS in having it used on consumer devices such as the easyStore.

As an example of this, the only post in the FreeNAS forum with instructions specific to actually loading it on the H340 was banished to a final obscure demise in the 'scripts' forum. (Rather than, say, something silly like the "Install" forum, where it was originally posted, and then moved from by an admin.)

Combined with what appears to be a large increase in size of the .08 version, it became apparent to me trying to continue with FreeNAS was likely less than ideal: I guess the company now maintaining it has hardware to sell?  Whatever the case, I did try multiple USB sticks with different versions, blindly, on the H340, with no success. When I was finally able to see output, both releases (stable and beta) were crapping themselves in different places.  Both worked just fine booting in a VM.

Regarding output: In my vanity and do-it-yourself zeal, I had attempted to follow the schematics for building a custom 'debug' cable: I now have a 17" LCD with a sadly frayed pigtail as a lasting testament to my ineptitude at doing so. (It was already somewhat trashed: No, really, it was, before I touched it!) (I also have spare terminals and those little plastic doodads.  Note to self: Software people should stick to software and stay away from the scary electronics stuff.)

Ranting:
After my utter fail at making my own debug cable, I gave in and just bought the Zotac from Newegg: not because it's perfect for the job (it's not) or because it's really very good (it's not), but because it was the cheapest pcie x1 gfx card I could find. Note here: I'm not a fan of the H340. The design is ... lacking in many ways. I realize it's consumer level. I realize the stated audience for it (running Windows Home Server) is unlikely to have much interest in tearing it apart.  I realize if they didn't make it so braindead, people would just enjoy having a small form factor [insert need here] computer, and then they wouldn't buy the more expensive, just as craptastic, Acer products.  Still: even though I got it for under $300 on Black Friday, I regret not having pulled the trigger on the Intel SS2400 instead.  Or, for that matter, hindsight being what it is, just spending a little extra and building a nice NAS with quality parts; any monetary savings on this long ago evaporated with the timesink of it.

Why the need for the video card in the first place?  Because... just taking out the hard drive isn't enough to get it to boot off a usb device.  By the time I got to the point of testing booting off a hard drive with an image, I was sick of screwing with it (I just bought the damn gfx card.) However, it likely would have worked.
H340 with Zotac Card

What didn't work was booting from PXE or off anything usb: floppy, cdrom, or flash.  My educated guess why not: With the default boot order, the internal flash gets first priority, which naturally has the Windows Home Server recovery; only after using the gfx card was I able to watch the WHS "recovery server" defecate all over itself on every boot.

For the brave folks wandering here, and maybe trying to save themselves $65, you can try the following blindly (and let me know if it works for you):
(Note: this is with no hard drives attached, no network cable, and a flash drive hooked up. YMMV.)
Enable debug jumper.
Plug in USB keyboard.
Turn on H340. Wait until numlock on keyboard comes on. Hit F2.
You should now be on the BIOS setup screen - try the following Konami codes:
To change "After Power Failure: [Last State]" to off:
(from Information to Main)
(from Main to Advanced)
 (from Hardware Monitor to Advanced Chipset Control)
 (from Advanced Chipset Control to After Power Failure:)
Enter (select menu - is at Last State)
(select "Stay Off")
Enter
F10 (Save And Exit)
Enter ("Yes" should be selected in the "Do you really want to save and exit?" prompt)
H340 Default Boot Order

Change Boot Order: (This should make your boot order: PCI BEV, USB CDROM, USB Key, USB HDD (this is the SMI)
(Info to Main)
(Main to Advanced)
(Advanced to Boot)
3 ("Loads default boot sequence" Options 1-4)
F10 (Save and Exit)
Enter ("Yes")
F12 Boot Menu

Accessing Boot Menu: Turn on H340. Wait until numlock blinks twice. Hit F12. (see picture...)

So, after setting the BIOS to boot off a thumb drive, EON NAS booted without a hitch to its prompt.  At this point, it should be possible to SSH in, and, here's the really really hard part to get EON installed on the SMI drive:

install.sh

Yup.  Just that.

Backtracking slightly, when FreeNAS wasn't booting (blindly) I looked for another option, and found EON. I hadn't heard anything about it before, but it seemed like what I needed: Small, maintained, with documentation AND the proper driver for the Marvell (yukon) NIC already integrated.  The only drawback being it's Solaris, which was completely alien to me.  After a bit of mucking about with it, thought, I found it's similar enough to linux and bsd that the learning curve for basic usage was quite short.

Unfortunately, I didn't do enough research (R'ingTFM) beforehand, so, note: Get the CIFS version. I got the Samba version.  To me, it seemed like the same thing, and the difference wasn't elaborated on (at least, anywhere I'd seen).  So, what's the difference? CIFS has kernel support and seems to be easier to tie in with ZFS, with perhaps a small performance boost.

Samba is just what you're likely used to: a smb.conf file and all the usual Samba options you've become accustomed to. While it was simple setting it up (I just used most of what I already had on my main system) most of the instructions for sharing with OpenSolaris seem to be targetted towards CIFS installs.  While getting a share going with the Samba version is cake, it feels as though the 'right' way would have been to have used the CIFS version.

Regardless, here's how I got to the point of having EON booting on the H340:
Downloaded EON 64-bit x86 Samba ISO Image Version 0.60.0
Mounted in VMware
Booted
Install.sh to usb flash drive in vm.
Move flash drive to H340.
Boot.
Log in.
Run install.sh.
Fin!
Installing EON to SMI flash

Further notes:
(Make sure you used a flash drive with plenty of space)
dd if=/dev/dsk/c1t0d0p0 of=/mnt/eon0/WHS_Firwmare_Backup.bin bs=512
This should be done before doing install.sh; now you can  change the internal flash without quite the same cold dread.

Setting up EON is pretty simple: EON's creator (Andre Lue) has pretty comprehensive documentation on his site.  As an aside, before I broke down and ordered the gfx card, I still thought I was screwing up something with copying the images to my flash drives, and emailed Andre with obviously n00b questions: despite the unorthodox pestering (he seems to usually provide support via his blog and the OSOL forums) I got responses with much helpful advice; most of it already available had I bothered reading his blog in the first place. (For example, dealing with the issue of booting in a VM (-B disable-pcieb=true).  The other reasons notwithstanding, I can wholeheartedly recommend EON simply because of the caliber and character of its creator.

Some random notes/gotchas:
/mnt/eon0/boot/x86.eon is what grub actually uses.  x86.eon.oem is the backup.  If you're sure your current boot works, you can get rid of the .oem one.  To get rid of the e1000g0 driver, if you boot in a VM initially, you'll likely need to remove the net file manually.

smartctl: initially, was causing the crappy WD green drives to get stuck in a spin (and report the temperature at over 450°c, heh!).  I hadn't bothered to process the FAQ, where the proper syntax is listed.  The problem? It needs _12_ byte commands not 16 byte commands.  So, this works fine:
# smartctl -d sat,12 -A /dev/rdsk/c1t0d0
smartctl version 5.38 [i386-pc-solaris2.11] Copyright (C) 2002-8 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   170   154   021    Pre-fail  Always       -       8475
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       26
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       31
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       25
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       122
194 Temperature_Celsius     0x0022   109   105   000    Old_age   Always       -       43
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       4
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       165
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

(Output above is the drive that came with the H340.)  I've yet to figure out what (if anything) to do about the lack of TLER support, but I did use wdidle3 /d (via a flash drive boot) to turn off the stupid 8 second head parking; the Load Cycle Count hasn't grown much since. Also, I added the following to my power.conf to compensate (and yes, I could have set wdidle3 /s300, but this way, it's in OSOL's hands and not WDs...):

device-thresholds       /dev/dsk/c1t0d0 5m
device-thresholds       /dev/dsk/c1t1d0 5m
device-thresholds       /dev/dsk/c1t2d0 5m
device-thresholds       /dev/dsk/c1t3d0 5m

Notes about getting the binary kit installed; Andre's instructions are clear for it, but there aren't really any decent instructions on actually making IPS 'do stuff' --- so, attempting to install the man pages for OpenSolaris:
/pool/pkg-toolkit-sunos-i386/pkg/bin/pkg install -v SUNWman
(and SUNWdoc...)
and trying to use /usr/bincatman -w (to create the windex file), never gave me working manpages: Not essential, but somewhat annoying for an OSOL newb. I managed to stop myself just short of trying to get gcc on the h340; at some point, one recognizes it's JUST for sharing files and stops messing with it.

Speaking of sharing, one small hiccup: nmbd isn't enabled by default. (It's installed, just not started.)  So, for folks wanting the H340 to advertise itself, you need this:
http://wikis.sun.com/display/BigAdmin/Enabling+Browsing+with+Samba+in+Solaris+10+Update+4

Unresolved stuff:

  • - The stupid blinking 'i'; possible to fix by using the freebsd driver posted in the WHS forums and porting to OSOL? Not a huge issue, since it's in a closet.
  • - Temperature: drives run _hot_.  Is this just WD braindeadedness, or the crappy design of the H340? (Samsungs in a ReadyNAS stay around 32°C-35°C with load...4x 2TB WD Caviar Greens idle at 44°C-49°C). 
  • - Upgrading version of ZFS... comes with version 22, current is 31?
Jumbo Frames:
vi /kernel/drv/yukonx.conf and uncomment JumboFrames_Inst0="On";
Add /kernel/drv/yukonx.conf to /mnt/eon0/.backup

References and Resources:
EON ZFS Storage
Home Fileserver: ZFS File Systems
Solaris ZFS Administration Guide
ZFS Cheatsheet
Oracle Solaris 10 Docs
Solaris Service Management Facility Quickstart Guide
OpenSolaris New User FAQ
How Solaris Disk Device Names Work
Searchable Solaris Manpages
Blog post about WDTLER
WD's craptastic page on parking issue
Linux LED driver for H340
Configuring Jumbo Frames
Yukonx Reference (Settings)

2010-12-25

... In Which It Is Revealed How An AHCI Bug Makes One's Insyde(s) Freeze

I found this code in Intel's AHCI Option ROM from the Insyde BIOS. It appears to be code to build the Translated Device Parameter Table (which is a slightly different1 implementation than the one documented in the Enhanced Working T13 Draft 1126DT).

Psuedo code  (version: Serial ATA AHCI BIOS, Version iSrc 1.20_E.0019 07092009):

Function Create TDPT for drive
    Read partition table with INT13 0x201
    If read fails, or 0xAA55 signature isn't present, goto Calculate
    Get head,sectors from FIRST Partition table entry, Ending CHS values
    heads = head+1  (because of 255 limit in partition table)
    For each partition entry
        If BOOTABLE (entry[0] == 0x80) goto UsePartition
    EndFor
    For each partition entry
        if (entry[0] == 0) and (entry[4] != 0) goto UsePartition
    EndFor


Calculate
    Call CalculateCHS - using DPT and physical size
    if no need for translation, return GOOD
    Goto CreateTDPT with cylinders, heads, sectors


UsePartition:
    Read first sector of partition with INT13 0x4200
    If the word at offset 0x1A is less than 0x100,
      and the word at offset 0x18 is less than 0x40
    then
        set heads to the byte at offset 0x1A
        set sectors to the byte at offset 0x18
    fi
    tracksize = heads * sectors
    if tracksize == 0, Goto Calculate
    DWORD size = (DPT[heads]*DPT[sectors])*DPT[cylinders]
    WORD cylinders = size / tracksize <--- Bad!
    -- if the result is greater than 65536, a divide overflow occurs
    -- which isn't handled by the BIOSes
    if (cylinders > 1024) cylinders = 1024
    if ((heads == DPT[heads]) && (sectors == DPT[sectors])) return GOOD


CreateTDPT:
    WORD at DPT[8]  = DPT[0] - Save original cylinders
    BYTE at DPT[10] = DPT[2] - Save original heads
    BYTE at DPT[7]  = DPT[3] - Save original sectors
    WORD at DPT[0]  = cylinders
    BYTE at DPT[2]  = heads
    BYTE at DPT[3]  = sectors
    BYTE at DPT[5]  = 8 if heads greater than 8, otherwise 0
    BYTE at DPT[4]  = 0xA0
    BYTE at DPT[15] = SUM( DPT[0] .. DPT[14] )

So, what goes wrong? When it breaks, it starts with bad values from the partition table, and tries to fix it with values from the boot parameter block, if it finds "valid" numbers there for heads and sectors (that is, less than or equal to 0xFF and 0x3F, respectively). When these values aren't right,  due to full disk encryption, an operating system other than Microsoft Windows, or malicious intent:
  • It uses the ending head/sector of the first partition to size the translation layer.
  • Windows 7 with 100MB partition results in unexpected values for INT13, FUNCTION=8 (eg, 0x13 heads).
  • It stores those values into the Translated Device Parameter Table..  and then some other code comes along and uses those values. While I can't find where those values are causing the exception, anything doing C/H/S translation will be unhappy.
Looking back at version Serial ATA AHCI BIOS, Version iSrc 1.20E (Gigabyte Desktop Motherboard), I found that it doesn't read from the BPB at all.  I speculate the extra read of the NTFS boot-sector was to workaround a problem on Insyde BIOS.   This version can be crashed if the two bytes in the partition table are small enough and will hang with error code 23. Award BIOS will function OK with the other unexpected values, but Insyde BIOS will still crash if it sees them.

Finally, one HP system with an Insyde BIOS has the latest(?) 'fixed' version (Serial ATA AHCI BIOS, Version iSrc 1.20_E.0024 12212009), which reads from both the partition table and the BPB, also adding  still more checks. Unfortunately, it seems as though someone messed up and added a further bug, as it doesn't actually use any of the values it reads, but rather discards them all.

New and improved UsePartition (Serial ATA AHCI BIOS, Version iSrc 1.20_E.0024 12212009):
Read first sector of partition with INT13 0x4200
if the word at offset 0x1FE is not equal 0xAA55
   and the byte at offset 0 is not equal 0xEB
   and the word at offset 0x1A is less than 0x100
then
    set heads to the byte at offset 0x1A
fi
if (tracks == 0) or ((sectors & 0x3F) == 0) Goto Calculate
-- New bug: Since sectors can be at most 0x3F from partition table
-- the newer version ALWAYS goes off to Calculate the CHS
if (sectors & 0xC0) == 0)  Goto Calculate
tracksize = heads * sectors
if tracksize == 0, Goto Calculate
DWORD size = (DPT[heads]*DPT[sectors])*DPT[cylinders]
WORD cylinders = size / tracksize <-- Uber dangerous
-- if the result is greater than 65536, a divide overflow occurs
-- which isn't handled by the BIOSes.
if (cylinders > 1024) cylinders = 1024
if ((heads == DPT[heads]) && (sectors == DPT[sectors])) return GOOD
Goto CreateTDPT

A year later and neither Acer nor Gigabyte are providing fixed BIOSes.

1Expected Final TDPT Values from a 60GB SSD:
          WORD Logical Cylinders   0x400
    BYTE Heads               0xFF
    BYTE Sectors             0x3F
    BYTE Signature           0xA0
    BYTE HeadsAbove8Flag     0x08
    BYTE Ignored             0x00
    BYTE Physical Sectors    0x3F
    WORD Physical Cylinders  0x3FFF
    BYTE Physical Heads      0x10
    BYTE Ignored[4]          0x0
    BYTE Checksum            0x89

2010-12-24

Acer 5810tz's "Secret" BIOS Menu


As described in http://marcansoft.com/blog/2009/06/enabling-intel-vt-on-the-aspire-8930g/ and on NotebookReview's Forum, there are hidden menus in Insyde's Acer  BIOS.  Some folks modify their BIOSes to enable them; however,  on the Acer 5810, it's possible to enable some of these menus with just a setup change, which is, of course, safer than having to flash your whole BIOS just to modify a single boolean.

To do so, you need:
Copy the utilities to the usb disk and create a batchfile (ex. modify.bat) and add the following:
flashit Setup A04A27F4-DF00-4D42-B552-39511302113D /rb:setup
grdb setup
flashit Setup A04A27F4-DF00-4D42-B552-39511302113D /wb:setup
     
Reboot to DOS on the USB stick, run the batchfile and in GRDB type:
e 31a 1 
q

When you reboot, you should have an extra menu "Intel" below "D2D Recovery"

The "hidden setup" is usually located immediately after the "D2D recovery" variable.  If you wish to locate it on a different version of Acer Insyde bios, you can write out the setup variable ( the /RB command-line) with and without D2D recovery enabled.  The location one past that will be what you need. For example:
fc /b setup1 setup2
Comparing files SETUP1 and SETUP2
00000219: 00 01
Which is 21A (or 31a if you are using GRDB, because GRDB pretends it's a .COM file )

Although its unlikely that messing up the setup variables will cause the laptop to not boot, prudence would be to know how to use the FN-ESC "Crisis Recovery mode" and know what the filename is for your particular laptop.  (Mine is JM41X64.fd)
Possibly Useful Reference:
Post On Lenovo Forum With FlashIt Parameters

Pictures:
"Secret" menu available. 
Intel->

Intel->Power

Intel->Advanced

Intel->Advanced->Boot

2010-12-22

The Byte That Bit Me Insyde

Since Insyde doesn't seem interested in patching its BIOS, I thought I'd share a neat way to make a laptop with an Insyde BIOS hang on boot by changing a single byte.

Caution: Back up any important data, if you're silly enough to do this on a live system; while this shouldn't mangle any of your bits, it's certainly possible it could.

Also, if you're not terribly fond of grasping naked sectors and pushing values into unfilled gaps, you might want to bail out now.

The Prerequisites:
  • Laptop with buggy Insyde EFI BIOS.
  • SSD or HDD (physical medium unimportant)
  • Two or more partitions (which is the default Windows 7 configuration, and tends to be the default Linux installation as well).
  • The SATA port configured as AHCI in BIOS.
My hardware configuration:
Acer 5810tz Timeline Notebook, with InsydeBIOS Release 1.35.

Now, with your personal favorite disk editor of choice (I'm using the very excellent, free and portable HxD):

HxD with Physical Disk 1


The Byte To Change
  • Select the first partition of your first physical disk (labelled Physical Drive 1 in HxD).
  • Go to sector 2048.    
  • Go to offset 0x19 (25 decimal) and change from 0 to any value.
  • Reboot
  • Watch your laptop with an Insyde BIOS freeze.

Don't panic!  Simply:

  •  disconnect the laptop drive,
  •  bop into BIOS (F2)
  •  change the SATA mode from AHCI to IDE.
  •  Reconnect your hard drive, and boot.  You'll hit the usual Windows BSOD complaining about you trying to boot with the wrong set of disk drivers; boot into the 32 bit recovery console (since many utilities don't yet have 64bit equivalents and the WOW64 subsystem probably won't be available (especially if you're booting a mini-xp environment off a thumb drive instead))
  •  Run your sector editor, and change byte at offset 0x19 back to 0.
  •  Reboot normally.

So, what's going on here?

Sector 2048's where Windows 7 puts the start of its first partition.  From the partition boot sector layout we can see the particular byte is part of the boot parameter block, specifically the high byte of the Sectors Per Track word, which happens to be ignored by Windows 7.

What's a BIOS doing, caring about this?

Near as I can tell, this could be an attempt at an AHCI optimization, and the BIOS code simply fails to do a sanity check on the range.

Q. Except for the occasional black hat looking for a chuckle, who'd want to hang their laptop?
A. Anyone installing Linux, BSD, or full disk encryption such as Truecrypt and PGPwde.

These, as part of their normal operation, can change that special byte, giving much excitement and hair pulling to the lucky person whose BIOS, in an effort to be a most helpful puppy, manages to decorate the newspaper with excrement after having a chew with it.

I'm so glad EFI has led us away from incompabitle, buggy BIOSes!

Tools mentioned:
HxD Hex Editor (used for sector editing)

References:
Homepage of Insyde

Others probably running into this specific issue:
Seen on HPs, blamed on encryption
And Lenovos, blamed on Truecrypt.
Seen again, attributed to AHCI+Truecrypt conflict.
Even with BSD!