Go to Making Light's front page.
Forward to next post: What day is it?
Subscribe (via RSS) to this post's comment thread. (What does this mean? Here's a quick introduction.)
Hi. Teresa here, still mournful. It looks like I’ve lost all my files since early September, including all the notes, drafts, and research for several writing projects. I’ve also lost the text of various books authors had sent me, and one large graphics project.
It’s funny. Losing data doesn’t feel like the loss of a possession. It feels like you’ve lost a part of yourself.
If any of you have copies of material (including e-mail) which you sent me or I sent you, or of anything else that would have been in my lost files, and you can see your way clear to re-sending it, I’ll be most grateful.
I’ve temporarily put up a tip jar at the top of the left-hand column. If you feel like helping underwrite the data recovery process, the new hard drive, and the backup systerm, that would be just wonderful. No guilt or anything if you can’t or don’t, but book publishing is notoriously a shoestring operation, with all that that implies.
And I promise to back up my files oftener in the future. I’ve had weird luck with that. I’ve had two other major data-loss disasters in my computer-using life, and they were both freak accidents that happened while I was backing up.
All best —
t.
Addendum: Some items from the comment thread
From Danny O’Brien:I hope you get enough money to try data recovery. Good data recovery places are marvellous - they have many many tricks to abstract data. Probability is on their side, too. A hardware problem generally only damages a small part of the recorded area, while the rest of the data is preserved in aspic. In many ways, data recovery is a job that attracts the very best in geekery. It requires attention to detail, a forensic spirit, and (because the best stay in contact with the frantic owner every minute of the process) a keen understanding and sympathy for human nature at its most vulnerable. Also, the rewards mental and monetary are fantastic. I bet a lot of people fall in love with their data recoverer. I bet data recoverers have groupies. So my advice is to keep heart, keep your drive safe until you can afford to fix it, reassure yourself that you ahve probably not lost anything, but merely gained a little early personal archaeology. And do try not to run off with the tall dark handsome stranger with the neat set of CD-Rs I see in your future.from Tom Whitmore:
One silly idea suggested to me when mine died (and the people who are actually technically competent can tell me if this makes any sense at all, because it sounds like magic to me) is to put the drive in the freezer for about 10 minutes and then try to boot it again. I did not do this, and I have no idea if it would work or exactly how….from Jordin Kare:
Tom, It’s not silly; I’ve heard of it being done. I’m not sure how it works — I can think of three or four possible effects — but it sometimes does. It’s worth a try on Teresa’s disk, but only because it doesn’t cost anything and won’t hurt if it doesn’t work. (Teresa, if you try it, just be sure no water has condensed on the drive before you apply power to it.) But I’d be very surprised if it helped. The driver board swap is also a reasonable thing to try, but has some risk of doing more damage if you break cables or bend pins. I’d try it if I had a duplicate drive (I’ve done it in the past) but it’s not something to try casually. (I was going to say it was really unlikely to work on a modern drive, but thinking about the noise this particular drive made, it’s just possible the problem is a blown transistor in the head actuator drive, which a board swap would fix.) (Teresa, if you want someone to try it, I can probably find a duplicate drive, and do something useful with it afterward if the swap doesn’t help.)from Erik V. Olson:
Drive freezing works if you have weak connections (the cold shrinks everything — including the space in between the connectors.) It can also break loose a head that’s stuck to the platter.It also gets the greatest looks when the accounting folks see you walk into the break room and take a drive out of the freezer. “Well, the magentic field on the platter weakens with heat — you hit the curie point, and the data’s gone — so if you store your hard drives in the freezer, they’ll last longer. Why do think superconducting magnets have to be kept so cold?” Try to keep a straight face. See how many floppies show up in the freezer.
In the waving a dead chicken file, this is what I do with a drive that is not going to be sent for recovery.1) Move it to another computer — if you are dealing with a marginal power supply, this might light it.Finally, well, a quarter pound of black powder sends dead drives a very long way indeed. Not very good at data recovery, but soothing in some ways none the less.2) Whack it at startup. See the infamous Quantum 40MB and 80MB drives that lived in the Mac SE and SE30 that loved to develop stiction. The fun was pulling apart an SE, whacking the drive, which would then spin, then putting the SE back together with the power on — without hitting the yoke of the built in monitor, with Dire Consequence.
3) Freeze it for 15 minutes, then power it up. As Jordin says, condensation bad. Do this on a dry, low humidity day.
4) 2+3 equals 4, in this particular case. This is the “kick in the pants with a frozen boot” technique.
5) If you can, swap controller boards. This requires another identical drive — one you don’t care about, since it is easy to mung the controller board, thus killing another drive.
6) This is hardcore — but it’s worked for me. Once. You really shouldn’t even try.
Seriously. Don’t go here. Honest. This is commitment, and there is no going back — once you do this, you either get the data, or nobody does.
Okay. You take the lid off and spin the platters with your finger, TOUCHING ONLY THE HUB, then flip on the power. If it spins, you DUPE THE DISK RIGHT THEN AND THERE. The drive’s lifetime is now measured in minutes, and every speck of dust that hits that platter is one more chance for the heads to hit something and come crashing down into the discs themselves. This is known as “a head crash” to the boring, and “Gone Farming” to the rest of us. If you’ve seen the resulting furrows, you’ll understand.
Of course, there is no recovery of data that the heads have scraped off the disc. So don’t even think about pulling out screw one until you have another hard drive tested, formatted and online. If your duplication command is complex, write it out as a batch file. Seconds can count — the one time this worked for me, when the copy ended, I turned to report to the boss that I’d managed to get the data off, whereupon, SCREEECLUNK, and it was dead, and there was a lovely amount of metallic poweder in the air.
Magnetic powder, no less. Not much magnetism, mind you — but powder doesn’t need a whole lot of pull.
Once you’ve cracked the lid, you either get the data off, or you don’t. If you don’t, then, well, go ahead and pull the platters out, they’re kind of neat and shiny, until you get fingerprints all over them, and boy, do they take fingerprints. Some ring like cymbals, but most are fairly atonal. They are, until you mess with them, very, very flat indeed, so if you need a small reference surface, there you go.
Needless to say, if there’s any chance you’ll be sending the drives to the pros, don’t do this. If you do, the chances of you getting the data back drop — and the cost of not getting that data back rise.
Way bummer. Best of luck dealing with Data Retrieval.
While I haven't set up such a regime myself, I understand that a CD-RW drive w/ a CD-RW disc can be set up to do a scheduled incremental backup. Put in a disk, set up a schedule, and tape the drive shut so you don't unmindfully swap disks.
Having just gone through this myself, I've been astonished to discover how much Windows had actually saved on the drive that I'd been trying to move stuff from. Yes, I've been sloppy with backups too; in this case, Nanny Bill actually seems to have worked in my favor. I'm not a big Windows fan, but I've ended up not as behind as I might be because of being on a Windows box.
Generally, my response has been "What! That's still there? I thought I'd moved that to the dead drive!"
Condolences anyway.
Tom
Bizarrly, I also suffered hard drive death last week. I figure I've lost all my posts and correspondence from the past 18 months or so. Turns out it wasn't just the drive - the awful clicking noise pretty much confirmed that was dead even before we tested it - but also the motherboard. Dead computer, in other words. I've bought a new one, which gets delivered tomorrow, but what a pain. All that software to reload and reconfigure. Believe me, Teresa, I feel your pain.
I remember Patrick once saying 'There are only two types of computer: those that have suffered catostrophic hard drive failure, and those that will eventually'. Guess I'll be more careful about back-up in future.
Bizarrly, I also suffered hard drive death last week. I figure I've lost all my posts and correspondence from the past 18 months or so. Turns out it wasn't just the drive - the awful clicking noise pretty much confirmed that was dead even before we tested it - but also the motherboard. Dead computer, in other words. I've bought a new one, which gets delivered tomorrow, but what a pain. All that software to reload and reconfigure. Believe me, Teresa, I feel your pain.
I remember Patrick once saying 'There are only two types of computer: those that have suffered catostrophic hard drive failure, and those that will eventually'. Guess I'll be more careful about back-up in future.
Ack. My sympathies. I lost my external a day after I upgraded to 10.2 (odd that people are now reporting similar problems with 10.3).
As almost goes without saying, I had backed up everything from my internal to my external, reformatted the internal, and hadn't moved all the stuff back onto the internal when the external failed. I was getting those ominous clicking noises too.
Although the disk wouldn't mount in the Finder, amazingly enough, Unix could see it, and I managed to navigate around in the terminal and copy back everything that was irreplaceable before the drive truly gave up the ghost. Using the "cp" command meant that I lost creator codes and resource forks, which was an annoyance, but you can use "ditto -v rsrcfork" (do a "man ditto" to read up on the usage of ditto) to make copies that preserve this stuff.
Anyhow, give it a shot. It worked for me.
I'm not any help at all with the tech stuff, but I do want to offer my sympathy.
I hope you get enough money to try data recovery. Good data recovery places are marvellous - they have many many tricks to abstract data. Probability is on their side, too. A hardware problem generally only damages a small part of the recorded area, while the rest of the data is preserved in aspic. In many ways, data recovery is a job that attracts the very best in geekery. It requires attention to detail, a forensic spirit, and (because the best stay in contact with the frantic owner every minute of the process) a keen understanding and sympathy for human nature at its most vulnerable. Also, the rewards mental and monetary are fantastic. I bet a lot of people fall in love with their data recoverer. I bet data recoverers have groupies.
So my advice is to keep heart, keep your drive safe until you can afford to fix it, reassure yourself that you ahve probably not lost anything, but merely gained a little early personal archaeology. And do try not to run off with the tall dark handsome stranger with the neat set of CD-Rs I see in your future.
Before I upgraded to Panther, I backed up my G4 to a Firewire drive using Apple's Backup program from .mac.
I didn't have any problems with Panther, so I haven't needed the backup.
Which is lucky, since I've seen reports that Apple's Backup program *might not work with Panther*.
You'd think they'd make sure they didn't break their own Backup application.
Ach, I know what you're going through - last year, my hard drive died and took five years worth of un-backed up writing with it. Worse, when I tried to recover some of the files that my girlfriend had copies of, she told me that just a few days earlier she'd reformatted her hard drive, and that the back-up CD she'd made had turned out to be no good. The best you can do is curse the cruel gods responsible, try to save what and where you can, and move forward.
As others have said, my condolences. I understand how it feels. I STILL get upset about losing 3 weeks worth of online storage from nearly six years ago, due to a combo of bad hard drive failure and a corrupted set of backups (or possibly a non-existent set, but it wasn't under my control).
My best hopes that you can get the data recovered.
All this talk of dying drives makes me nervous, as I long to upgrade my computer, but have to wait until I find a job. In the mean time, my hard drive is nearly maxed out, and occasionally makes scary spitting noises.
I fear the phantom pain that comes from losing my data. I still remember the pile of sketches I lost at a Republican convention I attended the year I was in grade eight. (Karma, you say?) I remember the perfect beautiful photo of Arches National Park that got destroyed when my brother opened my camera and exposed the film. If I lost this hard drive, I'd lose five years worth of digital art work, plus scanned photos that I don't have copies of, plus two unfinished novels and their notes, and all incarnations of my website. I can imagine your pain. Hopefully some restoration of data will be possible.
Teresa, that's horrible. I second what Danny said. Real data recovery pros can get almost everything back. But the one advantage of Windows is that it's easy to have multiple machines synchronising everything you care about. I would bevulnerable to three hard disks failing on three different machines. But otherwise (touch wood) everything gets backed up every day. Surely there is a Mac equivalent to the Windows "Second copy"?
I feel your pain. When I was a child, at least twice we moved with only the clothes on our backs. I hope everything turns out better than you expect.
Oh, no... I'm so sorry. What a horrible feeling, and one I too know only too well. Losing 6 chapters of draft on one project and Maude only knows how many pages of notes and bibliography on another back at the beginning of the summer has kept me religiously doing backups to CDR on the 15th of every month... and storing them in a safety deposit box at my bank!
Here's hoping you get a crack at data recovery. I was lucky enough to have some very able geeks at my disposal who helped me comb what was left of my HD's wreckage and thus let me save at least some of what I otherwise would've lost entirely. It made a huge difference.
PiscusFiche, you might want to think about getting a CDROM burner. They've gotten pretty cheap, and the blanks are quite cheap. A few hours with one would enable you to back up hundreds of megabytes of documents and images. You might even be able to borrow one, or get someone who has one to bring their computer over and network it to yours and make you a bunch of CDROMs. It's not a reasonable way to do day-to-day backups, but five years' work deserves a little effort to ensure it outlasts your hard disk.
Do not throw that drive out. One possible suggestion, if you're up for something that will cost you a few dollars and that might not work: if it is a controller failure, and not the less-frequent drive-mechanics failure, you might be able to try the following:
It doesn't always work, but I've been able to recover some critical files like this. Best wishes, if you're willing to give it a try.
One silly idea suggested to me when mine died (and the people who are actually technically competent can tell me if this makes any sense at all, because it sounds like magic to me) is to put the drive in the freezer for about 10 minutes and then try to boot it again. I did not do this, and I have no idea if it would work or exactly how....
Cheers,
Tom
I was installing OS X 10.3 betas on a weekly basis for a while, and got into the habit of using ditto to back up my home directory before each install, and I also had a complete pre-beta disk image made with Carbon Copy Cloner.
When the beta-test OS finally burned me, I had everything I needed to quickly rebuild my machine. The actual recovery took less than two hours; the other ten hours were spent fiddling with a flaky Firewire drive that I eventually gave up on.
...and before I typed this, I made sure to use CCC to back up my PowerBook onto my desktop machine. No sense begging Murphy for a visit today.
On that note, the simplest way to make portable backups under OS X is to create a sparse disk image and leave it on the desktop. Mount the image, copy the files you care about, unmount it, and you're left with a single file that can be copied to a Windows or Unix box (or one of those little USB keychains), burned to a CDR, whatever.
You can create the image from Disk Copy/Disk Utility in whatever size you want, or do it from the command line as part of a regular backup script: "hdiutil create -fs HFS+ -type SPARSE -size 100m myfiles". A simple shell script can combine hdiutil, ditto, and cron to keep your most important files safely duplicated.
As for "the one that got away", when I moved to California, I carefully made two sets of 8mm tapes containing all of my files, and discovered that both of the tape drives I'd used had misaligned heads. They could read each other's tapes, but no other drive in the world could. Fortunately, I'd made a third copy on a file server back in Ohio, and I was able to copy them across the Internet. Eventually.
-j
Just wanted to add one thought: if there's something relatively small but irreplaceable such as, say, your precious Novel in Progress, it's worth making use of free public web-based storage such as, for instance, Yahoo's Briefcase. Not so good for massive items or huge piles of notes, of course, but even a very long novel bloated up into a WinWord doc is still likely to be less than a meg, which is a tiny amount of space these days. Kind of a safety deposit box.
One silly idea suggested to me when mine died (and the people who are actually technically competent can tell me if this makes any sense at all, because it sounds like magic to me) is to put the drive in the freezer for about 10 minutes and then try to boot it again.
Tom,
It's not silly; I've heard of it being done. I'm not sure how it works -- I can think of three or four possible effects -- but it sometimes does. It's worth a try on Teresa's disk, but only because it doesn't cost anything and won't hurt if it doesn't work. (Teresa, if you try it, just be sure no water has condensed on the drive before you apply power to it.) But I'd be very surprised if it helped.
The driver board swap is also a reasonable thing to try, but has some risk of doing more damage if you break cables or bend pins. I'd try it if I had a duplicate drive (I've done it in the past) but it's not something to try casually. (I was going to say it was really unlikely to work on a modern drive, but thinking about the noise this particular drive made, it's just possible the problem is a blown transistor in the head actuator drive, which a board swap would fix.) (Teresa, if you want someone to try it, I can probably find a duplicate drive, and do something useful with it afterward if the swap doesn't help.)
Drive freezing works if you have weak connections (the cold shrinks everything -- including the space in between the connectors.) It can also break loose a head that's stuck to the platter.
It also gets the greatest looks when the accounting folks see you walk into the break room and take a drive out of the freezer. "Well, the magentic field on the platter weakens with heat -- you hit the curie point, and the data's *gone* -- so if you store your hard drives in the freezer, they'll last longer. Why do think superconducting magnets have to be kept so cold?" Try to keep a straight face. See how many floppies show up in the freezer.
In the waving a dead chicken file, this is what I do with a drive that is *not* going to be sent for recovery.
1) Move it to another computer -- if you are dealing with a marginal power supply, this might light it.
2) Whack it at startup. See the infamous Quantum 40MB and 80MB drives that lived in the Mac SE and SE30 that loved to develop stiction. The fun was pulling apart an SE, whacking the drive, which would then spin, then putting the SE back together with the power on -- without hitting the yoke of the built in monitor, with Dire Consequence.
3) Freeze it for 15 minutes, then power it up. As Jordin says, condensation bad. Do this on a dry, low humidity day.
4) 2+3 equals 4, in this particular case. This is the "kick in the pants with a frozen boot" technique.
5) If you can, swap controller boards. This requires another identical drive -- one you *don't* care about, since it is easy to mung the controller board, thus killing another drive.
6) This is hardcore -- but it's worked for me. Once. You really shouldn't even try.
Seriously. Don't go here. Honest. This is commitment, and there is no going back -- once you do this, you either get the data, or nobody does.
Okay. You take the lid off and spin the platters with your finger, TOUCHING ONLY THE HUB, then flip on the power. If it spins, you DUPE THE DISK RIGHT THEN AND THERE. The drive's lifetime is now measured in minutes, and every speck of dust that hits that platter is one more chance for the heads to hit something and come crashing down into the discs themselves. This is known as "a head crash" to the boring, and "Gone Farming" to the rest of us. If you've seen the resulting furrows, you'll understand.
Of course, there is no recovery of data that the heads have scraped off the disc. So don't even think about pulling out screw one until you have another hard drive tested, formatted and online. If your duplication command is complex, write it out as a batch file. Seconds can count -- the one time this worked for me, when the copy ended, I turned to report to the boss that I'd managed to get the data off, whereupon, SCREEECLUNK, and it was dead, and there was a lovely amount of metallic poweder in the air.
Magnetic powder, no less. Not much magnetism, mind you -- but powder doesn't need a whole lot of pull.
Once you've cracked the lid, you either get the data off, or you don't. If you don't, then, well, go ahead and pull the platters out, they're kind of neat and shiny, until you get fingerprints all over them, and boy, do they take fingerprints. Some ring like cymbals, but most are fairly atonal. They are, until you mess with them, very, very flat indeed, so if you need a small reference surface, there you go.
Needless to say, if there's any chance you'll be sending the drives to the pros, don't do this. If you do, the chances of you getting the data back drop -- and the cost of not getting that data back rise.
Finally, well, a quarter pound of black powder sends dead drives a very long way indeed. Not very good at data recovery, but soothing in some ways none the less.
Condolences.
I've been in the positin of having lost multiple months of work myself - a huge animation project, back in the days when I only had floppies to back up onto. It feels like you'll never have enough energy to get back to where you started...
If the drive ends up with the Kares in Seattle I'd be happy to furnish guns and ammunition for some values of trouble shooting.
Thank you all (she said, feebly). I got hit with a nasty virus and an attack of Lolitas on top of the data loss, so things are just getting ridiculous around here.
(Woe! Misery! Why me? --though as Tom Doherty says, it's better to suffer unjustly.)
The news on the Lolitas is that the site has been shut down for spam abuse. Apparently we're one of a handful of sites that got hit with Lolitas after the site was shut down. There's some suspicion that the spam was vindictive.
"I got hit with a nasty virus and an attack of Lolitas"
This sentence gave me a very odd mental picture of you. Probably because I just finished watching Hand Maid May.
-j
Jeremy: Thanks for the suggestion.
Argh! My very heartfelt condolences.
And I feel your pain: I lost the internal hard drive of my powerbook this past Thursday... yup, due to a freak accident during backup. I think it's karma's way of saying, "You can't get out of this by backing up just when you think you have a problem, buddy. Do it right."
On a positive note, the 10.3 Firewire data-loss bug has been fixed... at least by the drive manufacturers: they're offering firmware updates. The problem extends to some Firewire 400 drives, so check it out: I just had to update my LaCie d2 120.
I second Carbon Copy Cloner as a backup solution, as well. That, plus a second firewire drive (or if you're feeling racy, two in a RAID using Apple's Disk Utility) should minimize the chances of lightning striking you twice.
Erik Olson writes:
"Finally, well, a quarter pound of black powder sends dead drives a very long way indeed. Not very good at data recovery, but soothing in some ways none the less."
Or simply embed the drive in a pumpkin.
Just one additional note -- if you do plan to get professional help, don't try the do-it-yourself options. It's tempting to try the cheap stuff before paying a pro, but all the freezing, booting, etc. has a non-zero chance of making things worse.
The good news is, the data is probably still there, no matter what you've done. The bad news, it can be pretty hard to get to the data, and can cost quite a few $. There is a Swedish or Norwegian firm that specializes in retrieving data, and I've only heard about one case where they weren't at least partially succesful. I'm sure there is an equivalent US firm.
The one case were the firm wasn't succesful, was rather interesting. A company had a computer that maintained all their (electronic) locks, and allowed people to use their mag-keys to get in. The got a new sys-op, who wisely enough decided to take a back-up of the program used with all the data, as no-one had done that in all the time the system had existed. He turned of the computer, booted it, and tried to take a backup, but with no luck. Suddently the computer wouldn't start up - he inspected the machine, and found out that it was a hard-disk problem. Since it was critical data, the send the harddisk to the company that specializes in data-retrievement, so they could get the data back.
No luck. The harddisk was so old, that the magetic dust on the disks had fallen off, and the system had only run because it was running from the RAM. That's why no-one found out, until the system was actually turned off.
devestating news. i hope you folks are following the recovery discussions for this panther/firewire bug on www.macintouch.com and www.xlr8yourmac.com. a utility by prosoft has had some success for some people.
Bill, Erik, clark, no way am I blowing up my old drives. I've been collecting platters for some years now, hoping to someday string them up into a chiming mobile. A good platter will ring with a pure, clear note that holds longer than a tuning fork. I prize a stack of five platters I got out of a boxy old hard drive because they're so much smaller than the usual sort, and so have a different pitch. I've heard rumors that iPods have a little bitty platter in them. If so -- well, I wouldn't wish a major hardware failure on anyone, but since the world is full of hazards, they're bound to happen; and if that happens to an iPod, I want a look at its innards.
I'm now trying to remember who it was who told me about a friend of his who used dead platters to make a techno version of a sort of ancient Greek noisemaker. Maybe it was Singer.
I also collect the brass doughnuts off the ends of guitar and bass strings. I have a lot more guitar-gauge doughnuts, for obvious reasons, though I continue to hope that John Sobel will remember me the next time he re-strings.
Alan, I'm not going to try the DIY options. I've taken apart automobile engines and old radios. I can cheerfully watch my own innards being worked on ("Hey, look! It really does bleed in spurts!"). But the minute you take the cover plates off my computer, I start feeling faint and apprehensive, and have to look away. Needless to say, I score a strong positive on the one-shot geek test.
Jazz, what are the odds of a failure happening while you're backing up? I've had it happen to me twice, but in the Legion I'm known as Low Probability Woman, so I figured it was just another coincidence. But it happened to you, too -- and now that I think about it, it also happened to Adam. Is this simple weirdness, or is there some explanation for it?
Kristjan, that's an amazing story. I've never heard of the magnetic dust falling off the disks, though that may be because they fail before the dust can fall off. Any idea how long it had been running?
I have to wonder about that previous sysop. I'm bereft of a bunch of data, and badly hampered thereby, but except for the submissions and the graphics project -- oops, and all of Patrick's recordings, damn damn damn -- it was all my data. It wasn't my job to maintain a whole company's worth of data.
Back in my days as an office temp, I'd sometimes get hired to do filing in the aftermath of some major piece of dereliction. Say Miss Grimsby, the long-tenured and long-suffering secretary at some firm, moves away or retires. Weeks later, the company discovers that she'd stopped doing any filing during her last year with them. Instead, she'd stacked the "to be filed" documents into the back corner between the last file cabinet and the wall. When the stack got too tall, she'd started another stack next to it. On occasions when a particular document had been called for, she'd been able to access it because she remembered where everything was in the stacks.
Temps are forever getting glimpses of the psychodramas of small offices. I've always wondered whether that stack of unfiled documents was her reply to years and years of "Miss Grimsby is the only one who knows [how things work] [where to find anything] around here," which in clerical work is often code for "I've avoided learning anything about our office procedures, because that way they'll always be Miss Grimsby's problem, not mine."
To crown their folly, the firm would then hire a temp to do the back filing. When you're a temp, it's easiest to just file stuff any which way. It's another kind of data loss. Remarkably common, too
Teresa, there is also what looks like a good article on this issue at MacNightOwl.
"I have to wonder about that previous sysop."
I no longer wonder. I once had to clean up after an entire group of them, who not only failed to do the job, but lied to cover up their failure. It's a little tale I call "Backups? What backups?".
-j
Claude, I don't think I can properly explain why I'm making these odd noises, but you might want to feed the title of the author's book plus my name into Google Groups. You'll find a lot of people there whom you already know.
The Ur-thread starts here.
"what are the odds of a failure happening while you're backing up? I've had it happen to me twice, but in the Legion I'm known as Low Probability Woman, so I figured it was just another coincidence."
Well, I've been called the "weirdness magnet," so it could be the same thing. (Have you noticed that there really are certain people who just draw The Oddness around and upon themselves like... er, um, magnetic dust to a platter? I know some people...)
But seriously, I do think that backup software might just aggravate a flaky hard drive to the point of collapse. Consider:
Here we have your hard drive, which, unbeknownst to you, is about thirty pageouts away from a serious nervous breakdown. Were it a person, it would be that nice, quiet guy at the end of the block, who never bothers the neighbors.
Here we have backup software, whose job is to sort through that hard drive from top to bottom, read every bit on it, and write it to something else. This can be considered similar to a team of polite but ruthless forensics detectives. They walk into this guy's bedroom, start in the left-hand corner closest to the door, and carefully rifle through and catalog the contents of that room, cubic inch by cubic inch. They are scrupulously careful to put everything back into precisely the same place.
I think it could be just the thing to put a borderline case right over the edge. Out comes the machete, furniture and body parts start flying around, and you read about it the next day on the Evening News. Or Fox, if the guy was a registered Democrat.
To deanthropomorphize for a second, I mean that when the backup software reaches the point where the hard drive is seriously flaky, and starts doing intensive reads off of it, I can see that crashing the system and fragging the data to hell. Which is what happened to me.
The upshot of the story is, regular incremental backups tend to avoid this problem by not waiting until the hard drive is a basket case with a sock drawer full of carefully catalogued body parts, at which point it's generally too late. And as an added bonus, if you do crash in the midst of data backup, hopefully you've only lost a day's (or week's, depending) work.
I lost a month. I'm shifting to weekly backups, me...
Not that this advice does much good now, but if your computer contains something expensive in time or money to replace, you *really* should be mirroring your hard drive. That'll protect you from crashes; accidental deletions and fires are another matter.
Kristjan, that's an amazing story. I've never heard of the magnetic dust falling off the disks, though that may be because they fail before the dust can fall off. Any idea how long it had been running?
Teresa, a conservative estimate is at least 10 years, but it's really unkown, as none of the people who installed the system in the first place is still around in the company.
The retrieving company apparently hadn't tried anything similar before, so it must be pretty rare.
Oh, and about the former sysop, and his lack of backups. Let's just say that there were good reasons why there were a new sysop.
As an aside, I can mention that while I like the idea of a wind-chime, I'm using my harddisks differently - I'm building a nice bedside table out of them (so far I've got about 15 harddisks, but then I am studying computer science...).
Shades of racing drive arrays across the floor - lots of uses for bigger platters - I suspect the key to failures while backing up is not merely hardware stress but rather that the operative word is backup - there have been many miracle backup applications out there that promise incremental disk spanning compressed on the fly back ups to your medium of choice most of which are discovered after the fact to support only specific O/S hardware combinations - when simple copies would have been more robust.
When I came onto my last position, as the netadmin, I was assured the backup system was working.
I said, of course, "Really?"
"Yep. Tested it myself."
"Okay," I said. Then I reached down to the development server, and pulled two drives out of the seven drive RAID-5 array. Suffice to say, the system was Not Happy. Neither, for that matter, was the lame developer turned lame sysadmin. I pulled two coldswap drives out of storage, put them into the system, and said. "This is another test. Restore this box."
Later that day, after explaining to the CIO exactly why the dev server was still offline, I got the authority to start investing in a real backup solution.
I then put the old drives back in. Lucky, was I -- he hadn't mauled the other five trying to restore, and I didn't maul the filesystem by yanking drives. Sometimes, you just have to force yourself to be lucky. L'audace, l'audace, toujour l'audace!
"Jazz, what are the odds of a failure happening while you're backing up?"
Standard practice is to keep at least two backups and alternate between them daily. That way, if one fails, you've only lost a day's work. In addition, every month or so, do a complete archival backup in addition, and file that off-site. Of course, all this implies removable media drives or internet backup. DVD's are about the best inexpensive backup solution available--their long-term properties are unknown, though. Professional shops use tape, but that is tres expensive.
Even at best, disk drives are subject to mechnical wear--they will fail eventually, even with the best care. In addition, no magnetic medium is truly archival; all it takes is exposure to a strong magnetic field, and goodbye data. An automated solution to the backup problem for general computer users is long overdue.
oh, and sympathies on the loss and the plague. Computer problems make me feel like I've lost part of my life, too.
"Standard practice is to keep at least two backups and alternate between them daily. That way, if one fails, you've only lost a day's work. In addition, every month or so, do a complete archival backup in addition, and file that off-site."
Randolph, thank you for this. It's the strategy that (via trial and error) my shop ended up with, and I'm currently trying to explain to the People In Charge why it's a good idea to add SSH-over-internet backup to an offsite server. Backup strategies seem so paranoid, until they save your butt.
Unfortunately, tape drives are expensively subject to mechanical wear, as are the tapes. There is no solution that doesn't require constant vigilance, and we really need to figure out a consumer-level product that keeps Teresa's plight from repeating itself.
Jazz, the proverbial method for reckoning the budget for backup systems goes like this: Assume that you've just lost all your data. How much would you pay to get it back? That figure is the budget for your backup system.
On another subject: Anent my earlier exchange with Claude Muncey, I can only assume that Jo, Yog, Xopher, Graydon, and Brenda Clough haven't been through here lately.
Erik:
Who puts RAID across seven drives? 6 bits data to 1 bit parity? I've never seen it across more than five. I mean, that's just asking for trouble. Was this somebody's idea of saving money on drives?
I don't think five-and-two can even catch a double error, and I know it can't correct it--I suppose I should sit down and do the math to be certain--what were these people thinking?
Teresa -
I've been through.
It's more that I try to avoid saying things when I'm pretty sure they're not going to be comprehensible without more context than I have time to type, and that one was scarcely comprehensible at the time.
For my money, and I mean that literally, the jury has come in on home recorded CDs/DVDs and the verdict is they *don't* last - for pointers see e.g. the extensive discussion on Daynotes especially Dr. Pournelle and Robert Thompson - whose PC Hardware in Nutshell from O'Reilly and associated web site has a good discussion with details (some makes - only loosely correlated with brands - and dyes are better than others and so on and so forth) The I in RAID after all means _Inexpensive_ with all that sad to say implies. Thousands of hours MTBF sounds like forever to a single user; to an enterprise it's one head count running around replacing failed drives.
Wise not to confuse backup with archival storage - the only archival storage I know of for home users is print it out black on white acid free paper and store it safely - that means perhaps MIME encoding baby pictures rather than being caught by things like Epson's - now resolved - archival colors and paper aren't.
Adamsj -- RAID-5 works just fine on 7 drives. The problem with RAID-5 on increadingly large arrays is that your chances of losing two drives increases.
You are right that the chances of misparity increase as you extend the array -- but 6-1 isn't that dangerous, and not that much more dangerous that 5-1.
There is "RAID-6", which uses two drives for parity, but that's fairly rare -- if you need more redundacy, you start mirroring the entire array structure -- two controllers, two cages, two RAID-5 arrays. It then takes three failed drives (minimum) to drop your array. At that point, unless you are doing no maintenance whatsoever, the system board is much more likely to be the weak point.
That particular system was very reliable. The only two failures it had were a system update that installed a broken driver for the RAID controller, and some idiot yanking two drives out at once. :-)
And, as a devel server, it really didn't need much more than two nines.
What bought me more nines, drive wise, wasn't the arrays. The whole point of the array was to protect me against the most common failure mode on a small server -- a single drive going offline. What bought me more nines was spares. If one failed, it was replaced -- the array kept the machine running, and rebuilt the array once the replacement drive was in place. So, the time frame I was running without redundancy was kept very, very small.
When one flipped out (hell, when one started throwing warnings via SMART or some other monitor technology) I'd slap in a cold spare and retire the possibly flaky drive to unimportant roles (as in "payload," though I did use one to hold a bunch of mp3s on the desktop.)
If I needed shorter vulnerability windows, I'd have configured hot spares, for this role, I didn't. Hot-spares cost you another space in the cage you aren't using for storage, plus power and thermal loads.
This is not to say that hot-spares are a bad idea. They're a very good one in many siutations -- but not this particular one.
Good gear costs, and while it is tempting to throw more and more redudancy at the problem, you don't often gain anything from it. My target for the local machines at PEI was somewhere around 99.999% -- due to cost constraints, I couldn't make the electrical power more reliable than that.
Anybody can give you the correct answer, given no other constraints, to a given solvable problem. Good sysadmins and such give you workable answers even in cases where the constraints are very tight -- or, in the very worst cases, tell you exactly why the answer isn't workable.
TNH would have been fine with two mirrored RAID-5 arrays. But, she couldn't afford it, and she couldn't have carried it around. TNH's constraints are mainly cost and portability. Any answer must take those into account.
Esprit de l'escalier - another consideration is that home CD/DVD is much much more likely to start with small errors than tape - certain to for any practical use (but just try to read any useful size array exactly the same twice for exact values of exact - that's one reason we have error correction even for RAM)- the tape process has much better compare and error correct routines - the moving laser writes and having writ moves on - but for home users music, pictures and text information is highly redundant and robust against such small errors - e.g. removing all the vowels from several months emails would not be totally destructive - imagine equivalent dropouts from machine language or even the highest level programming languages or time series data or what have you. Just another complication - I still suggest the Plextor mentioned above and have my doubts about which hard drives are ruggedized for travel - then again I don't know any tapes or homemade optical discs robust for long term exposure in a car.
At work we rotate through two weeks' worth of tape backups. We've had to use a backup once, and the previous day's tape was corrupted, but the day before that wasn't.
Hmm . . . this seems to have become the 'I'll show you mine if you'll show me yours' thread on backups. Well, both as a sometimes sysop and a DBA I spend a lot of my time with this. Here are a few of the schemes used at the manufacturing and ecommerce operations I have supported. The principles are really the same for home though.
The schemes depend on the kind of information that is being backed up. (Yeah, I know -- D'oh.) We have one server that acts as a big (170GB) shared hard disk for our 120 or so users. While some programs and some files are held locally, we encourage (and in some cases require) files to be on this central device. We also have an email server, an old ERP (accounting/sales/inventory/manufacturing) package, a new ERP package, data marts, and specialized software packages to support stuff like plant maintenance and facility management. A total of six racks of servers not couting routers, telecom, or specialized plant hardware (it has its own area in each plant).
We use 40/80 GB DLT tapes (with compression) on roughly the following scheme:
* shared desktop storage - 1 tape
* old ERP - 1 tape
* data marts, new ERP, other systems (this will change) - 1 tape
* email - 1 tape
It comes out to about a quarter terabyte (TB) every night.
This runs from early evening into the middle of the next morning, as it also involves dodging other overnight processes. In a number of cases, what we do is run a database backup to another server, and those files are then swept to tape at a later time -- this reduces the hit on the database server. The tapes are picked up by a courier a couple of times a week, working on a careful rotation system that leaves some of the recent tapes here, and takes the rest offsite to vault storage. Now this is data only, and some of it is a differential backup - only updated objects. We also do bigger and more complete weekly and monthly sets that also go offsite. We can generally get any tape on 24 hours notice easily, but can have one hand carried (with the resulting cost) in 4 to 6 hours. I also as a matter of policy pull a complete backup of the system databases (and in some cases the user databases) on any database server before making any major structural change or upgrade to production systems. These don't go immediately go to tape -- they are held on another server, available for a fast restore if things blow up. Doesn't happen to often if you are careful, but I have had a service pack install blow up on me and force a complete rebuild of a SQL Server v6.5 system. While I have all the media for the software ready to go in a file box with copies of the associated documentation just for this purpose, a fresh backup makes everything so much easier.
Our new software is running on a Win2K Enterprise cluster with data on a multi-volume RAID storage area network. (Some RAID-5, some RAID 1+0, depending on the application. RAID-5 has too high a write penalty for high speed transactional systems, in my experience.) The backups are a bit crude, but effective, right now. Next year (it's in the budget, at least) are the new drives, cards, software upgrade, and server to allow us to have a dedicated gigabit network just for backups. We'll be runing incremental backups all day long which will be swept straight off the SAN and other production systems. Drool. We should be able to completely restore our ERP systems in less than an hour to a specific point in time. Maximum loss should be less than 5 minutes worth of data.
It's not much different for individuals, just a different scale. You need backups, both close copies for fast restore and safe copies for disasters. And you need to have a speed and granularity of backup that does not get in your way but will stay current. Automation is also a *good thing*.
The niftiest little thing I have been using for "last ditch" backups of my workstations and laptop (the official backup is done with CD-RW) uses a 256 MB Cyclone Flash Key USB pen drive. I just plug it in the back of my laptop and run a script that sweeps the key source code and document directories off the different machines under my desk and copies them to the drive. It does not take too long (10 minutes currently) and when done I just unplug it, replace the plug cover, hang it on a lanyard around my neck, and go home. If nothing else works . . .
Erik:
I guess I'm a provincial.
The only two setups I've ever used were RAID 5 in a 4-and-1 configuration, and (now that drives are cheap) RAID 1 mirroring, either with two drives or three drives partitioned with mostly RAID 1 and some RAID 0 for things like swap and dump. I have made many a midnight run to slap in drives and controllers. Drives were no big deal, but a failed controller always scared me silly.
I am a fan of SMART technology, having seen our failed drive replacement rates plummet (and my midnight rides reduced) once we started using it, and being hardcore about replacing the flaky ones.
When I got trained on RAID (remember the last millenium? When companies invested in training?), the instructor claimed there were highly secure installations using twenty-one drives, 16-and-5, both for extreme reliability and extreme security. I never saw one, so I don't know for certain. (For kicks, I just grabbed the manual--I thought I'd made notes on that page, but no such luck.)
I've never heard of RAID 6, except people selling that term for RAID 1+S--I think I'd just as soon put a hot spare in a slot rather than run two parity drives, but I'm guessing, not figuring.
Claude:
I'm so glad I'm strictly doing DBA work these days. The last system I worked on was too big to back up for any purpose other than replatforming or upgrading. This meant RAID, and fallback, and some proprietary innards to the system which gave that team a sterling record of never, ever having lost or corrupted any data. Still, that was a very nervous job.
The system I work with now backs up to mainframe, and is incredibly slow to restore, but at least they have backups. Shortly before I came here, a bad drive firmware upgrade lost the entire DW. (I was in a meeting today to discuss system work. People other than myself were commenting, not entirely favorably, on how cautious the guy running that system is. Not me.)
It's difficult, because we have a system that does not naturally do differential or incremental backups. I've been thinking about a couple of databases which are huge and are backed up every day, and schemes for creating user-controlled non-full backup for them. Those are areas of the warehouse where users do their own thing, so they are both important and difficult to protect.
We have some large production databases which appear to need daily backups, but there I've drawn the line where I can. Better to back up weekly, 'cause backups are expensive, and if you lose something, apply the daily loads. The DW is on one set of drives, the daily loads are on another, the backups for each are on separate systems, and the changes are quick to apply.
A consultant working with us has one of those neat little memory sticks. I lust after one, but I wonder how long they'll be allowed to exist--as he points out, they are a security nightmare.
One great frustration of moving away from working with hardware (which, granted, I wasn't good at--I hold a world's record from the one really sad mistake I made) is that I move away from things I have to think about. It's much easier being a DBA than a sysadmin, and I can feel my my brain cells dying off. Perhaps it's time to buy a lot of cheap old PCs and create a home heating system.
"A consultant working with us has one of those neat little memory sticks. I lust after one, but I wonder how long they'll be allowed to exist--as he points out, they are a security nightmare."
[the 256MB models are reliably under $100 now; I've seen decent ones for under $60.]
A friend of mine tells a charming story about his days at IBM's TJ Watson research lab. Every day, as they were leaving the building, the security guard would interrogate them: "got any floppies in there? are you carrying any floppies?"
One day, my friend was walking out with a 300MB hard drive under his arm, and when the guard asked the standard question, he just said "no, sir, no floppies in here."
As for offsite tape storage, in a previous life I was part of a high-availability database project, and someone asked a question that no one had ever considered before: "where exactly is our offsite tape vault?". The answer "downtown San Francisco" was not reassuring.
Oh, and whenever we tested the failover system, the DBAs fought for the privilege of yanking the plug out of the wall to crash the primary server. :-)
-j
Turns out that Erik is not necessarily correct about the lifetime of the drive after you open the case -- a buddy of mine at Apple had one on his desk that way for at least a year and a half. You just don't breathe on it (or turn it off if you can avoid doing so), and you don't let anyone smoke in the room. The rotating platters generate enough wind to keep the dust away, and anything that actually reaches them probably gets swept off by the heads.
Mind you, this was with the drive technology of a decade ago, and things might be different now. OTOH, higher RPM probably means more airflow, so my guess is that it's still true.
One other possibility, btw, is that the power supply in the drive box has a problem. You can test that by taking the drive out of the box and putting it into an IDE Mac that can handle an extra drive (not the Beige, unless you have the special ROM).
Good luck, one way or t'other --
jon
PS: If you do put the drive box into the freezer, put it into a ziploc bag first. You can leave the cable ends out, but you may want to put them into another ziploc while the thing is actually inside the freezer.
Anent my earlier exchange with Claude Muncey, I can only assume that Jo, Yog, Xopher, Graydon, and Brenda Clough haven't been through here lately.
Well, I just got here, and I remember that whole thrash. (I remember commenting that it was a bit like swatting a fly with a laser cannon -- extreme overkill, but quite entertaining.)
I had a quick look at the linked article, and up at the top of the web page there's a banner ad for the thing. Sigh.
Notice that slashdot.org currently (8 November) has a thread on CD-R life also quoting Fred Langa's newsletter (archived on his site) on surprisingly short life early failure and all that.
I had a hard disk crash just before tax season last year. I sent me disk to Drivesavers, and for just a little more than the cost of my entire computer system, they dismantled the drive in a clean room and recovered everything.
Depending on how bad a crash it is, it could be less expensive. It's unlikely to be more expensive...
I mean, huh?
Goalies' Heads? Sounds ominous.
Here.
(Is it just me, or would "Line-Sharing Devices" would make a really good band name?)
L.N.Hammer muses:
(Is it just me, or would "Line-Sharing Devices" would make a really good band name?)
Well, uh... somehow my mind's dropped straight into the gutter, and I'm thinking of rather a different sort of sharing...
Incompetent comment spam, that can't even manage to post valid links.
Great. If they're so inept that they couldn't post valid links, but they still managed to get past the "Preview" requirement, the Preview dodge must have become a pass-around no-brains-required exploit. No wonder comment spam has been creeping back in.
I hate them. I will kill their spam whenever and wherever I find it. I'll do all I can to help others fight it. If the opportunity arises to do the spammers a bad turn, I'll do it, and I'll publicize the opportunity so that others can join in.