reliability
We inadvertently provisioned a few database machines with MyISAM instead of InnoDB, and it has been a nightmare. I strongly advice against using MyISAM -- ever.
With MyISAM, we periodically get these incorrect "duplicate key" errors that don't go away until you run the lenghty "repair table" command that somehow fixes everything.
1 FaverShareViewed: 3 TimesQuoted: Use MyISAM when: The data isn't too critical ( "unreliable and slow, related to table size, table repair process" )
...
Use InnoDB for tables when: The table will be big (100Mb+ - "For reliability and performance, we use InnoDB for almost everything at Wikipedia - we just can't afford the downtime implied by MyISAM use and check table for 400GB of data when we get a crash." )
Ouch. Gmail down. "Don't worry, be happy".
1 FaverShareViewed: 1 TimeQuoted: We’re sorry, but your Gmail account is currently experiencing errors. You won’t be able to use your account while these errors last, but don’t worry, your account data and messages are safe. Our engineers are working to resolve this issue.
...
Please try accessing your account again in a few minutes.
Google AdSense is down - they couldn't even manage to serve their AdSense logo on their error page!
http://www.google.com/adsense/images/google_sm.gif
3 FaversShareViewed: 6 TimesQuoted: The Google AdSense website is temporarily unavailable. Please try back later.
We apologize for any inconvenience.
Is S3 a single point of failure for Web 2.0 companies? One of the 3 S-3 data centers went down for 2 hours on Friday morning. Given that people noticed a complete outage - requests seem NOT to have failed over to the other centers.
Amazon seems serious about responding to this - but seems like they have a fundamental system problem.
1 FaverShareViewed: 6 TimesQuoted: Bits is a blog about technology, innovation and society from The New York Times.
shelfari is crashing on me when I try to do simple operations, like view a book.
1 FaverShareViewed: 2 TimesI hadn't heard of the Space Composites accident this summer that killed three people. Here's an article comparing the reliability and response of private vs. government safety procedures.
1 FaverShareViewed: 5 TimesI had a conference called scheduled today and had to resort to *GASP* the land-line!
2 FaversShareViewed: 6 TimesGoogle Analytics finally fesses up to the problem. They claim no data will be lost - they are having a problem with reporting, however. At least it makes me feel better to know they are working on it, and are not going to loose data.
1 FaverShareViewed: 5 Times