RIM has released a preliminary statement on the Blackberry outage (the emphasis is mine):
"RIM has determined that the incident was triggered by the introduction of a new, non-critical system routine that was designed to provide better optimization of the system’s cache. The system routine was expected to be non-impacting with respect to the real-time operation of the BlackBerry infrastructure, but the pre-testing of the system routine proved to be insufficient.
The new system routine produced an unexpected impact and triggered a compounding series of interaction errors between the system’s operational database and cache. After isolating the resulting database problem and unsuccessfully attempting to correct it, RIM began its failover process to a backup system.
Although the backup system and failover process had been repeatedly and successfully tested previously, the failover process did not fully perform to RIM’s expectations in this situation and therefore caused further delay in restoring service and processing the resulting message queue.
RIM apologizes to customers for inconvenience resulting from the service interruption. RIM’s root cause analysis and system enhancement process with respect to this incident is ongoing and RIM has already identified certain aspects of its testing, monitoring and recovery processes that will be enhanced as a result of the incident and in order to prevent recurrence."