Should I backup my Office 365 email or is Microsoft's cloud reliable?
To answer questions of data availability, Microsoft published an article on their Office 365 blog titled Cloud services you can trust: Office 365 availability. In the article, Microsoft details their uptime, the infrastructure they architected with reliability in mind, and their processes which backup and protect your email, calendar, and other data in Office 365. Here are some of those highlighted:
Redundancy. Redundancy at every layer–physical, data and functional:
- We build physical redundancy at the disk/card level within servers, the server level within a datacenter and the service level across geographically separate data centers to protect against failures. Each data center has facilities and power redundancy. We have multiple datacenters serving every region.
- To build redundancy at the data level, we constantly replicate data across geographically separate datacenters. Our design goal is to maintain multiple copies of data whether in transit or at rest and failover capabilities to enable rapid recovery.
- In addition to the physical and data redundancy, as one of our core strengths we build Office clients to provide functional redundancy to enable you to be productive using offline functionality when there is no network connectivity.
Resiliency. Active load balancing and constant recovery testing across failure domains:
- We actively balance load to provide end users the best possible experiences in an automated manner. These mechanisms also dynamically prioritize, performing low priority tasks during low activity periods and deferring them during high load.
- We have both automated and manual failover to healthy resources during hardware or software failures and monitoring alerts.
- We routinely perform recovery across failure domains to ensure readiness for circumstances require failovers.
Distributed Services.Functionally distributed component services:
- The component services in Office 365 like Exchange, SharePoint, Lync and Office Web Apps are functionally distributed, ensuring that the scope and impact of failure in one area is limited to that area alone and not impact others.
- We replicate directory data across these component services so that if one service is experiencing an issue, users are able to login and use other services seamlessly.
- Our operations and deployment teams benefit from the distributed nature of our service, simplifying all aspects of maintenance and deployment, diagnostics, repair and recovery.
Monitoring. Extensive monitoring, recovery and diagnostic tools:
- Our internal monitoring systems continuously monitor the service for any failure and are built to drive automated recovery of the service.
- Our systems analyze any deviations in service behavior to alert on-call engineers to take proactive measures.
- We also have Outside-In monitoring constantly executing from multiple locations around the world both from trusted third party services (for independent SLA verification) and our own worldwide datacenters to raise alerts.
- For diagnostics, we have extensive logging, auditing, and tracing. Granular tracing and monitoring helps us isolate issues to root cause.
Simplification. Reduced complexity drives predictability:
- We use standardized components wherever possible. This leads to fewer deployment and issue isolation complexities as well as predictable failures and recovery.
- We use standardized process wherever possible. The focus is not only on automation but making sure that critical processes are repeated and repeatable.
- We have architected the software components to be loosely coupled so that their deployment and ongoing health don’t require complex orchestration.
- Our change management goes through progressive, staged, instrumented rings of scope and validation before being deployed worldwide.
Human back-up. 24/7 on-call support:
- While we have automated recovery actions where possible, we also have a team of on-call professionals standing by 24×7 to support you. This team includes support engineers, product developers, program managers, product managers and senior leadership.
- With an entire team on call, we have the ability to provide rapid response and information collection towards problem resolution.
- Our on-call professionals while providing back-up, also improve the automated systems every time they are called to help.
Microsoft also discusses continuous learning, as they use their own services, and communication to their customers for transparency through the Service Health Dashboard that your local mail administrators have access to so we may be aware of any ongoing issues.
Accidentally Deleted Emails
But what happens if users accidentally delete data from their mailboxes? Exchange Online provides a number of features that can help you restorethat data:
- Deleted Item Recovery. Users can restore items that have been deleted from any email folder. Here’s how the process works: When an item is deleted, it is kept in a user's Deleted Items folder. Items remain in this folder until manually removed by the user, or automatically removed by retention policies. (The default retention policy removes items from the Deleted Items folder after 30 days, but organizations can customize this setting). After an item has been removed from the Deleted Items folder, the item is kept in a Recoverable Items folder—where it can be restored by an administrator—for an additional 14 days before being permanently removed. (Please note that if an administrator has placed a user’s mailbox on legal hold, purged items are retained indefinitely and the 14-day window does not apply.)
- Deleted Mailbox Recovery. When an Exchange Online mailbox is deleted, its contents are recoverable for 30 days using the Exchange Control Panel. The mailbox contains all of the data stored in it at the time it was deleted. After 30 days, it is not recoverable.
- Single Item Recovery (Rolling Legal Hold). Sometimes an organization wants to preserve users’ mailbox contents for archiving and eDiscovery purposes, but only for a specific amount of time. The Single Item Recovery feature is a great way to accomplish this. It uses the same mechanism that legal hold uses to permanently preserve original copies of items that have been modified or deleted, but provides “rolling legal hold” capabilities that allow you to define a set period of time. By default, Single Item Recovery is enabled on all mailboxes in Exchange Online with a 14-day retention period, to enable recovery of deleted items. However, administrators can change this default period by extending the Single Item Recovery retention period to any length of time by contacting the Office 365 help desk.Please note that if the desired period is longer than 30 days, the mailbox must have an Exchange Online (Plan 2) subscription.