DapperHomeUserBackup

Summary

This specification discusses implementing a simple and concise backup solution for the non technical users. This application will be shipped with dapper, and be ready for use after installation. This application will make sure that a user either backs up his data, or otherwise is aware of the consequences that may arise if he does not.

Rationale

Providing an easy-to-use backup solution that's suitable for non-expert users is important. Expert users should install and/or use a more sophisticated backup system. Part of the rationale is to tell the user exactly what he has to do, rather than leave it for him to think up a backup schema.

Use Cases

  • John is a new Ubuntu user. He has been using his system for a week now, managed to sort everything out by means of getting his favorite theme set up and desktop behavior. He has also already got quite a few important email messages and some other bits of information currently stored on his Desktop. John however, is a newcomer to Ubuntu and is not aware of the fact he has to do periodic backups. After using his machine for a week, a pop up dialog appears telling him "It has been a week since you installed your computer. In order to be able to restore it to the current state if data loss occurs, it's recommended that you do a backup. Would you like to do that now". Upon confirmation, he is asked to insert blank backup media and a backup is carried out.
  • Rob wants to refresh the backup set he had previously created. He opens the backup program, and is prompted to insert his old backup media if it's rewriteable (multi-sesion CD), or blank media (CDR) if not. Then, the backup program scans for changes and additions to Rob's home directory and backs up only the files that have changed.
  • Marilize is a concerned Ubuntu user. Using her machine for 3 days now, she wishes to backup her data in order to be able to restore it in case it goes bad. She goes to "System" --> "Administration" --> "Backup Now". She is then instructed to insert CD media for storing the data backup. After confirming that she has inserted a CD into the drive, all her personal data is backed up against it. When finished a pop up dialog instructs her "Please take out the CD, and label it 'Ubuntu Personal Backup data, dated 10-10-2006, 06:00am'".

Scope

This specification covers only backing up all the home directories on a given machine (i.e. data and settings files). Audiovisual content will be also backed up, unless it exceeds the backup medium's capacity.

These are out of the scope of this specification:

  • Data mirroring.
  • Doing backup against non local medium. (i.e. network, nfs, sftp etc..)
  • Multi volume backups. (could be considered for the next version of this spec)
  • Backup scheduling algorithm. (possibly for the next version)
  • Encryption. (ditto)
  • Autorun capabilities (ditto).
  • Backing up the list of installed packages. (next version?)

Design

NOTE: All of this solution should be made with localization as a top priority and a ground level consideration, so it would be possible to translate and localize it.

One week after someone first logs in, a notification bubble appears in the corner of their screen, advertising the backup service.

notification.jpg

The main program shall be assisted by a gnome-panel applet, that would be responsible for delivering backup alerts and will allow the user to respond to them by firing the main backup program. The main program should not assume too much about the user's knowledge or understanding of a backup process, and thus assumes that a user follows its instructions accurately, not leaving anything for chance.

There will be an option to disable the notification and the feature altogether, to cater for large installations in which a sysadmin would like to take care of backing up machines' data by himself and not use the system. This will manifest in a debconf question at install time, such that it's also possible to preseed the question for mass installs:

  • "Would you like to have Ubuntu Personal Backup system enabled?"

attachment: backup3.png

Errata:

  • Change "Remind me to back up every: Month" to "Remind me to back up: Monthly", to allow "Never" as a grammatical option.

  • Move size to underneath menu, so that it can update visibly when options like "Include audio and video files" are changed.

  • Change first drop down to be check marks for each user on the system (if run as root) or just the current user's home directory, if run as them. When displaying multiple users, it should display as [X] Joe Shmoe's home (/home/jshmoe).

As suggested by the mentioned use cases, main program will follow a very focused and limited backup policy:

  • We are going to backup only home directories.
  • Only users that are allowed to sudo on the system, will be allowed to carry backups.
  • Audiovisual will be excluded, unless the user can provide a storage media big enough.

We will suggest to the user to use either one of his available media devices as a backup target.

Please note, that since this is essentially a user interface problem, most of my design will be around the UI elements and workflow we need to have for integrating such a solution into Ubuntu desktop.

User Interface

  • The user interface shall follow these guidelines:
    • Will be based on a Wizard form.
    • Will have integrated help documents, for each option.
    • Should provide consitent and accurate progress indication per each processing job.
    • Will allow the user to cancel a given operation, at any stage of the workflow.

The first trigger to launching the backup wizard, will be set as a default of one week after the system has been installed. The notification will be delivered by the panel applet, which after clicking the applet notification ballon will open a dialog allowing the user to choose among the following options:

(dialog 1)

  +----------------------------------------------------------+
  | [X] Perform backup now. (Recommended)                    |
  | [ ] Remind me later.                                     |
  | [ ] Don't bother me again, I do not care about backups*  |
  |----------------------------------------------------------|
  |* This will leave you in a non recoverable state should   |
  |  your data be lost.                                      |
  |  It is highly recommended that you do continue with the  |
  |  backup.                                                 |
  +----------------------------------------------------------+

If the user chooses to not care about backups, the program exits. If he choses to perform the backup, the Ubuntu Personal Backup wizard will be fired up to continue backup proccess in detail. If the user chooses to be reminded later, we will set another 3 days as a default interval before the panel applet will remind him again to do the backup.

In the event of retriggering after a chosen interval has passed, the third option in dialog 1 will be replaced by:

|                                                |
| [ ] Open configuration and maintainance wizard |
|                                                |

This will enable the user to choose a different notification interval, if the current one being uncomfortable as the reason for not carrying on the backup at the notification time.

Main Workflow

When invoked (either by responding to an applet notification, or by the user itself), the program will detect for available media devices, if no media devices are detected, a proper message shall follow:

(window 1)

  +-----------------------------------------------------------------------------------------------------------------------+
  | No suitable backup devices were found.                                                                                |
  |                                                                                                                       |
  | I couldn't detect any suitable media devices that can be used for backups.                                            |
  | In order to be able to do backups, please consider installing a rewrite CD drive, or better get an external USB       |
  | hard drive. Do note that although CDRW is relatively cheap, an external hard drive will hold much more data and will  |
  | enable you to backup your media files if you have any.  After installing, rerun Ubuntu Personal Backup.               |
  |                                               [Quit and get some media devices]                                       |
  +-----------------------------------------------------------------------------------------------------------------------+

If at least one backup device was detected, ask the user if he just wants to do a complete dump of his current data, or possibly backup only files that have changed since the last backup:

+--------------------------------------------------------------------+
|   Please indicate of you would like to either:                     |
|                                                                    |
|   [X] Backup all current data to backup medium (erasing previous   |
|       snapshot if existent.)                                       |
|   [ ] Backup only what have changed since the last backup.         |
|                                                                    |
|                      [Continue]                                    |
+--------------------------------------------------------------------+

This will result in the backend program assembling the backup file according to the criteria chosen, gathering only the changes in the event the 2nd options was indicated, or just dumping all data into the backup file if the first option was chosen.

Moving on, we will offer the user to use the backup medium which is likely to store the biggest volume of data:

(window 2)

  +--------------------------------------------------------------+
  | Where would you like backup data to be stored?               |
  |                                                              |
  | [X] USB External Hard Drive. (Recommended)                   |
  | [ ] CD rewriteable drive.                                    |
  | [ ] USB Key.                                                 |
  |                                                              |
  +--------------------------------------------------------------+

After choosing the backup medium, the program will scan and estimate how big will the backup data be. Then it will check if the chosen backup medium is sufficiently big to contain the backup data. If not, the program will scan backup data again, excluding audiovisual content; this time, if the backup data without the audiovisual content can fit on the backup medium, it will provide a proper alert to the user. In addition, it will offer him to list the files that are considered audiovisual conten, so he will know what files are not being backed up in this run:

+---------------------------------------------------------------------+
|                                                                     |
| Your backup data is too big to fit on the backup medium. However,   |
| excluding audio visual content, it will fit on the medium.          |
|                                                                     |
| How would you like to proceed ?                                     |
|                                                                     |
| [ ] Remind me later after I've purchased a bigger backup medium.(3  |
|     days from now)                                                  |
| [X] Continue backup process, without backing up audiovisual content.|
| [ ] Cancel. I will think about it and run backup myself.            |
|                                                                     |
+---------------------------------------------------------------------+

If the user chooses the first option, we will remind him again in 3 days to redo the backup. If he chooses to Cancel, then we are going to remind the user to do a backup only on the next close daily/weekly/monthly interval as set in the configuration wizard. Other then that, he will have to run backup by himself if he wants something earlier to happen.

If user chose "Continue backup..." , an instructional window shall open asking the user to close currently open programs, and save any pending files that he is working on. This should allow consistency of the backed up data. After his confirmation, we start the backup process, presenting a progress window that also allows to interrrupt the process at any time:

(window 3)

  +-----------------------------------------------------------------------------------------------------------------------+
  | Backup in progress. Depending on the media choosen, and the size of data that needs to be backed up this can take     |
  |  from a couple of minutes to several hours. You can also choose to stop the backup process if you wish.               |
  |                                                                                                                       |
  |                                                                                                                       |
  |                  +--------------------------------------------------------------------------+                         |
  |                  |                                                                          |                         |
  |                  |                                                                          |                         |
  |                  |      [DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD=====================]    |                         |
  |                  |                                  66% Done.                               |                         |
  |                  |                                                                          |                         |
  |                  |           Processing file: /home/sivan/devel/hello_world.py                                                                                                              
  |                  |                                [ Interrupt ]                             |                         |
  |                  +--------------------------------------------------------------------------+                         |
  |                                                                                                                       |
  +-----------------------------------------------------------------------------------------------------------------------+

In the case of using a rewriteable media, the following dialog will pop up:

(window 4)

+---------------------------[Please follow instruction below]--------------------------+
| Backup process has finished succesfully. Please take the backup medium and put it in |
| a safe place. Make sure you can provide this same exact medium when restoring, so    | 
| it might be good to put it in a distinct place then other medium you have.           |
|                             [OK - I've done that, let's finish]                      |
+--------------------------------------------------------------------------------------+

If the backup was against a non rewriteable medium, then a numbering label will be suggested for the user that includes date, time of when the backup was performed. In any event, labeling the media is an additional assistive step offered to the user. So there will be sufficient metadata on the backup medium such that the restoration mechanism could figure up and do the right thing.

NOTE: Windows 1 through 4 also represent a restoration workflow, given these changes:

  • Window 1; In case a restoration action was requested but no media devices were detected.
  • Window 2; Heading should be: "Please choose the media your saved your backup data in:"
  • Window 3; Instructional test and header would be: "I am currently restoring your data. Depending on the media choosen, and the size of data that needs to be restored up this can take from several minutes to several hours. You can also choose to stop the restoration process if you wish, but do note that this will leave your data in a non persistant state and can cause problems afterwards."
  • Window 4; Window content will be: "I have successfully restored your data, plesae take the CD out from the drive, and put it back to your backup set, so it could be easily found next time you need it."
  • We will have another screen between window 2 and window 3, that will allow a user to restore his data into and alternative location. The use case and rationale behind providing such and option is a user, who looses one file (knowing the file name). In the intent to get to that file, he doesn't want to overwrite the current state of his personal data but rather cherry-pick what ever he sees fit for him, from the alternative location to which data was restored.

At this point , if the user hasn't yet set up a backup interval, the program will ask him to do so:

+---------------------------------------------------------------------------------+
| You haven't set up time to be reminded to do periodical backups, Please specify |
| your preferred interval:                                                        |
|                                                                                 |
| [X] Daily. (provides the best recoverability)                                   |
| [ ] Weekly                                                                      |
| [ ] Monthly                                                                     |
| [ ] Do not remind me; I will use the System / Administration / Backup now menu item.                                                                   |                                                                                 |
|                             [OK]       [Cancel]                                 |
+---------------------------------------------------------------------------------|

A similar dialog, as well as the target device selection dialog and an additional verification functionality will be available from the "System -> Administration -> Backup Configuration and Maintainance" menu item. This option will fire up the backup system's configuration and maintainace utility allowing the user to do several things:

+-----------------------------------------------------------------------+
|       Welcome to Ubuntu Personal Backup wizard                        |
|                                                                       |
|   Plesae choose one of the following:                                 |
|                                                                       |
|   [ ] Configure notification interval for doing backups.              |
|   [ ] Choose a media device for backup and restoration procedures.    |
|   [ ] Verify exisitng backup data.                                    |
|   [ ] Restore my data back to the hard drive (overwrites current data)|
|   [ ] Restore my data to an alternative location. (useful for         |
|       cherry-picking specific lost files and for verifying that your  |
|       backup is restorable)                                           |
|                                                                       |
|                          [Next]     [Quit]                            |
|                                                                       |
+-----------------------------------------------------------------------|

In addition, we will provide shorter paths to reach the 2 most important functionalities of the Backup System, through the familiar GNOME panel "System" Menu:

  • System / Administration / Backup now
  • System / Administration / Restore from a backup

When the backup is finished, a note alert is displayed. Normally these are annoying, but in this case we want to provide extra reassurance that the backup was successful. In the case of backing up to CD or DVD, the alert also displays the suggested label for the disc (or for the last disc, if there was more than one.)

finished.jpg

Implementation

The basic idea here is to provide a GUI wrapping around an already exisitng command line tool that provides incermental backup functionality. The tool of choice is the DAR tool , currently available from universe. DAR is released under the GPL and already exposes a programmable API that we can use. DAR is also feature full and Ubuntu Personal Backup could be later on extended to include more functionality exposed by it. DAR also seems to be in wide use, and it is expected it will have sufficient bug fixes and upstream maintainership.

  • Why not use rsync which is cross-platform, mature, secure, well-established, powerful and works locally as well as over the net? As long as DAR does not provide any significant advantages over rsync i see no need to use YAA Wink ;-) (yet another archiver) [SaschaBrossmann]

Considerations:

  • Moving DAR to main:
    • Security review.
    • Inclusion report.
  • Using DAR's native API vs. using the actual command line tool, executed from a PyGTK frontend.

Package Dependencies:

  • libnotify-bin
  • dar
  • python
  • python2.4-gtk2
  • python2.4-glade2

Meta data to be stored on the backup medium:

  • For incremental backups, ordering number that will be used by the restoration algorithm to know in what order to apply restore data.
  • Timestamp of the performed backup.
  • flag indicating if audiovisual content was backed up or not.
  • hostname from which this backup snapshot was taken.

Community Feedback

ManuLopezIbanez: Mandriva/Mandrake has already a similar tool called drakbackup (and surely it is GPL).

FranciscoColaco: We have a problem with the sets for backup. Cluttering the home directory with dot files, there are two kinds of files:

* configuration: .signature, .bash_profile, as an example * application state and cache: .thumbnails/*

We should backup the first, but not the second. Unfortunately, there are dou subdirectories that have both kinds of files (.gnome2/*) comes to mind.

Maybe we should broaden the problem and thing that configuration files should reside in their own directory, with symlinks to the home directory. Thus, $HOME/.signature -> $HOME/Configuration/signature. Then, backing up configuration would be surprisingly easy.

This is unfortunately the result of historical errors, but can be dealt with. Unfortunately, the number of applications that use those files (and mix them in subdirs) is huge.

So, at the home directory, there should be a clear separation between Documents, Configuration and Application State. Maybe with their subdirectories. Also, should media files be under $HOME/Documents or shoud they be set up in a group home page and backed up on their own? This should be, I think, a new wiki page in itself.

NicholasWheeler: I think if the "System Administrator" wants to back up his entire system, so that it would be easily restorable by putting in either a set of cds, dvds, or pointing to nfs, or a tape drive, it should be possible. The case scenarios don't cover a full system backup, but a full system backup would be pretty easy to implement with MondoMindi. It might take a little more work, but as far as I know this isn't going to be in dapper main at this state, anyways. MondoMindi also has the advantage that you can restore the backups on entirely different hardware, including different drive, (can even go from hda->sdb). An installation about 1-1.5 gigs altogether backs up to one cd-rom.

Code

References

Outstanding issues

Why is backup limited to sudoers?

JohnCBarstow: Non-sudoers should be able to back up their own home directory. Sudoers should have the option to back up all home directories and possibly the configuration files in the /etc directory.

  • -- That's correct, this will be possible , minus the /etc directory - see Scope. (sivan)

Tiny suggestions

Some Cd writers are able to burn some text on cds.Ubuntu could detect if such a thing is possible with your cd burner and if so ask it he can write a label on the cd (the guy might not want so if he intends to re-use the cd)

  • -- Nice idea. Will be considered for the next version. (sivan)

JohnCBarstow: Perhaps an option to back up/restore over the network would be useful?

  • -- See scope section. (sivan)

VenkatRaghavanI use [sbackup http://packages.ubuntu.com/dapper/admin/sbackup] which is available in universe. It already has a nice interface with regex support to exclude thumbnails, and IIRC also perform weekly backups.

  • -- Me and Ian already explored and evaluated sbackup. Seeing it as it is we decided a much simpler and easy to operate solution should be available, with emphasis on backing up to removable media. We don't think they so much overlap. (sivan)


CategorySpec

DapperHomeUserBackup (last edited 2008-08-06 16:37:51 by localhost)