Snapshots and Project Filesystems

Since August 2007, all project filesystems have been stored on filesystems capable of, among other things, point-in-time snapshots. Since these snapshots impact how you use the filesystem, this page is meant to give some explanation of the implications that come along with them.

Note that ALL project filesystems have snapshots enabled by default. If you feel like snapshots on your project filesystem are causing a problem for you, please email csstaff@cs.princeton.edu explaining your problem so that we can discuss options to mitigate it.


What do snapshots do for me?

A snapshot, as the name implies, is a point-in-time copy of the filesystem as it existed at the moment of the snapshot. These can be useful in the event that data is lost or changed accidentally, to avoid having to go back to offline backups. It can also have other uses, such as tracking changes made by some operation.

You can find the snapshots for your filesystem mounted in the .snapshot directory at any level of your project filesystem (for example: /n/fs/island/.snapshot). Note that you will not see the .snapshot directory in the output of ls. If you blindly cd into it, it will be there. More details on restoring a file using snapshots can be found on the CS Guide Backup page.

What don't snapshots do for me?

It is important to note that snapshots do NOT protect your data against hardware failure of the underlying disk(s), logical failure due to some yet-unknown bug in the filesystem, or a limited set of admin (that's CS Staff) errors. For these cases, we must rely on caution, redundancy, and offline backups (in roughly that order).

Beside providing snapshots, all project filesystems are stored on enterprise-grade highly-redundant storage. This should make data loss from hardware failure very unlikely.

How do snapshots work?

Without getting into too much detail (there are many places you can find very detailed information about this), our snapshots are provided by a copy-on-write filesystem, meaning that no write operation ever destroys old data until the write of the new data is completed. A side-effect of this implementation style is that it becomes possible to set a flag in the filesystem which will preserve the entire filesystem, while changes to the filesystem are tracked in separate blocks.

How often are snapshots taken, and how long do they last?

Snapshots are taken every four hours, and deleted on a regular basis such that there are several snapshots available from various times going back as far as one year.

In some cases, however, CS Staff may remove snapshots before their normal expiration if disk space gets tight or some other issue arises that demands their removal. You should, therefore, never use snapshots as a place to store data that doesn't exist elsewhere.

What do snapshots cost?

As with anything worth having, there is a trade-off to be made in order to have snapshots. In this case, the trade-off is disk space. In order to have snapshots, extra space is consumed beyond that addressed as the current copy of the filesystem. At this time, the department absorbs the extra disk space taken up by snapshots for home directories.  For project space, the extra disk space is charged to the owner of the project space.

For the curious, here is an example of snapshot space usage: If your new filesystem is 10G in size and you create a 5G file in that filesystem, you will have 5G of space left. If a snapshot of the filesystem is then taken, you will still have 5G of available space and no extra space is taken up by the snapshot. However, if you then decide to delete the 5G file and create a 7G file, the snapshot will then take up 5G of space alongside the 7G of space for the new data for a total of 12G. Even though you deleted the 5G file, it still exists in the snapshot that was taken. Once all snapshots containing the 5G file expire (see previous question), that 5G of space will be reclaimed. Note that, because the snapshot space is absorbed into available space in the pool unlike the original way this policy was implemented, you will still be able to write another 3G into your filesystem before you reach your 10G quota.

It is worth noting that the amount of space taken up by snapshots can vary greatly depending on how you use a filesystem. As snapshots only take up space for data which has been changed or deleted, very little space is required for snapshots of filesystems with little change or strictly additive use. For a filesystem in which files are frequently deleted or changed, snapshots will require more space.

For that reason, transient data that is impacted by snapshots (generally, data with a lifespan of less than two weeks), may be better placed in Scratch Space, which does not use snapshots and is not backed up, but IS still stored on the same redundant storage as all project space.