Online Vs Offline Backup/restore of DS using dsbackup

tusharchoudhury · September 18, 2024, 4:09pm

Hi Team,

In the dsbackup command for Directory Server, we have two approaches for both backup and restore: one being online approach [(when the Directory Server is running) and the other being an offline approach (when the Directory Server is not running).

I would like to understand which approach is the better for creating backup and restoring from it, given that the dataset contains approximately 25million users.

Is offline restore is right for this case, or should we use the online approach? If offline preferable, what are the specific advantages or disadvantages with each of them? In our environment, bringing the instance down while restoring is not a concern. we have been doing offline restore only so far, but would like to know which approach is the most appropriate and its advantages. Any inputs on this will be helpful.

Thanks.

mwtech · September 18, 2024, 8:55pm

Hi @tusharchoudhury

One point of clarification I would like to make: offline does not necessarily mean that the server is not running. offline simply indicates that the process runs separate from the DS process, and can be run if the server is not running (although the server must be stopped for a restore if using offline). This distinction is important, as running offline backups and while the server is running can have issues if you are backing up an LDIF backend and a change is made.

Which approach is better? I think it really depends on your needs. An offline job has to be run directly on the server itself, whereas an online job can target a remote server. Offline jobs have the benefit of not requiring a privileged DS user to run them, but they do require that the user executing the task has the appropriate permissions on the file systems. You can use the scheduler functionality for either online or offline jobs and get the benefit of notifications and task sequencing, however DS must be running for the scheduled job to execute.

I know this doesn’t directly answer your question, but hopefully it provides you with some additional insight into the overall process.

tusharchoudhury · September 19, 2024, 12:03pm

Thanks @mwtech , for the note.

Yes, --offline option can be used when server is running for taking backup; while to restore server must be stopped.

So given that, we have access to server and can run commands with right privileges, is it wise to say offline should be preferred approach than online.

I wanted to know, impact on the server performance while carrying out restore online, and will restore occur with out impacting data or process. I am not able to find any documentation which says that --offline has less impact on server performance while online has more, neither there is documented observation of restore being impacted with large data sets while carrying out online restore?

The objective of asking this question is, we are preparing SOP for immediate backup and restore process and question came “why offline”, “why not online” and wanted to gather more evidence as I can. Any statistics around the process of restore would be helpful.

mwtech · September 19, 2024, 2:29pm

Hi @tusharchoudhury

I’d strongly recommend running benchmarks in your non-production environments to get the specific details. There are a number of factors that can come in to play including OS, resource allocations, total data set size, disk IO, DS usage patterns, etc, and while basic rough statistics may be out there they are heavily dependent upon the environment in which they were run. While others may have hard statistics available for their particular setup, those may not reflect what you’d observe in your own environment.

Sorry I can’t give you a more definitive answer, but in my experience there are too many variables involved to offer up any more specific guidance.

tusharchoudhury · September 25, 2024, 1:06pm

Thanks @mwtech for the note, this helps.