AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
Recursive folder backup python12/15/2023 ![]() ![]() This main feature of the package is its incremental backup which incrementally copies only the files that are added or the parts of the file which are changed. make a list of all files similar to the ls -lR but containing the full paths, and try to match that list with the list you found in the missing.txt file).PyBackup is a recursive/incremental backup utility package written purely in Python. In that case you should try to optimize it somehow (e.g. Now notice that this is just a simple example and not optimal, and depending on the number of missing files it will call find that many times which can be slow if the directories are big as you indicated. name "$name" -size $c -newermt "$date $time" ! -newermt "$date $time +0000 +1 minutes" Then, for each missing file use find to find the full path of the specific file: cat missing_files.txt | while read size date time nameįind. like this: comm -1 -3 file_outputs.txt file_results.txt >missing_files.txt k 4 will sort by filename which may be useful.Īfter you compare the two files and find differences, you will of course have to locate the file in the original directory tree, you can use find for this.īased on your comments, if you want to find the full paths for filenames that appear many times, you can do the following:įirst get the list of files that are missing in your second directory, e.g. ![]() k 4 is not really necessary, as long as you are consistent in the two commands. sort to make the comparison later work.cut columns beginning from column 5 (the file size).tr -s ' ' will squeeze multiple consecutive spaces for the following cut to work correctly in all cases.Depending on your use case, you might want to add more here, e.g. grep ^- to only select the actual files, ignore directories and possibly other special files.Using -time-style=long-iso to avoid locale specific peculiarities that might break the following pipes.Ls -lR -time-style=long-iso /data/results/ | grep ^- | tr -s ' ' | cut -d' ' -f5- | sort -k 4 >files_results.txtĪnd then compare the two lists, either with diff or some GUI like meld. ![]() If this is correct, you can create a list of files with size, time and filename like this: ls -lR -time-style=long-iso /data/output/ | grep ^- | tr -s ' ' | cut -d' ' -f5- | sort -k 4 >files_output.txt If I understand correctly you want to compare the two directories recursively, but ignoring the directory structure, so basically if you find two files in the two trees having the same filename, creation/modification time and size (you don't mention size, but I guess it will be also useful), then treat them as the same, even if they are at different positions in the two directory trees. I would consider both of the above the same if filesize, modification date and filename match. I've changed it because it was a mess, but I haven't written down all changes/renames I've done, since there were a lot of them. Since the restored backup still uses the 'old' directory structure. To clarify a little bit, after restoring the backup, I have: /data_backup_restore/output/test1/file1.mhaĪnd /data/results/mhas/first_test/file1.mha The directory has a size of around 2TB and a fairly large number of files, so checking the MD5 of every files is barely an option. Using diff -rq did not work for this and is also rather slow. Which I might have moved/renamed to data/results/mhas/first_test/file1.mha Is there a fast method of showing file differences that do not include their parent directories, only file name and modification or creation date? For example, I have the directories data/output/test1/file1.mha ![]() I have manually restored what I know was missing, but I'm not sure I managed to catch everything. However, the backup I restored was around 2 weeks old, and unfortunately, I've renamed and restructured the directories between deleting them and the point in time of the backup. I have accidentally deleted part of folder (before stopping the rm command). ![]()
0 Comments
Read More
Leave a Reply. |