07_interactive_deletion_manager_.md - it255ru/duplo GitHub Wiki
In the previous chapters, especially Duplicate Discovery Engine and Identical Directory Finder, duplo has done an incredible job. It's meticulously scanned your system, identified every single duplicate file, and even spotted entire folders that are exact copies of each other. You now have a comprehensive report telling you exactly where the redundant data is.
But finding duplicates is only half the battle! The real challenge comes next: deciding which copies to keep and which to delete. You wouldn't want to delete the only copy of an important file by mistake, would you?
This is where the Interactive Deletion Manager comes into play. It's your personal control panel, designed to give you full confidence and control over the cleanup process.
After duplo identifies a pile of duplicate files or identical directories, you need a safe and intuitive way to make decisions. Simply deleting everything automatically can be risky. You need:
- Clear Presentation: To see all copies of a duplicate group side-by-side.
- Granular Control: To choose exactly which files/directories to keep and which to delete.
- Safety First: A way to simulate the deletion process before making any permanent changes.
Use Case: duplo has reported that you have photo_1.jpg in /home/user/pictures/ and an identical copy old_photo.jpg in /home/user/backup/. You want to delete old_photo.jpg from the backup folder, but you want to be absolutely sure that's what will happen, and you want to choose this specific action, not just a general rule. You also have an identical Project_Files_V1 folder that you want to delete, keeping only one specific copy.
To solve this, duplo needs to:
- Display the duplicate groups clearly.
- Offer interactive choices for each group.
- Provide a "dry-run" mode to show the impact before committing.
Think of the Interactive Deletion Manager as a smart assistant who presents you with all your duplicate items and then asks, "What would you like to do with these?" It empowers you to make informed decisions without fear of accidental data loss.
Here are its core features:
- Easy-to-Understand Display: It lists each group of duplicates (either files or directories) and shows you their full paths.
-
Flexible Options: For each group, you can:
- Skip: Do nothing with this particular group.
- Keep First/Last: Automatically mark all copies for deletion except for the first or last one listed.
- Manual Selection: Pick specific files/directories to keep or delete.
- Apply Rule to All: If you find a common cleanup pattern, you can apply a rule (like "keep the first copy") to all remaining duplicate groups with one command.
-
"Dry-Run" Mode: This is
duplo's built-in safety net. It allowsduploto go through the entire deletion process and show you exactly which files and directories would be deleted, and how much space would be freed, without actually removing anything from your computer. It's like a rehearsal before the actual performance.
To use the Interactive Deletion Manager, you typically run duplo with the --interactive flag. You can also combine it with --dry-run for a simulation or --auto-first for an automatic approach.
1. Interactive Mode (--interactive)
python main.py my_messy_folder --interactive --find-identical-dirsAfter duplo finishes scanning and finding duplicates, you'll enter the interactive mode.
Example for File Duplicates:
============================================================
ИНТЕРАКТИВНЫЙ РЕЖИМ УПРАВЛЕНИЯ ДУБЛИКАТАМИ
============================================================
[+] ОБРАБОТКА ДУБЛИКАТОВ ФАЙЛОВ
Группа 1 (Хеш: 1a2b3c4d...), Размер: 2.50 MB, Категория: images
[1] /home/user/my_messy_folder/holiday/sunset.jpg
[2] /home/user/my_messy_folder/backup/sunset_copy.jpg
[3] /home/user/my_messy_folder/archive/old_sunset.jpg
[s] Пропустить эту группу
[a] Удалить все копии, кроме первой
[b] Удалить все копии, кроме последней
[m] Выбрать вручную
[A] Применить 'удалить все копии, кроме первой' для всех оставшихся групп
Ваш выбор: a
Добавлено для удаления: 2 файлов
In this example, for Group 1, the user chose a ("Удалить все копии, кроме первой"), meaning /home/user/my_messy_folder/holiday/sunset.jpg will be kept, and the other two copies will be marked for deletion.
Example for Identical Directories:
[+] ОБРАБОТКА ИДЕНТИЧНЫХ КАТАЛОГОВ
Группа идентичных каталогов #1:
[1] /home/user/my_messy_folder/Photos_Backup (100 файлов, 500.00 MB)
[2] /home/user/archive/OldPhotos (100 файлов, 500.00 MB)
[s] Пропустить эту группу
[a] Удалить все каталоги, кроме первого
[b] Удалить все каталоги, кроме последнего
[m] Выбрать вручную
Ваш выбор: m
Введите номера каталогов, которые нужно сохранить (через пробел): 1
Добавлено для удаления: 1 каталогов
Here, for the identical directories, the user chose m for manual selection and entered 1, indicating they want to keep the first directory and mark the second one for deletion.
2. Dry-Run Mode (--dry-run)
You can combine --interactive with --dry-run or just use --dry-run with --auto-first. This is a fantastic way to review your choices without any risk.
python main.py my_messy_folder --auto-first --dry-runAfter all selections (either manual or automatic) are made, duplo will present a "ПРЕДВАРИТЕЛЬНЫЙ ПРОСМОТР УДАЛЕНИЯ" (PREVIEW OF DELETION):
============================================================
ПРЕДВАРИТЕЛЬНЫЙ ПРОСМОТР УДАЛЕНИЯ
============================================================
Файлы для удаления (5):
- /home/user/my_messy_folder/backup/sunset_copy.jpg
- /home/user/my_messy_folder/archive/old_sunset.jpg
- /home/user/my_messy_folder/videos/party_copy.mp4
- ... и еще 2 файлов
Общий объем файлов для удаления: 30.00 MB
Каталоги для удаления (1):
- /home/user/archive/OldPhotos (100 файлов, 500.00 MB)
Общий объем каталогов для удаления: 500.00 MB
Общий объем, который будет освобожден: 530.00 MB
[ТЕСТ] Будет удален файл: /home/user/my_messy_folder/backup/sunset_copy.jpg (2.50 MB)
[ТЕСТ] Будет удален файл: /home/user/my_messy_folder/archive/old_sunset.jpg (2.50 MB)
...
[ТЕСТ] Будет удален каталог: /home/user/archive/OldPhotos (500.00 MB)
This output clearly lists what will be deleted and how much space will be saved, without touching any actual files. This gives you the final confidence to proceed.
Let's peek behind the curtain to understand how this interactive control panel operates.
- Collect Duplicates: The Interactive Deletion Manager receives the lists of duplicate files (from the Duplicate Discovery Engine) and identical directories (from the Identical Directory Finder).
- Loop Through File Groups: It takes the first group of duplicate files and displays all copies to the user.
- Prompt for Choice: It presents the user with a set of options (keep first, keep last, manual, skip, etc.).
- Process Choice: Based on the user's input, it identifies which files in that group should be deleted. These files are added to a temporary "files to delete" list.
- Repeat for Files: This process repeats for every group of duplicate files until all have been reviewed or a "apply to all" rule is chosen.
-
Loop Through Directory Groups: If
--find-identical-dirswas used, it then proceeds to do the same for each group of identical directories. - Prompt for Choice (Dirs): Presents options similar to files.
- Process Choice (Dirs): Identifies directories to delete and adds them to a temporary "directories to delete" list.
-
Review (Dry-Run): If
--dry-runis active,duplopresents a detailed summary of all files and directories in the "to delete" lists, showing their paths and the total space that would be recovered. It explicitly states that this is a "test run." -
Confirm Deletion: If not in dry-run mode,
duploasks for a final confirmation from the user. -
Execute Deletion: If confirmed,
duplothen iterates through the "to delete" lists and actually removes the specified files and directories from the file system.
Here's a simplified diagram of the interactive decision-making and optional dry-run:
sequenceDiagram
participant User
participant DuploApp as "Duplo Application"
participant DeletionMgr as "Interactive Deletion Manager"
participant FileSystem as "File System"
User->>DuploApp: "Run with --interactive [--dry-run]"
DuploApp->>DeletionMgr: "Start interactive selection (duplicates, identical_dirs)"
Note over DeletionMgr: Loops through each duplicate file group
DeletionMgr->>User: Display File Group 1, ask for choice
User->>DeletionMgr: Select "Keep first" (e.g., 'a')
DeletionMgr->>DeletionMgr: Add selected files to 'files_to_delete' list
DeletionMgr->>User: Display File Group 2, ask for choice
User->>DeletionMgr: Select "Skip" (e.g., 's')
DeletionMgr->>DeletionMgr: No files added for this group
Note over DeletionMgr: Loops through each identical directory group
DeletionMgr->>User: Display Dir Group 1, ask for choice
User->>DeletionMgr: Select "Manual: Keep [1]" (e.g., 'm', then '1')
DeletionMgr->>DeletionMgr: Add selected dirs to 'dirs_to_delete' list
alt If --dry-run is active
DeletionMgr->>User: Show "PREVIEW OF DELETION" (lists files/dirs to remove)
DeletionMgr->>DeletionMgr: Simulate deletion actions, print "[ТЕСТ] Будет удален..."
else If not --dry-run
DeletionMgr->>User: Ask for final confirmation (y/n)
User->>DeletionMgr: Confirm 'y'
DeletionMgr->>DeletionMgr: Iterates 'files_to_delete' and 'dirs_to_delete'
DeletionMgr->>FileSystem: Delete file A
FileSystem-->>DeletionMgr: File A removed
DeletionMgr->>FileSystem: Delete dir X
FileSystem-->>DeletionMgr: Dir X removed
end
DuploApp-->>User: Deletion complete or cancelled.
Let's look at the actual Python code from main.py that implements the Interactive Deletion Manager.
1. The interactive_selection Function:
This function handles the main loop for prompting the user for choices for both file and directory duplicates.
# File: main.py (simplified)
def interactive_selection(duplicates, identical_dirs, statistics, auto_select_first=False):
files_to_delete = []
dirs_to_delete = []
# ... (print section headers) ...
if duplicates:
print("\n[+] ОБРАБОТКА ДУБЛИКАТОВ ФАЙЛОВ")
for i, (file_hash, file_paths) in enumerate(duplicates.items(), 1):
# ... (display group info: hash, size, category) ...
for j, path in enumerate(file_paths, 1):
print(f" [{j}] {path}")
if auto_select_first:
files_to_delete.extend(file_paths[1:])
# ... (print auto-selected message) ...
continue
# ... (print options: s, a, b, m, A) ...
choice = input("\nВаш выбор: ").strip().lower()
if choice == 's': # Skip
continue
elif choice == 'a': # Keep first
files_to_delete.extend(file_paths[1:])
elif choice == 'b': # Keep last
files_to_delete.extend(file_paths[:-1])
elif choice == 'm': # Manual selection
keep = input("Введите номера файлов, которые нужно сохранить (через пробел): ").split()
keep_indices = [int(idx) - 1 for idx in keep if idx.isdigit()]
for idx, path in enumerate(file_paths):
if idx not in keep_indices:
files_to_delete.append(path)
elif choice == 'a': # Apply 'keep first' to all remaining
files_to_delete.extend(file_paths[1:])
auto_select_first = True # Set flag to apply automatically
# ... (error handling for invalid choice) ...
# ... (similar logic for identical_dirs) ...
return files_to_delete, dirs_to_delete- This function takes the
duplicates(file groups) andidentical_dirsas input. - It iterates through each group, displays the options, and takes user input.
- Based on the
choice, it adds the paths of files/directories destined for deletion to eitherfiles_to_deleteordirs_to_deletelists. - The
auto_select_firstflag allows users to choose a default action for all subsequent groups.
2. The auto_select_first_copy Function:
This is a simpler function for applying a common rule automatically, often used with --auto-first.
# File: main.py (simplified)
def auto_select_first_copy(duplicates, identical_dirs):
files_to_delete = []
dirs_to_delete = []
for file_hash, file_paths in duplicates.items():
files_to_delete.extend(file_paths[1:]) # Keep first, delete others
for dir_group in identical_dirs:
dirs_to_delete.extend(dir_group[1:]) # Keep first, delete others
return files_to_delete, dirs_to_delete- This function bypasses user interaction and automatically populates the
files_to_deleteanddirs_to_deletelists by always keeping the first item and marking all others for deletion.
3. The preview_deletion Function (Dry-Run Preview):
This function takes the collected lists of files and directories marked for deletion and presents them in a user-friendly summary.
# File: main.py (simplified)
def preview_deletion(files_to_delete, dirs_to_delete, statistics):
print("\n" + "="*60)
print("ПРЕДВАРИТЕЛЬНЫЙ ПРОСМОТР УДАЛЕНИЯ")
print("="*60)
total_size_saved = 0
if files_to_delete:
print(f"\nФайлы для удаления ({len(files_to_delete)}):")
total_file_size = sum(os.path.getsize(f) for f in files_to_delete)
total_size_saved += total_file_size
# ... (print first 10 files and total size) ...
if dirs_to_delete:
print(f"\nКаталоги для удаления ({len(dirs_to_delete)}):")
total_dir_size = 0
for dir_path in dirs_to_delete:
if dir_path in statistics['by_directory']:
dir_stats = statistics['by_directory'][dir_path]
total_dir_size += dir_stats['size']
# ... (print dir path and its stats) ...
total_size_saved += total_dir_size
# ... (print total dir size) ...
print(f"\nОбщий объем, который будет освобожден: {format_size(total_size_saved)}")
return True # Indicates there are items to delete- It calculates and displays the total number of items and the total amount of space that
would befreed. - It uses
os.path.getsizeto get accurate sizes for individual files and iterates through directory statistics for folders.
4. The execute_deletion Function (Actual Deletion or Simulation):
This is the function that actually performs the deletions, or simulates them if dry_run is true.
# File: main.py (simplified)
import shutil # For deleting directories
def execute_deletion(files_to_delete, dirs_to_delete, dry_run=False):
print("\n" + "="*60)
print("ВЫПОЛНЕНИЕ УДАЛЕНИЯ" if not dry_run else "ТЕСТОВЫЙ РЕЖИМ УДАЛЕНИЯ")
print("="*60)
freed_space = 0
for file_path in files_to_delete:
try:
file_size = os.path.getsize(file_path)
if not dry_run: # If NOT dry-run, actually remove
os.remove(file_path)
print(f"Удален файл: {file_path}")
else: # If dry-run, just print what WOULD happen
print(f"[ТЕСТ] Будет удален файл: {file_path}")
freed_space += file_size
except Exception as e:
print(f"Ошибка при удалении файла {file_path}: {e}")
for dir_path in dirs_to_delete:
try:
dir_size = 0 # Calculate size of directory to be deleted
# ... (code to calculate dir_size) ...
if not dry_run: # If NOT dry-run, actually remove
shutil.rmtree(dir_path) # Deletes directory and all its contents
print(f"Удален каталог: {dir_path}")
else: # If dry-run, just print what WOULD happen
print(f"[ТЕСТ] Будет удален каталог: {dir_path}")
freed_space += dir_size
except Exception as e:
print(f"Ошибка при удалении каталога {dir_path}: {e}")
print(f"\nОсвобождено места: {format_size(freed_space)}")- The
dry_runflag is crucial here. If it'sTrue, theos.remove()andshutil.rmtree()commands (which delete files and directories, respectively) are skipped, andduploonly prints a message indicating what would be deleted. - If
dry_runisFalse, the actual deletion commands are executed. - Error handling (
try...except) ensures thatduplodoesn't crash if it encounters permissions issues or files that no longer exist.
The Interactive Deletion Manager is the final and arguably most critical component of duplo. It empowers you, the user, with complete control over the cleanup process. By providing clear information, flexible decision-making options, and the invaluable "dry-run" safety feature, duplo ensures that you can confidently reclaim disk space by removing duplicates and redundant directories, without ever worrying about losing important data by mistake. It transforms duplo from a powerful analysis tool into a complete and safe cleanup solution for your digital life.
Generated by AI Codebase Knowledge Builder. References: [1]