Skip to content

Commit

Permalink
Disable bulk delete for files that break XML (#151)
Browse files Browse the repository at this point in the history
* Add fallback to non-bulk delete if table name breaks XML

* Simplify fallback

* Execute query
  • Loading branch information
kirillgarbar authored May 13, 2024
1 parent 13f7d82 commit 3f5b55f
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 6 deletions.
20 changes: 15 additions & 5 deletions ch_backup/storage/engine/s3/s3_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,17 +87,27 @@ def delete_files(self, remote_paths: Sequence[str]) -> None:
"""
Delete multiple files from S3
"""

def delete_by_one(remote_paths: Sequence[str]) -> None:
for remote_path in remote_paths:
self.delete_file(remote_path)

if not self._bulk_delete_enabled:
delete_by_one(remote_paths)
return

# https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.delete_objects
if self._bulk_delete_enabled:
objects_to_delete: Sequence = [
try:
objects_to_delete: list = [
{"Key": path.lstrip("/")} for path in remote_paths
]
self._s3_client.delete_objects(
Bucket=self._s3_bucket_name, Delete={"Objects": objects_to_delete}
)
else:
for remote_path in remote_paths:
self.delete_file(remote_path)
except ClientError as e:
if "MalformedXML" not in repr(e):
raise
delete_by_one(remote_paths)

def list_dir(
self, remote_path: str, recursive: bool = False, absolute: bool = False
Expand Down
41 changes: 40 additions & 1 deletion tests/integration/features/backup_delete.feature
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,45 @@ Feature: Backup & Clean & Restore
| 5 | created | 4 | 4 | shared+links |
| 6 | created | 4 | 0 | shared |

Scenario: Create "unusual_char" backup
When we execute query on clickhouse01
"""
CREATE DATABASE unusual_char;
"""
And we execute query on clickhouse01
"""
CREATE TABLE unusual_char."tab\x07l./-_e" (n UInt64) ENGINE = MergeTree() ORDER BY n;
"""
And we execute query on clickhouse01
"""
INSERT INTO unusual_char."tab\x07l./-_e" VALUES (1234);
"""
When we create clickhouse01 clickhouse backup
Then we got the following backups on clickhouse01
| num | state | data_count | link_count | title |
| 0 | created | 14 | 0 | unusual_char |
| 1 | created | 0 | 0 | schema-only |
| 2 | created | 13 | 0 | data |
| 3 | created | 4 | 9 | links+data |
| 4 | created | 12 | 0 | shared+data |
| 6 | created | 0 | 8 | links |
| 5 | created | 4 | 4 | shared+links |
| 7 | created | 4 | 0 | shared |
And s3 contains 117 objects

Scenario: Attempt to delete "unusual_char" backup succeeds
When we delete clickhouse01 clickhouse backup #0
Then we got the following backups on clickhouse01
| num | state | data_count | link_count | title |
| 0 | created | 0 | 0 | schema-only |
| 1 | created | 13 | 0 | data |
| 2 | created | 4 | 9 | links+data |
| 3 | created | 12 | 0 | shared+data |
| 4 | created | 0 | 8 | links |
| 5 | created | 4 | 4 | shared+links |
| 6 | created | 4 | 0 | shared |
And s3 contains 93 objects

Scenario: Attempt to delete "shared" backup deletes no data
When we delete clickhouse01 clickhouse backup #6
Then we got the following backups on clickhouse01
Expand Down Expand Up @@ -153,7 +192,7 @@ Feature: Backup & Clean & Restore
| 2 | created | 12 | 0 | shared+data |
| 3 | partially_deleted | 4 | 0 | shared+links |
| 4 | partially_deleted | 4 | 0 | shared |
And s3 contains 77 objects
And s3 contains 77 objects

Scenario: Attempt to delete "shared + data" backup deletes non-shared data only
When we delete clickhouse01 clickhouse backup #2
Expand Down

0 comments on commit 3f5b55f

Please sign in to comment.