Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete old print history and time-lapses #762

Open
lettore opened this issue Feb 17, 2023 · 9 comments
Open

Delete old print history and time-lapses #762

lettore opened this issue Feb 17, 2023 · 9 comments

Comments

@lettore
Copy link

lettore commented Feb 17, 2023

Hello,
Is there a way to delete old time-lapses and print history after x days?
Selecting hundreds of prints in the UI is really a pain, and I want to use a script to remove the old print history automatically.
I already tried to remove the timelapse files but it does not remove the print history and it just leads to a file missing when you try to view or download the file.

@dipuzyrev
Copy link
Contributor

Hey, @lettore! Thank you for the feedback.
We plan to to introduce bulk delete for the print history in web app very soon, so you will be able easily delete selected or all the prints at once. Hope, it will solve your problem.

@kennethjiang
Copy link
Contributor

@lettore Any specific reason why you want to remove items from the history? Please note that once removed, your print history stats won't be accurate any more.

@MallocArray
Copy link
Contributor

I personally don't use Obico for print stats, in fact, I didn't even know that was a thing. For self-hosted, I wipe my Docker host regularly and redeploy from code often and I don't backup the timelapses, so bulk cleaning up the references to videos that don't exist anymore is a win.

@lettore
Copy link
Author

lettore commented Feb 21, 2023

As I print many things everyday my data folder was more than 100GB so it's obvious I need to clean up sometimes the old things I don't need.

@kennethjiang
Copy link
Contributor

It sounds like the best way to do it is to implement a django admin function to clean up timelapse videos.

If you want to get it implemented more quickly and learn something new about django, you can take a stab at it. :) @MallocArray @lettore . An example is https://github.com/TheSpaghettiDetective/obico-server/blob/release/backend/app/management/commands/extract_prints_from_hist.py

@puterboy
Copy link
Contributor

Hey, @lettore! Thank you for the feedback. We plan to to introduce bulk delete for the print history in web app very soon, so you will be able easily delete selected or all the prints at once. Hope, it will solve your problem.

Note that the delete function (whether bulk or individual) and whether in the web or mobile interfaces only deletes the entries from the db.sqlite3 database -- however, the actual jpg, mp4, and json files which can account for tens of gigabytes are NOT deleted.

I wrote some bash-sql hacks that address this (and I posted them to the discord group), but here they are:

Delete jpg/mp4/json files associated with deleted print history elements

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd static_build/media
echo "select video_url from app_print where video_url is NOT NULL and deleted is NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p" | xargs \rm
echo "select tagged_video_url from app_print where tagged_video_url is NOT NULL and deleted is NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p" | xargs \rm
echo "select poster_url from app_print where poster_url is NOT NULL and deleted is NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/tsd-timelapses|tsd-timelapses|p" | xargs \rm
echo "select prediction_json_url from app_print where prediction_json_url is NOT NULL and deleted is NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p" | xargs \rm

for ID in $(echo "select id from app_print where deleted is NOT NULL" | sqlite3 $OBICODB); do
    FILES="$(echo "select image_url from app_printshotfeedback where print_id = \"$ID\" and image_url is NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")"
    if [ -n "$FILES" ]; then
       \rm $FILES
       DIR=$(echo $FILES | sed "s|\(tsd-pics/ff_printshots/[0-9]*/[0-9]*/\)[0-9.]*\.jpg *|\1\n|g" | sort | uniq )
       rmdir --ignore-fail-on-non-empty $DIR
    fi
    FILES="$(echo "select image_url from app_printerevent where print_id = \"$ID\" and image_url is not NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")"
    [ -n "$FILES" ] && \rm $FILES
done;

It should be noted that independent of the above there seem to be some "orphan" files that are left in the media/tsd-pics hierarchy that are not referenced in the db.sqlite3 database. These can be cleaned up using the following:
Remove orphan media files

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd obico-server/backend/static_build/media
comm -2 -3 <(find * -type f | sort) <((
echo "select video_url from app_print where video_url is NOT NULL" | sqlite3 $OBICODB;
echo "select tagged_video_url from app_print where tagged_video_url is NOT NULL" | sqlite3 $OBICODB;
echo "select poster_url from app_print where poster_url is NOT NULL" | sqlite3 $OBICODB;
echo "select prediction_json_url from app_print where prediction_json_url is NOT NULL" | sqlite3 $OBICODB;
echo "select image_url from app_printshotfeedback where image_url is NOT NULL" | sqlite3 $OBICODB;
echo "select image_url from app_printerevent where image_url is not NULL" | sqlite3 $OBICODB;
)| sed -n "s|.*/media/||p" | sort | uniq) | xargs \rm

Alternatively, the following also works:
Remove orphan media files (alternative version)

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd obico-server/backend/static_build/media
comm -2 -3 <(find * -type f | sort) <(echo .dump | sqlite3 $OBICODB | tr ',' '\n' | sed -n "s|.*http:.*/media/\([^',]*\).*|\1|p" | sort | uniq) | xargs \rm

Finally, you can remove empty directories using:
Remove empty media directories

cd obico-server/backend/static_build/media
find * -mindepth 2 -type d -empty -delete

@puterboy
Copy link
Contributor

puterboy commented Aug 28, 2023

Here is some improved code that is FASTER/CLEANER/MORE RELIABLE:

Removing images (jpg), video (mp4), and json files for DELETED print history items
(Note: this method removes resulting empty directories cleanly)

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd static_build/media
DELIDS=$(echo "SELECT id FROM app_print WHERE deleted IS NOT NULL" | sqlite3 $OBICODB | tr '\n', ',')
DELIDS=${DELIDS::-1}
FILES=($(echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printshotfeedback WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printerevent WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p"))

(IFS=$'\n'; echo "${FILES[*]}") | sort -u | xargs -n 50 \rm -f

DIRS=("${FILES[@]%/*}")
(IFS=$'\n'; echo "${DIRS[*]}") | sort -u | grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir --ignore-fail-on-non-empty

Removing ORPHANED files in 'media' hierarchy that are not in the db.sqlite3 database
(Note: this method removes resulting empty directories cleanly)

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd obico-server/backend/static_build/media
ORPHANS=($(comm -2 -3 <(find * -type f | sort) <((
echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT image_url FROM app_printshotfeedback WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT image_url FROM app_printerevent WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
)| sed -n "s|.*/media/||p" | sort -u) ))

(IFS=$'\n'; echo "${ORPHANS[*]}") | sort -u | xargs -n 50 \rm
DIRS=("${ORPHANS[@]%/*}")
(IFS=$'\n'; echo "${DIRS[*]}") | sort -u | grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir --ignore-fail-on-non-empty

Alternatively, use the following to calculate ORPHANS

ORPHANS=($(comm -2 -3 <(find * -type f | sort)<(echo ".dump" | sqlite3 $OBICODB | tr ',' '\n' | sed -n "s|.*http:.*/media/\([^',]*\).*|\1|p" | sort -u) ))

Removing EMPTY directories
(This shouldn't in general be necessary if you use the above 2 routines)

cd obico-server/backend/static_build/media
find * -mindepth 2 -type d -empty | grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir

@puterboy
Copy link
Contributor

puterboy commented Nov 22, 2023

Here is an updated version now that files are signed with a digest (using DJANGO_SECRET_KEY)

Removing images (jpg), video (mp4), and json files for DELETED print history items
(Note: this method removes resulting empty directories cleanly)

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd static_build/media
DELIDS=$(echo "SELECT id FROM app_print WHERE deleted IS NOT NULL" | sqlite3 $OBICODB | tr '\n', ',')
DELIDS=${DELIDS::-1} #Remove trailing newline element
FILES=($(echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printshotfeedback WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printerevent WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p"))
FILES=("${FILES[@]%?digest=*}")

(IFS=$'\n'; echo "${FILES[*]}") | sort -u | xargs -n 50 \rm

FDIRS=("${FILES[@]%/*}")
(IFS=$'\n'; echo "${FDIRS[*]}") | sort -u | \grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir --ignore-fail-on-non-empty

Removing ORPHANED files in 'media' hierarchy that are not in the db.sqlite3 database
(Note: this method removes resulting empty directories cleanly)

cd obico-server/backend
OBICODB="$PWD/db.sqlite3"
cd static_build/media
ORPHANS=($(comm -2 -3 <(find * -type f | sort) <((
echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT image_url FROM app_printshotfeedback WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
echo "SELECT image_url FROM app_printerevent WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
)| sed -e "s|.*/media/||" -e "s|\?digest=.*||" | sort -u) ))

(IFS=$'\n'; echo "${ORPHANS[*]}") | sort -u | xargs -n 50 \rm

ODIRS=("${ORPHANS[@]%/*}")
(IFS=$'\n'; echo "${ODIRS[*]}") | sort -u | \grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir --ignore-fail-on-non-empty

Alternatively, use the following to calculate ORPHANS:

ORPHANS=($(comm -2 -3 <(find * -type f | sort) <(echo ".dump" | sqlite3 $OBICODB | tr ',' '\n' | sed -n "s|.*http:.*/media/([^',]*\).*|\1|p" | sed "s|?digest=.*||" | sort -u) ))

Removing EMPTY directories
(This shouldn't in general be necessary if you use the above 2 routines)

cd obico-server/backend/static_build/media
find * -mindepth 2 -type d -empty | \grep -E "/[0-9]+/[0-9]+" | xargs -n 50 \rmdir

NOTE: Use at your own risk! If you are unsure, you may want to do a test run first by prepending the \rm and \rmdir statements with echo to make sure you are ok with what will be deleted.

@puterboy
Copy link
Contributor

puterboy commented Dec 3, 2023

Here is a single bash script that combines all the above scripts along (with confirmations before deleting files)

################################################################################
# Clean up Obico print history
#    - Delete image files (snapshot, timelapses) corresponding to deleted prints
#    - Delete any resulting empty directories
#
#    - Delete orphan media files not mentioned in db.sqlite3
#    - Delete any resulting empty directories
#
# Usage:
#    jobico_cleanup.sh
#
#    May need to run as *root* if media files not user-writable
#
# Author:
#    Jeff Kosowsky
#    Copyright December 2023
#
################################################################################
OBICODB="db.sqlite3"
BASE=/usr/local/downloads/tarfiles/obico/obico-server
cd $BASE

cd backend
OBICODB="$PWD/$OBICODB"
cd static_build/media

#### Find any remaining stored image files (snapshot, video) corresponding to deleted prints
DELIDS=$(echo "SELECT id FROM app_print WHERE deleted IS NOT NULL" | sqlite3 $OBICODB | tr '\n', ',')
DELIDS=${DELIDS::-1} #Remove trailing newline element

FILES=($(echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL AND deleted IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printshotfeedback WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p")
       $(echo "SELECT image_url FROM app_printerevent WHERE print_id IN ( ${DELIDS} ) AND image_url IS NOT NULL" | sqlite3 $OBICODB | sed -n "s|.*/media/||p"))
FILES=("${FILES[@]%?digest=*}")

## Delete image files corresponding to deleted prints
DELFILES=($(for file in $(IFS=$'\n'; echo "${FILES[*]}" | \sort -u); do [ -e "$file" ] && echo $file; done))
if [ ${#DELFILES[@]} -ne 0 ]; then
    (IFS=$'\n'; echo "${DELFILES[*]}")
    echo -en "---> Delete ALL above *DELETED* image files? [yes/no]"
    read -e ANSWER
    [ "$ANSWER" = "yes" ] || exit 1
    echo "${DELFILES[*]}" | xargs -n 50 \rm
    if [ $? -eq 0 ]; then
	echo "File deletion SUCCEEDED!"
    else
	echo "File deletion FAILED!"
    fi
    echo
fi

## Delete resulting Empty directories
FDIRS=("${FILES[@]%/*}")
DELFDIRS=($(for dir in $(IFS=$'\n'; echo "${FDIRS[*]}" | \sort -u | \grep -E "/[0-9]+/[0-9]+"); do [ -d "$dir" ] && [ -z "$(\ls -A $dir 2>/dev/null)" ] && echo $dir; done))
if [ ${#DELFDIRS[@]} -ne 0 ]; then
    (IFS=$'\n'; echo "${DELFDIRS[*]}")
    echo -en "---> Delete ALL above *EMPTY* directories corresponding to deleted image files? [yes/no]"
    read -e ANSWER
    [ "$ANSWER" = "yes" ] || exit 1    
    echo "${DELFDIRS[*]}" | xargs -n 50 \rmdir
    if [ $? -eq 0 ]; then
	echo "Directory deletion SUCCEEDED!"
    else
	echo "Directory deletion FAILED!"
    fi
    echo
fi

#### Find orphan files that are not mentioned in db.sqlite3
ORPHANS=($(comm -2 -3 \
		<(find * -type f | sort) \
		<((
		     echo "SELECT video_url FROM app_print WHERE video_url IS NOT NULL" | sqlite3 $OBICODB;
		     echo "SELECT tagged_video_url FROM app_print WHERE tagged_video_url IS NOT NULL" | sqlite3 $OBICODB;
		     echo "SELECT poster_url FROM app_print WHERE poster_url IS NOT NULL" | sqlite3 $OBICODB;
		     echo "SELECT prediction_json_url FROM app_print WHERE prediction_json_url IS NOT NULL" | sqlite3 $OBICODB;
		     echo "SELECT image_url FROM app_printshotfeedback WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
		     echo "SELECT image_url FROM app_printerevent WHERE image_url IS NOT NULL" | sqlite3 $OBICODB;
		 )| sed -e "s|.*/media/||" -e "s|\?digest=.*||" | sort -u) ))

## Delete orphan files
DELORPHANS=($(for file in $(IFS=$'\n'; echo "${ORPHANS[*]}" | sort -u); do [ -e "$file" ] && echo $file; done))
if [ ${#DELORPHANS[@]} -ne 0 ]; then
    (IFS=$'\n'; echo "${DELORPHANS[*]}")
    echo -en "---> Delete ALL above *ORPHANED* image files? [yes/no]"
    read -e ANSWER
    [ "$ANSWER" = "yes" ] || exit 1
    echo "${DELORPHANS[*]}" | xargs -n 50 \rm
    if [ $? -eq 0 ]; then
	echo "File deletion SUCCEEDED!"
    else
	echo "File deletion FAILED!"
    fi
    echo
fi

## Delete resulting Empty directories
ODIRS=("${ORPHANS[@]%/*}")
DELODIRS=($(for dir in $(IFS=$'\n'; echo "${ODIRS[*]}" | \sort -u | \grep -E "/[0-9]+/[0-9]+"); do [ -d "$dir" ] && [ -z "$(\ls -A $dir 2>/dev/null)" ] && echo $dir; done))
if [ ${#DELODIRS[@]} -ne 0 ]; then
    (IFS=$'\n'; echo "${DELODIRS[*]}")
    echo -en "---> Delete ALL above *EMPTY* directories corresponding to orphan files? [yes/no]"
    read -e ANSWER
    [ "$ANSWER" = "yes" ] || exit 1    
    echo "${DELODIRS[*]}" | xargs -n 50 \rmdir
    if [ $? -eq 0 ]; then
	echo "Directory deletion SUCCEEDED!"
    else
	echo "Directory deletion FAILED!"
    fi        
fi
echo "COMPLETED!"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants