Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove unnecessary database queries #1324

Merged
merged 3 commits into from
Feb 14, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions pyiron_base/database/jobtable.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def get_job_ids(database, sql_query, user, project_path, recursive=True):
user=user,
project_path=project_path,
recursive=recursive,
columns=["id"]
)["id"]
else:
return database.get_job_ids(project=project_path, recursive=recursive)
Expand Down
68 changes: 32 additions & 36 deletions pyiron_base/project/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1185,27 +1185,27 @@ def remove_job(self, job_specifier, _unprotect=False):
if isinstance(job_specifier, (list, np.ndarray)):
for job_id in job_specifier:
self.remove_job(job_specifier=job_id, _unprotect=_unprotect)
else:
if not self.db.view_mode:
try:
job = self.inspect(job_specifier=job_specifier)
if job is None:
state.logger.warning(
"Job '%s' does not exist and could not be removed",
str(job_specifier),
)
elif _unprotect:
job.remove_child()
else:
job.remove()
except IOError as _:
state.logger.debug(
"hdf file does not exist. Removal from database will be attempted."
)
job_id = self.get_job_id(job_specifier)
self.db.delete_item(job_id)
return
if self.db.view_mode:
raise EnvironmentError("copy_to: is not available in Viewermode !")
job = self.inspect(job_specifier=job_specifier)
if job is None:
state.logger.warning(
"Job '%s' does not exist and could not be removed",
str(job_specifier),
)
return
try:
if _unprotect:
job.remove_child()
else:
raise EnvironmentError("copy_to: is not available in Viewermode !")
job.remove()
except IOError as _:
state.logger.debug(
"hdf file does not exist. Removal from database will be attempted."
)
self.db.delete_item(job.id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jan-janssen You missed this line. It might not be called often but it is another redundant database query.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is dangerous. For example when you create a job object and remove the HDF5 file, then the loading of the job object with inspect() fails and returns an IOError still in the except case the job object is not available, so we cannot use job.id.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I realised that the inspect() function also works when the file is corrupted #1327



def remove_jobs(self, recursive=False, progress=True, silently=False):
"""
Expand Down Expand Up @@ -1700,23 +1700,19 @@ def _remove_jobs_helper(self, recursive=False, progress=True):
"""
if not isinstance(recursive, bool):
raise ValueError("recursive must be a boolean")
if not self.db.view_mode:
job_id_lst = self.get_job_ids(recursive=recursive)
if progress and len(job_id_lst) > 0:
job_id_lst = tqdm(job_id_lst)
for job_id in job_id_lst:
if job_id not in self.get_job_ids(recursive=recursive):
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand this pull request correctly, then the primary change is removing this database query, correct? Maybe we should split this pull request in two parts. One containing just the removal of this part and the other the refactoring. Having both mixed makes it a bit hard to read.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify these changes a bit more I created two separate pull requests #1326 and #1325 - these contain the changes to the database queries but not the refactoring done in this pull request.

else:
try:
self.remove_job(job_specifier=job_id)
state.logger.debug("Remove job with ID {0} ".format(job_id))
except (IndexError, Exception):
state.logger.debug(
"Could not remove job with ID {0} ".format(job_id)
)
else:
if self.db.view_mode:
raise EnvironmentError("copy_to: is not available in Viewermode !")
job_id_lst = self.get_job_ids(recursive=recursive)
job_id_progress = tqdm(job_id_lst) if progress else job_id_lst
for job_id in job_id_progress:
try:
self.remove_job(job_specifier=job_id)
state.logger.debug("Remove job with ID {0} ".format(job_id))
except (IndexError, Exception):
state.logger.debug(
"Could not remove job with ID {0} ".format(job_id)
)


def _remove_files(self, pattern="*"):
"""
Expand Down
Loading