Date: August 15, 2024
Contributors: Abuzar Mahmood, abuzarmahmood
PR: https://github.com/katzlabbrandeis/blech_clust/pull/204
The workflow management system was significantly improved in the PR released on August 15, 2024 titled, “45 workflow management – only run next step if dependencies have been fulfilled.” This PR aimed at enhancing the efficiency and reliability of the Blech_Clust pipeline.
A total of 21 files were changed with 413 additions and 28 deletions, predominantly in Python and JSON. The key files affected were blech_clust.py
, blech_common_avg_reference.py
, blech_exp_info.py
, blech_make_arrays.py
, and blech_make_psth.py
.
The core change was the introduction of a pipeline graph check in the workflow management system. This new feature ensures that the next step in the pipeline is only run if its dependencies have been fulfilled. Here is a snapshot of the code diff for blech_clust.py
:
diff --git a/blech_clust.py b/blech_clust.py
@@ -13,7 +13,7 @@
# Necessary blech_clust modules
from utils import read_file
from utils.qa_utils import channel_corr
-from utils.blech_utils import entry_checker, imp_metadata
+from utils.blech_utils import entry_checker, imp_metadata, pipeline_graph_check
The pipeline graph check is performed by importing pipeline_graph_check
from utils.blech_utils
. The function check_previous(script_path)
checks if the previous scripts in the pipeline have been successfully executed and write_to_log(script_path, 'attempted')
logs the attempt of the current script. After successful completion of the script, write_to_log(script_path, 'completed')
writes the success to the log.
Similar changes were introduced in the other key files.
These changes bring about a significant improvement in pipeline efficiency by preventing the pipeline from executing steps whose dependencies are not fulfilled. This minimizes the risk of errors and maximizes the use of computational resources.
In addition, the detailed logging of each step in the pipeline aids in debugging and tracking the progress of data processing. It allows users to easily identify any issues within the pipeline and fix them promptly.
This PR brings about a marked enhancement in the workflow management of the Blech_Clust pipeline. It significantly improves the efficiency and robustness of the pipeline, making it more user-friendly and reliable. As the Blech_Clust project continues to evolve, these changes provide a strong foundation for future improvements and additions.