patches#

Implements functions that modify the behavior of Snakemake.

This functionality is pretty hacky and is almost certainly not future-proof, so if we change the Snakemake version (currently pinned at 16.5.5), we may have to update the code in this file.

class showyourwork.patches.SnakemakeFormatter(fmt=None, datefmt=None, style='%', validate=True)#

Bases: logging.Formatter

Format Snakemake errors before displaying them on stdout.

Sometimes, Snakemake fails with suggestions for commands to fix certain issues. We intercept those suggestions here, replacing them with the corresponding showyourwork syntax for convenience.

Initialize the formatter with specified format strings.

Initialize the formatter either with the specified format string, or a default as described above. Allow for specialized date formatting with the optional datefmt argument. If datefmt is omitted, you get an ISO8601-like (or RFC 3339-like) format.

Use a style parameter of ‘%’, ‘{’ or ‘$’ to specify that you want to use one of %-formatting, str.format() ({}) formatting or string.Template formatting in your format string.

Changed in version 3.2: Added the style parameter.

format(record)#

Format the specified record as text.

The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.

replacements = {'snakemake --cleanup-metadata': 'showyourwork --cleanup-metadata'}#
showyourwork.patches.get_skippable_jobs(dag)#

Search the DAG and return jobs we can safely skip due to downstream cache hits.

showyourwork.patches.get_snakemake_variable(name, default=None)#

Infer the value of a variable within Snakemake.

This is extremely hacky, as it inspects local variables across various frames in the call stack. This function should be used for debugging/development, but not in production.

showyourwork.patches.job_is_cached(job)#

Return True if a job’s outputs will be restored from cache.

showyourwork.patches.patch_snakemake_cache(zenodo_doi, sandbox_doi)#

Patches the Snakemake cache functionality to

  • Add custom logging messages

  • Attempt to download the cache file from Zenodo/Zenodo Sandbox on fetch()

  • Uploads the cache file to Zenodo Sandbox on store()

Parameters
  • zenodo_doi (str) – The Zenodo DOI for the cache. Can be None.

  • sandbox_doi (str) – The Zenodo Sandbox doi for the development cache. Can be None.

showyourwork.patches.patch_snakemake_cache_optimization(dag)#

Remove unnecessary jobs upstream of those with cache hits.

See the full discussion about this feature here.

showyourwork.patches.patch_snakemake_logging()#

Hacks the Snakemake logger to suppress most of its terminal output, and redirects the rest to a custom log file.

showyourwork.patches.patch_snakemake_missing_input_leniency()#

Patches snakemake to raise an error if there are no producers for any of the output files in the DAG.

showyourwork.patches.patch_snakemake_wait_for_files()#

Replace Snakemake’s wait_for_files method with a custom version that prints a different error message.

If an expected output of a rule is not found after running it, Snakemake prints an error recommending that the user increase the filesystem latency tolerance. This is almost never a latency issue – the far more likely scenario is the user simply did not code up the rule properly, or the output file was saved with the wrong path.