Skip to content

Make it possible to resume LSDB pipeline #760

@hombit

Description

@hombit

Feature request

Let's say I'm running a pipeline, but the job crashes. It would be nice to resume and continue from the same point!

One option would be to add a Catalog.to_hats(..., resume: bool = False) argument which, if set to True, would check the target folder and skip all partitions that are already present. It should probably also verify the validity of existing Parquet files.

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

Status

Done

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions