Skip to content

Support URL files with up to millions of lines #457

@adutra

Description

@adutra

This came up while reviewing #399: some users are using giant urlfiles with millions of URLs inside.

This file size isn't how urlfiles were designed to work: indeed currently when a urlfile is parsed, all the parsed URL instances are stored in memory. See AbstractFileBasedConnector#loadURLs.

We should modify that method to return a Flux instead, and merge it with other fluxes.

┆Issue is synchronized with this Jira Task by Unito

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions