Skip to content
This repository was archived by the owner on Jul 21, 2021. It is now read-only.
This repository was archived by the owner on Jul 21, 2021. It is now read-only.

Prevent duplicates by using URL as uniquie identifier #2

@schliflo

Description

@schliflo

Entries that get updated after being crawled for the first time sometimes generate duplicates upon re-crawl. This could be prevented by using the URL as an unique identifier and falling back to updating the existing entry for any given URL on re-crawl.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions