Skip to content
This repository was archived by the owner on Oct 30, 2021. It is now read-only.
This repository was archived by the owner on Oct 30, 2021. It is now read-only.

Upstream: undo feature id sharding #98

@yhahn

Description

@yhahn

Transferring braindump notes to here. This issue affects both carmen-cache and carmen, but the problem originates mostly from the grid cache outward so capturing here for now.

Why the problem exists at????!!!

Feature storage is sharded right now not just for historical reasons but because the grid storage in our PBF format currently limits us 20-bit feature IDs.

Even with a larger feature ID space (e.g. if we went from 53 bits => 64 bits) we would probably encounter feature ID collisions at which point we'd need to store them in a nested fashion with some kind of secondary factor (like cover zxy) to disambiguate where possible.

Example:

feature A 120951912 \
                     +----> store together, distinguish by id + zxy
feature B 120951912 /

Rough braindump of how to untangle

  • We'd want unlimited feature ID storage space
  • We'd want enforcement of feature ID uniqueness (per index) at indexing time
  • We'd want to undo "sharded" feature storage
  • We'd want to stop using uint-64 as a way to transfer cover data into and out of carmen-cache -- we'd probably define a new PBF message format (variable length, allowing for any size feature IDs) for encoding this information

I think these would all be very positive changes, just a pretty solid, invasive refactor

cc @mapbox/geocoding-gang

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions