Skip to content

Restrict and delete keys from a Map using intervals#49

Merged
Bodigrim merged 2 commits intomsakai:masterfrom
AliceRixte:master
Jan 20, 2026
Merged

Restrict and delete keys from a Map using intervals#49
Bodigrim merged 2 commits intomsakai:masterfrom
AliceRixte:master

Conversation

@AliceRixte
Copy link
Contributor

Hi there !

This PR adds the possibility restrict or delete all the keys within an interval from a Map. If you like the idea, I can add the same functions for strict maps.

Cheers !

Alice

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 2, 2026

Could you possibly elaborate how new functions differ from Data.IntervalMap.Lazy.delete or, say, Data.IntervalMap.Lazy.difference? (I never personally used Data.IntervalMap, so out of my depth here)

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 2, 2026

Could you possibly elaborate how new functions differ from Data.IntervalMap.Lazy.delete or, say, Data.IntervalMap.Lazy.difference? (I never personally used Data.IntervalMap, so out of my depth here)

I'm not sure I understand the question but I can try. This is about Map from the containers library and not about IntervalMap . What this does is it reexports Data.Map (from containers) with additional functions to delete (or restrict to) all the keys of a map that belongs to some interval (or interval set).

I don't use IntervalMap either but my understanding is that it associates a value to an interval, and delete will remove such a pairing, which is very different from the above. Notice I did not modify Data.IntervalMap at all.

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 2, 2026

In fact, restrictInterval m i could be implemented as

restrictInterval m i = Map.filterKeys (`Interval.member` i) m

but the implementation of this PR is much faster : the implementation in terms of filterKeysis linear in the size of the map, but the implementation in this PR (using Map.splitand Map.splitLookup) is logarithmic.

To be able to get a logarithmic complexity, one needs pattern match on Interval to be able to use Map.splitLookup instead of Map.filter, which means I have to import Data.Interval.Internal.

@AliceRixte
Copy link
Contributor Author

@Bodigrim I added better documentation and a comparison between the performance of filterKeys versus restrictInterval.

I also added a benchmark. On my computer, restrictInterval is about 40 times faster than filterKeys.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 2, 2026

This is about Map from the containers library and not about IntervalMap .

Ah, thanks, now I see.

I don't use IntervalMap either but my understanding is that it associates a value to an interval, and delete will remove such a pairing, which is very different from the above.

Not quite, here is an example of what delete on IntervalMap does:

> import Data.Interval as I
> import Data.IntervalMap.Lazy as IML
> IML.delete (8 I.<..< 18) $ IML.insert (0 I.<..< 10) 42 $ IML.insert (15 I.<..< 25) 24 mempty
fromList [(Finite 0 <..<= Finite 8,42),(Finite 18 <=..< Finite 25,24)]

I think this is actually quite similar to what you are implementing here.

Say, if there was a function convertEachKeyToSingletonInterval :: Map k v -> IntervalMap k v, then Data.IntervalMap.Lazy.delete seems to work on the result similar to the proposed new delete function. Maybe we can get away just implementing Map k v -> IntervalMap k v bit?..

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 3, 2026

Maybe we can get away just implementing Map k v -> IntervalMap k v bit?..

This makes a lot of sense, and indeed you're right, delete from IntervalMap does correspond to deleteInterval, and what you propose probably meets the specs.

However, in my use case, I need to work with Map and not with IntervalMap and performance is critical (if it was not, I would simply use filterKeys with the member predicate on interval).

If I understand well your suggestion, this would mean that, to get deleteInterval on Map, one would first need to convert to IntervalMap via convertEachKeyToSingletonInterval and then convert back to Map. I don't see how this second conversion could be done.

Moreover, the function convertEachKeyToSingletonInterval :: Map k v -> IntervalMap k v (which might indeed be useful for people who use IntervalMap) you propose to implement will be in linear complexity O(n), whereas restrictInterval in this PR is O(log n), so I do not see how we can recover the efficiency of the implementation of this PR via a conversion.

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 3, 2026

Before I forget, there's also a question I haven't figured out yet :

If you decide to add restrictInterval and deleteInterval, I still don't know which order to choose for the arguments of restrictInterval.

Pros for restrictInterval :: Ord k => Map k a -> Interval k -> Map k a

  • This is the same argument order as restrictKeys from Data.Map.
  • I suspect the reason the devs from containers chose this order is that this corresponds to the restriction notation of a function $f|_X$ to some set $X$.

Pros for restrictInterval :: Ord k => Interval k -> Map k a -> Map k a

  • This is coherent with deleteInterval and all other functions (insert, delete, ...) from Data.Map

To be fair I don't like the order of arguments of restrictKeys, in Data.Map, but it might be confusing to have restrictKeysin one order and restrictInterval in the other. But it is also confusing to have deleteInterval in one order and restrictinterval in the other.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 4, 2026

Big picture is that I don't really feel like an entirely new module re-exporting entire Data.Map is a good fit. (I'm happy to be overruled by @msakai though.) Could we fold the new function into Data.Interval?

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 5, 2026

Big picture is that I don't really feel like an entirely new module re-exporting entire Data.Map is a good fit. (I'm happy to be overruled by @msakai though.) Could we fold the new function into Data.Interval?

That's fine by me. I did it this way because I intended to copy paste it to get the strict version for Data.Map.Strict. But we can also add a "strict" suffix which I don't find very pretty even though it's fine.

More concretely, we could move deleteInterval and restrictInterval (and maybe deleteIntervalStrict, restrictIntervalStrict) inside Data.Interval and deleteIntervalsand restrictIntervals inside Data.IntervalSet.

@AliceRixte
Copy link
Contributor Author

If I may add, the same functions can also be implemented for Set, IntSet and IntMap (both Strict and Lazy), so I don't think adding suffixes is a good idea, in case someone needs them in the future.

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 6, 2026

That's fine by me. I did it this way because I intended to copy paste it to get the strict version for Data.Map.Strict. But we can also add a "strict" suffix which I don't find very pretty even though it's fine.

I don't quite see why you need a dedicated strict version: it seems the implementation only ever inserts values which were already present in the map, so if it was a lazy Map it remains one and if it was a strict one it remains strict as well.

@AliceRixte
Copy link
Contributor Author

AliceRixte commented Jan 6, 2026

I don't quite see why you need a dedicated strict version: it seems the implementation only ever inserts values which were already present in the map, so if it was a lazy Map it remains one and if it was a strict one it remains strict as well.

I don't think I understand.

Data.Map.Lazy and Data.Map.Strict are two different data structures, with the same interface. The functions splitLookup, lookup and insert are not polymorphic on the container type, so restrictInterval has to be implemented for each specific container (here Data.Map.Lazy and Data.Map.Strict, but this also can be done with Data.Set.Lazy and Data.Set.Strict) we want to use it with.

Ideally, there would be a IsMap class in containers that allows to have such polymorphic functions but unfortunately it does not exists. There is a IsMap class in the mono-traversable package, but it does not contain splitLookup, meaning that all the performance benefits of this proposition would be gone if we use mono-traversable.

Except if we go and ask the mono-traversable devs to add splitLookup to IsMap, which I can do if this is what you prefer. In an ideal Haskell world, this is how I would want it to be : an IsMap class with splitLookup in it, an IsInterval class with the Interval API, and then restrictInterval would be polymorphic both on the interval and the container. But this looks a bit overkill 😆 .

@Bodigrim
Copy link
Collaborator

Bodigrim commented Jan 6, 2026

Data.Map.Lazy and Data.Map.Strict are two different data structures, with the same interface.
The functions splitLookup, lookup and insert are not polymorphic on the container type,

They are the same data structure and Data.Map.Lazy.splitLookup is the same function as Data.Map.Strict.splitLookup and the same for lookup. It's only insert which is different (for lazy Map it inserts the value as is and for strict Map it forces the value first), but given that your code never invents new values and takes them from the existing Map you can use Data.Map.Lazy.insert and it will uphold the invariant of a strict Map as well.

@AliceRixte AliceRixte force-pushed the master branch 2 times, most recently from 2a8ac86 to 4ee3163 Compare January 7, 2026 12:35
@AliceRixte
Copy link
Contributor Author

They are the same data structure and Data.Map.Lazy.splitLookup is the same function as Data.Map.Strict.splitLookup and the same for lookup. It's only insert which is different (for lazy Map it inserts the value as is and for strict Map it forces the value first), but given that your code never invents new values and takes them from the existing Map you can use Data.Map.Lazy.insert and it will uphold the invariant of a strict Map as well.

Oh now I understand, thank you for this explanation, I don't know why, I was convinced of the contrary, sorry. This is great !

I deleted the Data.Map.Lazy.Interval file.

I still believe my argument makes sense for other containers : the same function can be implemented for Set, IntSet and IntMap. I'm willing to do so, especially as I will probably need them in the future. To show you what I mean, I added an implementation of these functions for Set.

@AliceRixte AliceRixte force-pushed the master branch 3 times, most recently from 15264e7 to 1a61254 Compare January 8, 2026 22:53
Copy link
Collaborator

@Bodigrim Bodigrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll take another look over the weekend, just a few notes for now.

@AliceRixte AliceRixte force-pushed the master branch 5 times, most recently from 773b4f3 to 953b22d Compare January 10, 2026 13:37
@AliceRixte
Copy link
Contributor Author

I just implemented benchmarking and tests for the Set versions.

I'm now satisfied with the current state of this PR, except maybe for :

  • the naming of the various functions
  • the order of the arguments, for which I'm still confused about

I'll try to think of better names over the week-end.

@AliceRixte AliceRixte force-pushed the master branch 2 times, most recently from 1d3a4fa to 258dc2a Compare January 12, 2026 22:05
@AliceRixte
Copy link
Contributor Author

I just added another optimization : using takeWhileAntitone and dropWhileAntitone is faster than using splitLookupand then insert.

-- * Operations on the keys of 'Data.Map'
, restrictKeysToInterval
, withoutKeysFromInterval
, splitInterval
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, naming is the hardest problem :)

I wonder if we really need splitInterval. The name is very unfortunate (one would expect it to split an interval, not split a map by keys in and out of an interval) and I cannot come up with a better one. Maybe in rare cases when this function is needed, one can satisfactory use restrict... and without...?..

How do you feel about restrictMapKeysToInterval instead of restrictKeysToInterval?

Copy link
Contributor Author

@AliceRixte AliceRixte Jan 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As usual, naming is the hardest problem :)

Haha exactly !

I wonder if we really need splitInterval. The name is very unfortunate

What about splitMapAtBoundariesOf and splitSetAtBoundariesOf ?

Maybe in rare cases when this function is needed, one can satisfactory use restrict... and without...?...

Maybe but it would be slower than the current implementation if all three intervals are needed, and performance is the whole reason to be of these functions : otherwise why not just use filterKeys ?

How do you feel about restrictMapKeysToInterval instead of restrictKeysToInterval?

It makes a lot of sense. withoutMapKeysFomInterval makes a bit less sense but is fine by me.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe but it would be slower than the current implementation if all three intervals are needed, and performance is the whole reason to be of these functions : otherwise why not just use filterKeys?

Even restrictKeysToInterval + withoutKeysFromInterval will still be faster than filterKeys and with a better asymptotic behaviour.

Also containers themselves do not offer a combined restrictKeys and withoutKeys. Let's leave it out for now; we can keep it in the Git history so that it can be restored back if needed.

What about splitMapAtBoundariesOf and splitSetAtBoundariesOf?

Not quite, to be honest. Judging by the name I'd expect such function to split Map into three parts: before, inside and after the interval.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite, to be honest. Judging by the name I'd expect such function to split Map into three parts: before, inside and after the interval.

Maybe I'm not understanding something this is precisely what the function does. The documentation I wrote says

-- | Split a 'Map' into three 'Map's, such that, respectively,
--
-- 1. the keys are less than the interval
-- 2. the keys are contained in the interval
-- 3. the keys are greater than the interval

Copy link
Contributor Author

@AliceRixte AliceRixte Jan 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you say so, I will remove splitMapAtBoundariesOfand splitSetAtBoundariesOf.

[Edit] Seeing your comment below, I removed them.

@AliceRixte AliceRixte force-pushed the master branch 2 times, most recently from 8eaeb20 to ee82b69 Compare January 19, 2026 13:13
@Bodigrim Bodigrim merged commit 4e57c1b into msakai:master Jan 20, 2026
11 checks passed
@Bodigrim
Copy link
Collaborator

Thank you, awesome!

@AliceRixte
Copy link
Contributor Author

Thank you, awesome!

Thanks a lot for your time !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants