Skip to content

Hash function improvements#145

Closed
vouillon wants to merge 1 commit intoocaml:masterfrom
vouillon:hash
Closed

Hash function improvements#145
vouillon wants to merge 1 commit intoocaml:masterfrom
vouillon:hash

Conversation

@vouillon
Copy link
Member

@vouillon vouillon commented Aug 2, 2023

  • make sure that the hash of zero is not 0
  • finalize the hash and fold the result to a positive 31-bit integer

- make sure that the hash of zero is not 0
- finalize the hash and fold the result to a positive 31-bit integer
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Aug 3, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Sep 15, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Sep 30, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
@xavierleroy
Copy link
Contributor

I could use more context:

  • What's wrong with zero hashing to 0 ?
  • Is it intended that Z.hash x should be equal to Hashtbl.hash x? I'm not sure this is the case even with the proposed patch, as an extra mixing with 0 seems to be missing.

Clarifications welcome!

@vouillon
Copy link
Member Author

vouillon commented Oct 25, 2023

My initial motivation was to harmonize hash results over 32-bits architectures (between zarith_stubs_js and zarith C stubs).

Beside that:

  • I don't really care about zero hashing to 0. I can special-case zero in zarith_stubs_js instead.
  • It probably makes sense that Z.hash returns a positive integer, since this is less error-prone, and that it returns the same result on all architectures. For this, anding with 0x3FFFFFFF should be sufficient.
  • It might be nice that Hashtbl.hash and Z.hash coincide. But I got this wrong...

@xavierleroy
Copy link
Contributor

My initial motivation was to harmonize hash results over 32-bits architectures

Agreed. Let's do that.

I don't really care about zero hashing to 0. I can special-case zero in zarith_stubs_js instead.

I still don't understand why a special case is needed for zero...

It might be nice that Hashtbl.hash and Z.hash coincide. But I got this wrong...

If we really want this identity, we can just define Z.hash as Hashtbl.hash + a type constraint, like we do in the OCaml stdlib for String.hash, Int32.hash, etc. While we're at it, we could also define Z.seeded_hash. I'm tempted!

vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 23, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 27, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 28, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
vouillon added a commit to ocaml-wasm/zarith_stubs_js that referenced this pull request Nov 30, 2023
- Hash chunks in the same order as the C implementation.
  (submitted a change to the C stubs to get the same result on all platforms:
   ocaml/Zarith#145)
- Call caml_hash_mix_final to compute the final hash value and restrict
  the result to a positive 31-bit integer.

Signed-off-by: Jérôme Vouillon <jerome.vouillon@gmail.com>
xavierleroy added a commit to xavierleroy/Zarith that referenced this pull request Dec 28, 2023
For consistency with other integer types in the standard library
(modules Int, Int32, Int64, Nativeint), let's use the standard
hash function (`Hashtbl.hash`) for `Z.hash` instead of our variant.

This is a bit slower but has several benefits (see ocaml#145):
- 32/64 bit compatibility
- better mixing of the bits of the result.

While we're at it, add a `Z.seeded_hash` function, defined as
`Hashtbl.seeded_hash`, so that the `Z` module can be used as the argument
to the `Hashtbl.MakeSeeded` functor.
@xavierleroy
Copy link
Contributor

After thinking some more about it, I believe Z.hash should be an alias for the standard hash function Hashtbl.hash, like we do for other standard library modules that work with integers (Int, Int32, Int64, Nativeint). See #150 for more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants