Skip to content

Conversation

@blyxxyz
Copy link
Collaborator

@blyxxyz blyxxyz commented Aug 6, 2020

  • Memoize sentence properties. This makes the explicit decomposable and such arguments obsolete. Sentence properties are memoized using weakref, which limits the amount of extra memory we need per node (compared to an attribute for each property).
    A .mark_deterministic() method can be used to declare that a sentence is deterministic.
    Sentence properties are marked in advance when possible. This does screw with the tests a little, you have to make sure to bypass the memoization when appropriate.
  • Memoize .vars() in the same way (resolves Improve the vars() computation #12), and replace a lot of the internal use of .vars() by a smarter system with a temporary cache. This hugely speeds up those methods (including .decomposable()) on large sentences, because .vars() does redundant work if you run it on many nodes in the sentence. I think it takes the complexity down from quadratic to something more manageable.
    The memoization does mean that running .vars() on individual nodes is inadvisable, because it associates a whole new frozenset with each. It's something you should try to only run on full sentences. That goes for memoized methods in general.
  • Change the is_CNF() and is_DNF() semantics (resolves is_CNF semantics #14).
  • Change a few methods to very simple implementations that use .satisfiable(), now that it uses a SAT solver as a sane baseline. This slows them down for some cases and speeds them up for others, but I think it massively improves the worst-case performance for each.
  • A bunch of smaller fixes and changes.

@haz Can you check if you see anything that's questionable or badly explained or needs to make better use of these changes?

Copy link
Collaborator

@haz haz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the weakref strategy is a bit more involved than I have time to fully grok, but generally things make sense. Quite nice to see some of the major reductions in lieu of faster SAT calls & caching!

The general high-level points that came to mind:

  1. Best practices for when/how to use the memoization would be a helpful addition to CONTRIBUTING.md. It's pretty sparse at the moment, but worth adding to prior to having more students brought in on the project.
  2. It strikes me as useful to run the full test suit with/without memoization. I realize this clashes with some of the memoization mods in the tests (clearing and such), but would it possibly surface corner case bugs?
  3. Similarly, all of the forced deterministic settings in the testing seem like a prime place to run both with/without that optimization.

@haz haz mentioned this pull request Aug 7, 2020
@blyxxyz
Copy link
Collaborator Author

blyxxyz commented Aug 7, 2020

  1. I expanded CONTRIBUTING.md. Does it still look approachable for newcomers?

  2. That's a good point. The explicit marking of sentence properties (besides determinism) is now disabled when the tests run. That's much easier to manage than manually clearing the caches.
    I don't think we'd gain much from running the tests both with and without, as long as we test that the marked properties all hold. The results would be the same except in very crazy cases that would probably cause tests to fail regardless.

  3. I changed the places where the test uses .mark_deterministic() to run the assertions both before and after.

blyxxyz added 3 commits August 8, 2020 17:48
- No need to specify the environment
- MyPy has its own config file
- MyPy runs for Python 3.9
@blyxxyz blyxxyz mentioned this pull request Aug 8, 2020
Copy link
Collaborator

@haz haz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving so you can merge when set, but just noticed one minor issue in an otherwise CONTRIBUTING.md doc -- thanks!

@blyxxyz blyxxyz merged commit 2d163df into master Aug 10, 2020
@blyxxyz blyxxyz deleted the memoization branch August 10, 2020 18:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

is_CNF semantics Improve the vars() computation

3 participants