Skip to content

Dataframes and rules in tsv; OpenIE; Rule, Ruleset formats; READMEs; Unit tests#40

Merged
adaamko merged 18 commits intomainfrom
dev
Mar 9, 2022
Merged

Dataframes and rules in tsv; OpenIE; Rule, Ruleset formats; READMEs; Unit tests#40
adaamko merged 18 commits intomainfrom
dev

Conversation

@adaamko
Copy link
Copy Markdown
Owner

@adaamko adaamko commented Mar 8, 2022

close #39
close #26

@adaamko adaamko requested review from Eszti, GKingA and recski March 8, 2022 13:35
@adaamko adaamko changed the title Dataframes and rules in tsv; OpenIE; Rule, Ruleset formats; READMEs Dataframes and rules in tsv; OpenIE; Rule, Ruleset formats; READMEs; Unit tests Mar 8, 2022
@Eszti
Copy link
Copy Markdown
Collaborator

Eszti commented Mar 9, 2022

Just a suggestion on how to write good commit messages: https://cbea.ms/git-commit/
It took me no time to adjust, and I do see a significant improvement in my git history. For me, it's also a good entry point for a PR I did not work on to first look at the commit messages to get the grasp of it. 👼

@Eszti
Copy link
Copy Markdown
Collaborator

Eszti commented Mar 9, 2022

Do you still support json or is it only tsv now?

rule_set.to_tsv(path)
else:
raise ValueError("Unknown file extension, currently only .json and .tsv are supported")
raise ValueError(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add black as commit hook? :P

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, will do that, thanks :)

["(u_0 / find :obl (u_1 / .*) :obj (u_2 / .*))"],
[],
"FIND",
[{"ARG1": 1, "ARG2": 2}],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are these?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Eszti New arguments for OpenIE, you can find examples in https://github.com/adaamko/POTATO/blob/dev/notebooks/openie.ipynb

self._dataset = self.read_dataset(path=path, binary=binary)
else:
self._dataset = self.read_dataset(examples=examples)
self.extractor = GraphExtractor(lang=lang, cache_fn=f"{lang}_nlp_cache")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest you leave an option for the cache_fn to be configured from outside

]
if len(g) > 1:
print(
"WARNING: graph has multiple connected components, taking the largest"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about Kinga's solution for representing multiple connected components in penman? Is it too soon to use that?

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think currently we don't have a good solution for the issue, but correct me if I'm wrong @GKingA

else:
raise ValueError("No examples or path provided")

# ADAM: THIS WILL NEED TO BE ADDRESSED
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this about? Maybe open an issue and reference it in the comment

Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did in #41

@adaamko
Copy link
Copy Markdown
Owner Author

adaamko commented Mar 9, 2022

Just a suggestion on how to write good commit messages: https://cbea.ms/git-commit/ It took me no time to adjust, and I do see a significant improvement in my git history. For me, it's also a good entry point for a PR I did not work on to first look at the commit messages to get the grasp of it. 👼

@Eszti Thank you for the suggestion, I will also check it :)

@adaamko adaamko merged commit a05ddf9 into main Mar 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Resolving pip backtracking takes too long Instead of pickle, store Dataframes in TSV with penman formatted graphs

4 participants