feat: add draft implementation for canonical raft#5933
feat: add draft implementation for canonical raft#5933niebayes wants to merge 2 commits intojina-ai:masterfrom niebayes:gsoc_raft
Conversation
JoanFM
left a comment
There was a problem hiding this comment.
should we consider consistency mlde as part of the general raft_configuration?
| nargs='+', | ||
| ) | ||
|
|
||
| gp.add_argument( |
There was a problem hiding this comment.
this should be at Deployment level, not Pod level
There was a problem hiding this comment.
actually, there is a specific section whrre u will find raft related args
There was a problem hiding this comment.
| repeated string endpoints = 1; | ||
| repeated string write_endpoints = 2; | ||
| // TODO(niebayes): regenerate proto files. | ||
| repeated string read_endpoints = 2; |
There was a problem hiding this comment.
no need to add an extra list, the difference between all the endpoints and write endpoints is already the read endpoints
There was a problem hiding this comment.
Are there any endpoints not for read nor for write? For e.g. endpoints about shutting down an executor, or about upgrade an executor?
On the other hand, some endpoints may not want to involve Raft at all, i.e. we simply forward them to the executor directly rather than letting raft worker forward them.
There was a problem hiding this comment.
There are only write and read.
| else: | ||
| return fn | ||
|
|
||
| def _unwrap_read_decorator(self, fn): |
There was a problem hiding this comment.
forget about read decorator, everything that js not write, is read
|
Are we sure this is what we want? I think it does not make sense to have Read requests go fully into the Raft Layer. This would make the log grow insanely large for no reason. Isn't there a better way to achieve strong consensus? |
|
@JoanFM You are right, if letting read requests go into the Raft layer, the log will grow insanely if the workload is read skewed. However, if not do so, the strong consistency mode is not guaranteed.
This optimization only involves an extra round of RPC interaction and read requests do not need to go fully into the Raft layer and hence log would not grow I've investigated that Hashicorp Raft supports querying about committed index and applied index. But I'm not sure if it supports invoking a checkQuorum-like RPC. If not, I have to slightly modify Hashicorp Raft manually. Don't worry, I could handle it. |
|
@JoanFM About whether or not wrapping consistency mode into the raft general configuration: |
|
@JoanFM About the RpcInterface: Why we create an RpcInterface instance for each server registration? Is there any chance we could pass the reference for a single RpcInterface instance? I want to implement the RpcInterface as a coordinator which manages the communication between the raft layer and the executor FSM. Can you provide any advices? |
For this I am not sure, maybe u are right and we could go with only an instance |
Manually editing Hashicorl Raft is not a good option as then we would need to mantain the 2 codebases and make sure that updates on hashicorp are compatible, to me this is only good if Hashicorp takes a PR. |
this can be a useful link: |
|
I've made a PR hashicorp/raft#560 about adding a CommitIndex for Hashicorp Raft. |
JoanFM
left a comment
There was a problem hiding this comment.
Changes requested, I like the refactoring on the RpcInterface.
Highlights:
- Remove the logic about
ReadEndpointas it is unnecessary. - Processing
ReadRequestsinStrongconsistency mode has to be done more optimally but without changingHashicorp Raftlibrary.
Good progress and understanding of the library and codebase I am seeing, thanks a lot and congrats
| gp.add_argument( | ||
| '--consistency-mode', | ||
| type=str, | ||
| default='Strong', |
There was a problem hiding this comment.
Make Eventual the default. Plus get the string from the Enum
| nargs='+', | ||
| ) | ||
|
|
||
| gp.add_argument( |
There was a problem hiding this comment.
| } | ||
|
|
||
| // TODO(niebayes): regenerate proto files. | ||
| read_endpoints := response.ReadEndpoints |
There was a problem hiding this comment.
no need for this, the logic of write and read was there, there is no other set of endpoints
| } | ||
| } | ||
|
|
||
| if !found { |
| threading.Lock() | ||
| ) # watch because this makes it no serializable | ||
|
|
||
| overlap_endpoints = set(self.read_endpoints).intersection(self.write_endpoints) |
| return self._requests | ||
|
|
||
| @property | ||
| def read_endpoints(self): |
| _inherit_from_parent_class_inner(cls) | ||
|
|
||
|
|
||
| def read( |
| endpoints_proto = jina_pb2.EndpointsProto() | ||
| endpoints_proto.endpoints.extend(list(self._executor.requests.keys())) | ||
| # TODO(niebayes): add read_endpoints to the proto. | ||
| endpoints_proto.read_endpoints.extend(list(self._executor.read_endpoints)) |
This is a known issue. Indeed, if there is a connection errror to the Executor, it may be problematic to the FSM consistency. This is why I would like to have the The |







Goals:
Add draft implementation for canonical Raft, i.e. all requests, whether read or write requests, go to the Raft layer.