Skip to content

Added documentation for core peer forwarder#1797

Merged
asifsmohammed merged 3 commits intoopensearch-project:mainfrom
asifsmohammed:cpf-documentation
Sep 26, 2022
Merged

Added documentation for core peer forwarder#1797
asifsmohammed merged 3 commits intoopensearch-project:mainfrom
asifsmohammed:cpf-documentation

Conversation

@asifsmohammed
Copy link
Copy Markdown
Collaborator

Signed-off-by: Asif Sohail Mohammed nsifmoh@amazon.com

Description

Adds core peer forwarder documentation.

Issues Resolved

resolves #1772

Check List

  • New functionality includes testing.
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@asifsmohammed asifsmohammed requested a review from a team as a code owner September 22, 2022 00:38

Presently peer discovery is provided by either a static list or by a DNS record lookup or AWS Cloudmap.

### Static discovery mode
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to elaborate on what this static list is.

For example:

When configured for static discovery mode, each Data Prepper node discovers nodes using a list of IP addresses or domain names.

mutual_tls:
```

### <a name="DNS_Discovery"></a>Operating a Cluster with DNS Discovery
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be good to expand on what this should be. I believe this is typically an A record which is populated with a list of IP addresses.

### <a name="DNS_Discovery"></a>Operating a Cluster with DNS Discovery
DNS discovery is recommended when scaling out a Data Prepper cluster. The core concept is to configure a DNS provider to return a list of Data Prepper hosts when given a single domain name.

#### With Kubernetes (Recommended)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the recommended deployment? I'm not quite sure where this recommendation came from or why. Maybe we should just leave out recommendations for now.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This I just copied from the existing peer forwarder plugin documentation. I can remove all 3 recommendations for now.

discovery_mode: aws_cloud_map
awsCloudMapNamespaceName: "my-namespace"
awsCloudMapServiceName: "data-prepper-cluster"
awsCloudMapQueryParameters:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This parameter is optional. We should be clear about this.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 23, 2022

Codecov Report

Merging #1797 (2d2e951) into main (504e8b8) will decrease coverage by 0.04%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##               main    #1797      +/-   ##
============================================
- Coverage     93.96%   93.91%   -0.05%     
  Complexity     1457     1457              
============================================
  Files           188      188              
  Lines          4275     4275              
  Branches        347      347              
============================================
- Hits           4017     4015       -2     
- Misses          177      179       +2     
  Partials         81       81              
Impacted Files Coverage Δ
...rwarder/discovery/AwsCloudMapPeerListProvider.java 92.59% <0.00%> (-2.47%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Copy Markdown
Collaborator

@oeyh oeyh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice doc with lots of examples. Just a few small comments:


### DNS lookup discovery mode
DNS discovery is recommended when scaling out a Data Prepper cluster. The core concept is to configure a DNS provider to return a list of Data Prepper hosts when given a single domain name.
This is a [DNS A Record](https://www.cloudflare.com/learning/dns/dns-records/dns-a-record/) which indicates a list list of IP addresses of a given domain.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This is a [DNS A Record](https://www.cloudflare.com/learning/dns/dns-records/dns-a-record/) which indicates a list list of IP addresses of a given domain.
This is a [DNS A Record](https://www.cloudflare.com/learning/dns/dns-records/dns-a-record/) which indicates a list of IP addresses of a given domain.

namespace configured for API instance discovery. You can create a new one following the instructions
provided by the [CloudMap documentation](https://docs.aws.amazon.com/cloud-map/latest/dg/working-with-namespaces.html).

Your pipeline configuration needs to include:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be in data prepper server configuration instead?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

* `aws_region` - The AWS region where your namespace exists.
* `discovery_mode` - Set to `aws_cloud_map`

Your pipeline configuration can optionally include:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, data prepper config?

## Configuration

* `port`(Optional): An `int` between 0 and 65535 represents the port peer forwarder server is running on. Default value is `21892`.
* `request_timeout`(Optional): Duration - A int representing the request timeout in milliseconds for Peer Forwarder HTTP server. Default value is `10000`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
* `request_timeout`(Optional): Duration - A int representing the request timeout in milliseconds for Peer Forwarder HTTP server. Default value is `10000`.
* `request_timeout`(Optional): Duration - An `int` representing the request timeout in milliseconds for Peer Forwarder HTTP server. Default value is `10000`.

* `ssl_insecure_disable_verification`(Optional) : A `boolean` that disables the verification of server's TLS certificate chain. Default value is `false`.
* `use_acm_certificate_for_ssl`(Optional) : A `boolean` that enables TLS/SSL using certificate and private key from AWS Certificate Manager (ACM). Default is `false`.
* `acm_certificate_arn`(Optional) : A `String` represents the ACM certificate ARN. ACM certificate take preference over S3 or local file system certificate. Required if `use_acm_certificate_for_ssl` is set to `true`.
* `acm_certificate_timeout_millis`(Optional) : An 'int' representing the timeout in milliseconds for ACM to get certificates. Default value is `120000`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
* `acm_certificate_timeout_millis`(Optional) : An 'int' representing the timeout in milliseconds for ACM to get certificates. Default value is `120000`.
* `acm_certificate_timeout_millis`(Optional) : An `int` representing the timeout in milliseconds for ACM to get certificates. Default value is `120000`.

dlvenable
dlvenable previously approved these changes Sep 23, 2022
Copy link
Copy Markdown
Member

@dlvenable dlvenable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a suggestion, but I'm also happy to merge it as it is now.

```

### DNS lookup discovery mode
DNS discovery is recommended when scaling out a Data Prepper cluster. The core concept is to configure a DNS provider to return a list of Data Prepper hosts when given a single domain name.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DNS discovery is recommended when scaling out a Data Prepper cluster.

I suggest the following instead:

We recommend using DNS discovery over static discovery when scaling out a Data Prepper cluster.

We don't recommend DNS discovery exclusively. Cloud Map is a great solution for some AWS customers. But, it is good to recommend the DNS over static list.

Signed-off-by: Asif Sohail Mohammed <nsifmoh@amazon.com>
Signed-off-by: Asif Sohail Mohammed <nsifmoh@amazon.com>
Signed-off-by: Asif Sohail Mohammed <nsifmoh@amazon.com>
### AWS Cloud Map discovery mode

[AWS CloudMap](https://docs.aws.amazon.com/cloud-map/latest/dg/what-is-cloud-map.html) provides API-based service discovery as well as DNS-based service discovery.
[AWS Cloud Map](https://docs.aws.amazon.com/cloud-map/latest/dg/what-is-cloud-map.html) provides API-based service discovery as well as DNS-based service discovery.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! Thanks!

@asifsmohammed asifsmohammed merged commit 63307d9 into opensearch-project:main Sep 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create a guide to using Peer Forwarding (documentation task) in Data Prepper repo

4 participants