Skip to content

Add support for CloudWatch metric streams through firehose endpoint#6380

Merged
kaiyan-sheng merged 23 commits intoelastic:masterfrom
kaiyan-sheng:firehose_metrics
Oct 28, 2021
Merged

Add support for CloudWatch metric streams through firehose endpoint#6380
kaiyan-sheng merged 23 commits intoelastic:masterfrom
kaiyan-sheng:firehose_metrics

Conversation

@kaiyan-sheng
Copy link
Copy Markdown

@kaiyan-sheng kaiyan-sheng commented Oct 19, 2021

Motivation/summary

This PR is to add support for CloudWatch metric streams through Firehose using the same /firehose endpoint.

Checklist

For functional changes, consider:

  • Is it observable through the addition of either logging or metrics?
  • Is its use being published in telemetry to enable product improvement?
  • Have system tests been added to avoid regression?

How to test these changes

This can be tested either hosted in an EC2 instance or locally. Here is how to test it locally:

  1. Build APM server using make
  2. Generate an API key for authentication:
./apm-server apikey create -E cloud.id=xxx -E cloud.auth=elastic:yyy
  1. Run apm-server locally:
./apm-server -e -E apm-server.data_streams.enabled=true -E apm-server.auth.api_key.enabled=true -E cloud.id=xxx -E cloud.auth=elastic:yyy
  1. Use Postman or curl to POST to localhost:8200/firehose with body:
{   "requestId": "bb389cba-95be-469f-8e50-95ecc1afefcd",
    "timestamp": 1634755956128,
    "records":[
        {
            "data": "eyJtZXRyaWNfc3RyZWFtX25hbWUiOiJjbG91ZHdhdGNoLW1ldHJpYy1zdHJlYW0tdXMtZWFzdC0xIiwiYWNjb3VudF9pZCI6IjQyODE1MjUwMjQ2NyIsInJlZ2lvbiI6InVzLWVhc3QtMSIsIm5hbWVzcGFjZSI6IkFXUy9FQzIiLCJtZXRyaWNfbmFtZSI6IkNQVVV0aWxpemF0aW9uIiwiZGltZW5zaW9ucyI6eyJJbnN0YW5jZUlEIjoiaS10ZXN0LTEyMzQifSwidGltZXN0YW1wIjoxNjM0NzU1OTU2MTI4LCJ2YWx1ZSI6eyJjb3VudCI6Mi4wfSwidW5pdCI6IlBlcmNlbnQifQ=="
        }
    ]
}

Also make sure the API key is passed into the request using X-Amz-Firehose-Access-Key header.
5. You should be able to see a 200 response code and an EC2 CPU utilization metric posted into Elasticsearch.

Related issues

closes elastic/integrations#956

Sample Output Event

{
  "_index": ".ds-metrics-apm.firehose-aws.ec2-default-2021.10.27-000001",
  "_id": "v4Fcw3wB8wxb-ZWOdj2b",
  "_version": 1,
  "_score": 1,
  "_source": {
    "@timestamp": "2021-10-27T19:58:00.000Z",
    "data_stream.type": "metrics",
    "data_stream.namespace": "default",
    "metricset.name": "aws.ec2",
    "ecs": {
      "version": "1.12.0"
    },
    "service": {
      "origin": {
        "name": "deliverystream/test-cloudwatch-metric-streams",
        "id": "arn:aws:firehose:us-east-1:428152502467:deliverystream/test-cloudwatch-metric-streams"
      }
    },
    "observer": {
      "hostname": "ip-172-31-84-43.ec2.internal",
      "id": "1df86227-5ac9-4127-84a2-655ab45b3d76",
      "type": "apm-server",
      "version": "8.0.0",
      "version_major": 8,
      "ephemeral_id": "4f3ea0a9-5415-4dfb-8b94-053e455b9f5e"
    },
    "cloud": {
      "origin": {
        "region": "us-east-1",
        "account.id": "428152502467"
      },
      "account": {
        "id": "428152502467"
      },
      "service": {
        "name": "AWS/EC2"
      },
      "region": "us-east-1"
    },
    "CPUUtilization.min": 0.163934426230105,
    "CPUUtilization.count": 5,
    "CPUUtilization.sum": 2.0862508104103137,
    "data_stream.dataset": "apm.firehose-aws.ec2",
    "labels": {
      "InstanceId": "i-0646c0435554cc5ed"
    },
    "CPUUtilization.max": 1.08333333333273,
    "processor": {
      "name": "metric",
      "event": "metric"
    }
  },
  "fields": {
    "CPUUtilization.max": [
      1.0833334
    ],
    "cloud.origin.region": [
      "us-east-1"
    ],
    "cloud.origin.account.id": [
      "428152502467"
    ],
    "labels.InstanceId": [
      "i-0646c0435554cc5ed"
    ],
    "processor.event": [
      "metric"
    ],
    "service.origin.name": [
      "deliverystream/test-cloudwatch-metric-streams"
    ],
    "cloud.region": [
      "us-east-1"
    ],
    "CPUUtilization.min": [
      0.16393442
    ],
    "data_stream.namespace": [
      "default"
    ],
    "processor.name": [
      "metric"
    ],
    "observer.version_major": [
      8
    ],
    "service.origin.id": [
      "arn:aws:firehose:us-east-1:428152502467:deliverystream/test-cloudwatch-metric-streams"
    ],
    "CPUUtilization.sum": [
      2.0862508
    ],
    "observer.hostname": [
      "ip-172-31-84-43.ec2.internal"
    ],
    "data_stream.type": [
      "metrics"
    ],
    "CPUUtilization.count": [
      5
    ],
    "metricset.name": [
      "aws.ec2"
    ],
    "observer.id": [
      "1df86227-5ac9-4127-84a2-655ab45b3d76"
    ],
    "@timestamp": [
      "2021-10-27T19:58:00.000Z"
    ],
    "cloud.service.name": [
      "AWS/EC2"
    ],
    "observer.ephemeral_id": [
      "4f3ea0a9-5415-4dfb-8b94-053e455b9f5e"
    ],
    "observer.version": [
      "8.0.0"
    ],
    "cloud.account.id": [
      "428152502467"
    ],
    "ecs.version": [
      "1.12.0"
    ],
    "observer.type": [
      "apm-server"
    ],
    "data_stream.dataset": [
      "apm.firehose-aws.ec2"
    ]
  }
}

@mergify
Copy link
Copy Markdown
Contributor

mergify bot commented Oct 19, 2021

This pull request does not have a backport label. Could you fix it @kaiyan-sheng? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-7.x is the label to automatically backport to the 7.x branch.
  • backport-7./d is the label to automatically backport to the 7./d branch. /d is the digit

NOTE: backport-skip has been added to this pull request.

@mergify mergify bot added the backport-skip Skip notification from the automated backport with mergify label Oct 19, 2021
@kaiyan-sheng kaiyan-sheng self-assigned this Oct 19, 2021
@kaiyan-sheng kaiyan-sheng changed the title Firehose metrics Add support for CloudWatch metric streams through firehose endpoint Oct 19, 2021
@ghost
Copy link
Copy Markdown

ghost commented Oct 19, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-10-28T04:13:09.716+0000

  • Duration: 43 min 22 sec

  • Commit: de3d69f

Test stats 🧪

Test Results
Failed 0
Passed 6313
Skipped 18
Total 6331

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /hey-apm : Run the hey-apm benchmark.

  • /package : Generate and publish the docker images.

@axw
Copy link
Copy Markdown
Member

axw commented Oct 19, 2021

BTW: I think the dataset value you're setting looks appropriate. I'd suggest we copy that to metricset.name.

@kaiyan-sheng kaiyan-sheng requested a review from axw October 20, 2021 21:23
@kaiyan-sheng kaiyan-sheng marked this pull request as ready for review October 20, 2021 21:40
Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for adding summary metric support! Can you please extend this test to include a summary metric type?

{
Metricset: &Metricset{
Samples: map[string]MetricsetSample{
"latency_histogram": {
Type: "histogram",
Unit: "s",
Histogram: Histogram{
Counts: []int64{1, 2, 3},
Values: []float64{1.1, 2.2, 3.3},
},
},
"just_type": {
Type: "counter",
Value: 123,
},
"just_unit": {
Unit: "percent",
Value: 0.99,
},
},
},
Output: common.MapStr{
"latency_histogram": common.MapStr{
"counts": []int64{1, 2, 3},
"values": []float64{1.1, 2.2, 3.3},
},
"just_type": 123.0,
"just_unit": 0.99,
"_metric_descriptions": common.MapStr{
"latency_histogram": common.MapStr{
"type": "histogram",
"unit": "s",
},
"just_type": common.MapStr{
"type": "counter",
},
"just_unit": common.MapStr{
"unit": "percent",
},
},
},
Msg: "Payload with metric type and unit.",
},
?

Also, the ingest pipeline still needs to be updated:

if (metric_type == "histogram") {
dynamic_templates[name] = "histogram";
}

@kaiyan-sheng
Copy link
Copy Markdown
Author

Thanks for the review @axw ! I see we also have histogram in ingest pipeline here: https://github.com/elastic/apm-server/blob/master/ingest/pipeline/definition.yml#L130. Do we need to add summary also here?

Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review @axw ! I see we also have histogram in ingest pipeline here: https://github.com/elastic/apm-server/blob/master/ingest/pipeline/definition.yml#L130. Do we need to add summary also here?

No. That file will be deleted soon, and we'll rely 100% on Fleet for managing the ingest pipeline.

@kaiyan-sheng kaiyan-sheng requested a review from axw October 27, 2021 20:13
Copy link
Copy Markdown
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Sorry about the wild goose chase.

@kaiyan-sheng kaiyan-sheng enabled auto-merge (squash) October 28, 2021 04:13
@kaiyan-sheng kaiyan-sheng merged commit 85be4e0 into elastic:master Oct 28, 2021
@axw axw added backport-8.0 Automated backport with mergify and removed backport-skip Skip notification from the automated backport with mergify labels Oct 28, 2021
mergify bot pushed a commit that referenced this pull request Oct 28, 2021
…6380)

* Process CloudWatch metric streams using firehose endpoint

* Add data_stream.dataset separately for logs and metrics

* remove unit from metricset

* Add summray metric type

* update unit test for cloudwatch metric

* update go.sum

* make update

* add test for summary metric type

* fix manifest.yml

* fix manifest.yml

* Fix comments in summary fields

* remove summary metric type

* undo some changes

* add changelog

* remove extra line

(cherry picked from commit 85be4e0)
axw pushed a commit that referenced this pull request Oct 28, 2021
…6380) (#6448)

* Process CloudWatch metric streams using firehose endpoint

* Add data_stream.dataset separately for logs and metrics

* remove unit from metricset

* Add summray metric type

* update unit test for cloudwatch metric

* update go.sum

* make update

* add test for summary metric type

* fix manifest.yml

* fix manifest.yml

* Fix comments in summary fields

* remove summary metric type

* undo some changes

* add changelog

* remove extra line

(cherry picked from commit 85be4e0)

Co-authored-by: kaiyan-sheng <kaiyan.sheng@elastic.co>
@kaiyan-sheng kaiyan-sheng deleted the firehose_metrics branch October 28, 2021 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.0 Automated backport with mergify

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for Cloudwatch metrics streams

2 participants