Skip to content

Use SSM targets instead of doing our own lookups#26

Merged
sendqueery merged 12 commits intov1from
use-ssm-targets
Aug 4, 2020
Merged

Use SSM targets instead of doing our own lookups#26
sendqueery merged 12 commits intov1from
use-ssm-targets

Conversation

@sendqueery
Copy link
Copy Markdown
Collaborator

@sendqueery sendqueery commented Jul 16, 2020

The original design of ssm-run involved starting an invocation per instance targeted, allowing for what I'd call a "shotgun" approach. Unfortunately, at scale, you start to run into API rate limit issues due to the sheer volume of calls. For any given instance, we had to make an API call each time we:

  • got our target list (1 per 50 instances)
  • started an invocation (1 per instance)
  • checked the status of an invocation (1 base + 1 per 2 seconds, per instance)

This might not sound like a ton, but the relevant SSM APIs have relatively low rate limits, resulting in our invocations behaving inconsistently at best. After talking with the AWS SSM team, we were able to validate the expected behavior for things like instances running OSes incompatible with a given document and concurrency settings.

Going forward, we're offloading much of the work to the SSM side of things. We're letting SSM validate targets on its own, which means lots of conditionals that we don't have to check. We're also going to check the status of the top-level invocation itself instead of querying on a per invocation ID + instance ID basis.

The downside of this approach is primarily that visibility into the status of an invocation is reduced. SSM doesn't return information on how many instances an invocation targets unless you specify their instance IDs. Because of the new method, we instead look at the status of the parent invocation, and only fetch the status/output of individual invocations when it's complete.

EDIT: Yes, this is missing tests. I'm workin' on it.

@rothgar
Copy link
Copy Markdown
Collaborator

rothgar commented Jul 16, 2020

Is there no way to do a dry-run with this new method? Dry run was originally supported in mco via --noop and was a blocker for some users to adopt ssm-run

If you can't get a list of instances that would execute a command (based on account and region) maybe it's possible to execute a noop command (e.g. pwd) and read the list of instances from the resulted output.

@sendqueery
Copy link
Copy Markdown
Collaborator Author

Yeah, the only way to do a dry run would be via running a noop of some sort. Since we can't guarantee what noop commands will be available on a target instance, I'd rather just document this approach instead of hard-coding it in.

@rothgar
Copy link
Copy Markdown
Collaborator

rothgar commented Jul 17, 2020

So long as --log-level X still shows the instances which ran the task I think that would be fine for most users.

@sendqueery sendqueery merged commit 18f50ee into v1 Aug 4, 2020
@sendqueery sendqueery deleted the use-ssm-targets branch August 4, 2020 20:11
sendqueery added a commit that referenced this pull request Sep 3, 2020
* Update ssm-run to rely on SSM for targeting

* Update gomod

* Update RunInvocations() to use mockable SSM client

* Clean up logging

* Clean up flag validation

* Change instances of "ctx" to "client"

* Fix profile validity check

* Temporary change: remove --dry-run flag from run command

* Temporary change: update session command to make things build
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants