Skip to content

chore: Update and improve jump host Terraform definition#93

Open
Willem-Barkhuizen wants to merge 1 commit intojuspay:mainfrom
Willem-Barkhuizen:chore/update_jump_host_infra
Open

chore: Update and improve jump host Terraform definition#93
Willem-Barkhuizen wants to merge 1 commit intojuspay:mainfrom
Willem-Barkhuizen:chore/update_jump_host_infra

Conversation

@Willem-Barkhuizen
Copy link
Contributor

@Willem-Barkhuizen Willem-Barkhuizen commented Feb 9, 2026

Summary

While deploying this module to our environment, I noticed that when
logged in via Session Manager, I could not use the shortcut to reach
the internal jump host printed in the output. On digging around, I
realised that the SSH configuration to enable the ssh internal-jump
shortcut was being added to the ec2-user home directory but I was
logged in as ssm-user, which is the default for Session Manager.

I also noticed that the instances were still using AL2, which is slated
for EOL in less than 6 months (2026/06/30), and that cloud-init ended
up overwriting the custom /etc/motd file that was being created by
the user data script.

This commit addresses the above by:

  • Changing the default AMI from AL2 to AL2023
  • Adding a new module variable specifying which username to use for SSM
    access, defaulting to the SSM default of 'ssm-user'
  • Modifying the user data template to get the jump host username from a
    template parameter (the new variable for the external or 'ec2-user'
    for the internal)
  • Fixing up the script in the user data template to:
    • Create the templated SSM username if it doesn't exist yet, along
      with its group and home directory, and setting up its sudo
      permissions the same as SSM Agent would have
    • Add the SSH configuration for the 'ssh internal-jump' shortcut to
      the home directory of the templated user instead, along with some
      minor fixes and improvements to the related script logic
    • Changes how the custom MOTD message is added to avoid it being
      overwritten

It also makes some minor formatting improvements to the instructions
printed out as part of the module output.

Testing

Planned and applied against our environment, confirming the instructions
formatting, that the shortcut now works out the box, sudo still works
and that the MOTD behavior is as expected.

@Willem-Barkhuizen Willem-Barkhuizen force-pushed the chore/update_jump_host_infra branch from ad51a9f to 5e1bfcf Compare February 10, 2026 10:01
Comment on lines 64 to 65
USER_ID=1001
GROUP_ID=1001
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might work in most AL2023 because:

default ec2-user → 1000
next available UID → 1001
SSM later creates ssm-user → gets 1001

But this is not guaranteed in case of :

  • Custom AMI
  • Additional users
  • Future AMI changes
  • Package updates

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more reliable way if ssm-user already exist would be

if [ "$OS_USER_NAME" = "ssm-user" ] && ! id ssm-user &>/dev/null; then
    useradd -m -s /bin/bash ssm-user
fi
chown -R "$OS_USER_NAME:$OS_USER_NAME" /home/$OS_USER_NAME

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a fair point, which did cross my mind. The unfortunate bit is that we have no control over which ID the ssm-user user will be assigned; our only recourse, if we want a modicum of control would be to enable and configure the Run As feature in Session Manager, which neither this, nor any other module touches and, in my opinion, falls outside of the scope of this package.

Do you think it would be a fair compromise if I update the user data template so the ID can be set as a Terraform variable, defaulting to 1001 and having a comment explaining the situation?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I guess we could also simply update the instructions in the output instead, although that does feel a bit clunky to me.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more reliable way if ssm-user already exist would be

if [ "$OS_USER_NAME" = "ssm-user" ] && ! id ssm-user &>/dev/null; then
    useradd -m -s /bin/bash ssm-user
fi
chown -R "$OS_USER_NAME:$OS_USER_NAME" /home/$OS_USER_NAME

Unfortunately, at the time the user data script is run, the ssm-user does not exist yet, as it is only created by the SSM Agent the first time a session is started on the node; hence my assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — but I don’t think we need to control the UID explicitly.

If we create ssm-user ourselves during userdata (if it doesn’t exist), Linux will assign the next available UID, and SSM will reuse that user. That removes the need to assume 1001 or expose UID as a Terraform variable.
this way :

  1. If ssm-user doesn't exist yet: Create it ourselves (useradd will assign the next available UID)
  2. Then run chown: Whether we created it or it already existed, the user now has a valid UID
  3. SSM will reuse that user
  4. this way we fix ownership and no uid assumptions required

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commenting with the correct account this time 🤦‍♂️

That sounds like an idea. My only concern with it is that we lose the automatic configuration applied to the user when it is created by the SSM Agent. I don't, however see that changing frequently, so as long as we mirror the config applied to it as of this moment, we should be good, I think.

@Willem-Barkhuizen Willem-Barkhuizen force-pushed the chore/update_jump_host_infra branch from 5e1bfcf to 6709f6c Compare February 13, 2026 13:41
@Willem-Barkhuizen Willem-Barkhuizen changed the title chore: Update and fix up jump host Terraform definition chore: Update and improve jump host Terraform definition Feb 13, 2026
\#### Summary

While deploying this module to our environment, I noticed that when
logged in via Session Manager, I could not use the shortcut to reach
the internal jump host printed in the output. On digging around, I
realised that the SSH configuration to enable the `ssh internal-jump`
shortcut was being added to the `ec2-user` home directory but I was
logged in as `ssm-user`, which is the default for Session Manager.

I also noticed that the instances were still using AL2, which is slated
for EOL in less than 6 months (2026/06/30), and that `cloud-init` ended
up overwriting the custom `/etc/motd` file that was being created by
the user data script.

This commit addresses the above by:
- Changing the default AMI from AL2 to AL2023
- Adding a new module variable specifying which username to use for SSM
  access, defaulting to the SSM default of 'ssm-user'
- Modifying the user data template to get the jump host username from a
  template parameter (the new variable for the external or 'ec2-user'
  for the internal)
- Fixing up the script in the user data template to:
  - Create the templated SSM username if it doesn't exist yet, along
    with its group and home directory, and setting up its `sudo`
    permissions the same as SSM Agent would have
  - Add the SSH configuration for the 'ssh internal-jump' shortcut to
    the home directory of the templated user instead, along with some
    minor fixes and improvements to the related script logic
  - Changes how the custom MOTD message is added to avoid it being
    overwritten

It also makes some minor formatting improvements to the instructions
printed out as part of the module output.

\#### Testing

Planned and applied against our environment, confirming the instructions
formatting, that the shortcut now works out the box, `sudo` still works
and that the MOTD behavior is as expected.
@Willem-Barkhuizen Willem-Barkhuizen force-pushed the chore/update_jump_host_infra branch from 6709f6c to d97b3db Compare February 13, 2026 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants