Running DHCP Server Within AWS VPC

Problem:

DHCP server hosted in VPC wasn’t issuing addresses to clients.

Scenario:

DHCP server resides in a VPC which tunnels to our office LAN. Because of the particular solutions I’m implementing, the DHCP server must be in AWS, not on-prem.

The test client resides on VLAN X and the router is properly configured. There’s also another device on the same VLAN that acts as a DHCP relay agent, forwarding DHCP broadcasts to the DHCP server in the VPC.

Troubleshooting:

  • If statically-addressed, can test client ping the VLAN’s gateway?
  • Yup!
  • Can test client ping out to 8.8.8.8?
  • Yup!
  • Can test client ping the DHCP server in AWS?
  • Yup!
  • Can the DHCP relay agent ping the VLAN’s gateway?
  • Yup!
  • Can the DHCP relay agent ping the DHCP server in AWS?
  • Yup!
  • When performing a packet capture, is DHCP relay rec’ing DHCP request from test client?
  • Yup
  • When performing a packet capture is DHCP relay forwarding requests to DHCP server in AWS?
  • Yup!
  • When performing a packet capture, is the edge router forwarding the DHCP relay traffic through the tunnel to AWS?
  • Yup!
  • When performing a packet capture, is the DHCP server receiving the DHCP request from the DHCP relay agent?
  • Yup!
  • Is the DHCP server responding to both DHCP and bootp requests?
  • Yup!
  • Is the DHCP scope properly configured in the DHCP server?
  • Yup!

And that’s where I got stopped. Monitoring the DHCP server’s network, it would receive traffic from the DHCP relay, but do absolutely nothing with it. I’ve tried it with firewalls on and off. No luck.

Solution – Part 1

After running BPA, it didn’t dawn on me that DHCP server would crap out if the server’s NIC wasn’t configured w/ a static IP. As a test, I configured the DHCP server w/ a static address and it worked!

However…

Solution – Part 2

DO. NOT. CONFIGURE. EC2 INSTANCES. WITH. STATIC. ADDRESSES!!!! Period. No exceptions. No way around it.

I did this once before when I was just starting out with AWS/VPC and it absolutely WRECKS your chances of getting into (or recovering) an AMI that you lose connectivity to. All of the AWS recovery tools rely on the default networking, and configuring anything that’s not default within an AMI’s OS breaks the tools’ ability to function.

So, I added a second NIC to the server and manually configured that, leaving the default NIC alone so recovery tools can work.