A large number of Internet-of-Things (IoT) devices will soon populate our physical environments. Yet, IoT devices’ reliance on mobile applications and voice-only assistants as the primary interface limits their scalability and expressiveness. Building off of the classic ‘Put-That-There’ system, we contribute an exploration of the design space of voice + gesture interaction with spatially-distributed IoT devices. Our design space decomposes users’ IoT commands into two components—selection and interaction. We articulate how the permutations of voice and freehand gesture for these two components can complementarily afford interaction possibilities that go beyond current approaches. We instantiate this design space as a proof-of-concept sensing platform and demonstrate a series of novel IoT interaction scenarios, such as making ‘dumb’ objects smart, commanding robotic appliances, and resolving ambiguous pointing at cluttered devices.