One who walks in another's tracks leaves no footprints.
~Proverb
I obtained my Ph.D. from Columbia University in 2023. I was advised by Prof. Henning Schulzrinne and affiliated with the Internet Real-Time Laboratory. My research interests include internet services and protocols, cyber-physical systems (internet of things), distributed computing, operating systems, and mission-critical communications systems and protocols. I received a Master's degree in Computer Science from the Czech Technical University in Prague and worked at the Fraunhofer Institute for Open Communication Systems (FOKUS) in Berlin, Germany. Previously, I was a co-founder of Iptel.org GmbH, a FOKUS spin-off developing software for internet real-time multimedia services (acquired by Tekelec, now part of Oracle).
Outside work, I enjoy reading, hiking, running, and cross-country skiing. I thru-hiked the John Muir Trail in 2019.
My main research area is network services and protocols for cyber-physical systems, the Internet of Things (IoT), and multi-media network services. My research aims to enable self-managing IoT systems that automatically adapt to a changing environment and make automated decisions based on high-level programs and policies. I am also interested in developing programming abstractions for IoT applications that follow data, i.e., applications deployable to the network edge.
At Columbia, I supervise student projects at the undergraduate and graduate levels. I also occasionally mentor high-school students interested in computer science. Our paper on the design of a wireless networking lab won the Best Educational Paper Award at the Second GENI Educational and Research Workshop in 2013. I helped to design the homework assignments for the Advanced Programming.
Networked systems integrating software with the physical world are known as cyber-physical systems (CPSs). CPSs have been used in diverse sectors, including power generation and distribution, transportation, industrial systems, and building management. The diversity of applications and interdisciplinary nature make CPSs exciting to design and build but challenging to manage once deployed. Deployed CPSs must adapt to changes in the operating environment or the system's architecture, e.g., when outdated or malfunctioning components need to be replaced. Skilled human operators have traditionally performed such adaptations using centralized management protocols. As the CPS grows, management tasks become more complex, tedious, and error-prone.
This dissertation studies management challenges in deployed CPSs. It is based on practical research with CPSs of various sizes and diverse application domains, from the large geographically dispersed electrical grid to small-scale consumer Internet of Things (IoT) systems. We study the management challenges unique to each system and propose network services and protocols specifically designed to reduce the amount of management overhead, drawing inspiration from autonomic systems and networking research.
We first introduce PhoenixSEN, a self-managing ad hoc network designed to restore connectivity in the electrical grid after a large-scale outage. The electrical grid is a large, heterogeneous, geographically dispersed CPS. We analyze the U.S. electrical grid network subsystem, propose an ad hoc network to temporarily replace the network subsystem during a blackout, and discuss the experimental evaluation of the network on a one-of-a-kind physical electrical grid testbed. The novel aspects of PhoenixSEN lie in a combination of existing and new network technologies and manageability by power distribution industry operators.
Motivated by the challenges of running unmodified third-party applications in an ad hoc network like PhoenixSEN, we propose a geographic resource discovery and query processing service for federated CPSs called SenSQL. The service combines a resource discovery protocol inspired by the LoST protocol with a standard SQL-based query interface. SenSQL aims to simplify the development of applications for federated or administratively decoupled autonomous cyber-physical systems without a single administrative or technological point of failure. The SenSQL framework balances control over autonomous cyber-physical devices and their data with service federation, limiting the application's reliance on centralized infrastructures or services.
We conclude the first part of the dissertation by presenting the design and implementation of a testbed for usability experiments with mission-critical voice, a vital communication modality in PhoenixSEN, and during emergency scenarios in general. The testbed can be used to conduct human-subject studies under emulated network conditions to assess the influence of various network parameters on the end-user's quality of experience.
The second dissertation part focuses on network enrollment of IoT devices, a management process that is often complicated, frustrating, and error-prone, particularly in consumer-oriented systems. We motivate the work by reverse-engineering and analyzing Amazon Echo's network enrollment protocol. The Echo is one of the most widely deployed IoT devices and, thus, an excellent case study. We learn that the process is rather complicated and cumbersome.
We then present a systematic study of IoT network enrollment with a focus on consumer IoT devices in advanced deployment scenarios, e.g., third-party installations, shared physical spaces, or evolving IoT systems. We evaluate existing frameworks and their shortcoming and propose WIDE, a network-independent enrollment framework designed to minimize user interactions to enable advanced deployment scenarios. WIDE is designed for large-scale or heterogeneous IoT systems where multiple independent entities cooperate to set the system up. We also discuss the design of a human-subject study to compare and contrast the usability of network enrollment frameworks.
A secure network must authenticate a new device before it can be enrolled. The authentication step usually requires physical device access, which may be impossible in many advanced deployment scenarios, e.g., when IoT devices are installed by a specialist in physically unreachable locations. We propose Lighthouse, a visible-light authentication protocol for physically inaccessible IoT devices. We discuss the protocol's design, develop transmitter and receiver prototypes, and evaluate the system. Our measurements with off-the-shelf components over realistic distances indicate authentication times shorter or comparable with existing methods involving gaining physical access to the device. We also illustrate how the visible-light authentication protocol could be used as another authentication method in other network enrollment frameworks.
With over 20 million units sold since 2015, Amazon Echo, the Alexa-enabled smart speaker developed by Amazon, is probably one of the most widely deployed Internet of Things consumer devices. Despite the very large installed base, surprisingly little is known about the device's network behavior. We modify a first generation Echo device, decrypt its communication with Amazon cloud, and analyze the device pairing, Alexa Voice Service, and drop-in calling protocols. We also describe our methodology and the experimental setup. We find a minor shortcoming in the device pairing protocol and learn that drop-in calls are end-to-end encrypted and based on modern open standards. Overall, we find the Echo to be a well-designed device from the network communication perspective.
When the electrical grid in a region suffers a major outage, e.g., after a catastrophic cyber attack, a "black start" may be required, where the grid is slowly restarted, carefully and incrementally adding generating capacity and demand. To ensure safe and effective black start, the grid control center has to be able to communicate with field personnel and with supervisory control and data acquisition (SCADA) systems. Voice and text communication are particularly critical. As part of the Defense Advanced Research Projects Agency (DARPA) Rapid Attack Detection, Isolation, and Characterization Systems (RADICS) program, we designed, tested and evaluated a self-configuring mesh network prototype called the Phoenix Secure Emergency Network (PhoenixSEN). PhoenixSEN provides a secure drop-in replacement for grid's primary communication networks during black start recovery. The network combines existing and new technologies, can work with a variety of link-layer protocols, emphasizes manageability and auto-configuration, and provides services and applications for coordination of people and devices including voice, text, and SCADA communication. We discuss the architecture of PhoenixSEN and evaluate a prototype on realistic grid infrastructure through a series of DARPA-led exercises.
The COVID-19 pandemic and related restrictions forced many to work, learn, and socialize from home over the internet. There appears to be consensus that internet infrastructure in the developed world handled the resulting traffic surge well. In this paper, we study network measurement data collected by the Federal Communications Commission's Measuring Broadband America program before and during the pandemic in the United States (US). We analyze the data to understand the impact of lockdown orders on the performance of fixed broadband internet infrastructure across the US, and also attempt to correlate internet usage patterns with the changing behavior of users during lockdown. We found the key metrics such as change in data usage to be generally consistent with the literature. Through additional analysis, we found differences between metro and rural areas, changes in weekday, weekend, and hourly internet usage patterns, and indications of network congestion for some users.
The serverless and functions as a service (FaaS) paradigms are currently trending among cloud providers and are now increasingly being applied to the network edge, and to the Internet of Things (IoT) devices. The benefits include reduced latency for communication, less network traffic and increased privacy for data processing. However, there are challenges as IoT devices have limited resources for running multiple simultaneous containerized functions, and also FaaS does not typically support long-running functions. Our implementation utilizes Docker and CRIU for checkpointing and suspending long-running blocking functions. The results show that checkpointing is slightly slower than regular Docker pause, but it saves memory and allows for more long-running functions to be run on an IoT device. Furthermore, the resulting checkpoint files are small, hence they are suitable for live migration and backing up stateful functions, therefore improving availability and reliability of the system.