IEEE INFOCOM 2022

Program at a Glance

IEEE INFOCOM 2022

Session Break-1-May5: TII Virtual Booth Session Break-2-May5: Virtual Lunch Break

Session A-7: Attacks Session A-8: Attacks and Security Session A-9: Blockchain

Session B-7: Federated Learning 1 Session B-8: Federated Learning 2 Session B-9: Graph Machine Learning

Session C-7: Crowdsensing Session C-8: Mobile Sensing Session C-9: Machine Learning

Session D-7: Network Functions and Tasking Session D-8: Online Learning Session D-9: Reinforcement Learning

Session E-7: Optimization Session E-8: Resource Management Session E-9: Networks and Monitoring

Session F-7: Vehicular Systems Session F-8: Video Analytics Session F-9: Video Streaming

Session G-7: Data and Datacenters Session G-8: Networks Protocols 1 Session G-9: Networks Protocols 2

Connectivity Maintenance in Uncertain Networks under Adversarial Attack

Jianzhi Tang, Luoyi Fu and Jiaxin Ding (Shanghai Jiao Tong University, China); Xinbing Wang (Shanghai Jiaotong University, China); Guihai Chen (Shanghai Jiao Tong University, China)

This paper studies the problem of connectivity maintenance in adversarial uncertain networks, where a defender prevents the largest connected component from being decomposed by an attacker. In contrast with its deterministic counterpart, connectivity maintenance in an uncertain network involves additional testing on edges to determine their existence. To this end, by modeling a general uncertain network as a random graph with each edge associated with an existence probability and a testing cost, our goal is to design a general adaptive defensive strategy to maximize the expected size of the largest remaining connected component with minimum expected testing cost and, moreover, the strategy should be independent of the attacking patterns. The computational complexity of the connectivity maintenance problem is unraveled by proving its NP-hardness. To accurately tackle the problem, based on dynamic programming we first propose an optimal defensive strategy for a specific class of uncertain networks with uniform testing costs. Thereafter multi-objective optimization is adopted to generalize the optimal strategy for general uncertain networks through weighted sum of normalized size and cost. Due to the prohibitive price of an optimal strategy, two approximate defensive strategies are further designed to pursue decent performance with logarithmic complexity.

FeCo: Boosting Intrusion Detection Capability in IoT Networks via Contrastive Learning

Ning Wang (Virginia Tech, USA); Yimin Chen (University of Massachusetts Lowell, USA); Yang Hu (Virgina Tech, USA); Wenjing Lou and Thomas Hou (Virginia Tech, USA)

Over the last decade, Internet of Things (IoT) has permeated our daily life with a broad range of applications. However, a lack of sufficient security features in IoT devices renders IoT ecosystems vulnerable to various network intrusion attacks, potentially causing severe damage. Previous works have explored using machine learning to build anomaly detection models for defending against such attacks. In this paper, we propose FeCo, a federated-contrastive-learning framework that coordinates in-network IoT devices to jointly learn intrusion detection models. FeCo utilizes federated learning to alleviate users' privacy concerns as participating devices only submit their model parameters rather than local data. Compared to previous works, we develop a novel representation learning method based on contrastive learning that is able to learn a more accurate model for the benign class. FeCo significantly improves the intrusion detection accuracy compared to previous works. Besides, we implement a two-step feature selection scheme to avoid overfitting and reduce computation time. Through extensive experiments on the NSL-KDD dataset, we demonstrate that FeCo achieves as high as 8% accuracy improvement compared to the state-of-the-art and is robust to non-IID data. Evaluations on convergence, computation overhead, and scalability further confirm the suitability of FeCo for IoT intrusion detection.

PhoneyTalker: An Out-of-the-Box Toolkit for Adversarial Example Attack on Speaker Recognition

Meng Chen, Li Lu, Zhongjie Ba and Kui Ren (Zhejiang University, China)

Voice has become a fundamental method for human-computer interactions and person identification these days. Benefit from the rapid development of deep learning, speaker recognition exploiting voice biometrics has achieved great success in various applications. However, the shadow of adversarial example attacks on deep neural network-based speaker recognition recently raised extensive public concerns and enormous research interests. Although existing studies propose to generate adversarial examples by iterative optimization to deceive speaker recognition, these methods require multiple iterations to construct specific perturbations for a single voice, which is input-specific, time-consuming, and non-transferable, hindering the deployment and application for non-professional adversaries. In this paper, we propose PhoneyTalker, an out-of-the-box toolkit for any adversary to generate universal and transferable adversarial examples with low complexity, releasing the requirement for professional background and specialized equipment. PhoneyTalker decomposes an arbitrary voice into phone combinations and generates phone-level perturbations using a generative model, which are reusable for voices from different persons with various texts. By training the generative model with diversified datasets, PhoneyTalker could generalize across different models. Experiments on mainstream speaker recognition systems with large-scale corpus show that PhoneyTalker outperforms state-of-the-art methods with overall attack success rates of 99.9% and 84.0% under white-box and black-box settings respectively.

TrojanFlow: A Neural Backdoor Attack to Deep Learning-based Network Traffic Classifiers

Rui Ning, ChunSheng Xin and Hongyi Wu (Old Dominion University, USA)

This paper reports TrojanFlow, a new and practical neural backdoor attack to DL-based network traffic classifiers. In contrast to traditional neural backdoor attacks where a designated and sample-agnostic trigger is used to plant backdoor, TrojanFlow poisons a model using dynamic and sample-specific triggers that are optimized to efficiently hijack the model. It features a unique design to jointly optimize the trigger generator with the target classifier during training. The trigger generator can thus craft optimized triggers based on the input sample to efficiently manipulate the model's prediction. A well-engineered prototype is developed using Pytorch to demonstrate TrojanFlow attacking multiple practical DL-based network traffic classifiers. Thorough analysis is conducted to gain insights into the effectiveness of TrojanFlow, revealing the fundamentals of why it is effective and what it does to efficiently hijack the model. Extensive experiments are carried out on the well-known ISCXVPN2016 dataset with three widely adopted DL network traffic classifier architectures. TrojanFlow is compared with two other backdoor attacks under five state-of-the-art backdoor defenses. The results show that the TrojanFlow attack is stealthy, efficient, and highly robust against existing neural backdoor mitigation schemes.

Session Chair

Xiaolong Zheng (Beijing University of Posts and Telecommunications)

A Profit-Maximizing Model Marketplace with Differentially Private Federated Learning

Peng Sun (The Chinese University of Hong Kong, Shenzhen, China); Xu Chen (Sun Yat-sen University, China); Guocheng Liao (Sun Yat-Sen University, China); Jianwei Huang (The Chinese University of Hong Kong, Shenzhen, China)

Existing machine learning (ML) model marketplaces generally require data owners to share their raw data, leading to serious privacy concerns. Federated learning (FL) can partially alleviate this issue by enabling model training without raw data exchange. However, data owners are still susceptible to privacy leakage from gradient exposure in FL, which discourages their participation. In this work, we advocate a novel differentially private FL (DPFL)-based ML model marketplace. We focus on the broker-centric design. Specifically, the broker first incentivizes data owners to participate in model training via DPFL by offering privacy protection as per their privacy budgets and explicitly accounting for their privacy costs. Then, it conducts optimal model versioning and pricing to sell the obtained model versions to model buyers. In particular, we focus on the broker's profit maximization, which is challenging due to the significant difficulties in the revenue characterization of model trading and the cost estimation of DPFL model training. We propose a two-layer optimization framework to address it, i.e., revenue maximization and cost minimization under model quality constraints. The latter is still challenging due to its non-convexity and integer constraints. We hence propose efficient algorithms, and their performances are both theoretically guaranteed and empirically validated.

Communication-Efficient Device Scheduling for Federated Learning Using Stochastic Optimization

Jake Perazzone (US Army Research Lab, USA); Shiqiang Wang (IBM T. J. Watson Research Center, USA); Mingyue Ji (University of Utah, USA); Kevin S Chan (US Army Research Laboratory, USA)

Federated learning (FL) is a useful tool in distributed machine learning that utilizes users' local datasets in a privacy-preserving manner. When deploying FL in a constrained wireless environment, however, training models in a time-efficient manner can be a challenging task due to intermittent connectivity of devices, heterogeneous connection quality, and non-i.i.d. data. In this paper, we provide a novel convergence analysis of non-convex loss functions using FL on both i.i.d. and non-i.i.d. datasets with arbitrary device selection probabilities for each round. Then, using the derived convergence bound, we use stochastic optimization to develop a new client selection and power allocation algorithm that minimizes a function of the convergence bound and the average communication time under a transmit power constraint. We find the closed form solution to the minimization problem. One key feature of the algorithm is that knowledge of the channel statistics is not required and only the instantaneous channel state information needs to be known. Using the FEMNIST and CIFAR-10 datasets, we show through simulations that communication time can be significantly decreased using our algorithm, compared to uniformly random participation.

Optimal Rate Adaption in Federated Learning with Compressed Communications

Laizhong Cui and Xiaoxin Su (Shenzhen University, China); Yipeng Zhou (Macquarie University, Australia); Jiangchuan Liu (Simon Fraser University, Canada)

Federated Learning (FL) incurs high communication overhead, which can be greatly alleviated by compression for model updates. Yet the tradeoff between compression and model accuracy in the networked environment remains unclear and, for simplicity, most implementations adopt a fixed compression rate only. In this paper, we for the first time systematically examine this tradeoff, identifying the influence of the compression error on the final model accuracy with respect to the learning rate. Specifically, we factor the compression error of each global iteration into the convergence rate analysis under both strongly convex and non-convex loss functions. We then present an adaptation framework to maximize the final model accuracy by strategically adjusting the compression rate in each iteration. We have discussed the key implementation issues of our framework in practical networks with representative compression algorithms. Experiments over the popular MNIST and CIFAR-10 datasets confirm that our solution effectively reduces network traffic yet maintains high model accuracy in FL.

Towards Optimal Multi-modal Federated Learning on Non-IID Data with Hierarchical Gradient Blending

Sijia Chen and Baochun Li (University of Toronto, Canada)

Recent advances in federated learning (FL) made it feasible to train a machine learning model across multiple clients, even with non-IID data distributions. In contrast to these uni-modal models that have been studied extensively in the literature, there are few in-depth studies on how multi-modal models can be trained effectively with federated learning. Unfortunately, we empirically observed a counter-intuitive phenomenon that, compared with its uni-modal counterpart, multi-modal FL leads to a significant degradation in performance.

Our in-depth analysis of such a phenomenon shows that modality sub-networks and local models can overfit and generalize at different rates. To alleviate these inconsistencies in collaborative learning, we propose hierarchical gradient blending (HGB), which simultaneously computes the optimal blending of modalities and the optimal weighting of local models by adaptively measuring their overfitting and generalization behaviors. When HGB is applied, we present a few important theoretical insights and convergence guarantees for convex and smooth functions, and evaluate its performance in multi-modal FL. Our experimental results on an extensive array of non-IID multi-modal data have demonstrated that HGB is not only able to outperform the best uni-modal baselines but also to achieve superior accuracy and convergence speed as compared to state-of-the-art frameworks.

Session Chair

Xiaofei Wang (Tianjin University)

A Comparative Approach to Resurrecting the Market of MOD Vehicular Crowdsensing

Chaocan Xiang (Chongqing University, China); Yaoyu Li (ChongQing University, China); Yanlin Zhou (Chongqing University, China); Suining He (The University of Connecticut, USA); Yuben Qu (Nanjing University of Aeronautics and Astronautics, China); Zhenhua Li (Tsinghua University, China); Liangyi Gong (Computer Network Information Center, Chinese Academy of Sciences, China); Chao Chen (Chongqing University, China)

With the popularity of Mobility-on-Demand (MOD) vehicles, a new market called MOD-Vehicular-Crowdsensing (MOVE-CS) was introduced for drivers to earn more by collecting road data. Unfortunately, MOVE-CS failed after two years of operation. To identify the root cause, we survey 581 drivers and reveal its simple operation model based on blindly competitive rewards. This model brings most drivers few yields, resulting in their withdrawals. In contrast, a similar market termed MOD-Human-Crowdsensing (MOMAN-CS) remains successful thanks to a complex model based on
exclusively customized rewards. Hence, we wonder whether MOVE-CS can be resurrected by learning from MOMAN-CS. Despite considerable similarity, we can hardly apply the operation model of MOMAN-CS to MOVE-CS, since drivers are also concerned with passenger missions that dominate their earnings. To this end, we analyze a large-scale dataset of 12,493 MOD vehicles, finding that drivers have explicit preference for short-term, immediate gains as well as implicit rationality in pursuit of long-term, stable profits. Therefore, we design a
novel operation model for MOVE-CS, at the heart of which lies a spatial-temporal differentiation-aware task recommendation scheme empowered by submodular optimization. Applied to the dataset, our design would essentially benefit both the drivers and platform, thus possessing the potential to resurrect MOVE-CS.

Real-Time Execution of Trigger-Action Connection for Home Internet-of-Things

Kai Dong, Yakun Zhang, Yuchen Zhao, Daoming Li, Zhen Ling and Wenjia Wu (Southeast University, China); Xiaorui Zhu (Nanjing Xiaozhuang University, China)

IFTTT is a programming framework for Applets (i.e., user customized policies with a "trigger-action" syntax), and is the most popular Home Internet-of-Things (H-IoT) platform. The execution of an Applet prompted by a device operation suffers from a long delay, since IFTTT has to periodically reads the states of the device to determine whether the trigger is satisfied, with an interval of up to 5 min for professionals and 60 min for normal users. Although IFTTT sets up a flexible polling interval based on the past several times an Applet has run, the delay is still around 2 min even for frequently executed Applets. This paper proposes a novel trigger notification mechanism "RTX-IFTTT" to implement real-time execution of Applets. The mechanism does not require any changes to the current IFTTT framework or the H-IoT devices, but only requires an H-IoT edge node (e.g., router) to identify the device events (e.g., turning on/off) and notify IFTTT to perform the action of an Applet when an identified event is the trigger of that Applet. The experimental results show that the averaged Applet execution delay for RTX-IFTTT is only about 2 sec.

Spatiotemporal Fracture Data Inference in Sparse Urban CrowdSensing

En Wang, Mijia Zhang and Yuanbo Xu (Jilin University, China); Haoyi Xiong (Baidu, USA); Yongjian Yang (Jilin University, China)

While Mobile CrowdSensing (MCS) has become a popular paradigm that recruits mobile users to carry out various sensing tasks collaboratively, the performance of MCS is frequently degraded due to the limited spatiotemporal coverage in data collection. A possible way here is to incorporate sparse MCS with data inference, where unsensed data could be completed through prediction. However, the spatiotemporal data inference is usually "fractured" with poor performance, because of following challenges: 1) the sparsity of the sensed data, 2) the unpredictability of a spatiotemporal fracture and 3) the complex spatiotemporal relations. To resolve such fracture data issues, we elaborate a data generative model for achieving spatiotemporal fracture data inference in sparse MCS. Specifically, an algorithm named Generative High-Fidelity Matrix Completion (GHFMC) is proposed through combining traditional Deep Matrix Factorization (DMF) and Generative Adversarial Networks (GAN) for generating spatiotemporal fracture data. Along this line, GHFMC learns to extract the features of spatiotemporal data and further efficiently complete and predict the unsensed data by using Binary Cross Entropy (BCE) loss. Finally, we conduct experiments on three popular datasets. The experimental results show that our approach performs higher than the state-of-the-art (SOTA) baselines in both data inference accuracy and fidelity.

Worker Selection Towards Data Completion for Online Sparse Crowdsensing

Wenbin Liu, En Wang and Yongjian Yang (Jilin University, China); Jie Wu (Temple University, USA)

As a cost-effective paradigm, Sparse Crowdsensing (SC) aims to recruit workers to perform a part of sensing tasks and infer the rest. In most cases, workers participate in real time, and thus their sensing data are coming dynamically. Taking full advantage of the online coming data to complete the full map is an important problem. However, for data completion, the importance of data collected from spatio-temporal areas is different and time-varying. Moreover, the area importance also influences the subsequent worker selection, i.e., selecting suitable workers to actively sense important areas (instead of passively waiting for a given set of data) for improving completion accuracy. To this end, we propose a framework for ONline Sparse Crowdsensing, called ON-SC, which consists of three parts: data completion, importance estimation, and worker selection. We start from the dynamically coming data and propose an online matrix completion algorithm with spatio-temporal constraints. Based on that, we estimate the spatio-temporal area importance by conducting a reinforcement learning-based up-to-date model. Finally, we utilize the prophet secretary problem to select suitable workers to sense important areas for accurate completion in an online manner. Extensive experiments on real-world data sets show the effectiveness of our proposals.

Session Chair

Hongbo Jiang (Hunan University)

An Efficient Two-Layer Task Offloading Scheme for MEC Networks with Multiple Services Providers

Ju Ren and Jiani Liu (Tsinghua University, China); Yongmin Zhang and Zhaohui Li (Central South University, China); Zhibo Wang (Zhejiang University, China); Feng Lyu (Central South University, China); Yaoxue Zhang (Tsinghua University, China)

With the explosive growth of mobile and IoT applications, increasing Mobile Edge Computing (MEC) systems have been developed by diverse edge service providers (ESPs), opening a new computing market with stiff competition. Considering the spatiotemporally varying features of computation tasks, taking over all the received tasks alone may greatly degrade the service performance of the MEC system and lead to poor economical benefit. To this end, this paper proposes a two-layer collaboration model for ESPs, in which each ESP not only can balance the computation workload among the internal edge nodes from the ESP, but also can offload part of computation tasks to the external edge servers from other ESPs. For internal load balancing, we propose an ADMM-based task balancing scheme to manage the computation tasks within the edge nodes of the ESP, such that the computation delay can be minimized. For external task offloading, we formulate a game based pricing and task allocation scheme to derive the best game strategy aiming at maximizing the total revenue of each ESP. Extensive simulation results demonstrate that the proposed schemes can achieve improved performance in terms of system revenue and stability, as well as computation delay.

Dyssect: Dynamic Scaling of Stateful Network Functions

Fabricio Carvalho (Federal University of Mato Grosso do Sul, Brazil); Ronaldo A. Ferreira (UFMS, Brazil); Italo Cunha (Universidade Federal de Minas Gerais, Brazil); Marcos A. M. Vieira (Federal University of Minas Gerais, Brazil); Murali K Ramanathan (Uber Technologies Inc, USA)

Network Function Virtualization promises better utilization of computational resources by dynamically scaling resources on demand. However, most network functions (NFs) are stateful and require state updates on a per-packet basis. During a scaling operation, cores need to synchronize access to a shared state to avoid race conditions and to guarantee that NFs process packets in arrival order. Unfortunately, the classic approach to control concurrent access to a shared state with locks does not scale to today's throughput and latency requirements. Moreover, network traffic is highly skewed, leading to load imbalances in systems that use only sharding to partition the NF states. To address these challenges, we present Dyssect, a system that enables dynamic scaling of stateful NFs by disaggregating the states of network functions. By carefully coordinating actions between cores and a central controller, Dyssect migrates shards and flows between cores for load balancing or traffic prioritization without resorting to locks or reordering packets. Our experimental evaluation shows that Dyssect reduces tail latency up to 32% and increases throughput up to 19.36% when compared to state-of-the-art competing solutions.

Network Synthesis under Delay Constraints: The Power of Network Calculus Differentiability

Fabien Geyer (Airbus, Germany); Steffen Bondorf (Ruhr University Bochum, Germany)

With the advent of standards for deterministic network behavior, synthesizing network designs under delay constraints becomes the natural next task to tackle. Network Calculus (NC) has become a key method for validating industrial networks, as it computes formally verified end-to-end delay bounds. However, analyses from the NC framework were thus far designed to bound one flow's delay at a time. Attempts to use classical analyses for derivation of a network configuration revealed this approach to be poorly fitted for practical use cases. Take finding a delay-optimal routing configuration: One model for each routing alternative had to be created, then each flow delay had to be bounded, then the bounds were compared to the given constraints. To overcome this three-step procedure, we introduce Differential Network Calculus. We extend NC to allow for differentiation of delay bounds w.r.t. to a wide range of network parameters - such as flow routes. This opens up NC to a class of efficient non-linear optimization techniques taking advantage of the delay bound computation's gradient. Our numerical evaluation on the routing problem shows that our novel method can synthesize flow path in a matter of seconds, outperforming existing methods by multiple orders of magnitude.

User Experience Oriented Task Computation for UAV-Assisted MEC System

Lutian Shen (Yunnan University, China)

Unmanned aerial vehicle (UAV)-assisted computation paradigms have been treated as the new common in the integrated space-air-ground networks in B5G and 6G era. However, providing services solely from the perspective of the UAVs can not ensure a high Quality of Experience (QoE) at the user devices (UDs). Therefore, this paper presents a user experience-oriented service provision model for UAV-aided mobile edge computing (MEC) systems. First, a novel metric called shrinkage ratio is defined to reflect the experience at the UDs, and then the promotion of the QoE is formulated as a shrinkage ratio minimization problem. Second, a three-step strategy is adopted to tackle the problem which is non-convex: 1) a posterior method is adopted to eliminate the unknown upper bounds; 2) the non-convex constraints are approximately converted to convex by applying the first order Taylor expansion and are handled by SCA technique; 3) an algorithm for derive an accurate estimation in step 1) is put forward. Third, an initialization scheme for UAV trajectory is proposed by applying TSP technique. Finally, numerical results are given to validate our proposed design.

Session Chair

Ana C Aguiar (University of Porto)

Energy-Efficient Trajectory Optimization for Aerial Video Surveillance under QoS Constraints

Cheng Zhan (Southwest University, China); Han Hu (Beijing Institute of Technology, China); Shiwen Mao (Auburn University, USA); Jing Wang (Renmin University of China, China)

In this paper, we propose a novel design framework for aerial video surveillance in urban areas, where a cellular-connected UAV captures and transmits videos to the cellular network that services users. Fundamental challenges arise due to the limited onboard energy and quality of service (QoS) requirements over environment-dependent air-to-ground cellular links, where UAVs are usually served by the sidelobes of base stations (BSs). We aim to minimize the energy consumption of the UAV by jointly optimizing the mission completion time and UAV trajectory as well as transmission scheduling and association, subject to QoS constraints. The problem is formulated as a mixed-integer nonlinear programming (MINLP) problem by taking into account building blockage and BS antenna patterns. We first consider the average performance for uncertain local environments, and obtain an efficient sub-optimal solution by employing graph theory and convex optimization techniques. Next, we investigate the site-specific performance for specific urban local environments. By reformulating the problem as a Markov decision process (MDP), a deep reinforcement learning (DRL) algorithm is proposed by employing a dueling deep Q-network (DQN) neural network model with only local observations of sampled rate measurements. Simulation results show that the proposed solutions achieve significant performance gains over baseline schemes.

GADGET: Online Resource Optimization for Scheduling Ring-All-Reduce Learning Jobs

Menglu Yu and Ye Tian (Iowa State University, USA); Bo Ji (Virginia Tech, USA); Chuan Wu (The University of Hong Kong, Hong Kong); Hridesh Rajan (Iowa State University, USA); Jia Liu (The Ohio State University, USA)

Fueled by advances in distributed deep learning (DDL), recent years have witnessed a rapidly growing demand for resource-intensive distributed/parallel computing to process DDL computing jobs. To resolve network communication bottleneck and load balancing issues in distributed computing, the so-called "ring-all-reduce" decentralized architecture has been increasingly adopted to remove the need for dedicated parameter servers. To date, however, there remains a lack of theoretical understanding on how to design resource optimization algorithms for efficiently scheduling ring-all-reduce DDL jobs in computing clusters. This motivates us to fill this gap by proposing a series of new resource scheduling designs for ring-all-reduce DDL jobs. Our contributions in this paper are three-fold: i) We propose a new resource scheduling analytical model for ring-all-reduce deep learning, which covers a wide range of objectives in DDL performance optimization (e.g., excessive training avoidance, energy efficiency, fairness); ii) Based on the proposed performance analytical model, we develop an efficient resource scheduling algorithm called GADGET (greedy ring-all-reduce distributed graph embedding technique), which enjoys a provable strong performance guarantee; iii) We conduct extensive trace-driven experiments to demonstrate the effectiveness of the GADGET approach and its superiority over the state of the art.

Midpoint Optimization for Segment Routing

Alexander Brundiers (Osnabrück University, Germany); Timmy Schüller (Deutsche Telekom Technik GmbH & Osnabrück University, Germany); Nils Aschenbruck (Osnabrück University, Germany)

In this paper, we discuss the concept of Midpoint Optimization (MO) for Segment Routing (SR). It is based on the idea of integrating SR policies into the Interior Gateway Protocol (IGP) to allow various demands to be steered into them. We discuss the benefits of this approach when compared to end-to-end SR and potential challenges that might arise in deployment. We further develop an LP-based optimization algorithm to assess the Traffic Engineering capabilities of MO for SR. Based on traffic and topology data from a Tier-1 Internet Service Provider as well as other, publicly available data, we show that this algorithm is able to achieve virtually optimal results with regards to the maximum link utilization, that are on par with state-of-the-art end-to-end SR approaches. However, our MO approach requires substantially less policies to do so. For some instances the achieved reduction ranges up to more than 99%.

On Designing Secure Cross-user Redundancy Elimination for WAN Optimization

Yuan Zhang, Ziwei Zhang, Minze Xu, Chen Tian and Sheng Zhong (Nanjing University, China)

Redundancy elimination (RE) systems allow network users to remove duplicate parts in their messages by introducing caches at both message senders' and receivers' sides. While RE systems have been successfully deployed for handling unencrypted traffic, making them work over encrypted links is still open. A few solutions have been proposed recently, however they either completely violate end-to-end security or focus on single-user setting. In this paper, we present a highly secure RE solution which supports cross-user redundancy eliminations on encrypted traffics. Our solution not only preserves the end-to-end security against outside adversaries, but also protects users' privacy against semi-honest RE agents. Furthermore, our solution can defend malicious users' poisoning attack, which is crucial for cross-user RE systems but has never been studied before. In cross-user RE systems, since all users inside a LAN write into a shared, global cache and use it to recover their original messages from deduplicated ones, the poisoning attack is prone to happen, and cause systematic damage to all users even when only one user is malicious and injects poisoned data into the cache. We rigorously prove our solution's security properties, and demonstrate its promising performance via testing the proof-of-concept implementation with real-world internet traffic data.

Session Chair

Zhenzhe Zheng (Shanghai Jiao Tong University)

ANTIGONE: Accurate Navigation Path Caching in Dynamic Road Networks leveraging Route APIs

Xiaojing Yu and Xiang-Yang Li (University of Science and Technology of China, China); Jing Zhao (Illinois Institute of Technology, USA); Guobin Shen (Joveai Inc, USA); Nikolaos M. Freris and Lan Zhang (University of Science and Technology of China, China)

Navigation paths and corresponding travel times play a key role in location-based services (LBS) of which large-scale navigation path caching constitutes a fundamental component. In view of the highly dynamic real-time traffic changes in road networks, the main challenge amounts to updating paths in the cache in a fashion that incurs minimal costs due to querying external map service providers and cache maintenance. In this paper, we propose a hybrid graph approach in which an LBS provider maintains a dynamic graph with edge weights representing travel times, and queries the external map server so as to ascertain high fidelity of the cached paths subject to stringent limitations on query costs. We further deploy our method in one of the biggest on-demand food delivery platforms and evaluate the performance against state-of-the-art methods. Our experimental results demonstrate the efficacy of our approach in terms of both substantial savings in the number of required queries and superior fidelity of the cached paths.

Cutting Through the Noise to Infer Autonomous System Topology

Kirtus G Leyba and Joshua J. Daymude (Arizona State University, USA); Jean-Gabriel Young (University of Vermont, USA); Mark Newman (University of Michigan, USA); Jennifer Rexford (Princeton University, USA); Stephanie Forrest (Arizona State University, USA)

The Border Gateway Protocol (BGP) is a distributed protocol that manages interdomain routing without requiring a centralized record of which autonomous systems (ASes) connect to which others. Many methods have been devised to infer the AS topology from publicly available BGP data, but none provide a general way to handle the fact that the data are notoriously incomplete and subject to error. This paper describes a method for reliably inferring AS-level connectivity in the presence of measurement error using Bayesian statistical inference acting on BGP routing tables from multiple vantage points. We employ a novel approach for counting AS adjacency observations in the AS-PATH attribute data from public route collectors, along with a Bayesian algorithm to generate a statistical estimate of the AS-level network. Our approach also gives us a way to evaluate the accuracy of existing reconstruction methods and to identify advantageous locations for new route collectors or vantage points.

Joint Order Dispatch and Charging for Electric Self-Driving Taxi Systems

Guiyun Fan, Haiming Jin and Yiran Zhao (Shanghai Jiao Tong University, China); Yiwen Song (Carnegie Mellon University, USA); Xiaoying Gan and Jiaxin Ding (Shanghai Jiao Tong University, China); Lu Su (Purdue University, USA); Xinbing Wang (Shanghai Jiaotong University, China)

Nowadays, the rapid development of self-driving technology and its fusion with the current vehicle electrification process has given rise to electric self-driving taxis (es-taxis). Foreseeably, es-taxis will become a major force that serves the massive urban mobility demands not far into the future. Though promising, it is still a fundamental unsolved problem of effectively deciding when and where a city-scale fleet of es-taxis should be charged, so that enough es-taxis will be available whenever and wherever ride requests are submitted. Furthermore, charging decisions are far from isolated, but tightly coupled with the order dispatch process that matches orders with es-taxis. Therefore, in this paper, we investigate the problem of joint order dispatch and charging in es-taxi systems, with the objective of maximizing the ride-hailing platform's long-term cumulative profit. Technically, such problem is challenging in a myriad of aspects, such as long-term profit maximization, partial statistical information on future orders, etc. We address the various arising challenges by meticulously integrating a series of methods, including distributionally robust optimization, primal-dual transformation, and second order conic programming to yield far-sighted decisions. Finally, we validate the effectiveness of our proposed methods though extensive experiments based on two large-scale real-world online ride-hailing order datasets.

Vehicle-to-Nothing? Securing C-V2X Against Protocol-Aware DoS Attacks

Geoff Twardokus and Hanif Rahbari (Rochester Institute of Technology, USA)

Vehicle-to-vehicle (V2V) communication allows vehicles to directly exchange messages, increasing their situational awareness and offering the potential to prevent hundreds of thousands vehicular crashes annually. Cellular Vehicle-to-Everything (C-V2X), with its LTE-V2X and New Radio (NR)-V2X variants in 4G/LTE- and 5G-based C-V2X, is emerging as the main V2V technology. However, despite security protocols and standards for C-V2X, we expose in this paper that its physical (PHY) and MAC layers are not resilient against intelligent, protocol-aware attacks due to the very predictable PHY-layer structure and vulnerable scheduling algorithm used in both LTE-V2X and NR-V2X. We devise two stealthy denial-of-service (DoS) exploits that dramatically degrade C-V2X availability, thereby increasing the chances of fatal vehicle collisions. We experimentally evaluate our attacks on an integrated, hybrid testbed with USRPs and state-of-the-art LTE-V2X devices as well as through extensive simulations, showing that within seconds, our attacks can reduce a target's packet delivery ratio by 90% or degrade C-V2X channel throughput by 50%. We propose, analyze, and evaluate detection approaches as well as mitigation techniques to address the vulnerabilities we expose in the C-V2X PHY/MAC layers, providing direction towards better-secured, resilient 5G C-V2X.

Session Chair

Janise McNair (University of Florida)

Constrained In-network Computing with Low Congestion in Datacenter Networks

Raz Segal, Chen Avin and Gabriel Scalosub (Ben-Gurion University of the Negev, Israel)

Distributed computing has become a common practice nowadays, where recent focus has been given to the usage of smart networking devices with in-network computing capabilities. State-of-the-art switches with near-line rate computing and aggregation capabilities enable acceleration and improved performance for various modern applications like big data analytics and large-scale distributed and federated machine learning.

In this paper, we formulate and study the theoretical algorithmic foundations of such approaches, and focus on how to deploy and use constrained in-network computing capabilities within the data center. We focus our attention on reducing the network congestion, i.e., the most congested link in the network, while supporting the given workload(s). We present an efficient optimal algorithm for tree-like network topologies and show that our solution provides as much as an x13 improvement over common alternative approaches. In particular, our results show that having merely a small fraction of network devices that support in-network aggregation can significantly reduce the network congestion, both for single and multiple workloads.

Fast and Heavy Disjoint Weighted Matchings for Demand-Aware Datacenter Topologies

Kathrin Hanauer, Monika Henzinger, Stefan Schmid and Jonathan Trummer (University of Vienna, Austria)

Reconfigurable optical topologies promise to improve the performance in datacenters by dynamically optimizing the physical network in a demand-aware manner. State-of-the-art optical technologies allow to establish and update direct connectivity (in the form of edge-disjoint matchings) between top-of-rack switches within microseconds or less. However, to fully exploit temporal structure in the demand, such fine-grained reconfigurations also require fast algorithms for optimizing the interconnecting matchings.
Motivated by the desire to offload a maximum amount of demand to the reconfigurable network, this paper initiates the study of fast algorithms to find k disjoint heavy matchings in graphs. We present and analyze six algorithms, based on iterative matchings, b-matching, edge coloring, and node-rankings. We show that the problem is generally NP-hard and study the achievable approximation ratios.
An extensive empirical evaluation of our algorithms on both real-world and synthetic traces (88 in total), including traces collected in Facebook datacenters and in HPC clusters reveals that all our algorithms provide high-quality matchings, and also very fast ones come within 95% or more of the best solution. However, the running times differ significantly and what is the best algorithm depends on k and the acceptable runtime-quality tradeoff.

Jingwei: An Efficient and Adaptable Data Migration Strategy for Deduplicated Storage Systems

Geyao Cheng, Deke Guo, Lailong Luo, Junxu Xia and Yuchen Sun (National University of Defense Technology, China)

The traditional migration methods are confronted with formidable challenges when data deduplication technologies are incorporated. Firstly, the deduplication creates data-sharing dependencies in the stored files; breaking such dependencies in migration would attach extra space overhead. Secondly, the redundancy elimination heightens the risk of data unavailability during server crushes. The existing methods fail to tackle them at one shot. To this end, we propose Jingwei, an efficient and adaptable data migration strategy for deduplicated storage systems. To be specific, Jingwei tries to minimize the extra space cost in migration for space efficiency. Meanwhile, Jingwei realizes the service adaptability by encouraging replicas of hot data to spread out their data access requirements. We first model such a problem as an integer linear programming (ILP) and solve it with a commercial solver when only one empty migration target server is allowed. We then extend this problem to a scenario wherein multiple non-empty target servers are available for migration. We solve it by an effective heuristic algorithm based on the Bloom Filter-based data sketches. Trace-driven experiments show that Jingwei fortifies the file replicas by 66.7%, while only 5.7% of the extra storage space is occupied compared with the latest "Goseed" method.

Optimal Data Placement for Stripe Merging in Locally Repairable Codes

Si Wu and Qingpeng Du (University of Science and Technology of China, China); Patrick Pak-Ching Lee (The Chinese University of Hong Kong, Hong Kong); Yongkun Li and Yinlong Xu (University of Science and Technology of China, China)

Erasure coding is a storage-efficient redundancy scheme for modern clustered storage systems by storing stripes of data and parity blocks across the nodes of multiple clusters; in particular, locally repairable codes (LRC) continue to be one popular family of practical erasure codes that achieve high repair efficiency. To efficiently adapt to the dynamic requirements of access efficiency and reliability, storage systems often perform redundancy transitioning by tuning erasure coding parameters. In this paper, we apply a stripe merging approach for redundancy transitioning of LRC in clustered storage systems, by merging multiple LRC stripes to form a large LRC stripe with low storage redundancy. We show that the random placement of multiple LRC stripes that are being merged can lead to high cross-cluster transitioning bandwidth. To this end, we design an optimal data placement scheme that provably minimizes the cross-cluster traffic for stripe merging, by judiciously placing the blocks to be merged in the same cluster while maintaining the repair efficiency of LRC. We prototype and implement our optimal data placement scheme on a local cluster. Our evaluation shows that it significantly reduces the transitioning time by up to 43.2% compared to the baseline.

Session Chair

Qiao Xiang (Xiamen University)

6Forest: An Ensemble Learning-based Approach to Target Generation for Internet-wide IPv6 Scanning

Tao Yang, Bingnan Hou, Tongqing Zhou and Zhiping Cai (National University of Defense Technology, China)

IPv6 target generation is the critical step for fast IPv6 scanning for Internet-wide surveys. Existing techniques, however, commonly suffer from low hit rates due to inappropriate space partition caused by the outlier addresses and short-sighted splitting indicators. To address the problem, we propose 6Forest, an ensemble learning-based approach for IPv6 target generation that is from a global perspective and resilient to outlier addresses. Given a set of known addresses, 6Forest first considers it as an initial address region and then iteratively divides the IPv6 address space into smaller regions using a maximum-covering splitting indicator. Before a round of space partition, it builds a forest structure for each region and exploits an enhanced isolation forest algorithm to remove the outlier addresses. Finally, it pre-scans samples from the divided address regions and based on the results generates IPv6 addresses. Experiments on eight large-scale candidate datasets indicate that, compared with the state-of-the-art methods in IPv6 worldwide scanning, 6Forest can achieve up to 116.5% improvement for low-budget scanning and 15× improvement for high-budget scanning.

Auter: Automatically Tuning Multi-layer Network Buffers in Long-Distance Shadowsocks Networks

Xu He (George Mason University, USA); Jiahao Cao (Tsinghua University, China); Shu Wang and Kun Sun (George Mason University, USA); Lisong Xu (University of Nebraska-Lincoln, USA); Qi Li (Tsinghua University, China)

To bypass network censorship, Shadowsocks is often deployed on long-distance transnational networks; however, such proxy networks are usually plagued by high latency, high packet loss rate, and unstable bandwidth. Most existing tuning solutions rely on hand-tuned heuristics, which cannot work well in the volatile Shadowsocks networks due to the labor intensive and time-consuming properties. In this paper, we propose Auter, which automatically tunes multi-layer buffer parameters with reinforcement learning (RL) to improve the performance of Shadowsocks in long-distance networks. The key insight behind Auter is that different network environments require different sizes of buffers to achieve sufficiently good performance. Hence, Auter continuously learns a tuning policy from volatile network states and dynamically alter sizes of multi-buffers for high network performance. We prototype Auter and evaluate its effectiveness under various real networks. Our experimental results show that Auter can effectively improve network performance, up to 40.5% throughput increase in real networks. Besides, we demonstrate that Auter outperforms all the existing tuning schemes.

FUME: Fuzzing Message Queuing Telemetry Transport Brokers

Bryan Pearson (University of Central Florida, USA); Yue Zhang (Jinan University, China); Cliff Zou (University of Central Florida, USA); Xinwen Fu (University of Massachusetts Lowell, USA)

Message Queuing Telemetry Transport (MQTT) is a popular communication protocol used to interconnect devices with considerable network restraints, such as those found in Internet of Things (IoT). MQTT directly impacts thousands of devices, but the software security of its server ("broker") implementations is not well studied. In this paper, we design, implement, and evaluate a novel fuzz testing model for MQTT. The fuzzer combines aspects of mutation guided fuzzing and generation guided fuzzing to rigorously exhaust the MQTT protocol and identify vulnerabilities in servers. We introduce Markov chains for mutation guided fuzzing and generation guided fuzzing that model the fuzzing engine according to a finite Bernoulli process. We implement "response feedback", a novel technique which monitors network and console activity to learn which inputs trigger new responses from the broker. In total, we found 7 major vulnerabilities across 9 different MQTT implementations, including 6 zero-day vulnerabilities and 2 CVEs. We show that when fuzzing these popular MQTT targets, our fuzzer compares favorably with other state-of-the-art fuzzing frameworks, such as BooFuzz and AFLNet.

Large-scale Evaluation of Malicious Tor Hidden Service Directory Discovery

Chunmian Wang, Zhen Ling, Wenjia Wu, Qi Chen and Ming Yang (Southeast University, China); Xinwen Fu (University of Massachusetts Lowell, USA)

Tor is the largest anonymous communication system, providing anonymous communication services to approximately 2.8 million users and 170,000 hidden services per day. The Tor hidden service mechanism can protect a server from exposing its real identity during the communication. However, due to a design flaw of the Tor hidden service mechanism, adversaries can deploy malicious Tor hidden service directories (HSDirs) to covertly collect all onion addresses of hidden services and further probe the hidden services. To mitigate this issue, we design customized honeypot hidden services based on one-to-one and many-to-one HSDir monitoring approaches to luring and identifying the malicious HSDirs conducting the rapid and delayed probing attacks, respectively. By analyzing the probing behaviors and payloads, we investigate a novel semantic-based probing pattern clustering approach to classify the adversaries so as to shed light on the purposes of the malicious HSDirs. Moreover, we perform theoretical analysis of the capability and accuracy of our approaches. Large-scale experiments are conducted in the real-world Tor network by deploying hundreds of thousands of honeypots during a monitoring period of more than three months. Finally, we identify 8 groups of 32 malicious HSDirs, discover 25 probing pattern clusters and reveal 3 major probing purposes.

Session Chair

WenZhan Song (University of Georgia)

FLASH: Federated Learning for Automated Selection of High-band mmWave Sectors

Batool Salehihikouei, Jerry Z Gu, Debashri Roy and Kaushik Chowdhury (Northeastern University, USA)

Fast sector-steering in the mmWave band for vehicular mobility scenarios remains an open challenge. This is because standard-defined exhaustive search over predefined antenna sectors cannot be assuredly completed within short contact times. This paper proposes machine learning to speed up sector selection using data from multiple non-RF sensors, such as LiDAR, GPS, and camera images. The contributions in this paper are threefold: First, a multimodal deep learning architecture is proposed that fuses the inputs from these data sources and locally predicts the sectors for best alignment at a vehicle. Second, it studies the impact of missing data (e.g., missing LiDAR/images) during inference, which is possible due to unreliable control channels or hardware malfunction. Third, it describes the first-of-its-kind multimodal federated learning framework that combines model weights from multiple vehicles and then disseminates the final fusion architecture back to them, thus incorporating private sharing of information and reducing their individual training times. We validate the proposed architectures on a live dataset collected from an autonomous car equipped with multiple sensors (GPS, LiDAR, and camera) and roof-mounted Talon AD7200 60GHz mmWave radios. We observe 52.75% decrease in sector selection time than 802.11ad standard while maintaining 89.32% throughput with the globally optimal solution.

Joint Superposition Coding and Training for Federated Learning over Multi-Width Neural Networks

Hankyul Baek, Won Joon Yun and Yunseok Kwak (Korea University, Korea (South)); Soyi Jung (Hallym University, Korea (South)); Mingyue Ji (University of Utah, USA); Mehdi Bennis (Centre of Wireless Communications, University of Oulu, Finland); Jihong Park (Deakin University, Australia); Joongheon Kim (Korea University, Korea (South))

This paper aims to integrate two synergetic technologies, federated learning (FL) and width-adjustable slimmable neural network (SNN) architectures. FL preserves data privacy by exchanging the locally trained models of mobile devices. By adopting SNNs as local models, FL can flexibly cope with the time-varying energy capacities of mobile devices. Combining FL and SNNs is however non-trivial, particularly under wireless connections with time-varying channel conditions. Furthermore, existing multi-width SNN training algorithms are sensitive to the data distributions across devices, so are ill-suited to FL. Motivated by this, we propose a communication and energy efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models. By applying SC, SlimFL exchanges the superposition of multiple width configurations that are decoded as many as possible for a given communication throughput. Leveraging ST, SlimFL aligns the forward propagation of different width configurations, while avoiding the inter-width interference during back propagation. We formally prove the convergence of SlimFL. The result reveals that SlimFL is not only communication-efficient but also can counteract non-IID data distributions and poor channel conditions, which is also corroborated by simulations.

Tackling System and Statistical Heterogeneity for Federated Learning with Adaptive Client Sampling

Bing Luo (Shenzhen Institute of Artificial Intelligence and Robotics for Society & The Chinese University of Hong Kong, Shenzhen, China); Wenli Xiao (The Chinese University of Hong Kong, Shenzhen, China); Shiqiang Wang (IBM T. J. Watson Research Center, USA); Jianwei Huang (The Chinese University of Hong Kong, Shenzhen, China); Leandros Tassiulas (Yale University, USA)

Federated learning (FL) algorithms usually sample a fraction of clients in each round (partial participation) when the number of participants is large and the server's communication bandwidth is limited. Recent works on the convergence analysis of FL have focused on unbiased client sampling, e.g., sampling uniformly at random, which suffers from slow wall-clock time for convergence due to high degrees of system heterogeneity and statistical heterogeneity. This paper aims to design an adaptive client sampling algorithm that tackles both system and statistical heterogeneity to minimize the wall-clock convergence time. We obtain a new tractable convergence bound for FL algorithms with arbitrary client sampling probabilities. Based on the bound, we analytically establish the relationship between the total learning time and sampling probabilities, which results in a non-convex optimization problem. We design an efficient algorithm for learning the unknown parameters in the convergence bound and develop a low-complexity algorithm to approximately solve the non-convex problem. Experimental results from both hard-ware prototype and simulation demonstrate that our proposed sampling scheme significantly reduces the convergence time compared to several baseline sampling schemes. Notably, our scheme in hardware prototype spends 73% less time than the baseline uniform sampling for reaching the same target loss.

The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining

Yi Liu (City University of Hong Kong, China); Lei Xu (Nanjing University of Science and Technology, China); Xingliang Yuan (Monash University, Australia); Cong Wang (City University of Hong Kong, Hong Kong); Bo Li (Hong Kong University of Science and Technology, Hong Kong)

In Machine Learning (ML), the emergence of the right to be forgotten gave birth to a paradigm named machine unlearning, which enables data holders to proactively erase their data from a trained model. Existing machine unlearning techniques largely focus on centralized training, where access to all holders' training data is a must for the server to conduct the unlearning process. It remains largely underexplored about how to achieve the unlearning when full access to all training data becomes unavailable. One noteworthy example is Federated Learning (FL), where each participating data holder trains locally, without sharing their training data to the central server. In this paper, we investigate the problem of machine unlearning in FL systems. We start with a formal definition of the unlearning problem in FL and propose a rapid retraining approach to fully erase the data samples from a trained FL model. The resulting design allows data holders to jointly conduct the unlearning process efficiently while keeping their training data locally. Our formal convergence and complexity analysis demonstrate that our design can preserve model utility with high efficiency. Extensive evaluations on four real-world datasets illustrate the effectiveness and performance of our proposed realization.

Session Chair

Christopher Brinton (Purdue University)

Can We Obtain Fine-grained Heartbeat Waveform via Contact-free RF-sensing?

Shujie Zhang and Tianyue Zheng (Nanyang Technological University, Singapore); Zhe Chen (School of Computer Science and Engineering, Nangyang Technological University, Singapore); Jun Luo (Nanyang Technological University, Singapore)

Contact-free vital-signs monitoring enabled by radio frequency (RF) sensing is gaining increasing attention, thanks to its non-intrusiveness, noise-resistance, and low cost. Whereas most of these systems only perform respiration monitoring or retrieve heart rate, few can recover fine-grained heartbeat waveform. The major reason is that, though both respiration and heartbeat cause detectable micro-motions on human bodies, the former is so strong that it overwhelms the latter. In this paper, we aim to answer the question in the paper title, by demystifying how heartbeat waveform can be extracted from RF-sensing signal. Applying several mainstream methods to recover heartbeat waveform from raw RF signal, our results reveal that these methods fail to achieve what they have claimed, mainly because they assume linear signal mixing whereas the composition between respiration and heartbeat can be highly nonlinear. To overcome the difficulty of decomposing nonlinear signal mixing, we leverage the power of a novel deep generative model termed variational encoder-decoder (VED). Exploiting the universal approximation ability of deep neural networks and the generative potential of variational inference, VED demonstrates a promising capability in recovering fine-grained heartbeat waveform from RF-sensing signal; this is firmly validated by our experiments with 12 subjects and 48-hour data.

DroneSense: Leveraging Drones for Sustainable Urban-scale Sensing of Open Parking Spaces

Dong Zhao (Beijing University of Posts and Telecommunications, China); Mingzhe Cao (BeiUniversity of Posts and Telecommunications, China); Lige Ding, Qiaoyue Han, Yunhao Xing and Huadong Ma (Beijing University of Posts and Telecommunications, China)

Energy and cost are two primary concerns when leveraging drones for urban sensing. With the advances of wireless charging technologies and the inspiration from the sparse crowdsensing paradigm, this paper proposes a novel drone-based collaborative sparse-sensing framework DroneSense, demonstrating its feasibility for sustainable urban-scale sensing. We focus on a typical use case, i.e., leveraging DroneSense to sense open parking spaces. DroneSense selects a minimum number of Points of Interest (POIs) to schedule drones for physical data sensing and then infers the parking occupancy of the remaining POIs to meet the overall quality requirement. However, drone-based sensing is different from human-centric crowdsensing, resulting in a series of new problems, including which POIs are visited first, when and where to charge drones, which drones to charge first, how much to charge, and when to stop the scheduling. To this end, we design a holistic solution, including context-aware matrix factorization for parking occupancy data inference, progressive determination of task quantity, deep reinforcement learning (DRL) based task selection, energy-aware DRL-based task scheduling, and adaptive charger scheduling. Extensive experiments with a real-world on-street parking dataset from Shenzhen, China demonstrate the obvious advantages of DroneSense.

RF-Wise: Pushing the Limit of RFID-based Sensing

Cui Zhao (Xi'an Jiaotong University, China); Zhenjiang Li (City University of Hong Kong, Hong Kong); Han Ding (Xi'an Jiaotong University, China); Ge Wang (Xi‘an Jiaotong University, China); Wei Xi and Jizhong Zhao (Xi'an Jiaotong University, China)

RFID shows great potentials to build useful sensing applications. However, current RFID sensing can obtain mainly a single-dimensional sensing measure from each query, e.g., phase, RSS, etc. It is sufficient to fulfill the designs bounded to tag's own movement, e.g., tag's localization, while it imposes inevitable uncertainty to a larger spectrum of the sensing tasks that rely on the features extracted from RFID signals, which limits the fidelity of RFID sensing fundamentally and prevents its broader usage in more sophisticated scenarios. This paper presents RF-Wise to push the limit of RFID sensing, which is motivated by an insightful observation to customize RFID signals. RF-Wise can enrich the prior single-dimensional measure to a channel state information (CSI)-like measure with up to 150 dimensional samples across frequencies concurrently. More importantly, RF-Wise is a software solution atop standard RFID without using any extra hardware, requires only one tag for sensing, works within ISM band and is compatible to EPC Gen2 protocol. RF-Wise so far as we know is the first system of such a kind. We develop a RF-Wise prototype. Extensive experiments show that RF-Wise does not impact underlying RFID communications, while by using RF-Wise's features, applications' sensing performance can be improved remarkably.

TeethPass: Dental Occlusion-based User Authentication via In-ear Acoustic Sensing

Yadong Xie and Fan Li (Beijing Institute of Technology, China); Yue Wu (Tsinghua University, China); Huijie Chen (Beijing University of Technology, China); Zhiyuan Zhao (Beijing Institute of Technology, China); Yu Wang (Temple University, USA)

With the rapid development of mobile devices and the fast increase of sensitive data, secure and convenient mobile authentication technologies are desired. Except for traditional passwords, many mobile devices have biometric-based authentication methods (e.g., fingerprint, voiceprint, and face recognition), but they are vulnerable to spoofing attacks. To solve this problem, we study new biometric features which are based on the dental occlusion and find that the bone-conducted sound of dental occlusion collected in binaural canals contains unique features of individual bones and teeth. Motivated by this, we propose a novel authentication system, TeethPass, which uses earbuds to collect occlusal sounds in binaural canals to achieve authentication. We design an event detection method based on spectrum variance and double thresholds to detect bone-conducted sounds. Then, we analyze the time-frequency domain of the sounds to filter out motion noises and extract unique features of users from three aspects: bone structure, occlusal location, and occlusal sound. Finally, we design an incremental learning-based Siamese network to construct the classifier. Through extensive experiments including 22 participants, the performance of TeethPass in different environments is verified. TeethPass achieves an accuracy of 96.8% and resists nearly 99% of spoofing attacks.

Session Chair

Gang Zhou (William & Mary)

Online Learning-Based Rate Selection for Wireless Interactive Panoramic Scene Delivery

Harsh Gupta (University of Illinois at Urbana-Champaign, USA); Jiangong Chen and Bin Li (The Pennsylvania State University, USA); R. Srikant (University of Illinois at Urbana-Champaign, USA)

Interactive panoramic scene delivery not only consumes 4 ~ 6× more bandwidth than traditional video streaming of the same resolution but also requires timely displaying the delivered content to ensure smooth interaction. Since users can only see roughly 20% of the entire scene at a time (called the viewport), it is sufficient to deliver the relevant portion of the panoramic scene if we can accurately predict the user's motion. It is customary to deliver a portion larger than the viewport to tolerate inaccurate predictions. Intuitively, the larger the delivered portion, the higher the prediction accuracy and lower the wireless transmission success probability. The goal is to select an appropriate delivery portion to maximize system throughput. We formulate this problem as a multi-armed bandit problem and use the classical Kullback-Leibler Upper Confidence Bound (KL-UCB) algorithm for the portion selection. We further develop a novel variant of the KL-UCB algorithm that effectively leverages two-level feedback (i.e., both prediction and transmission outcomes) after each decision on the selected portion and show its asymptotical optimality, which may be of independent interest by itself. We demonstrate the superior performance of our proposed algorithms over existing heuristic methods using both synthetic simulations and real experimental evaluations.

Schedule or Wait: Age-Minimization for IoT Big Data Processing in MEC via Online Learning

Zichuan Xu and Wenhao Ren (Dalian University of Technology, China); Weifa Liang (City University of Hong Kong, Hong Kong); Wenzheng Xu (Sichuan University, China); Qiufen Xia (Dalian University of Technology, China); Pan Zhou (School of CSE, Huazhong University of Science and Technology, China); Mingchu Li (School of Software, Dalian University of Technology, China)

The age of data (AoD) is identified as one of the most novel and important metrics to measure the quality of big data analytics for Internet-of-Things (IoT) applications. Meanwhile, mobile edge computing (MEC) is envisioned as an enabling technology to minimize the AoD of IoT applications by processing the data in edge servers close to IoT devices. In this paper, we study the AoD minimization problem for IoT big data processing in MEC networks. We first propose an exact solution for the problem by formulating it as an Integer Linear Program (ILP). We then propose an efficient heuristic for the offline AoD minimization problem. We also devise an approximation algorithm with a provable approximation ratio for a special case of the problem, by leveraging the parametric rounding technique. We thirdly develop an online learning algorithm with a bounded regret for the online AoD minimization problem under dynamic arrivals of IoT requests and uncertain network delay assumptions, by adopting the Multi-Armed Bandit (MAB) technique. We finally evaluate the performance of the proposed algorithms by extensive simulations and implementations in a real test-bed. Results show that the proposed algorithms outperform existing approaches by reducing the AoD around 10%.

Sending Timely Status Updates through Channel with Random Delay via Online Learning

Haoyue Tang, Yuchao Chen, Jintao Wang and Jingzhou Sun (Tsinghua University, China); Jian Song (Tsinghua University & Beijing National Research Center for Information Science and Technology & Key Lab of DTV System of Guangdong & Shenzhen, Research Institute of Tsinghua University in Shenzhen, China)

In this work, we study a status update system with a source node sending timely information to the destination through a channel with random delay. We measure timeliness of the information stored at the receiver via the Age of Information (AoI), the time elapsed since the freshest sample stored at the receiver is generated. The goal is to design sampling strategies that minimize the total cost of the expected time average AoI and average sampling cost in the absence of transmission delay statistics. We reformulate the total sum minimization problem as the optimization of a renewal-reward process, and propose an online sampling strategy based on the Monro-Robbins algorithm. We prove that, when the transmission delay is bounded, the expected time average total cost converges to the minimum cost and the optimality gap decays with rate ..\mathcal{O}\left(\log K/K\right).., where ..K.. is the number of samples we have taken. Simulation results validate the performance of our proposed algorithm.

Socially-Optimal Mechanism Design for Incentivized Online Learning

Zhiyuan Wang (Beihang University, China); Lin Gao (Harbin Institute of Technology (Shenzhen), China); Jianwei Huang (The Chinese University of Hong Kong, Shenzhen, China)

Multi-arm bandit (MAB) is a classic online learning framework that studies the sequential decision-making in an uncertain environment. The MAB framework, however, overlooks the scenario where the decision-maker cannot take actions (e.g., pulling arms) directly. It is a practically important scenario in many applications such as spectrum sharing, crowdsensing, and edge computing. In these applications, the decision-maker would incentivize other selfish agents to carry out desired actions (i.e., pulling arms on the decision-maker's behalf). This paper establishes the incentivized online learning (IOL) framework for this scenario. The key challenge to design the IOL framework lies in the tight coupling of the unknown environment learning and asymmetric information revelation. To address this, we construct a special Lagrangian function based on which we propose a socially-optimal mechanism for the IOL framework. Our mechanism satisfies various desirable properties such as agent fairness, incentive compatibility, and voluntary participation. It achieves the same asymptotic performance as the state-of-art benchmark that requires extra information. Our analysis also unveils the power of crowd in the IOL framework: a larger agent crowd enables our mechanism to approach more closely the theoretical upper bound of social performance. Numerical results demonstrate the advantages of our mechanism in large-scale edge computing.

Session Chair

Bo Ji (Virginia Tech)

Energy Saving in Heterogeneous Wireless Rechargeable Sensor Networks

Riheng Jia, Jinhao Wu, Jianfeng Lu, Minglu Li, Feilong Lin and Zhonglong Zheng (Zhejiang Normal University, China)

Mobile chargers (MCs) are usually dispatched to deliver energy to sensors in wireless rechargeable sensor networks (WRSNs) due to its flexibility and easy maintenance. This paper concerns the fundamental issue of charging path DEsign with the Minimized energy cOst (DEMO), i.e., given a set of rechargeable sensors, we appropriately design the MC's charging path to minimize the energy cost which is due to the wireless charging and the MC's movement, such that the different charging demand of each sensor is satisfied. Solving DEMO is NP-hard and involves handling the tradeoff between the charging efficiency and the moving cost. To address DEMO, we first develop a computational geometry-based algorithm to deploy multiple charging positions where the MC stays to charge nearby sensors. We prove that the designed algorithm has the approximation ratio of O(lnN), where N is the number of sensors. Then we construct the charging path by calculating the shortest Hamiltonian cycle passing through all the deployed charging positions within the network. Extensive evaluations validate the effectiveness of our path design in terms of the MC's energy cost minimization.

Escala: Timely Elastic Scaling of Control Channels in Network Measurement

Hongyan Liu (Zhejiang University, China); Xiang Chen (Zhejiang University, Peking University, and Fuzhou University, China); Qun Huang (Peking University, China); Dezhang Kong (Zhejiang University, China); Sun Jinbo (Institute of Computing Technology, Chinese Academy of Sciences, China); Dong Zhang (Fuzhou University, China); Haifeng Zhou and Chunming Wu (Zhejiang University, China)

In network measurement, data plane switches measure traffic and report events (e.g., heavy hitters) to the control plane via control channels. The control plane makes decisions to process events. However, current network measurement suffers from two problems. First, when traffic bursts occur, massive events are reported in a short time so that the control channels may be overloaded due to limited bandwidth capacity. Second, only a few events are reported in normal cases, making control channels underloaded and wasting network resources. In this paper, we propose Escala to provide the elastic scaling of control channels at runtime. The key idea is to dynamically migrate event streams among control channels to regulate the loads of these channels. Escala offers two components, including an Escala monitor that detects scaling situations based on realtime network statistics, and an optimization framework that makes scaling decisions to eliminate overload and underload situations. We have implemented a prototype of Escala on Tofino-based switches. Extensive experiments show that Escala achieves timely elastic scaling while preserving high application-level accuracy.

LSAB: Enhancing Spatio-Temporal Efficiency of AoA Tracking Systems

Qingrui Pan, Zhenlin An and Qiongzheng Lin (The Hong Kong Polytechnic University, Hong Kong); Lei Yang (The Hong Kong Polytechnic University, China)

Estimating the angle-of-arrival (AoA) of an RF source by using a large-sized antenna array is a classical topic in wireless systems. However, AoA tracking systems are not yet used for Internet of Things (IoT) in real world due to their unaffordable cost. Many efforts, such as a time-sharing array, emulated array and sparse array, were recently made to cut the cost. This work introduces a log-spiral antenna belt (LSAB), a new novel sparse "planar array" that could estimate the AoA of an IoT device in 3D space by using a few antennas connected to a single timeshare channel. Unlike the conventional arrays, LSAB deploys antennas on a log-spiral-shaped belt in a non-linear manner, following the theory of minimum resolution redundancy newly discovered in this work. One physical 8×8 uniform planar array (UPA) and four logical sparse arrays, including LSAB, were prototyped to validate the theory and evaluate the performance of sparse arrays. The extensive benchmark demonstrates that the performance of LSAB was comparable to that of a UPA, with similar degree of resolution; and LSAB could provide over 40% performance improvement than existing sparse arrays.

StepConf: SLO-Aware Dynamic Resource Configuration for Serverless Function Workflows

Zhaojie Wen, Yishuo Wang and Fangming Liu (Huazhong University of Science and Technology, China)

Function-as-a-Service (FaaS) offers a fine-grained resource provision model, enabling developers to build highly elastic cloud applications. User requests are handled by a series of serverless functions step by step, which forms a function-based workflow. The developers need an optimal resource configuration strategy for all functions in the workflow to meet their service level objectives (SLOs) while saving cost. However, developing such a resource configuration strategy is nontrivial. It is mainly because cloud functions often suffer from unpredictable cold start and performance degradation, which requires an online configuration strategy to guarantee the SLOs.

In this paper, we present StepConf, a framework that automates the resource configuration for functions as the workflow runs. StepConf optimizes memory size for each function step in the workflow and takes inter-function parallelism into consideration, which is a crucial factor that influences workflow performance.
We evaluate StepConf using a video processing workflow on AWS Lambda. Compared with static strategy, the experimental results show that StepConf can further reduce the cost by 10.37% on average.

Session Chair

Zhi Sun (Tsinghua University)

ArmSpy: Video-assisted PIN Inference Leveraging Keystroke-induced Arm Posture Changes

Yuefeng Chen, YiCong Du, Chunlong Xu, Yanghai Yu and Hongbo Liu (University of Electronic Science and Technology of China, China); Huan Dai (Suzhou University of Science and Technology, China); Yanzhi Ren (University of Electronic Science and Technology of China, China); Jiadi Yu (Shanghai Jiao Tong University, China)

PIN inference attack leveraging keystroke-induced side-channel information poses a substantial threat to the security of people's privacy and properties. Among various PIN inference attacks, video-assisted method provide more intuitive and robust side-channel information to infer PINs. But it usually requires there is no visual occlusion between the attacker and the victims or their hand gestures, making the attackers either easy to expose themselves or inapplicable to the scenarios such as ATM or POS terminals. In this paper, we present a novel and practical video-assisted PIN inference system, ArmSpy, which infers victim's PIN by observing from behind the victims in a stealthy way. Specifically, ArmSpy explores the subtle keystroke-induced arm posture changes, including elbow bending angle changes and the spatial relationship between different arm joints, to infer the PIN entries. We develop the keystroke inference model to detect the keystroke events and pinpoint the keystroke positions, and then accurately infer the PINs with the proposed inferred PIN coordination mechanism. Extensive experimental results demonstrate that ArmSpy can achieve over 67% average accuracy on inferring the PIN with 3 attempts and even over 80% for some victims, indicating the severity of the threat posed by ArmSpy.

DNN-Driven Compressive Offloading for Edge-Assisted Semantic Video Segmentation

Xuedou Xiao, Juecheng Zhang and Wei Wang (Huazhong University of Science and Technology, China); Jianhua He (Essex University, United Kingdom (Great Britain)); Qian Zhang (Hong Kong University of Science and Technology, Hong Kong)

Deep learning has shown impressive performance in semantic segmentation, but it is still unaffordable for resource-constrained mobile devices. While offloading computation tasks is promising, the high traffic demands overwhelm the limited bandwidth. Existing compression algorithms are not fit for semantic segmentation, as the lack of obvious and concentrated regions of interest (RoIs) forces the adoption of uniform compression strategies, leading to low compression ratios or accuracy. This paper introduces STAC, a DNN-driven compression scheme tailored for edge-assisted semantic video segmentation. STAC is the first to exploit DNN's gradients as spatial sensitivity metrics for spatial adaptive compression and achieves superior compression ratio and accuracy. Yet, it is challenging to adapt this content-customized compression to videos. Practical issues include varying spatial sensitivity and huge bandwidth consumption for compression strategy feedback and offloading. We tackle these issues through a spatiotemporal adaptive scheme, which (1) takes partial strategy generation operations offline to reduce communication load, and (2) propagates compression strategies and segmentation results across frames through dense optical flow, and adaptively offloads keyframes to accommodate video content. We implement STAC on a commodity mobile device. Experiments show that STAC can save up to 20.95% of bandwidth without losing accuracy, compared to the state-of-the-art algorithm.

FlexPatch: Fast and Accurate Object Detection for On-device High-Resolution Live Video Analytics

Kichang Yang, Juheon Yi and Kyungjin Lee (Seoul National University, Korea (South)); Youngki Lee (Seoul National University, Singapore)

We present FlexPatch, a novel mobile system to enable accurate and real-time object detection over high-resolution video streams. A widely-used approach for real-time video analysis is detection-based tracking (DBT), i.e., running the heavy-but-accurate detector every few frames and applying a lightweight tracker for in-between frames. However, the approach is limited for real-time processing of high-resolution videos in that i) a lightweight tracker fails to handle occlusion, object appearance changes, and occurrences of new objects, and ii) the detection results do not effectively offset tracking errors due to the high detection latency. We propose tracking-aware patching technique to address such limitations of the DBT frameworks. It effectively identifies a set of subareas where the tracker likely fails and tightly packs them into a small-sized rectangular area where the detection can be efficiently performed at low latency. This prevents the accumulation of tracking errors and offsets the tracking errors with frequent fresh detection results. Our extensive evaluation shows that FlexPatch not only enables real-time and power-efficient analysis of high-resolution frames on mobile devices but also improves the overall accuracy by 146% compared to baseline DBT frameworks.

Learning for Crowdsourcing: Online Dispatch for Video Analytics with Guarantee

Yu Chen, Sheng Zhang, Yibo Jin and Zhuzhong Qian (Nanjing University, China); Mingjun Xiao (University of Science and Technology of China, China); Ning Chen and Zhi Ma (Nanjing University, China)

Crowdsourcing enables a paradigm to conduct the manual annotation and the analytics by those recruited workers, with their rewards relevant to the quality of the results. Existing dispatchers fail to capture the resource-quality trade-off for video analytics, since the configurations supported by various workers are different, and the workers' availability is essentially dynamic. To determine the most suitable configurations as well as workers for video analytics, we formulate a non-linear mixed program in a long-term scope, maximizing the profit for the crowdsourcing platform. Based on previous results under various configurations and workers, we design an algorithm via a series of subproblems to decide the configurations adaptively upon the prediction of the worker rewards. Such prediction is based on volatile multi-armed bandit to capture the workers' availability and stochastic changes on resource uses. Via rigorous proof, the regret is ensured upon the Lyapunov optimization and the bandit, measuring the gap between the online decisions and the offline optimum. Extensive trace-driven experiments show that our algorithm improves the platform profit by 37%, compared with other algorithms.

Session Chair

Zhenjiang Li (City University of Hong Kong)

Add/Drop Flexibility and System Complexity Tradeoff in ROADM Designs

Lexin Pan (Shanghai Jiao Tong University, China); Tong Ye (Shanghai JiaoTong University, China)

As a key component in dynamic wavelength-routing optical networks (WRON), the contention performance at add/drop (A/D) ports of ROADMs has attracted a lot of attention in recent years. For the first time, this paper derives the close-form solutions of the blocking probability (BP) of A/D requests to characterize the fundamental tradeoff between A/D flexibility and system complexity in the ROADM. We show that the ROADM with fiber cross-connect (FXC) can decide whether a request can be satisfied based on global transceiver-usage information, and thus an exact expression of BP can be obtained. In comparison, the ROADM without FXC needs detail transceiver-usage information to make decision, and thus is hard to be analyzed. To circumvent the difficulty in analysis, we describe the evolution of the number of busy transceivers using a one-dimensional Markov chain, where the state transition rates are estimated from global transceiver-usage information. Based on this model, we obtain an approximate BP the accuracy of which is high and increases with the number of drop ports. Our analytical results provide an interesting insight into the tradeoff between A/D flexibility and system complexity, based on which we give some suggestions for system design.

Detecting and Resolving PFC Deadlocks with ITSY Entirely in the Data Plane

Xinyu Crystal Wu and T. S. Eugene Ng (Rice University, USA)

The Priority-based Flow Control (PFC) protocol is adopted to guarantee zero packet loss in many high-performance data centers. PFC, however, can induce deadlocks and in severe cases cause the entire network to be blocked. Existing solutions have focused on deadlock avoidance; unfortunately, they are not foolproof. Therefore, deadlock detection is a necessity. We propose ITSY, a novel system that correctly detects and resolves deadlocks entirely in the data plane. It works with any network topologies and routing algorithms. Unique to ITSY is the use of deadlock initial triggers, which contributes to efficient deadlock detection, mitigation, and recurrence prevention. ITSY provides three deadlock resolution mechanisms with different trade-off options. We implement ITSY for programmable switches in the P4 language. Experiments show that ITSY detects and resolves deadlocks rapidly with minimal overheads.

Mousika: Enable General In-Network Intelligence in Programmable Switches by Knowledge Distillation

Guorui Xie (Tsinghua University, China); Qing Li (Peng Cheng Laboratory, China); Yutao Dong and Guanglin Duan (Tsinghua University, China); Yong Jiang (Graduate School at Shenzhen, Tsinghua University, China); Jingpu Duan (Southern University of Science and Technology, China)

Given the power efficiency and Tbps throughput of packet processing, several works are proposed to offload the decision tree (DT) to programmable switches, i.e., in-network intelligence. Though the DT is suitable for the switches' match-action paradigm, it has several limitations. E.g., its range match rules may not be supported well due to the hardware diversity; and its implementation also consumes lots of switch resources (e.g., stages and memory). Moreover, as learning algorithms (particularly deep learning) have shown their superior performance, some more complicated learning models are emerging for networking. However, their high computational complexity and large storage requirement are cause challenges in the deployment on switches. Therefore, we propose Mousika, an in-network intelligence framework that addresses these drawbacks successfully. First, we modify the DT to the Binary Decision Tree (BDT). Compared with the DT, our BDT supports faster training, generates fewer rules, and satisfies switch constraints better. Second, we introduce the teacher-student knowledge distillation in Mousika, which enables the general translation from other learning models to the BDT. Through the translation, we can not only utilize the super learning capabilities of complicated models, but also avoid the computation/memory constraints when deploying them on switches directly for line-speed processing.

Persistent Items Tracking in Large Data Streams Based on Adaptive Sampling

Lin Chen (Sun Yat-sen University, China); Raphael C.-W. Phan (Monash University, Malaysia); Zhili Chen (East China Normal University, China); Dan Huang (University of Central Florida, USA)

We address the problem of persistent item tracking in large-scale data streams. A persistent item refers to the one that persists to occur in the stream over a long timespan. Tracking persistent items is an important and pivotal functionality for many networking and computing applications as persistent items, though not necessarily contributing significantly to the data volume, may convey valuable information on the data pattern about the stream. The state-of-the-art solutions of tracking persistent items require to know the monitoring time horizon to set the sampling rate. This limitation is further accentuated when we need to track the persistent items in recent w slots where w can be any value between 0and T to support different monitoring granularity.
Motivated by this limitation, we develop a persistent item tracking algorithm that can function without knowing the monitoring time horizon beforehand, and can thus track persistent items up to the current time t or within a certain time window at any moment. Our central technicality is adaptively reducing the sampling rate such that the total memory overhead can be limited while still meeting the target tracking accuracy. Through both theoretical and empirical analysis, we fully characterize the performance of our proposition.

Session Chair

Damla Turgut (University of Central Florida)

Blockchain Based Non-repudiable IoT Data Trading: Simpler, Faster, and Cheaper

Fei Chen, Jiahao Wang and Changkun Jiang (Shenzhen University, China); Tao Xiang (Chongqing University, China); Yuanyuan Yang (Stony Brook University, USA)

Next-generation wireless technology and machine-to-machine technology can provide the ability to connect and share data at any time among IoT smart devices. However, the traditional centralized data sharing/trading mechanism lacks trust guarantee and cannot satisfy the real-time requirement. Distributed systems, especially blockchain, provide us with promising solutions. In this paper, we propose a blockchain based non-repudiation scheme for IoT data trading to resolve the credibility and real-time limits. The proposed scheme has two parts, i.e., a trading scheme and an arbitration scheme. The trading scheme employs a divide-and-conquer method and two commitment methods to support efficient IoT data trading, which runs in a two-round manner. The arbitration scheme first leverages a smart contract to solve disputes on-chain in real time. In case of on-chain arbitration dissatisfaction, the arbitration scheme also employs an off-line arbitration to make a final resolution. Short-term and long-term analysis show that the proposed scheme enforces non-repudiation among the data trading parties and runs efficiently for rational data owners and buyers. We implemented the proposed scheme. Experimental results confirm that the proposed scheme has an orders-of-magnitude performance speedup than the state-of-the-art scheme.

BrokerChain: A Cross-Shard Blockchain Protocol for Account/Balance-based State Sharding

Huawei Huang, Xiaowen Peng, Jianzhou Zhan, Shenyang Zhang and Yue Lin (Sun Yat-Sen University, China); Zibin Zheng (School of Data and Computer Science, Sun Yat-sen University, China); Song Guo (The Hong Kong Polytechnic University, Hong Kong)

State-of-the-art blockchain sharding solutions, say Monoxide, can induce imbalanced transaction (TX) distributions among all blockchain shards due to their account deployment mechanisms. Imbalanced TX distributions can then cause hot shards, in which the cross-shard TXs may experience an unlimited length of confirmation latency. Thus, how to address the hot shard issue and how to reduce cross-shard TXs become significant challenges in the context of blockchain state sharding. Through reviewing the related studies, we find that a cross-shard TX protocol that can achieve workload balance among all blockchain shards and simultaneously reduce the number of cross-shard TXs is still absent from the literature. To this end, we propose BrokerChain, which is a cross-shard blockchain protocol devised for the account/balance-based state sharding. Essentially, BrokerChain exploits fine-grained state partition and account segmentation. We also elaborate on how BrokerChain handles the cross-shard TXs through broker accounts. The security issues and other properties of BrokerChain are analyzed substantially. Finally, we conduct comprehensive evaluations using both a real cloud-based prototype and a transaction-driven simulator. The evaluation results show that BrokerChain outperforms the state-of-the-art solutions in terms of system throughput, transaction confirmation latency, the queue size of the transaction pool, and workload balance performance.

S-Store:: A Scalable Data Store towards Permissioned Blockchain Sharding

Xiaodong Qi (East China Normal University, China)

Sharding technique, which divides the whole network into multiple disjoint groups or committees, has been recognized as a revolutionary solution to enhance the scalability of blockchains. For account-based model, state data are partitioned over all committees and organized as Merkle trees to ensure data consistency and immutability. However, existing techniques on Merkle tree-based state storage fail to scale out due to a large amount of network and compute overheads incurred by data migration and Merkle tree reconstruction, respectively. In this paper, we propose S-Store, a scalable data storage technique towards permissioned blockchain sharding based on Aggregate Merkle B+ tree (AMB-tree). S-Store utilizes consistent hashing to reduce data migration among committees and uses split and merge on AMB-tree to decrease Merkle tree reconstruction overheads. S-Store also employs a novel committee addition protocol that guarantees the system service availability during data migration. Extensive experiments show that S-Sotre outperforms existing techniques by one order of magnitude in terms of transaction execution, data transmission, and committee addition.

Optimal Oblivious Routing for Structured Networks

Sucha Supittayapornpong (Vidyasirimedhi Institute of Science and Technology, Thailand); Pooria Namyar (University of Southern California, USA); Mingyang Zhang (University of Science and Technology of China, China); Minlan Yu (Harvard University, USA); Ramesh Govindan (University of Southern California, USA)

Oblivious routing distributes traffic from sources to destinations following predefined routes with rules independent of traffic demands. While finding optimal oblivious routing is intractable for general topologies, we show it is tractable for structured topologies often used in datacenter networks. In this work, we apply graph automorphism and prove the existence of the optimal automorphism-invariant solution. This result reduces the search space to targeting the optimal automorphism-invariant solution. We design an iterative algorithm to obtain such a solution by alternating between two linear programs. The first program finds an automorphism-invariant solution based on representative variables and constraints, making the problem tractable. The second program generates adversarial demands to ensure the final result satisfies all possible traffic demands. The construction of the representative variables and constraints are combinatorial problems, so we design polynomial-time algorithms for the construction. The proposed iterative algorithm is evaluated in terms of throughput performance, scalability, and generality over three potential applications. The algorithm i) improves the throughput up to 87.5% over a heuristic algorithm for partially deployed FatTree, ii) scales for FatClique with a thousand switches, iii) is applicable for a general structured topology with non-uniform link capacity and server distribution.

Session Chair

Guiling Wang (New Jersey Institute of Technology)

MalGraph: Hierarchical Graph Neural Networks for Robust Windows Malware Detection

Xiang Ling (Institute of Software, Chinese Academy of Sciences & Zhejiang University, China); Lingfei Wu (JD.COM Silicon Valley Research Center, USA); Wei Deng, Zhenqing Qu, Jiangyu Zhang and Sheng Zhang (Zhejiang University, China); Tengfei Ma (IBM T. J. Watson Research Center, USA); Bin Wang (Hangzhou Hikvision Digital Technology Co., Ltd, China); Chunming Wu (College of Computer Science, Zhejiang University, China); Shouling Ji (Zhejiang University, China & Georgia Institute of Technology, USA)

With the ever-increasing malware threats, malware detection plays an indispensable role in protecting information systems. Although tremendous research efforts have been made, there are still two key challenges hindering them from being applied to accurately and robustly detect malwares. Firstly, most of them represent executable programs with shallow features, but ignore their semantic and structural information. Secondly, these methods are based on representations that can be easily modified by attackers and thus cannot provide robustness against adversarial attacks. To tackle the aforementioned challenges, we present MalGraph, which first represents executable programs with hierarchical graphs and then uses an end-to-end learning framework based on graph neural networks for malware detection. In particular, the hierarchical graph consists of a function call graph that captures the interaction semantics among different functions (at the inter-function level) and corresponding control-flow graphs for learning the structural semantics of each function (at the intra-function level). We argue the abstraction and hierarchy nature of hierarchical graphs makes them not only easy to capture rich structural information of executables, but also be immune to adversarial attacks. Experimental results demonstrate that MalGraph not only outperforms state-of-the-art malware detection, but also exhibits stronger robustness against adversarial attacks by a large margin.

Nadege: When Graph Kernels meet Network Anomaly Detection

Hicham Lesfari (Université Côte d'Azur, France); Frederic Giroire (CNRS, France)

With the continuous growing level of dynamicity, heterogeneity and complexity of traffic data, anomaly detection remains one of the most critical tasks to ensure an efficient and flexible management of a network. Recently, driven by their empirical success in many domains, especially bioinformatics and computer vision, graph kernels have attracted increasing attention. Our work aims at investigating their discrimination power for detecting vulnerabilities and distilling traffic in the field of networking. In this paper, we propose Nadege, a new graph-based learning framework which aims at preventing anomalies from disrupting the network while providing assistance for traffic monitoring. Specifically, we design a graph kernel tailored for network profiling by leveraging propagation schemes which regularly adapt to contextual patterns. Moreover, we provide provably efficient algorithms and consider both offline and online detection policies. Finally, we demonstrate the potential of kernel-based models by conducting extensive experiments on a wide variety of network environments. Under different usage scenarios, Nadege significantly outperforms all baseline approaches.

RouteNet-Erlang: A Graph Neural Network for Network Performance Evaluation

Miquel Ferriol-Galmés (Universitat Politècnica de Catalunya, Spain); Krzysztof Rusek (AGH University of Science and Technology, Poland); Jose Suarez-Varela (Universitat Politècnica de Catalunya, Spain); Shihan Xiao and Xiang Shi (Huawei Technologies, China); Xiangle Cheng (University of Exeter, United Kingdom (Great Britain)); Bo Wu (Huawei Technologies, China); Pere Barlet-Ros and Albert Cabellos-Aparicio (Universitat Politècnica de Catalunya, Spain)

Network modeling is a fundamental tool in network research, design, and operation. Arguably the most popular method for modeling is Queuing Theory (QT). Its main limitation is that it imposes strong assumptions on the packet arrival process, which typically do not hold in real networks. In the field of Deep Learning, Graph Neural Networks (GNN) have emerged as a new technique to build data-driven models that can learn complex and non-linear behavior. In this paper, we present ErlangNet, a pioneering GNN architecture designed to model computer networks. ErlangNet supports complex traffic models, multi-queue scheduling policies, routing policies and can provide accurate estimates in networks not seen in the training phase. We benchmark ErlangNet against a state-of-the-art QT model, and our results show that it outperforms QT in all the network scenarios.

xNet: Improving Expressiveness and Granularity for Network Modeling with Graph Neural Networks

Mowei Wang, Linbo Hui and Yong Cui (Tsinghua University, China); Ru Liang (Huawei Technologies Co., Ltd., China); Zhenhua Liu (Huawei Technologies, China)

Today's network is notorious for its complexity and uncertainty. Network operators often rely on network models to achieve efficient network planning, operation, and optimization. The network model is tasked to understand the complex relationships between the network performance metrics (e.g. latency) and the network characteristics (e.g. traffic). However, we still lack a systematic approach to building accurate and lightweight network models that can model the influence of network configurations (i.e. expressiveness) and support fine-grained flow-level temporal predictions (i.e. granularity).

In this paper, we propose xNet, a data-driven network modeling framework based on graph neural networks (GNN). Unlike the previous proposals, xNet is not a dedicated network model designed for specific network scenarios with constraint considerations. On the contrary, xNet provides a general approach to model the network characteristics of concern with graph representations and configurable GNN blocks. xNet learns the state transition function between time steps and rolls it out to obtain the full fine-grained prediction trajectory. We implement and instantiate xNet with three use cases. The experiment results show that xNet can accurately predict different performance metrics while achieving up to 100x speedup compared with the conventional packet-level simulator.

Session Chair

Qinghua Li (University of Arkansas)

ABS: Adaptive Buffer Sizing via Augmented Programmability with Machine Learning

Jiaxin Tang, Sen Liu and Yang Xu (Fudan University, China); Zehua Guo (Beijing Institute of Technology, China); Junjie Zhang (Fortinet, Inc., USA); Peixuan Gao (Fudan University, USA & New York University, USA); Yang Chen and Xin Wang (Fudan University, China); H. Jonathan Chao (NYU Tandon School of Engineering, USA)

Programmable switches have been proposed in today's network to enable flexible reconfiguration of devices and reduce time-to-deployment. Buffer sizing, an important factor for network performance, however, has not received enough attention in programmable network. The state-of-the-art buffer sizing solutions usually employ either fixed buffer size or adjust the buffer size heuristically. Without programmability, they suffer from either massive packet drops or large queueing delay in dynamic environment. In this paper, we propose Adaptive Buffer Sizing (ABS), a low-cost and deploy-friendly framework compatible with programmable network. By decoupling the data plane and control plane, ABS-capable switches only need to react to the actions from controller, optimizing network performance in run-time under dynamic traffic. Meanwhile, actions can be programmed by particular Machine Learning (ML) models in the controller to meet different network requirements. In this paper, we address two specific ML models, a reinforcement learning model for relatively stable network with user specific quality requirements, and a supervised learning model for highly dynamic network scenarios. We implement the ABS framework by integrating the prevalent network simulator NS-2 with ML module. The experiment shows ABS outperforms state-of-the-art buffer sizing solutions by up to 38.23x under various network environments.

Network Link Weight Setting: A Machine Learning Based Approach

Murali Kodialam (Nokia Bell Labs, USA); T. V Lakshman (Bell Labs, Nokia, USA)

Several key internet routing protocols like OSPF and ISIS use shortest path routing to transport traffic from the ingress node to the egress node. These shortest paths are computed with respect to the weights on the links in the underlying connectivity graph. Since the routed paths depend on the link weights, a fundamental problem in network routing is to determine the set of weights that minimize congestion in the network. This is an NP-hard combinatorial optimization problem and several heuristics have been developed to determine the set of link weights to minimize congestion. In this paper, we develop a smoothed version of the weight setting problem and use gradient descent in the PyTorch framework to derive approximate solutions to this problem. We show the improvement in performance compared to traditional approaches on several benchmark networks.

NeuroMessenger: Towards Error Tolerant Distributed Machine Learning Over Edge Networks

Song Wang (University of California San Diego, USA); Xinyu Zhang (University of California San Diego & University of Wisconsin-Madison, USA)

Despite the evolution of distributed machine learning (ML) systems in recent years, the communication overhead induced by their data transfers remains a major issue that hampers the efficiency of such systems, especially in edge networks with poor wireless link conditions. In this paper, we propose to explore a new paradigm of error-tolerant distribute ML to mitigate the communication overhead. Unlike generic network traffic, ML data exhibits an intrinsic error-tolerant capability which helps the model yield fair performance even with errors in the data transfers. We first characterize the error tolerance capability of state-of-art distributed ML frameworks. Based on the observations, we propose NeuroMessenger, a lightweight mechanism that can be built into the cellular network stack, which can enhance and utilize the error tolerance in ML data to reduce communication overhead. NeuroMessenger does not require per-model profiling and is transparent to application layer, which simplifies the development and deployment. Our experiments on a 5G simulation framework demonstrate that NeuroMessenger reduces the end-to-end latency by up to 99% while maintaining less than low accuracy loss under various link conditions.

Real-time Machine Learning for Symbol Detection in MIMO-OFDM Systems

Yibin Liang, Lianjun Li, Yang (Cindy) Yi and Lingjia Liu (Virginia Tech, USA)

Recently, there have been renewed interests in applying machine learning (ML) techniques to wireless systems. However, ML-based approaches often require large amount of data in training and prior ML-based symbol detectors usually adopt off-line learning approaches which are not applicable to real-time processing. In this paper, echo state network (ESN), a prominent type of reservoir computing (RC), is applied to the symbol detection task in MIMO-OFDM systems. Our RC-based approach avoids the model-mismatch problem in traditional model-based receivers and consistently achieves better performance. Furthermore, two new ESN training methods, namely recursive-least-square (RLS) and generalized adaptive weighted recursive-least-square (GAW-RLS) are introduced, together with a decision feedback mechanism to further improve the training procedure. Simulation study shows that the proposed methods can achieve better performance than previous conventional and ML-based symbol detectors. Finally, the effectiveness of our RC-based approach is validated with a software-defined radio (SDR) transceiver and extensive field tests in various real-world scenarios. To the best of our knowledge, this is the first report of a real-time SDR implementation for ML-based MIMO-OFDM symbol detector. Our work provide a strong indication that ML-based signal processing could be a promising and key approach for future wireless networks.

Session Chair

Tony T. Luo (Missouri University of Science and Technology)

Cost Effective MLaaS Federation: A Combinatorial Reinforcement Learning Approach

Shuzhao Xie and Yuan Xue (Tsinghua University, China); Yifei Zhu (Shanghai Jiao Tong University, China); Zhi Wang (Tsinghua University, China)

With the advancement of deep learning techniques, major cloud providers and niche machine learning service providers start to offer their cloud-based machine learning tools, also known as machine learning as a service (MLaaS), to the public. According to our measurement, for the same task, these MLaaSes from different providers have varying performance due to the proprietary datasets, models, etc. Federating different MLaaSes together allows us to improve the analytic performance further. However, naively aggregating results from different MLaaSes not only incurs significant momentary cost but also may lead to sub-optimal performance gain due to the introduction of possible false positive results. In this paper, we propose Armol, a framework to federate the right selection of MLaaS providers to achieve the best possible analytic performance. We first design a word grouping algorithm to unify the output labels across different providers. We then present a deep combinatorial reinforcement learning based-approach to maximize the accuracy while minimizing the cost. The predictions from the selected providers are then aggregated together using carefully chosen ensemble strategies. The real-world trace-driven evaluation further demonstrates that Armol is able to achieve the same accuracy results with 67% less inference cost.

Landing Reinforcement Learning onto Smart Scanning of The Internet of Things

Jian Qu and Xiaobo Ma (Xi'an Jiaotong University, China); Wenmao Liu and Hongqing Sang (NSFOCUS Inc., China); Jianfeng Li (Xi'an Jiaotong University, China); Lei Xue and Xiapu Luo (The Hong Kong Polytechnic University, Hong Kong); Zhenhua Li (Tsinghua University, China); Li Feng (Center of Dependable and Secure Computing (CDSC), China); Xiaohong Guan (Xi'an Jiaotong University & Tsinghua University, China)

Cyber search engines, such as Shodan and Censys, have gained popularity due to their strong capability of indexing the Internet of Things (IoT). They actively scan and fingerprint IoT devices for unearthing IP-device mapping. Because of the large address space of the Internet and the mapping's mutative nature, efficiently tracking the evolution of IP-device mapping with a limited budget of scans is essential for building timely cyber search engines. An intuitive solution is to use reinforcement learning to schedule more scans to networks with high churn-rates of IP-device mapping, which can be learned from historical scanning records. However, such an intuitive solution has never been systematically studied. In this paper, we take the first step toward demystifying this problem based on our experiences in maintaining a global IoT scanning platform. Inspired by the measurement study of large-scale real-world IoT scan records, we land reinforcement learning onto a system capable of smartly scanning IoT devices in a principled way. Through extensive experiments, we disclose key parameters affecting the effectiveness of different scanning strategies, and find that our system would achieve growing advantages with the proliferation of IoT devices.

Multi-Agent Distributed Reinforcement Learning for Making Decentralized Offloading Decisions

Jing Tan (Paderborn University, Germany); Ramin Khalili (Huawei Technologies, Germany); Holger Karl (Hasso Plattner Institute, Germany); Artur Hecker (Huawei, Germany)

We formulate computation offloading as a decentralized decision-making problem with selfish agents. We design an interaction mechanism that incentivizes agents to align selfish and system goals by balancing between competition and cooperation. The mechanism provably has Nash equilibria with optimal resource allocation in the static case. For a dynamic environment, we propose a novel multi-agent online learning algorithm that learns with partial, delayed and noisy state information, and a reward signal that reduces information need to a great extent. Empirical results confirm that through learning, agents significantly improve both system and individual performance, e.g., 40% offloading failure rate reduction, 32% communication overhead reduction, up to 38% computation resource savings in low contention, 18% utilization increase with reduced load variation in high contention, and improvement in fairness. Results also confirm the algorithm's good convergence and generalization property in significantly different environments.

Reinforcement Learning for Dynamic Dimensioning of Cloud Caches: A Restless Bandit Approach

Guojun Xiong and Shufan Wang (Binghamton University, USA); Gang Yan (Binghamton University-SUNY, USA); Jian Li (Binghamton University, USA)

We study the dynamic cache dimensioning problem, where the objective is to decide how much storage to place in the cache to minimize the total costs with respect to the storage and content delivery latency. We formulate this problem as a Markov decision process, which turns out to be a restless multi-armed bandit problem and is provably hard to solve. For given dimensioning decisions, it is possible to develop solutions based on the celebrated Whittle index policy. However, Whittle index policy has not been studied for dynamic cache dimensioning, mainly because cache dimensioning needs to be repeatedly solved and jointly optimized with content caching. To overcome this difficulty, we propose a low-complexity fluid Whittle index policy, which jointly determines dimensioning and content caching. We show that this policy is asymptotically optimal. We further develop a lightweight reinforcement learning augmented algorithm dubbed fW-UCB when the content request and delivery rates are unavailable. fW-UCB is shown to achieve a sub-linear regret as it fully exploits the structure of the near-optimal fluid Whittle index policy and hence can be easily implemented. Extensive simulations using real traces support our theoretical results.

Session Chair

Xi Chen (Samsung Electronics)

Accelerating Deep Learning classification with error-controlled approximate-key caching

Alessandro Finamore (HUAWEI France, France); Massimo Gallo (Huawei, France); James Roberts (Telecom ParisTech, France); Dario Rossi (Huawei Technologies, France)

While Deep Learning (DL) technologies are a promising tool to solve networking problems that map to classification tasks, their computational complexity is still too high with respect to real-time traffic measurements requirements. To reduce the DL inference cost, we propose a novel caching paradigm, that we named approximate-key caching, which returns approximate results for lookups of selected input based on cached DL inference results. While approximate cache hits alleviate DL inference workload and increase the system throughput, they however introduce an approximation error. As such, we couple approximate-key caching with an error-correction principled algorithm, that we named auto-refresh. We analytically model our caching system performance for classic LRU and ideal caches, we perform a trace-driven evaluation of the expected performance, and we compare the benefits of our proposed approach with the state-of-the-art similarity caching -- testifying the practical interest of our proposal.

Lightweight Trilinear Pooling based Tensor Completion for Network Traffic Monitoring

Yudian Ouyang and Kun Xie (Hunan University, China); Xin Wang (Stony Brook University, USA); Jigang Wen (Chinese Academy of Science & Institute of Computing Technology, China); Guangxing Zhang (Institute of Computing Technology Chinese Academy of Sciences, China)

Network traffic engineering and anomaly detection rely heavily on network traffic measurement. Due to the lack of infrastructure to measure all points of interest, the high measurement cost, and the unavoidable transmission loss, network monitoring systems suffer from the problem that the network traffic data are incomplete with only a subset of paths or time slots measured. Recent studies show that tensor completion can be applied to infer the missing traffic data from partial measurements. Although promising, the interaction model adopted in current tensor completion algorithms can only capture linear and simple correlations in the traffic data, which compromises the recovery performance. To solve the problem, we propose a new tensor completion scheme based on Lightweight Trilinear Pooling, which designs (1) a Trilinear Pooling, a new multi-modal fusion method to model the interaction function to capture the complex correlations, (2) a low-rank decomposition based neural network compression method to reduce the storage and computation complexity, (3) an attention enhanced LSTM to encode and incorporate the temporal patterns in the tensor completion scheme. The extensive experiments on three real-world network traffic datasets demonstrate that our scheme can significantly reduce the error in missing data recovery with fast speed using small storage.

LossLeaP: Learning to Predict for Intent-Based Networking

Alan Collet (IMDEA Networks Institute, Spain); Albert Banchs (Universidad Carlos III de Madrid, Spain); Marco Fiore (IMDEA Networks Institute, Spain)

Intent-Based Networking mandates that high-level human-understandable intents are automatically interpreted and implemented by network management entities. As a key part in this process, it is required that network orchestrators activate the correct automated decision model to meet the intent objective. In anticipatory networking tasks, this requirement maps to identifying and deploying a tailored prediction model that can produce a forecast aligned with the specific -and typically complex- network management goal expressed by the original intent. Current forecasting models for network demands or network management optimize generic, non-flexible, and manually designed objectives, hence do not fulfil the needs of anticipatory Intent-Based Networking. To close this gap, we propose LossLeaP, a novel forecasting model that can autonomously learn the relationship between the prediction and the target management objective, steering the former to minimize the latter. To this end, LossLeaP adopts an original deep learning architecture that advances current efforts in automated machine learning, towards a spontaneous design of loss functions for regression tasks. Extensive experiments in controlled environments and in practical application case studies prove that LossLeaP outperforms a wide range of benchmarks, including state-of-the-art solutions for network capacity forecasting.

Network Tomography based on Adaptive Measurements in Probabilistic Routing

Hiroki Ikeuchi (NTT Corporation, Japan); Hiroshi Saito (University of Tokyo & Mathematics and Informatics Center, Japan); Kotaro Matsuda (NTT, Japan)

We discuss Boolean network tomography in a probabilistic routing environment. Although the stochastic behavior of routing can be found in load balancing mechanisms and normal routing protocols, it has not been discussed much in network tomography so far. In probabilistic routing, because monitoring paths are not uniquely determined, a huge number of measurements are generally required to identify the network state. To overcome this difficulty, we propose a network tomography method for efficiently narrowing down the states with a limited number of measurements by iteratively updating the posterior of the states. In this method, we introduce mutual information as a measure of the effectiveness of the probabilistic monitoring path. This enables us to prioritize measurements that are critically effective in identifying the state, thus significantly reducing the number of required measurements. We show that our method has a theoretical guarantee of the approximation ratio (1 − 1/e) on the basis of submodularity analysis. Numerical evaluations show that our method can identify the network states with far fewer measurements than existing methods.

Session Chair

Hao Wang (Louisiana State University)

Batch Adaptative Streaming for Video Analytics

Lei Zhang (Shenzhen University, China); Yuqing Zhang (ShenZhen University, China); Ximing Wu (Shenzhen University, China); Fangxin Wang (The Chinese University of Hong Kong, Shenzhen, China); Laizhong Cui (Shenzhen University, China); Zhi Wang (Tsinghua University, China); Jiangchuan Liu (Simon Fraser University, Canada)

Video streaming plays a critical role in the video analytics pipeline and thus its adaptation scheme has been a focus of optimization. As machine learning algorithms have become main consumers of video contents, the streaming adaptation decision should be made to optimize their inference performance. Existing video streaming adaptation schemes for video analytics are usually designed to adapt to bandwidth and content variations separately, which fail to consider the coordination between transmission and computation. Given the nature of batch transmission in video streaming and batch processing in deep learning-based inference, we observe that the choices of the batch sizes directly affects the bandwidth efficiency, the response delay and the accuracy of the deep learning inference in video analytics. In this work, we investigate the effect of the batch size in transmission and processing, formulate the optimal batch size adaptation problem, and further develop the deep reinforcement learning-based solution. Practical issues are further addressed for Implementation. Extensive simulations are conducted for performance evaluation, whose results demonstrate the superiority of our proposed batch adaptive streaming approach over the baseline streaming approaches.

CASVA: Configuration-Adaptive Streaming for Live Video Analytics

Miao Zhang (Simon Fraser University, Canada); Fangxin Wang (The Chinese University of Hong Kong, Shenzhen, China); Jiangchuan Liu (Simon Fraser University, Canada)

The advent of high-accuracy and resource-intensive deep neural networks (DNNs) has fulled the development of live video analytics, where camera videos need to be streamed over the network to edge or cloud servers with sufficient computational resources. Although it is promising to strike a balance between available bandwidth and server-side DNN inference accuracy by adjusting video encoding configurations, the influences of fine-grained network and video content dynamics on configuration performance should be addressed. In this paper, we propose CASVA, a Configuration-Adaptive Streaming framework designed for live Video Analytics. The design of CASVA is motivated by our extensive measurements on how video configuration affects its bandwidth requirement and inference accuracy. To handle the complicated dynamics in live analytics streaming, CASVA trains a deep reinforcement learning model which does not make any assumptions about the environment but learns to make configuration choices through its experiences. A variety of real-world network traces are used to drive the evaluation of CASVA. The results on a multitude of video types and video analytics tasks show the advantages of CASVA over state-of-the-art solutions.

Deadline-aware Multipath Transmission for Streaming Blocks

Xutong Zuo and Yong Cui (Tsinghua University, China); Xin Wang (Stony Brook University, USA); Jiayu Yang (Beijing University of Posts and Telecommunications, China)

Interactive applications have deadline requirements, e.g. video conferencing and online gaming. Compared with a single path, which may be less stable or bandwidth insufficient, using multiple network paths simultaneously (e.g., WiFi and cellular network) can leverage the ability of multiple paths to service for the deadline. However, existing multipath schedulers usually ignore the deadline and the influence from subsequent blocks to the current scheduling decision when multiple blocks exist at the sender. In this paper, we propose DAMS, a Deadline-Aware Multipath Scheduler aiming to deliver more blocks with heterogeneous attributes before their deadlines. DAMS carefully schedules the sending order of blocks and balances its allocation on multiple paths to reduce the waste of bandwidth resources with the consideration of the block's deadline. We implement DAMS with the inspiration of MPQUIC in user space. The extensive experimental results show that DAMS brings 41%-63% performance improvement on average compared with existing multipath solutions.

LSync: A Universal Event-synchronizing Solution for Live Streaming

Yifan Xu, Fan Dang, Rongwu Xu and Xinlei Chen (Tsinghua University, China); Yunhao Liu (Tsinghua University & The Hong Kong University of Science and Technology, China)

The widespread of smart devices and the development of mobile networks brings the growing popularity of live streaming services worldwide. In addition to the video and audio transmission, a lot more media content is sent to the audiences as well, including player statistics for a sports stream, subtitles for living news, etc. However, due to the diverse transmission process between live streams and other media content, the synchronization of them has grown to be a great challenge. Unfortunately, the existing commercial solutions are not universal, which require specific server cloud services or CDN and limit the users' free choices of web infrastructures. To address the issue, we propose a lightweight universal event-synchronizing solution for live streaming, called LSync, which inserts a series of audio signals containing metadata into the original audio stream. It brings no modification to the original live broadcast process and thus fits prevalent live broadcast infrastructure. Evaluations on real system show that the proposed solution reduces the signal processing delay by at most 5.62% of an audio buffer length in mobile phones and ensures real-time signal processing. It also achieves a data rate of 156.25 bps in a specific configuration and greatly outperforms recent works.

Session Chair

Imad Jawhar (Al Maaref University)

AoDNN: An Auto-Offloading Approach to Optimize Deep Inference for Fostering Mobile Web

Yakun Huang and Xiuquan Qiao (Beijing University of Posts and Telecommunications, China); Schahram Dustdar (Vienna University of Technology, Austria); Yan Li (Shanxi Transportation Planning Survey and Design Institute, China)

Employing today's deep neural network (DNN) into the cross-platform web with an offloading way has been a promising means to alleviate the tension between intensive inference and limited computing resources. However, it is still challenging to directly leverage the distributed DNN execution into web apps with the following limitations, including (1) how special computing tasks such as DNN inference can provide fine-grained and efficient offloading in the inefficient JavaScript-based environment? (2) lacking the ability to balance the latency and mobile energy to partition the inference facing various web applications' requirements. (3) and ignoring that DNN inference is vulnerable to the operating environment and mobile devices' computing capability, especially dedicated web apps. This paper designs AoDNN, an automatic offloading framework to orchestrate the DNN inference across the mobile web and the edge server, with three main contributions. First, we design the DNN offloading based on providing a snapshot mechanism and use multi-threads to monitor dynamic contexts, partition decision, trigger offloading, etc. Second, we provide a learning-based latency and mobile energy prediction framework for supporting various web browsers and platforms. Third, we establish a multi-objective optimization to solve the optimal partition by balancing the latency and mobile energy.

Muses: Enabling Lightweight Learning-Based Congestion Control for Mobile Devices

Zhiren Zhong (University of Chinese Academy of Sciences, China & Huawei, China); Wei Wang and Yiyang Shao (Huawei, China); Zhenyu Li, Heng Pan and Hongtao Guan (Institute of Computing Technology, Chinese Academy of Sciences, China); Gareth Tyson (Queen Mary, University of London, United Kingdom (Great Britain)); Gaogang Xie (CNIC Chinese Academy of Sciences & University of Chinese Academy of Sciences, China); Kai Zheng (Huawei Technologies, China)

Various congestion control (CC) algorithms have been designed to target specific scenarios. To automate this process, researchers have begun to use machine learning to automatically control the congestion window. These, however, often rely on heavyweight learning models (e.g., neural networks). This can make them unsuitable for resource-constrained mobile devices. On the other hand, lightweight models (e.g., decision trees) are often incapable of reflecting the complexity of diverse mobile wireless environments. To address this, we present Muses, a learning-based approach for generating lightweight congestion control algorithms. Muses relies on imitation learning to train a universal (heavy) LSTM model, which is then used to extract (lightweight) decision tree models that are each targeted at an individual environment. Muses then dynamically selects the most appropriate decision tree on a per-flow basis. We show that Muses can generate high throughput policies across a diverse set of environments, and it is sufficiently light to operate on mobile devices.

NMMF-Stream: A Fast and Accurate Stream-Processing Scheme for Network Monitoring Data Recovery

Kun Xie and Ruotian Xie (Hunan University, China); Xin Wang (Stony Brook University, USA); Gaogang Xie (CNIC Chinese Academy of Sciences & University of Chinese Academy of Sciences, China); Dafang Zhang (Hunan University, China); Jigang Wen (Chinese Academy of Science & Institute of Computing Technology, China)

Recovery of missing network monitoring data is of great significance for network operation and maintenance tasks such as anomaly detection and traffic prediction. To exploit historical data for more accurate missing data recovery, some recent studies combine the data together as a tensor to learn more features. However, the need of performing high cost data decomposition compromises their speed and accuracy, which makes them difficult to track dynamic features from streaming monitoring data. To ensure fast and accurate recovery of network monitoring data, this paper proposes NMMF-Stream, a stream-processing scheme with a context extraction module and a generation module. To achieve fast feature extraction and missing data filling with a low sampling rate, we propose several novel techniques, including the context extraction based on both positive and negative monitoring data, context validation via measuring the Pointwise Mutual Information, GRU-based temporal feature learning and memorization, and a new composite loss function to guide the fast and accurate data filling. We have done extensive experiments using two real network traffic monitoring data sets and one network latency data set. The experimental results demonstrate that, compared with three baselines, NMMF-Stream can fill the newly arrived monitoring data very quickly with much higher accuracy.

PACC: Proactive and Accurate Congestion Feedback for RDMA Congestion Control

Xiaolong Zhong and Jiao Zhang (Beijing University of Posts and Telecommunications, China); Yali Zhang and Zixuan Guan (Huawei, China); Zirui Wan (Beijing University of Posts and Telecommunications, China)

The rapid upgrade of link speed and the prosperity of new applications in data centers lead to a rigorous demand for ultra-low latency and high throughput in end-to-end communication. Consequently, RDMA is adopted to mitigate the overhead caused by traditional software-based packet processing at end hosts, and congestion control mechanisms designed for RDMA have attracted much attention to avoid performance deterioration when packets lose. However, through comprehensive analysis, we found that existing RDMA congestion control schemes have limitations of a sluggish response to congestion and unawareness of tiny microbursts due to the long end-to-end control loop. In this paper, we propose PACC, a switch-driven RDMA congestion control algorithm with easy deployability. Benefited from PI controller-based computation, threshold-based flow discrimination and weight-based allocation at the switch, PACC leverages real-time queue length to generate accurate congestion feedback proactively and piggybacks it to the corresponding source without modification to end hosts. We theoretically analyze the stability and key parameter settings of PACC. Then, we conduct both micro-benchmark and large-scale simulations to evaluate the performance of PACC. The results show that PACC achieves fairness, fast reaction, high throughput, and 6∼69% lower FCT (Flow Completion Time) than DCQCN, TIMELY and HPCC.

Session Chair

Aaron D Striegel (University of Notre Dame)

Program at a Glance