Welcome to Byte Size Arxiv

Papers made digestable

Recommended Categories:
Show All

2022-11-03

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves long-term autonomous path-following using topometric mapping and localization from a single rich sensor stream. In this paper, we improve the capabilities of a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in changing environments. Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. We then extend the behaviour of generic sample-based motion planners to better suit the teach-and-repeat problem structure by introducing a new edge-cost metric paired with a curvilinear planning space. The resulting planner generates naturally smooth paths that avoid local obstacles while minimizing lateral path deviation to best exploit prior terrain knowledge. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Experimental results from online run-time analysis, unit testing, and qualitative experiments on a differential drive robot show the promise of the technique for reliable long-term autonomous operation in complex unstructured environments.

Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.

2022-11-03

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

The statistical and design considerations that pertain to dose optimization are discussed. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. The traditional more-is-better dose selection paradigm, developed based on cytotoxic chemotherapeutics, is often problematic When applied to the development of novel molecularly targeted agents (e.g., kinase inhibitors, monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug Administration (FDA) initiated Project Optimus to reform the dose optimization and dose selection paradigm in oncology drug development and call for more attention to benefit-risk consideration. We systematically investigated the operating characteristics of the seamless phase 2-3 design as a strategy for dose optimization, where in stage 1 (corresponding to phase 2) patients are randomized to multiple doses, with or without a control; and in stage 2 (corresponding to phase 3) the efficacy of the selected optimal dose is evaluated with a randomized concurrent control or historical control. Depending on whether the concurrent control is included and the type of endpoints used in stages 1 and 2, we describe four types of seamless phase 2-3 dose-optimization designs, which are suitable for different clinical settings. The statistical and design considerations that pertain to dose optimization are discussed. Simulation shows that dose optimization phase 2-3 designs are able to control the familywise type I error rates and yield appropriate statistical power with substantially smaller sample size than the conventional approach. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. Due to the interim dose selection, the phase 2-3 dose-optimization design is logistically and operationally more challenging, and should be carefully planned and implemented to ensure trial integrity.

Authors: Liyun Jiang, Ying Yuan.

2022-11-03

Fast and robust Bayesian Inference using Gaussian Processes with GPry

We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.

2022-11-03

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. While no non-preemptive algorithm is constant competitive, Motwani, Phillips, and Torng (SODA '93) proved that the simple preemptive round robin procedure is $2$-competitive and that no better competitive ratio is possible, initiating a long line of research focused on preemptive algorithms for generalized variants of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS '91) introduced kill-and-restart schedules, where running jobs may be killed and restarted from scratch later, and analyzed then for the makespan objective. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. We contribute to both models: First we give for any $b > 1$ a tight analysis for the natural $b$-scaling kill-and-restart strategy for scheduling jobs without release dates, as well as for a randomized variant of it. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. Second, we show that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is $2$-competitive for jobs released in an online fashion over time, matching the lower bound by Motwani et al. Using this result as well as the competitiveness of round robin for multiple machines, we prove performance guarantees of adaptions of the $b$-scaling algorithm to online release dates and unweighted jobs on identical parallel machines.

Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.

2022-11-03

Could Giant Pretrained Image Models Extract Universal Representations?

Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models. Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. However, with frozen models there are relatively few parameters available for adapting to downstream tasks, which is problematic in computer vision where tasks vary significantly in input/output format and the type of information that is of value. In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. From this empirical analysis, our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes. We additionally examine the upper bound of performance using a giant frozen pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches competitive performance on a varied set of major benchmarks with only one shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7 top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models.

Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.

2022-11-03

Single SMPC Invocation DPHelmet: Differentially Private Distributed Learning on a Large Scale

Distributing machine learning predictors enables the collection of large-scale datasets while leaving sensitive raw data at trustworthy sites. For a large number of participants, communication cost is one of the main challenges. We achieve a low communication cost by requiring only a single invocation of an efficient secure multiparty summation protocol. More generally, we prove learnability properties for the average of such locally trained models: convergence and uniform stability. Distributing machine learning predictors enables the collection of large-scale datasets while leaving sensitive raw data at trustworthy sites. We show that locally training support vector machines (SVMs) and computing their averages leads to a learning technique that is scalable to a large number of users, satisfies differential privacy, and is applicable to non-trivial tasks, such as CIFAR-10. For a large number of participants, communication cost is one of the main challenges. We achieve a low communication cost by requiring only a single invocation of an efficient secure multiparty summation protocol. By relying on state-of-the-art feature extractors (SimCLR), we are able to utilize differentially private convex learners for non-trivial tasks such as CIFAR-10. Our experimental results illustrate that for $1{,}000$ users with $50$ data points each, our scheme outperforms state-of-the-art scalable distributed learning methods (differentially private federated learning, short DP-FL) while requiring around $500$ times fewer communication costs: For CIFAR-10, we achieve a classification accuracy of $79.7\,\%$ for an $\varepsilon = 0.59$ while DP-FL achieves $57.6\,\%$. More generally, we prove learnability properties for the average of such locally trained models: convergence and uniform stability. By only requiring strongly convex, smooth, and Lipschitz-continuous objective functions, locally trained via stochastic gradient descent (SGD), we achieve a strong utility-privacy tradeoff.

Authors: Moritz Kirschte, Sebastian Meiser, Saman Ardalan, Esfandiar Mohammadi.

2022-11-03

Emergence of Competing Orders and Possible Quantum Spin Liquid in SU(N) Fermions

In the past decades, tremendous efforts have been made towards understanding the exotic physics emerging from competition between various ordering tendencies in strongly correlated systems. For small value of $N$, namely $N=2,3$, the ground state is antiferromagnetic (AFM) phase, whereas in the large-$N$ limit, valence bound solid (VBS) order is dominant. For the intermediate value of $N$ such as $N=4$, remarkably, our study reveals distinct VBS orders appear in the weak and strong coupling regimes.

In the past decades, tremendous efforts have been made towards understanding the exotic physics emerging from competition between various ordering tendencies in strongly correlated systems. Employing state-of-the-art quantum Monte-Carlo simulation, we investigate an interacting SU($N$) fermionic model with varying interaction strength and value of $N$, and unveil the ground-state phase diagram of the model exhibiting a plethora of exotic phases. For small value of $N$, namely $N=2,3$, the ground state is antiferromagnetic (AFM) phase, whereas in the large-$N$ limit, valence bound solid (VBS) order is dominant. For the intermediate value of $N$ such as $N=4$, remarkably, our study reveals distinct VBS orders appear in the weak and strong coupling regimes. More fantastically, the competition between staggered and columnar VBS ordering tendencies gives rise to a Mott insulating phase without spontaneously symmetry breaking (SSB), existing in a large interacting parameter regime, which is consistent with a gapped quantum spin liquid. Our study not only provides a platform to investigate the fundamental physics of quantum many-body systems, but also offers a novel route towards searching for exotic states of matter such as quantum spin liquid in realistic quantum materials.

Authors: Xue-Jia Yu, Shao-Hang Shi, Limei Xu, Zi-Xiang Li.

2022-11-03

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model

In the present article, we aim to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle. We also study the energy requirements and carbon emissions of its deployment for inference via an API endpoint receiving user queries in real-time. Progress in machine learning (ML) comes with a cost to the environment, given that training ML models requires significant computational resources, energy and materials. In the present article, we aim to quantify the carbon footprint of BLOOM, a 176-billion parameter language model, across its life cycle. We estimate that BLOOM's final training emitted approximately 24.7 tonnes of~\carboneq~if we consider only the dynamic power consumption, and 50.5 tonnes if we account for all processes ranging from equipment manufacturing to energy-based operational consumption. We also study the energy requirements and carbon emissions of its deployment for inference via an API endpoint receiving user queries in real-time. We conclude with a discussion regarding the difficulty of precisely estimating the carbon footprint of ML models and future research directions that can contribute towards improving carbon emissions reporting.

Authors: Alexandra Sasha Luccioni, Sylvain Viguier, Anne-Laure Ligozat.

2022-11-03

Dynamic Kernels and Channel Attention with Multi-Layer Embedding Aggregation for Speaker Verification

The attention weights on the kernels are further distilled by channel attention and multi-layer feature aggregation to learn global features from speech. This approach provides an efficient solution to improving representation capacity with lower data resources. This is due to the self-adaptation to inputs of the structures of the model parameters.

State-of-the-art speaker verification frameworks have typically focused on speech enhancement techniques with increasingly deeper (more layers) and wider (number of channels) models to improve their verification performance. Instead, this paper proposes an approach to increase the model resolution capability using attention-based dynamic kernels in a convolutional neural network to adapt the model parameters to be feature-conditioned. The attention weights on the kernels are further distilled by channel attention and multi-layer feature aggregation to learn global features from speech. This approach provides an efficient solution to improving representation capacity with lower data resources. This is due to the self-adaptation to inputs of the structures of the model parameters. The proposed dynamic convolutional model achieved 1.62\% EER and 0.18 miniDCF on the VoxCeleb1 test set and has a 17\% relative improvement compared to the ECAPA-TDNN.

Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain.

2022-11-03

Quantifying Model Uncertainty for Semantic Segmentation using Operators in the RKHS

Deep learning models for semantic segmentation are prone to poor performance in real-world applications due to the highly challenging nature of the task. This leads to a significantly more accurate view of model uncertainty than conventional Bayesian methods. We demonstrate these advantages through experimental evaluations of our framework implemented over four different state-of-the-art model architectures that are trained and evaluated on two benchmark road-scene segmentation datasets (Camvid and Cityscapes). Deep learning models for semantic segmentation are prone to poor performance in real-world applications due to the highly challenging nature of the task. Model uncertainty quantification (UQ) is one way to address this issue of lack of model trustworthiness by enabling the practitioner to know how much to trust a segmentation output. Current UQ methods in this application domain are mainly restricted to Bayesian based methods which are computationally expensive and are only able to extract central moments of uncertainty thereby limiting the quality of their uncertainty estimates. We present a simple framework for high-resolution predictive uncertainty quantification of semantic segmentation models that leverages a multi-moment functional definition of uncertainty associated with the model's feature space in the reproducing kernel Hilbert space (RKHS). The multiple uncertainty functionals extracted from this framework are defined by the local density dynamics of the model's feature space and hence automatically align themselves at the tail-regions of the intrinsic probability density function of the feature space (where uncertainty is the highest) in such a way that the successively higher order moments quantify the more uncertain regions. This leads to a significantly more accurate view of model uncertainty than conventional Bayesian methods. Moreover, the extraction of such moments is done in a single-shot computation making it much faster than Bayesian and ensemble approaches (that involve a high number of forward stochastic passes of the model to quantify its uncertainty). We demonstrate these advantages through experimental evaluations of our framework implemented over four different state-of-the-art model architectures that are trained and evaluated on two benchmark road-scene segmentation datasets (Camvid and Cityscapes).

Authors: Rishabh Singh, Jose C. Principe.

2022-11-03

Driving innovation through project based learning: A pre-university STEAM for Social Good initiative

The Covid pandemic is a clarion call for increased sensitivity to the interconnected nature of social problems facing our world today. The children learnt the Engineering Design Thinking process and worked in online groups of two or three, from concept to completion. Despite the constraints posed by the pandemic, they explored creative ways to think about design and innovation. They completed a variety of tasks by making, tinkering, engineering, assembling, and programming to grasp the intricate relationship between software and hardware. Subsequently, the children showcased their creative abilities through video storytelling to a panel of domain experts.

The Covid pandemic is a clarion call for increased sensitivity to the interconnected nature of social problems facing our world today. A future-oriented education on critical issues, such as those outlined in the United Nations Sustainable Development Goals (UN SDGs) and designing potential solutions for such problems is an imperative skill that must be imparted to children to help them navigate their future in today's unpredictable world. Towards this goal, we have been conducting 3.5 month-long mentoring programs for pre-university students in India to participate in a STEAM for Social Good innovation challenge conducted annually by the Government of India. Using digital and physical computing skills, we helped children explore creative solutions for social problems through a constructionist approach to learning, wherein they ideated and reflected upon the problems in their communities. The children learnt the Engineering Design Thinking process and worked in online groups of two or three, from concept to completion. Despite the constraints posed by the pandemic, they explored creative ways to think about design and innovation. They completed a variety of tasks by making, tinkering, engineering, assembling, and programming to grasp the intricate relationship between software and hardware. Subsequently, the children showcased their creative abilities through video storytelling to a panel of domain experts. In this paper, we present the children's perspective of their experiences through this journey, the evaluation metrics based on IEEE design principles, and our learnings from conducting this initiative as a university-school partnership model for 84 middle and high school students. The aspirational intent of this initiative is to make the children better social problem solvers and help them perceive social problems as opportunities to enhance life for themselves and their communities.

Authors: Gayathri Manikutty, Sreejith Sasidharan, Bhavani Rao.

2022-11-03

Extremely Narrow, Sharp-Peaked Resonances at the Edge of the Continuum

We report a critical narrowing of resonances of a driven potential well, when their eigenfrequencies approach the edge of the continuum. The resonances also obtain unusual sharp-peak shapes at the continuum boundary. The narrow and sharp-peak resonances can be used for an efficient narrow-band frequency- and spatial filtering of light. We report a critical narrowing of resonances of a driven potential well, when their eigenfrequencies approach the edge of the continuum. The resonances also obtain unusual sharp-peak shapes at the continuum boundary. The situation can be realized for the electromagnetic wave propagating across the dielectric thin films with a periodically modulated interface(s). We show the general phenomenon semi-analytically on a simplified model of a driven quantum potential well, also by rigorous numerical analysis of Maxwell equations for the wave propagation across the thin film with a modulated interface(s). We justify the phenomenon experimentally, by the measurements of light reflection from the dielectric thin film deposited on a periodically modulated surface. The narrow and sharp-peak resonances can be used for an efficient narrow-band frequency- and spatial filtering of light.

Authors: Ignas Lukosiunas, Lina Grineviciute, Julianija Nikitina, Darius Gailevicius, Kestutis Staliunas.

2022-11-03

A remark on the Hochschild dimension of liberated quantum groups

Let $A$ be a Hopf algebra equipped with a projection onto the coordinate Hopf algebra $\mathcal{O}(G)$ of a semisimple algebraic group $G$.

Let $A$ be a Hopf algebra equipped with a projection onto the coordinate Hopf algebra $\mathcal{O}(G)$ of a semisimple algebraic group $G$. It is shown that if $A$ admits a suitably non-degenerate comodule $V$ and the induced $G$-module structure of $V$ is non-trivial, then the third Hochschild homology group of $A$ is non-trivial.

Authors: Tomasz Brzeziński, Ulrich Krähmer, Réamonn Ó Buachalla, Karen R. Strung.

2022-11-03

Computing zero-group-velocity points in anisotropic elastic waveguides: globally and locally convergent methods

These are the so-called zero-group-velocity (ZGV) points. These applications rely on the correct prediction of the ZGV points. The resulting governing equation is interpreted as a two-parameter eigenvalue problem. Dispersion curves of elastic waveguides exhibit points where the group velocity vanishes while the wavenumber remains finite. These are the so-called zero-group-velocity (ZGV) points. As the elastodynamic energy at these points remains confined close to the source, they are of practical interest for nondestructive testing and quantitative characterization of structures. These applications rely on the correct prediction of the ZGV points. In this contribution, we first model the ZGV resonances in anisotropic plates based on the appearance of an exceptional mode. The resulting governing equation is interpreted as a two-parameter eigenvalue problem. We then present three complementary numerical procedures capable of computing ZGV points in arbitrary nondissipative elastic waveguides. The first method is globally convergent and guarantees to find all ZGV points but can only be used for small problems. The second procedure is a very fast, generally-applicable, Newton-type iteration that is locally convergent and requires initial guesses. The third method combines both kind of approaches and yields a procedure that is applicable to large problems, does not require initial guesses and is likely to find all ZGV points.

Authors: Daniel A. Kiefer, Bor Plestenjak, Hauke Gravenkamp, Claire Prada.

2022-11-03

lilGym: Natural Language Visual Reasoning with Reinforcement Learning

We experiment with lilGym with different models and learning regimes. lilGym is available at https://lil.nlp.cornell.edu/lilgym/.

We present lilGym, a new benchmark for language-conditioned reinforcement learning in visual environments. lilGym is based on 2,661 highly-compositional human-written natural language statements grounded in an interactive visual environment. We annotate all statements with executable Python programs representing their meaning to enable exact reward computation in every possible world state. Each statement is paired with multiple start states and reward functions to form thousands of distinct Markov Decision Processes of varying difficulty. We experiment with lilGym with different models and learning regimes. Our results and analysis show that while existing methods are able to achieve non-trivial performance, lilGym forms a challenging open problem. lilGym is available at https://lil.nlp.cornell.edu/lilgym/.

Authors: Anne Wu, Kianté Brantley, Noriyuki Kojima, Yoav Artzi.

Read more

Read more

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

Fast and robust Bayesian Inference using Gaussian Processes with GPry

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

Could Giant Pretrained Image Models Extract Universal Representations?

Single SMPC Invocation DPHelmet: Differentially Private Distributed Learning on a Large Scale

Emergence of Competing Orders and Possible Quantum Spin Liquid in SU(N) Fermions

Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model

Dynamic Kernels and Channel Attention with Multi-Layer Embedding Aggregation for Speaker Verification

Quantifying Model Uncertainty for Semantic Segmentation using Operators in the RKHS

Driving innovation through project based learning: A pre-university STEAM for Social Good initiative

Extremely Narrow, Sharp-Peaked Resonances at the Edge of the Continuum

A remark on the Hochschild dimension of liberated quantum groups

Computing zero-group-velocity points in anisotropic elastic waveguides: globally and locally convergent methods

lilGym: Natural Language Visual Reasoning with Reinforcement Learning