Welcome to Byte Size Arxiv

Papers made digestable

Recommended Categories:
Show All

2022-11-03

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves long-term autonomous path-following using topometric mapping and localization from a single rich sensor stream. In this paper, we improve the capabilities of a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in changing environments. Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. We then extend the behaviour of generic sample-based motion planners to better suit the teach-and-repeat problem structure by introducing a new edge-cost metric paired with a curvilinear planning space. The resulting planner generates naturally smooth paths that avoid local obstacles while minimizing lateral path deviation to best exploit prior terrain knowledge. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Experimental results from online run-time analysis, unit testing, and qualitative experiments on a differential drive robot show the promise of the technique for reliable long-term autonomous operation in complex unstructured environments.

Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.

2022-11-03

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

The statistical and design considerations that pertain to dose optimization are discussed. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. The traditional more-is-better dose selection paradigm, developed based on cytotoxic chemotherapeutics, is often problematic When applied to the development of novel molecularly targeted agents (e.g., kinase inhibitors, monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug Administration (FDA) initiated Project Optimus to reform the dose optimization and dose selection paradigm in oncology drug development and call for more attention to benefit-risk consideration. We systematically investigated the operating characteristics of the seamless phase 2-3 design as a strategy for dose optimization, where in stage 1 (corresponding to phase 2) patients are randomized to multiple doses, with or without a control; and in stage 2 (corresponding to phase 3) the efficacy of the selected optimal dose is evaluated with a randomized concurrent control or historical control. Depending on whether the concurrent control is included and the type of endpoints used in stages 1 and 2, we describe four types of seamless phase 2-3 dose-optimization designs, which are suitable for different clinical settings. The statistical and design considerations that pertain to dose optimization are discussed. Simulation shows that dose optimization phase 2-3 designs are able to control the familywise type I error rates and yield appropriate statistical power with substantially smaller sample size than the conventional approach. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. Due to the interim dose selection, the phase 2-3 dose-optimization design is logistically and operationally more challenging, and should be carefully planned and implemented to ensure trial integrity.

Authors: Liyun Jiang, Ying Yuan.

2022-11-03

Fast and robust Bayesian Inference using Gaussian Processes with GPry

We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.

2022-11-03

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. While no non-preemptive algorithm is constant competitive, Motwani, Phillips, and Torng (SODA '93) proved that the simple preemptive round robin procedure is $2$-competitive and that no better competitive ratio is possible, initiating a long line of research focused on preemptive algorithms for generalized variants of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS '91) introduced kill-and-restart schedules, where running jobs may be killed and restarted from scratch later, and analyzed then for the makespan objective. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. We contribute to both models: First we give for any $b > 1$ a tight analysis for the natural $b$-scaling kill-and-restart strategy for scheduling jobs without release dates, as well as for a randomized variant of it. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. Second, we show that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is $2$-competitive for jobs released in an online fashion over time, matching the lower bound by Motwani et al. Using this result as well as the competitiveness of round robin for multiple machines, we prove performance guarantees of adaptions of the $b$-scaling algorithm to online release dates and unweighted jobs on identical parallel machines.

Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.

2022-11-03

Could Giant Pretrained Image Models Extract Universal Representations?

Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models. Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. However, with frozen models there are relatively few parameters available for adapting to downstream tasks, which is problematic in computer vision where tasks vary significantly in input/output format and the type of information that is of value. In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. From this empirical analysis, our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes. We additionally examine the upper bound of performance using a giant frozen pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches competitive performance on a varied set of major benchmarks with only one shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7 top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models.

Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.

2022-11-03

Revisiting Hyperparameter Tuning with Differential Privacy

Hyperparameter tuning is a common practice in the application of machine learning but is a typically ignored aspect in the literature on privacy-preserving machine learning due to its negative effect on the overall privacy parameter. In this paper, we aim to tackle this fundamental yet challenging problem by providing an effective hyperparameter tuning framework with differential privacy. Interestingly, it instead correlates with the utility gained from hyperparameter searching, revealing an explicit and mandatory trade-off between privacy and utility. Hyperparameter tuning is a common practice in the application of machine learning but is a typically ignored aspect in the literature on privacy-preserving machine learning due to its negative effect on the overall privacy parameter. In this paper, we aim to tackle this fundamental yet challenging problem by providing an effective hyperparameter tuning framework with differential privacy. The proposed method allows us to adopt a broader hyperparameter search space and even to perform a grid search over the whole space, since its privacy loss parameter is independent of the number of hyperparameter candidates. Interestingly, it instead correlates with the utility gained from hyperparameter searching, revealing an explicit and mandatory trade-off between privacy and utility. Theoretically, we show that its additional privacy loss bound incurred by hyperparameter tuning is upper-bounded by the squared root of the gained utility. However, we note that the additional privacy loss bound would empirically scale like a squared root of the logarithm of the utility term, benefiting from the design of doubling step.

Authors: Youlong Ding, Xueyang Wu.

2022-11-03

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In doing so, we are able to compute an $\epsilon$-stationary point with $\tilde{O}\left(n + \sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower bound up to logarithmic factors.

We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan & Streeter, 2010], but a fairly distinct, adaptive step-size schedule with the recursive stochastic path integrated estimator proposed in [Fang et al., 2018]. To our knowledge, Adaspider is the first parameter-free non-convex variance-reduction method in the sense that it does not require the knowledge of problem-dependent parameters, such as smoothness constant $L$, target accuracy $\epsilon$ or any bound on gradient norms. In doing so, we are able to compute an $\epsilon$-stationary point with $\tilde{O}\left(n + \sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower bound up to logarithmic factors.

Authors: Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse Dadi, Volkan Cevher.

2022-11-03

Hints of the Photonic Nature of the Electromagnetic fields in Classical Electrodynamics and its connection to the electronic charge and vacuum energy density

The electromagnetic fields of a long dipole working without dispersive and dissipative losses are analyzed in the frequency domains. The dipole produces radiation in bursts of duration T/2 where T is the period of oscillation. We have studied how U vary as a function of the charge associated with the current in the dipole and the ratio of the length of the dipole and its radius. We have observed a remarkable result when this ratio is equal to the ratio of the radius of the universe to the Bohr radius. The importance of this finding is discussed. The electromagnetic fields of a long dipole working without dispersive and dissipative losses are analyzed in the frequency domains. The dipole produces radiation in bursts of duration T/2 where T is the period of oscillation. The parameter studied in this paper is the energy, U, dissipated in a single burst of radiation of duration T/2. We have studied how U vary as a function of the charge associated with the current in the dipole and the ratio of the length of the dipole and its radius. We have observed a remarkable result when this ratio is equal to the ratio of the radius of the universe to the Bohr radius. Our results, based purely on the classical electrodynamics and general relativity, show that, as the magnitude of the oscillating charge (as defined by the root mean square) reduces to the electronic charge, the energy dissipated in a single burst of radiation reduces to hv, where v is the frequency of oscillation and h is the Planck constant. The importance of this finding is discussed. In particular, the results show that the existence of a minimum free charge in nature, i.e., electronic charge, is a direct consequence of the photonic nature of the electromagnetic fields. Furthermore, the presented findings allow to derive for the first time an expression for the vacuum energy density of the universe in terms of the other fundamental constants in nature, the prediction of which is consistent with experimental observations. This equation, which combines the vacuum energy, electronic charge and mass, speed of light, gravitational constant and Planck constant, creates a link between classical field theories (i.e., classical electrodynamics and general relativity) and quantum mechanics.

Authors: Vernon Cooray, Gerald Cooray, Marcos Rubinstein, Farhad Rachidi.

2022-11-03

Unsupervised Parameter Estimation using Model-based Decoder

Relying only on unlabeled training data we show in our analysis that we can outperform existing unsupervised machine learning methods and classical methods. Our numerical simulation show that the performance of the presented approach is not affected by correlated signals but rather improves slightly. This is due to the fact, that we propose the estimation of the correlation parameters simultaneously to the DoA estimation.

In this work, we consider the use of a model-based decoder in combination with an unsupervised learning strategy for direction-of-arrival (DoA) estimation. Relying only on unlabeled training data we show in our analysis that we can outperform existing unsupervised machine learning methods and classical methods. This is done by introducing a model-based decoder in an autoencoder architecture with leads to a meaningful representation of the statistical model in the latent space. Our numerical simulation show that the performance of the presented approach is not affected by correlated signals but rather improves slightly. This is due to the fact, that we propose the estimation of the correlation parameters simultaneously to the DoA estimation.

Authors: Franz Weißer, Michael Baur, Wolfgang Utschick.

2022-11-03

Circling Back to Recurrent Models of Language

In the process, we establish a new state of the art for language modelling on small datasets and on enwik8 with dynamic evaluation. Just because some purely recurrent models suffer from being hard to optimize and inefficient on today's hardware, they are not necessarily bad models of language. We demonstrate this by the extent to which these models can still be improved by a combination of a slightly better recurrent cell, architecture, objective, as well as optimization. In the process, we establish a new state of the art for language modelling on small datasets and on enwik8 with dynamic evaluation.

Authors: Gábor Melis.

2022-11-03

Seeing the Unseen: Errors and Bias in Visual Datasets

From face recognition in smartphones to automatic routing on self-driving cars, machine vision algorithms lie in the core of these features. These systems solve image based tasks by identifying and understanding objects, subsequently making decisions from these information. However, errors in datasets are usually induced or even magnified in algorithms, at times resulting in issues such as recognising black people as gorillas and misrepresenting ethnicities in search results. This paper tracks the errors in datasets and their impacts, revealing that a flawed dataset could be a result of limited categories, incomprehensive sourcing and poor classification.

Authors: Hongrui Jin.

2022-11-03

Evidence for the molecular nature of $D_{s0}^*(2317)$ and $D_{s1}(2460)$ and the production yield of $K_{c\bar{c}}(4180)$ as a $D\bar{D}K$ bound state in inclusive $e^+e^-\to c\bar{c}$ collisions

Searching for exotic multiquark states and elucidating their nature remains a central topic in understanding quantum chromo dynamics--the underlying theory of the strong interaction. Two of the most studied such states are the charmed-strange states $D_{s0}^*(2317)$ and $D_{s1}(2460)$. Searching for exotic multiquark states and elucidating their nature remains a central topic in understanding quantum chromo dynamics--the underlying theory of the strong interaction. Two of the most studied such states are the charmed-strange states $D_{s0}^*(2317)$ and $D_{s1}(2460)$. In this work, we show for the first time that their production yields in inclusive $e^+e^-\to c\bar{c}$ productions near $\sqrt{s}=10.6$ GeV measured by the BaBar Collaboration can be well explained in the molecular picture, which provide a highly nontrivial verification of their nature being $DK/D^*K$ molecules. In addition, we predict the production yield of the $D\bar{D}K$ three-body bound state, $K_{c\bar{c}}(4180)$, in $e^+e^-\to c\bar{c}$ collisions and find that it is within the reach of the ongoing Belle II experiment.

Authors: Tian-Chen Wu, Li-Sheng Geng.

2022-11-03

Space-grid approximations of hybrid stochastic differential equations and first passage properties

We finally illustrate the performance of the resulting approximations in numerical examples.

Hybrid stochastic differential equations are a useful tool to model continuously varying stochastic systems which are modulated by a random environment that may depend on the system state itself. In this paper, we establish the pathwise convergence of the solutions to hybrid stochastic differential equations via space-grid discretizations. While time-grid discretizations are a classical approach for simulation purposes, our space-grid discretization provides a link with multi-regime Markov modulated Brownian motions, leading to computational tractability. We exploit our convergence result to obtain efficient approximations to first passage probabilities and expected occupation times of the solutions hybrid stochastic differential equations, results which are the first of their kind for such a robust framework. We finally illustrate the performance of the resulting approximations in numerical examples.

Authors: Hansjoerg Albrecher, Oscar Peralta.

2022-11-03

Semicocycle discontinuities for substitutions and reverse-reading automata

In this article we define the semigroup associated to a substitution. In this article we define the semigroup associated to a substitution. We use it to construct a minimal automaton which generates a substitution sequence u in reverse reading. We show, in the case where the substitution has a coincidence, that this automaton completely describes the semicocycle discontinuities of u.

Authors: Gandhar Joshi, Reem Yassawi.

2022-11-03

Towards Discovering Neural Architectures from Scratch

The discovery of neural architectures from scratch is the long-standing goal of Neural Architecture Search (NAS). Searching over a wide spectrum of neural architectures can facilitate the discovery of previously unconsidered but well-performing architectures. In this work, we take a large step towards discovering neural architectures from scratch by expressing architectures algebraically. This algebraic view leads to a more general method for designing search spaces, which allows us to compactly represent search spaces that are 100s of orders of magnitude larger than common spaces from the literature. Further, we propose a Bayesian Optimization strategy to efficiently search over such huge spaces, and demonstrate empirically that both our search space design and our search strategy can be superior to existing baselines. We open source our algebraic NAS approach and provide APIs for PyTorch and TensorFlow.

Authors: Simon Schrodi, Danny Stoll, Binxin Ru, Rhea Sukthanker, Thomas Brox, Frank Hutter.

Read more

Read more

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

Fast and robust Bayesian Inference using Gaussian Processes with GPry

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

Could Giant Pretrained Image Models Extract Universal Representations?

Revisiting Hyperparameter Tuning with Differential Privacy

Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum Minimization

Hints of the Photonic Nature of the Electromagnetic fields in Classical Electrodynamics and its connection to the electronic charge and vacuum energy density

Unsupervised Parameter Estimation using Model-based Decoder

Circling Back to Recurrent Models of Language

Seeing the Unseen: Errors and Bias in Visual Datasets

Evidence for the molecular nature of $D_{s0}^*(2317)$ and $D_{s1}(2460)$ and the production yield of $K_{c\bar{c}}(4180)$ as a $D\bar{D}K$ bound state in inclusive $e^+e^-\to c\bar{c}$ collisions

Space-grid approximations of hybrid stochastic differential equations and first passage properties

Semicocycle discontinuities for substitutions and reverse-reading automata

Towards Discovering Neural Architectures from Scratch