Papers made digestable
Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications.
Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves
long-term autonomous path-following using topometric mapping and localization
from a single rich sensor stream. In this paper, we improve the capabilities of
a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in
changing environments. Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. We then extend the
behaviour of generic sample-based motion planners to better suit the
teach-and-repeat problem structure by introducing a new edge-cost metric paired
with a curvilinear planning space. The resulting planner generates naturally
smooth paths that avoid local obstacles while minimizing lateral path deviation
to best exploit prior terrain knowledge. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications. Experimental
results from online run-time analysis, unit testing, and qualitative
experiments on a differential drive robot show the promise of the technique for
reliable long-term autonomous operation in complex unstructured environments.
Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.
The statistical and design considerations that pertain to
dose optimization are discussed. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%.
The traditional more-is-better dose selection paradigm, developed based on
cytotoxic chemotherapeutics, is often problematic When applied to the
development of novel molecularly targeted agents (e.g., kinase inhibitors,
monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug
Administration (FDA) initiated Project Optimus to reform the dose optimization
and dose selection paradigm in oncology drug development and call for more
attention to benefit-risk consideration.
We systematically investigated the operating characteristics of the seamless
phase 2-3 design as a strategy for dose optimization, where in stage 1
(corresponding to phase 2) patients are randomized to multiple doses, with or
without a control; and in stage 2 (corresponding to phase 3) the efficacy of
the selected optimal dose is evaluated with a randomized concurrent control or
historical control. Depending on whether the concurrent control is included and
the type of endpoints used in stages 1 and 2, we describe four types of
seamless phase 2-3 dose-optimization designs, which are suitable for different
clinical settings. The statistical and design considerations that pertain to
dose optimization are discussed. Simulation shows that dose optimization phase
2-3 designs are able to control the familywise type I error rates and yield
appropriate statistical power with substantially smaller sample size than the
conventional approach. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%. Due to the
interim dose selection, the phase 2-3 dose-optimization design is logistically
and operationally more challenging, and should be carefully planned and
implemented to ensure trial integrity.
Authors: Liyun Jiang, Ying Yuan.
We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples.
We present the GPry algorithm for fast Bayesian inference of general
(non-Gaussian) posteriors with a moderate number of parameters. GPry does not
need any pre-training, special hardware such as GPUs, and is intended as a
drop-in replacement for traditional Monte Carlo methods for Bayesian inference.
Our algorithm is based on generating a Gaussian Process surrogate model of the
log-posterior, aided by a Support Vector Machine classifier that excludes
extreme or non-finite values. An active learning scheme allows us to reduce the
number of required posterior evaluations by two orders of magnitude compared to
traditional Monte Carlo inference. Our algorithm allows for parallel
evaluations of the posterior at optimal locations, further reducing wall-clock
times. We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples. GPry outperforms traditional Monte Carlo methods when
the evaluation time of the likelihood (or the calculation of theoretical
observables) is of the order of seconds; for evaluation times of over a minute
it can perform inference in days that would take months using traditional
methods. GPry is distributed as an open source Python package (pip install
gpry) and can also be found at https://github.com/jonaselgammal/GPry.
Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting.
While no non-preemptive algorithm is constant competitive, Motwani, Phillips,
and Torng (SODA '93) proved that the simple preemptive round robin procedure is
$2$-competitive and that no better competitive ratio is possible, initiating a
long line of research focused on preemptive algorithms for generalized variants
of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS
'91) introduced kill-and-restart schedules, where running jobs may be killed
and restarted from scratch later, and analyzed then for the makespan objective.
However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model.
We contribute to both models: First we give for any $b > 1$ a tight analysis
for the natural $b$-scaling kill-and-restart strategy for scheduling jobs
without release dates, as well as for a randomized variant of it. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version. Second, we show
that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is
$2$-competitive for jobs released in an online fashion over time, matching the
lower bound by Motwani et al. Using this result as well as the competitiveness
of round robin for multiple machines, we prove performance guarantees of
adaptions of the $b$-scaling algorithm to online release dates and unweighted
jobs on identical parallel machines.
Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. However, with
frozen models there are relatively few parameters available for adapting to
downstream tasks, which is problematic in computer vision where tasks vary
significantly in input/output format and the type of information that is of
value. In this paper, we present a study of frozen pretrained models when
applied to diverse and representative computer vision tasks, including object
detection, semantic segmentation and video action recognition. From this
empirical analysis, our work answers the questions of what pretraining task
fits best with this frozen setting, how to make the frozen setting more
flexible to various downstream tasks, and the effect of larger model sizes. We
additionally examine the upper bound of performance using a giant frozen
pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches
competitive performance on a varied set of major benchmarks with only one
shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object
detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7
top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.
Hyperparameter tuning is a common practice in the application of machine
learning but is a typically ignored aspect in the literature on
privacy-preserving machine learning due to its negative effect on the overall
privacy parameter. In this paper, we aim to tackle this fundamental yet
challenging problem by providing an effective hyperparameter tuning framework
with differential privacy. Interestingly, it instead correlates with the
utility gained from hyperparameter searching, revealing an explicit and
mandatory trade-off between privacy and utility.
Hyperparameter tuning is a common practice in the application of machine
learning but is a typically ignored aspect in the literature on
privacy-preserving machine learning due to its negative effect on the overall
privacy parameter. In this paper, we aim to tackle this fundamental yet
challenging problem by providing an effective hyperparameter tuning framework
with differential privacy. The proposed method allows us to adopt a broader
hyperparameter search space and even to perform a grid search over the whole
space, since its privacy loss parameter is independent of the number of
hyperparameter candidates. Interestingly, it instead correlates with the
utility gained from hyperparameter searching, revealing an explicit and
mandatory trade-off between privacy and utility. Theoretically, we show that
its additional privacy loss bound incurred by hyperparameter tuning is
upper-bounded by the squared root of the gained utility. However, we note that
the additional privacy loss bound would empirically scale like a squared root
of the logarithm of the utility term, benefiting from the design of doubling
step.
Authors: Youlong Ding, Xueyang Wu.
We propose an adaptive variance-reduction method, called AdaSpider, for minimization of $L$-smooth, non-convex functions with a finite-sum structure. In doing so, we are able to compute an $\epsilon$-stationary point with $\tilde{O}\left(n + \sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower bound up to logarithmic factors.
We propose an adaptive variance-reduction method, called AdaSpider, for
minimization of $L$-smooth, non-convex functions with a finite-sum structure.
In essence, AdaSpider combines an AdaGrad-inspired [Duchi et al., 2011, McMahan
& Streeter, 2010], but a fairly distinct, adaptive step-size schedule with the
recursive stochastic path integrated estimator proposed in [Fang et al., 2018].
To our knowledge, Adaspider is the first parameter-free non-convex
variance-reduction method in the sense that it does not require the knowledge
of problem-dependent parameters, such as smoothness constant $L$, target
accuracy $\epsilon$ or any bound on gradient norms. In doing so, we are able to
compute an $\epsilon$-stationary point with $\tilde{O}\left(n +
\sqrt{n}/\epsilon^2\right)$ oracle-calls, which matches the respective lower
bound up to logarithmic factors.
Authors: Ali Kavis, Stratis Skoulakis, Kimon Antonakopoulos, Leello Tadesse Dadi, Volkan Cevher.
The electromagnetic fields of a long dipole working without dispersive and
dissipative losses are analyzed in the frequency domains. The dipole produces
radiation in bursts of duration T/2 where T is the period of oscillation. We have studied how U vary as a function of the
charge associated with the current in the dipole and the ratio of the length of
the dipole and its radius. We have observed a remarkable result when this ratio
is equal to the ratio of the radius of the universe to the Bohr radius. The importance of this finding is
discussed.
The electromagnetic fields of a long dipole working without dispersive and
dissipative losses are analyzed in the frequency domains. The dipole produces
radiation in bursts of duration T/2 where T is the period of oscillation. The
parameter studied in this paper is the energy, U, dissipated in a single burst
of radiation of duration T/2. We have studied how U vary as a function of the
charge associated with the current in the dipole and the ratio of the length of
the dipole and its radius. We have observed a remarkable result when this ratio
is equal to the ratio of the radius of the universe to the Bohr radius. Our
results, based purely on the classical electrodynamics and general relativity,
show that, as the magnitude of the oscillating charge (as defined by the root
mean square) reduces to the electronic charge, the energy dissipated in a
single burst of radiation reduces to hv, where v is the frequency of
oscillation and h is the Planck constant. The importance of this finding is
discussed. In particular, the results show that the existence of a minimum free
charge in nature, i.e., electronic charge, is a direct consequence of the
photonic nature of the electromagnetic fields. Furthermore, the presented
findings allow to derive for the first time an expression for the vacuum energy
density of the universe in terms of the other fundamental constants in nature,
the prediction of which is consistent with experimental observations. This
equation, which combines the vacuum energy, electronic charge and mass, speed
of light, gravitational constant and Planck constant, creates a link between
classical field theories (i.e., classical electrodynamics and general
relativity) and quantum mechanics.
Authors: Vernon Cooray, Gerald Cooray, Marcos Rubinstein, Farhad Rachidi.
Relying only on unlabeled training data we show in our analysis that we can outperform existing unsupervised machine learning methods and classical methods. Our numerical simulation show that the performance of the presented approach is not affected by correlated signals but rather improves slightly. This is due to the fact, that we propose the estimation of the correlation parameters simultaneously to the DoA estimation.
In this work, we consider the use of a model-based decoder in combination
with an unsupervised learning strategy for direction-of-arrival (DoA)
estimation. Relying only on unlabeled training data we show in our analysis
that we can outperform existing unsupervised machine learning methods and
classical methods. This is done by introducing a model-based decoder in an
autoencoder architecture with leads to a meaningful representation of the
statistical model in the latent space. Our numerical simulation show that the
performance of the presented approach is not affected by correlated signals but
rather improves slightly. This is due to the fact, that we propose the
estimation of the correlation parameters simultaneously to the DoA estimation.
Authors: Franz Weißer, Michael Baur, Wolfgang Utschick.
In the process, we establish a new state of
the art for language modelling on small datasets and on enwik8 with dynamic
evaluation.
Just because some purely recurrent models suffer from being hard to optimize
and inefficient on today's hardware, they are not necessarily bad models of
language. We demonstrate this by the extent to which these models can still be
improved by a combination of a slightly better recurrent cell, architecture,
objective, as well as optimization. In the process, we establish a new state of
the art for language modelling on small datasets and on enwik8 with dynamic
evaluation.
Authors: Gábor Melis.
From face recognition in smartphones to automatic routing on self-driving cars, machine vision algorithms lie in the core of these features. These systems solve image based tasks by identifying and understanding objects, subsequently making decisions from these information.
From face recognition in smartphones to automatic routing on self-driving
cars, machine vision algorithms lie in the core of these features. These
systems solve image based tasks by identifying and understanding objects,
subsequently making decisions from these information. However, errors in
datasets are usually induced or even magnified in algorithms, at times
resulting in issues such as recognising black people as gorillas and
misrepresenting ethnicities in search results. This paper tracks the errors in
datasets and their impacts, revealing that a flawed dataset could be a result
of limited categories, incomprehensive sourcing and poor classification.
Authors: Hongrui Jin.
Searching for exotic multiquark states and elucidating their nature remains a
central topic in understanding quantum chromo dynamics--the underlying theory
of the strong interaction. Two of the most studied such states are the
charmed-strange states $D_{s0}^*(2317)$ and $D_{s1}(2460)$.
Searching for exotic multiquark states and elucidating their nature remains a
central topic in understanding quantum chromo dynamics--the underlying theory
of the strong interaction. Two of the most studied such states are the
charmed-strange states $D_{s0}^*(2317)$ and $D_{s1}(2460)$. In this work, we
show for the first time that their production yields in inclusive $e^+e^-\to
c\bar{c}$ productions near $\sqrt{s}=10.6$ GeV measured by the BaBar
Collaboration can be well explained in the molecular picture, which provide a
highly nontrivial verification of their nature being $DK/D^*K$ molecules. In
addition, we predict the production yield of the $D\bar{D}K$ three-body bound
state, $K_{c\bar{c}}(4180)$, in $e^+e^-\to c\bar{c}$ collisions and find that
it is within the reach of the ongoing Belle II experiment.
Authors: Tian-Chen Wu, Li-Sheng Geng.
We finally illustrate the performance of the resulting approximations in numerical examples.
Hybrid stochastic differential equations are a useful tool to model
continuously varying stochastic systems which are modulated by a random
environment that may depend on the system state itself. In this paper, we
establish the pathwise convergence of the solutions to hybrid stochastic
differential equations via space-grid discretizations. While time-grid
discretizations are a classical approach for simulation purposes, our
space-grid discretization provides a link with multi-regime Markov modulated
Brownian motions, leading to computational tractability. We exploit our
convergence result to obtain efficient approximations to first passage
probabilities and expected occupation times of the solutions hybrid stochastic
differential equations, results which are the first of their kind for such a
robust framework. We finally illustrate the performance of the resulting
approximations in numerical examples.
Authors: Hansjoerg Albrecher, Oscar Peralta.
In this article we define the semigroup associated to a substitution.
In this article we define the semigroup associated to a substitution. We use
it to construct a minimal automaton which generates a substitution sequence u
in reverse reading. We show, in the case where the substitution has a
coincidence, that this automaton completely describes the semicocycle
discontinuities of u.
Authors: Gandhar Joshi, Reem Yassawi.
The discovery of neural architectures from scratch is the long-standing goal of Neural Architecture Search (NAS). Searching over a wide spectrum of neural architectures can facilitate the discovery of previously unconsidered but well-performing architectures. We open source our algebraic NAS approach and provide APIs for PyTorch and TensorFlow.
The discovery of neural architectures from scratch is the long-standing goal
of Neural Architecture Search (NAS). Searching over a wide spectrum of neural
architectures can facilitate the discovery of previously unconsidered but
well-performing architectures. In this work, we take a large step towards
discovering neural architectures from scratch by expressing architectures
algebraically. This algebraic view leads to a more general method for designing
search spaces, which allows us to compactly represent search spaces that are
100s of orders of magnitude larger than common spaces from the literature.
Further, we propose a Bayesian Optimization strategy to efficiently search over
such huge spaces, and demonstrate empirically that both our search space design
and our search strategy can be superior to existing baselines. We open source
our algebraic NAS approach and provide APIs for PyTorch and TensorFlow.
Authors: Simon Schrodi, Danny Stoll, Binxin Ru, Rhea Sukthanker, Thomas Brox, Frank Hutter.