Papers made digestable
Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications.
Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves
long-term autonomous path-following using topometric mapping and localization
from a single rich sensor stream. In this paper, we improve the capabilities of
a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in
changing environments. Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. We then extend the
behaviour of generic sample-based motion planners to better suit the
teach-and-repeat problem structure by introducing a new edge-cost metric paired
with a curvilinear planning space. The resulting planner generates naturally
smooth paths that avoid local obstacles while minimizing lateral path deviation
to best exploit prior terrain knowledge. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications. Experimental
results from online run-time analysis, unit testing, and qualitative
experiments on a differential drive robot show the promise of the technique for
reliable long-term autonomous operation in complex unstructured environments.
Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.
The statistical and design considerations that pertain to
dose optimization are discussed. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%.
The traditional more-is-better dose selection paradigm, developed based on
cytotoxic chemotherapeutics, is often problematic When applied to the
development of novel molecularly targeted agents (e.g., kinase inhibitors,
monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug
Administration (FDA) initiated Project Optimus to reform the dose optimization
and dose selection paradigm in oncology drug development and call for more
attention to benefit-risk consideration.
We systematically investigated the operating characteristics of the seamless
phase 2-3 design as a strategy for dose optimization, where in stage 1
(corresponding to phase 2) patients are randomized to multiple doses, with or
without a control; and in stage 2 (corresponding to phase 3) the efficacy of
the selected optimal dose is evaluated with a randomized concurrent control or
historical control. Depending on whether the concurrent control is included and
the type of endpoints used in stages 1 and 2, we describe four types of
seamless phase 2-3 dose-optimization designs, which are suitable for different
clinical settings. The statistical and design considerations that pertain to
dose optimization are discussed. Simulation shows that dose optimization phase
2-3 designs are able to control the familywise type I error rates and yield
appropriate statistical power with substantially smaller sample size than the
conventional approach. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%. Due to the
interim dose selection, the phase 2-3 dose-optimization design is logistically
and operationally more challenging, and should be carefully planned and
implemented to ensure trial integrity.
Authors: Liyun Jiang, Ying Yuan.
We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples.
We present the GPry algorithm for fast Bayesian inference of general
(non-Gaussian) posteriors with a moderate number of parameters. GPry does not
need any pre-training, special hardware such as GPUs, and is intended as a
drop-in replacement for traditional Monte Carlo methods for Bayesian inference.
Our algorithm is based on generating a Gaussian Process surrogate model of the
log-posterior, aided by a Support Vector Machine classifier that excludes
extreme or non-finite values. An active learning scheme allows us to reduce the
number of required posterior evaluations by two orders of magnitude compared to
traditional Monte Carlo inference. Our algorithm allows for parallel
evaluations of the posterior at optimal locations, further reducing wall-clock
times. We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples. GPry outperforms traditional Monte Carlo methods when
the evaluation time of the likelihood (or the calculation of theoretical
observables) is of the order of seconds; for evaluation times of over a minute
it can perform inference in days that would take months using traditional
methods. GPry is distributed as an open source Python package (pip install
gpry) and can also be found at https://github.com/jonaselgammal/GPry.
Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting.
While no non-preemptive algorithm is constant competitive, Motwani, Phillips,
and Torng (SODA '93) proved that the simple preemptive round robin procedure is
$2$-competitive and that no better competitive ratio is possible, initiating a
long line of research focused on preemptive algorithms for generalized variants
of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS
'91) introduced kill-and-restart schedules, where running jobs may be killed
and restarted from scratch later, and analyzed then for the makespan objective.
However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model.
We contribute to both models: First we give for any $b > 1$ a tight analysis
for the natural $b$-scaling kill-and-restart strategy for scheduling jobs
without release dates, as well as for a randomized variant of it. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version. Second, we show
that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is
$2$-competitive for jobs released in an online fashion over time, matching the
lower bound by Motwani et al. Using this result as well as the competitiveness
of round robin for multiple machines, we prove performance guarantees of
adaptions of the $b$-scaling algorithm to online release dates and unweighted
jobs on identical parallel machines.
Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. However, with
frozen models there are relatively few parameters available for adapting to
downstream tasks, which is problematic in computer vision where tasks vary
significantly in input/output format and the type of information that is of
value. In this paper, we present a study of frozen pretrained models when
applied to diverse and representative computer vision tasks, including object
detection, semantic segmentation and video action recognition. From this
empirical analysis, our work answers the questions of what pretraining task
fits best with this frozen setting, how to make the frozen setting more
flexible to various downstream tasks, and the effect of larger model sizes. We
additionally examine the upper bound of performance using a giant frozen
pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches
competitive performance on a varied set of major benchmarks with only one
shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object
detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7
top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.
Distributing machine learning predictors enables the collection of
large-scale datasets while leaving sensitive raw data at trustworthy sites. For a large number of participants, communication cost is one
of the main challenges. We achieve a low communication cost by requiring only a
single invocation of an efficient secure multiparty summation protocol. More generally, we prove learnability
properties for the average of such locally trained models: convergence and
uniform stability.
Distributing machine learning predictors enables the collection of
large-scale datasets while leaving sensitive raw data at trustworthy sites. We
show that locally training support vector machines (SVMs) and computing their
averages leads to a learning technique that is scalable to a large number of
users, satisfies differential privacy, and is applicable to non-trivial tasks,
such as CIFAR-10. For a large number of participants, communication cost is one
of the main challenges. We achieve a low communication cost by requiring only a
single invocation of an efficient secure multiparty summation protocol. By
relying on state-of-the-art feature extractors (SimCLR), we are able to utilize
differentially private convex learners for non-trivial tasks such as CIFAR-10.
Our experimental results illustrate that for $1{,}000$ users with $50$ data
points each, our scheme outperforms state-of-the-art scalable distributed
learning methods (differentially private federated learning, short DP-FL) while
requiring around $500$ times fewer communication costs: For CIFAR-10, we
achieve a classification accuracy of $79.7\,\%$ for an $\varepsilon = 0.59$
while DP-FL achieves $57.6\,\%$. More generally, we prove learnability
properties for the average of such locally trained models: convergence and
uniform stability. By only requiring strongly convex, smooth, and
Lipschitz-continuous objective functions, locally trained via stochastic
gradient descent (SGD), we achieve a strong utility-privacy tradeoff.
Authors: Moritz Kirschte, Sebastian Meiser, Saman Ardalan, Esfandiar Mohammadi.
In the past decades, tremendous efforts have been made towards understanding the exotic physics emerging from competition between various ordering tendencies in strongly correlated systems. For small value of $N$, namely $N=2,3$, the ground state is antiferromagnetic (AFM) phase, whereas in the large-$N$ limit, valence bound solid (VBS) order is dominant. For the intermediate value of $N$ such as $N=4$, remarkably, our study reveals distinct VBS orders appear in the weak and strong coupling regimes.
In the past decades, tremendous efforts have been made towards understanding
the exotic physics emerging from competition between various ordering
tendencies in strongly correlated systems. Employing state-of-the-art quantum
Monte-Carlo simulation, we investigate an interacting SU($N$) fermionic model
with varying interaction strength and value of $N$, and unveil the ground-state
phase diagram of the model exhibiting a plethora of exotic phases. For small
value of $N$, namely $N=2,3$, the ground state is antiferromagnetic (AFM)
phase, whereas in the large-$N$ limit, valence bound solid (VBS) order is
dominant. For the intermediate value of $N$ such as $N=4$, remarkably, our
study reveals distinct VBS orders appear in the weak and strong coupling
regimes. More fantastically, the competition between staggered and columnar VBS
ordering tendencies gives rise to a Mott insulating phase without spontaneously
symmetry breaking (SSB), existing in a large interacting parameter regime,
which is consistent with a gapped quantum spin liquid. Our study not only
provides a platform to investigate the fundamental physics of quantum many-body
systems, but also offers a novel route towards searching for exotic states of
matter such as quantum spin liquid in realistic quantum materials.
Authors: Xue-Jia Yu, Shao-Hang Shi, Limei Xu, Zi-Xiang Li.
In the present article, we aim to quantify the carbon footprint
of BLOOM, a 176-billion parameter language model, across its life cycle. We also study the energy requirements and
carbon emissions of its deployment for inference via an API endpoint receiving
user queries in real-time.
Progress in machine learning (ML) comes with a cost to the environment, given
that training ML models requires significant computational resources, energy
and materials. In the present article, we aim to quantify the carbon footprint
of BLOOM, a 176-billion parameter language model, across its life cycle. We
estimate that BLOOM's final training emitted approximately 24.7 tonnes
of~\carboneq~if we consider only the dynamic power consumption, and 50.5 tonnes
if we account for all processes ranging from equipment manufacturing to
energy-based operational consumption. We also study the energy requirements and
carbon emissions of its deployment for inference via an API endpoint receiving
user queries in real-time. We conclude with a discussion regarding the
difficulty of precisely estimating the carbon footprint of ML models and future
research directions that can contribute towards improving carbon emissions
reporting.
Authors: Alexandra Sasha Luccioni, Sylvain Viguier, Anne-Laure Ligozat.
The attention weights on the kernels are further distilled by channel attention and multi-layer feature aggregation to learn global features from speech. This approach provides an efficient solution to improving representation capacity with lower data resources. This is due to the self-adaptation to inputs of the structures of the model parameters.
State-of-the-art speaker verification frameworks have typically focused on
speech enhancement techniques with increasingly deeper (more layers) and wider
(number of channels) models to improve their verification performance. Instead,
this paper proposes an approach to increase the model resolution capability
using attention-based dynamic kernels in a convolutional neural network to
adapt the model parameters to be feature-conditioned. The attention weights on
the kernels are further distilled by channel attention and multi-layer feature
aggregation to learn global features from speech. This approach provides an
efficient solution to improving representation capacity with lower data
resources. This is due to the self-adaptation to inputs of the structures of
the model parameters. The proposed dynamic convolutional model achieved 1.62\%
EER and 0.18 miniDCF on the VoxCeleb1 test set and has a 17\% relative
improvement compared to the ECAPA-TDNN.
Authors: Anna Ollerenshaw, Md Asif Jalal, Thomas Hain.
Deep learning models for semantic segmentation are prone to poor performance
in real-world applications due to the highly challenging nature of the task. This leads to a significantly more
accurate view of model uncertainty than conventional Bayesian methods. We demonstrate these advantages through experimental evaluations
of our framework implemented over four different state-of-the-art model
architectures that are trained and evaluated on two benchmark road-scene
segmentation datasets (Camvid and Cityscapes).
Deep learning models for semantic segmentation are prone to poor performance
in real-world applications due to the highly challenging nature of the task.
Model uncertainty quantification (UQ) is one way to address this issue of lack
of model trustworthiness by enabling the practitioner to know how much to trust
a segmentation output. Current UQ methods in this application domain are mainly
restricted to Bayesian based methods which are computationally expensive and
are only able to extract central moments of uncertainty thereby limiting the
quality of their uncertainty estimates. We present a simple framework for
high-resolution predictive uncertainty quantification of semantic segmentation
models that leverages a multi-moment functional definition of uncertainty
associated with the model's feature space in the reproducing kernel Hilbert
space (RKHS). The multiple uncertainty functionals extracted from this
framework are defined by the local density dynamics of the model's feature
space and hence automatically align themselves at the tail-regions of the
intrinsic probability density function of the feature space (where uncertainty
is the highest) in such a way that the successively higher order moments
quantify the more uncertain regions. This leads to a significantly more
accurate view of model uncertainty than conventional Bayesian methods.
Moreover, the extraction of such moments is done in a single-shot computation
making it much faster than Bayesian and ensemble approaches (that involve a
high number of forward stochastic passes of the model to quantify its
uncertainty). We demonstrate these advantages through experimental evaluations
of our framework implemented over four different state-of-the-art model
architectures that are trained and evaluated on two benchmark road-scene
segmentation datasets (Camvid and Cityscapes).
Authors: Rishabh Singh, Jose C. Principe.
The Covid pandemic is a clarion call for increased sensitivity to the interconnected nature of social problems facing our world today. The children learnt the Engineering Design Thinking process and worked in online groups of two or three, from concept to completion. Despite the constraints posed by the pandemic, they explored creative ways to think about design and innovation. They completed a variety of tasks by making, tinkering, engineering, assembling, and programming to grasp the intricate relationship between software and hardware. Subsequently, the children showcased their creative abilities through video storytelling to a panel of domain experts.
The Covid pandemic is a clarion call for increased sensitivity to the
interconnected nature of social problems facing our world today. A
future-oriented education on critical issues, such as those outlined in the
United Nations Sustainable Development Goals (UN SDGs) and designing potential
solutions for such problems is an imperative skill that must be imparted to
children to help them navigate their future in today's unpredictable world.
Towards this goal, we have been conducting 3.5 month-long mentoring programs
for pre-university students in India to participate in a STEAM for Social Good
innovation challenge conducted annually by the Government of India. Using
digital and physical computing skills, we helped children explore creative
solutions for social problems through a constructionist approach to learning,
wherein they ideated and reflected upon the problems in their communities. The
children learnt the Engineering Design Thinking process and worked in online
groups of two or three, from concept to completion. Despite the constraints
posed by the pandemic, they explored creative ways to think about design and
innovation. They completed a variety of tasks by making, tinkering,
engineering, assembling, and programming to grasp the intricate relationship
between software and hardware. Subsequently, the children showcased their
creative abilities through video storytelling to a panel of domain experts. In
this paper, we present the children's perspective of their experiences through
this journey, the evaluation metrics based on IEEE design principles, and our
learnings from conducting this initiative as a university-school partnership
model for 84 middle and high school students. The aspirational intent of this
initiative is to make the children better social problem solvers and help them
perceive social problems as opportunities to enhance life for themselves and
their communities.
Authors: Gayathri Manikutty, Sreejith Sasidharan, Bhavani Rao.
We report a critical narrowing of resonances of a driven potential well, when
their eigenfrequencies approach the edge of the continuum. The resonances also
obtain unusual sharp-peak shapes at the continuum boundary. The narrow and sharp-peak resonances can be used for an efficient narrow-band
frequency- and spatial filtering of light.
We report a critical narrowing of resonances of a driven potential well, when
their eigenfrequencies approach the edge of the continuum. The resonances also
obtain unusual sharp-peak shapes at the continuum boundary. The situation can
be realized for the electromagnetic wave propagating across the dielectric thin
films with a periodically modulated interface(s). We show the general
phenomenon semi-analytically on a simplified model of a driven quantum
potential well, also by rigorous numerical analysis of Maxwell equations for
the wave propagation across the thin film with a modulated interface(s). We
justify the phenomenon experimentally, by the measurements of light reflection
from the dielectric thin film deposited on a periodically modulated surface.
The narrow and sharp-peak resonances can be used for an efficient narrow-band
frequency- and spatial filtering of light.
Authors: Ignas Lukosiunas, Lina Grineviciute, Julianija Nikitina, Darius Gailevicius, Kestutis Staliunas.
Let $A$ be a Hopf algebra equipped with a projection onto the coordinate Hopf algebra $\mathcal{O}(G)$ of a semisimple algebraic group $G$.
Let $A$ be a Hopf algebra equipped with a projection onto the coordinate Hopf
algebra $\mathcal{O}(G)$ of a semisimple algebraic group $G$. It is shown that
if $A$ admits a suitably non-degenerate comodule $V$ and the induced $G$-module
structure of $V$ is non-trivial, then the third Hochschild homology group of
$A$ is non-trivial.
Authors: Tomasz Brzeziński, Ulrich Krähmer, Réamonn Ó Buachalla, Karen R. Strung.
These are the so-called
zero-group-velocity (ZGV) points. These
applications rely on the correct prediction of the ZGV points. The resulting governing equation is
interpreted as a two-parameter eigenvalue problem.
Dispersion curves of elastic waveguides exhibit points where the group
velocity vanishes while the wavenumber remains finite. These are the so-called
zero-group-velocity (ZGV) points. As the elastodynamic energy at these points
remains confined close to the source, they are of practical interest for
nondestructive testing and quantitative characterization of structures. These
applications rely on the correct prediction of the ZGV points. In this
contribution, we first model the ZGV resonances in anisotropic plates based on
the appearance of an exceptional mode. The resulting governing equation is
interpreted as a two-parameter eigenvalue problem. We then present three
complementary numerical procedures capable of computing ZGV points in arbitrary
nondissipative elastic waveguides. The first method is globally convergent and
guarantees to find all ZGV points but can only be used for small problems. The
second procedure is a very fast, generally-applicable, Newton-type iteration
that is locally convergent and requires initial guesses. The third method
combines both kind of approaches and yields a procedure that is applicable to
large problems, does not require initial guesses and is likely to find all ZGV
points.
Authors: Daniel A. Kiefer, Bor Plestenjak, Hauke Gravenkamp, Claire Prada.
We experiment with lilGym with different models and learning regimes. lilGym is available at https://lil.nlp.cornell.edu/lilgym/.
We present lilGym, a new benchmark for language-conditioned reinforcement
learning in visual environments. lilGym is based on 2,661 highly-compositional
human-written natural language statements grounded in an interactive visual
environment. We annotate all statements with executable Python programs
representing their meaning to enable exact reward computation in every possible
world state. Each statement is paired with multiple start states and reward
functions to form thousands of distinct Markov Decision Processes of varying
difficulty. We experiment with lilGym with different models and learning
regimes. Our results and analysis show that while existing methods are able to
achieve non-trivial performance, lilGym forms a challenging open problem.
lilGym is available at https://lil.nlp.cornell.edu/lilgym/.
Authors: Anne Wu, Kianté Brantley, Noriyuki Kojima, Yoav Artzi.