Papers made digestable
Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications.
Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves
long-term autonomous path-following using topometric mapping and localization
from a single rich sensor stream. In this paper, we improve the capabilities of
a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in
changing environments. Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. We then extend the
behaviour of generic sample-based motion planners to better suit the
teach-and-repeat problem structure by introducing a new edge-cost metric paired
with a curvilinear planning space. The resulting planner generates naturally
smooth paths that avoid local obstacles while minimizing lateral path deviation
to best exploit prior terrain knowledge. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications. Experimental
results from online run-time analysis, unit testing, and qualitative
experiments on a differential drive robot show the promise of the technique for
reliable long-term autonomous operation in complex unstructured environments.
Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.
The statistical and design considerations that pertain to
dose optimization are discussed. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%.
The traditional more-is-better dose selection paradigm, developed based on
cytotoxic chemotherapeutics, is often problematic When applied to the
development of novel molecularly targeted agents (e.g., kinase inhibitors,
monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug
Administration (FDA) initiated Project Optimus to reform the dose optimization
and dose selection paradigm in oncology drug development and call for more
attention to benefit-risk consideration.
We systematically investigated the operating characteristics of the seamless
phase 2-3 design as a strategy for dose optimization, where in stage 1
(corresponding to phase 2) patients are randomized to multiple doses, with or
without a control; and in stage 2 (corresponding to phase 3) the efficacy of
the selected optimal dose is evaluated with a randomized concurrent control or
historical control. Depending on whether the concurrent control is included and
the type of endpoints used in stages 1 and 2, we describe four types of
seamless phase 2-3 dose-optimization designs, which are suitable for different
clinical settings. The statistical and design considerations that pertain to
dose optimization are discussed. Simulation shows that dose optimization phase
2-3 designs are able to control the familywise type I error rates and yield
appropriate statistical power with substantially smaller sample size than the
conventional approach. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%. Due to the
interim dose selection, the phase 2-3 dose-optimization design is logistically
and operationally more challenging, and should be carefully planned and
implemented to ensure trial integrity.
Authors: Liyun Jiang, Ying Yuan.
We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples.
We present the GPry algorithm for fast Bayesian inference of general
(non-Gaussian) posteriors with a moderate number of parameters. GPry does not
need any pre-training, special hardware such as GPUs, and is intended as a
drop-in replacement for traditional Monte Carlo methods for Bayesian inference.
Our algorithm is based on generating a Gaussian Process surrogate model of the
log-posterior, aided by a Support Vector Machine classifier that excludes
extreme or non-finite values. An active learning scheme allows us to reduce the
number of required posterior evaluations by two orders of magnitude compared to
traditional Monte Carlo inference. Our algorithm allows for parallel
evaluations of the posterior at optimal locations, further reducing wall-clock
times. We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples. GPry outperforms traditional Monte Carlo methods when
the evaluation time of the likelihood (or the calculation of theoretical
observables) is of the order of seconds; for evaluation times of over a minute
it can perform inference in days that would take months using traditional
methods. GPry is distributed as an open source Python package (pip install
gpry) and can also be found at https://github.com/jonaselgammal/GPry.
Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting.
While no non-preemptive algorithm is constant competitive, Motwani, Phillips,
and Torng (SODA '93) proved that the simple preemptive round robin procedure is
$2$-competitive and that no better competitive ratio is possible, initiating a
long line of research focused on preemptive algorithms for generalized variants
of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS
'91) introduced kill-and-restart schedules, where running jobs may be killed
and restarted from scratch later, and analyzed then for the makespan objective.
However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model.
We contribute to both models: First we give for any $b > 1$ a tight analysis
for the natural $b$-scaling kill-and-restart strategy for scheduling jobs
without release dates, as well as for a randomized variant of it. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version. Second, we show
that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is
$2$-competitive for jobs released in an online fashion over time, matching the
lower bound by Motwani et al. Using this result as well as the competitiveness
of round robin for multiple machines, we prove performance guarantees of
adaptions of the $b$-scaling algorithm to online release dates and unweighted
jobs on identical parallel machines.
Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. However, with
frozen models there are relatively few parameters available for adapting to
downstream tasks, which is problematic in computer vision where tasks vary
significantly in input/output format and the type of information that is of
value. In this paper, we present a study of frozen pretrained models when
applied to diverse and representative computer vision tasks, including object
detection, semantic segmentation and video action recognition. From this
empirical analysis, our work answers the questions of what pretraining task
fits best with this frozen setting, how to make the frozen setting more
flexible to various downstream tasks, and the effect of larger model sizes. We
additionally examine the upper bound of performance using a giant frozen
pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches
competitive performance on a varied set of major benchmarks with only one
shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object
detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7
top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.
Sufficient conditions for the uniqueness of
the fixpoint are established. Examples include known and new fixpoint theorems
for metric spaces, fuzzy metric spaces, and probabilistic metric spaces.
We prove a fixpoint theorem for contractions on Cauchy-complete
quantale-enriched categories. It holds for any quantale whose underlying
lattice is continuous, and applies to contractions whose control function is
sequentially lower-semicontinuous. Sufficient conditions for the uniqueness of
the fixpoint are established. Examples include known and new fixpoint theorems
for metric spaces, fuzzy metric spaces, and probabilistic metric spaces.
Authors: Arij Benkhadra, Isar Stubbe.
Video event extraction aims to detect salient events from a video and identify the arguments for each event as well as their semantic roles. Existing methods focus on capturing the overall visual scene of each frame, ignoring fine-grained argument-level information. We further propose Object State Embedding, Object Motion-aware Embedding and Argument Interaction Embedding to encode and track these changes respectively. Experiments on various video event extraction tasks demonstrate significant improvements compared to state-of-the-art models.
Video event extraction aims to detect salient events from a video and
identify the arguments for each event as well as their semantic roles. Existing
methods focus on capturing the overall visual scene of each frame, ignoring
fine-grained argument-level information. Inspired by the definition of events
as changes of states, we propose a novel framework to detect video events by
tracking the changes in the visual states of all involved arguments, which are
expected to provide the most informative evidence for the extraction of video
events. In order to capture the visual state changes of arguments, we decompose
them into changes in pixels within objects, displacements of objects, and
interactions among multiple arguments. We further propose Object State
Embedding, Object Motion-aware Embedding and Argument Interaction Embedding to
encode and track these changes respectively. Experiments on various video event
extraction tasks demonstrate significant improvements compared to
state-of-the-art models. In particular, on verb classification, we achieve
3.49% absolute gains (19.53% relative gains) in F1@5 on Video Situation
Recognition.
Authors: Guang Yang, Manling Li, Xudong Lin, Jiajie Zhang, Shih-Fu Chang, Heng Ji.
We use angle-resolved photoemission
spectroscopy to investigate changes in the Fermi surface of this material under
surface doping with potassium.
We report on the interplay between a van Hove singularity and a charge
density wave state in 2H-TaSe$_{2}$. We use angle-resolved photoemission
spectroscopy to investigate changes in the Fermi surface of this material under
surface doping with potassium. At high doping, we observe modifications which
imply the disappearance of the $(3\times 3)$ charge density wave and formation
of a different correlated state. Using a tight-binding-based approach as well
as an effective model, we explain our observations as a consequence of coupling
between the single-particle Lifshitz transition during which the Fermi level
passes a van Hove singularity and the charge density order. The high electronic
density of states associated with the van Hove singularity induces a change in
the periodicity of the charge density wave from the known $(3\times 3)$ to a
new $(2\times 2)$ superlattice.
Authors: William R. B. Luckin, Yiwei Li, Juan Jiang, Surani M. Gunasekera, Dharmalingam Prabhakaran, Felix Flicker, Yulin Chen, Marcin Mucha-Kruczynski.
The idea has been applied in two situations. The validity has been domonstrated on MNIST and CIFAR-10 datasets. Further improvement and some open questions related are also discussed.
We propose to employ the hierarchical coarse-grained structure in the
artificial neural networks explicitly to improve the interpretability without
degrading performance. The idea has been applied in two situations. One is a
neural network called TaylorNet, which aims to approximate the general mapping
from input data to output result in terms of Taylor series directly, without
resorting to any magic nonlinear activations. The other is a new setup for data
distillation, which can perform multi-level abstraction of the input dataset
and generate new data that possesses the relevant features of the original
dataset and can be used as references for classification. In both cases, the
coarse-grained structure plays an important role in simplifying the network and
improving both the interpretability and efficiency. The validity has been
domonstrated on MNIST and CIFAR-10 datasets. Further improvement and some open
questions related are also discussed.
Authors: Xi-Ci Yang, Z. Y. Xie, Xiao-Tao Yang.
In PTL, accurately quantifying the domain gap is critical. To do that, we
theoretically demonstrate that the feature representation space of a given
object detector can be modeled as a multivariate Gaussian distribution from
which the Mahalanobis distance between a virtual object and the Gaussian
distribution of each object category in the representation space can be readily
computed. Experiments show that PTL results in a substantial performance
increase over the baseline, especially in the small data and the cross-domain
regime.
To effectively interrogate UAV-based images for detecting objects of
interest, such as humans, it is essential to acquire large-scale UAV-based
datasets that include human instances with various poses captured from widely
varying viewing angles. As a viable alternative to laborious and costly data
curation, we introduce Progressive Transformation Learning (PTL), which
gradually augments a training dataset by adding transformed virtual images with
enhanced realism. Generally, a virtual2real transformation generator in the
conditional GAN framework suffers from quality degradation when a large domain
gap exists between real and virtual images. To deal with the domain gap, PTL
takes a novel approach that progressively iterates the following three steps:
1) select a subset from a pool of virtual images according to the domain gap,
2) transform the selected virtual images to enhance realism, and 3) add the
transformed virtual images to the training set while removing them from the
pool. In PTL, accurately quantifying the domain gap is critical. To do that, we
theoretically demonstrate that the feature representation space of a given
object detector can be modeled as a multivariate Gaussian distribution from
which the Mahalanobis distance between a virtual object and the Gaussian
distribution of each object category in the representation space can be readily
computed. Experiments show that PTL results in a substantial performance
increase over the baseline, especially in the small data and the cross-domain
regime.
Authors: Yi-Ting Shen, Hyungtae Lee, Heesung Kwon, Shuvra Shikhar Bhattacharyya.
The results show differences depending on the test concepts considered and problems with very specific concepts. These evaluations were performed using a vision transformer model for image classification.
We generate synthetic images with the "Stable Diffusion" image generation
model using the Wordnet taxonomy and the definitions of concepts it contains.
This synthetic image database can be used as training data for data
augmentation in machine learning applications, and it is used to investigate
the capabilities of the Stable Diffusion model.
Analyses show that Stable Diffusion can produce correct images for a large
number of concepts, but also a large variety of different representations. The
results show differences depending on the test concepts considered and problems
with very specific concepts. These evaluations were performed using a vision
transformer model for image classification.
Authors: Andreas Stöckl.
The goal is to find $s$.
Simon's problem is a standard example of a problem that is exponential in
classical sense, while it admits a polynomial solution in quantum computing. It
is about a function $f$ for which it is given that a unique non-zero vector $s$
exists for which $f(x) = f(x \oplus s)$ for all $x$, where $\oplus$ is the
exclusive or operator. The goal is to find $s$. The exponential lower bound for
the classical sense assumes that $f$ only admits black box access. In this
paper we investigate classical complexity when $f$ is given by a standard
representation like a circuit. We focus on finding the vector space of all
vectors $s$ for which $f(x) = f(x \oplus s)$ for all $x$, for any given $f$.
Two main results are: (1) if $f$ is given by any circuit, then checking whether
this vector space contains a non-zero element is NP-hard, and (2) if $f$ is
given by any ordered BDD, then a basis of this vector space can be computed in
polynomial time.
Authors: Hans Zantema.
Along the way, we construct an existentially decidable field of positive characteristic with an existentially undecidable finite extension, modifying a construction due to Kesavan Thanagopal.
We construct an existentially undecidable complete discretely valued field of
mixed characteristic with existentially decidable residue field and decidable
algebraic part, answering a question by Anscombe-Fehm in a strong way. Along
the way, we construct an existentially decidable field of positive
characteristic with an existentially undecidable finite extension, modifying a
construction due to Kesavan Thanagopal.
Authors: Philip Dittmann.
The competition between antiferromagnetism and superconductivity is one of
the central questions in the research of strong correlated systems. Our results shows that the superconductivity emerges
when the antiferromagnetism is suppressed by tuning the filling or the
anisotropy of the interlayer Heisenberg interaction. This model can be seen as
an analogue of unconventional superconductors and may help us to understand the
transition from an antiferromagnetic insulator to a superconductor.
The competition between antiferromagnetism and superconductivity is one of
the central questions in the research of strong correlated systems. In this
work, we utilize a double layer model containing Hubbard interaction and
interlayer Heisenberg interaction to reveal their competitions. This model is
free of sign problem at certain conditions, and we perform projector quantum
Monte Carlo simulations to extract the ground state correlations of magnetism
and superconductivity. Our results shows that the superconductivity emerges
when the antiferromagnetism is suppressed by tuning the filling or the
anisotropy of the interlayer Heisenberg interaction. This model can be seen as
an analogue of unconventional superconductors and may help us to understand the
transition from an antiferromagnetic insulator to a superconductor.
Authors: Runyu Ma, Tianxing Ma.
We assess the qualitative and quantitative performance of these techniques on three different datasets and describe our findings. The results shed a fresh light on the notion of explainability in GNNs, particularly GATs.
With the growing use of deep learning methods, particularly graph neural
networks, which encode intricate interconnectedness information, for a variety
of real tasks, there is a necessity for explainability in such settings. In
this paper, we demonstrate the applicability of popular explainability
approaches on Graph Attention Networks (GAT) for a graph-based super-pixel
image classification task. We assess the qualitative and quantitative
performance of these techniques on three different datasets and describe our
findings. The results shed a fresh light on the notion of explainability in
GNNs, particularly GATs.
Authors: Harsh Patel, Shivam Sahni.