Papers made digestable
Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications.
Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves
long-term autonomous path-following using topometric mapping and localization
from a single rich sensor stream. In this paper, we improve the capabilities of
a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in
changing environments. Our architecture simplifies the obstacle-perception
problem to that of place-dependent change detection. We then extend the
behaviour of generic sample-based motion planners to better suit the
teach-and-repeat problem structure by introducing a new edge-cost metric paired
with a curvilinear planning space. The resulting planner generates naturally
smooth paths that avoid local obstacles while minimizing lateral path deviation
to best exploit prior terrain knowledge. While we use the method with VT&R, it
can be generalized to suit arbitrary path-following applications. Experimental
results from online run-time analysis, unit testing, and qualitative
experiments on a differential drive robot show the promise of the technique for
reliable long-term autonomous operation in complex unstructured environments.
Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.
The statistical and design considerations that pertain to
dose optimization are discussed. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%.
The traditional more-is-better dose selection paradigm, developed based on
cytotoxic chemotherapeutics, is often problematic When applied to the
development of novel molecularly targeted agents (e.g., kinase inhibitors,
monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug
Administration (FDA) initiated Project Optimus to reform the dose optimization
and dose selection paradigm in oncology drug development and call for more
attention to benefit-risk consideration.
We systematically investigated the operating characteristics of the seamless
phase 2-3 design as a strategy for dose optimization, where in stage 1
(corresponding to phase 2) patients are randomized to multiple doses, with or
without a control; and in stage 2 (corresponding to phase 3) the efficacy of
the selected optimal dose is evaluated with a randomized concurrent control or
historical control. Depending on whether the concurrent control is included and
the type of endpoints used in stages 1 and 2, we describe four types of
seamless phase 2-3 dose-optimization designs, which are suitable for different
clinical settings. The statistical and design considerations that pertain to
dose optimization are discussed. Simulation shows that dose optimization phase
2-3 designs are able to control the familywise type I error rates and yield
appropriate statistical power with substantially smaller sample size than the
conventional approach. The sample size savings range from 16.6% to 27.3%,
depending on the design and scenario, with a mean savings of 22.1%. Due to the
interim dose selection, the phase 2-3 dose-optimization design is logistically
and operationally more challenging, and should be carefully planned and
implemented to ensure trial integrity.
Authors: Liyun Jiang, Ying Yuan.
We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples.
We present the GPry algorithm for fast Bayesian inference of general
(non-Gaussian) posteriors with a moderate number of parameters. GPry does not
need any pre-training, special hardware such as GPUs, and is intended as a
drop-in replacement for traditional Monte Carlo methods for Bayesian inference.
Our algorithm is based on generating a Gaussian Process surrogate model of the
log-posterior, aided by a Support Vector Machine classifier that excludes
extreme or non-finite values. An active learning scheme allows us to reduce the
number of required posterior evaluations by two orders of magnitude compared to
traditional Monte Carlo inference. Our algorithm allows for parallel
evaluations of the posterior at optimal locations, further reducing wall-clock
times. We significantly improve performance using properties of the posterior
in our active learning scheme and for the definition of the GP prior. In
particular we account for the expected dynamical range of the posterior in
different dimensionalities. We test our model against a number of synthetic and
cosmological examples. GPry outperforms traditional Monte Carlo methods when
the evaluation time of the likelihood (or the calculation of theoretical
observables) is of the order of seconds; for evaluation times of over a minute
it can perform inference in days that would take months using traditional
methods. GPry is distributed as an open source Python package (pip install
gpry) and can also be found at https://github.com/jonaselgammal/GPry.
Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version.
We consider the fundamental scheduling problem of minimizing the sum of
weighted completion times on a single machine in the non-clairvoyant setting.
While no non-preemptive algorithm is constant competitive, Motwani, Phillips,
and Torng (SODA '93) proved that the simple preemptive round robin procedure is
$2$-competitive and that no better competitive ratio is possible, initiating a
long line of research focused on preemptive algorithms for generalized variants
of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS
'91) introduced kill-and-restart schedules, where running jobs may be killed
and restarted from scratch later, and analyzed then for the makespan objective.
However, to the best of our knowledge, this concept has never been considered
for the total completion time objective in the non-clairvoyant model.
We contribute to both models: First we give for any $b > 1$ a tight analysis
for the natural $b$-scaling kill-and-restart strategy for scheduling jobs
without release dates, as well as for a randomized variant of it. This implies
a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic
algorithm and of $\approx 3.032$ for the randomized version. Second, we show
that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is
$2$-competitive for jobs released in an online fashion over time, matching the
lower bound by Motwani et al. Using this result as well as the competitiveness
of round robin for multiple machines, we prove performance guarantees of
adaptions of the $b$-scaling algorithm to online release dates and unweighted
jobs on identical parallel machines.
Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Frozen pretrained models have become a viable alternative to the
pretraining-then-finetuning paradigm for transfer learning. However, with
frozen models there are relatively few parameters available for adapting to
downstream tasks, which is problematic in computer vision where tasks vary
significantly in input/output format and the type of information that is of
value. In this paper, we present a study of frozen pretrained models when
applied to diverse and representative computer vision tasks, including object
detection, semantic segmentation and video action recognition. From this
empirical analysis, our work answers the questions of what pretraining task
fits best with this frozen setting, how to make the frozen setting more
flexible to various downstream tasks, and the effect of larger model sizes. We
additionally examine the upper bound of performance using a giant frozen
pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches
competitive performance on a varied set of major benchmarks with only one
shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object
detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7
top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to
bring greater attention to this promising path of freezing pretrained image
models.
Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.
Generic programming
has proven to be an effective means of constructing libraries of reusable
software components in languages that support it. Generics-related language
design choices play a major role in how conducive generic programming is in
practice. Inquiry: Several mainstream programming languages (e.g. Much of the existing literature on supporting generic
programming focuses thus on retrofitting generic programming into existing
languages and identifying related implementation challenges. Is the programming
experience significantly better, or different when programming with a language
designed for generic programming without limitations from prior language design
choices? Magnolia is representative of an approach to language design
rooted in algebraic specifications. The understanding of
how to set the ground for generic programming will inform future language
design.
Context: Generic programming, as defined by Stepanov, is a methodology for
writing efficient and reusable algorithms by considering only the required
properties of their underlying data types and operations. Generic programming
has proven to be an effective means of constructing libraries of reusable
software components in languages that support it. Generics-related language
design choices play a major role in how conducive generic programming is in
practice.
Inquiry: Several mainstream programming languages (e.g. Java and C++) were
first created without generics; features to support generic programming were
added later, gradually. Much of the existing literature on supporting generic
programming focuses thus on retrofitting generic programming into existing
languages and identifying related implementation challenges. Is the programming
experience significantly better, or different when programming with a language
designed for generic programming without limitations from prior language design
choices?
Approach: We examine Magnolia, a language designed to embody generic
programming. Magnolia is representative of an approach to language design
rooted in algebraic specifications. We repeat a well-known experiment, where we
put Magnolia's generic programming facilities under scrutiny by implementing a
subset of the Boost Graph Library, and reflect on our development experience.
Knowledge: We discover that the idioms identified as key features for
supporting Stepanov-style generic programming in the previous studies and work
on the topic do not tell a full story. We clarify which of them are more of a
means to an end, rather than fundamental features for supporting generic
programming. Based on the development experience with Magnolia, we identify
variadics as an additional key feature for generic programming and point out
limitations and challenges of genericity by property.
Grounding: Our work uses a well-known framework for evaluating the generic
programming facilities of a language from the literature to evaluate the
algebraic approach through Magnolia, and we draw comparisons with well-known
programming languages.
Importance: This work gives a fresh perspective on generic programming, and
clarifies what are fundamental language properties and their trade-offs when
considering supporting Stepanov-style generic programming. The understanding of
how to set the ground for generic programming will inform future language
design.
Authors: Benjamin Chetioui, Jaakko Järvi, Magne Haveraaen.
All these uses demand that the user have a clear understanding of what an LTL specification means. This paper addresses the gap with a first study of LTL misconceptions. Concretely, we decompose "understanding LTL" into three questions. Therefore, we also study the relationship between formulas and specific traces. These findings are already resulting in changes to the Alloy modeling language. Grounding: Our findings are grounded in the responses to our survey rounds. Round 4 adds deep support for our misconceptions via talk-aloud surveys. Our survey instruments can serve as a starting point for other studies.
Context: Linear Temporal Logic (LTL) has been used widely in verification.
Its importance and popularity have only grown with the revival of temporal
logic synthesis, and with new uses of LTL in robotics and planning activities.
All these uses demand that the user have a clear understanding of what an LTL
specification means.
Inquiry: Despite the growing use of LTL, no studies have investigated the
misconceptions users actually have in understanding LTL formulas. This paper
addresses the gap with a first study of LTL misconceptions. Approach: We study
researchers' and learners' understanding of LTL in four rounds (three written
surveys, one talk-aloud) spread across a two-year timeframe. Concretely, we
decompose "understanding LTL" into three questions. A person reading a spec
needs to understand what it is saying, so we study the mapping from LTL to
English. A person writing a spec needs to go in the other direction, so we
study English to LTL. However, misconceptions could arise from two sources: a
misunderstanding of LTL's syntax or of its underlying semantics. Therefore, we
also study the relationship between formulas and specific traces.
Knowledge: We find several misconceptions that have consequences for
learners, tool builders, and designers of new property languages. These
findings are already resulting in changes to the Alloy modeling language. We
also find that the English to LTL direction was the most common source of
errors; unfortunately, this is the critical "authoring" direction in which a
subtle mistake can lead to a faulty system. We contribute study instruments
that are useful for training learners (whether academic or industrial) who are
getting acquainted with LTL, and we provide a code book to assist in the
analysis of responses to similar-style questions.
Grounding: Our findings are grounded in the responses to our survey rounds.
Round 1 used Quizius to identify misconceptions among learners in a way that
reduces the threat of expert blind spots. Rounds 2 and 3 confirm that both
additional learners and researchers (who work in formal methods, robotics, and
related fields) make similar errors. Round 4 adds deep support for our
misconceptions via talk-aloud surveys.
Importance This work provides useful answers to two critical but unexplored
questions: in what ways is LTL tricky and what can be done about it? Our survey
instruments can serve as a starting point for other studies.
Authors: Ben Greenman, Sam Saarinen, Tim Nelson, Shriram Krishnamurthi.
Besides, RPS could take DST as a
special case when all items occur in the same order. However, the repetition of
items is not allowed in RPS. Based on these properties, a decision support
system application is simulated to show the effectiveness of R2PS.
Based on Dempster-Shafer evidence theory (DST), random permutation set (RPS)
is proposed by replacing combinatorial number with permutation number and
therefore incorporating order information. Besides, RPS could take DST as a
special case when all items occur in the same order. However, the repetition of
items is not allowed in RPS. To address this issue, we propose repeatable
random permutation set (R2PS) which takes the repetition of items into
consideration. The right and left junctional sum combination rules are proposed
and their properties including consistency, pseudo-Matthew effect and
associativity are researched. Based on these properties, a decision support
system application is simulated to show the effectiveness of R2PS.
Authors: Wenran Yang, Yong Deng.
In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the review or not. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. Our focus in this article is to detect any deceptive text reviews.
A robust and reliable system of detecting spam reviews is a crying need in
todays world in order to purchase products without being cheated from online
sites. In many online sites, there are options for posting reviews, and thus
creating scopes for fake paid reviews or untruthful reviews. These concocted
reviews can mislead the general public and put them in a perplexity whether to
believe the review or not. Prominent machine learning techniques have been
introduced to solve the problem of spam review detection. The majority of
current research has concentrated on supervised learning methods, which require
labeled data - an inadequacy when it comes to online review. Our focus in this
article is to detect any deceptive text reviews. In order to achieve that we
have worked with both labeled and unlabeled data and proposed deep learning
methods for spam review detection which includes Multi-Layer Perceptron (MLP),
Convolutional Neural Network (CNN) and a variant of Recurrent Neural Network
(RNN) that is Long Short-Term Memory (LSTM). We have also applied some
traditional machine learning classifiers such as Nave Bayes (NB), K Nearest
Neighbor (KNN) and Support Vector Machine (SVM) to detect spam reviews and
finally, we have shown the performance comparison for both traditional and deep
learning classifiers.
Authors: G. M. Shahariar, Swapnil Biswas, Faiza Omar, Faisal Muhammad Shah, Samiha Binte Hassan.
The black hole photon ring is a prime target for upcoming space-based VLBI
missions seeking to image the fine structure of astrophysical black holes. Recent
work has identified a number of emergent symmetries related to the intricate
self-similar structure of the photon ring. Here, we explore this web of
interrelated phenomena in an exactly soluble example that arises as an
approximation to the near-extremal Kerr black hole. We show
explicitly that the geometric optics approximation reproduces the eikonal limit
of the exact QNM spectrum, as well as the approximate "near-ring"
wavefunctions.
The black hole photon ring is a prime target for upcoming space-based VLBI
missions seeking to image the fine structure of astrophysical black holes. The
classical Lyapunov exponents of the corresponding nearly bound null geodesics
control the quasinormal ringing of a perturbed black hole as it settles back
down to equilibrium, and they admit a holographic interpretation in terms of
quantum Ruelle resonances of the microstate dual to the Kerr black hole. Recent
work has identified a number of emergent symmetries related to the intricate
self-similar structure of the photon ring. Here, we explore this web of
interrelated phenomena in an exactly soluble example that arises as an
approximation to the near-extremal Kerr black hole. The self-dual warped
AdS$_3$ geometry has a photon ring as well as $\mathsf{SL}(2,\mathbb{R})$
isometries and an exactly calculable quasinormal mode (QNM) spectrum. We show
explicitly that the geometric optics approximation reproduces the eikonal limit
of the exact QNM spectrum, as well as the approximate "near-ring"
wavefunctions. The $\mathsf{SL}(2,\mathbb{R})$ isometries are directly related
to the emergent conformal symmetry of the photon ring in black hole images but
are distinct from a recently discussed conformal symmetry of the eikonal QNM
spectrum. The equivalence of the classical QNM spectrum -- and thus the photon
ring -- to the quantum Ruelle resonances in the context of a spacetime with a
putative holographic dual suggests that the photon ring of a warped black hole
is indeed part of the black hole hologram.
Authors: Daniel Kapec, Alexandru Lupsasca, Andrew Strominger.
This makes it possible to extract a different integrability/regularity of the data from each variable.
We revisit the local well-posedness theory of nonlinear Schr\"odinger and
wave equations in Sobolev spaces $H^s$ and $\dot{H}^s$, $0< s\leq 1$. The
theory has been well established over the past few decades under Sobolev
initial data regular with respect to all spatial variables. But here, we reveal
that the initial data do not need to have complete regularity like Sobolev
spaces, but only partially regularity with respect to some variables is
sufficient. To develop such a new theory, we suggest a refined Strichartz
estimate which has a different norm for each spatial variable. This makes it
possible to extract a different integrability/regularity of the data from each
variable.
Authors: Youngwoo Koh, Yoonjung Lee, Ihyeok Seo.
The current
adversarial attacks in computer vision can be divided into digital attacks and
physical attacks according to their different attack forms. Compared with
digital attacks, which generate perturbations in the digital pixels, physical
attacks are more practical in the real world. To
establish a taxonomy, we organize the current physical attacks from attack
tasks, attack forms, and attack methods, respectively. Thus, readers can have a
systematic knowledge about this topic from different aspects. Based on the above survey, we finally discuss the
challenges of this research field and further outlook the future direction.
Although Deep Neural Networks (DNNs) have been widely applied in various
real-world scenarios, they are vulnerable to adversarial examples. The current
adversarial attacks in computer vision can be divided into digital attacks and
physical attacks according to their different attack forms. Compared with
digital attacks, which generate perturbations in the digital pixels, physical
attacks are more practical in the real world. Owing to the serious security
problem caused by physically adversarial examples, many works have been
proposed to evaluate the physically adversarial robustness of DNNs in the past
years. In this paper, we summarize a survey versus the current physically
adversarial attacks and physically adversarial defenses in computer vision. To
establish a taxonomy, we organize the current physical attacks from attack
tasks, attack forms, and attack methods, respectively. Thus, readers can have a
systematic knowledge about this topic from different aspects. For the physical
defenses, we establish the taxonomy from pre-processing, in-processing, and
post-processing for the DNN models to achieve a full coverage of the
adversarial defenses. Based on the above survey, we finally discuss the
challenges of this research field and further outlook the future direction.
Authors: Xingxing Wei, Bangzheng Pu, Jiefan Lu, Baoyuan Wu.
Computed tomography (CT) is a widely-used imaging technology that assists clinical decision-making with high-quality human body representations. To reduce the radiation dose posed by CT, sparse-view and limited-angle CT are developed with preserved image quality. With such a design, we achieve better performances on the NIH-AAPM dataset over popular uniform sampling, especially when the number of views is small. Experiments on the VerSe dataset demonstrate this ability of our sampling policy, which is difficult to achieve based on uniform sampling.
Computed tomography (CT) is a widely-used imaging technology that assists
clinical decision-making with high-quality human body representations. To
reduce the radiation dose posed by CT, sparse-view and limited-angle CT are
developed with preserved image quality. However, these methods are still stuck
with a fixed or uniform sampling strategy, which inhibits the possibility of
acquiring a better image with an even reduced dose. In this paper, we explore
this possibility via learning an active sampling policy that optimizes the
sampling positions for patient-specific, high-quality reconstruction. To this
end, we design an \textit{intelligent agent} for active recommendation of
sampling positions based on on-the-fly reconstruction with obtained sinograms
in a progressive fashion. With such a design, we achieve better performances on
the NIH-AAPM dataset over popular uniform sampling, especially when the number
of views is small. Finally, such a design also enables RoI-aware reconstruction
with improved reconstruction quality within regions of interest (RoI's) that
are clinically important. Experiments on the VerSe dataset demonstrate this
ability of our sampling policy, which is difficult to achieve based on uniform
sampling.
Authors: Ce Wang, Kun Shang, Haimiao Zhang, Shang Zhao, Dong Liang, S. Kevin Zhou.
The
joint encoder-decoder self-supervised model extends the HuBERT model with a
Transformer decoder. HuBERT performs clustering of features and predicts the
class of every input frame. In simple pooling, which is our baseline, there is
no way to identify the channel information. To incorporate channel information,
we have proposed non-overlapping cluster IDs for speech from different
channels.
This paper proposes a novel technique to obtain better downstream ASR
performance from a joint encoder-decoder self-supervised model when trained
with speech pooled from two different channels (narrow and wide band). The
joint encoder-decoder self-supervised model extends the HuBERT model with a
Transformer decoder. HuBERT performs clustering of features and predicts the
class of every input frame. In simple pooling, which is our baseline, there is
no way to identify the channel information. To incorporate channel information,
we have proposed non-overlapping cluster IDs for speech from different
channels. Our method gives a relative improvement of ~ 5% over the joint
encoder-decoder self-supervised model built with simple pooling of data, which
serves as our baseline.
Authors: Vrunda N. Sukhadia, A. Arunkumar, S. Umesh.
Cross-platform verification becomes increasingly challenging as the system's dimensionality increases, and has so far remained intractable for continuous variable quantum systems. In this Letter, we develop a data-driven approach, working with limited noisy data and suitable for continuous variable quantum states. Our approach is based on a convolutional neural network that assesses the similarity of quantum states based on a lower-dimensional state representation built from measurement data.
The task of testing whether two uncharacterized devices behave in the same
way, known as cross-platform verification, is crucial for benchmarking quantum
simulators and near-term quantum computers. Cross-platform verification becomes
increasingly challenging as the system's dimensionality increases, and has so
far remained intractable for continuous variable quantum systems. In this
Letter, we develop a data-driven approach, working with limited noisy data and
suitable for continuous variable quantum states. Our approach is based on a
convolutional neural network that assesses the similarity of quantum states
based on a lower-dimensional state representation built from measurement data.
The network can be trained offline with classically simulated data, and is
demonstrated here on non-Gaussian quantum states for which cross-platform
verification could not be achieved with previous techniques. It can also be
applied to cross-platform verification of quantum dynamics and to the problem
of experimentally testing whether two quantum states are equivalent up to
Gaussian unitary transformations.
Authors: Ya-Dong Wu, Yan Zhu, Ge Bai, Yuexuan Wang, Giulio Chiribella.