Welcome to Byte Size Arxiv

Papers made digestable

Recommended Categories:
Show All

2022-11-03

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves long-term autonomous path-following using topometric mapping and localization from a single rich sensor stream. In this paper, we improve the capabilities of a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in changing environments. Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. We then extend the behaviour of generic sample-based motion planners to better suit the teach-and-repeat problem structure by introducing a new edge-cost metric paired with a curvilinear planning space. The resulting planner generates naturally smooth paths that avoid local obstacles while minimizing lateral path deviation to best exploit prior terrain knowledge. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Experimental results from online run-time analysis, unit testing, and qualitative experiments on a differential drive robot show the promise of the technique for reliable long-term autonomous operation in complex unstructured environments.

Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.

2022-11-03

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

The statistical and design considerations that pertain to dose optimization are discussed. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. The traditional more-is-better dose selection paradigm, developed based on cytotoxic chemotherapeutics, is often problematic When applied to the development of novel molecularly targeted agents (e.g., kinase inhibitors, monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug Administration (FDA) initiated Project Optimus to reform the dose optimization and dose selection paradigm in oncology drug development and call for more attention to benefit-risk consideration. We systematically investigated the operating characteristics of the seamless phase 2-3 design as a strategy for dose optimization, where in stage 1 (corresponding to phase 2) patients are randomized to multiple doses, with or without a control; and in stage 2 (corresponding to phase 3) the efficacy of the selected optimal dose is evaluated with a randomized concurrent control or historical control. Depending on whether the concurrent control is included and the type of endpoints used in stages 1 and 2, we describe four types of seamless phase 2-3 dose-optimization designs, which are suitable for different clinical settings. The statistical and design considerations that pertain to dose optimization are discussed. Simulation shows that dose optimization phase 2-3 designs are able to control the familywise type I error rates and yield appropriate statistical power with substantially smaller sample size than the conventional approach. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. Due to the interim dose selection, the phase 2-3 dose-optimization design is logistically and operationally more challenging, and should be carefully planned and implemented to ensure trial integrity.

Authors: Liyun Jiang, Ying Yuan.

2022-11-03

Fast and robust Bayesian Inference using Gaussian Processes with GPry

We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.

2022-11-03

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. While no non-preemptive algorithm is constant competitive, Motwani, Phillips, and Torng (SODA '93) proved that the simple preemptive round robin procedure is $2$-competitive and that no better competitive ratio is possible, initiating a long line of research focused on preemptive algorithms for generalized variants of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS '91) introduced kill-and-restart schedules, where running jobs may be killed and restarted from scratch later, and analyzed then for the makespan objective. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. We contribute to both models: First we give for any $b > 1$ a tight analysis for the natural $b$-scaling kill-and-restart strategy for scheduling jobs without release dates, as well as for a randomized variant of it. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. Second, we show that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is $2$-competitive for jobs released in an online fashion over time, matching the lower bound by Motwani et al. Using this result as well as the competitiveness of round robin for multiple machines, we prove performance guarantees of adaptions of the $b$-scaling algorithm to online release dates and unweighted jobs on identical parallel machines.

Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.

2022-11-03

Could Giant Pretrained Image Models Extract Universal Representations?

Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models. Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. However, with frozen models there are relatively few parameters available for adapting to downstream tasks, which is problematic in computer vision where tasks vary significantly in input/output format and the type of information that is of value. In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. From this empirical analysis, our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes. We additionally examine the upper bound of performance using a giant frozen pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches competitive performance on a varied set of major benchmarks with only one shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7 top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models.

Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.

2022-11-03

Crisp bi-Gödel modal logic and its paraconsistent expansion

We also consider a paraconsistent expansion of $\KbiG$ with a De Morgan negation $\neg$ which we dub $\KGsquare$. For these two logics, we establish that their decidability and validity are $\mathsf{PSPACE}$-complete. We also study the semantical properties of $\KbiG$ and $\KGsquare$. In particular, we show that Glivenko theorem holds only in finitely branching frames. In this paper, we provide a Hilbert-style axiomatisation for the crisp bi-G\"{o}del modal logic $\KbiG$. We prove its completeness w.r.t.\ crisp Kripke models where formulas at each state are evaluated over the standard bi-G\"{o}del algebra on $[0,1]$. We also consider a paraconsistent expansion of $\KbiG$ with a De Morgan negation $\neg$ which we dub $\KGsquare$. We devise a Hilbert-style calculus for this logic and, as a~con\-se\-quence of a~conservative translation from $\KbiG$ to $\KGsquare$, prove its completeness w.r.t.\ crisp Kripke models with two valuations over $[0,1]$ connected via $\neg$. For these two logics, we establish that their decidability and validity are $\mathsf{PSPACE}$-complete. We also study the semantical properties of $\KbiG$ and $\KGsquare$. In particular, we show that Glivenko theorem holds only in finitely branching frames. We also explore the classes of formulas that define the same classes of frames both in $\mathbf{K}$ (the classical modal logic) and the crisp G\"{o}del modal logic $\KG^c$. We show that, among others, all Sahlqvist formulas and all formulas $\phi\rightarrow\chi$ where $\phi$ and $\chi$ are monotone, define the same classes of frames in $\mathbf{K}$ and $\KG^c$.

Authors: Marta Bilkova, Sabine Frittella, Daniil Kozhemiachenko.

2022-11-03

Flows of 3-edge-colorable cubic signed graphs

Bouchet conjectured in 1983 that every flow-admissible signed graph admits a nowhere-zero 6-flow which is equivalent to the restriction to cubic signed graphs.

Bouchet conjectured in 1983 that every flow-admissible signed graph admits a nowhere-zero 6-flow which is equivalent to the restriction to cubic signed graphs. In this paper, we proved that every flow-admissible $3$-edge-colorable cubic signed graph admits a nowhere-zero $10$-flow. This together with the 4-color theorem implies that every flow-admissible bridgeless planar signed graph admits a nowhere-zero $10$-flow. As a byproduct, we also show that every flow-admissible hamiltonian signed graph admits a nowhere-zero $8$-flow.

Authors: Liangchen Li, Chong Li, Rong Luo, C. -Q Zhang, Hailing Zhang.

2022-11-03

Performance of different correction maps in the extended phase-space method for spinning compact binaries

The extended phase-space algorithm is an effective solution for the problem of this system. are calculated inaccurately. Since the first detection of gravitational waves by the LIGO/VIRGO team, the related research field has attracted more attention. The spinning compact binaries system, as one of the gravitational-wave sources for broadband laser interferometers, has been widely studied by related researchers. In order to analyze the gravitational wave signals using matched filtering techniques, reliable numerical algorithms are needed. Spinning compact binaries systems in Post-Newtonian (PN) celestial mechanics have an inseparable Hamiltonian. The extended phase-space algorithm is an effective solution for the problem of this system. We have developed correction maps for the extended phase-space method in our previous work, which significantly improves the accuracy and stability of the method with only a momentum scale factor. In this paper, we will add more scale factors to modify the numerical solution in order to minimize the errors in the constants of motion. However, we find that these correction maps will result in a large energy bias in the subterms of the Hamiltonian in chaotic orbits, whose potential and kinetic energy, etc. are calculated inaccurately. We develop new correction maps to reduce the energy bias of the subterms of the Hamiltonian, which can instead improve the accuracy of the numerical solution and also provides a new idea for the application of the manifold correction in other algorithms.

Authors: Junjie Luo, Jie Feng, Hong-Hao Zhang, Weipeng Lin.

2022-11-03

$^{178}$Hg and asymmetric fission of neutron-deficient pre-actinides

Fission at low excitation energy is an ideal playground to probe the impact of nuclear structure on nuclear dynamics. Fission of $^{178}$Hg was induced by fusion of $^{124}$Xe and $^{54}$Fe. The two fragments were detected in coincidence using VAMOS++ supplemented with a new SEcond Detection arm.

Fission at low excitation energy is an ideal playground to probe the impact of nuclear structure on nuclear dynamics. While the importance of structural effects in the nascent fragments is well-established in the (trans-)actinide region, the observation of asymmetric fission in several neutron-deficient pre-actinides can be explained by various mechanisms. To deepen our insight into that puzzle, an innovative approach based on inverse kinematics and an enhanced version of the VAMOS++ heavy-ion spectrometer was implemented at the GANIL facility, Caen. Fission of $^{178}$Hg was induced by fusion of $^{124}$Xe and $^{54}$Fe. The two fragments were detected in coincidence using VAMOS++ supplemented with a new SEcond Detection arm. For the first time in the pre-actinide region, access to the pre-neutron mass and total kinetic energy distributions, and the simultaneous isotopic identification of one the fission fragment, was achieved. The present work describes the experimental approach, and discusses the pre-neutron observables in the context of an extended asymmetric-fission island located south-west of $^{208}Pb. A comparison with different models is performed, demonstrating the importance of this "new" asymmetric-fission island for elaborating on driving effects in fission.

Authors: A. Jhingan, C. Schmitt, A. Lemasson, S. Biswas, Y. H. Kim, D. Ramos, A. N. Andreyev, D. Curien, M. Ciemala, E. Clément, O. Dorvaux, B. De Canditiis, F. Didierjean, G. Duchêne, J. Dudouet, J. Frankland, G. Frémont, J. Goupil, B. Jacquot, C. Raison, D. Ralet, B. -M. Retailleau, L. Stuttgé, I. Tsekhanovich, A. V. Andreev, S. Goriely, S. Hilaire, J-F. Lemaître, P. Möller, K. -H. Schmidt.

2022-11-03

Microscopic Description of Isoscalar Giant Monopole Resonance in $^{48}$Ca

Our results demonstrate that the developed approach enables to us to describe a gross structure of the ISGMR spreading width. The properties of the isoscalar giant monopole resonance (ISGMR) for the double magic $^{48}$Ca are analyzed in the framework of a microscopic model based on Skyrme-type interactions. A method for simultaneously taking into account the coupling between one-, two-, and three-phonon terms in the wave functions of $0^{+}$ states has been developed. The inclusion of three-phonon configurations leads to a substantial redistribution of the ISGMR strength to lower energy $0^{+}$ states and also higher energy tail. Our results demonstrate that the developed approach enables to us to describe a gross structure of the ISGMR spreading width.

Authors: N. N. Arsenyev, A. P. Severyukhin.

2022-11-03

Convex Clustering through MM: An Efficient Algorithm to Perform Hierarchical Clustering

Convex clustering is a modern method with both hierarchical and $k$-means clustering characteristics. Moreover, it is known that convex clustering sometimes fails to produce hierarchical clustering structures. This undesirable phenomenon is called cluster split and makes it difficult to interpret clustering results. In the CCMM algorithm, the diagonal majorization technique makes a highly efficient update for each iteration.

Convex clustering is a modern method with both hierarchical and $k$-means clustering characteristics. Although convex clustering can capture the complex clustering structure hidden in data, the existing convex clustering algorithms are not scalable to large data sets with sample sizes greater than ten thousand. Moreover, it is known that convex clustering sometimes fails to produce hierarchical clustering structures. This undesirable phenomenon is called cluster split and makes it difficult to interpret clustering results. In this paper, we propose convex clustering through majorization-minimization (CCMM) -- an iterative algorithm that uses cluster fusions and sparsity to enforce a complete cluster hierarchy with reduced memory usage. In the CCMM algorithm, the diagonal majorization technique makes a highly efficient update for each iteration. With a current desktop computer, the CCMM algorithm can solve a single clustering problem featuring over one million objects in seven-dimensional space within 70 seconds.

Authors: Daniel J. W. Touw, Patrick J. F. Groenen, Yoshikazu Terada.

2022-11-03

Artificial kagome spin ice magnetic phase recognition from the initial magnetization curve

As a proof of concept, micromagnetic simulations of these curves were performed starting from representative microstates of different phases of the system. We show that the curves are characterized by phase-specific features in such a way that a pattern recognition algorithm predicts the phase of the initial microstate with good reliability. This achievement represents a new strategy to identify phases in ASIs, easier and more accessible than magnetic imaging techniques normally used for this task. Artificial spin ices (ASIs) are designable arrays of interacting nanomagnets that span a wide range of magnetic phases associated with a number of spin lattice models. Here, we demonstrate that the phase of an artificial kagome spin ice can be determined from its initial magnetization curve. As a proof of concept, micromagnetic simulations of these curves were performed starting from representative microstates of different phases of the system. We show that the curves are characterized by phase-specific features in such a way that a pattern recognition algorithm predicts the phase of the initial microstate with good reliability. This achievement represents a new strategy to identify phases in ASIs, easier and more accessible than magnetic imaging techniques normally used for this task.

Authors: Breno Cecchi, Nathan Cruz, Marcelo Knobel, Kleber Roberto Pirota.

2022-11-03

M-to-N Backdoor Paradigm: A Stealthy and Fuzzy Attack to Deep Learning Models

Recent studies show that deep neural networks (DNNs) are vulnerable to backdoor attacks. Besides, it is robust to pre-processing operations and can resist state-of-the-art defenses.

Recent studies show that deep neural networks (DNNs) are vulnerable to backdoor attacks. A backdoor DNN model behaves normally with clean inputs, whereas outputs attacker's expected behaviors when the inputs contain a pre-defined pattern called a trigger. However, in some tasks, the attacker cannot know the exact target that shows his/her expected behavior, because the task may contain a large number of classes and the attacker does not have full access to know the semantic details of these classes. Thus, the attacker is willing to attack multiple suspected targets to achieve his/her purpose. In light of this, in this paper, we propose the M-to-N backdoor attack, a new attack paradigm that allows an attacker to launch a fuzzy attack by simultaneously attacking N suspected targets, and each of the N targets can be activated by any one of its M triggers. To achieve a better stealthiness, we randomly select M clean images from the training dataset as our triggers for each target. Since the triggers used in our attack have the same distribution as the clean images, the inputs poisoned by the triggers are difficult to be detected by the input-based defenses, and the backdoor models trained on the poisoned training dataset are also difficult to be detected by the model-based defenses. Thus, our attack is stealthier and has a higher probability of achieving the attack purpose by attacking multiple suspected targets simultaneously in contrast to prior backdoor attacks. Extensive experiments show that our attack is effective against different datasets with various models and achieves high attack success rates (e.g., 99.43% for attacking 2 targets and 98.23% for attacking 4 targets on the CIFAR-10 dataset) when poisoning only an extremely small portion of the training dataset (e.g., less than 2%). Besides, it is robust to pre-processing operations and can resist state-of-the-art defenses.

Authors: Linshan Hou, Zhongyun Hua, Yuhong Li, Leo Yu Zhang.

2022-11-03

Contextual information integration for stance detection via cross-attention

This architecture allows for integrating contextual information from heterogeneous sources. for seen targets, and (2) out-of-domain, i.e. for targets unseen during training. Our analysis shows that it is able to regularize for spurious label correlations with target-specific cue words. Stance detection deals with the identification of an author's stance towards a target and is applied on various text domains like social media and news. In many cases, inferring the stance is challenging due to insufficient access to contextual information. Complementary context can be found in knowledge bases but integrating the context into pretrained language models is non-trivial due to their graph structure. In contrast, we explore an approach to integrate contextual information as text which aligns better with transformer architectures. Specifically, we train a model consisting of dual encoders which exchange information via cross-attention. This architecture allows for integrating contextual information from heterogeneous sources. We evaluate context extracted from structured knowledge sources and from prompting large language models. Our approach is able to outperform competitive baselines (1.9pp on average) on a large and diverse stance detection benchmark, both (1) in-domain, i.e. for seen targets, and (2) out-of-domain, i.e. for targets unseen during training. Our analysis shows that it is able to regularize for spurious label correlations with target-specific cue words.

Authors: Tilman Beck, Andreas Waldis, Iryna Gurevych.

2022-11-03

Port-metriplectic neural networks: thermodynamics-informed machine learning of complex physical systems

Predictions can be done, however, at the scale of the complete system. Examples are shown on the performance of the proposed technique.

We develop inductive biases for the machine learning of complex physical systems based on the port-Hamiltonian formalism. To satisfy by construction the principles of thermodynamics in the learned physics (conservation of energy, non-negative entropy production), we modify accordingly the port-Hamiltonian formalism so as to achieve a port-metriplectic one. We show that the constructed networks are able to learn the physics of complex systems by parts, thus alleviating the burden associated to the experimental characterization and posterior learning process of this kind of systems. Predictions can be done, however, at the scale of the complete system. Examples are shown on the performance of the proposed technique.

Authors: Quercus Hernández, Alberto Badías, Francisco Chinesta, Elías Cueto.

Read more

Read more

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

Fast and robust Bayesian Inference using Gaussian Processes with GPry

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

Could Giant Pretrained Image Models Extract Universal Representations?

Crisp bi-Gödel modal logic and its paraconsistent expansion

Flows of 3-edge-colorable cubic signed graphs

Performance of different correction maps in the extended phase-space method for spinning compact binaries

$^{178}$Hg and asymmetric fission of neutron-deficient pre-actinides

Microscopic Description of Isoscalar Giant Monopole Resonance in $^{48}$Ca

Convex Clustering through MM: An Efficient Algorithm to Perform Hierarchical Clustering

Artificial kagome spin ice magnetic phase recognition from the initial magnetization curve

M-to-N Backdoor Paradigm: A Stealthy and Fuzzy Attack to Deep Learning Models

Contextual information integration for stance detection via cross-attention

Port-metriplectic neural networks: thermodynamics-informed machine learning of complex physical systems