Welcome to Byte Size Arxiv

Papers made digestable

Recommended Categories:
Show All

2022-11-03

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Visual Teach and Repeat 3 (VT&R3), a generalization of stereo VT&R, achieves long-term autonomous path-following using topometric mapping and localization from a single rich sensor stream. In this paper, we improve the capabilities of a LiDAR implementation of VT&R3 to reliably detect and avoid obstacles in changing environments. Our architecture simplifies the obstacle-perception problem to that of place-dependent change detection. We then extend the behaviour of generic sample-based motion planners to better suit the teach-and-repeat problem structure by introducing a new edge-cost metric paired with a curvilinear planning space. The resulting planner generates naturally smooth paths that avoid local obstacles while minimizing lateral path deviation to best exploit prior terrain knowledge. While we use the method with VT&R, it can be generalized to suit arbitrary path-following applications. Experimental results from online run-time analysis, unit testing, and qualitative experiments on a differential drive robot show the promise of the technique for reliable long-term autonomous operation in complex unstructured environments.

Authors: Jordy Sehn, Yuchen Wu, Timothy D. Barfoot.

2022-11-03

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

The statistical and design considerations that pertain to dose optimization are discussed. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. The traditional more-is-better dose selection paradigm, developed based on cytotoxic chemotherapeutics, is often problematic When applied to the development of novel molecularly targeted agents (e.g., kinase inhibitors, monoclonal antibodies, and antibody-drug conjugates). The US Food and Drug Administration (FDA) initiated Project Optimus to reform the dose optimization and dose selection paradigm in oncology drug development and call for more attention to benefit-risk consideration. We systematically investigated the operating characteristics of the seamless phase 2-3 design as a strategy for dose optimization, where in stage 1 (corresponding to phase 2) patients are randomized to multiple doses, with or without a control; and in stage 2 (corresponding to phase 3) the efficacy of the selected optimal dose is evaluated with a randomized concurrent control or historical control. Depending on whether the concurrent control is included and the type of endpoints used in stages 1 and 2, we describe four types of seamless phase 2-3 dose-optimization designs, which are suitable for different clinical settings. The statistical and design considerations that pertain to dose optimization are discussed. Simulation shows that dose optimization phase 2-3 designs are able to control the familywise type I error rates and yield appropriate statistical power with substantially smaller sample size than the conventional approach. The sample size savings range from 16.6% to 27.3%, depending on the design and scenario, with a mean savings of 22.1%. Due to the interim dose selection, the phase 2-3 dose-optimization design is logistically and operationally more challenging, and should be carefully planned and implemented to ensure trial integrity.

Authors: Liyun Jiang, Ying Yuan.

2022-11-03

Fast and robust Bayesian Inference using Gaussian Processes with GPry

We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Authors: Jonas El Gammal, Nils Schöneberg, Jesús Torrado, Christian Fidler.

2022-11-03

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. We consider the fundamental scheduling problem of minimizing the sum of weighted completion times on a single machine in the non-clairvoyant setting. While no non-preemptive algorithm is constant competitive, Motwani, Phillips, and Torng (SODA '93) proved that the simple preemptive round robin procedure is $2$-competitive and that no better competitive ratio is possible, initiating a long line of research focused on preemptive algorithms for generalized variants of the problem. As an alternative model, Shmoys, Wein, and Williamson (FOCS '91) introduced kill-and-restart schedules, where running jobs may be killed and restarted from scratch later, and analyzed then for the makespan objective. However, to the best of our knowledge, this concept has never been considered for the total completion time objective in the non-clairvoyant model. We contribute to both models: First we give for any $b > 1$ a tight analysis for the natural $b$-scaling kill-and-restart strategy for scheduling jobs without release dates, as well as for a randomized variant of it. This implies a performance guarantee of $(1+3\sqrt{3})\approx 6.197$ for the deterministic algorithm and of $\approx 3.032$ for the randomized version. Second, we show that the preemptive Weighted Shortest Elapsed Time First (WSETF) rule is $2$-competitive for jobs released in an online fashion over time, matching the lower bound by Motwani et al. Using this result as well as the competitiveness of round robin for multiple machines, we prove performance guarantees of adaptions of the $b$-scaling algorithm to online release dates and unweighted jobs on identical parallel machines.

Authors: Sven Jäger, Guillaume Sagnol, Daniel Schmidt genannt Waldschmidt, Philipp Warode.

2022-11-03

Could Giant Pretrained Image Models Extract Universal Representations?

Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models. Frozen pretrained models have become a viable alternative to the pretraining-then-finetuning paradigm for transfer learning. However, with frozen models there are relatively few parameters available for adapting to downstream tasks, which is problematic in computer vision where tasks vary significantly in input/output format and the type of information that is of value. In this paper, we present a study of frozen pretrained models when applied to diverse and representative computer vision tasks, including object detection, semantic segmentation and video action recognition. From this empirical analysis, our work answers the questions of what pretraining task fits best with this frozen setting, how to make the frozen setting more flexible to various downstream tasks, and the effect of larger model sizes. We additionally examine the upper bound of performance using a giant frozen pretrained model with 3 billion parameters (SwinV2-G) and find that it reaches competitive performance on a varied set of major benchmarks with only one shared frozen base network: 60.0 box mAP and 52.2 mask mAP on COCO object detection test-dev, 57.6 val mIoU on ADE20K semantic segmentation, and 81.7 top-1 accuracy on Kinetics-400 action recognition. With this work, we hope to bring greater attention to this promising path of freezing pretrained image models.

Authors: Yutong Lin, Ze Liu, Zheng Zhang, Han Hu, Nanning Zheng, Stephen Lin, Yue Cao.

2022-11-03

Nonrelativistic Approximations of Closed Bosonic String Theory

We further develop the string $1/c^2$ expansion of closed bosonic string theory, where $c$ is the speed of light. The expansion will be performed up to and including the next-to-next-to-leading order (NNLO). Finally, we expand the phase space action, which allows us to perform the Dirac procedure and pass to the quantum theory. We further develop the string $1/c^2$ expansion of closed bosonic string theory, where $c$ is the speed of light. The expansion will be performed up to and including the next-to-next-to-leading order (NNLO). We show that the next-to-leading order (NLO) theory is equal to the Gomis--Ooguri string, generalised to a curved target space, provided the target space geometry admits a certain class of co-dimension-2 foliations. We compute the energy of the string up to NNLO for a flat target space with a circle that must be wound by the string, and we show that it agrees with the $1/c^2$ expansion of the relativistic energy. We also compute the algebra of Noether charges for a flat target space and show that this matches order-by-order with an appropriate expansion of the Poincar\'e algebra, which at NLO gives the string Bargmann algebra. Finally, we expand the phase space action, which allows us to perform the Dirac procedure and pass to the quantum theory. It turns out that the Poisson brackets change at each order, and we show that the normal ordering constant of the relativistic theory, which does not depend on $c$, can be reproduced by the NLO and NNLO theories.

Authors: Jelle Hartong, Emil Have.

2022-11-03

Kinks Multiply Mesons

We also calculate the differential probability with respect to the final meson momenta and the probability that one or two of the final mesons recoils back towards the source. In the ultrarelativistic limit of the initial meson, the total probability tends to a constant, which we calculate analytically in the $\phi^4$ model. At this order the meson sector conserves energy on its own, while the incoming meson applies a positive pressure to the kink.

In a (1+1)-dimensional scalar quantum field theory, we calculate the leading-order probability of meson multiplication, which is the inelastic scattering process: kink + meson $\rightarrow$ kink + 2 mesons. We also calculate the differential probability with respect to the final meson momenta and the probability that one or two of the final mesons recoils back towards the source. In the ultrarelativistic limit of the initial meson, the total probability tends to a constant, which we calculate analytically in the $\phi^4$ model. At this order the meson sector conserves energy on its own, while the incoming meson applies a positive pressure to the kink. This is in contrast with the situation in classical field theory, where Romanczukiewicz and collaborators have shown that, in the presence of a reflectionless kink, only meson fusion is allowed, resulting in a negative radiation pressure on the kink.

Authors: Jarah Evslin, Hui Liu, Baiyang Zhang.

2022-11-03

Data-driven Abstractions for Verification of Deterministic Systems

By sampling random $\ell$-step trajectories of an unknown system, we build an abstraction based on the notion of $\ell$-completeness. Our method is then tested on several numerical benchmarks. A common technique to verify complex logic specifications for dynamical systems is the construction of symbolic abstractions: simpler, finite-state models whose behaviour mimics the one of the systems of interest. Typically, abstractions are constructed exploiting an accurate knowledge of the underlying model: in real-life applications, this may be a costly assumption. By sampling random $\ell$-step trajectories of an unknown system, we build an abstraction based on the notion of $\ell$-completeness. We newly define the notion of probabilistic behavioural inclusion, and provide probably approximately correct (PAC) guarantees that this abstraction includes all behaviours of the concrete system, for finite and infinite time horizon, leveraging the scenario theory for non convex problems. Our method is then tested on several numerical benchmarks.

Authors: Rudi Coppola, Andrea Peruffo, Manuel Mazo Jr.

2022-11-03

Experimental fingerprint of the electron's longitudinal momentum at the tunnel exit in strong field ionization

Our conclusions are in accordance with predictions based on strong field approximation.

We present experimental data on the strong field tunnel ionization of argon in a counter-rotating two-color (CRTC) laser field. We find that the initial momentum component along the tunneling direction changes sign comparing the rising and the falling edge of the CRTC field. If the initial momentum at the tunnel exit points in the direction of the ion at the instant of tunneling, this manifests as an enhanced Coulomb interaction of the outgoing electron with its parent ion. Our conclusions are in accordance with predictions based on strong field approximation.

Authors: A. Geyer, D. Trabert, M. Hofmann, N. Anders, M. S. Schöffler, L. Ph. H. Schmidt, T. Jahnke, M. Kunitski, R. Dörner, S. Eckart.

2022-11-03

The observed polarization direction depending on geometrical and kinematic parameters of relativistic jets

The study of the polarization direction is crucial in the issue of restoring the spatial structure of the magnetic field in the active galaxy parsec-scale jets. Moreover, the local axis of the jet component may not coincide with its motion direction, which affects the observed polarization direction. The study of the polarization direction is crucial in the issue of restoring the spatial structure of the magnetic field in the active galaxy parsec-scale jets. But, due to relativistic effects, the magnetic field projected onto the celestial sphere in the source reference frame cannot be assumed to be orthogonal to the observed direction of the electric vector in the wave. Moreover, the local axis of the jet component may not coincide with its motion direction, which affects the observed polarization direction. In this article, we analyze the transverse to jet distributions of the electric vector in the wave, obtained as a result of modeling with different jet kinematic and geometrical parameters for a helical magnetic field with a different twist angle and for a toroidal magnetic field in the center, surrounded by a varying thickness sheath, penetrated by a poloidal field. We obtained: 1) the shape of the electric vector transverse distribution depends in a complex way on the angles of the jet axis and the velocity vector with the line of sight; 2) ambiguity in determining the twist direction of the helical magnetic field under using only the distributions of the electric vector in the wave; 3) both considered magnetic field topologies can reproduce both the ``spine-sheath'' polarization structure and individual bright details with the longitudinal to the jet axis polarization direction.

Authors: Marina S. Butuzova.

2022-11-03

Probing hadron mass spectrum in dense two-color QCD with linear sigma model

The model describes not only ground-state scalar diquarks and pseudo-scalar mesons but also the excited pseudo-scalar diquarks and scalar mesons; each ground-state diquark (meson) has the corresponding excited diquark (hadron) with opposite parity as a chiral partner. Effects of chiral symmetry breaking and diquark condensates are incorporated by a mean-field treatment.

We investigate modifications of hadron masses at finite quark chemical potential in two-flavor and two-color QCD, of which the data are available from lattice simulations, within a linear sigma model based on approximate Pauli-Gursey $SU(4)$ symmetry. The model describes not only ground-state scalar diquarks and pseudo-scalar mesons but also the excited pseudo-scalar diquarks and scalar mesons; each ground-state diquark (meson) has the corresponding excited diquark (hadron) with opposite parity as a chiral partner. Effects of chiral symmetry breaking and diquark condensates are incorporated by a mean-field treatment. We show that various mixings among the hadrons, which are triggered by the breakdown of baryon number conservation in the superfluid phase, lead to a rich hadron mass spectrum. We discuss the influence of $U(1)_A$ anomaly on the density dependence of the mass spectrum and also manifestations of the chiral partner structures as density increases in the superfluid phase. The predicted hadron masses are expected to provide future lattice simulations with useful information on such symmetry properties in dense two-color QCD.

Authors: Daiki Suenaga, Kotaro Murakami, Etsuko Itou, Kei Iida.

2022-11-03

Crosslingual Generalization through Multitask Finetuning

We find training on these machine-translated prompts leads to better performance on human-written prompts in the respective languages. We conjecture that the models are learning higher-level capabilities that are both task- and language-agnostic. Our code, datasets and models are publicly available at https://github.com/bigscience-workshop/xmtf. Multitask prompted finetuning (MTF) has been shown to help large language models generalize to new tasks in a zero-shot setting, but so far explorations of MTF have focused on English data and models. We apply MTF to the pretrained multilingual BLOOM and mT5 model families to produce finetuned variants called BLOOMZ and mT0. We find finetuning large multilingual language models on English tasks with English prompts allows for task generalization to non-English languages that appear only in the pretraining corpus. Finetuning on multilingual tasks with English prompts further improves performance on English and non-English tasks leading to various state-of-the-art zero-shot results. We also investigate finetuning on multilingual tasks with prompts that have been machine-translated from English to match the language of each dataset. We find training on these machine-translated prompts leads to better performance on human-written prompts in the respective languages. Surprisingly, we find models are capable of zero-shot generalization to tasks in languages they have never intentionally seen. We conjecture that the models are learning higher-level capabilities that are both task- and language-agnostic. In addition, we introduce xP3, a composite of supervised datasets in 46 languages with English and machine-translated prompts. Our code, datasets and models are publicly available at https://github.com/bigscience-workshop/xmtf.

Authors: Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel.

2022-11-03

Rethinking Hierarchicies in Pre-trained Plain Vision Transformer

We transform the plain ViT into a hierarchical one with minimal changes. The code and models will be released at https://github.com/ViTAE-Transformer/HPViT.

Self-supervised pre-training vision transformer (ViT) via masked image modeling (MIM) has been proven very effective. However, customized algorithms should be carefully designed for the hierarchical ViTs, e.g., GreenMIM, instead of using the vanilla and simple MAE for the plain ViT. More importantly, since these hierarchical ViTs cannot reuse the off-the-shelf pre-trained weights of the plain ViTs, the requirement of pre-training them leads to a massive amount of computational cost, thereby incurring both algorithmic and computational complexity. In this paper, we address this problem by proposing a novel idea of disentangling the hierarchical architecture design from the self-supervised pre-training. We transform the plain ViT into a hierarchical one with minimal changes. Technically, we change the stride of linear embedding layer from 16 to 4 and add convolution (or simple average) pooling layers between the transformer blocks, thereby reducing the feature size from 1/4 to 1/32 sequentially. Despite its simplicity, it outperforms the plain ViT baseline in classification, detection, and segmentation tasks on ImageNet, MS COCO, Cityscapes, and ADE20K benchmarks, respectively. We hope this preliminary study could draw more attention from the community on developing effective (hierarchical) ViTs while avoiding the pre-training cost by leveraging the off-the-shelf checkpoints. The code and models will be released at https://github.com/ViTAE-Transformer/HPViT.

Authors: Yufei Xu, Jing Zhang, Qiming Zhang, Dacheng Tao.

2022-11-03

MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion Segmentation

However, it is not friendly for actual clinical environments due to limited computing resources. Code is available at https://github.com/JCruan519/MALUNet. Recently, some pioneering works have preferred applying more complex modules to improve segmentation performances. However, it is not friendly for actual clinical environments due to limited computing resources. To address this challenge, we propose a light-weight model to achieve competitive performances for skin lesion segmentation at the lowest cost of parameters and computational complexity so far. Briefly, we propose four modules: (1) DGA consists of dilated convolution and gated attention mechanisms to extract global and local feature information; (2) IEA, which is based on external attention to characterize the overall datasets and enhance the connection between samples; (3) CAB is composed of 1D convolution and fully connected layers to perform a global and local fusion of multi-stage features to generate attention maps at channel axis; (4) SAB, which operates on multi-stage features by a shared 2D convolution to generate attention maps at spatial axis. We combine four modules with our U-shape architecture and obtain a light-weight medical image segmentation model dubbed as MALUNet. Compared with UNet, our model improves the mIoU and DSC metrics by 2.39% and 1.49%, respectively, with a 44x and 166x reduction in the number of parameters and computational complexity. In addition, we conduct comparison experiments on two skin lesion segmentation datasets (ISIC2017 and ISIC2018). Experimental results show that our model achieves state-of-the-art in balancing the number of parameters, computational complexity and segmentation performances. Code is available at https://github.com/JCruan519/MALUNet.

Authors: Jiacheng Ruan, Suncheng Xiang, Mingye Xie, Ting Liu, Yuzhuo Fu.

2022-11-03

Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks

There is limited understanding of the information captured by deep spatiotemporal models in their intermediate representations. (iv) Most models converge to their culminating biases in the first half of training. We then explore how these biases affect performance on dynamically biased datasets. For AVOS, we design a better combination of fusion and cross connection layers compared with previous architectures.

There is limited understanding of the information captured by deep spatiotemporal models in their intermediate representations. For example, while evidence suggests that action recognition algorithms are heavily influenced by visual appearance in single frames, no quantitative methodology exists for evaluating such static bias in the latent representation compared to bias toward dynamics. We tackle this challenge by proposing an approach for quantifying the static and dynamic biases of any spatiotemporal model, and apply our approach to three tasks, action recognition, automatic video object segmentation (AVOS) and video instance segmentation (VIS). Our key findings are: (i) Most examined models are biased toward static information. (ii) Some datasets that are assumed to be biased toward dynamics are actually biased toward static information. (iii) Individual channels in an architecture can be biased toward static, dynamic or a combination of the two. (iv) Most models converge to their culminating biases in the first half of training. We then explore how these biases affect performance on dynamically biased datasets. For action recognition, we propose StaticDropout, a semantically guided dropout that debiases a model from static information toward dynamics. For AVOS, we design a better combination of fusion and cross connection layers compared with previous architectures.

Authors: Matthew Kowal, Mennatullah Siam, Md Amirul Islam, Neil D. B. Bruce, Richard P. Wildes, Konstantinos G. Derpanis.

Read more

Read more

Along Similar Lines: Local Obstacle Avoidance for Long-term Autonomous Path Following

Seamless Phase 2-3 Design: A Useful Strategy to Reduce the Sample Size for Dose Optimization

Fast and robust Bayesian Inference using Gaussian Processes with GPry

Competitive Kill-and-Restart Strategies for Non-Clairvoyant Scheduling

Could Giant Pretrained Image Models Extract Universal Representations?

Nonrelativistic Approximations of Closed Bosonic String Theory

Kinks Multiply Mesons

Data-driven Abstractions for Verification of Deterministic Systems

Experimental fingerprint of the electron's longitudinal momentum at the tunnel exit in strong field ionization

The observed polarization direction depending on geometrical and kinematic parameters of relativistic jets

Probing hadron mass spectrum in dense two-color QCD with linear sigma model

Crosslingual Generalization through Multitask Finetuning

Rethinking Hierarchicies in Pre-trained Plain Vision Transformer

MALUNet: A Multi-Attention and Light-weight UNet for Skin Lesion Segmentation

Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks