osdi 2021 accepted papers
The file system performance of the proposed ZNS+ storage system was 1.33--2.91 times better than that of the normal ZNS-based storage system. OSDI'21 accepted 31 papers and 26 papers participated in the AE, a significant increase in the participate ratio: 84%, compared to OSDI'20 (70%) and SOSP'19 (61%). USENIX NSDI, 2021 Acceptance Rate: 15.99% Fluid: Resource-Aware Hyperparameter Tuning Engine P. Yu*, J. Liu*, M. Chowdhury (*Equal contribution) MLSys, 2021 Acceptance Rate: 23.53% NetLock: Fast, Centralized Lock Management Using Programmable Switches Z. Yu, Y. Zhang, V. Braverman, M. Chowdhury, X. Jin ACM SIGCOMM, 2020 Acceptance Rate: 21.6% Software Systems Laboratory Wins Best Paper Award at OSDI 2022 A glance at this year's OSDI program shows that Operating Systems are a small niche topic for this conference, not even meriting their own full session. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh . Collaboration: You have a collaboration on a project, publication, grant proposal, program co-chairship, or editorship within the past two years (December 2018 through March 2021). We demonstrate that KEVIN reduces the amount of I/O traffic between the host and the device, and remains particularly robust as the system ages and the data become fragmented. The biennial ACM Symposium on Operating Systems Principles is the world's premier forum for researchers, developers, programmers, and teachers of computer systems technology. Cores can safely and concurrently read from their local kernel replica, eliminating remote NUMA accesses. This year, there were only 2 accepted papers from UK institutes. We demonstrate the above using design, implementation and evaluation of blk-switch, a new Linux kernel storage stack architecture. Nico Lehmann and Rose Kunkel, UC San Diego; Jordan Brown, Independent; Jean Yang, Akita Software; Niki Vazou, IMDEA Software Institute; Nadia Polikarpova, Deian Stefan, and Ranjit Jhala, UC San Diego. PLDI seeks outstanding research that extends and/or applies programming-language concepts to advance the field of computing. Second, GNNAdvisor implements a novel and highly-efficient 2D workload management tailored for GNN computation to improve GPU utilization and performance under different application settings. OSDI '21 Call for Papers | USENIX The chairs may reject abstracts or papers on the basis of egregious missing or extraneous conflicts. Our evaluation shows that NrOS scales to 96 cores with performance that nearly always dominates Linux at scale, in some cases by orders of magnitude, while retaining much of the simplicity of a sequential kernel. We conclude with a discussion of additional techniques for improving the allocator development process and potential optimization strategies for future memory allocators. Submissions violating the detailed formatting and anonymization rules will not be considered for review. It then feeds those invariants and the desired safety properties to an SMT solver to check if the conjunction of the invariants and the safety properties is inductive. The OSDI '21 program co-chairs have agreed not to submit their work to OSDI '21. Calibrated interrupts increase throughput by up to 35%, reduce CPU consumption by as much as 30%, and achieve up to 37% lower latency when interrupts are coalesced. This kernel is scaled across NUMA nodes using node replication, a scheme inspired by state machine replication in distributed systems. Professor Veloso has been recognized with a multiple honors, including being a Fellow of the ACM, IEEE, AAAS, and AAAI. Submissions may include as many additional pages as needed for references but not for appendices. Academic and industrial participants present research and experience papers that cover the full range of theory and practice of computer . DMons targeted optimizations provide 16.83% speedup on average (up to 53.14%), compared to a baseline that uses the highest level of compiler optimization. Attaching supplementary material is optional; if your paper says that you have source code or formal proofs, you need not attach them to convince the PC of their existence. We present the results of a 1% experiment at fleet scale as well as the longitudinal rollout in Googles warehouse scale computers. A graph neural network (GNN) enables deep learning on structured graph data. Sijie Shen, Rong Chen, Haibo Chen, and Binyu Zang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai Artificial Intelligence Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. We demonstrate that the hardware thread scheduler is able to lower RPC tail response time by about 5 while enabling the system to sustain 20% higher load, relative to traditional thread scheduling techniques. This distinction forces a re-design of the scheduler. Four months after we reported the bugs to Geth developers, one of the bugs was triggered on the mainnet, and caused nodes using a stale version of Geth to hard fork the Ethereum blockchain. In this paper, we propose a software-hardware co-design to support dynamic, fine-grained, large-scale secure memory as well as fast-initialization. These are hard deadlines, and no extensions will be given. EuroSys 2021 With her students, she had led research in AI, with a focus on robotics and machine learning, having concretely researched and developed a variety of autonomous robots, including teams of soccer robots, and mobile service robots. When uploading your OSDI 2021 reviews for your submission to SOSP, you can optionally append a note about how you addressed the reviews and comments. The ZNS+ also allows each zone to be overwritten with sparse sequential write requests, which enables the LFS to use threaded logging-based block reclamation instead of segment compaction. For general conference information, see https://www . Under different configurations of TPC-C and TPC-E, Polyjuice can achieve throughput numbers higher than the best of existing algorithms by 15% to 56%. For general conference information, see https://www.usenix.org/conference/osdi22. However, your OSDI submission must use an anonymized name for your project or system that differs from any used in such contexts. Call for Papers. We present TEMERAIRE, a hugepage-aware enhancement of TCMALLOC to reduce CPU overheads in the applications code. In the Ethereum network, decentralized Ethereum clients reach consensus through transitioning to the same blockchain states according to the Ethereum specification. Many application domains can benefit from hybrid transaction/analytical processing (HTAP) by executing queries on real-time datasets produced by concurrent transactions. His work has included the Barrelfish multikernel research OS, as well as work on distributed stream processors, and using formal specifications to describe the hardware/software interfaces of modern computer systems. Session Chairs: Moshe Gabel, University of Toronto, and Joseph Gonzalez, University of California, Berkeley, John Thorpe, Yifan Qiao, Jonathan Eyolfson, and Shen Teng, UCLA; Guanzhou Hu, UCLA and University of Wisconsin, Madison; Zhihao Jia, CMU; Jinliang Wei, Google Brain; Keval Vora, Simon Fraser; Ravi Netravali, Princeton University; Miryung Kim and Guoqing Harry Xu, UCLA. Furthermore, to enable automatic runtime optimization, GNNAdvisor incorporates a lightweight analytical model for an effective design parameter search. The main contribution of this paper is GoJournal, a verified, concurrent journaling system that provides atomicity for storage applications, together with Perennial 2.0, a framework for formally specifying and verifying concurrent crash-safe systems. Her robot soccer teams have been RoboCup world champions several times, and the CoBot mobile robots have autonomously navigated for more than 1,000km in university buildings. 64 papers accepted out of 341 submitted. Please identify yourself as a presenter and include your mailing address in your email. We prove that DistAI is guaranteed to find the -free inductive invariant that proves the desired safety properties in finite time, if one exists. In this paper, we propose Oort to improve the performance of federated training and testing with guided participant selection. Just using Lambdas on top of CPU servers offers up to 2.75 more performance-per-dollar than training only with CPU servers. KEVIN combines a fast, lightweight, and POSIX compliant file system with a key-value storage device that performs in-storage indexing. sosp ACM Symposium on Operating Systems Principles. Kirk Rodrigues, Yu Luo, and Ding Yuan, University of Toronto and YScope Inc. The NVMe zoned namespace (ZNS) is emerging as a new storage interface, where the logical address space is divided into fixed-sized zones, and each zone must be written sequentially for flash-memory-friendly access. We identify that current systems for learning the embeddings of large-scale graphs are bottlenecked by data movement, which results in poor resource utilization and inefficient training. SOSP 2021 - Symposium on Operating Systems Principles To evaluate the security guarantees of Storm, we build a formally verified reference implementation using the Labeled IO (LIO) IFC framework. SOSP 2021 - Symposium on Operating Systems Principles We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on Amazon Reviews data. She also has made contributions in network security, including scalable data expiration, distributed algorithms despite malicious participants, and DDOS prevention techniques. Welcome to the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI '21) submissions site. Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! Welcome to the 2021 USENIX Annual Technical Conference (ATC '21) submissions site! If your paper is accepted and you need an invitation letter to apply for a visa to attend the conference, please contact [email protected] as soon as possible. My paper has accepted to appear in the EuroSys2020; I will have a talk at the Hotstorage'19; The Paper about GCMA Accepted to TC; If you have any questions about conflicts, please contact the program co-chairs. Owing to the sequential write-only zone scheme of the ZNS, the log-structured file system (LFS) is required to access ZNS solid-state drives (SSDs). We also show that Marius can scale training to datasets an order of magnitude beyond a single machine's GPU and CPU memory capacity, enabling training of configurations with more than a billion edges and 550 GB of total parameters on a single machine with 16 GB of GPU memory and 64 GB of CPU memory. The NAL eliminates remote PM accesses to hot items without inducing extra local PM accesses. Our evaluation shows that PET outperforms existing systems by up to 2.5, by unlocking previously missed opportunities from partially equivalent transformations. Papers must be in PDF format and must be submitted via the submission form. This change is receiving considerable attention in the architecture and security communities, for example, but in contrast, so-called OS researchers are mostly in denial. The 20th ACM Workshop on Hot Topics in Networks (HotNets 2021) will bring together researchers in computer networks and systems to engage in a lively debate on the theory and practice of computer networking. Our approach effectively eliminates high communication and partitioning overheads, and couples it with a new pipelined push-pull parallelism based execution strategy for fast model training. A graph embedding is a fixed length vector representation for each node (and/or edge-type) in a graph and has emerged as the de-facto approach to apply modern machine learning on graphs. 23 artifacts received the Artifacts Functional badge (88%). She is the recipient of several best paper awards, the Einstein Chair of the Chinese Academy of Science, the ACM/SIGART Autonomous Agents Research Award, an NSF Career Award, and the Allen Newell Medal for Excellence in Research. Professor Veloso earned a Bachelor and Master of Science degrees in Electrical and Computer Engineering from Instituto Superior Tecnico in Lisbon, Portugal, a Master of Arts in Computer Science from Boston University, and Master of Science and PhD in Computer Science from Carnegie Mellon University. Editor in charge: Daniel Petrolia . Log search and log archiving, despite being critical problems, are mutually exclusive. We have made Fluffy publicly available at https://github.com/snuspl/fluffy to contribute to the security of Ethereum. Prior or concurrent workshop publication does not preclude publishing a related paper in OSDI. OSDI is "a premier forum for discussing the design, implementation, and implications of systems software." A total of six research papers from the department were accepted to the . To adapt to different workloads, prior works mix or switch between a few known algorithms using manual insights or simple heuristics. This is the first OSDI in an odd year as OSDI moves to a yearly cadence. Furthermore, such performance can be achieved without any modification in applications, network hardware, kernel CPU schedulers and/or kernel network stack. As increasingly more sensitive data is being collected to gain valuable insights, the need to natively integrate privacy controls in data analytics frameworks is growing in importance. We propose Marius, a system for efficient training of graph embeddings that leverages partition caching and buffer-aware data orderings to minimize disk access and interleaves data movement with computation to maximize utilization. Papers not meeting these criteria will be rejected without review, and no deadline extensions will be granted for reformatting. GoJournals goal is to bring the advantages of journaling for code to specs and proofs. This is unfortunate because good OS design has always been driven by the underlying hardware, and right now that hardware is almost unrecognizable from ten years ago, let alone from the 1960s when Unix was written. Pollux promotes fairness among DL jobs competing for resources based on a more meaningful measure of useful job progress, and reveals a new opportunity for reducing DL cost in cloud environments. Thanks to selective profiling, DMons profiling overhead is 1.36% on average, making it feasible for production use. We argue that a key-value interface between a file system and an SSD is superior to the legacy block interface by presenting KEVIN. Penglai also reduces the latency of secure memory initialization by three orders of magnitude and gains 3.6x speedup for real-world applications (e.g., MapReduce). Commonly used log archival and compression tools like Gzip provide high compression ratio, yet searching archived logs is a slow and painful process as it first requires decompressing the logs. USENIX discourages program co-chairs from submitting papers to the conferences they organize, although they are allowed to do so. We implement DeSearch for two existing decentralized services that handle over 80 million records and 240 GBs of data, and show that DeSearch can scale horizontally with the number of workers and can process 128 million search queries per day. Sanitizers detect unsafe actions such as invalid memory accesses by inserting checks that are validated during a programs execution. We present case studies and end-to-end applications that show how Storm lets developers specify diverse policies while centralizing the trusted code to under 1% of the application, and statically enforces security with modest type annotation overhead, and no run-time cost. OSDI 2021 papers summary | hacklog To help more profitably utilize sanitizers, we introduce SanRazor, a practical tool aiming to effectively detect and remove redundant sanitizer checks. Zeph enforces privacy policies cryptographically and ensures that data available to third-party applications complies with users' privacy policies. Existing decentralized systems like Steemit, OpenBazaar, and the growing number of blockchain apps provide alternatives to existing services. These scripts often make pages slow to load, partly due to a fundamental inefficiency in how browsers process JavaScript content: browsers make it easy for web developers to reason about page state by serially executing all scripts on any frame in a page, but as a result, fail to leverage the multiple CPU cores that are readily available even on low-end phones. Submission of a response is optional. We present selective profiling, a technique that locates data locality problems with low-enough overhead that is suitable for production use. Welcome to the SOSP 2021 Website. She developed the technology for making network routing self-stabilizing, largely self-managing, and scalable. Some recent schedulers choose job resources for users, but do so without awareness of how DL training can be re-optimized to better utilize the provided resources. Paper abstracts and proceedings front matter are available to everyone now. Alas, existing profiling techniques incur high overhead when used to identify data locality problems and cannot be deployed in production, where programs may exhibit previously-unseen performance problems. SC is being increasingly adopted by industry for a variety of applications. Consensus bugs are bugs that make Ethereum clients transition to incorrect blockchain states and fail to reach consensus with other clients. This formulation of memory management, which we call memory programming, is a generalization of paging that allows MAGE to provide a highly efficient virtual memory abstraction for SC. The overhead of GPT is 5% for memory-intensive workloads (e.g., Redis) and negligible for CPU-intensive workloads (e.g., RV8 and Coremarks). Metadata from voice calls, such as the knowledge of who is communicating with whom, contains rich information about peoples lives. He joined Intel Research at Berkeley in April 2002 as a principal architect of PlanetLab, an open, shared platform for developing and deploying planetary-scale services. This yielded 6% fewer TLB miss stalls, and 26% reduction in memory wasted due to fragmentation. Session Chairs: Nadav Amit, VMware Research Group, and Ada Gavrilovska, Georgia Institute of Technology, Stephen Ibanez, Alex Mallery, Serhat Arslan, and Theo Jepsen, Stanford University; Muhammad Shahbaz, Purdue University; Changhoon Kim and Nick McKeown, Stanford University. Grand Rapids, Michigan, United States . Our approach outperforms existing file systems on a block SSD by a wide margin 6.2 on average for metadata-intensive benchmarks. Yet, existing efforts randomly select FL participants, which leads to poor model and system efficiency. We introduce a hybrid cryptographic protocol for privacy-adhering transformations of encrypted data. Amy Tai, VMware Research; Igor Smolyar, Technion Israel Institute of Technology; Michael Wei, VMware Research; Dan Tsafrir, Technion Israel Institute of Technology and VMware Research. Software Systems Laboratory Wins Best Paper Awards at the OSDI and A scientific paper consists of a constellation of artifacts that extend beyond the document itself: software, hardware, evaluation data and documentation, raw survey results, mechanized proofs, models, test suites, benchmarks, and so on. Additionally, there is no assurance that data processing and handling comply with the claimed privacy policies. If your accepted paper should not be published prior to the event, please notify [email protected]. Radia Perlman is a Fellow at Dell Technologies. We built an FPGA prototype of the nanoPU fast path by modifying an open-source RISC-V CPU, and evaluated its performance using cycle-accurate simulations on AWS FPGAs. Forgot your password? First, it enables a caller to push a message to a callee in two hops, using a new way of assigning mailboxes to users that resembles how a post office assigns PO boxes to its customers. In 2023 I started another two-year term on the . OSDI '22 - HotCRP.com J.P. Morgan AI Research partners with applied data analytics teams across the firm as well as with leading academic institutions globally. Devices employ adaptive interrupt coalescing heuristics that try to balance between these opposing goals. Erhu Feng, Xu Lu, Dong Du, Bicheng Yang, and Xueqiang Jiang, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China; Yubin Xia, Binyu Zang, and Haibo Chen, Institute of Parallel and Distributed Systems, Shanghai Jiao Tong University; Shanghai AI Laboratory; Engineering Research Center for Domain-specific Operating Systems, Ministry of Education, China. Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, and Roxana Geambasu, Columbia University; Mathias Lcuyer, Microsoft Research. Poor data locality hurts an application's performance. We discuss the design and implementation of TEMERAIRE including strategies for hugepage-aware memory layouts to maximize hugepage coverage and to minimize fragmentation overheads. Conference site 49 papers accepted out of 251 submitted. Our evaluation shows that DistAI successfully verifies 13 common distributed protocols automatically and outperforms alternative methods both in the number of protocols it verifies and the speed at which it does so, in some cases by more than two orders of magnitude. To remedy this, we introduce DeSearch, the first decentralized search engine that guarantees the integrity and privacy of search results for decentralized services and blockchain apps. Conference Dates: Apr 12, 2021 - Apr 14, 2021. Compared to existing baselines, DPF allows training more models under the same global privacy guarantee. Even the little publishable OS work that is not based on Linux still assumes the same simplistic hardware model (essentially a multiprocessor VAX) that bears little resemblance to modern reality. Widely used log-search tools like Elasticsearch and Splunk Enterprise index the logs to provide fast search performance, yet the size of the index is within the same order of magnitude as the raw log size. However, existing enclave designs fail to meet the requirements of scalability demanded by new scenarios like serverless computing, mainly due to the limitations in their secure memory protection mechanisms, including static allocation, restricted capacity and high-cost initialization. Haojie Wang, Jidong Zhai, Mingyu Gao, Zixuan Ma, Shizhi Tang, and Liyan Zheng, Tsinghua University; Yuanzhi Li, Carnegie Mellon University; Kaiyuan Rong and Yuanyong Chen, Tsinghua University; Zhihao Jia, Carnegie Mellon University and Facebook. Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that enables in-situ model training and testing on edge data. . Simultaneous submission of the same work to multiple venues, submission of previously published work, or plagiarism constitutes dishonesty or fraud. We convert five state-of-the-art PM indexes using Nap. Based on the observation that invariants are often concise in practice, DistAI starts with small invariant formulas and enumerates all strongest possible invariants that hold for all samples. There are two major GNN training obstacles: 1) it relies on high-end servers with many GPUs which are expensive to purchase and maintain, and 2) limited memory on GPUs cannot scale to today's billion-edge graphs. We develop rigorous theoretical foundations to simplify equivalence examination and correction for partially equivalent transformations, and design an efficient search algorithm to quickly discover highly optimized programs by combining fully and partially equivalent optimizations at the tensor, operator, and graph levels. Last year, 70% of accepted OSDI papers participated in the . We implement a variant of a log-structured merge tree in the storage device that not only indexes file objects, but also supports transactions and manages physical storage space. We propose a learning-based framework that instead explicitly optimizes concurrency control via offline training to maximize performance. Ethereum is the second-largest blockchain platform next to Bitcoin. The key insight in blk-switch is that Linux's multi-queue storage design, along with multi-queue network and storage hardware, makes the storage stack conceptually similar to a network switch. Publications | Mosharaf Chowdhury OSDI - Guide Proceedings This fast path contains programmable hardware support for low latency transport and congestion control as well as hardware support for efficient load balancing of RPCs to cores. She also invented the spanning tree algorithm, which transformed Ethernet from a technology that supported a few hundred nodes, to something that can support large networks. And yet, they continue to rely on centralized search engines and indexers to help users access the content they seek and navigate the apps. This budget is a scarce resource that must be carefully managed to maximize the number of successfully trained models. Authors are also encouraged to contact the program co-chairs, [email protected], if needed to relate their OSDI submissions to relevant submissions of their own that are simultaneously under review or awaiting publication at other venues.
Victoria Police Discounts,
University Of Illinois Women's Swimming Roster,
Cynthia P Stafford Husband,
Areas To Avoid In Shreveport, La,
Articles O