Panasas' Len Rosenthal Speaks On the Plans to Open Source DirectFLOW

By Chris O'Neal – Today Panasas announced plans to open source a key component of its PanFS parallel file system, specifically the Panasas DirectFLOW client software. Their goal is to advance the development and encourage the adoption of an emerging standard known as parallel NFS, or pNFS. We sat down with Chief Marketing Officer Len Rosenthal to talk about the standard and why Panasas is making the move now to open source some of its "family jewels." SC Online: What is pNFS and why should our readers care about it? Rosenthal: For the last several years, high-performance data centers have been moving quickly to parallel technologies such as clustered computing and multi-core processors as the best way to improve the performance of complex simulation and modeling applications and to drive down costs. While the use of parallelism is solving a number of computational bottlenecks, it had minimal to no impact on the performance bottlenecks in storage I/O systems. As mainstream computing continues to grow, storage systems must be optimized for parallelism using a standard approach for users to reap the inherent cost and performance benefits. Until the industry delivers a parallel storage standard, customer adoption will continue to be hampered by a reluctance to deploy one of the many incompatible and proprietary parallel storage implementations. The emerging pNFS standard is aimed at solving these problems. Later this year, the Internet Engineering Task Force (IETF) NFSv4 subcommittee is expected to conclude its work on the pNFS protocol as part of the release of NFS version 4.1 RFC (Request For Comment). The pNFS protocol will enable direct parallel data transfer between clients and storage devices without the need for expensive NFS filer heads. Support is expected for Linux, leading UNIX versions such as Solaris and AIX, and Microsoft Windows operating systems. SC Online: Specifically, how does pNFS work? Rosenthal: When fully released, pNFS will provide significant performance improvements over today's NFS and clustered NFS implementations, which are not viable solutions for most high-bandwidth HPC applications. Basically, pNFS separates the data path of an NFS file system from the metadata or control path. And by doing so, pNFS enables direct data transfer from pNFS clients (i.e. nodes in a cluster) to the storage devices. pNFS eliminates vendor lock-in by allowing customers to upgrade or swap their storage infrastructure to match high performance, parallel-ready applications of choice without associated changes to their clients, network or switching infrastructures. In addition, pNFS will eliminate the need of developers to support multiple, proprietary storage systems by ensuring that applications are capable of using parallel I/O without changes to the underlying operating system or programming interfaces. SC Online: How wide spread is industry support for this emerging protocol? Rosenthal: Panasas CTO Garth Gibson authored the original pNFS problem statement and the pNFS architecture is directly leveraged from the concepts of our DirectFLOW protocol. Clearly, Panasas has been intimately involved with pNFS for many years. But now, all of the leading storage vendors have now joined the initiative. In addition to Panasas, IBM, EMC, Network Appliance, Sun Microsystems, University of Michigan's Center for Information Technology Integration (CITI), and others are all actively developing and contributing to the new pNFS standard. SC Online: How does Panasas intend to drive the standard? Rosenthal: Today we are announcing that key components of the Panasas DirectFLOW client for Linux will be released to the open source community. Specifically, Panasas is making available the object layout driver, the iSCSI driver and other parts of the Panasas libraries. It will be available to the storage community on the Panasas website and www.pnfs.com this summer. As a precursor to the new standard, the DirectFLOW protocol provides nearly all of the functionality that is expected to be in the final release of pNFS. What this means is that both users and applications developers can begin to reap the benefits of pNFS today with the expectation that as the standard evolves, Panasas will evolve its parallel storage solutions to be fully compliant with the emerging standard. In addition, Panasas has opened an R&D center in Tel Aviv that is dedicated to advancing the development of solutions based on the pNFS standard. SC Online: Why is Panasas uniquely qualified to lead the industry to adoption of parallel storage? Rosenthal: Panasas has been developing parallel file systems for more than eight years now. We have been shipping production-level parallel storage systems for Linux cluster users for about four years. We have more than 100 customer deployments enjoying the real benefits of parallel storage today. No other storage vendor can make these statements. Our leadership in IP around parallel file systems, our ability to develop production-proven parallel storage systems for F500 companies and our commitment to the pNFS standard development process set us apart from the other storage industry players. SC Online: How does the standard play into your corporate strategy? Rosenthal: From the onset, Panasas set out to accomplish three major goals. First, we wanted to build best parallel file system and parallel storage systems in the industry, and we believe we have achieved this with our ActiveStor Parallel Storage Clusters and the ActiveScale 3.0 operating environment. Second, we realized early on that industry standards were needed to make parallel storage an accepted implementation that customers would accept and adopt. Our work on the object storage standard was completed and ratified two years ago and now we're working on the file system side with pNFS. And finally, we wanted to ensure that application developers, OS vendors, and customers had a standard way of maximizing the value of their clustered computing infrastructure by removing the one-lane type of I/O bottlenecks to take advantage of the freeway-style lanes of parallel storage. The benefits to end-users are faster time to market, accelerated time to results, and ultimately a better bottom line. Supercomputing Online wishes to thank Len Rosenthal for his time and insights.