Sivan Toledo with Aviad Zuck, Spring 2013
The seminar will focus on storage technologies; we will focus on flash-based storage, but will also discuss papers on mobile storage and general topics; we will read and discuss research papers from recent top-tier systems conferences. If you don’t find a paper that interests you in the list below, and would like to present papers on other aspects of computer systems (especially storage), let us know which papers you would like to present and we’ll consider the request.
- March 3: Aviad Zuck will lead the discussion
- March 10: Oren Kishon will talk about When Poll is Better than Interrupt.
- March 17:
- March 24, 31: holiday, no classes
- April 7: Ofir Hermesh on “DFS; a file system for virtualized flash storage”
- April 14: Jenia Gorokhovsky on “FlashVM: virtual memory management on flash” (date may change)
- April 21: Pavel Anissimov on “To zip or not to zip”
- April 28: Eliad Maimon on “A study of Linux file system evolution”
- May 5: Yogev Vaknin on “Revisiting storage for smartphones”
- May 12: Lior Zilpa on “Autonomous storage management for personal devices with PodBase
- May 19: Adam Polyak on “Eyo: Device-Transparent Personal Storage”
- May 26:A guest lecture on OpenStack.
- June 2: Ron Bigman on “Moneta: A high performance storage array architecture for next generation non-volatile memories”
- June 9: Elad Ivanir on “Reliably erasing data on flash-based solid state drives”
- June 16: No lecture
- June 24 (Monday): John Davis (from Microsoft Research) on “HW/SW Co-Design for Flash, its Management Software and the Applications that Use it”
List of suggested papers for the seminar.
Revisiting storage for smartphones
Hyojun Kim, Nitin Agrawal, Cristian Ungureanu
To Zip or Not to Zip: Effective Resource Usage for Real-Time Compression
Danny Harnik, Ronen Kat, Oded Margalit, Dmitry Sotnikov, and Avishay Traeger
Autonomous storage management for personal devices with PodBase
Ansley Post, Juan Navarro, Petr Kuznetsov, Peter Druschel
A Study of Linux File System Evolution (Best paper)
Lanyue Lu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, and Shan Lu
Extending the Lifetime of Flash-based Storage through Reducing Write Amplification from File Systems
Mai Zheng, Joseph Tucek, Feng Qin, Mark Lillibridge
When Poll Is Better than Interrupt
Jisoo Yang, Dave B. Minturn, Frank Hady
Emulating Goliath storage systems with David
Nitin Agrawal, Leo Arulraj, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Lifetime Management of Flash-Based SSDs Using Recovery-Aware Dynamic Throttling
Sungjin Lee, Taejin Kim, Kyungho Kim, Jihong Kim
Reliably Erasing Data from Flash-Based Solid State Drives
Michael Wei, Laura M. Grupp, Frederick E. Spada, Steven Swanson
Eyo: Device-Transparent Personal Storage
Jacob Strauss, Justin Mazzola Paluska, Chris Lesniewski-Laas, Bryan Ford, Robert Morris, Frans Kaashoek
USENIX ATC 2011
DFS: A File System for Virtualized Flash Storage
William K. Josephson, Lars A. Bongo, Kai Li, David Flynn
quFiles: The Right File at the Right Time
Kaushik Veeraraghavan, Jason Flinn, Edmund B. Nightingale, Brian Noble
FlashVM: virtual memory management on flash
Mohit Saxena, Michael M. Swift
USENIX ATC 2010
Moneta: A High-Performance Storage Array Architecture for Next-Generation, Non-volatile Memories
Adrian M. Caulfield, Arup De, Joel Coburn, Todor I. Mollow, Rajesh K. Gupta, Steven Swanson
MICRO ’43 (2010)
SFS: Random Write Considered Harmful in Solid State Drives
Changwoo Min, Kangnyeon Kim, Hyunjin Cho, Sang-Won Lee, Young Ik Eom
The Bleak Future of NAND Flash Memory
Laura M. Grupp, John D. Davis, Steven Swanson
Students are expected to present one or two papers and to actively participate in the discussions about all the papers. More specifically:
To actively participate in the discussions and to learn the most, you need to read the papers that we will discuss before each meeting. Most systems paper are not difficult to read and understand, so reading all the papers over the semester is not a huge task. Plan to spend between one and two hours reading the papers before each meeting.
Each discussion will be led by one of the students. The time frame for each discussion will be one or two hours, depending on how many participants we have (most likely 2 hours per discussion). Plan to spend much more than two hours to prepare for the discussion that you will lead.
The discussion may cover one or two papers. One paper is definitely too much for two hours, sometimes even for one hour. One paper is always from the list; other papers or material that you discuss can be related material from other papers.
Take into consideration the background material required to understand the paper; explaining it might take a while, and can often be as interesting (if not more) as the paper itself.
Powerpoint-style presentations are very effective for short lectures (say 20 or 30 minutes), but they are not so effective when you have an hour or two. You do not have to use a powerpoint-style presentation; you can use the board, you can distribute handouts, and you can mix different kinds of presentation methods.
If you do use a powerpoint-style presentation, remember that slides with lots of text are particularly ineffective. Sparse text slides are better, but not by much. Visuals are good; they often help people understand. But visuals (diagrams, graphs, animation) are hard to prepare; they are easier to draw on the board. If you draw on the board, make sure you have a drawing in your notes; you can’t invent a good visual on the spot. Visuals must be relevant; don’t use them just for entertainment.
Think hard about how to engage the other participants in the discussion. People learn more when they are active than when they listen passively. It’s not easy to engage students, but try.
What to Think About When Presenting a Paper
When you are reading a research paper in computer systems, whether it you are presenting it or not, think about the following questions. Being able to answer them shows that you understand the paper. These are also appropriate topics for the discussion.
What is the paper claiming or proposing?
How does the proposed system work? The details are important. Research builds on prior research. If you don’t understand a paper well enough to duplicate the system that the paper proposes, you can’t build on top of it. Papers are written to enable experts to duplicate the system. If you are not yet an expert, you may need to consult earlier papers, or books, to understand all the details. If you are leading the discussion, consider covering earlier ideas and techniques that must be understood but are not well known. We must understand the technical ideas in detail. This is the most important part of the discussion.
Are the ideas novel and innovative? Computer systems have been around for more than 60 years, so many good ideas have already been proposed in some context. Try to figure out which ideas and techniques are new, which have been adapted from other contexts, and which are simply tools of the trade.
How did the authors substantiate the claims? In computer systems research, claims are usually substantiated using experiments, simulation, and/or some analysis; there are usually no mathematical proofs that a technique or a system works. This kind of evidence can be convincing, or not so convincing. Much of the effort in systems research often lies in producing convincing evidence that the ideas work. Make sure you understand the evidence, not just the system idea.
Do you believe the claims? Do you think that the system works as claimed, or is the evidence weak? If so, why?
Even when the technical claims are valid, the ideas will not necessarily be rolled into production systems. There are many barriers to incorporating new ideas in systems: reliability concerns, cost issues, commercial interests, backward compatibility,etc. Do you believe that the ideas we discuss will be used in production? What might be the barriers and obstacles (the paper may mention some, but perhaps not all, and it might not judge these barriers correctly)?