All'ombra del Leviatano: Filesystem in Userspace

Page 1

Sabato 24 Ottobre 2015

#LDROMA15 http://lug.uniroma2.it/ld15/


All'ombra del Leviatano


https://robertoreale.me/linux-day-2015


File e filesystem


Die Dataien hat der liebe Gott gemacht, alles andere ist Menschenwerk. Source: Leopold Kronecker (apocrifo)


file is the new byte


All files are created equal. Source: Anonimo


Everything is a file. Source: Anonimo


All file systems are not created equal. Source: 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14)


Fondazioni


the most visible aspect of an operating system. For most users, the filesystem is

Source: Silberschatz & Galvin, Operating System Concepts, 7th ed.


collection a directory

The filesystem consists of two distinct parts: a

of files, each storing related data, and structure, which organizes and provides information about all the files in the system.

Source: Silberschatz & Galvin, Operating System Concepts, 7th ed.


The most important job of UNIX is to provide a filesystem. Source: Ritchie & Thompson, The UNIX TimeSharing System


A file contains whatever information the user places on it, for example symbolic or binary (object) programs.

No particular structuring is expected by the system. Source: Ritchie & Thompson, The UNIX TimeSharing System


A file does not exist within a particular directory; the directory entry for a file consists merely of its name and a pointer to the information actually describing the file. Source: Ritchie & Thompson, The UNIX TimeSharing System


There is a threefold advantage in treating I/O devices this way: file and device I/O are as similar as possible; file and device names have the same syntax and meaning, so that a program expecting a file name as a parameter can be passed a device name; finally, special files are subject to the same protection mechanism as regular files. Source: Ritchie & Thompson, The UNIX TimeSharing System


Perhaps paradoxically, the success of UNIX is largely due to the fact that it was not designed to meet any predefined objectives. Source: Ritchie & Thompson, The UNIX TimeSharing System


Precisazioni


The whole point with "everything is a file" is not that you have

you can use common tools to operate on different things. some random filename, but the fact that

Source: Linux Torvalds, 8 giugno 2002


The UNIX philosophy is often quoted as "everything is a file", but that really means

everything is a stream of

bytes. Source: Linux Torvalds, 8 marzo 2007


people can use general libraries and treat all sources the same. It should be just a "read()", and then

Source: Linux Torvalds, 8 marzo 2007


Il paradosso


everything is a file

ma il perimetro di cosa è un file non è flessible ad libitum


il perimetro di cosa è un file è

fissato dal kernel


la sintassi e la semantica del filesystem sono fissate dal kernel


Il Leviatano


VFS: oltre 65 mila righe di codice giĂ nel 2008 Source: Galloway et al., Model-Checking the Linux Virtual File System


approccio conservativo kernel-centrico


debug difficile


l'utente non amministratore semplicemente non

può


ecce spes eius frustrabitur eum et videntibus cunctis praecipitabitur Source: Iob, 41, 1


VFS


Astrazione File: modello comune Strutture dati: superblock, inode, file, dentry Operazioni Object-oriented


Implementazione Disk data structures Memory data structures Disk space management


Precursori Earlier VFS implementations include Sun's VFS (in SunOS version 2.0, circa 1985) and IBM and Microsoft's "Installable File System" for IBM OS/2. Source: M. Tim Jones, Anatomy of the Linux virtual filesystem switch


Altre strade


Synthetic Files


9P: Plan 9 Filesystem Protocol


puffs: Pass-to-Userspace Framework File System su NetBSD


A filesystem is a protocol translator: it interprets incoming requests and transforms them into a form suitable to store and retrieve data. Source: Antti Kantee, Send and Receive of File System Protocols


Hurd translators


A translator is simply a normal program acting as an object server and participating in the Hurd's distributed virtual filesystem. Source: https://www.gnu.org/software/hurd/hurd/translator.html


It is so-called because it typically exports a filesystem (although need not: cf. auth, proc and pfinet) and thus translates object invocations into calls appropriate for the backing store (e.g., ext2 filesystem, nfs server, etc.). Source: https://www.gnu.org/software/hurd/hurd/translator.html


Another way of putting it is that it translates from one representation of a data structure into another representation, for example from the on-disk ext2 data layout to a traditional filesystem hierarchy, or from a XML file to a virtual hierarchical manifestation. Source: https://www.gnu.org/software/hurd/hurd/translator.html


A translator is usually registered with a specific filesystem node by using the settrans command. Source: https://www.gnu.org/software/hurd/hurd/translator.html


Translators do not require any special privilege to run. The privilege they require is simply that to access the indiviudal resources they use. Source: https://www.gnu.org/software/hurd/hurd/translator.html


FUSE Filesystem in Userspace



With FUSE it is possible to implement a fully functional filesystem in a userspace program. Source: http://fuse.sourceforge.net/


Autore Miklos Szeredi Licenze GPL + LGPL


Features include...


simple library API


simple installation (no need to patch or recompile the kernel)


secure implementation


userspace-kernel interface is very efficient


usable by non privileged users Source: http://fuse.sourceforge.net/


Interazione attraverso un file (ancora!): /dev/fuse.


FUSE is a userspace filesystem framework. It consists of a kernel module (fuse.ko), a userspace library (libfuse.*) and a mount utility (fusermount). Source: http://fuse.sourceforge.net/doxygen/index.html


One of the most important features of FUSE is allowing secure, non-privileged mounts. This opens up new possibilities for the use of filesystems. A good example is sshfs: a secure network filesystem using the sftp protocol. Source: http://fuse.sourceforge.net/doxygen/index.html


Since the mount() system call is a privileged operation, a helper program (fusermount) is needed, which is installed setuid root. Source: http://fuse.sourceforge.net/doxygen/index.html


Vocabolario


Userspace filesystem A filesystem in which data and metadata are provided by an ordinary userspace process. The filesystem can be accessed normally through the kernel interface. Source: http://fuse.sourceforge.net/doxygen/index.html


Filesystem daemon The process(es) providing the data and metadata of the filesystem. Source: http://fuse.sourceforge.net/doxygen/index.html


Non-privileged mount (or user mount) A userspace filesystem mounted by a non-privileged (nonroot) user. The filesystem daemon is running with the privileges of the mounting user. Source: http://fuse.sourceforge.net/doxygen/index.html


Filesystem connection A connection between the filesystem daemon and the kernel. The connection exists until either the daemon dies, or the filesystem is umounted. Source: http://fuse.sourceforge.net/doxygen/index.html


Mount owner The user who does the mounting. Source: http://fuse.sourceforge.net/doxygen/index.html


User The user who is performing filesystem operations. Source: http://fuse.sourceforge.net/doxygen/index.html


hello.c


/* FUSE: Filesystem in Userspace Copyright (C) 2001-2007 Miklos Szeredi <miklos@szeredi.hu> This program can be distributed under the terms of the GNU GPL. See the file COPYING. */ #define FUSE_USE_VERSION 30 #include <fuse.h> #include <stdio.h> #include <string.h> #include <errno.h> #include <fcntl.h> static const char *hello_str = "Hello World!\n"; static const char *hello_path = "/hello";


static int hello_getattr(const char *path, struct stat *stbuf) { int res = 0; memset(stbuf, 0, sizeof(struct stat)); if (strcmp(path, "/") == 0) { stbuf->st_mode = S_IFDIR | 0755; stbuf->st_nlink = 2; } else if (strcmp(path, hello_path) == 0) { stbuf->st_mode = S_IFREG | 0444; stbuf->st_nlink = 1; stbuf->st_size = strlen(hello_str); } else res = -ENOENT; return res; }


static int hello_readdir(const char *path, void *buf, fuse_fill_dir_t filler, off_t offset, struct fuse_file_info *fi) { (void) offset; (void) fi; if (strcmp(path, "/") != 0) return -ENOENT; filler(buf, ".", NULL, 0); filler(buf, "..", NULL, 0); filler(buf, hello_path + 1, NULL, 0); return 0; }


static int hello_open(const char *path, struct fuse_file_info *fi) { if (strcmp(path, hello_path) != 0) return -ENOENT; if ((fi->flags & 3) != O_RDONLY) return -EACCES; return 0; }


static int hello_read(const char *path, char *buf, size_t size, off_t offset, struct fuse_file_info *fi) { size_t len; (void) fi; if(strcmp(path, hello_path) != 0) return -ENOENT; len = strlen(hello_str); if (offset < len) { if (offset + size > len) size = len - offset; memcpy(buf, hello_str + offset, size); } else size = 0; return size; }


static struct fuse_operations hello_oper = { .getattr = hello_getattr, .readdir = hello_readdir, .open = hello_open, .read = hello_read, };


int main(int argc, char *argv[]) { return fuse_main(argc, argv, &hello_oper, NULL); }


Bestiario


CephFS FUSE come vivaio, come coltura di giovani filesystem. CephFS è nel kernel dalla versione 2.6.34.


Couchfuse Couchfuse is a FUSE filesystem that exposes Couchdb databases as filesystem folder. Source: http://narkisr.github.io/couch-fuse/


elfs A simple (FUSE) filesystem on top of ELF objects. Autore: Guillaume Leconte Source: https://github.com/pozdnychev/elfs


$ elfs `which fdup` /tmp/elf $ ls -l /tmp/elf/ total 0 drw-r--r-- 1 root root 0 Jan drw-r--r-- 1 root root 0 Jan drw-r--r-- 1 root root 0 Jan

1 1 1

1970 header 1970 libs 1970 sections


estensione ad altri formati binari astrazione dal formato interfaccia verso exec()


etcd-fs A replicated filesystem on top of etcd. Autore: Jonathan Leibiusky Source: https://github.com/xetorthio/etcd-fs


fusepy Simple ctypes bindings for FUSE. Autore: Terence Honles Source: https://github.com/fusepy/fusepy


GlusterFS GlusterFS is a scalable network filesystem. Using common off-the-shelf hardware, you can create large, distributed storage solutions for media streaming, data analysis, and other data- and bandwidth-intensive tasks. Source: http://www.gluster.org/


PNGdrive PNG meets Steganography meets Fuse: the easiest way to have plausible deniability. Source: https://code.google.com/p/pngdrive/


WikipediaFS WikipediaFS is a virtual filesystem which allows users to view and edit Wikipedia articles as if they were real files on a local disk drive. Source: https://en.wikipedia.org/wiki/WikipediaFS


Colofรณne Presentazione composta con vim e Hovercraft! su Ubuntu Saucy. Featuring Google Fonts: Libre Baskerville, Racing Sans One, Satisfy.


exit()


Roberto Reale https://robertoreale.me/linux-day-2015


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.