From 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 Mon Sep 17 00:00:00 2001 From: Linus Torvalds Date: Sat, 16 Apr 2005 15:20:36 -0700 Subject: Linux-2.6.12-rc2 Initial git repository build. I'm not bothering with the full history, even though we have it. We can create a separate "historical" git archive of that later if we want to, and in the meantime it's about 3.2GB when imported into git - space that would just make the early git days unnecessarily complicated, when we don't have a lot of good infrastructure for it. Let it rip! --- Documentation/filesystems/00-INDEX | 50 + Documentation/filesystems/Exporting | 176 ++ Documentation/filesystems/Locking | 515 ++++++ Documentation/filesystems/adfs.txt | 57 + Documentation/filesystems/affs.txt | 219 +++ Documentation/filesystems/afs.txt | 155 ++ Documentation/filesystems/automount-support.txt | 118 ++ Documentation/filesystems/befs.txt | 117 ++ Documentation/filesystems/bfs.txt | 57 + Documentation/filesystems/cifs.txt | 51 + Documentation/filesystems/coda.txt | 1673 +++++++++++++++++++ Documentation/filesystems/cramfs.txt | 76 + Documentation/filesystems/devfs/ChangeLog | 1977 +++++++++++++++++++++++ Documentation/filesystems/devfs/README | 1964 ++++++++++++++++++++++ Documentation/filesystems/devfs/ToDo | 40 + Documentation/filesystems/devfs/boot-options | 65 + Documentation/filesystems/directory-locking | 113 ++ Documentation/filesystems/ext2.txt | 383 +++++ Documentation/filesystems/ext3.txt | 183 +++ Documentation/filesystems/hfs.txt | 83 + Documentation/filesystems/hpfs.txt | 296 ++++ Documentation/filesystems/isofs.txt | 38 + Documentation/filesystems/jfs.txt | 35 + Documentation/filesystems/ncpfs.txt | 12 + Documentation/filesystems/ntfs.txt | 630 ++++++++ Documentation/filesystems/porting | 266 +++ Documentation/filesystems/proc.txt | 1940 ++++++++++++++++++++++ Documentation/filesystems/romfs.txt | 187 +++ Documentation/filesystems/smbfs.txt | 8 + Documentation/filesystems/sysfs-pci.txt | 88 + Documentation/filesystems/sysfs.txt | 341 ++++ Documentation/filesystems/sysv-fs.txt | 38 + Documentation/filesystems/tmpfs.txt | 100 ++ Documentation/filesystems/udf.txt | 57 + Documentation/filesystems/ufs.txt | 61 + Documentation/filesystems/vfat.txt | 231 +++ Documentation/filesystems/vfs.txt | 671 ++++++++ Documentation/filesystems/xfs.txt | 188 +++ 38 files changed, 13259 insertions(+) create mode 100644 Documentation/filesystems/00-INDEX create mode 100644 Documentation/filesystems/Exporting create mode 100644 Documentation/filesystems/Locking create mode 100644 Documentation/filesystems/adfs.txt create mode 100644 Documentation/filesystems/affs.txt create mode 100644 Documentation/filesystems/afs.txt create mode 100644 Documentation/filesystems/automount-support.txt create mode 100644 Documentation/filesystems/befs.txt create mode 100644 Documentation/filesystems/bfs.txt create mode 100644 Documentation/filesystems/cifs.txt create mode 100644 Documentation/filesystems/coda.txt create mode 100644 Documentation/filesystems/cramfs.txt create mode 100644 Documentation/filesystems/devfs/ChangeLog create mode 100644 Documentation/filesystems/devfs/README create mode 100644 Documentation/filesystems/devfs/ToDo create mode 100644 Documentation/filesystems/devfs/boot-options create mode 100644 Documentation/filesystems/directory-locking create mode 100644 Documentation/filesystems/ext2.txt create mode 100644 Documentation/filesystems/ext3.txt create mode 100644 Documentation/filesystems/hfs.txt create mode 100644 Documentation/filesystems/hpfs.txt create mode 100644 Documentation/filesystems/isofs.txt create mode 100644 Documentation/filesystems/jfs.txt create mode 100644 Documentation/filesystems/ncpfs.txt create mode 100644 Documentation/filesystems/ntfs.txt create mode 100644 Documentation/filesystems/porting create mode 100644 Documentation/filesystems/proc.txt create mode 100644 Documentation/filesystems/romfs.txt create mode 100644 Documentation/filesystems/smbfs.txt create mode 100644 Documentation/filesystems/sysfs-pci.txt create mode 100644 Documentation/filesystems/sysfs.txt create mode 100644 Documentation/filesystems/sysv-fs.txt create mode 100644 Documentation/filesystems/tmpfs.txt create mode 100644 Documentation/filesystems/udf.txt create mode 100644 Documentation/filesystems/ufs.txt create mode 100644 Documentation/filesystems/vfat.txt create mode 100644 Documentation/filesystems/vfs.txt create mode 100644 Documentation/filesystems/xfs.txt (limited to 'Documentation/filesystems') diff --git a/Documentation/filesystems/00-INDEX b/Documentation/filesystems/00-INDEX new file mode 100644 index 00000000000..bcfbab899b3 --- /dev/null +++ b/Documentation/filesystems/00-INDEX @@ -0,0 +1,50 @@ +00-INDEX + - this file (info on some of the filesystems supported by linux). +Locking + - info on locking rules as they pertain to Linux VFS. +adfs.txt + - info and mount options for the Acorn Advanced Disc Filing System. +affs.txt + - info and mount options for the Amiga Fast File System. +bfs.txt + - info for the SCO UnixWare Boot Filesystem (BFS). +cifs.txt + - description of the CIFS filesystem +coda.txt + - description of the CODA filesystem. +cramfs.txt + - info on the cram filesystem for small storage (ROMs etc) +devfs/ + - directory containing devfs documentation. +ext2.txt + - info, mount options and specifications for the Ext2 filesystem. +fat_cvf.txt + - info on the Compressed Volume Files extension to the FAT filesystem +hpfs.txt + - info and mount options for the OS/2 HPFS. +isofs.txt + - info and mount options for the ISO 9660 (CDROM) filesystem. +jfs.txt + - info and mount options for the JFS filesystem. +ncpfs.txt + - info on Novell Netware(tm) filesystem using NCP protocol. +ntfs.txt + - info and mount options for the NTFS filesystem (Windows NT). +proc.txt + - info on Linux's /proc filesystem. +romfs.txt + - Description of the ROMFS filesystem. +smbfs.txt + - info on using filesystems with the SMB protocol (Windows 3.11 and NT) +sysv-fs.txt + - info on the SystemV/V7/Xenix/Coherent filesystem. +udf.txt + - info and mount options for the UDF filesystem. +ufs.txt + - info on the ufs filesystem. +vfat.txt + - info on using the VFAT filesystem used in Windows NT and Windows 95 +vfs.txt + - Overview of the Virtual File System +xfs.txt + - info and mount options for the XFS filesystem. diff --git a/Documentation/filesystems/Exporting b/Documentation/filesystems/Exporting new file mode 100644 index 00000000000..31047e0fe14 --- /dev/null +++ b/Documentation/filesystems/Exporting @@ -0,0 +1,176 @@ + +Making Filesystems Exportable +============================= + +Most filesystem operations require a dentry (or two) as a starting +point. Local applications have a reference-counted hold on suitable +dentrys via open file descriptors or cwd/root. However remote +applications that access a filesystem via a remote filesystem protocol +such as NFS may not be able to hold such a reference, and so need a +different way to refer to a particular dentry. As the alternative +form of reference needs to be stable across renames, truncates, and +server-reboot (among other things, though these tend to be the most +problematic), there is no simple answer like 'filename'. + +The mechanism discussed here allows each filesystem implementation to +specify how to generate an opaque (out side of the filesystem) byte +string for any dentry, and how to find an appropriate dentry for any +given opaque byte string. +This byte string will be called a "filehandle fragment" as it +corresponds to part of an NFS filehandle. + +A filesystem which supports the mapping between filehandle fragments +and dentrys will be termed "exportable". + + + +Dcache Issues +------------- + +The dcache normally contains a proper prefix of any given filesystem +tree. This means that if any filesystem object is in the dcache, then +all of the ancestors of that filesystem object are also in the dcache. +As normal access is by filename this prefix is created naturally and +maintained easily (by each object maintaining a reference count on +its parent). + +However when objects are included into the dcache by interpreting a +filehandle fragment, there is no automatic creation of a path prefix +for the object. This leads to two related but distinct features of +the dcache that are not needed for normal filesystem access. + +1/ The dcache must sometimes contain objects that are not part of the + proper prefix. i.e that are not connected to the root. +2/ The dcache must be prepared for a newly found (via ->lookup) directory + to already have a (non-connected) dentry, and must be able to move + that dentry into place (based on the parent and name in the + ->lookup). This is particularly needed for directories as + it is a dcache invariant that directories only have one dentry. + +To implement these features, the dcache has: + +a/ A dentry flag DCACHE_DISCONNECTED which is set on + any dentry that might not be part of the proper prefix. + This is set when anonymous dentries are created, and cleared when a + dentry is noticed to be a child of a dentry which is in the proper + prefix. + +b/ A per-superblock list "s_anon" of dentries which are the roots of + subtrees that are not in the proper prefix. These dentries, as + well as the proper prefix, need to be released at unmount time. As + these dentries will not be hashed, they are linked together on the + d_hash list_head. + +c/ Helper routines to allocate anonymous dentries, and to help attach + loose directory dentries at lookup time. They are: + d_alloc_anon(inode) will return a dentry for the given inode. + If the inode already has a dentry, one of those is returned. + If it doesn't, a new anonymous (IS_ROOT and + DCACHE_DISCONNECTED) dentry is allocated and attached. + In the case of a directory, care is taken that only one dentry + can ever be attached. + d_splice_alias(inode, dentry) will make sure that there is a + dentry with the same name and parent as the given dentry, and + which refers to the given inode. + If the inode is a directory and already has a dentry, then that + dentry is d_moved over the given dentry. + If the passed dentry gets attached, care is taken that this is + mutually exclusive to a d_alloc_anon operation. + If the passed dentry is used, NULL is returned, else the used + dentry is returned. This corresponds to the calling pattern of + ->lookup. + + +Filesystem Issues +----------------- + +For a filesystem to be exportable it must: + + 1/ provide the filehandle fragment routines described below. + 2/ make sure that d_splice_alias is used rather than d_add + when ->lookup finds an inode for a given parent and name. + Typically the ->lookup routine will end: + if (inode) + return d_splice(inode, dentry); + d_add(dentry, inode); + return NULL; + } + + + + A file system implementation declares that instances of the filesystem +are exportable by setting the s_export_op field in the struct +super_block. This field must point to a "struct export_operations" +struct which could potentially be full of NULLs, though normally at +least get_parent will be set. + + The primary operations are decode_fh and encode_fh. +decode_fh takes a filehandle fragment and tries to find or create a +dentry for the object referred to by the filehandle. +encode_fh takes a dentry and creates a filehandle fragment which can +later be used to find/create a dentry for the same object. + +decode_fh will probably make use of "find_exported_dentry". +This function lives in the "exportfs" module which a filesystem does +not need unless it is being exported. So rather that calling +find_exported_dentry directly, each filesystem should call it through +the find_exported_dentry pointer in it's export_operations table. +This field is set correctly by the exporting agent (e.g. nfsd) when a +filesystem is exported, and before any export operations are called. + +find_exported_dentry needs three support functions from the +filesystem: + get_name. When given a parent dentry and a child dentry, this + should find a name in the directory identified by the parent + dentry, which leads to the object identified by the child dentry. + If no get_name function is supplied, a default implementation is + provided which uses vfs_readdir to find potential names, and + matches inode numbers to find the correct match. + + get_parent. When given a dentry for a directory, this should return + a dentry for the parent. Quite possibly the parent dentry will + have been allocated by d_alloc_anon. + The default get_parent function just returns an error so any + filehandle lookup that requires finding a parent will fail. + ->lookup("..") is *not* used as a default as it can leave ".." + entries in the dcache which are too messy to work with. + + get_dentry. When given an opaque datum, this should find the + implied object and create a dentry for it (possibly with + d_alloc_anon). + The opaque datum is whatever is passed down by the decode_fh + function, and is often simply a fragment of the filehandle + fragment. + decode_fh passes two datums through find_exported_dentry. One that + should be used to identify the target object, and one that can be + used to identify the object's parent, should that be necessary. + The default get_dentry function assumes that the datum contains an + inode number and a generation number, and it attempts to get the + inode using "iget" and check it's validity by matching the + generation number. A filesystem should only depend on the default + if iget can safely be used this way. + +If decode_fh and/or encode_fh are left as NULL, then default +implementations are used. These defaults are suitable for ext2 and +extremely similar filesystems (like ext3). + +The default encode_fh creates a filehandle fragment from the inode +number and generation number of the target together with the inode +number and generation number of the parent (if the parent is +required). + +The default decode_fh extract the target and parent datums from the +filehandle assuming the format used by the default encode_fh and +passed them to find_exported_dentry. + + +A filehandle fragment consists of an array of 1 or more 4byte words, +together with a one byte "type". +The decode_fh routine should not depend on the stated size that is +passed to it. This size may be larger than the original filehandle +generated by encode_fh, in which case it will have been padded with +nuls. Rather, the encode_fh routine should choose a "type" which +indicates the decode_fh how much of the filehandle is valid, and how +it should be interpreted. + + diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking new file mode 100644 index 00000000000..a934baeeb33 --- /dev/null +++ b/Documentation/filesystems/Locking @@ -0,0 +1,515 @@ + The text below describes the locking rules for VFS-related methods. +It is (believed to be) up-to-date. *Please*, if you change anything in +prototypes or locking protocols - update this file. And update the relevant +instances in the tree, don't leave that to maintainers of filesystems/devices/ +etc. At the very least, put the list of dubious cases in the end of this file. +Don't turn it into log - maintainers of out-of-the-tree code are supposed to +be able to use diff(1). + Thing currently missing here: socket operations. Alexey? + +--------------------------- dentry_operations -------------------------- +prototypes: + int (*d_revalidate)(struct dentry *, int); + int (*d_hash) (struct dentry *, struct qstr *); + int (*d_compare) (struct dentry *, struct qstr *, struct qstr *); + int (*d_delete)(struct dentry *); + void (*d_release)(struct dentry *); + void (*d_iput)(struct dentry *, struct inode *); + +locking rules: + none have BKL + dcache_lock rename_lock ->d_lock may block +d_revalidate: no no no yes +d_hash no no no yes +d_compare: no yes no no +d_delete: yes no yes no +d_release: no no no yes +d_iput: no no no yes + +--------------------------- inode_operations --------------------------- +prototypes: + int (*create) (struct inode *,struct dentry *,int, struct nameidata *); + struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid +ata *); + int (*link) (struct dentry *,struct inode *,struct dentry *); + int (*unlink) (struct inode *,struct dentry *); + int (*symlink) (struct inode *,struct dentry *,const char *); + int (*mkdir) (struct inode *,struct dentry *,int); + int (*rmdir) (struct inode *,struct dentry *); + int (*mknod) (struct inode *,struct dentry *,int,dev_t); + int (*rename) (struct inode *, struct dentry *, + struct inode *, struct dentry *); + int (*readlink) (struct dentry *, char __user *,int); + int (*follow_link) (struct dentry *, struct nameidata *); + void (*truncate) (struct inode *); + int (*permission) (struct inode *, int, struct nameidata *); + int (*setattr) (struct dentry *, struct iattr *); + int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *); + int (*setxattr) (struct dentry *, const char *,const void *,size_t,int); + ssize_t (*getxattr) (struct dentry *, const char *, void *, size_t); + ssize_t (*listxattr) (struct dentry *, char *, size_t); + int (*removexattr) (struct dentry *, const char *); + +locking rules: + all may block, none have BKL + i_sem(inode) +lookup: yes +create: yes +link: yes (both) +mknod: yes +symlink: yes +mkdir: yes +unlink: yes (both) +rmdir: yes (both) (see below) +rename: yes (all) (see below) +readlink: no +follow_link: no +truncate: yes (see below) +setattr: yes +permission: no +getattr: no +setxattr: yes +getxattr: no +listxattr: no +removexattr: yes + Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_sem on +victim. + cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem. + ->truncate() is never called directly - it's a callback, not a +method. It's called by vmtruncate() - library function normally used by +->setattr(). Locking information above applies to that call (i.e. is +inherited from ->setattr() - vmtruncate() is used when ATTR_SIZE had been +passed). + +See Documentation/filesystems/directory-locking for more detailed discussion +of the locking scheme for directory operations. + +--------------------------- super_operations --------------------------- +prototypes: + struct inode *(*alloc_inode)(struct super_block *sb); + void (*destroy_inode)(struct inode *); + void (*read_inode) (struct inode *); + void (*dirty_inode) (struct inode *); + int (*write_inode) (struct inode *, int); + void (*put_inode) (struct inode *); + void (*drop_inode) (struct inode *); + void (*delete_inode) (struct inode *); + void (*put_super) (struct super_block *); + void (*write_super) (struct super_block *); + int (*sync_fs)(struct super_block *sb, int wait); + void (*write_super_lockfs) (struct super_block *); + void (*unlockfs) (struct super_block *); + int (*statfs) (struct super_block *, struct kstatfs *); + int (*remount_fs) (struct super_block *, int *, char *); + void (*clear_inode) (struct inode *); + void (*umount_begin) (struct super_block *); + int (*show_options)(struct seq_file *, struct vfsmount *); + ssize_t (*quota_read)(struct super_block *, int, char *, size_t, loff_t); + ssize_t (*quota_write)(struct super_block *, int, const char *, size_t, loff_t); + +locking rules: + All may block. + BKL s_lock s_umount +alloc_inode: no no no +destroy_inode: no +read_inode: no (see below) +dirty_inode: no (must not sleep) +write_inode: no +put_inode: no +drop_inode: no !!!inode_lock!!! +delete_inode: no +put_super: yes yes no +write_super: no yes read +sync_fs: no no read +write_super_lockfs: ? +unlockfs: ? +statfs: no no no +remount_fs: no yes maybe (see below) +clear_inode: no +umount_begin: yes no no +show_options: no (vfsmount->sem) +quota_read: no no no (see below) +quota_write: no no no (see below) + +->read_inode() is not a method - it's a callback used in iget(). +->remount_fs() will have the s_umount lock if it's already mounted. +When called from get_sb_single, it does NOT have the s_umount lock. +->quota_read() and ->quota_write() functions are both guaranteed to +be the only ones operating on the quota file by the quota code (via +dqio_sem) (unless an admin really wants to screw up something and +writes to quota files with quotas on). For other details about locking +see also dquot_operations section. + +--------------------------- file_system_type --------------------------- +prototypes: + struct super_block *(*get_sb) (struct file_system_type *, int, + const char *, void *); + void (*kill_sb) (struct super_block *); +locking rules: + may block BKL +get_sb yes yes +kill_sb yes yes + +->get_sb() returns error or a locked superblock (exclusive on ->s_umount). +->kill_sb() takes a write-locked superblock, does all shutdown work on it, +unlocks and drops the reference. + +--------------------------- address_space_operations -------------------------- +prototypes: + int (*writepage)(struct page *page, struct writeback_control *wbc); + int (*readpage)(struct file *, struct page *); + int (*sync_page)(struct page *); + int (*writepages)(struct address_space *, struct writeback_control *); + int (*set_page_dirty)(struct page *page); + int (*readpages)(struct file *filp, struct address_space *mapping, + struct list_head *pages, unsigned nr_pages); + int (*prepare_write)(struct file *, struct page *, unsigned, unsigned); + int (*commit_write)(struct file *, struct page *, unsigned, unsigned); + sector_t (*bmap)(struct address_space *, sector_t); + int (*invalidatepage) (struct page *, unsigned long); + int (*releasepage) (struct page *, int); + int (*direct_IO)(int, struct kiocb *, const struct iovec *iov, + loff_t offset, unsigned long nr_segs); + +locking rules: + All except set_page_dirty may block + + BKL PageLocked(page) +writepage: no yes, unlocks (see below) +readpage: no yes, unlocks +sync_page: no maybe +writepages: no +set_page_dirty no no +readpages: no +prepare_write: no yes +commit_write: no yes +bmap: yes +invalidatepage: no yes +releasepage: no yes +direct_IO: no + + ->prepare_write(), ->commit_write(), ->sync_page() and ->readpage() +may be called from the request handler (/dev/loop). + + ->readpage() unlocks the page, either synchronously or via I/O +completion. + + ->readpages() populates the pagecache with the passed pages and starts +I/O against them. They come unlocked upon I/O completion. + + ->writepage() is used for two purposes: for "memory cleansing" and for +"sync". These are quite different operations and the behaviour may differ +depending upon the mode. + +If writepage is called for sync (wbc->sync_mode != WBC_SYNC_NONE) then +it *must* start I/O against the page, even if that would involve +blocking on in-progress I/O. + +If writepage is called for memory cleansing (sync_mode == +WBC_SYNC_NONE) then its role is to get as much writeout underway as +possible. So writepage should try to avoid blocking against +currently-in-progress I/O. + +If the filesystem is not called for "sync" and it determines that it +would need to block against in-progress I/O to be able to start new I/O +against the page the filesystem should redirty the page with +redirty_page_for_writepage(), then unlock the page and return zero. +This may also be done to avoid internal deadlocks, but rarely. + +If the filesytem is called for sync then it must wait on any +in-progress I/O and then start new I/O. + +The filesystem should unlock the page synchronously, before returning +to the caller. + +Unless the filesystem is going to redirty_page_for_writepage(), unlock the page +and return zero, writepage *must* run set_page_writeback() against the page, +followed by unlocking it. Once set_page_writeback() has been run against the +page, write I/O can be submitted and the write I/O completion handler must run +end_page_writeback() once the I/O is complete. If no I/O is submitted, the +filesystem must run end_page_writeback() against the page before returning from +writepage. + +That is: after 2.5.12, pages which are under writeout are *not* locked. Note, +if the filesystem needs the page to be locked during writeout, that is ok, too, +the page is allowed to be unlocked at any point in time between the calls to +set_page_writeback() and end_page_writeback(). + +Note, failure to run either redirty_page_for_writepage() or the combination of +set_page_writeback()/end_page_writeback() on a page submitted to writepage +will leave the page itself marked clean but it will be tagged as dirty in the +radix tree. This incoherency can lead to all sorts of hard-to-debug problems +in the filesystem like having dirty inodes at umount and losing written data. + + ->sync_page() locking rules are not well-defined - usually it is called +with lock on page, but that is not guaranteed. Considering the currently +existing instances of this method ->sync_page() itself doesn't look +well-defined... + + ->writepages() is used for periodic writeback and for syscall-initiated +sync operations. The address_space should start I/O against at least +*nr_to_write pages. *nr_to_write must be decremented for each page which is +written. The address_space implementation may write more (or less) pages +than *nr_to_write asks for, but it should try to be reasonably close. If +nr_to_write is NULL, all dirty pages must be written. + +writepages should _only_ write pages which are present on +mapping->io_pages. + + ->set_page_dirty() is called from various places in the kernel +when the target page is marked as needing writeback. It may be called +under spinlock (it cannot block) and is sometimes called with the page +not locked. + + ->bmap() is currently used by legacy ioctl() (FIBMAP) provided by some +filesystems and by the swapper. The latter will eventually go away. All +instances do not actually need the BKL. Please, keep it that way and don't +breed new callers. + + ->invalidatepage() is called when the filesystem must attempt to drop +some or all of the buffers from the page when it is being truncated. It +returns zero on success. If ->invalidatepage is zero, the kernel uses +block_invalidatepage() instead. + + ->releasepage() is called when the kernel is about to try to drop the +buffers from the page in preparation for freeing it. It returns zero to +indicate that the buffers are (or may be) freeable. If ->releasepage is zero, +the kernel assumes that the fs has no private interest in the buffers. + + Note: currently almost all instances of address_space methods are +using BKL for internal serialization and that's one of the worst sources +of contention. Normally they are calling library functions (in fs/buffer.c) +and pass foo_get_block() as a callback (on local block-based filesystems, +indeed). BKL is not needed for library stuff and is usually taken by +foo_get_block(). It's an overkill, since block bitmaps can be protected by +internal fs locking and real critical areas are much smaller than the areas +filesystems protect now. + +----------------------- file_lock_operations ------------------------------ +prototypes: + void (*fl_insert)(struct file_lock *); /* lock insertion callback */ + void (*fl_remove)(struct file_lock *); /* lock removal callback */ + void (*fl_copy_lock)(struct file_lock *, struct file_lock *); + void (*fl_release_private)(struct file_lock *); + + +locking rules: + BKL may block +fl_insert: yes no +fl_remove: yes no +fl_copy_lock: yes no +fl_release_private: yes yes + +----------------------- lock_manager_operations --------------------------- +prototypes: + int (*fl_compare_owner)(struct file_lock *, struct file_lock *); + void (*fl_notify)(struct file_lock *); /* unblock callback */ + void (*fl_copy_lock)(struct file_lock *, struct file_lock *); + void (*fl_release_private)(struct file_lock *); + void (*fl_break)(struct file_lock *); /* break_lease callback */ + +locking rules: + BKL may block +fl_compare_owner: yes no +fl_notify: yes no +fl_copy_lock: yes no +fl_release_private: yes yes +fl_break: yes no + + Currently only NFSD and NLM provide instances of this class. None of the +them block. If you have out-of-tree instances - please, show up. Locking +in that area will change. +--------------------------- buffer_head ----------------------------------- +prototypes: + void (*b_end_io)(struct buffer_head *bh, int uptodate); + +locking rules: + called from interrupts. In other words, extreme care is needed here. +bh is locked, but that's all warranties we have here. Currently only RAID1, +highmem, fs/buffer.c, and fs/ntfs/aops.c are providing these. Block devices +call this method upon the IO completion. + +--------------------------- block_device_operations ----------------------- +prototypes: + int (*open) (struct inode *, struct file *); + int (*release) (struct inode *, struct file *); + int (*ioctl) (struct inode *, struct file *, unsigned, unsigned long); + int (*media_changed) (struct gendisk *); + int (*revalidate_disk) (struct gendisk *); + +locking rules: + BKL bd_sem +open: yes yes +release: yes yes +ioctl: yes no +media_changed: no no +revalidate_disk: no no + +The last two are called only from check_disk_change(). + +--------------------------- file_operations ------------------------------- +prototypes: + loff_t (*llseek) (struct file *, loff_t, int); + ssize_t (*read) (struct file *, char __user *, size_t, loff_t *); + ssize_t (*aio_read) (struct kiocb *, char __user *, size_t, loff_t); + ssize_t (*write) (struct file *, const char __user *, size_t, loff_t *); + ssize_t (*aio_write) (struct kiocb *, const char __user *, size_t, + loff_t); + int (*readdir) (struct file *, void *, filldir_t); + unsigned int (*poll) (struct file *, struct poll_table_struct *); + int (*ioctl) (struct inode *, struct file *, unsigned int, + unsigned long); + long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long); + long (*compat_ioctl) (struct file *, unsigned int, unsigned long); + int (*mmap) (struct file *, struct vm_area_struct *); + int (*open) (struct inode *, struct file *); + int (*flush) (struct file *); + int (*release) (struct inode *, struct file *); + int (*fsync) (struct file *, struct dentry *, int datasync); + int (*aio_fsync) (struct kiocb *, int datasync); + int (*fasync) (int, struct file *, int); + int (*lock) (struct file *, int, struct file_lock *); + ssize_t (*readv) (struct file *, const struct iovec *, unsigned long, + loff_t *); + ssize_t (*writev) (struct file *, const struct iovec *, unsigned long, + loff_t *); + ssize_t (*sendfile) (struct file *, loff_t *, size_t, read_actor_t, + void __user *); + ssize_t (*sendpage) (struct file *, struct page *, int, size_t, + loff_t *, int); + unsigned long (*get_unmapped_area)(struct file *, unsigned long, + unsigned long, unsigned long, unsigned long); + int (*check_flags)(int); + int (*dir_notify)(struct file *, unsigned long); +}; + +locking rules: + All except ->poll() may block. + BKL +llseek: no (see below) +read: no +aio_read: no +write: no +aio_write: no +readdir: no +poll: no +ioctl: yes (see below) +unlocked_ioctl: no (see below) +compat_ioctl: no +mmap: no +open: maybe (see below) +flush: no +release: no +fsync: no (see below) +aio_fsync: no +fasync: yes (see below) +lock: yes +readv: no +writev: no +sendfile: no +sendpage: no +get_unmapped_area: no +check_flags: no +dir_notify: no + +->llseek() locking has moved from llseek to the individual llseek +implementations. If your fs is not using generic_file_llseek, you +need to acquire and release the appropriate locks in your ->llseek(). +For many filesystems, it is probably safe to acquire the inode +semaphore. Note some filesystems (i.e. remote ones) provide no +protection for i_size so you will need to use the BKL. + +->open() locking is in-transit: big lock partially moved into the methods. +The only exception is ->open() in the instances of file_operations that never +end up in ->i_fop/->proc_fops, i.e. ones that belong to character devices +(chrdev_open() takes lock before replacing ->f_op and calling the secondary +method. As soon as we fix the handling of module reference counters all +instances of ->open() will be called without the BKL. + +Note: ext2_release() was *the* source of contention on fs-intensive +loads and dropping BKL on ->release() helps to get rid of that (we still +grab BKL for cases when we close a file that had been opened r/w, but that +can and should be done using the internal locking with smaller critical areas). +Current worst offender is ext2_get_block()... + +->fasync() is a mess. This area needs a big cleanup and that will probably +affect locking. + +->readdir() and ->ioctl() on directories must be changed. Ideally we would +move ->readdir() to inode_operations and use a separate method for directory +->ioctl() or kill the latter completely. One of the problems is that for +anything that resembles union-mount we won't have a struct file for all +components. And there are other reasons why the current interface is a mess... + +->ioctl() on regular files is superceded by the ->unlocked_ioctl() that +doesn't take the BKL. + +->read on directories probably must go away - we should just enforce -EISDIR +in sys_read() and friends. + +->fsync() has i_sem on inode. + +--------------------------- dquot_operations ------------------------------- +prototypes: + int (*initialize) (struct inode *, int); + int (*drop) (struct inode *); + int (*alloc_space) (struct inode *, qsize_t, int); + int (*alloc_inode) (const struct inode *, unsigned long); + int (*free_space) (struct inode *, qsize_t); + int (*free_inode) (const struct inode *, unsigned long); + int (*transfer) (struct inode *, struct iattr *); + int (*write_dquot) (struct dquot *); + int (*acquire_dquot) (struct dquot *); + int (*release_dquot) (struct dquot *); + int (*mark_dirty) (struct dquot *); + int (*write_info) (struct super_block *, int); + +These operations are intended to be more or less wrapping functions that ensure +a proper locking wrt the filesystem and call the generic quota operations. + +What filesystem should expect from the generic quota functions: + + FS recursion Held locks when called +initialize: yes maybe dqonoff_sem +drop: yes - +alloc_space: ->mark_dirty() - +alloc_inode: ->mark_dirty() - +free_space: ->mark_dirty() - +free_inode: ->mark_dirty() - +transfer: yes - +write_dquot: yes dqonoff_sem or dqptr_sem +acquire_dquot: yes dqonoff_sem or dqptr_sem +release_dquot: yes dqonoff_sem or dqptr_sem +mark_dirty: no - +write_info: yes dqonoff_sem + +FS recursion means calling ->quota_read() and ->quota_write() from superblock +operations. + +->alloc_space(), ->alloc_inode(), ->free_space(), ->free_inode() are called +only directly by the filesystem and do not call any fs functions only +the ->mark_dirty() operation. + +More details about quota locking can be found in fs/dquot.c. + +--------------------------- vm_operations_struct ----------------------------- +prototypes: + void (*open)(struct vm_area_struct*); + void (*close)(struct vm_area_struct*); + struct page *(*nopage)(struct vm_area_struct*, unsigned long, int *); + +locking rules: + BKL mmap_sem +open: no yes +close: no yes +nopage: no yes + +================================================================================ + Dubious stuff + +(if you break something or notice that it is broken and do not fix it yourself +- at least put it here) + +ipc/shm.c::shm_delete() - may need BKL. +->read() and ->write() in many drivers are (probably) missing BKL. +drivers/sgi/char/graphics.c::sgi_graphics_nopage() - may need BKL. diff --git a/Documentation/filesystems/adfs.txt b/Documentation/filesystems/adfs.txt new file mode 100644 index 00000000000..060abb0c700 --- /dev/null +++ b/Documentation/filesystems/adfs.txt @@ -0,0 +1,57 @@ +Mount options for ADFS +---------------------- + + uid=nnn All files in the partition will be owned by + user id nnn. Default 0 (root). + gid=nnn All files in the partition willbe in group + nnn. Default 0 (root). + ownmask=nnn The permission mask for ADFS 'owner' permissions + will be nnn. Default 0700. + othmask=nnn The permission mask for ADFS 'other' permissions + will be nnn. Default 0077. + +Mapping of ADFS permissions to Linux permissions +------------------------------------------------ + + ADFS permissions consist of the following: + + Owner read + Owner write + Other read + Other write + + (In older versions, an 'execute' permission did exist, but this + does not hold the same meaning as the Linux 'execute' permission + and is now obsolete). + + The mapping is performed as follows: + + Owner read -> -r--r--r-- + Owner write -> --w--w---w + Owner read and filetype UnixExec -> ---x--x--x + These are then masked by ownmask, eg 700 -> -rwx------ + Possible owner mode permissions -> -rwx------ + + Other read -> -r--r--r-- + Other write -> --w--w--w- + Other read and filetype UnixExec -> ---x--x--x + These are then masked by othmask, eg 077 -> ----rwxrwx + Possible other mode permissions -> ----rwxrwx + + Hence, with the default masks, if a file is owner read/write, and + not a UnixExec filetype, then the permissions will be: + + -rw------- + + However, if the masks were ownmask=0770,othmask=0007, then this would + be modified to: + -rw-rw---- + + There is no restriction on what you can do with these masks. You may + wish that either read bits give read access to the file for all, but + keep the default write protection (ownmask=0755,othmask=0577): + + -rw-r--r-- + + You can therefore tailor the permission translation to whatever you + desire the permissions should be under Linux. diff --git a/Documentation/filesystems/affs.txt b/Documentation/filesystems/affs.txt new file mode 100644 index 00000000000..30c9738590f --- /dev/null +++ b/Documentation/filesystems/affs.txt @@ -0,0 +1,219 @@ +Overview of Amiga Filesystems +============================= + +Not all varieties of the Amiga filesystems are supported for reading and +writing. The Amiga currently knows six different filesystems: + +DOS\0 The old or original filesystem, not really suited for + hard disks and normally not used on them, either. + Supported read/write. + +DOS\1 The original Fast File System. Supported read/write. + +DOS\2 The old "international" filesystem. International means that + a bug has been fixed so that accented ("international") letters + in file names are case-insensitive, as they ought to be. + Supported read/write. + +DOS\3 The "international" Fast File System. Supported read/write. + +DOS\4 The original filesystem with directory cache. The directory + cache speeds up directory accesses on floppies considerably, + but slows down file creation/deletion. Doesn't make much + sense on hard disks. Supported read only. + +DOS\5 The Fast File System with directory cache. Supported read only. + +All of the above filesystems allow block sizes from 512 to 32K bytes. +Supported block sizes are: 512, 1024, 2048 and 4096 bytes. Larger blocks +speed up almost everything at the expense of wasted disk space. The speed +gain above 4K seems not really worth the price, so you don't lose too +much here, either. + +The muFS (multi user File System) equivalents of the above file systems +are supported, too. + +Mount options for the AFFS +========================== + +protect If this option is set, the protection bits cannot be altered. + +setuid[=uid] This sets the owner of all files and directories in the file + system to uid or the uid of the current user, respectively. + +setgid[=gid] Same as above, but for gid. + +mode=mode Sets the mode flags to the given (octal) value, regardless + of the original permissions. Directories will get an x + permission if the corresponding r bit is set. + This is useful since most of the plain AmigaOS files + will map to 600. + +reserved=num Sets the number of reserved blocks at the start of the + partition to num. You should never need this option. + Default is 2. + +root=block Sets the block number of the root block. This should never + be necessary. + +bs=blksize Sets the blocksize to blksize. Valid block sizes are 512, + 1024, 2048 and 4096. Like the root option, this should + never be necessary, as the affs can figure it out itself. + +quiet The file system will not return an error for disallowed + mode changes. + +verbose The volume name, file system type and block size will + be written to the syslog when the filesystem is mounted. + +mufs The filesystem is really a muFS, also it doesn't + identify itself as one. This option is necessary if + the filesystem wasn't formatted as muFS, but is used + as one. + +prefix=path Path will be prefixed to every absolute path name of + symbolic links on an AFFS partition. Default = "/". + (See below.) + +volume=name When symbolic links with an absolute path are created + on an AFFS partition, name will be prepended as the + volume name. Default = "" (empty string). + (See below.) + +Handling of the Users/Groups and protection flags +================================================= + +Amiga -> Linux: + +The Amiga protection flags RWEDRWEDHSPARWED are handled as follows: + + - R maps to r for user, group and others. On directories, R implies x. + + - If both W and D are allowed, w will be set. + + - E maps to x. + + - H and P are always retained and ignored under Linux. + + - A is always reset when a file is written to. + +User id and group id will be used unless set[gu]id are given as mount +options. Since most of the Amiga file systems are single user systems +they will be owned by root. The root directory (the mount point) of the +Amiga filesystem will be owned by the user who actually mounts the +filesystem (the root directory doesn't have uid/gid fields). + +Linux -> Amiga: + +The Linux rwxrwxrwx file mode is handled as follows: + + - r permission will set R for user, group and others. + + - w permission will set W and D for user, group and others. + + - x permission of the user will set E for plain files. + + - All other flags (suid, sgid, ...) are ignored and will + not be retained. + +Newly created files and directories will get the user and group ID +of the current user and a mode according to the umask. + +Symbolic links +============== + +Although the Amiga and Linux file systems resemble each other, there +are some, not always subtle, differences. One of them becomes apparent +with symbolic links. While Linux has a file system with exactly one +root directory, the Amiga has a separate root directory for each +file system (for example, partition, floppy disk, ...). With the Amiga, +these entities are called "volumes". They have symbolic names which +can be used to access them. Thus, symbolic links can point to a +different volume. AFFS turns the volume name into a directory name +and prepends the prefix path (see prefix option) to it. + +Example: +You mount all your Amiga partitions under /amiga/ (where + is the name of the volume), and you give the option +"prefix=/amiga/" when mounting all your AFFS partitions. (They +might be "User", "WB" and "Graphics", the mount points /amiga/User, +/amiga/WB and /amiga/Graphics). A symbolic link referring to +"User:sc/include/dos/dos.h" will be followed to +"/amiga/User/sc/include/dos/dos.h". + +Examples +======== + +Command line: + mount Archive/Amiga/Workbench3.1.adf /mnt -t affs -o loop,verbose + mount /dev/sda3 /Amiga -t affs + +/etc/fstab entry: + /dev/sdb5 /amiga/Workbench affs noauto,user,exec,verbose 0 0 + +IMPORTANT NOTE +============== + +If you boot Windows 95 (don't know about 3.x, 98 and NT) while you +have an Amiga harddisk connected to your PC, it will overwrite +the bytes 0x00dc..0x00df of block 0 with garbage, thus invalidating +the Rigid Disk Block. Sheer luck has it that this is an unused +area of the RDB, so only the checksum doesn't match anymore. +Linux will ignore this garbage and recognize the RDB anyway, but +before you connect that drive to your Amiga again, you must +restore or repair your RDB. So please do make a backup copy of it +before booting Windows! + +If the damage is already done, the following should fix the RDB +(where is the device name). +DO AT YOUR OWN RISK: + + dd if=/dev/ of=rdb.tmp count=1 + cp rdb.tmp rdb.fixed + dd if=/dev/zero of=rdb.fixed bs=1 seek=220 count=4 + dd if=rdb.fixed of=/dev/ + +Bugs, Restrictions, Caveats +=========================== + +Quite a few things may not work as advertised. Not everything is +tested, though several hundred MB have been read and written using +this fs. For a most up-to-date list of bugs please consult +fs/affs/Changes. + +Filenames are truncated to 30 characters without warning (this +can be changed by setting the compile-time option AFFS_NO_TRUNCATE +in include/linux/amigaffs.h). + +Case is ignored by the affs in filename matching, but Linux shells +do care about the case. Example (with /wb being an affs mounted fs): + rm /wb/WRONGCASE +will remove /mnt/wrongcase, but + rm /wb/WR* +will not since the names are matched by the shell. + +The block allocation is designed for hard disk partitions. If more +than 1 process writes to a (small) diskette, the blocks are allocated +in an ugly way (but the real AFFS doesn't do much better). This +is also true when space gets tight. + +You cannot execute programs on an OFS (Old File System), since the +program files cannot be memory mapped due to the 488 byte blocks. +For the same reason you cannot mount an image on such a filesystem +via the loopback device. + +The bitmap valid flag in the root block may not be accurate when the +system crashes while an affs partition is mounted. There's currently +no way to fix a garbled filesystem without an Amiga (disk validator) +or manually (who would do this?). Maybe later. + +If you mount affs partitions on system startup, you may want to tell +fsck that the fs should not be checked (place a '0' in the sixth field +of /etc/fstab). + +It's not possible to read floppy disks with a normal PC or workstation +due to an incompatibility with the Amiga floppy controller. + +If you are interested in an Amiga Emulator for Linux, look at + +http://www-users.informatik.rwth-aachen.de/~crux/uae.html diff --git a/Documentation/filesystems/afs.txt b/Documentation/filesystems/afs.txt new file mode 100644 index 00000000000..2f4237dfb8c --- /dev/null +++ b/Documentation/filesystems/afs.txt @@ -0,0 +1,155 @@ + kAFS: AFS FILESYSTEM + ==================== + +ABOUT +===== + +This filesystem provides a fairly simple AFS filesystem driver. It is under +development and only provides very basic facilities. It does not yet support +the following AFS features: + + (*) Write support. + (*) Communications security. + (*) Local caching. + (*) pioctl() system call. + (*) Automatic mounting of embedded mountpoints. + + +USAGE +===== + +When inserting the driver modules the root cell must be specified along with a +list of volume location server IP addresses: + + insmod rxrpc.o + insmod kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 + +The first module is a driver for the RxRPC remote operation protocol, and the +second is the actual filesystem driver for the AFS filesystem. + +Once the module has been loaded, more modules can be added by the following +procedure: + + echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells + +Where the parameters to the "add" command are the name of a cell and a list of +volume location servers within that cell. + +Filesystems can be mounted anywhere by commands similar to the following: + + mount -t afs "%cambridge.redhat.com:root.afs." /afs + mount -t afs "#cambridge.redhat.com:root.cell." /afs/cambridge + mount -t afs "#root.afs." /afs + mount -t afs "#root.cell." /afs/cambridge + + NB: When using this on Linux 2.4, the mount command has to be different, + since the filesystem doesn't have access to the device name argument: + + mount -t afs none /afs -ovol="#root.afs." + +Where the initial character is either a hash or a percent symbol depending on +whether you definitely want a R/W volume (hash) or whether you'd prefer a R/O +volume, but are willing to use a R/W volume instead (percent). + +The name of the volume can be suffixes with ".backup" or ".readonly" to +specify connection to only volumes of those types. + +The name of the cell is optional, and if not given during a mount, then the +named volume will be looked up in the cell specified during insmod. + +Additional cells can be added through /proc (see later section). + + +MOUNTPOINTS +=========== + +AFS has a concept of mountpoints. These are specially formatted symbolic links +(of the same form as the "device name" passed to mount). kAFS presents these +to the user as directories that have special properties: + + (*) They cannot be listed. Running a program like "ls" on them will incur an + EREMOTE error (Object is remote). + + (*) Other objects can't be looked up inside of them. This also incurs an + EREMOTE error. + + (*) They can be queried with the readlink() system call, which will return + the name of the mountpoint to which they point. The "readlink" program + will also work. + + (*) They can be mounted on (which symbolic links can't). + + +PROC FILESYSTEM +=============== + +The rxrpc module creates a number of files in various places in the /proc +filesystem: + + (*) Firstly, some information files are made available in a directory called + "/proc/net/rxrpc/". These list the extant transport endpoint, peer, + connection and call records. + + (*) Secondly, some control files are made available in a directory called + "/proc/sys/rxrpc/". Currently, all these files can be used for is to + turn on various levels of tracing. + +The AFS modules creates a "/proc/fs/afs/" directory and populates it: + + (*) A "cells" file that lists cells currently known to the afs module. + + (*) A directory per cell that contains files that list volume location + servers, volumes, and active servers known within that cell. + + +THE CELL DATABASE +================= + +The filesystem maintains an internal database of all the cells it knows and +the IP addresses of the volume location servers for those cells. The cell to +which the computer belongs is added to the database when insmod is performed +by the "rootcell=" argument. + +Further cells can be added by commands similar to the following: + + echo add CELLNAME VLADDR[:VLADDR][:VLADDR]... >/proc/fs/afs/cells + echo add grand.central.org 18.7.14.88:128.2.191.224 >/proc/fs/afs/cells + +No other cell database operations are available at this time. + + +EXAMPLES +======== + +Here's what I use to test this. Some of the names and IP addresses are local +to my internal DNS. My "root.afs" partition has a mount point within it for +some public volumes volumes. + +insmod -S /tmp/rxrpc.o +insmod -S /tmp/kafs.o rootcell=cambridge.redhat.com:172.16.18.73:172.16.18.91 + +mount -t afs \%root.afs. /afs +mount -t afs \%cambridge.redhat.com:root.cell. /afs/cambridge.redhat.com/ + +echo add grand.central.org 18.7.14.88:128.2.191.224 > /proc/fs/afs/cells +mount -t afs "#grand.central.org:root.cell." /afs/grand.central.org/ +mount -t afs "#grand.central.org:root.archive." /afs/grand.central.org/archive +mount -t afs "#grand.central.org:root.contrib." /afs/grand.central.org/contrib +mount -t afs "#grand.central.org:root.doc." /afs/grand.central.org/doc +mount -t afs "#grand.central.org:root.project." /afs/grand.central.org/project +mount -t afs "#grand.central.org:root.service." /afs/grand.central.org/service +mount -t afs "#grand.central.org:root.software." /afs/grand.central.org/software +mount -t afs "#grand.central.org:root.user." /afs/grand.central.org/user + +umount /afs/grand.central.org/user +umount /afs/grand.central.org/software +umount /afs/grand.central.org/service +umount /afs/grand.central.org/project +umount /afs/grand.central.org/doc +umount /afs/grand.central.org/contrib +umount /afs/grand.central.org/archive +umount /afs/grand.central.org +umount /afs/cambridge.redhat.com +umount /afs +rmmod kafs +rmmod rxrpc diff --git a/Documentation/filesystems/automount-support.txt b/Documentation/filesystems/automount-support.txt new file mode 100644 index 00000000000..58c65a1713e --- /dev/null +++ b/Documentation/filesystems/automount-support.txt @@ -0,0 +1,118 @@ +Support is available for filesystems that wish to do automounting support (such +as kAFS which can be found in fs/afs/). This facility includes allowing +in-kernel mounts to be performed and mountpoint degradation to be +requested. The latter can also be requested by userspace. + + +====================== +IN-KERNEL AUTOMOUNTING +====================== + +A filesystem can now mount another filesystem on one of its directories by the +following procedure: + + (1) Give the directory a follow_link() operation. + + When the directory is accessed, the follow_link op will be called, and + it will be provided with the location of the mountpoint in the nameidata + structure (vfsmount and dentry). + + (2) Have the follow_link() op do the following steps: + + (a) Call do_kern_mount() to call the appropriate filesystem to set up a + superblock and gain a vfsmount structure representing it. + + (b) Copy the nameidata provided as an argument and substitute the dentry + argument into it the copy. + + (c) Call do_add_mount() to install the new vfsmount into the namespace's + mountpoint tree, thus making it accessible to userspace. Use the + nameidata set up in (b) as the destination. + + If the mountpoint will be automatically expired, then do_add_mount() + should also be given the location of an expiration list (see further + down). + + (d) Release the path in the nameidata argument and substitute in the new + vfsmount and its root dentry. The ref counts on these will need + incrementing. + +Then from userspace, you can just do something like: + + [root@andromeda root]# mount -t afs \#root.afs. /afs + [root@andromeda root]# ls /afs + asd cambridge cambridge.redhat.com grand.central.org + [root@andromeda root]# ls /afs/cambridge + afsdoc + [root@andromeda root]# ls /afs/cambridge/afsdoc/ + ChangeLog html LICENSE pdf RELNOTES-1.2.2 + +And then if you look in the mountpoint catalogue, you'll see something like: + + [root@andromeda root]# cat /proc/mounts + ... + #root.afs. /afs afs rw 0 0 + #root.cell. /afs/cambridge.redhat.com afs rw 0 0 + #afsdoc. /afs/cambridge.redhat.com/afsdoc afs rw 0 0 + + +=========================== +AUTOMATIC MOUNTPOINT EXPIRY +=========================== + +Automatic expiration of mountpoints is easy, provided you've mounted the +mountpoint to be expired in the automounting procedure outlined above. + +To do expiration, you need to follow these steps: + + (3) Create at least one list off which the vfsmounts to be expired can be + hung. Access to this list will be governed by the vfsmount_lock. + + (4) In step (2c) above, the call to do_add_mount() should be provided with a + pointer to this list. It will hang the vfsmount off of it if it succeeds. + + (5) When you want mountpoints to be expired, call mark_mounts_for_expiry() + with a pointer to this list. This will process the list, marking every + vfsmount thereon for potential expiry on the next call. + + If a vfsmount was already flagged for expiry, and if its usage count is 1 + (it's only referenced by its parent vfsmount), then it will be deleted + from the namespace and thrown away (effectively unmounted). + + It may prove simplest to simply call this at regular intervals, using + some sort of timed event to drive it. + +The expiration flag is cleared by calls to mntput. This means that expiration +will only happen on the second expiration request after the last time the +mountpoint was accessed. + +If a mountpoint is moved, it gets removed from the expiration list. If a bind +mount is made on an expirable mount, the new vfsmount will not be on the +expiration list and will not expire. + +If a namespace is copied, all mountpoints contained therein will be copied, +and the copies of those that are on an expiration list will be added to the +same expiration list. + + +======================= +USERSPACE DRIVEN EXPIRY +======================= + +As an alternative, it is possible for userspace to request expiry of any +mountpoint (though some will be rejected - the current process's idea of the +rootfs for example). It does this by passing the MNT_EXPIRE flag to +umount(). This flag is considered incompatible with MNT_FORCE and MNT_DETACH. + +If the mountpoint in question is in referenced by something other than +umount() or its parent mountpoint, an EBUSY error will be returned and the +mountpoint will not be marked for expiration or unmounted. + +If the mountpoint was not already marked for expiry at that time, an EAGAIN +error will be given and it won't be unmounted. + +Otherwise if it was already marked and it wasn't referenced, unmounting will +take place as usual. + +Again, the expiration flag is cleared every time anything other than umount() +looks at a mountpoint. diff --git a/Documentation/filesystems/befs.txt b/Documentation/filesystems/befs.txt new file mode 100644 index 00000000000..877a7b1d46e --- /dev/null +++ b/Documentation/filesystems/befs.txt @@ -0,0 +1,117 @@ +BeOS filesystem for Linux + +Document last updated: Dec 6, 2001 + +WARNING +======= +Make sure you understand that this is alpha software. This means that the +implementation is neither complete nor well-tested. + +I DISCLAIM ALL RESPONSIBILTY FOR ANY POSSIBLE BAD EFFECTS OF THIS CODE! + +LICENSE +===== +This software is covered by the GNU General Public License. +See the file COPYING for the complete text of the license. +Or the GNU website: + +AUTHOR +===== +The largest part of the code written by Will Dyson +He has been working on the code since Aug 13, 2001. See the changelog for +details. + +Original Author: Makoto Kato +His orriginal code can still be found at: + +Does anyone know of a more current email address for Makoto? He doesn't +respond to the address given above... + +Current maintainer: Sergey S. Kostyliov + +WHAT IS THIS DRIVER? +================== +This module implements the native filesystem of BeOS +for the linux 2.4.1 and later kernels. Currently it is a read-only +implementation. + +Which is it, BFS or BEFS? +================ +Be, Inc said, "BeOS Filesystem is officially called BFS, not BeFS". +But Unixware Boot Filesystem is called bfs, too. And they are already in +the kernel. Because of this nameing conflict, on Linux the BeOS +filesystem is called befs. + +HOW TO INSTALL +============== +step 1. Install the BeFS patch into the source code tree of linux. + +Apply the patchfile to your kernel source tree. +Assuming that your kernel source is in /foo/bar/linux and the patchfile +is called patch-befs-xxx, you would do the following: + + cd /foo/bar/linux + patch -p1 < /path/to/patch-befs-xxx + +if the patching step fails (i.e. there are rejected hunks), you can try to +figure it out yourself (it shouldn't be hard), or mail the maintainer +(Will Dyson ) for help. + +step 2. Configuretion & make kernel + +The linux kernel has many compile-time options. Most of them are beyond the +scope of this document. I suggest the Kernel-HOWTO document as a good general +reference on this topic. + +However, to use the BeFS module, you must enable it at configure time. + + cd /foo/bar/linux + make menuconfig (or xconfig) + +The BeFS module is not a standard part of the linux kernel, so you must first +enable support for experimental code under the "Code maturity level" menu. + +Then, under the "Filesystems" menu will be an option called "BeFS +filesystem (experimental)", or something like that. Enable that option +(it is fine to make it a module). + +Save your kernel configuration and then build your kernel. + +step 3. Install + +See the kernel howto for +instructions on this critical step. + +USING BFS +========= +To use the BeOS filesystem, use filesystem type 'befs'. + +ex) + mount -t befs /dev/fd0 /beos + +MOUNT OPTIONS +============= +uid=nnn All files in the partition will be owned by user id nnn. +gid=nnn All files in the partition will be in group nnn. +iocharset=xxx Use xxx as the name of the NLS translation table. +debug The driver will output debugging information to the syslog. + +HOW TO GET LASTEST VERSION +========================== + +The latest version is currently available at: + + +ANY KNOWN BUGS? +=========== +As of Jan 20, 2002: + + None + +SPECIAL THANKS +============== +Dominic Giampalo ... Writing "Practical file system design with Be filesystem" +Hiroyuki Yamada ... Testing LinuxPPC. + + + diff --git a/Documentation/filesystems/bfs.txt b/Documentation/filesystems/bfs.txt new file mode 100644 index 00000000000..d2841e0bcf0 --- /dev/null +++ b/Documentation/filesystems/bfs.txt @@ -0,0 +1,57 @@ +BFS FILESYSTEM FOR LINUX +======================== + +The BFS filesystem is used by SCO UnixWare OS for the /stand slice, which +usually contains the kernel image and a few other files required for the +boot process. + +In order to access /stand partition under Linux you obviously need to +know the partition number and the kernel must support UnixWare disk slices +(CONFIG_UNIXWARE_DISKLABEL config option). However BFS support does not +depend on having UnixWare disklabel support because one can also mount +BFS filesystem via loopback: + +# losetup /dev/loop0 stand.img +# mount -t bfs /dev/loop0 /mnt/stand + +where stand.img is a file containing the image of BFS filesystem. +When you have finished using it and umounted you need to also deallocate +/dev/loop0 device by: + +# losetup -d /dev/loop0 + +You can simplify mounting by just typing: + +# mount -t bfs -o loop stand.img /mnt/stand + +this will allocate the first available loopback device (and load loop.o +kernel module if necessary) automatically. If the loopback driver is not +loaded automatically, make sure that your kernel is compiled with kmod +support (CONFIG_KMOD) enabled. Beware that umount will not +deallocate /dev/loopN device if /etc/mtab file on your system is a +symbolic link to /proc/mounts. You will need to do it manually using +"-d" switch of losetup(8). Read losetup(8) manpage for more info. + +To create the BFS image under UnixWare you need to find out first which +slice contains it. The command prtvtoc(1M) is your friend: + +# prtvtoc /dev/rdsk/c0b0t0d0s0 + +(assuming your root disk is on target=0, lun=0, bus=0, controller=0). Then you +look for the slice with tag "STAND", which is usually slice 10. With this +information you can use dd(1) to create the BFS image: + +# umount /stand +# dd if=/dev/rdsk/c0b0t0d0sa of=stand.img bs=512 + +Just in case, you can verify that you have done the right thing by checking +the magic number: + +# od -Ad -tx4 stand.img | more + +The first 4 bytes should be 0x1badface. + +If you have any patches, questions or suggestions regarding this BFS +implementation please contact the author: + +Tigran A. Aivazian diff --git a/Documentation/filesystems/cifs.txt b/Documentation/filesystems/cifs.txt new file mode 100644 index 00000000000..49cc923a93e --- /dev/null +++ b/Documentation/filesystems/cifs.txt @@ -0,0 +1,51 @@ + This is the client VFS module for the Common Internet File System + (CIFS) protocol which is the successor to the Server Message Block + (SMB) protocol, the native file sharing mechanism for most early + PC operating systems. CIFS is fully supported by current network + file servers such as Windows 2000, Windows 2003 (including + Windows XP) as well by Samba (which provides excellent CIFS + server support for Linux and many other operating systems), so + this network filesystem client can mount to a wide variety of + servers. The smbfs module should be used instead of this cifs module + for mounting to older SMB servers such as OS/2. The smbfs and cifs + modules can coexist and do not conflict. The CIFS VFS filesystem + module is designed to work well with servers that implement the + newer versions (dialects) of the SMB/CIFS protocol such as Samba, + the program written by Andrew Tridgell that turns any Unix host + into a SMB/CIFS file server. + + The intent of this module is to provide the most advanced network + file system function for CIFS compliant servers, including better + POSIX compliance, secure per-user session establishment, high + performance safe distributed caching (oplock), optional packet + signing, large files, Unicode support and other internationalization + improvements. Since both Samba server and this filesystem client support + the CIFS Unix extensions, the combination can provide a reasonable + alternative to NFSv4 for fileserving in some Linux to Linux environments, + not just in Linux to Windows environments. + + This filesystem has an optional mount utility (mount.cifs) that can + be obtained from the project page and installed in the path in the same + directory with the other mount helpers (such as mount.smbfs). + Mounting using the cifs filesystem without installing the mount helper + requires specifying the server's ip address. + + For Linux 2.4: + mount //anything/here /mnt_target -o + user=username,pass=password,unc=//ip_address_of_server/sharename + + For Linux 2.5: + mount //ip_address_of_server/sharename /mnt_target -o user=username, pass=password + + + For more information on the module see the project page at + + http://us1.samba.org/samba/Linux_CIFS_client.html + + For more information on CIFS see: + + http://www.snia.org/tech_activities/CIFS + + or the Samba site: + + http://www.samba.org diff --git a/Documentation/filesystems/coda.txt b/Documentation/filesystems/coda.txt new file mode 100644 index 00000000000..61311356025 --- /dev/null +++ b/Documentation/filesystems/coda.txt @@ -0,0 +1,1673 @@ +NOTE: +This is one of the technical documents describing a component of +Coda -- this document describes the client kernel-Venus interface. + +For more information: + http://www.coda.cs.cmu.edu +For user level software needed to run Coda: + ftp://ftp.coda.cs.cmu.edu + +To run Coda you need to get a user level cache manager for the client, +named Venus, as well as tools to manipulate ACLs, to log in, etc. The +client needs to have the Coda filesystem selected in the kernel +configuration. + +The server needs a user level server and at present does not depend on +kernel support. + + + + + + + + The Venus kernel interface + Peter J. Braam + v1.0, Nov 9, 1997 + + This document describes the communication between Venus and kernel + level filesystem code needed for the operation of the Coda file sys- + tem. This document version is meant to describe the current interface + (version 1.0) as well as improvements we envisage. + ______________________________________________________________________ + + Table of Contents + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + 1. Introduction + + 2. Servicing Coda filesystem calls + + 3. The message layer + + 3.1 Implementation details + + 4. The interface at the call level + + 4.1 Data structures shared by the kernel and Venus + 4.2 The pioctl interface + 4.3 root + 4.4 lookup + 4.5 getattr + 4.6 setattr + 4.7 access + 4.8 create + 4.9 mkdir + 4.10 link + 4.11 symlink + 4.12 remove + 4.13 rmdir + 4.14 readlink + 4.15 open + 4.16 close + 4.17 ioctl + 4.18 rename + 4.19 readdir + 4.20 vget + 4.21 fsync + 4.22 inactive + 4.23 rdwr + 4.24 odymount + 4.25 ody_lookup + 4.26 ody_expand + 4.27 prefetch + 4.28 signal + + 5. The minicache and downcalls + + 5.1 INVALIDATE + 5.2 FLUSH + 5.3 PURGEUSER + 5.4 ZAPFILE + 5.5 ZAPDIR + 5.6 ZAPVNODE + 5.7 PURGEFID + 5.8 REPLACE + + 6. Initialization and cleanup + + 6.1 Requirements + + + ______________________________________________________________________ + 0wpage + + 11.. IInnttrroodduuccttiioonn + + + + A key component in the Coda Distributed File System is the cache + manager, _V_e_n_u_s. + + + When processes on a Coda enabled system access files in the Coda + filesystem, requests are directed at the filesystem layer in the + operating system. The operating system will communicate with Venus to + service the request for the process. Venus manages a persistent + client cache and makes remote procedure calls to Coda file servers and + related servers (such as authentication servers) to service these + requests it receives from the operating system. When Venus has + serviced a request it replies to the operating system with appropriate + return codes, and other data related to the request. Optionally the + kernel support for Coda may maintain a minicache of recently processed + requests to limit the number of interactions with Venus. Venus + possesses the facility to inform the kernel when elements from its + minicache are no longer valid. + + This document describes precisely this communication between the + kernel and Venus. The definitions of so called upcalls and downcalls + will be given with the format of the data they handle. We shall also + describe the semantic invariants resulting from the calls. + + Historically Coda was implemented in a BSD file system in Mach 2.6. + The interface between the kernel and Venus is very similar to the BSD + VFS interface. Similar functionality is provided, and the format of + the parameters and returned data is very similar to the BSD VFS. This + leads to an almost natural environment for implementing a kernel-level + filesystem driver for Coda in a BSD system. However, other operating + systems such as Linux and Windows 95 and NT have virtual filesystem + with different interfaces. + + To implement Coda on these systems some reverse engineering of the + Venus/Kernel protocol is necessary. Also it came to light that other + systems could profit significantly from certain small optimizations + and modifications to the protocol. To facilitate this work as well as + to make future ports easier, communication between Venus and the + kernel should be documented in great detail. This is the aim of this + document. + + 0wpage + + 22.. SSeerrvviicciinngg CCooddaa ffiilleessyysstteemm ccaallllss + + The service of a request for a Coda file system service originates in + a process PP which accessing a Coda file. It makes a system call which + traps to the OS kernel. Examples of such calls trapping to the kernel + are _r_e_a_d_, _w_r_i_t_e_, _o_p_e_n_, _c_l_o_s_e_, _c_r_e_a_t_e_, _m_k_d_i_r_, _r_m_d_i_r_, _c_h_m_o_d in a Unix + context. Similar calls exist in the Win32 environment, and are named + _C_r_e_a_t_e_F_i_l_e_, . + + Generally the operating system handles the request in a virtual + filesystem (VFS) layer, which is named I/O Manager in NT and IFS + manager in Windows 95. The VFS is responsible for partial processing + of the request and for locating the specific filesystem(s) which will + service parts of the request. Usually the information in the path + assists in locating the correct FS drivers. Sometimes after extensive + pre-processing, the VFS starts invoking exported routines in the FS + driver. This is the point where the FS specific processing of the + request starts, and here the Coda specific kernel code comes into + play. + + The FS layer for Coda must expose and implement several interfaces. + First and foremost the VFS must be able to make all necessary calls to + the Coda FS layer, so the Coda FS driver must expose the VFS interface + as applicable in the operating system. These differ very significantly + among operating systems, but share features such as facilities to + read/write and create and remove objects. The Coda FS layer services + such VFS requests by invoking one or more well defined services + offered by the cache manager Venus. When the replies from Venus have + come back to the FS driver, servicing of the VFS call continues and + finishes with a reply to the kernel's VFS. Finally the VFS layer + returns to the process. + + As a result of this design a basic interface exposed by the FS driver + must allow Venus to manage message traffic. In particular Venus must + be able to retrieve and place messages and to be notified of the + arrival of a new message. The notification must be through a mechanism + which does not block Venus since Venus must attend to other tasks even + when no messages are waiting or being processed. + + + + + + + Interfaces of the Coda FS Driver + + Furthermore the FS layer provides for a special path of communication + between a user process and Venus, called the pioctl interface. The + pioctl interface is used for Coda specific services, such as + requesting detailed information about the persistent cache managed by + Venus. Here the involvement of the kernel is minimal. It identifies + the calling process and passes the information on to Venus. When + Venus replies the response is passed back to the caller in unmodified + form. + + Finally Venus allows the kernel FS driver to cache the results from + certain services. This is done to avoid excessive context switches + and results in an efficient system. However, Venus may acquire + information, for example from the network which implies that cached + information must be flushed or replaced. Venus then makes a downcall + to the Coda FS layer to request flushes or updates in the cache. The + kernel FS driver handles such requests synchronously. + + Among these interfaces the VFS interface and the facility to place, + receive and be notified of messages are platform specific. We will + not go into the calls exported to the VFS layer but we will state the + requirements of the message exchange mechanism. + + 0wpage + + 33.. TThhee mmeessssaaggee llaayyeerr + + + + At the lowest level the communication between Venus and the FS driver + proceeds through messages. The synchronization between processes + requesting Coda file service and Venus relies on blocking and waking + up processes. The Coda FS driver processes VFS- and pioctl-requests + on behalf of a process P, creates messages for Venus, awaits replies + and finally returns to the caller. The implementation of the exchange + of messages is platform specific, but the semantics have (so far) + appeared to be generally applicable. Data buffers are created by the + FS Driver in kernel memory on behalf of P and copied to user memory in + Venus. + + The FS Driver while servicing P makes upcalls to Venus. Such an + upcall is dispatched to Venus by creating a message structure. The + structure contains the identification of P, the message sequence + number, the size of the request and a pointer to the data in kernel + memory for the request. Since the data buffer is re-used to hold the + reply from Venus, there is a field for the size of the reply. A flags + field is used in the message to precisely record the status of the + message. Additional platform dependent structures involve pointers to + determine the position of the message on queues and pointers to + synchronization objects. In the upcall routine the message structure + is filled in, flags are set to 0, and it is placed on the _p_e_n_d_i_n_g + queue. The routine calling upcall is responsible for allocating the + data buffer; its structure will be described in the next section. + + A facility must exist to notify Venus that the message has been + created, and implemented using available synchronization objects in + the OS. This notification is done in the upcall context of the process + P. When the message is on the pending queue, process P cannot proceed + in upcall. The (kernel mode) processing of P in the filesystem + request routine must be suspended until Venus has replied. Therefore + the calling thread in P is blocked in upcall. A pointer in the + message structure will locate the synchronization object on which P is + sleeping. + + Venus detects the notification that a message has arrived, and the FS + driver allow Venus to retrieve the message with a getmsg_from_kernel + call. This action finishes in the kernel by putting the message on the + queue of processing messages and setting flags to READ. Venus is + passed the contents of the data buffer. The getmsg_from_kernel call + now returns and Venus processes the request. + + At some later point the FS driver receives a message from Venus, + namely when Venus calls sendmsg_to_kernel. At this moment the Coda FS + driver looks at the contents of the message and decides if: + + + +o the message is a reply for a suspended thread P. If so it removes + the message from the processing queue and marks the message as + WRITTEN. Finally, the FS driver unblocks P (still in the kernel + mode context of Venus) and the sendmsg_to_kernel call returns to + Venus. The process P will be scheduled at some point and continues + processing its upcall with the data buffer replaced with the reply + from Venus. + + +o The message is a _d_o_w_n_c_a_l_l. A downcall is a request from Venus to + the FS Driver. The FS driver processes the request immediately + (usually a cache eviction or replacement) and when it finishes + sendmsg_to_kernel returns. + + Now P awakes and continues processing upcall. There are some + subtleties to take account of. First P will determine if it was woken + up in upcall by a signal from some other source (for example an + attempt to terminate P) or as is normally the case by Venus in its + sendmsg_to_kernel call. In the normal case, the upcall routine will + deallocate the message structure and return. The FS routine can proceed + with its processing. + + + + + + + + Sleeping and IPC arrangements + + In case P is woken up by a signal and not by Venus, it will first look + at the flags field. If the message is not yet READ, the process P can + handle its signal without notifying Venus. If Venus has READ, and + the request should not be processed, P can send Venus a signal message + to indicate that it should disregard the previous message. Such + signals are put in the queue at the head, and read first by Venus. If + the message is already marked as WRITTEN it is too late to stop the + processing. The VFS routine will now continue. (-- If a VFS request + involves more than one upcall, this can lead to complicated state, an + extra field "handle_signals" could be added in the message structure + to indicate points of no return have been passed.--) + + + + 33..11.. IImmpplleemmeennttaattiioonn ddeettaaiillss + + The Unix implementation of this mechanism has been through the + implementation of a character device associated with Coda. Venus + retrieves messages by doing a read on the device, replies are sent + with a write and notification is through the select system call on the + file descriptor for the device. The process P is kept waiting on an + interruptible wait queue object. + + In Windows NT and the DPMI Windows 95 implementation a DeviceIoControl + call is used. The DeviceIoControl call is designed to copy buffers + from user memory to kernel memory with OPCODES. The sendmsg_to_kernel + is issued as a synchronous call, while the getmsg_from_kernel call is + asynchronous. Windows EventObjects are used for notification of + message arrival. The process P is kept waiting on a KernelEvent + object in NT and a semaphore in Windows 95. + + 0wpage + + 44.. TThhee iinntteerrffaaccee aatt tthhee ccaallll lleevveell + + + This section describes the upcalls a Coda FS driver can make to Venus. + Each of these upcalls make use of two structures: inputArgs and + outputArgs. In pseudo BNF form the structures take the following + form: + + + struct inputArgs { + u_long opcode; + u_long unique; /* Keep multiple outstanding msgs distinct */ + u_short pid; /* Common to all */ + u_short pgid; /* Common to all */ + struct CodaCred cred; /* Common to all */ + + + }; + + struct outputArgs { + u_long opcode; + u_long unique; /* Keep multiple outstanding msgs distinct */ + u_long result; + + + }; + + + + Before going on let us elucidate the role of the various fields. The + inputArgs start with the opcode which defines the type of service + requested from Venus. There are approximately 30 upcalls at present + which we will discuss. The unique field labels the inputArg with a + unique number which will identify the message uniquely. A process and + process group id are passed. Finally the credentials of the caller + are included. + + Before delving into the specific calls we need to discuss a variety of + data structures shared by the kernel and Venus. + + + + + 44..11.. DDaattaa ssttrruuccttuurreess sshhaarreedd bbyy tthhee kkeerrnneell aanndd VVeennuuss + + + The CodaCred structure defines a variety of user and group ids as + they are set for the calling process. The vuid_t and guid_t are 32 bit + unsigned integers. It also defines group membership in an array. On + Unix the CodaCred has proven sufficient to implement good security + semantics for Coda but the structure may have to undergo modification + for the Windows environment when these mature. + + struct CodaCred { + vuid_t cr_uid, cr_euid, cr_suid, cr_fsuid; /* Real, effective, set, fs uid*/ + vgid_t cr_gid, cr_egid, cr_sgid, cr_fsgid; /* same for groups */ + vgid_t cr_groups[NGROUPS]; /* Group membership for caller */ + }; + + + + NNOOTTEE It is questionable if we need CodaCreds in Venus. Finally Venus + doesn't know about groups, although it does create files with the + default uid/gid. Perhaps the list of group membership is superfluous. + + + The next item is the fundamental identifier used to identify Coda + files, the ViceFid. A fid of a file uniquely defines a file or + directory in the Coda filesystem within a _c_e_l_l. (-- A _c_e_l_l is a + group of Coda servers acting under the aegis of a single system + control machine or SCM. See the Coda Administration manual for a + detailed description of the role of the SCM.--) + + + typedef struct ViceFid { + VolumeId Volume; + VnodeId Vnode; + Unique_t Unique; + } ViceFid; + + + + Each of the constituent fields: VolumeId, VnodeId and Unique_t are + unsigned 32 bit integers. We envisage that a further field will need + to be prefixed to identify the Coda cell; this will probably take the + form of a Ipv6 size IP address naming the Coda cell through DNS. + + The next important structure shared between Venus and the kernel is + the attributes of the file. The following structure is used to + exchange information. It has room for future extensions such as + support for device files (currently not present in Coda). + + + + + + + + + + + + + + + + + + + struct coda_vattr { + enum coda_vtype va_type; /* vnode type (for create) */ + u_short va_mode; /* files access mode and type */ + short va_nlink; /* number of references to file */ + vuid_t va_uid; /* owner user id */ + vgid_t va_gid; /* owner group id */ + long va_fsid; /* file system id (dev for now) */ + long va_fileid; /* file id */ + u_quad_t va_size; /* file size in bytes */ + long va_blocksize; /* blocksize preferred for i/o */ + struct timespec va_atime; /* time of last access */ + struct timespec va_mtime; /* time of last modification */ + struct timespec va_ctime; /* time file changed */ + u_long va_gen; /* generation number of file */ + u_long va_flags; /* flags defined for file */ + dev_t va_rdev; /* device special file represents */ + u_quad_t va_bytes; /* bytes of disk space held by file */ + u_quad_t va_filerev; /* file modification number */ + u_int va_vaflags; /* operations flags, see below */ + long va_spare; /* remain quad aligned */ + }; + + + + + 44..22.. TThhee ppiiooccttll iinntteerrffaaccee + + + Coda specific requests can be made by application through the pioctl + interface. The pioctl is implemented as an ordinary ioctl on a + fictitious file /coda/.CONTROL. The pioctl call opens this file, gets + a file handle and makes the ioctl call. Finally it closes the file. + + The kernel involvement in this is limited to providing the facility to + open and close and pass the ioctl message _a_n_d to verify that a path in + the pioctl data buffers is a file in a Coda filesystem. + + The kernel is handed a data packet of the form: + + struct { + const char *path; + struct ViceIoctl vidata; + int follow; + } data; + + + + where + + + struct ViceIoctl { + caddr_t in, out; /* Data to be transferred in, or out */ + short in_size; /* Size of input buffer <= 2K */ + short out_size; /* Maximum size of output buffer, <= 2K */ + }; + + + + The path must be a Coda file, otherwise the ioctl upcall will not be + made. + + NNOOTTEE The data structures and code are a mess. We need to clean this + up. + + We now proceed to document the individual calls: + + 0wpage + + 44..33.. rroooott + + + AArrgguummeennttss + + iinn empty + + oouutt + + struct cfs_root_out { + ViceFid VFid; + } cfs_root; + + + + DDeessccrriippttiioonn This call is made to Venus during the initialization of + the Coda filesystem. If the result is zero, the cfs_root structure + contains the ViceFid of the root of the Coda filesystem. If a non-zero + result is generated, its value is a platform dependent error code + indicating the difficulty Venus encountered in locating the root of + the Coda filesystem. + + 0wpage + + 44..44.. llooookkuupp + + + SSuummmmaarryy Find the ViceFid and type of an object in a directory if it + exists. + + AArrgguummeennttss + + iinn + + struct cfs_lookup_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_lookup; + + + + oouutt + + struct cfs_lookup_out { + ViceFid VFid; + int vtype; + } cfs_lookup; + + + + DDeessccrriippttiioonn This call is made to determine the ViceFid and filetype of + a directory entry. The directory entry requested carries name name + and Venus will search the directory identified by cfs_lookup_in.VFid. + The result may indicate that the name does not exist, or that + difficulty was encountered in finding it (e.g. due to disconnection). + If the result is zero, the field cfs_lookup_out.VFid contains the + targets ViceFid and cfs_lookup_out.vtype the coda_vtype giving the + type of object the name designates. + + The name of the object is an 8 bit character string of maximum length + CFS_MAXNAMLEN, currently set to 256 (including a 0 terminator.) + + It is extremely important to realize that Venus bitwise ors the field + cfs_lookup.vtype with CFS_NOCACHE to indicate that the object should + not be put in the kernel name cache. + + NNOOTTEE The type of the vtype is currently wrong. It should be + coda_vtype. Linux does not take note of CFS_NOCACHE. It should. + + 0wpage + + 44..55.. ggeettaattttrr + + + SSuummmmaarryy Get the attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_getattr_in { + ViceFid VFid; + struct coda_vattr attr; /* XXXXX */ + } cfs_getattr; + + + + oouutt + + struct cfs_getattr_out { + struct coda_vattr attr; + } cfs_getattr; + + + + DDeessccrriippttiioonn This call returns the attributes of the file identified by + fid. + + EErrrroorrss Errors can occur if the object with fid does not exist, is + unaccessible or if the caller does not have permission to fetch + attributes. + + NNoottee Many kernel FS drivers (Linux, NT and Windows 95) need to acquire + the attributes as well as the Fid for the instantiation of an internal + "inode" or "FileHandle". A significant improvement in performance on + such systems could be made by combining the _l_o_o_k_u_p and _g_e_t_a_t_t_r calls + both at the Venus/kernel interaction level and at the RPC level. + + The vattr structure included in the input arguments is superfluous and + should be removed. + + 0wpage + + 44..66.. sseettaattttrr + + + SSuummmmaarryy Set the attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_setattr_in { + ViceFid VFid; + struct coda_vattr attr; + } cfs_setattr; + + + + + oouutt + empty + + DDeessccrriippttiioonn The structure attr is filled with attributes to be changed + in BSD style. Attributes not to be changed are set to -1, apart from + vtype which is set to VNON. Other are set to the value to be assigned. + The only attributes which the FS driver may request to change are the + mode, owner, groupid, atime, mtime and ctime. The return value + indicates success or failure. + + EErrrroorrss A variety of errors can occur. The object may not exist, may + be inaccessible, or permission may not be granted by Venus. + + 0wpage + + 44..77.. aacccceessss + + + SSuummmmaarryy + + AArrgguummeennttss + + iinn + + struct cfs_access_in { + ViceFid VFid; + int flags; + } cfs_access; + + + + oouutt + empty + + DDeessccrriippttiioonn Verify if access to the object identified by VFid for + operations described by flags is permitted. The result indicates if + access will be granted. It is important to remember that Coda uses + ACLs to enforce protection and that ultimately the servers, not the + clients enforce the security of the system. The result of this call + will depend on whether a _t_o_k_e_n is held by the user. + + EErrrroorrss The object may not exist, or the ACL describing the protection + may not be accessible. + + 0wpage + + 44..88.. ccrreeaattee + + + SSuummmmaarryy Invoked to create a file + + AArrgguummeennttss + + iinn + + struct cfs_create_in { + ViceFid VFid; + struct coda_vattr attr; + int excl; + int mode; + char *name; /* Place holder for data. */ + } cfs_create; + + + + + oouutt + + struct cfs_create_out { + ViceFid VFid; + struct coda_vattr attr; + } cfs_create; + + + + DDeessccrriippttiioonn This upcall is invoked to request creation of a file. + The file will be created in the directory identified by VFid, its name + will be name, and the mode will be mode. If excl is set an error will + be returned if the file already exists. If the size field in attr is + set to zero the file will be truncated. The uid and gid of the file + are set by converting the CodaCred to a uid using a macro CRTOUID + (this macro is platform dependent). Upon success the VFid and + attributes of the file are returned. The Coda FS Driver will normally + instantiate a vnode, inode or file handle at kernel level for the new + object. + + + EErrrroorrss A variety of errors can occur. Permissions may be insufficient. + If the object exists and is not a file the error EISDIR is returned + under Unix. + + NNOOTTEE The packing of parameters is very inefficient and appears to + indicate confusion between the system call creat and the VFS operation + create. The VFS operation create is only called to create new objects. + This create call differs from the Unix one in that it is not invoked + to return a file descriptor. The truncate and exclusive options, + together with the mode, could simply be part of the mode as it is + under Unix. There should be no flags argument; this is used in open + (2) to return a file descriptor for READ or WRITE mode. + + The attributes of the directory should be returned too, since the size + and mtime changed. + + 0wpage + + 44..99.. mmkkddiirr + + + SSuummmmaarryy Create a new directory. + + AArrgguummeennttss + + iinn + + struct cfs_mkdir_in { + ViceFid VFid; + struct coda_vattr attr; + char *name; /* Place holder for data. */ + } cfs_mkdir; + + + + oouutt + + struct cfs_mkdir_out { + ViceFid VFid; + struct coda_vattr attr; + } cfs_mkdir; + + + + + DDeessccrriippttiioonn This call is similar to create but creates a directory. + Only the mode field in the input parameters is used for creation. + Upon successful creation, the attr returned contains the attributes of + the new directory. + + EErrrroorrss As for create. + + NNOOTTEE The input parameter should be changed to mode instead of + attributes. + + The attributes of the parent should be returned since the size and + mtime changes. + + 0wpage + + 44..1100.. lliinnkk + + + SSuummmmaarryy Create a link to an existing file. + + AArrgguummeennttss + + iinn + + struct cfs_link_in { + ViceFid sourceFid; /* cnode to link *to* */ + ViceFid destFid; /* Directory in which to place link */ + char *tname; /* Place holder for data. */ + } cfs_link; + + + + oouutt + empty + + DDeessccrriippttiioonn This call creates a link to the sourceFid in the directory + identified by destFid with name tname. The source must reside in the + target's parent, i.e. the source must be have parent destFid, i.e. Coda + does not support cross directory hard links. Only the return value is + relevant. It indicates success or the type of failure. + + EErrrroorrss The usual errors can occur.0wpage + + 44..1111.. ssyymmlliinnkk + + + SSuummmmaarryy create a symbolic link + + AArrgguummeennttss + + iinn + + struct cfs_symlink_in { + ViceFid VFid; /* Directory to put symlink in */ + char *srcname; + struct coda_vattr attr; + char *tname; + } cfs_symlink; + + + + oouutt + none + + DDeessccrriippttiioonn Create a symbolic link. The link is to be placed in the + directory identified by VFid and named tname. It should point to the + pathname srcname. The attributes of the newly created object are to + be set to attr. + + EErrrroorrss + + NNOOTTEE The attributes of the target directory should be returned since + its size changed. + + 0wpage + + 44..1122.. rreemmoovvee + + + SSuummmmaarryy Remove a file + + AArrgguummeennttss + + iinn + + struct cfs_remove_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_remove; + + + + oouutt + none + + DDeessccrriippttiioonn Remove file named cfs_remove_in.name in directory + identified by VFid. + + EErrrroorrss + + NNOOTTEE The attributes of the directory should be returned since its + mtime and size may change. + + 0wpage + + 44..1133.. rrmmddiirr + + + SSuummmmaarryy Remove a directory + + AArrgguummeennttss + + iinn + + struct cfs_rmdir_in { + ViceFid VFid; + char *name; /* Place holder for data. */ + } cfs_rmdir; + + + + oouutt + none + + DDeessccrriippttiioonn Remove the directory with name name from the directory + identified by VFid. + + EErrrroorrss + + NNOOTTEE The attributes of the parent directory should be returned since + its mtime and size may change. + + 0wpage + + 44..1144.. rreeaaddlliinnkk + + + SSuummmmaarryy Read the value of a symbolic link. + + AArrgguummeennttss + + iinn + + struct cfs_readlink_in { + ViceFid VFid; + } cfs_readlink; + + + + oouutt + + struct cfs_readlink_out { + int count; + caddr_t data; /* Place holder for data. */ + } cfs_readlink; + + + + DDeessccrriippttiioonn This routine reads the contents of symbolic link + identified by VFid into the buffer data. The buffer data must be able + to hold any name up to CFS_MAXNAMLEN (PATH or NAM??). + + EErrrroorrss No unusual errors. + + 0wpage + + 44..1155.. ooppeenn + + + SSuummmmaarryy Open a file. + + AArrgguummeennttss + + iinn + + struct cfs_open_in { + ViceFid VFid; + int flags; + } cfs_open; + + + + oouutt + + struct cfs_open_out { + dev_t dev; + ino_t inode; + } cfs_open; + + + + DDeessccrriippttiioonn This request asks Venus to place the file identified by + VFid in its cache and to note that the calling process wishes to open + it with flags as in open(2). The return value to the kernel differs + for Unix and Windows systems. For Unix systems the Coda FS Driver is + informed of the device and inode number of the container file in the + fields dev and inode. For Windows the path of the container file is + returned to the kernel. + EErrrroorrss + + NNOOTTEE Currently the cfs_open_out structure is not properly adapted to + deal with the Windows case. It might be best to implement two + upcalls, one to open aiming at a container file name, the other at a + container file inode. + + 0wpage + + 44..1166.. cclloossee + + + SSuummmmaarryy Close a file, update it on the servers. + + AArrgguummeennttss + + iinn + + struct cfs_close_in { + ViceFid VFid; + int flags; + } cfs_close; + + + + oouutt + none + + DDeessccrriippttiioonn Close the file identified by VFid. + + EErrrroorrss + + NNOOTTEE The flags argument is bogus and not used. However, Venus' code + has room to deal with an execp input field, probably this field should + be used to inform Venus that the file was closed but is still memory + mapped for execution. There are comments about fetching versus not + fetching the data in Venus vproc_vfscalls. This seems silly. If a + file is being closed, the data in the container file is to be the new + data. Here again the execp flag might be in play to create confusion: + currently Venus might think a file can be flushed from the cache when + it is still memory mapped. This needs to be understood. + + 0wpage + + 44..1177.. iiooccttll + + + SSuummmmaarryy Do an ioctl on a file. This includes the pioctl interface. + + AArrgguummeennttss + + iinn + + struct cfs_ioctl_in { + ViceFid VFid; + int cmd; + int len; + int rwflag; + char *data; /* Place holder for data. */ + } cfs_ioctl; + + + + oouutt + + + struct cfs_ioctl_out { + int len; + caddr_t data; /* Place holder for data. */ + } cfs_ioctl; + + + + DDeessccrriippttiioonn Do an ioctl operation on a file. The command, len and + data arguments are filled as usual. flags is not used by Venus. + + EErrrroorrss + + NNOOTTEE Another bogus parameter. flags is not used. What is the + business about PREFETCHING in the Venus code? + + + 0wpage + + 44..1188.. rreennaammee + + + SSuummmmaarryy Rename a fid. + + AArrgguummeennttss + + iinn + + struct cfs_rename_in { + ViceFid sourceFid; + char *srcname; + ViceFid destFid; + char *destname; + } cfs_rename; + + + + oouutt + none + + DDeessccrriippttiioonn Rename the object with name srcname in directory + sourceFid to destname in destFid. It is important that the names + srcname and destname are 0 terminated strings. Strings in Unix + kernels are not always null terminated. + + EErrrroorrss + + 0wpage + + 44..1199.. rreeaaddddiirr + + + SSuummmmaarryy Read directory entries. + + AArrgguummeennttss + + iinn + + struct cfs_readdir_in { + ViceFid VFid; + int count; + int offset; + } cfs_readdir; + + + + + oouutt + + struct cfs_readdir_out { + int size; + caddr_t data; /* Place holder for data. */ + } cfs_readdir; + + + + DDeessccrriippttiioonn Read directory entries from VFid starting at offset and + read at most count bytes. Returns the data in data and returns + the size in size. + + EErrrroorrss + + NNOOTTEE This call is not used. Readdir operations exploit container + files. We will re-evaluate this during the directory revamp which is + about to take place. + + 0wpage + + 44..2200.. vvggeett + + + SSuummmmaarryy instructs Venus to do an FSDB->Get. + + AArrgguummeennttss + + iinn + + struct cfs_vget_in { + ViceFid VFid; + } cfs_vget; + + + + oouutt + + struct cfs_vget_out { + ViceFid VFid; + int vtype; + } cfs_vget; + + + + DDeessccrriippttiioonn This upcall asks Venus to do a get operation on an fsobj + labelled by VFid. + + EErrrroorrss + + NNOOTTEE This operation is not used. However, it is extremely useful + since it can be used to deal with read/write memory mapped files. + These can be "pinned" in the Venus cache using vget and released with + inactive. + + 0wpage + + 44..2211.. ffssyynncc + + + SSuummmmaarryy Tell Venus to update the RVM attributes of a file. + + AArrgguummeennttss + + iinn + + struct cfs_fsync_in { + ViceFid VFid; + } cfs_fsync; + + + + oouutt + none + + DDeessccrriippttiioonn Ask Venus to update RVM attributes of object VFid. This + should be called as part of kernel level fsync type calls. The + result indicates if the syncing was successful. + + EErrrroorrss + + NNOOTTEE Linux does not implement this call. It should. + + 0wpage + + 44..2222.. iinnaaccttiivvee + + + SSuummmmaarryy Tell Venus a vnode is no longer in use. + + AArrgguummeennttss + + iinn + + struct cfs_inactive_in { + ViceFid VFid; + } cfs_inactive; + + + + oouutt + none + + DDeessccrriippttiioonn This operation returns EOPNOTSUPP. + + EErrrroorrss + + NNOOTTEE This should perhaps be removed. + + 0wpage + + 44..2233.. rrddwwrr + + + SSuummmmaarryy Read or write from a file + + AArrgguummeennttss + + iinn + + struct cfs_rdwr_in { + ViceFid VFid; + int rwflag; + int count; + int offset; + int ioflag; + caddr_t data; /* Place holder for data. */ + } cfs_rdwr; + + + + + oouutt + + struct cfs_rdwr_out { + int rwflag; + int count; + caddr_t data; /* Place holder for data. */ + } cfs_rdwr; + + + + DDeessccrriippttiioonn This upcall asks Venus to read or write from a file. + + EErrrroorrss + + NNOOTTEE It should be removed since it is against the Coda philosophy that + read/write operations never reach Venus. I have been told the + operation does not work. It is not currently used. + + + 0wpage + + 44..2244.. ooddyymmoouunntt + + + SSuummmmaarryy Allows mounting multiple Coda "filesystems" on one Unix mount + point. + + AArrgguummeennttss + + iinn + + struct ody_mount_in { + char *name; /* Place holder for data. */ + } ody_mount; + + + + oouutt + + struct ody_mount_out { + ViceFid VFid; + } ody_mount; + + + + DDeessccrriippttiioonn Asks Venus to return the rootfid of a Coda system named + name. The fid is returned in VFid. + + EErrrroorrss + + NNOOTTEE This call was used by David for dynamic sets. It should be + removed since it causes a jungle of pointers in the VFS mounting area. + It is not used by Coda proper. Call is not implemented by Venus. + + 0wpage + + 44..2255.. ooddyy__llooookkuupp + + + SSuummmmaarryy Looks up something. + + AArrgguummeennttss + + iinn irrelevant + + + oouutt + irrelevant + + DDeessccrriippttiioonn + + EErrrroorrss + + NNOOTTEE Gut it. Call is not implemented by Venus. + + 0wpage + + 44..2266.. ooddyy__eexxppaanndd + + + SSuummmmaarryy expands something in a dynamic set. + + AArrgguummeennttss + + iinn irrelevant + + oouutt + irrelevant + + DDeessccrriippttiioonn + + EErrrroorrss + + NNOOTTEE Gut it. Call is not implemented by Venus. + + 0wpage + + 44..2277.. pprreeffeettcchh + + + SSuummmmaarryy Prefetch a dynamic set. + + AArrgguummeennttss + + iinn Not documented. + + oouutt + Not documented. + + DDeessccrriippttiioonn Venus worker.cc has support for this call, although it is + noted that it doesn't work. Not surprising, since the kernel does not + have support for it. (ODY_PREFETCH is not a defined operation). + + EErrrroorrss + + NNOOTTEE Gut it. It isn't working and isn't used by Coda. + + + 0wpage + + 44..2288.. ssiiggnnaall + + + SSuummmmaarryy Send Venus a signal about an upcall. + + AArrgguummeennttss + + iinn none + + oouutt + not applicable. + + DDeessccrriippttiioonn This is an out-of-band upcall to Venus to inform Venus + that the calling process received a signal after Venus read the + message from the input queue. Venus is supposed to clean up the + operation. + + EErrrroorrss No reply is given. + + NNOOTTEE We need to better understand what Venus needs to clean up and if + it is doing this correctly. Also we need to handle multiple upcall + per system call situations correctly. It would be important to know + what state changes in Venus take place after an upcall for which the + kernel is responsible for notifying Venus to clean up (e.g. open + definitely is such a state change, but many others are maybe not). + + 0wpage + + 55.. TThhee mmiinniiccaacchhee aanndd ddoowwnnccaallllss + + + The Coda FS Driver can cache results of lookup and access upcalls, to + limit the frequency of upcalls. Upcalls carry a price since a process + context switch needs to take place. The counterpart of caching the + information is that Venus will notify the FS Driver that cached + entries must be flushed or renamed. + + The kernel code generally has to maintain a structure which links the + internal file handles (called vnodes in BSD, inodes in Linux and + FileHandles in Windows) with the ViceFid's which Venus maintains. The + reason is that frequent translations back and forth are needed in + order to make upcalls and use the results of upcalls. Such linking + objects are called ccnnooddeess. + + The current minicache implementations have cache entries which record + the following: + + 1. the name of the file + + 2. the cnode of the directory containing the object + + 3. a list of CodaCred's for which the lookup is permitted. + + 4. the cnode of the object + + The lookup call in the Coda FS Driver may request the cnode of the + desired object from the cache, by passing its name, directory and the + CodaCred's of the caller. The cache will return the cnode or indicate + that it cannot be found. The Coda FS Driver must be careful to + invalidate cache entries when it modifies or removes objects. + + When Venus obtains information that indicates that cache entries are + no longer valid, it will make a downcall to the kernel. Downcalls are + intercepted by the Coda FS Driver and lead to cache invalidations of + the kind described below. The Coda FS Driver does not return an error + unless the downcall data could not be read into kernel memory. + + + 55..11.. IINNVVAALLIIDDAATTEE + + + No information is available on this call. + + + 55..22.. FFLLUUSSHH + + + + AArrgguummeennttss None + + SSuummmmaarryy Flush the name cache entirely. + + DDeessccrriippttiioonn Venus issues this call upon startup and when it dies. This + is to prevent stale cache information being held. Some operating + systems allow the kernel name cache to be switched off dynamically. + When this is done, this downcall is made. + + + 55..33.. PPUURRGGEEUUSSEERR + + + AArrgguummeennttss + + struct cfs_purgeuser_out {/* CFS_PURGEUSER is a venus->kernel call */ + struct CodaCred cred; + } cfs_purgeuser; + + + + DDeessccrriippttiioonn Remove all entries in the cache carrying the Cred. This + call is issued when tokens for a user expire or are flushed. + + + 55..44.. ZZAAPPFFIILLEE + + + AArrgguummeennttss + + struct cfs_zapfile_out { /* CFS_ZAPFILE is a venus->kernel call */ + ViceFid CodaFid; + } cfs_zapfile; + + + + DDeessccrriippttiioonn Remove all entries which have the (dir vnode, name) pair. + This is issued as a result of an invalidation of cached attributes of + a vnode. + + NNOOTTEE Call is not named correctly in NetBSD and Mach. The minicache + zapfile routine takes different arguments. Linux does not implement + the invalidation of attributes correctly. + + + + 55..55.. ZZAAPPDDIIRR + + + AArrgguummeennttss + + struct cfs_zapdir_out { /* CFS_ZAPDIR is a venus->kernel call */ + ViceFid CodaFid; + } cfs_zapdir; + + + + DDeessccrriippttiioonn Remove all entries in the cache lying in a directory + CodaFid, and all children of this directory. This call is issued when + Venus receives a callback on the directory. + + + 55..66.. ZZAAPPVVNNOODDEE + + + + AArrgguummeennttss + + struct cfs_zapvnode_out { /* CFS_ZAPVNODE is a venus->kernel call */ + struct CodaCred cred; + ViceFid VFid; + } cfs_zapvnode; + + + + DDeessccrriippttiioonn Remove all entries in the cache carrying the cred and VFid + as in the arguments. This downcall is probably never issued. + + + 55..77.. PPUURRGGEEFFIIDD + + + SSuummmmaarryy + + AArrgguummeennttss + + struct cfs_purgefid_out { /* CFS_PURGEFID is a venus->kernel call */ + ViceFid CodaFid; + } cfs_purgefid; + + + + DDeessccrriippttiioonn Flush the attribute for the file. If it is a dir (odd + vnode), purge its children from the namecache and remove the file from the + namecache. + + + + 55..88.. RREEPPLLAACCEE + + + SSuummmmaarryy Replace the Fid's for a collection of names. + + AArrgguummeennttss + + struct cfs_replace_out { /* cfs_replace is a venus->kernel call */ + ViceFid NewFid; + ViceFid OldFid; + } cfs_replace; + + + + DDeessccrriippttiioonn This routine replaces a ViceFid in the name cache with + another. It is added to allow Venus during reintegration to replace + locally allocated temp fids while disconnected with global fids even + when the reference counts on those fids are not zero. + + 0wpage + + 66.. IInniittiiaalliizzaattiioonn aanndd cclleeaannuupp + + + This section gives brief hints as to desirable features for the Coda + FS Driver at startup and upon shutdown or Venus failures. Before + entering the discussion it is useful to repeat that the Coda FS Driver + maintains the following data: + + + 1. message queues + + 2. cnodes + + 3. name cache entries + + The name cache entries are entirely private to the driver, so they + can easily be manipulated. The message queues will generally have + clear points of initialization and destruction. The cnodes are + much more delicate. User processes hold reference counts in Coda + filesystems and it can be difficult to clean up the cnodes. + + It can expect requests through: + + 1. the message subsystem + + 2. the VFS layer + + 3. pioctl interface + + Currently the _p_i_o_c_t_l passes through the VFS for Coda so we can + treat these similarly. + + + 66..11.. RReeqquuiirreemmeennttss + + + The following requirements should be accommodated: + + 1. The message queues should have open and close routines. On Unix + the opening of the character devices are such routines. + + +o Before opening, no messages can be placed. + + +o Opening will remove any old messages still pending. + + +o Close will notify any sleeping processes that their upcall cannot + be completed. + + +o Close will free all memory allocated by the message queues. + + + 2. At open the namecache shall be initialized to empty state. + + 3. Before the message queues are open, all VFS operations will fail. + Fortunately this can be achieved by making sure than mounting the + Coda filesystem cannot succeed before opening. + + 4. After closing of the queues, no VFS operations can succeed. Here + one needs to be careful, since a few operations (lookup, + read/write, readdir) can proceed without upcalls. These must be + explicitly blocked. + + 5. Upon closing the namecache shall be flushed and disabled. + + 6. All memory held by cnodes can be freed without relying on upcalls. + + 7. Unmounting the file system can be done without relying on upcalls. + + 8. Mounting the Coda filesystem should fail gracefully if Venus cannot + get the rootfid or the attributes of the rootfid. The latter is + best implemented by Venus fetching these objects before attempting + to mount. + + NNOOTTEE NetBSD in particular but also Linux have not implemented the + above requirements fully. For smooth operation this needs to be + corrected. + + + diff --git a/Documentation/filesystems/cramfs.txt b/Documentation/filesystems/cramfs.txt new file mode 100644 index 00000000000..31f53f0ab95 --- /dev/null +++ b/Documentation/filesystems/cramfs.txt @@ -0,0 +1,76 @@ + + Cramfs - cram a filesystem onto a small ROM + +cramfs is designed to be simple and small, and to compress things well. + +It uses the zlib routines to compress a file one page at a time, and +allows random page access. The meta-data is not compressed, but is +expressed in a very terse representation to make it use much less +diskspace than traditional filesystems. + +You can't write to a cramfs filesystem (making it compressible and +compact also makes it _very_ hard to update on-the-fly), so you have to +create the disk image with the "mkcramfs" utility. + + +Usage Notes +----------- + +File sizes are limited to less than 16MB. + +Maximum filesystem size is a little over 256MB. (The last file on the +filesystem is allowed to extend past 256MB.) + +Only the low 8 bits of gid are stored. The current version of +mkcramfs simply truncates to 8 bits, which is a potential security +issue. + +Hard links are supported, but hard linked files +will still have a link count of 1 in the cramfs image. + +Cramfs directories have no `.' or `..' entries. Directories (like +every other file on cramfs) always have a link count of 1. (There's +no need to use -noleaf in `find', btw.) + +No timestamps are stored in a cramfs, so these default to the epoch +(1970 GMT). Recently-accessed files may have updated timestamps, but +the update lasts only as long as the inode is cached in memory, after +which the timestamp reverts to 1970, i.e. moves backwards in time. + +Currently, cramfs must be written and read with architectures of the +same endianness, and can be read only by kernels with PAGE_CACHE_SIZE +== 4096. At least the latter of these is a bug, but it hasn't been +decided what the best fix is. For the moment if you have larger pages +you can just change the #define in mkcramfs.c, so long as you don't +mind the filesystem becoming unreadable to future kernels. + + +For /usr/share/magic +-------------------- + +0 ulelong 0x28cd3d45 Linux cramfs offset 0 +>4 ulelong x size %d +>8 ulelong x flags 0x%x +>12 ulelong x future 0x%x +>16 string >\0 signature "%.16s" +>32 ulelong x fsid.crc 0x%x +>36 ulelong x fsid.edition %d +>40 ulelong x fsid.blocks %d +>44 ulelong x fsid.files %d +>48 string >\0 name "%.16s" +512 ulelong 0x28cd3d45 Linux cramfs offset 512 +>516 ulelong x size %d +>520 ulelong x flags 0x%x +>524 ulelong x future 0x%x +>528 string >\0 signature "%.16s" +>544 ulelong x fsid.crc 0x%x +>548 ulelong x fsid.edition %d +>552 ulelong x fsid.blocks %d +>556 ulelong x fsid.files %d +>560 string >\0 name "%.16s" + + +Hacker Notes +------------ + +See fs/cramfs/README for filesystem layout and implementation notes. diff --git a/Documentation/filesystems/devfs/ChangeLog b/Documentation/filesystems/devfs/ChangeLog new file mode 100644 index 00000000000..e5aba5246d7 --- /dev/null +++ b/Documentation/filesystems/devfs/ChangeLog @@ -0,0 +1,1977 @@ +/* -*- auto-fill -*- */ +=============================================================================== +Changes for patch v1 + +- creation of devfs + +- modified miscellaneous character devices to support devfs +=============================================================================== +Changes for patch v2 + +- bug fix with manual inode creation +=============================================================================== +Changes for patch v3 + +- bugfixes + +- documentation improvements + +- created a couple of scripts (one to save&restore a devfs and the + other to set up compatibility symlinks) + +- devfs support for SCSI discs. New name format is: sd_hHcCiIlL +=============================================================================== +Changes for patch v4 + +- bugfix for the directory reading code + +- bugfix for compilation with kerneld + +- devfs support for generic hard discs + +- rationalisation of the various watchdog drivers +=============================================================================== +Changes for patch v5 + +- support for mounting directly from entries in the devfs (it doesn't + need to be mounted to do this), including the root filesystem. + Mounting of swap partitions also works. Hence, now if you set + CONFIG_DEVFS_ONLY to 'Y' then you won't be able to access your discs + via ordinary device nodes. Naturally, the default is 'N' so that you + can still use your old device nodes. If you want to mount from devfs + entries, make sure you use: append = "root=/dev/sd_..." in your + lilo.conf. It seems LILO looks for the device number (major&minor) + and writes that into the kernel image :-( + +- support for character memory devices (/dev/null, /dev/zero, /dev/full + and so on). Thanks to C. Scott Ananian +=============================================================================== +Changes for patch v6 + +- support for subdirectories + +- support for symbolic links (created by devfs_mk_symlink(), no + support yet for creation via symlink(2)) + +- SCSI disc naming now cast in stone, with the format: + /dev/sd/c0b1t2u3 controller=0, bus=1, ID=2, LUN=3, whole disc + /dev/sd/c0b1t2u3p4 controller=0, bus=1, ID=2, LUN=3, 4th partition + +- loop devices now appear in devfs + +- tty devices, console, serial ports, etc. now appear in devfs + Thanks to C. Scott Ananian + +- bugs with mounting devfs-only devices now fixed +=============================================================================== +Changes for patch v7 + +- SCSI CD-ROMS, tapes and generic devices now appear in devfs +=============================================================================== +Changes for patch v8 + +- bugfix with no-rewind SCSI tapes + +- RAMDISCs now appear in devfs + +- better cleaning up of devfs entries created by various modules + +- interface change to +=============================================================================== +Changes for patch v9 + +- the v8 patch was corrupted somehow, which would affect the patch for + linux/fs/filesystems.c + I've also fixed the v8 patch file on the WWW + +- MetaDevices (/dev/md*) should now appear in devfs +=============================================================================== +Changes for patch v10 + +- bugfix in meta device support for devfs + +- created this ChangeLog file + +- added devfs support to the floppy driver + +- added support for creating sockets in a devfs +=============================================================================== +Changes for patch v11 + +- added DEVFS_FL_HIDE_UNREG flag + +- incorporated better patch for ttyname() in libc 5.4.43 from H.J. Lu. + +- interface change to + +- support for creating symlinks with symlink(2) + +- parallel port printer (/dev/lp*) now appears in devfs +=============================================================================== +Changes for patch v12 + +- added inode check to function + +- improved devfs support when mounting from devfs + +- added call to <> operation when removing swap areas on + devfs devices + +- increased NR_SUPER to 128 to support large numbers of devfs mounts + (for chroot(2) gaols) + +- fixed bug in SCSI disc support: was generating incorrect minors if + SCSI ID's did not start at 0 and increase by 1 + +- support symlink traversal when mounting root +=============================================================================== +Changes for patch v13 + +- added devfs support to soundcard driver + Thanks to Eric Dumas and + C. Scott Ananian + +- added devfs support to the joystick driver + +- loop driver now has it's own subdirectory "/dev/loop/" + +- created and functions + +- fix problem with SCSI disc compatibility names (sd{a,b,c,d,e,f}) + which assumes ID's start at 0 and increase by 1. Also only create + devfs entries for SCSI disc partitions which actually exist + Show new names in partition check + Thanks to Jakub Jelinek +=============================================================================== +Changes for patch v14 + +- bug fix in floppy driver: would not compile without + CONFIG_DEVFS_FS='Y' + Thanks to Jurgen Botz + +- bug fix in loop driver + Thanks to C. Scott Ananian + +- do not create devfs entries for printers not configured + Thanks to C. Scott Ananian + +- do not create devfs entries for serial ports not present + Thanks to C. Scott Ananian + +- ensure is exported from tty_io.c + Thanks to C. Scott Ananian + +- allow unregistering of devfs symlink entries + +- fixed bug in SCSI disc naming introduced in last patch version +=============================================================================== +Changes for patch v15 + +- ported to kernel 2.1.81 +=============================================================================== +Changes for patch v16 + +- created function + +- moved DEVFS_SUPER_MAGIC into header file + +- added DEVFS_FL_HIDE flag + +- created + +- created + +- fixed bugs in searching by major&minor + +- changed interface to , and + + +- fixed inode times when symlink created with symlink(2) + +- change tty driver to do auto-creation of devfs entries + Thanks to C. Scott Ananian + +- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to + devfs + +- updated libc 5.4.43 patch for ttyname() +=============================================================================== +Changes for patch v17 + +- added CONFIG_DEVFS_TTY_COMPAT + Thanks to C. Scott Ananian + +- bugfix in devfs support for drivers/char/lp.c + Thanks to C. Scott Ananian + +- clean up serial driver so that PCMCIA devices unregister correctly + Thanks to C. Scott Ananian + +- fixed bug in genhd.c: whole disc (non-SCSI) was not registered to + devfs [was missing in patch v16] + +- updated libc 5.4.43 patch for ttyname() [was missing in patch v16] + +- all SCSI devices now registered in /dev/sg + +- support removal of devfs entries via unlink(2) +=============================================================================== +Changes for patch v18 + +- added floppy/?u720 floppy entry + +- fixed kerneld support for entries in devfs subdirectories + +- incorporated latest patch for ttyname() in libc 5.4.43 from H.J. Lu. +=============================================================================== +Changes for patch v19 + +- bug fix when looking up unregistered entries: kerneld was not called + +- fixes for kernel 2.1.86 (now requires 2.1.86) +=============================================================================== +Changes for patch v20 + +- only create available floppy entries + Thanks to Andrzej Krzysztofowicz + +- new IDE naming scheme following SCSI format (i.e. /dev/id/c0b0t0u0p1 + instead of /dev/hda1) + Thanks to Andrzej Krzysztofowicz + +- new XT disc naming scheme following SCSI format (i.e. /dev/xd/c0t0p1 + instead of /dev/xda1) + Thanks to Andrzej Krzysztofowicz + +- new non-standard CD-ROM names (i.e. /dev/sbp/c#t#) + Thanks to Andrzej Krzysztofowicz + +- allow symlink traversal when mounting the root filesystem + +- Create entries for MD devices at MD init + Thanks to Christophe Leroy +=============================================================================== +Changes for patch v21 + +- ported to kernel 2.1.91 +=============================================================================== +Changes for patch v22 + +- SCSI host number patch ("scsihosts=" kernel option) + Thanks to Andrzej Krzysztofowicz +=============================================================================== +Changes for patch v23 + +- Fixed persistence bug with device numbers for manually created + device files + +- Fixed problem with recreating symlinks with different content + +- Added CONFIG_DEVFS_MOUNT (mount devfs on /dev at boot time) +=============================================================================== +Changes for patch v24 + +- Switched from CONFIG_KERNELD to CONFIG_KMOD: module autoloading + should now work again + +- Hide entries which are manually unlinked + +- Always invalidate devfs dentry cache when registering entries + +- Support removal of devfs directories via rmdir(2) + +- Ensure directories created by are visible + +- Default no access for "other" for floppy device +=============================================================================== +Changes for patch v25 + +- Updates to CREDITS file and minor IDE numbering change + Thanks to Andrzej Krzysztofowicz + +- Invalidate devfs dentry cache when making directories + +- Invalidate devfs dentry cache when removing entries + +- More informative message if root FS mount fails when devfs + configured + +- Fixed persistence bug with fifos +=============================================================================== +Changes for patch v26 + +- ported to kernel 2.1.97 + +- Changed serial directory from "/dev/serial" to "/dev/tts" and + "/dev/consoles" to "/dev/vc" to be more friendly to new procps +=============================================================================== +Changes for patch v27 + +- Added support for IDE4 and IDE5 + Thanks to Andrzej Krzysztofowicz + +- Documented "scsihosts=" boot parameter + +- Print process command when debugging kerneld/kmod + +- Added debugging for register/unregister/change operations + +- Added "devfs=" boot options + +- Hide unregistered entries by default +=============================================================================== +Changes for patch v28 + +- No longer lock/unlock superblock in (cope with + recent VFS interface change) + +- Do not automatically change ownership/protection of /dev/tty + +- Drop negative dentries when they are released + +- Manage dcache more efficiently +=============================================================================== +Changes for patch v29 + +- Added DEVFS_FL_AUTO_DEVNUM flag +=============================================================================== +Changes for patch v30 + +- No longer set unnecessary methods + +- Ported to kernel 2.1.99-pre3 +=============================================================================== +Changes for patch v31 + +- Added PID display to debugging message + +- Added "diread" and "diwrite" options + +- Ported to kernel 2.1.102 + +- Fixed persistence problem with permissions +=============================================================================== +Changes for patch v32 + +- Fixed devfs support in drivers/block/md.c +=============================================================================== +Changes for patch v33 + +- Support legacy device nodes + +- Fixed bug where recreated inodes were hidden + +- New IDE naming scheme: everything is under /dev/ide +=============================================================================== +Changes for patch v34 + +- Improved debugging in + +- Prevent duplicate calls to in SCSI layer + +- No longer free old dentries in + +- Free all dentries for a given entry when deleting inodes +=============================================================================== +Changes for patch v35 + +- Ported to kernel 2.1.105 (sound driver changes) +=============================================================================== +Changes for patch v36 + +- Fixed sound driver port +=============================================================================== +Changes for patch v37 + +- Minor documentation tweaks +=============================================================================== +Changes for patch v38 + +- More documentation tweaks + +- Fix for sound driver port + +- Removed ttyname-patch (grab libc 5.4.44 instead) + +- Ported to kernel 2.1.107-pre2 (loop driver fix) +=============================================================================== +Changes for patch v39 + +- Ported to kernel 2.1.107 (hd.c hunk broke due to spelling "fixes"). Sigh + +- Removed many #ifdef's, replaced with trickery in include/devfs_fs.h +=============================================================================== +Changes for patch v40 + +- Fix for sound driver port + +- Limit auto-device numbering to majors 128 to 239 +=============================================================================== +Changes for patch v41 + +- Fixed inode times persistence problem +=============================================================================== +Changes for patch v42 + +- Ported to kernel 2.1.108 (drivers/scsi/hosts.c hunk broke) +=============================================================================== +Changes for patch v43 + +- Fixed spelling in debug + +- Fixed bug in parsing "dilookup" + +- More #ifdef's removed + +- Supported Sparc keyboard (/dev/kbd) + +- Supported DSP56001 digital signal processor (/dev/dsp56k) + +- Supported Apple Desktop Bus (/dev/adb) + +- Supported Coda network file system (/dev/cfs*) +=============================================================================== +Changes for patch v44 + +- Fixed devfs inode leak when manually recreating inodes + +- Fixed permission persistence problem when recreating inodes +=============================================================================== +Changes for patch v45 + +- Ported to kernel 2.1.110 +=============================================================================== +Changes for patch v46 + +- Ported to kernel 2.1.112-pre1 + +- Removed harmless "unused variable" compiler warning + +- Fixed modes for manually recreated device nodes +=============================================================================== +Changes for patch v47 + +- Added NULL devfs inode warning in + +- Force all inode nlink values to 1 +=============================================================================== +Changes for patch v48 + +- Added "dimknod" option + +- Set inode nlink to 0 when freeing dentries + +- Added support for virtual console capture devices (/dev/vcs*) + Thanks to Dennis Hou + +- Fixed modes for manually recreated symlinks +=============================================================================== +Changes for patch v49 + +- Ported to kernel 2.1.113 +=============================================================================== +Changes for patch v50 + +- Fixed bugs in recreated directories and symlinks +=============================================================================== +Changes for patch v51 + +- Improved robustness of rc.devfs script + Thanks to Roderich Schupp + +- Fixed bugs in recreated device nodes + +- Fixed bug in currently unused + +- Defined new type + +- Improved debugging when getting entries + +- Fixed bug where directories could be emptied + +- Ported to kernel 2.1.115 +=============================================================================== +Changes for patch v52 + +- Replaced dummy .epoch inode with .devfsd character device + +- Modified rc.devfs to take account of above change + +- Removed spurious driver warning messages when CONFIG_DEVFS_FS=n + +- Implemented devfsd protocol revision 0 +=============================================================================== +Changes for patch v53 + +- Ported to kernel 2.1.116 (kmod change broke hunk) + +- Updated Documentation/Configure.help + +- Test and tty pattern patch for rc.devfs script + Thanks to Roderich Schupp + +- Added soothing message to warning in +=============================================================================== +Changes for patch v54 + +- Ported to kernel 2.1.117 + +- Fixed default permissions in sound driver + +- Added support for frame buffer devices (/dev/fb*) +=============================================================================== +Changes for patch v55 + +- Ported to kernel 2.1.119 + +- Use GCC extensions for structure initialisations + +- Implemented async open notification + +- Incremented devfsd protocol revision to 1 +=============================================================================== +Changes for patch v56 + +- Ported to kernel 2.1.120-pre3 + +- Moved async open notification to end of +=============================================================================== +Changes for patch v57 + +- Ported to kernel 2.1.121 + +- Prepended "/dev/" to module load request + +- Renamed to + +- Created sample modules.conf file +=============================================================================== +Changes for patch v58 + +- Fixed typo "AYSNC" -> "ASYNC" +=============================================================================== +Changes for patch v59 + +- Added open flag for files +=============================================================================== +Changes for patch v60 + +- Ported to kernel 2.1.123-pre2 +=============================================================================== +Changes for patch v61 + +- Set i_blocks=0 and i_blksize=1024 in +=============================================================================== +Changes for patch v62 + +- Ported to kernel 2.1.123 +=============================================================================== +Changes for patch v63 + +- Ported to kernel 2.1.124-pre2 +=============================================================================== +Changes for patch v64 + +- Fixed Unix98 pty support + +- Increased buffer size in to avoid crash and + burn +=============================================================================== +Changes for patch v65 + +- More Unix98 pty support fixes + +- Added test for empty <> in + +- Renamed to and published + +- Created /dev/root symlink + Thanks to Roderich Schupp + with further modifications by me +=============================================================================== +Changes for patch v66 + +- Yet more Unix98 pty support fixes (now tested) + +- Created + +- Support media change checks when CONFIG_DEVFS_ONLY=y + +- Abolished Unix98-style PTY names for old PTY devices +=============================================================================== +Changes for patch v67 + +- Added inline declaration for dummy + +- Removed spurious "unable to register... in devfs" messages when + CONFIG_DEVFS_FS=n + +- Fixed misc. devices when CONFIG_DEVFS_FS=n + +- Limit auto-device numbering to majors 144 to 239 +=============================================================================== +Changes for patch v68 + +- Hide unopened virtual consoles from directory listings + +- Added support for video capture devices + +- Ported to kernel 2.1.125 +=============================================================================== +Changes for patch v69 + +- Fix for CONFIG_VT=n +=============================================================================== +Changes for patch v70 + +- Added support for non-OSS/Free sound cards +=============================================================================== +Changes for patch v71 + +- Ported to kernel 2.1.126-pre2 +=============================================================================== +Changes for patch v72 + +- #ifdef's for CONFIG_DEVFS_DISABLE_OLD_NAMES removed +=============================================================================== +Changes for patch v73 + +- CONFIG_DEVFS_DISABLE_OLD_NAMES replaced with "nocompat" boot option + +- CONFIG_DEVFS_BOOT_OPTIONS removed: boot options always available +=============================================================================== +Changes for patch v74 + +- Removed CONFIG_DEVFS_MOUNT and "mount" boot option and replaced with + "nomount" boot option + +- Documentation updates + +- Updated sample modules.conf +=============================================================================== +Changes for patch v75 + +- Updated sample modules.conf + +- Remount devfs after initrd finishes + +- Ported to kernel 2.1.127 + +- Added support for ISDN + Thanks to Christophe Leroy +=============================================================================== +Changes for patch v76 + +- Updated an email address in ChangeLog + +- CONFIG_DEVFS_ONLY replaced with "only" boot option +=============================================================================== +Changes for patch v77 + +- Added DEVFS_FL_REMOVABLE flag + +- Check for disc change when listing directories with removable media + devices + +- Use DEVFS_FL_REMOVABLE in sd.c + +- Ported to kernel 2.1.128 +=============================================================================== +Changes for patch v78 + +- Only call on first call to + +- Ported to kernel 2.1.129-pre5 + +- ISDN support improvements + Thanks to Christophe Leroy +=============================================================================== +Changes for patch v79 + +- Ported to kernel 2.1.130 + +- Renamed miscdevice "apm" to "apm_bios" to be consistent with + devices.txt +=============================================================================== +Changes for patch v80 + +- Ported to kernel 2.1.131 + +- Updated for VFS change in 2.1.131 +=============================================================================== +Changes for patch v81 + +- Fixed permissions on /dev/ptmx +=============================================================================== +Changes for patch v82 + +- Ported to kernel 2.1.132-pre4 + +- Changed initial permissions on /dev/pts/* + +- Created + +- Added "symlinks" boot option + +- Changed devfs_register_blkdev() back to register_blkdev() for IDE + +- Check for partitions on removable media in +=============================================================================== +Changes for patch v83 + +- Fixed support for ramdisc when using string-based root FS name + +- Ported to kernel 2.2.0-pre1 +=============================================================================== +Changes for patch v84 + +- Ported to kernel 2.2.0-pre7 +=============================================================================== +Changes for patch v85 + +- Compile fixes for driver/sound/sound_common.c (non-module) and + drivers/isdn/isdn_common.c + Thanks to Christophe Leroy + +- Added support for registering regular files + +- Created + +- Added /dev/cpu/mtrr as an alternative interface to /proc/mtrr + +- Update devfs inodes from entries if not changed through FS +=============================================================================== +Changes for patch v86 + +- Ported to kernel 2.2.0-pre9 +=============================================================================== +Changes for patch v87 + +- Fixed bug when mounting non-devfs devices in a devfs +=============================================================================== +Changes for patch v88 + +- Fixed to only initialise temporary inodes + +- Trap for NULL fops in + +- Return -ENODEV in for non-driver inodes + +- Fixed bug when unswapping non-devfs devices in a devfs +=============================================================================== +Changes for patch v89 + +- Switched to C data types in include/linux/devfs_fs.h + +- Switched from PATH_MAX to DEVFS_PATHLEN + +- Updated Documentation/filesystems/devfs/modules.conf to take account + of reverse scanning (!) by modprobe + +- Ported to kernel 2.2.0 +=============================================================================== +Changes for patch v90 + +- CONFIG_DEVFS_DISABLE_OLD_TTY_NAMES replaced with "nottycompat" boot + option + +- CONFIG_DEVFS_TTY_COMPAT removed: existing "symlinks" boot option now + controls this. This means you must have libc 5.4.44 or later, or a + recent version of libc 6 if you use the "symlinks" option +=============================================================================== +Changes for patch v91 + +- Switch from to in + drivers/char/vc_screen.c to fix problems with Midnight Commander +=============================================================================== +Changes for patch v92 + +- Ported to kernel 2.2.2-pre5 +=============================================================================== +Changes for patch v93 + +- Modified in drivers/scsi/sd.c to cope with devices that + don't exist (which happens with new RAID autostart code printk()s) +=============================================================================== +Changes for patch v94 + +- Fixed bug in joystick driver: only first joystick was registered +=============================================================================== +Changes for patch v95 + +- Fixed another bug in joystick driver + +- Fixed to not overrun event buffer +=============================================================================== +Changes for patch v96 + +- Ported to kernel 2.2.5-2 + +- Created + +- Fixed bugs: compatibility entries were not unregistered for: + loop driver + floppy driver + RAMDISC driver + IDE tape driver + SCSI CD-ROM driver + SCSI HDD driver +=============================================================================== +Changes for patch v97 + +- Fixed bugs: compatibility entries were not unregistered for: + ALSA sound driver + partitions in generic disc driver + +- Don't return unregistred entries in + +- Panic in if entry unregistered + +- Don't panic in for duplicates +=============================================================================== +Changes for patch v98 + +- Don't unregister already unregistered entries in + +- Register entry in + +- Unregister entry in + +- Changed to in drivers/char/tty_io.c + +- Ported to kernel 2.2.7 +=============================================================================== +Changes for patch v99 + +- Ported to kernel 2.2.8 + +- Fixed bug in drivers/scsi/sd.c when >16 SCSI discs + +- Disable warning messages when unable to read partition table for + removable media +=============================================================================== +Changes for patch v100 + +- Ported to kernel 2.3.1-pre5 + +- Added "oops-on-panic" boot option + +- Improved debugging in and + +- Register entry in + +- Unregister entry in + +- Register entry in + +- Unregister entry in + +- Added support for ALSA drivers +=============================================================================== +Changes for patch v101 + +- Ported to kernel 2.3.2 +=============================================================================== +Changes for patch v102 + +- Update serial driver to register PCMCIA entries + Thanks to Roch-Alexandre Nomine-Beguin + +- Updated an email address in ChangeLog + +- Hide virtual console capture entries from directory listings when + corresponding console device is not open +=============================================================================== +Changes for patch v103 + +- Ported to kernel 2.3.3 +=============================================================================== +Changes for patch v104 + +- Added documentation for some functions + +- Added "doc" target to fs/devfs/Makefile + +- Added "v4l" directory for video4linux devices + +- Replaced call to in with call to + + +- Moved registration for sr and sg drivers from detect() to attach() + methods + +- Register entries in and unregister in + +- Work around IDE driver treating CD-ROM as gendisk + +- Use instead of in rc.devfs + +- Updated ToDo list + +- Removed "oops-on-panic" boot option: now always Oops +=============================================================================== +Changes for patch v105 + +- Unregister SCSI host from in + Thanks to Zoltán Böszörményi + +- Don't save /dev/log in rc.devfs + +- Ported to kernel 2.3.4-pre1 +=============================================================================== +Changes for patch v106 + +- Fixed silly typo in drivers/scsi/st.c + +- Improved debugging in +=============================================================================== +Changes for patch v107 + +- Added "diunlink" and "nokmod" boot options + +- Removed superfluous warning message in +=============================================================================== +Changes for patch v108 + +- Remove entries when unloading sound module +=============================================================================== +Changes for patch v109 + +- Ported to kernel 2.3.6-pre2 +=============================================================================== +Changes for patch v110 + +- Took account of change to +=============================================================================== +Changes for patch v111 + +- Created separate event queue for each mounted devfs + +- Removed + +- Created new ioctl()s for devfsd + +- Incremented devfsd protocol revision to 3 + +- Fixed bug when re-creating directories: contents were lost + +- Block access to inodes until devfsd updates permissions +=============================================================================== +Changes for patch v112 + +- Modified patch so it applies against 2.3.5 and 2.3.6 + +- Updated an email address in ChangeLog + +- Do not automatically change ownership/protection of /dev/tty + +- Updated sample modules.conf + +- Switched to sending process uid/gid to devfsd + +- Renamed to + +- Added DEVFSD_NOTIFY_LOOKUP event + +- Added DEVFSD_NOTIFY_CHANGE event + +- Added DEVFSD_NOTIFY_CREATE event + +- Incremented devfsd protocol revision to 4 + +- Moved kernel-specific stuff to include/linux/devfs_fs_kernel.h +=============================================================================== +Changes for patch v113 + +- Ported to kernel 2.3.9 + +- Restricted permissions on some block devices +=============================================================================== +Changes for patch v114 + +- Added support for /dev/netlink + Thanks to Dennis Hou + +- Return EISDIR rather than EINVAL for read(2) on directories + +- Ported to kernel 2.3.10 +=============================================================================== +Changes for patch v115 + +- Added support for all remaining character devices + Thanks to Dennis Hou + +- Cleaned up netlink support +=============================================================================== +Changes for patch v116 + +- Added support for /dev/parport%d + Thanks to Tim Waugh + +- Fixed parallel port ATAPI tape driver + +- Fixed Atari SLM laser printer driver +=============================================================================== +Changes for patch v117 + +- Added support for COSA card + Thanks to Dennis Hou + +- Fixed drivers/char/ppdev.c: missing #include + +- Fixed drivers/char/ftape/zftape/zftape-init.c + Thanks to Vladimir Popov +=============================================================================== +Changes for patch v118 + +- Ported to kernel 2.3.15-pre3 + +- Fixed bug in loop driver + +- Unregister /dev/lp%d entries in drivers/char/lp.c + Thanks to Maciej W. Rozycki +=============================================================================== +Changes for patch v119 + +- Ported to kernel 2.3.16 +=============================================================================== +Changes for patch v120 + +- Fixed bug in drivers/scsi/scsi.c + +- Added /dev/ppp + Thanks to Dennis Hou + +- Ported to kernel 2.3.17 +=============================================================================== +Changes for patch v121 + +- Fixed bug in drivers/block/loop.c + +- Ported to kernel 2.3.18 +=============================================================================== +Changes for patch v122 + +- Ported to kernel 2.3.19 +=============================================================================== +Changes for patch v123 + +- Ported to kernel 2.3.20 +=============================================================================== +Changes for patch v124 + +- Ported to kernel 2.3.21 +=============================================================================== +Changes for patch v125 + +- Created , , + and + Added <> parameter to , , + and + Work sponsored by SGI + +- Fixed apparent bug in COSA driver + +- Re-instated "scsihosts=" boot option +=============================================================================== +Changes for patch v126 + +- Always create /dev/pts if CONFIG_UNIX98_PTYS=y + +- Fixed call to in drivers/block/ide-disk.c + Thanks to Dennis Hou + +- Allow multiple unregistrations + +- Created /dev/scsi hierarchy + Work sponsored by SGI +=============================================================================== +Changes for patch v127 + +Work sponsored by SGI + +- No longer disable devpts if devfs enabled (caveat emptor) + +- Added flags array to struct gendisk and removed code from + drivers/scsi/sd.c + +- Created /dev/discs hierarchy +=============================================================================== +Changes for patch v128 + +Work sponsored by SGI + +- Created /dev/cdroms hierarchy +=============================================================================== +Changes for patch v129 + +Work sponsored by SGI + +- Removed compatibility entries for sound devices + +- Removed compatibility entries for printer devices + +- Removed compatibility entries for video4linux devices + +- Removed compatibility entries for parallel port devices + +- Removed compatibility entries for frame buffer devices +=============================================================================== +Changes for patch v130 + +Work sponsored by SGI + +- Added major and minor number to devfsd protocol + +- Incremented devfsd protocol revision to 5 + +- Removed compatibility entries for SoundBlaster CD-ROMs + +- Removed compatibility entries for netlink devices + +- Removed compatibility entries for SCSI generic devices + +- Removed compatibility entries for SCSI tape devices +=============================================================================== +Changes for patch v131 + +Work sponsored by SGI + +- Support info pointer for all devfs entry types + +- Added <> parameter to and + +- Removed /dev/st hierarchy + +- Removed /dev/sg hierarchy + +- Removed compatibility entries for loop devices + +- Removed compatibility entries for IDE tape devices + +- Removed compatibility entries for SCSI CD-ROMs + +- Removed /dev/sr hierarchy +=============================================================================== +Changes for patch v132 + +Work sponsored by SGI + +- Removed compatibility entries for floppy devices + +- Removed compatibility entries for RAMDISCs + +- Removed compatibility entries for meta-devices + +- Removed compatibility entries for SCSI discs + +- Created + +- Removed /dev/sd hierarchy + +- Support "../" when searching devfs namespace + +- Created /dev/ide/host* hierarchy + +- Supported IDE hard discs in /dev/ide/host* hierarchy + +- Removed compatibility entries for IDE discs + +- Removed /dev/ide/hd hierarchy + +- Supported IDE CD-ROMs in /dev/ide/host* hierarchy + +- Removed compatibility entries for IDE CD-ROMs + +- Removed /dev/ide/cd hierarchy +=============================================================================== +Changes for patch v133 + +Work sponsored by SGI + +- Created + +- Fixed bug in fs/partitions/check.c when rescanning +=============================================================================== +Changes for patch v134 + +Work sponsored by SGI + +- Removed /dev/sd, /dev/sr, /dev/st and /dev/sg directories + +- Removed /dev/ide/hd directory + +- Exported + +- Created and /dev/tapes hierarchy + +- Removed /dev/ide/mt hierarchy + +- Removed /dev/ide/fd hierarchy + +- Ported to kernel 2.3.25 +=============================================================================== +Changes for patch v135 + +Work sponsored by SGI + +- Removed compatibility entries for virtual console capture devices + +- Removed unused + +- Removed compatibility entries for serial devices + +- Removed compatibility entries for console devices + +- Do not hide entries from devfsd or children + +- Removed DEVFS_FL_TTY_COMPAT flag + +- Removed "nottycompat" boot option + +- Removed +=============================================================================== +Changes for patch v136 + +Work sponsored by SGI + +- Moved BSD pty devices to /dev/pty + +- Added DEVFS_FL_WAIT flag +=============================================================================== +Changes for patch v137 + +Work sponsored by SGI + +- Really fixed bug in fs/partitions/check.c when rescanning + +- Support new "disc" naming scheme in + +- Allow NULL fops in + +- Removed redundant name functions in SCSI disc and IDE drivers +=============================================================================== +Changes for patch v138 + +Work sponsored by SGI + +- Fixed old bugs in drivers/block/paride/pt.c, drivers/char/tpqic02.c, + drivers/net/wan/cosa.c and drivers/scsi/scsi.c + Thanks to Sergey Kubushin + +- Fall back to major table if NULL fops given to +=============================================================================== +Changes for patch v139 + +Work sponsored by SGI + +- Corrected and moved and declarations + from arch/alpha/kernel/osf_sys.c to include/linux/fs.h + +- Removed name function from struct gendisk + +- Updated devfs FAQ +=============================================================================== +Changes for patch v140 + +Work sponsored by SGI + +- Ported to kernel 2.3.27 +=============================================================================== +Changes for patch v141 + +Work sponsored by SGI + +- Bug fix in arch/m68k/atari/joystick.c + +- Moved ISDN and capi devices to /dev/isdn +=============================================================================== +Changes for patch v142 + +Work sponsored by SGI + +- Bug fix in drivers/block/ide-probe.c (patch confusion) +=============================================================================== +Changes for patch v143 + +Work sponsored by SGI + +- Bug fix in drivers/block/blkpg.c:partition_name() +=============================================================================== +Changes for patch v144 + +Work sponsored by SGI + +- Ported to kernel 2.3.29 + +- Removed calls to from cdu31a, cm206, mcd and mcdx + CD-ROM drivers: generic driver handles this now + +- Moved joystick devices to /dev/joysticks +=============================================================================== +Changes for patch v145 + +Work sponsored by SGI + +- Ported to kernel 2.3.30-pre3 + +- Register whole-disc entry even for invalid partition tables + +- Fixed bug in mounting root FS when initrd enabled + +- Fixed device entry leak with IDE CD-ROMs + +- Fixed compile problem with drivers/isdn/isdn_common.c + +- Moved COSA devices to /dev/cosa + +- Support fifos when unregistering + +- Created and used in many drivers + +- Moved Coda devices to /dev/coda + +- Moved parallel port IDE tapes to /dev/pt + +- Moved parallel port IDE generic devices to /dev/pg +=============================================================================== +Changes for patch v146 + +Work sponsored by SGI + +- Removed obsolete DEVFS_FL_COMPAT and DEVFS_FL_TOLERANT flags + +- Fixed compile problem with fs/coda/psdev.c + +- Reinstate change to in + drivers/block/ide-probe.c now that fs/isofs/inode.c is fixed + +- Switched to in drivers/block/floppy.c, + drivers/scsi/sr.c and drivers/block/md.c + +- Moved DAC960 devices to /dev/dac960 +=============================================================================== +Changes for patch v147 + +Work sponsored by SGI + +- Ported to kernel 2.3.32-pre4 +=============================================================================== +Changes for patch v148 + +Work sponsored by SGI + +- Removed kmod support: use devfsd instead + +- Moved miscellaneous character devices to /dev/misc +=============================================================================== +Changes for patch v149 + +Work sponsored by SGI + +- Ensure include/linux/joystick.h is OK for user-space + +- Improved debugging in + +- Ensure dentries created by devfsd will be cleaned up +=============================================================================== +Changes for patch v150 + +Work sponsored by SGI + +- Ported to kernel 2.3.34 +=============================================================================== +Changes for patch v151 + +Work sponsored by SGI + +- Ported to kernel 2.3.35-pre1 + +- Created +=============================================================================== +Changes for patch v152 + +Work sponsored by SGI + +- Updated sample modules.conf + +- Ported to kernel 2.3.36-pre1 +=============================================================================== +Changes for patch v153 + +Work sponsored by SGI + +- Ported to kernel 2.3.42 + +- Removed +=============================================================================== +Changes for patch v154 + +Work sponsored by SGI + +- Took account of device number changes for /dev/fb* +=============================================================================== +Changes for patch v155 + +Work sponsored by SGI + +- Ported to kernel 2.3.43-pre8 + +- Moved /dev/tty0 to /dev/vc/0 + +- Moved sequence number formatting from <_tty_make_name> to drivers +=============================================================================== +Changes for patch v156 + +Work sponsored by SGI + +- Fixed breakage in drivers/scsi/sd.c due to recent SCSI changes +=============================================================================== +Changes for patch v157 + +Work sponsored by SGI + +- Ported to kernel 2.3.45 +=============================================================================== +Changes for patch v158 + +Work sponsored by SGI + +- Ported to kernel 2.3.46-pre2 +=============================================================================== +Changes for patch v159 + +Work sponsored by SGI + +- Fixed drivers/block/md.c + Thanks to Mike Galbraith + +- Documentation fixes + +- Moved device registration from to + Thanks to Tim Waugh +=============================================================================== +Changes for patch v160 + +Work sponsored by SGI + +- Fixed drivers/char/joystick/joystick.c + Thanks to Vojtech Pavlik + +- Documentation updates + +- Fixed arch/i386/kernel/mtrr.c if procfs and devfs not enabled + +- Fixed drivers/char/stallion.c +=============================================================================== +Changes for patch v161 + +Work sponsored by SGI + +- Remove /dev/ide when ide-mod is unloaded + +- Fixed bug in drivers/block/ide-probe.c when secondary but no primary + +- Added DEVFS_FL_NO_PERSISTENCE flag + +- Used new DEVFS_FL_NO_PERSISTENCE flag for Unix98 pty slaves + +- Removed unnecessary call to in + + +- Only set auto-ownership for /dev/pty/s* +=============================================================================== +Changes for patch v162 + +Work sponsored by SGI + +- Set inode->i_size to correct size for symlinks + Thanks to Jeremy Fitzhardinge + +- Only give lookup() method to directories to comply with new VFS + assumptions + +- Remove unnecessary tests in symlink methods + +- Don't kill existing block ops in + +- Restore auto-ownership for /dev/pty/m* +=============================================================================== +Changes for patch v163 + +Work sponsored by SGI + +- Don't create missing directories in + +- Removed Documentation/filesystems/devfs/mk-devlinks + +- Updated Documentation/filesystems/devfs/README +=============================================================================== +Changes for patch v164 + +Work sponsored by SGI + +- Fixed CONFIG_DEVFS breakage in drivers/char/serial.c introduced in + linux-2.3.99-pre6-7 +=============================================================================== +Changes for patch v165 + +Work sponsored by SGI + +- Ported to kernel 2.3.99-pre6 +=============================================================================== +Changes for patch v166 + +Work sponsored by SGI + +- Added CONFIG_DEVFS_MOUNT +=============================================================================== +Changes for patch v167 + +Work sponsored by SGI + +- Updated Documentation/filesystems/devfs/README + +- Updated sample modules.conf +=============================================================================== +Changes for patch v168 + +Work sponsored by SGI + +- Disabled multi-mount capability (use VFS bindings instead) + +- Updated README from master HTML file +=============================================================================== +Changes for patch v169 + +Work sponsored by SGI + +- Removed multi-mount code + +- Removed compatibility macros: VFS has changed too much +=============================================================================== +Changes for patch v170 + +Work sponsored by SGI + +- Updated README from master HTML file + +- Merged devfs inode into devfs entry +=============================================================================== +Changes for patch v171 + +Work sponsored by SGI + +- Updated sample modules.conf + +- Removed dead code in which used to call + + +- Ported to kernel 2.4.0-test2-pre3 +=============================================================================== +Changes for patch v172 + +Work sponsored by SGI + +- Changed interface to + +- Changed interface to +=============================================================================== +Changes for patch v173 + +Work sponsored by SGI + +- Simplified interface to + +- Simplified interface to + +- Simplified interface to +=============================================================================== +Changes for patch v174 + +Work sponsored by SGI + +- Updated README from master HTML file +=============================================================================== +Changes for patch v175 + +Work sponsored by SGI + +- DocBook update for fs/devfs/base.c + Thanks to Tim Waugh + +- Removed stale fs/tunnel.c (was never used or completed) +=============================================================================== +Changes for patch v176 + +Work sponsored by SGI + +- Updated ToDo list + +- Removed sample modules.conf: now distributed with devfsd + +- Updated README from master HTML file + +- Ported to kernel 2.4.0-test3-pre4 (which had devfs-patch-v174) +=============================================================================== +Changes for patch v177 + +- Updated README from master HTML file + +- Documentation cleanups + +- Ensure terminates string for root entry + Thanks to Tim Jansen + +- Exported to modules + +- Make send events to devfsd + +- Cleaned up option processing in + +- Fixed bugs in handling symlinks: could leak or cause Oops + +- Cleaned up directory handling by separating fops + Thanks to Alexander Viro +=============================================================================== +Changes for patch v178 + +- Fixed handling of inverted options in +=============================================================================== +Changes for patch v179 + +- Adjusted to account for fix +=============================================================================== +Changes for patch v180 + +- Fixed !CONFIG_DEVFS_FS stub declaration of +=============================================================================== +Changes for patch v181 + +- Answered question posed by Al Viro and removed his comments from + +- Moved setting of registered flag after other fields are changed + +- Fixed race between and + +- Global VFS changes added bogus BKL to devfsd_close(): removed + +- Widened locking in and + +- Replaced stack usage with kmalloc + +- Simplified locking in and fixed memory leak +=============================================================================== +Changes for patch v182 + +- Created and + +- Removed broken devnum allocation and use + +- Fixed old devnum leak by calling new + +- Created + +- Fixed number leak for /dev/cdroms/cdrom%d + +- Fixed number leak for /dev/discs/disc%d +=============================================================================== +Changes for patch v183 + +- Fixed bug in which could hang boot process +=============================================================================== +Changes for patch v184 + +- Documentation typo fix for fs/devfs/util.c + +- Fixed drivers/char/stallion.c for devfs + +- Added DEVFSD_NOTIFY_DELETE event + +- Updated README from master HTML file + +- Removed #include from fs/devfs/base.c +=============================================================================== +Changes for patch v185 + +- Made and in fs/devfs/util.c + private + +- Fixed inode table races by removing it and using inode->u.generic_ip + instead + +- Moved into + +- Moved into +=============================================================================== +Changes for patch v186 + +- Fixed race in for uni-processor + +- Updated README from master HTML file +=============================================================================== +Changes for patch v187 + +- Fixed drivers/char/stallion.c for devfs + +- Fixed drivers/char/rocket.c for devfs + +- Fixed bug in : limited to 128 numbers +=============================================================================== +Changes for patch v188 + +- Updated major masks in fs/devfs/util.c up to Linus' "no new majors" + proclamation. Block: were 126 now 122 free, char: were 26 now 19 free + +- Updated README from master HTML file + +- Removed remnant of multi-mount support in + +- Removed unused DEVFS_FL_SHOW_UNREG flag +=============================================================================== +Changes for patch v189 + +- Removed nlink field from struct devfs_inode + +- Removed auto-ownership for /dev/pty/* (BSD ptys) and used + DEVFS_FL_CURRENT_OWNER|DEVFS_FL_NO_PERSISTENCE for /dev/pty/s* (just + like Unix98 pty slaves) and made /dev/pty/m* rw-rw-rw- access +=============================================================================== +Changes for patch v190 + +- Updated README from master HTML file + +- Replaced BKL with global rwsem to protect symlink data (quick and + dirty hack) +=============================================================================== +Changes for patch v191 + +- Replaced global rwsem for symlink with per-link refcount +=============================================================================== +Changes for patch v192 + +- Removed unnecessary #ifdef CONFIG_DEVFS_FS from arch/i386/kernel/mtrr.c + +- Ported to kernel 2.4.10-pre11 + +- Set inode->i_mapping->a_ops for block nodes in +=============================================================================== +Changes for patch v193 + +- Went back to global rwsem for symlinks (refcount scheme no good) +=============================================================================== +Changes for patch v194 + +- Fixed overrun in by removing function (not needed) + +- Updated README from master HTML file +=============================================================================== +Changes for patch v195 + +- Fixed buffer underrun in + +- Moved down_read() from to +=============================================================================== +Changes for patch v196 + +- Fixed race in when setting event mask + Thanks to Kari Hurtta + +- Avoid deadlock in by using temporary buffer +=============================================================================== +Changes for patch v197 + +- First release of new locking code for devfs core (v1.0) + +- Fixed bug in drivers/cdrom/cdrom.c +=============================================================================== +Changes for patch v198 + +- Discard temporary buffer, now use "%s" for dentry names + +- Don't generate path in : use fake entry instead + +- Use "existing" directory in <_devfs_make_parent_for_leaf> + +- Use slab cache rather than fixed buffer for devfsd events +=============================================================================== +Changes for patch v199 + +- Removed obsolete usage of DEVFS_FL_NO_PERSISTENCE + +- Send DEVFSD_NOTIFY_REGISTERED events in + +- Fixed locking bug in due to typo + +- Do not send CREATE, CHANGE, ASYNC_OPEN or DELETE events from devfsd + or children +=============================================================================== +Changes for patch v200 + +- Ported to kernel 2.5.1-pre2 +=============================================================================== +Changes for patch v201 + +- Fixed bug in : was dereferencing freed pointer +=============================================================================== +Changes for patch v202 + +- Fixed bug in : was dereferencing freed pointer + +- Added process group check for devfsd privileges +=============================================================================== +Changes for patch v203 + +- Use SLAB_ATOMIC in from +=============================================================================== +Changes for patch v204 + +- Removed long obsolete rc.devfs + +- Return old entry in for 2.4.x kernels + +- Updated README from master HTML file + +- Increment refcount on module in + +- Created and exported + +- Increment refcount on module in + +- Created and used where needed to fix races + +- Added clarifying comments in response to preliminary EMC code review + +- Added poisoning to + +- Improved debugging messages + +- Fixed unregister bugs in drivers/md/lvm-fs.c +=============================================================================== +Changes for patch v205 + +- Corrected (made useful) debugging message in + +- Moved in to + +- Fixed drivers/md/lvm-fs.c to create "lvm" entry + +- Added magic number to guard against scribbling drivers + +- Only return old entry in if a directory + +- Defined macros for error and debug messages + +- Updated README from master HTML file +=============================================================================== +Changes for patch v206 + +- Added support for multiple Compaq cpqarray controllers + +- Fixed (rare, old) race in +=============================================================================== +Changes for patch v207 + +- Fixed deadlock bug in + +- Tag VFS deletable in if handle ignored + +- Updated README from master HTML file +=============================================================================== +Changes for patch v208 + +- Added KERN_* to remaining messages + +- Cleaned up declaration of + +- Updated README from master HTML file +=============================================================================== +Changes for patch v209 + +- Updated README from master HTML file + +- Removed silently introduced calls to lock_kernel() and + unlock_kernel() due to recent VFS locking changes. BKL isn't + required in devfs + +- Changed to allow later additions if not yet empty + +- Added calls to in drivers/block/blkpc.c + and + +- Fixed bug in : was clearing beyond + bitfield + +- Fixed bitfield data type for + +- Made major bitfield type and initialiser 64 bit safe +=============================================================================== +Changes for patch v210 + +- Updated fs/devfs/util.c to fix shift warning on 64 bit machines + Thanks to Anton Blanchard + +- Updated README from master HTML file +=============================================================================== +Changes for patch v211 + +- Do not put miscellaneous character devices in /dev/misc if they + specify their own directory (i.e. contain a '/' character) + +- Copied macro for error messages from fs/devfs/base.c to + fs/devfs/util.c and made use of this macro + +- Removed 2.4.x compatibility code from fs/devfs/base.c +=============================================================================== +Changes for patch v212 + +- Added BKL to because drivers still need it +=============================================================================== +Changes for patch v213 + +- Protected and + from changing directory contents +=============================================================================== +Changes for patch v214 + +- Switched to ISO C structure field initialisers + +- Switch to set_current_state() and move before add_wait_queue() + +- Updated README from master HTML file + +- Fixed devfs entry leak in when *readdir fails +=============================================================================== +Changes for patch v215 + +- Created + +- Switched many functions from to + + +- Switched many functions from to +=============================================================================== +Changes for patch v216 + +- Switched arch/ia64/sn/io/hcl.c from to + + +- Removed deprecated +=============================================================================== +Changes for patch v217 + +- Exported and to modules + +- Updated README from master HTML file + +- Fixed module unload race in +=============================================================================== +Changes for patch v218 + +- Removed DEVFS_FL_AUTO_OWNER flag + +- Switched lingering structure field initialiser to ISO C + +- Added locking when setting/clearing flags + +- Documentation fix in fs/devfs/util.c diff --git a/Documentation/filesystems/devfs/README b/Documentation/filesystems/devfs/README new file mode 100644 index 00000000000..54366ecc241 --- /dev/null +++ b/Documentation/filesystems/devfs/README @@ -0,0 +1,1964 @@ +Devfs (Device File System) FAQ + + +Linux Devfs (Device File System) FAQ +Richard Gooch +20-AUG-2002 + + +Document languages: + + + + + + + +----------------------------------------------------------------------------- + +NOTE: the master copy of this document is available online at: + +http://www.atnf.csiro.au/~rgooch/linux/docs/devfs.html +and looks much better than the text version distributed with the +kernel sources. A mirror site is available at: + +http://www.ras.ucalgary.ca/~rgooch/linux/docs/devfs.html + +There is also an optional daemon that may be used with devfs. You can +find out more about it at: + +http://www.atnf.csiro.au/~rgooch/linux/ + +A mailing list is available which you may subscribe to. Send +email +to majordomo@oss.sgi.com with the following line in the +body of the message: +subscribe devfs +To unsubscribe, send the message body: +unsubscribe devfs +instead. The list is archived at + +http://oss.sgi.com/projects/devfs/archive/. + +----------------------------------------------------------------------------- + +Contents + + +What is it? + +Why do it? + +Who else does it? + +How it works + +Operational issues (essential reading) + +Instructions for the impatient +Permissions persistence across reboots +Dealing with drivers without devfs support +All the way with Devfs +Other Issues +Kernel Naming Scheme +Devfsd Naming Scheme +Old Compatibility Names +SCSI Host Probing Issues + + + +Device drivers currently ported + +Allocation of Device Numbers + +Questions and Answers + +Making things work +Alternatives to devfs +What I don't like about devfs +How to report bugs +Strange kernel messages +Compilation problems with devfsd + + +Other resources + +Translations of this document + + +----------------------------------------------------------------------------- + + +What is it? + +Devfs is an alternative to "real" character and block special devices +on your root filesystem. Kernel device drivers can register devices by +name rather than major and minor numbers. These devices will appear in +devfs automatically, with whatever default ownership and +protection the driver specified. A daemon (devfsd) can be used to +override these defaults. Devfs has been in the kernel since 2.3.46. + +NOTE that devfs is entirely optional. If you prefer the old +disc-based device nodes, then simply leave CONFIG_DEVFS_FS=n (the +default). In this case, nothing will change. ALSO NOTE that if you do +enable devfs, the defaults are such that full compatibility is +maintained with the old devices names. + +There are two aspects to devfs: one is the underlying device +namespace, which is a namespace just like any mounted filesystem. The +other aspect is the filesystem code which provides a view of the +device namespace. The reason I make a distinction is because devfs +can be mounted many times, with each mount showing the same device +namespace. Changes made are global to all mounted devfs filesystems. +Also, because the devfs namespace exists without any devfs mounts, you +can easily mount the root filesystem by referring to an entry in the +devfs namespace. + + +The cost of devfs is a small increase in kernel code size and memory +usage. About 7 pages of code (some of that in __init sections) and 72 +bytes for each entry in the namespace. A modest system has only a +couple of hundred device entries, so this costs a few more +pages. Compare this with the suggestion to put /dev on a ramdisc. + +On a typical machine, the cost is under 0.2 percent. On a modest +system with 64 MBytes of RAM, the cost is under 0.1 percent. The +accusations of "bloatware" levelled at devfs are not justified. + +----------------------------------------------------------------------------- + + +Why do it? + +There are several problems that devfs addresses. Some of these +problems are more serious than others (depending on your point of +view), and some can be solved without devfs. However, the totality of +these problems really calls out for devfs. + +The choice is a patchwork of inefficient user space solutions, which +are complex and likely to be fragile, or to use a simple and efficient +devfs which is robust. + +There have been many counter-proposals to devfs, all seeking to +provide some of the benefits without actually implementing devfs. So +far there has been an absence of code and no proposed alternative has +been able to provide all the features that devfs does. Further, +alternative proposals require far more complexity in user-space (and +still deliver less functionality than devfs). Some people have the +mantra of reducing "kernel bloat", but don't consider the effects on +user-space. + +A good solution limits the total complexity of kernel-space and +user-space. + + +Major&minor allocation + +The existing scheme requires the allocation of major and minor device +numbers for each and every device. This means that a central +co-ordinating authority is required to issue these device numbers +(unless you're developing a "private" device driver), in order to +preserve uniqueness. Devfs shifts the burden to a namespace. This may +not seem like a huge benefit, but actually it is. Since driver authors +will naturally choose a device name which reflects the functionality +of the device, there is far less potential for namespace conflict. +Solving this requires a kernel change. + +/dev management + +Because you currently access devices through device nodes, these must +be created by the system administrator. For standard devices you can +usually find a MAKEDEV programme which creates all these (hundreds!) +of nodes. This means that changes in the kernel must be reflected by +changes in the MAKEDEV programme, or else the system administrator +creates device nodes by hand. + +The basic problem is that there are two separate databases of +major and minor numbers. One is in the kernel and one is in /dev (or +in a MAKEDEV programme, if you want to look at it that way). This is +duplication of information, which is not good practice. +Solving this requires a kernel change. + +/dev growth + +A typical /dev has over 1200 nodes! Most of these devices simply don't +exist because the hardware is not available. A huge /dev increases the +time to access devices (I'm just referring to the dentry lookup times +and the time taken to read inodes off disc: the next subsection shows +some more horrors). + +An example of how big /dev can grow is if we consider SCSI devices: + +host 6 bits (say up to 64 hosts on a really big machine) +channel 4 bits (say up to 16 SCSI buses per host) +id 4 bits +lun 3 bits +partition 6 bits +TOTAL 23 bits + + +This requires 8 Mega (1024*1024) inodes if we want to store all +possible device nodes. Even if we scrap everything but id,partition +and assume a single host adapter with a single SCSI bus and only one +logical unit per SCSI target (id), that's still 10 bits or 1024 +inodes. Each VFS inode takes around 256 bytes (kernel 2.1.78), so +that's 256 kBytes of inode storage on disc (assuming real inodes take +a similar amount of space as VFS inodes). This is actually not so bad, +because disc is cheap these days. Embedded systems would care about +256 kBytes of /dev inodes, but you could argue that embedded systems +would have hand-tuned /dev directories. I've had to do just that on my +embedded systems, but I would rather just leave it to devfs. + +Another issue is the time taken to lookup an inode when first +referenced. Not only does this take time in scanning through a list in +memory, but also the seek times to read the inodes off disc. +This could be solved in user-space using a clever programme which +scanned the kernel logs and deleted /dev entries which are not +available and created them when they were available. This programme +would need to be run every time a new module was loaded, which would +slow things down a lot. + +There is an existing programme called scsidev which will automatically +create device nodes for SCSI devices. It can do this by scanning files +in /proc/scsi. Unfortunately, to extend this idea to other device +nodes would require significant modifications to existing drivers (so +they too would provide information in /proc). This is a non-trivial +change (I should know: devfs has had to do something similar). Once +you go to this much effort, you may as well use devfs itself (which +also provides this information). Furthermore, such a system would +likely be implemented in an ad-hoc fashion, as different drivers will +provide their information in different ways. + +Devfs is much cleaner, because it (naturally) has a uniform mechanism +to provide this information: the device nodes themselves! + + +Node to driver file_operations translation + +There is an important difference between the way disc-based character +and block nodes and devfs entries make the connection between an entry +in /dev and the actual device driver. + +With the current 8 bit major and minor numbers the connection between +disc-based c&b nodes and per-major drivers is done through a +fixed-length table of 128 entries. The various filesystem types set +the inode operations for c&b nodes to {chr,blk}dev_inode_operations, +so when a device is opened a few quick levels of indirection bring us +to the driver file_operations. + +For miscellaneous character devices a second step is required: there +is a scan for the driver entry with the same minor number as the file +that was opened, and the appropriate minor open method is called. This +scanning is done *every time* you open a device node. Potentially, you +may be searching through dozens of misc. entries before you find your +open method. While not an enormous performance overhead, this does +seem pointless. + +Linux *must* move beyond the 8 bit major and minor barrier, +somehow. If we simply increase each to 16 bits, then the indexing +scheme used for major driver lookup becomes untenable, because the +major tables (one each for character and block devices) would need to +be 64 k entries long (512 kBytes on x86, 1 MByte for 64 bit +systems). So we would have to use a scheme like that used for +miscellaneous character devices, which means the search time goes up +linearly with the average number of major device drivers on your +system. Not all "devices" are hardware, some are higher-level drivers +like KGI, so you can get more "devices" without adding hardware +You can improve this by creating an ordered (balanced:-) +binary tree, in which case your search time becomes log(N). +Alternatively, you can use hashing to speed up the search. +But why do that search at all if you don't have to? Once again, it +seems pointless. + +Note that devfs doesn't use the major&minor system. For devfs +entries, the connection is done when you lookup the /dev entry. When +devfs_register() is called, an internal table is appended which has +the entry name and the file_operations. If the dentry cache doesn't +have the /dev entry already, this internal table is scanned to get the +file_operations, and an inode is created. If the dentry cache already +has the entry, there is *no lookup time* (other than the dentry scan +itself, but we can't avoid that anyway, and besides Linux dentries +cream other OS's which don't have them:-). Furthermore, the number of +node entries in a devfs is only the number of available device +entries, not the number of *conceivable* entries. Even if you remove +unnecessary entries in a disc-based /dev, the number of conceivable +entries remains the same: you just limit yourself in order to save +space. + +Devfs provides a fast connection between a VFS node and the device +driver, in a scalable way. + +/dev as a system administration tool + +Right now /dev contains a list of conceivable devices, most of which I +don't have. Devfs only shows those devices available on my +system. This means that listing /dev is a handy way of checking what +devices are available. + +Major&minor size + +Existing major and minor numbers are limited to 8 bits each. This is +now a limiting factor for some drivers, particularly the SCSI disc +driver, which consumes a single major number. Only 16 discs are +supported, and each disc may have only 15 partitions. Maybe this isn't +a problem for you, but some of us are building huge Linux systems with +disc arrays. With devfs an arbitrary pointer can be associated with +each device entry, which can be used to give an effective 32 bit +device identifier (i.e. that's like having a 32 bit minor +number). Since this is private to the kernel, there are no C library +compatibility issues which you would have with increasing major and +minor number sizes. See the section on "Allocation of Device Numbers" +for details on maintaining compatibility with userspace. + +Solving this requires a kernel change. + +Since writing this, the kernel has been modified so that the SCSI disc +driver has more major numbers allocated to it and now supports up to +128 discs. Since these major numbers are non-contiguous (a result of +unplanned expansion), the implementation is a little more cumbersome +than originally. + +Just like the changes to IPv4 to fix impending limitations in the +address space, people find ways around the limitations. In the long +run, however, solutions like IPv6 or devfs can't be put off forever. + +Read-only root filesystem + +Having your device nodes on the root filesystem means that you can't +operate properly with a read-only root filesystem. This is because you +want to change ownerships and protections of tty devices. Existing +practice prevents you using a CD-ROM as your root filesystem for a +*real* system. Sure, you can boot off a CD-ROM, but you can't change +tty ownerships, so it's only good for installing. + +Also, you can't use a shared NFS root filesystem for a cluster of +discless Linux machines (having tty ownerships changed on a common +/dev is not good). Nor can you embed your root filesystem in a +ROM-FS. + +You can get around this by creating a RAMDISC at boot time, making +an ext2 filesystem in it, mounting it somewhere and copying the +contents of /dev into it, then unmounting it and mounting it over +/dev. + +A devfs is a cleaner way of solving this. + +Non-Unix root filesystem + +Non-Unix filesystems (such as NTFS) can't be used for a root +filesystem because they variously don't support character and block +special files or symbolic links. You can't have a separate disc-based +or RAMDISC-based filesystem mounted on /dev because you need device +nodes before you can mount these. Devfs can be mounted without any +device nodes. Devlinks won't work because symlinks aren't supported. +An alternative solution is to use initrd to mount a RAMDISC initial +root filesystem (which is populated with a minimal set of device +nodes), and then construct a new /dev in another RAMDISC, and finally +switch to your non-Unix root filesystem. This requires clever boot +scripts and a fragile and conceptually complex boot procedure. + +Devfs solves this in a robust and conceptually simple way. + +PTY security + +Current pseudo-tty (pty) devices are owned by root and read-writable +by everyone. The user of a pty-pair cannot change +ownership/protections without being suid-root. + +This could be solved with a secure user-space daemon which runs as +root and does the actual creation of pty-pairs. Such a daemon would +require modification to *every* programme that wants to use this new +mechanism. It also slows down creation of pty-pairs. + +An alternative is to create a new open_pty() syscall which does much +the same thing as the user-space daemon. Once again, this requires +modifications to pty-handling programmes. + +The devfs solution allows a device driver to "tag" certain device +files so that when an unopened device is opened, the ownerships are +changed to the current euid and egid of the opening process, and the +protections are changed to the default registered by the driver. When +the device is closed ownership is set back to root and protections are +set back to read-write for everybody. No programme need be changed. +The devpts filesystem provides this auto-ownership feature for Unix98 +ptys. It doesn't support old-style pty devices, nor does it have all +the other features of devfs. + +Intelligent device management + +Devfs implements a simple yet powerful protocol for communication with +a device management daemon (devfsd) which runs in user space. It is +possible to send a message (either synchronously or asynchronously) to +devfsd on any event, such as registration/unregistration of device +entries, opening and closing devices, looking up inodes, scanning +directories and more. This has many possibilities. Some of these are +already implemented. See: + + +http://www.atnf.csiro.au/~rgooch/linux/ + +Device entry registration events can be used by devfsd to change +permissions of newly-created device nodes. This is one mechanism to +control device permissions. + +Device entry registration/unregistration events can be used to run +programmes or scripts. This can be used to provide automatic mounting +of filesystems when a new block device media is inserted into the +drive. + +Asynchronous device open and close events can be used to implement +clever permissions management. For example, the default permissions on +/dev/dsp do not allow everybody to read from the device. This is +sensible, as you don't want some remote user recording what you say at +your console. However, the console user is also prevented from +recording. This behaviour is not desirable. With asynchronous device +open and close events, you can have devfsd run a programme or script +when console devices are opened to change the ownerships for *other* +device nodes (such as /dev/dsp). On closure, you can run a different +script to restore permissions. An advantage of this scheme over +modifying the C library tty handling is that this works even if your +programme crashes (how many times have you seen the utmp database with +lingering entries for non-existent logins?). + +Synchronous device open events can be used to perform intelligent +device access protections. Before the device driver open() method is +called, the daemon must first validate the open attempt, by running an +external programme or script. This is far more flexible than access +control lists, as access can be determined on the basis of other +system conditions instead of just the UID and GID. + +Inode lookup events can be used to authenticate module autoload +requests. Instead of using kmod directly, the event is sent to +devfsd which can implement an arbitrary authentication before loading +the module itself. + +Inode lookup events can also be used to construct arbitrary +namespaces, without having to resort to populating devfs with symlinks +to devices that don't exist. + +Speculative Device Scanning + +Consider an application (like cdparanoia) that wants to find all +CD-ROM devices on the system (SCSI, IDE and other types), whether or +not their respective modules are loaded. The application must +speculatively open certain device nodes (such as /dev/sr0 for the SCSI +CD-ROMs) in order to make sure the module is loaded. This requires +that all Linux distributions follow the standard device naming scheme +(last time I looked RedHat did things differently). Devfs solves the +naming problem. + +The same application also wants to see which devices are actually +available on the system. With the existing system it needs to read the +/dev directory and speculatively open each /dev/sr* device to +determine if the device exists or not. With a large /dev this is an +inefficient operation, especially if there are many /dev/sr* nodes. A +solution like scsidev could reduce the number of /dev/sr* entries (but +of course that also requires all that inefficient directory scanning). + +With devfs, the application can open the /dev/sr directory +(which triggers the module autoloading if required), and proceed to +read /dev/sr. Since only the available devices will have +entries, there are no inefficencies in directory scanning or device +openings. + +----------------------------------------------------------------------------- + +Who else does it? + +FreeBSD has a devfs implementation. Solaris and AIX each have a +pseudo-devfs (something akin to scsidev but for all devices, with some +unspecified kernel support). BeOS, Plan9 and QNX also have it. SGI's +IRIX 6.4 and above also have a device filesystem. + +While we shouldn't just automatically do something because others do +it, we should not ignore the work of others either. FreeBSD has a lot +of competent people working on it, so their opinion should not be +blithely ignored. + +----------------------------------------------------------------------------- + + +How it works + +Registering device entries + +For every entry (device node) in a devfs-based /dev a driver must call +devfs_register(). This adds the name of the device entry, the +file_operations structure pointer and a few other things to an +internal table. Device entries may be added and removed at any +time. When a device entry is registered, it automagically appears in +any mounted devfs'. + +Inode lookup + +When a lookup operation on an entry is performed and if there is no +driver information for that entry devfs will attempt to call +devfsd. If still no driver information can be found then a negative +dentry is yielded and the next stage operation will be called by the +VFS (such as create() or mknod() inode methods). If driver information +can be found, an inode is created (if one does not exist already) and +all is well. + +Manually creating device nodes + +The mknod() method allows you to create an ordinary named pipe in the +devfs, or you can create a character or block special inode if one +does not already exist. You may wish to create a character or block +special inode so that you can set permissions and ownership. Later, if +a device driver registers an entry with the same name, the +permissions, ownership and times are retained. This is how you can set +the protections on a device even before the driver is loaded. Once you +create an inode it appears in the directory listing. + +Unregistering device entries + +A device driver calls devfs_unregister() to unregister an entry. + +Chroot() gaols + +2.2.x kernels + +The semantics of inode creation are different when devfs is mounted +with the "explicit" option. Now, when a device entry is registered, it +will not appear until you use mknod() to create the device. It doesn't +matter if you mknod() before or after the device is registered with +devfs_register(). The purpose of this behaviour is to support +chroot(2) gaols, where you want to mount a minimal devfs inside the +gaol. Only the devices you specifically want to be available (through +your mknod() setup) will be accessible. + +2.4.x kernels + +As of kernel 2.3.99, the VFS has had the ability to rebind parts of +the global filesystem namespace into another part of the namespace. +This now works even at the leaf-node level, which means that +individual files and device nodes may be bound into other parts of the +namespace. This is like making links, but better, because it works +across filesystems (unlike hard links) and works through chroot() +gaols (unlike symbolic links). + +Because of these improvements to the VFS, the multi-mount capability +in devfs is no longer needed. The administrator may create a minimal +device tree inside a chroot(2) gaol by using VFS bindings. As this +provides most of the features of the devfs multi-mount capability, I +removed the multi-mount support code (after issuing an RFC). This +yielded code size reductions and simplifications. + +If you want to construct a minimal chroot() gaol, the following +command should suffice: + +mount --bind /dev/null /gaol/dev/null + + +Repeat for other device nodes you want to expose. Simple! + +----------------------------------------------------------------------------- + + +Operational issues + + +Instructions for the impatient + +Nobody likes reading documentation. People just want to get in there +and play. So this section tells you quickly the steps you need to take +to run with devfs mounted over /dev. Skip these steps and you will end +up with a nearly unbootable system. Subsequent sections describe the +issues in more detail, and discuss non-essential configuration +options. + +Devfsd +OK, if you're reading this, I assume you want to play with +devfs. First you should ensure that /usr/src/linux contains a +recent kernel source tree. Then you need to compile devfsd, the device +management daemon, available at + +http://www.atnf.csiro.au/~rgooch/linux/. +Because the kernel has a naming scheme +which is quite different from the old naming scheme, you need to +install devfsd so that software and configuration files that use the +old naming scheme will not break. + +Compile and install devfsd. You will be provided with a default +configuration file /etc/devfsd.conf which will provide +compatibility symlinks for the old naming scheme. Don't change this +config file unless you know what you're doing. Even if you think you +do know what you're doing, don't change it until you've followed all +the steps below and booted a devfs-enabled system and verified that it +works. + +Now edit your main system boot script so that devfsd is started at the +very beginning (before any filesystem +checks). /etc/rc.d/rc.sysinit is often the main boot script +on systems with SysV-style boot scripts. On systems with BSD-style +boot scripts it is often /etc/rc. Also check +/sbin/rc. + +NOTE that the line you put into the boot +script should be exactly: + +/sbin/devfsd /dev + +DO NOT use some special daemon-launching +programme, otherwise the boot script may not wait for devfsd to finish +initialising. + +System Libraries +There may still be some problems because of broken software making +assumptions about device names. In particular, some software does not +handle devices which are symbolic links. If you are running a libc 5 +based system, install libc 5.4.44 (if you have libc 5.4.46, go back to +libc 5.4.44, which is actually correct). If you are running a glibc +based system, make sure you have glibc 2.1.3 or later. + +/etc/securetty +PAM (Pluggable Authentication Modules) is supposed to be a flexible +mechanism for providing better user authentication and access to +services. Unfortunately, it's also fragile, complex and undocumented +(check out RedHat 6.1, and probably other distributions as well). PAM +has problems with symbolic links. Append the following lines to your +/etc/securetty file: + +vc/1 +vc/2 +vc/3 +vc/4 +vc/5 +vc/6 +vc/7 +vc/8 + +This will not weaken security. If you have a version of util-linux +earlier than 2.10.h, please upgrade to 2.10.h or later. If you +absolutely cannot upgrade, then also append the following lines to +your /etc/securetty file: + +1 +2 +3 +4 +5 +6 +7 +8 + +This may potentially weaken security by allowing root logins over the +network (a password is still required, though). However, since there +are problems with dealing with symlinks, I'm suspicious of the level +of security offered in any case. + +XFree86 +While not essential, it's probably a good idea to upgrade to XFree86 +4.0, as patches went in to make it more devfs-friendly. If you don't, +you'll probably need to apply the following patch to +/etc/security/console.perms so that ordinary users can run +startx. Note that not all distributions have this file (e.g. Debian), +so if it's not present, don't worry about it. + +--- /etc/security/console.perms.orig Sat Apr 17 16:26:47 1999 ++++ /etc/security/console.perms Fri Feb 25 23:53:55 2000 +@@ -14,7 +14,7 @@ + # man 5 console.perms + + # file classes -- these are regular expressions +-=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] ++=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + + # device classes -- these are shell-style globs + =/dev/fd[0-1]* + +If the patch does not apply, then change the line: + +=tty[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + +with: + +=tty[0-9][0-9]* vc/[0-9][0-9]* :[0-9]\.[0-9] :[0-9] + + +Disable devpts +I've had a report of devpts mounted on /dev/pts not working +correctly. Since devfs will also manage /dev/pts, there is no +need to mount devpts as well. You should either edit your +/etc/fstab so devpts is not mounted, or disable devpts from +your kernel configuration. + +Unsupported drivers +Not all drivers have devfs support. If you depend on one of these +drivers, you will need to create a script or tarfile that you can use +at boot time to create device nodes as appropriate. There is a +section which describes this. Another +section lists the drivers which have +devfs support. + +/dev/mouse + +Many disributions configure /dev/mouse to be the mouse device +for XFree86 and GPM. I actually think this is a bad idea, because it +adds another level of indirection. When looking at a config file, if +you see /dev/mouse you're left wondering which mouse +is being referred to. Hence I recommend putting the actual mouse +device (for example /dev/psaux) into your +/etc/X11/XF86Config file (and similarly for the GPM +configuration file). + +Alternatively, use the same technique used for unsupported drivers +described above. + +The Kernel +Finally, you need to make sure devfs is compiled into your kernel. Set +CONFIG_EXPERIMENTAL=y, CONFIG_DEVFS_FS=y and CONFIG_DEVFS_MOUNT=y by +using favourite configuration tool (i.e. make config or +make xconfig) and then make clean and then recompile your kernel and +modules. At boot, devfs will be mounted onto /dev. + +If you encounter problems booting (for example if you forgot a +configuration step), you can pass devfs=nomount at the kernel +boot command line. This will prevent the kernel from mounting devfs at +boot time onto /dev. + +In general, a kernel built with CONFIG_DEVFS_FS=y but without mounting +devfs onto /dev is completely safe, and requires no +configuration changes. One exception to take note of is when +LABEL= directives are used in /etc/fstab. In this +case you will be unable to boot properly. This is because the +mount(8) programme uses /proc/partitions as part of +the volume label search process, and the device names it finds are not +available, because setting CONFIG_DEVFS_FS=y changes the names in +/proc/partitions, irrespective of whether devfs is mounted. + +Now you've finished all the steps required. You're now ready to boot +your shiny new kernel. Enjoy. + +Changing the configuration + +OK, you've now booted a devfs-enabled system, and everything works. +Now you may feel like changing the configuration (common targets are +/etc/fstab and /etc/devfsd.conf). Since you have a +system that works, if you make any changes and it doesn't work, you +now know that you only have to restore your configuration files to the +default and it will work again. + + +Permissions persistence across reboots + +If you don't use mknod(2) to create a device file, nor use chmod(2) or +chown(2) to change the ownerships/permissions, the inode ctime will +remain at 0 (the epoch, 12 am, 1-JAN-1970, GMT). Anything with a ctime +later than this has had it's ownership/permissions changed. Hence, a +simple script or programme may be used to tar up all changed inodes, +prior to shutdown. Although effective, many consider this approach a +kludge. + +A much better approach is to use devfsd to save and restore +permissions. It may be configured to record changes in permissions and +will save them in a database (in fact a directory tree), and restore +these upon boot. This is an efficient method and results in immediate +saving of current permissions (unlike the tar approach, which saves +permissions at some unspecified future time). + +The default configuration file supplied with devfsd has config entries +which you may uncomment to enable persistence management. + +If you decide to use the tar approach anyway, be aware that tar will +first unlink(2) an inode before creating a new device node. The +unlink(2) has the effect of breaking the connection between a devfs +entry and the device driver. If you use the "devfs=only" boot option, +you lose access to the device driver, requiring you to reload the +module. I consider this a bug in tar (there is no real need to +unlink(2) the inode first). + +Alternatively, you can use devfsd to provide more sophisticated +management of device permissions. You can use devfsd to store +permissions for whole groups of devices with a single configuration +entry, rather than the conventional single entry per device entry. + +Permissions database stored in mounted-over /dev + +If you wish to save and restore your device permissions into the +disc-based /dev while still mounting devfs onto /dev +you may do so. This requires a 2.4.x kernel (in fact, 2.3.99 or +later), which has the VFS binding facility. You need to do the +following to set this up: + + + +make sure the kernel does not mount devfs at boot time + + +make sure you have a correct /dev/console entry in your +root file-system (where your disc-based /dev lives) + +create the /dev-state directory + + +add the following lines near the very beginning of your boot +scripts: + +mount --bind /dev /dev-state +mount -t devfs none /dev +devfsd /dev + + + + +add the following lines to your /etc/devfsd.conf file: + +REGISTER ^pt[sy] IGNORE +CREATE ^pt[sy] IGNORE +CHANGE ^pt[sy] IGNORE +DELETE ^pt[sy] IGNORE +REGISTER .* COPY /dev-state/$devname $devpath +CREATE .* COPY $devpath /dev-state/$devname +CHANGE .* COPY $devpath /dev-state/$devname +DELETE .* CFUNCTION GLOBAL unlink /dev-state/$devname +RESTORE /dev-state + +Note that the sample devfsd.conf file contains these lines, +as well as other sample configurations you may find useful. See the +devfsd distribution + + +reboot. + + + + +Permissions database stored in normal directory + +If you are using an older kernel which doesn't support VFS binding, +then you won't be able to have the permissions database in a +mounted-over /dev. However, you can still use a regular +directory to store the database. The sample /etc/devfsd.conf +file above may still be used. You will need to create the +/dev-state directory prior to installing devfsd. If you have +old permissions in /dev, then just copy (or move) the device +nodes over to the new directory. + +Which method is better? + +The best method is to have the permissions database stored in the +mounted-over /dev. This is because you will not need to copy +device nodes over to /dev-state, and because it allows you to +switch between devfs and non-devfs kernels, without requiring you to +copy permissions between /dev-state (for devfs) and +/dev (for non-devfs). + + +Dealing with drivers without devfs support + +Currently, not all device drivers in the kernel have been modified to +use devfs. Device drivers which do not yet have devfs support will not +automagically appear in devfs. The simplest way to create device nodes +for these drivers is to unpack a tarfile containing the required +device nodes. You can do this in your boot scripts. All your drivers +will now work as before. + +Hopefully for most people devfs will have enough support so that they +can mount devfs directly over /dev without losing most functionality +(i.e. losing access to various devices). As of 22-JAN-1998 (devfs +patch version 10) I am now running this way. All the devices I have +are available in devfs, so I don't lose anything. + +WARNING: if your configuration requires the old-style device names +(i.e. /dev/hda1 or /dev/sda1), you must install devfsd and configure +it to maintain compatibility entries. It is almost certain that you +will require this. Note that the kernel creates a compatibility entry +for the root device, so you don't need initrd. + +Note that you no longer need to mount devpts if you use Unix98 PTYs, +as devfs can manage /dev/pts itself. This saves you some RAM, as you +don't need to compile and install devpts. Note that some versions of +glibc have a bug with Unix98 pty handling on devfs systems. Contact +the glibc maintainers for a fix. Glibc 2.1.3 has the fix. + +Note also that apart from editing /etc/fstab, other things will need +to be changed if you *don't* install devfsd. Some software (like the X +server) hard-wire device names in their source. It really is much +easier to install devfsd so that compatibility entries are created. +You can then slowly migrate your system to using the new device names +(for example, by starting with /etc/fstab), and then limiting the +compatibility entries that devfsd creates. + +IF YOU CONFIGURE TO MOUNT DEVFS AT BOOT, MAKE SURE YOU INSTALL DEVFSD +BEFORE YOU BOOT A DEVFS-ENABLED KERNEL! + +Now that devfs has gone into the 2.3.46 kernel, I'm getting a lot of +reports back. Many of these are because people are trying to run +without devfsd, and hence some things break. Please just run devfsd if +things break. I want to concentrate on real bugs rather than +misconfiguration problems at the moment. If people are willing to fix +bugs/false assumptions in other code (i.e. glibc, X server) and submit +that to the respective maintainers, that would be great. + + +All the way with Devfs + +The devfs kernel patch creates a rationalised device tree. As stated +above, if you want to keep using the old /dev naming scheme, +you just need to configure devfsd appopriately (see the man +page). People who prefer the old names can ignore this section. For +those of us who like the rationalised names and an uncluttered +/dev, read on. + +If you don't run devfsd, or don't enable compatibility entry +management, then you will have to configure your system to use the new +names. For example, you will then need to edit your +/etc/fstab to use the new disc naming scheme. If you want to +be able to boot non-devfs kernels, you will need compatibility +symlinks in the underlying disc-based /dev pointing back to +the old-style names for when you boot a kernel without devfs. + +You can selectively decide which devices you want compatibility +entries for. For example, you may only want compatibility entries for +BSD pseudo-terminal devices (otherwise you'll have to patch you C +library or use Unix98 ptys instead). It's just a matter of putting in +the correct regular expression into /dev/devfsd.conf. + +There are other choices of naming schemes that you may prefer. For +example, I don't use the kernel-supplied +names, because they are too verbose. A common misconception is +that the kernel-supplied names are meant to be used directly in +configuration files. This is not the case. They are designed to +reflect the layout of the devices attached and to provide easy +classification. + +If you like the kernel-supplied names, that's fine. If you don't then +you should be using devfsd to construct a namespace more to your +liking. Devfsd has built-in code to construct a +namespace that is both logical and easy to +manage. In essence, it creates a convenient abbreviation of the +kernel-supplied namespace. + +You are of course free to build your own namespace. Devfsd has all the +infrastructure required to make this easy for you. All you need do is +write a script. You can even write some C code and devfsd can load the +shared object as a callable extension. + + +Other Issues + +The init programme +Another thing to take note of is whether your init programme +creates a Unix socket /dev/telinit. Some versions of init +create /dev/telinit so that the telinit programme can +communicate with the init process. If you have such a system you need +to make sure that devfs is mounted over /dev *before* init +starts. In other words, you can't leave the mounting of devfs to +/etc/rc, since this is executed after init. Other +versions of init require a named pipe /dev/initctl +which must exist *before* init starts. Once again, you need to +mount devfs and then create the named pipe *before* init +starts. + +The default behaviour now is not to mount devfs onto /dev at +boot time for 2.3.x and later kernels. You can correct this with the +"devfs=mount" boot option. This solves any problems with init, +and also prevents the dreaded: + +Cannot open initial console + +message. For 2.2.x kernels where you need to apply the devfs patch, +the default is to mount. + +If you have automatic mounting of devfs onto /dev then you +may need to create /dev/initctl in your boot scripts. The +following lines should suffice: + +mknod /dev/initctl p +kill -SIGUSR1 1 # tell init that /dev/initctl now exists + +Alternatively, if you don't want the kernel to mount devfs onto +/dev then you could use the following procedure is a +guideline for how to get around /dev/initctl problems: + +# cd /sbin +# mv init init.real +# cat > init +#! /bin/sh +mount -n -t devfs none /dev +mknod /dev/initctl p +exec /sbin/init.real $* +[control-D] +# chmod a+x init + +Note that newer versions of init create /dev/initctl +automatically, so you don't have to worry about this. + +Module autoloading +You will need to configure devfsd to enable module +autoloading. The following lines should be placed in your +/etc/devfsd.conf file: + +LOOKUP .* MODLOAD + + +As of devfsd-v1.3.10, a generic /etc/modules.devfs +configuration file is installed, which is used by the MODLOAD +action. This should be sufficient for most configurations. If you +require further configuration, edit your /etc/modules.conf +file. The way module autoloading work with devfs is: + + +a process attempts to lookup a device node (e.g. /dev/fred) + + +if that device node does not exist, the full pathname is passed to +devfsd as a string + + +devfsd will pass the string to the modprobe programme (provided the +configuration line shown above is present), and specifies that +/etc/modules.devfs is the configuration file + + +/etc/modules.devfs includes /etc/modules.conf to +access local configurations + +modprobe will search it's configuration files, looking for an alias +that translates the pathname into a module name + + +the translated pathname is then used to load the module. + + +If you wanted a lookup of /dev/fred to load the +mymod module, you would require the following configuration +line in /etc/modules.conf: + +alias /dev/fred mymod + +The /etc/modules.devfs configuration file provides many such +aliases for standard device names. If you look closely at this file, +you will note that some modules require multiple alias configuration +lines. This is required to support module autoloading for old and new +device names. + +Mounting root off a devfs device +If you wish to mount root off a devfs device when you pass the +"devfs=only" boot option, then you need to pass in the +"root=" option to the kernel when booting. If you use +LILO, then you must have this in lilo.conf: + +append = "root=" + +Surprised? Yep, so was I. It turns out if you have (as most people +do): + +root = + + +then LILO will determine the device number of and will +write that device number into a special place in the kernel image +before starting the kernel, and the kernel will use that device number +to mount the root filesystem. So, using the "append" variety ensures +that LILO passes the root filesystem device as a string, which devfs +can then use. + +Note that this isn't an issue if you don't pass "devfs=only". + +TTY issues +The ttyname(3) function in some versions of the C library makes +false assumptions about device entries which are symbolic links. The +tty(1) programme is one that depends on this function. I've +written a patch to libc 5.4.43 which fixes this. This has been +included in libc 5.4.44 and a similar fix is in glibc 2.1.3. + + +Kernel Naming Scheme + +The kernel provides a default naming scheme. This scheme is designed +to make it easy to search for specific devices or device types, and to +view the available devices. Some device types (such as hard discs), +have a directory of entries, making it easy to see what devices of +that class are available. Often, the entries are symbolic links into a +directory tree that reflects the topology of available devices. The +topological tree is useful for finding how your devices are arranged. + +Below is a list of the naming schemes for the most common drivers. A +list of reserved device names is +available for reference. Please send email to +rgooch@atnf.csiro.au to obtain an allocation. Please be +patient (the maintainer is busy). An alternative name may be allocated +instead of the requested name, at the discretion of the maintainer. + +Disc Devices + +All discs, whether SCSI, IDE or whatever, are placed under the +/dev/discs hierarchy: + + /dev/discs/disc0 first disc + /dev/discs/disc1 second disc + + +Each of these entries is a symbolic link to the directory for that +device. The device directory contains: + + disc for the whole disc + part* for individual partitions + + +CD-ROM Devices + +All CD-ROMs, whether SCSI, IDE or whatever, are placed under the +/dev/cdroms hierarchy: + + /dev/cdroms/cdrom0 first CD-ROM + /dev/cdroms/cdrom1 second CD-ROM + + +Each of these entries is a symbolic link to the real device entry for +that device. + +Tape Devices + +All tapes, whether SCSI, IDE or whatever, are placed under the +/dev/tapes hierarchy: + + /dev/tapes/tape0 first tape + /dev/tapes/tape1 second tape + + +Each of these entries is a symbolic link to the directory for that +device. The device directory contains: + + mt for mode 0 + mtl for mode 1 + mtm for mode 2 + mta for mode 3 + mtn for mode 0, no rewind + mtln for mode 1, no rewind + mtmn for mode 2, no rewind + mtan for mode 3, no rewind + + +SCSI Devices + +To uniquely identify any SCSI device requires the following +information: + + controller (host adapter) + bus (SCSI channel) + target (SCSI ID) + unit (Logical Unit Number) + + +All SCSI devices are placed under /dev/scsi (assuming devfs +is mounted on /dev). Hence, a SCSI device with the following +parameters: c=1,b=2,t=3,u=4 would appear as: + + /dev/scsi/host1/bus2/target3/lun4 device directory + + +Inside this directory, a number of device entries may be created, +depending on which SCSI device-type drivers were installed. + +See the section on the disc naming scheme to see what entries the SCSI +disc driver creates. + +See the section on the tape naming scheme to see what entries the SCSI +tape driver creates. + +The SCSI CD-ROM driver creates: + + cd + + +The SCSI generic driver creates: + + generic + + +IDE Devices + +To uniquely identify any IDE device requires the following +information: + + controller + bus (aka. primary/secondary) + target (aka. master/slave) + unit + + +All IDE devices are placed under /dev/ide, and uses a similar +naming scheme to the SCSI subsystem. + +XT Hard Discs + +All XT discs are placed under /dev/xd. The first XT disc has +the directory /dev/xd/disc0. + +TTY devices + +The tty devices now appear as: + + New name Old-name Device Type + -------- -------- ----------- + /dev/tts/{0,1,...} /dev/ttyS{0,1,...} Serial ports + /dev/cua/{0,1,...} /dev/cua{0,1,...} Call out devices + /dev/vc/0 /dev/tty Current virtual console + /dev/vc/{1,2,...} /dev/tty{1...63} Virtual consoles + /dev/vcc/{0,1,...} /dev/vcs{1...63} Virtual consoles + /dev/pty/m{0,1,...} /dev/ptyp?? PTY masters + /dev/pty/s{0,1,...} /dev/ttyp?? PTY slaves + + +RAMDISCS + +The RAMDISCS are placed in their own directory, and are named thus: + + /dev/rd/{0,1,2,...} + + +Meta Devices + +The meta devices are placed in their own directory, and are named +thus: + + /dev/md/{0,1,2,...} + + +Floppy discs + +Floppy discs are placed in the /dev/floppy directory. + +Loop devices + +Loop devices are placed in the /dev/loop directory. + +Sound devices + +Sound devices are placed in the /dev/sound directory +(audio, sequencer, ...). + + +Devfsd Naming Scheme + +Devfsd provides a naming scheme which is a convenient abbreviation of +the kernel-supplied namespace. In some +cases, the kernel-supplied naming scheme is quite convenient, so +devfsd does not provide another naming scheme. The convenience names +that devfsd creates are in fact the same names as the original devfs +kernel patch created (before Linus mandated the Big Name +Change). These are referred to as "new compatibility entries". + +In order to configure devfsd to create these convenience names, the +following lines should be placed in your /etc/devfsd.conf: + +REGISTER .* MKNEWCOMPAT +UNREGISTER .* RMNEWCOMPAT + +This will cause devfsd to create (and destroy) symbolic links which +point to the kernel-supplied names. + +SCSI Hard Discs + +All SCSI discs are placed under /dev/sd (assuming devfs is +mounted on /dev). Hence, a SCSI disc with the following +parameters: c=1,b=2,t=3,u=4 would appear as: + + /dev/sd/c1b2t3u4 for the whole disc + /dev/sd/c1b2t3u4p5 for the 5th partition + /dev/sd/c1b2t3u4p5s6 for the 6th slice in the 5th partition + + +SCSI Tapes + +All SCSI tapes are placed under /dev/st. A similar naming +scheme is used as for SCSI discs. A SCSI tape with the +parameters:c=1,b=2,t=3,u=4 would appear as: + + /dev/st/c1b2t3u4m0 for mode 0 + /dev/st/c1b2t3u4m1 for mode 1 + /dev/st/c1b2t3u4m2 for mode 2 + /dev/st/c1b2t3u4m3 for mode 3 + /dev/st/c1b2t3u4m0n for mode 0, no rewind + /dev/st/c1b2t3u4m1n for mode 1, no rewind + /dev/st/c1b2t3u4m2n for mode 2, no rewind + /dev/st/c1b2t3u4m3n for mode 3, no rewind + + +SCSI CD-ROMs + +All SCSI CD-ROMs are placed under /dev/sr. A similar naming +scheme is used as for SCSI discs. A SCSI CD-ROM with the +parameters:c=1,b=2,t=3,u=4 would appear as: + + /dev/sr/c1b2t3u4 + + +SCSI Generic Devices + +The generic (aka. raw) interface for all SCSI devices are placed under +/dev/sg. A similar naming scheme is used as for SCSI discs. A +SCSI generic device with the parameters:c=1,b=2,t=3,u=4 would appear +as: + + /dev/sg/c1b2t3u4 + + +IDE Hard Discs + +All IDE discs are placed under /dev/ide/hd, using a similar +convention to SCSI discs. The following mappings exist between the new +and the old names: + + /dev/hda /dev/ide/hd/c0b0t0u0 + /dev/hdb /dev/ide/hd/c0b0t1u0 + /dev/hdc /dev/ide/hd/c0b1t0u0 + /dev/hdd /dev/ide/hd/c0b1t1u0 + + +IDE Tapes + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/mt directory. + +IDE CD-ROM + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/cd directory. + +IDE Floppies + +A similar naming scheme is used as for IDE discs. The entries will +appear in the /dev/ide/fd directory. + +XT Hard Discs + +All XT discs are placed under /dev/xd. The first XT disc +would appear as /dev/xd/c0t0. + + +Old Compatibility Names + +The old compatibility names are the legacy device names, such as +/dev/hda, /dev/sda, /dev/rtc and so on. +Devfsd can be configured to create compatibility symlinks so that you +may continue to use the old names in your configuration files and so +that old applications will continue to function correctly. + +In order to configure devfsd to create these legacy names, the +following lines should be placed in your /etc/devfsd.conf: + +REGISTER .* MKOLDCOMPAT +UNREGISTER .* RMOLDCOMPAT + +This will cause devfsd to create (and destroy) symbolic links which +point to the kernel-supplied names. + + +----------------------------------------------------------------------------- + + +Device drivers currently ported + +- All miscellaneous character devices support devfs (this is done + transparently through misc_register()) + +- SCSI discs and generic hard discs + +- Character memory devices (null, zero, full and so on) + Thanks to C. Scott Ananian + +- Loop devices (/dev/loop?) + +- TTY devices (console, serial ports, terminals and pseudo-terminals) + Thanks to C. Scott Ananian + +- SCSI tapes (/dev/scsi and /dev/tapes) + +- SCSI CD-ROMs (/dev/scsi and /dev/cdroms) + +- SCSI generic devices (/dev/scsi) + +- RAMDISCS (/dev/ram?) + +- Meta Devices (/dev/md*) + +- Floppy discs (/dev/floppy) + +- Parallel port printers (/dev/printers) + +- Sound devices (/dev/sound) + Thanks to Eric Dumas and + C. Scott Ananian + +- Joysticks (/dev/joysticks) + +- Sparc keyboard (/dev/kbd) + +- DSP56001 digital signal processor (/dev/dsp56k) + +- Apple Desktop Bus (/dev/adb) + +- Coda network file system (/dev/cfs*) + +- Virtual console capture devices (/dev/vcc) + Thanks to Dennis Hou + +- Frame buffer devices (/dev/fb) + +- Video capture devices (/dev/v4l) + + +----------------------------------------------------------------------------- + + +Allocation of Device Numbers + +Devfs allows you to write a driver which doesn't need to allocate a +device number (major&minor numbers) for the internal operation of the +kernel. However, there are a number of userspace programmes that use +the device number as a unique handle for a device. An example is the +find programme, which uses device numbers to determine whether +an inode is on a different filesystem than another inode. The device +number used is the one for the block device which a filesystem is +using. To preserve compatibility with userspace programmes, block +devices using devfs need to have unique device numbers allocated to +them. Furthermore, POSIX specifies device numbers, so some kind of +device number needs to be presented to userspace. + +The simplest option (especially when porting drivers to devfs) is to +keep using the old major and minor numbers. Devfs will take whatever +values are given for major&minor and pass them onto userspace. + +This device number is a 16 bit number, so this leaves plenty of space +for large numbers of discs and partitions. This scheme can also be +used for character devices, in particular the tty devices, which are +currently limited to 256 pseudo-ttys (this limits the total number of +simultaneous xterms and remote logins). Note that the device number +is limited to the range 36864-61439 (majors 144-239), in order to +avoid any possible conflicts with existing official allocations. + +Please note that using dynamically allocated block device numbers may +break the NFS daemons (both user and kernel mode), which expect dev_t +for a given device to be constant over the lifetime of remote mounts. + +A final note on this scheme: since it doesn't increase the size of +device numbers, there are no compatibility issues with userspace. + +----------------------------------------------------------------------------- + + +Questions and Answers + + +Making things work +Alternatives to devfs +What I don't like about devfs +How to report bugs +Strange kernel messages +Compilation problems with devfsd + + + +Making things work + +Here are some common questions and answers. + + + +Devfsd doesn't start + +Make sure you have compiled and installed devfsd +Make sure devfsd is being started from your boot +scripts +Make sure you have configured your kernel to enable devfs (see +below) +Make sure devfs is mounted (see below) + + +Devfsd is not managing all my permissions + +Make sure you are capturing the appropriate events. For example, +device entries created by the kernel generate REGISTER events, +but those created by devfsd generate CREATE events. + + +Devfsd is not capturing all REGISTER events + +See the previous entry: you may need to capture CREATE events. + + +X will not start + +Make sure you followed the steps +outlined above. + + +Why don't my network devices appear in devfs? + +This is not a bug. Network devices have their own, completely separate +namespace. They are accessed via socket(2) and +setsockopt(2) calls, and thus require no device nodes. I have +raised the possibilty of moving network devices into the device +namespace, but have had no response. + + +How can I test if I have devfs compiled into my kernel? + +All filesystems built-in or currently loaded are listed in +/proc/filesystems. If you see a devfs entry, then +you know that devfs was compiled into your kernel. If you have +correctly configured and rebuilt your kernel, then devfs will be +built-in. If you think you've configured it in, but +/proc/filesystems doesn't show it, you've made a mistake. +Common mistakes include: + +Using a 2.2.x kernel without applying the devfs patch (if you +don't know how to patch your kernel, use 2.4.x instead, don't bother +asking me how to patch) +Forgetting to set CONFIG_EXPERIMENTAL=y +Forgetting to set CONFIG_DEVFS_FS=y +Forgetting to set CONFIG_DEVFS_MOUNT=y (if you want devfs +to be automatically mounted at boot) +Editing your .config manually, instead of using make +config or make xconfig +Forgetting to run make dep; make clean after changing the +configuration and before compiling +Forgetting to compile your kernel and modules +Forgetting to install your kernel +Forgetting to install your modules + +Please check twice that you've done all these steps before sending in +a bug report. + + + +How can I test if devfs is mounted on /dev? + +The device filesystem will always create an entry called +".devfsd", which is used to communicate with the daemon. Even +if the daemon is not running, this entry will exist. Testing for the +existence of this entry is the approved method of determining if devfs +is mounted or not. Note that the type of entry (i.e. regular file, +character device, named pipe, etc.) may change without notice. Only +the existence of the entry should be relied upon. + + +When I start devfsd, I see the error: +Error opening file: ".devfsd" No such file or directory? + +This means that devfs is not mounted. Make sure you have devfs mounted. + + +How do I mount devfs? + +First make sure you have devfs compiled into your kernel (see +above). Then you will either need to: + +set CONFIG_DEVFS_MOUNT=y in your kernel config +pass devfs=mount to your boot loader +mount devfs manually in your boot scripts with: +mount -t none devfs /dev + + + +Mount by volume LABEL=