Age | Commit message (Collapse) | Author |
|
Use schedule_timeout_{,un}interruptible() instead of
set_current_state()/schedule_timeout() to reduce kernel size. Also use helper
functions to convert between human time units and jiffies rather than constant
HZ division to avoid rounding errors.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
attached patch should fix the following race:
Proc 1 Proc 2
__flush_batch()
ll_rw_block()
do_get_write_access()
lock_buffer
jh is only waiting for checkpoint
-> b_transaction == NULL ->
do nothing
unlock_buffer
test_set_buffer_locked()
test_clear_buffer_dirty()
__journal_file_buffer()
change the data
submit_bh()
and we have sent wrong data to disk... We now clean the dirty buffer flag
under buffer lock in all cases and hence we know that whenever a buffer is
starting to be journaled we either finish the pending write-out before
attaching a buffer to a transaction or we won't write the buffer until the
transaction is going to be committed.
The test in jbd_unexpected_dirty_buffer() is redundant - remove it.
Furthermore we have to clear the buffer dirty bit under the buffer lock to
prevent races with buffer write-out (and hence prevent returning a buffer with
IO happening).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
We must be sure that the current data in buffer are sent to disk. Hence we
have to call ll_rw_block() with SWRITE.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Fix race between journal_commit_transaction() and other places as
journal_unmap_buffer() that are adding buffers to transaction's t_forget list.
We have to protect against such places by holding j_list_lock even when
traversing the t_forget list. The fact that other places can only add buffers
to the list makes the locking easier. OTOH the lock ranking complicates the
stuff...
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
It seems that kjournald() may be missing a check of the JFS_UNMOUNT flag
before calling schedule(). This showed up in testing of OCFS2 recovery
where our recovery thread would hang in journal_kill_thread() called from
journal_destroy() because kjournald never got a chance to read the flag to
shut down before the schedule().
Zach pointed out the missing check which led me to hack up this trivial
patch. It's been tested many times now and I have yet to reproduce the
hang, which was happening very regularly before.
<mild rant>
I'm guessing that we could really use some wait_event() calls with helper
functions in, well, most of jbd these days which would make a ton of the
wait code there vastly cleaner.
</mild rant>
As for why this doesn't happen in ext3 (or OCFS2 during normal
mount/unmount of the local nodes journal), I think it may that the specific
timing of events in the ocfs2 recovery thread exposes a race there.
Because ocfs2_replay_journal() is only interested in playing back the
journal, initialization and shutdown happen very quicky with no other
metadata put into that specific journal.
Acked-by: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This patch contains the following cleanups:
- make needlessly global functions static
- journal.c: remove the unused global function __journal_internal_check
and move the check to journal_init
- remove the following write-only global variable:
- journal.c: current_journal
- remove the following unneeded EXPORT_SYMBOL:
- journal.c: journal_recover
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Andreas Dilger <adilger@clusterfs.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
1. Establish a simple API for process freezing defined in linux/include/sched.h:
frozen(process) Check for frozen process
freezing(process) Check if a process is being frozen
freeze(process) Tell a process to freeze (go to refrigerator)
thaw_process(process) Restart process
frozen_process(process) Process is frozen now
2. Remove all references to PF_FREEZE and PF_FROZEN from all
kernel sources except sched.h
3. Fix numerous locations where try_to_freeze is manually done by a driver
4. Remove the argument that is no longer necessary from two function calls.
5. Some whitespace cleanup
6. Clear potential race in refrigerator (provides an open window of PF_FREEZE
cleared before setting PF_FROZEN, recalc_sigpending does not check
PF_FROZEN).
This patch does not address the problem of freeze_processes() violating the rule
that a task may only modify its own flags by setting PF_FREEZE. This is not clean
in an SMP environment. freeze(process) is therefore not SMP safe!
Signed-off-by: Christoph Lameter <christoph@lameter.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Fix a bug in list scanning that can cause us to skip the last buffer on the
checkpoint list (and hence fail to do any progress under some rather
unfavorable conditions).
The problem is we first do jh=next_jh and then test
} while (jh!=last_jh);
Hence we skip the last buffer on the list (if it was not the only buffer on
the list). As we already do jh=next_jh; in the beginning of the loop we
are safe to just remove the assignment in the end. It can happen that 'jh'
will be freed at the point we test jh != last_jh but that does not matter
as we never *dereference* the pointer.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Fix possible false assertion failure in log_do_checkpoint(). We might fail
to detect that we actually made a progress when cleaning up the checkpoint
lists if we don't retry after writing something to disk. The patch was
confirmed to fix observed assertion failures for several users.
When we flushed some buffers we need to retry scanning the list.
Otherwise we can fail to detect our progress.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This fixes the lots-of-fsx-linux-instances-cause-a-slow-leak bug.
It's been there since 2.6.6, caused by:
ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.5/2.6.5-mm4/broken-out/jbd-move-locked-buffers.patch
That patch moves under-writeout ordered-data buffers onto a separate journal
list during commit. It took out the old code which was based on a single
list.
The old code (necessarily) had logic which would restart I/O against buffers
which had been redirtied while they were on the committing transaction's
t_sync_datalist list. The new code only writes buffers once, ignoring
redirtyings by a later transaction, which is good.
But over on the truncate side of things, in journal_unmap_buffer(), we're
treating buffers on the t_locked_list as inviolable things which belong to the
committing transaction, and we just leave them alone during concurrent
truncate-vs-commit.
The net effect is that when truncate tries to invalidate a page whose buffers
are on t_locked_list and have been redirtied, journal_unmap_buffer() just
leaves those buffers alone. truncate will remove the page from its mapping
and we end up with an anonymous clean page with dirty buffers, which is an
illegal state for a page. The JBD commit will not clean those buffers as they
are removed from t_locked_list. The VM (try_to_free_buffers) cannot reclaim
these pages.
The patch teaches journal_unmap_buffer() about buffers which are on the
committing transaction's t_locked_list. These buffers have been written and
I/O has completed. We can take them off the transaction and undirty them
within the context of journal_invalidatepage()->journal_unmap_buffer().
Acked-by: "Stephen C. Tweedie" <sct@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Initial git repository build. I'm not bothering with the full history,
even though we have it. We can create a separate "historical" git
archive of that later if we want to, and in the meantime it's about
3.2GB when imported into git - space that would just make the early
git days unnecessarily complicated, when we don't have a lot of good
infrastructure for it.
Let it rip!
|