Jens Axboe [Sat, 22 Jul 2006 23:42:19 +0000 (01:42 +0200)]
[PATCH] cfq-iosched: use metadata read flag
Give meta data reads preference over regular reads, as the process
often needs to get that out of the way to do the io it was actually
interested in.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Sat, 22 Jul 2006 23:41:26 +0000 (01:41 +0200)]
[PATCH] ext3: make meta data reads use READ_META
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 10 Aug 2006 07:01:02 +0000 (09:01 +0200)]
[PATCH] Allow file systems to differentiate between data and meta reads
We can use this information for making more intelligent priority
decisions, and it will also be useful for blktrace.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 21 Jul 2006 18:30:28 +0000 (20:30 +0200)]
[PATCH] ll_rw_blk: allow more flexibility for read_ahead_kb store
It can make sense to set read-ahead larger than a single request.
We should not be enforcing such policy on the user. Additionally,
using the BLKRASET ioctl doesn't impose such a restriction. So
additionally we now expose identical behaviour through the two.
Issue also reported by Anton <cbou@mail.ru>
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 19 Jul 2006 18:29:12 +0000 (20:29 +0200)]
[PATCH] cfq-iosched: improve queue preemption
Don't touch the current queues, just make sure that the wanted queue
is selected next. Simplifies the logic.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 20 Jul 2006 12:54:05 +0000 (14:54 +0200)]
[PATCH] Add blk_start_queueing() helper
CFQ implements this on its own now, but it's really block layer
knowledge. Tells a device queue to start dispatching requests to
the driver, taking care to unplug if needed. Also fixes the issue
where as/cfq will invoke a stopped queue, which we really don't
want.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 19 Jul 2006 12:56:28 +0000 (14:56 +0200)]
[PATCH] cfq-iosched: kill the empty_list
No point in having a place holder list just for empty queues, so remove
it. It's not used for anything other than to keep ->cfq_list busy.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:48:51 +0000 (09:48 +0200)]
[PATCH] cfq-iosched: Kill O(N) runtime of cfq_resort_rr_list()
Currently it scales with number of processes in that priority group,
which is potentially not very nice as it's called quite often.
Basically we always need to do tail inserts, except for the case of a
new process. So just mark/detect a queue as such.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 19 Jul 2006 21:39:40 +0000 (23:39 +0200)]
[PATCH] Make sure all block/io scheduler setups are node aware
Some were kmalloc_node(), some were still kmalloc(). Change them all to
kmalloc_node().
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:36:46 +0000 (09:36 +0200)]
[PATCH] Kill various deprecated/unused block layer defines/functions
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 18 Jul 2006 20:24:11 +0000 (22:24 +0200)]
[PATCH] Audit block layer inlines
Kill a few inlines that bring in too much code to more than one location
Shrinks kernel text by about 300 bytes on 32-bit x86.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 19 Jul 2006 03:07:12 +0000 (05:07 +0200)]
[PATCH] cfq-iosched: use new io context counting mechanism
It's ok if the read path is a lot more costly, as long as inc/dec is
really cheap. The inc/dec will happen for each created/freed io context,
while the reading only happens when a disk queue exits.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 19 Jul 2006 03:10:01 +0000 (05:10 +0200)]
[PATCH] as-iosched: use new io context counting mechanism
It's ok if the read path is a lot more costly, as long as inc/dec is
really cheap. The inc/dec will happen for each created/freed io context,
while the reading only happens when a disk queue exits.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Sat, 22 Jul 2006 13:37:43 +0000 (15:37 +0200)]
[PATCH] elevator: define ioc counting mechanism
None of the in-kernel primitives for handling "atomic" counting seem
to be a good fit. We need something that is essentially free for
incrementing/decrementing, while the read side may be more expensive
as we only ever need to do that when a device is removed from the
kernel.
Use a per-cpu variable for maintaining a per-cpu ioc count and define
a reading mechanism that just sums up the values.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 29 Aug 2006 07:05:44 +0000 (09:05 +0200)]
[PATCH] cfq-iosched: kill cfq_exit_lock
cfq_exit_lock is protecting two things now:
- The per-ioc rbtree of cfq_io_contexts
- The per-cfqd linked list of cfq_io_contexts
The per-cfqd linked list can be protected by the queue lock, as it is (by
definition) per cfqd as the queue lock is.
The per-ioc rbtree is mainly used and updated by the process itself only.
The only outside use is the io priority changing. If we move the
priority changing to not browsing the rbtree, we can remove any locking
from the rbtree updates and lookup completely. Let the sys_ioprio syscall
just mark processes as having the iopriority changed and lazily update
the private cfq io contexts the next time io is queued, and we can
remove this locking as well.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Sat, 22 Jul 2006 14:48:31 +0000 (16:48 +0200)]
[PATCH] cfq-iosched: cleanups, fixes, dead code removal
A collection of little fixes and cleanups:
- We don't use the 'queued' sysfs exported attribute, since the
may_queue() logic was rewritten. So kill it.
- Remove dead defines.
- cfq_set_active_queue() can be rewritten cleaner with else if conditions.
- Several places had cfq_exit_cfqq() like logic, abstract that out and
use that.
- Annotate the cfqq kmem_cache_alloc() so the allocator knows that this
is a repeat allocation if it fails with __GFP_WAIT set. Allows the
allocator to start freeing some memory, if needed. CFQ already loops for
this condition, so might as well pass the hint down.
- Remove cfqd->rq_starved logic. It's not needed anymore after we dropped
the crq allocation in cfq_set_request().
- Remove uneeded parameter passing.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 10 Aug 2006 07:00:21 +0000 (09:00 +0200)]
[PATCH] struct request: shrink and optimize some more
Move some members around and unionize completion_data and rb_node since
they cannot ever be used at the same time.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 18 Jul 2006 02:14:45 +0000 (04:14 +0200)]
[PATCH] ll_rw_blk: cleanup __make_request()
- Don't assign variables that are only used once.
- Kill spin_lock() prefetching, it's opportunistic at best.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:32:57 +0000 (09:32 +0200)]
[PATCH] Drop useless bio passing in may_queue/set_request API
It's not needed for anything, so kill the bio passing.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:32:07 +0000 (09:32 +0200)]
[PATCH] Remove ->rq_status from struct request
After Christophs SCSI change, the only usage left is RQ_ACTIVE
and RQ_INACTIVE. The block layer sets RQ_INACTIVE right before freeing
the request, so any check for RQ_INACTIVE in a driver is a bug and
indicates use-after-free.
So kill/clean the remaining users, straight forward.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 10 Aug 2006 06:59:11 +0000 (08:59 +0200)]
[PATCH] Remove struct request_list from struct request
It is always identical to &q->rq, and we only use it for detecting
whether this request came out of our mempool or not. So replace it
with an additional ->flags bit flag.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Sat, 30 Sep 2006 18:29:12 +0000 (20:29 +0200)]
[PATCH] Remove ->waiting member from struct request
As the comments indicates in blkdev.h, we can fold it into ->end_io_data
usage as that is really what ->waiting is. Fixup the users of
blk_end_sync_rq().
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Jens Axboe [Tue, 18 Jul 2006 19:07:29 +0000 (21:07 +0200)]
[PATCH] as-iosched: kill arq
Get rid of the as_rq request type. With the added elevator_private2, we
have enough room in struct request to get rid of any arq allocation/free
for each request.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Jens Axboe [Thu, 13 Jul 2006 10:39:25 +0000 (12:39 +0200)]
[PATCH] cfq-iosched: kill crq
Get rid of the cfq_rq request type. With the added elevator_private2, we
have enough room in struct request to get rid of any crq allocation/free
for each request.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Wed, 12 Jul 2006 12:04:37 +0000 (14:04 +0200)]
[PATCH] Add one more pointer to struct request for IO scheduler usage
Then we have enough room in the request to get rid of the dynamic
allocations in CFQ/AS.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 13 Jul 2006 10:37:56 +0000 (12:37 +0200)]
[PATCH] cfq-iosched: remove the crq flag functions/variable
There's just one flag currently (SYNC), and that one can be grabbed from
the request.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 13 Jul 2006 10:36:41 +0000 (12:36 +0200)]
[PATCH] deadline-iosched: remove elevator private drq request type
A big win, we now save an allocation/free on each request! With the
previous rb/hash abstractions, we can just reuse queuelist/donelist
for the FIFO data and be done with it.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:26:13 +0000 (09:26 +0200)]
[PATCH] as-iosched: remove arq->is_sync member
We can track this in struct request.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Jens Axboe [Thu, 13 Jul 2006 07:12:14 +0000 (09:12 +0200)]
[PATCH] as-iosched: reuse rq for fifo
Saves some space in arq.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Jens Axboe [Tue, 11 Jul 2006 19:30:31 +0000 (21:30 +0200)]
[PATCH] cfq-iosched: convert to using the FIFO elevator defines
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 11 Jul 2006 19:49:15 +0000 (21:49 +0200)]
[PATCH] elevator: introduce a way to reuse rq for internal FIFO handling
The io schedulers can use this instead of having to allocate space for
it themselves.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 13 Jul 2006 10:34:24 +0000 (12:34 +0200)]
[PATCH] deadline-iosched: migrate to using the elevator rb functions
This removes the rbtree handling from deadline.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 13 Jul 2006 10:33:14 +0000 (12:33 +0200)]
[PATCH] cfq-iosched: migrate to using the elevator rb functions
This removes the rbtree handling from CFQ.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 18 Jul 2006 19:06:01 +0000 (21:06 +0200)]
[PATCH] as-iosched: migrate to using the elevator rb functions
This removes the rbtree handling from AS.
Signed-off-by: Jens Axboe <axboe@suse.de>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Jens Axboe [Thu, 13 Jul 2006 09:55:04 +0000 (11:55 +0200)]
[PATCH] elevator: abstract out the rbtree sort handling
The rbtree sort/lookup/reposition logic is mostly duplicated in
cfq/deadline/as, so move it to the elevator core. The io schedulers
still provide the actual rb root, as we don't want to impose any sort
of specific handling on the schedulers.
Introduce the helpers and rb_node in struct request to help migrate the
IO schedulers.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Tue, 11 Jul 2006 19:15:52 +0000 (21:15 +0200)]
[PATCH] rbtree: fixed reversed RB_EMPTY_NODE and rb_next/prev
The conditions got reserved. Also make rb_next() and rb_prev() check
for the empty condition.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Fri, 28 Jul 2006 07:23:08 +0000 (09:23 +0200)]
[PATCH] elevator: move the backmerging logic into the elevator core
Right now, every IO scheduler implements its own backmerging (except for
noop, which does no merging). That results in duplicated code for
essentially the same operation, which is never a good thing. This patch
moves the backmerging out of the io schedulers and into the elevator
core. We save 1.6kb of text and as a bonus get backmerging for noop as
well. Win-win!
Signed-off-by: Jens Axboe <axboe@suse.de>
Jens Axboe [Thu, 10 Aug 2006 06:44:47 +0000 (08:44 +0200)]
[PATCH] Split struct request ->flags into two parts
Right now ->flags is a bit of a mess: some are request types, and
others are just modifiers. Clean this up by splitting it into
->cmd_type and ->cmd_flags. This allows introduction of generic
Linux block message types, useful for sending generic Linux commands
to block devices.
Signed-off-by: Jens Axboe <axboe@suse.de>
Jean Delvare [Sat, 30 Sep 2006 15:18:59 +0000 (17:18 +0200)]
[PATCH] i2c: Prevent deadlock on i2c client registration
Delay the call to adapter->client_register() until after we are
certain that the client registration is a success. At this point the
client is fully initialized and we no longer hold the adapter->clist
mutex, so this should prevent the deadlocks if the client_register()
callback needs to take that mutex too, as is the case for the bttv
driver.
This fixes bug #7234.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Sat, 30 Sep 2006 16:39:15 +0000 (09:39 -0700)]
Merge /pub/scm/linux/kernel/git/mchehab/v4l-dvb
* master.kernel.org:/pub/scm/linux/kernel/git/mchehab/v4l-dvb: (180 commits)
V4L/DVB (4641): Trivial: use lowercase letters in hex subsystem ids
V4L/DVB (4639): Cx88: add autodetection for alternate revision of Leadtek PVR
V4L/DVB (4638): Basic DVB-T and analog TV support for the HVR1300.
V4L/DVB (4637): Add a default method for VIDIOC_G_PARM
V4L/DVB (4635): Extend bttv and saa7134 to check for both AGP and PCI PCI failure case
V4L/DVB (4634): Zr36120: implement pcipci checks
V4L/DVB (4632): Zoran: Implement pcipci failure check
V4L/DVB (4631): Av7110: remove V4L2_CAP_VBI_CAPTURE flag
V4L/DVB (4630): Av7110: FW_LOADER depemdency fixed
V4L/DVB (4629): Saa7134: add card support for Proteus Pro 2309
V4L/DVB (4628): Fix VIDIOC_ENUMSTD ioctl in videodev.c
V4L/DVB (4627): Vivi crashes with mplayer
V4L/DVB (4626): On saa7111/7113, LUMA_CTRL need a different value
V4L/DVB (4624): Tvaudio: Replaced kernel_thread() with kthread_run()
V4L/DVB (4622): Copy-paste bug in videodev.c
V4L/DVB (4620): Fix AGC configuration for MOD3000P-based boards
V4L/DVB (4619): Fixes some I2C dependencies on V4L devices
V4L/DVB (4617): Problem with dibusb-mb.c USB IDs
V4L/DVB (4616): [PATCH] Nebula DigiTV USB RC support
V4L/DVB (4614): Export symbol saa7134_tvaudio_setmute from saa7134 for saa7134-alsa
...
Linus Torvalds [Sat, 30 Sep 2006 16:38:19 +0000 (09:38 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6: (48 commits)
ieee1394: raw1394: arm functions slept in atomic context
ieee1394: sbp2: enable auto spin-up for all SBP-2 devices
MAINTAINERS: updates to IEEE 1394 subsystem maintainership
ieee1394: ohci1394: check for errors in suspend or resume
set power state of firewire host during suspend
ieee1394: ohci1394: more obvious endianess handling
ieee1394: ohci1394: fix endianess bug in debug message
ieee1394: sbp2: don't prefer MODE SENSE 10
ieee1394: nodemgr: grab class.subsys.rwsem in nodemgr_resume_ne
ieee1394: nodemgr: fix rwsem recursion
ieee1394: sbp2: more help in Kconfig
ieee1394: sbp2: prevent rare deadlock in shutdown
ieee1394: sbp2: update includes
ieee1394: sbp2: better handling of transport errors
ieee1394: sbp2: recheck node generation in sbp2_update
ieee1394: sbp2: safer agent reset in error handlers
ieee1394: sbp2: handle "sbp2util_node_write_no_wait failed"
CONFIG_PM=n slim: drivers/ieee1394/ohci1394.c
ieee1394: safer definition of empty macros
video1394: add poll file operation support
...
Linus Torvalds [Sat, 30 Sep 2006 16:36:56 +0000 (09:36 -0700)]
Merge branch 'intelfb-patches' of /linux/kernel/git/airlied/intelfb-2.6
* 'intelfb-patches' of master.kernel.org:/pub/scm/linux/kernel/git/airlied/intelfb-2.6:
intelfbhw.c: intelfbhw_get_p1p2 defined but not used
intelfb: fix mtrr_reg signedness
intelfb: update doc and Kconfig (supported devices)
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add preliminary i2c support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
intelfb: add vsync interrupt support
Linus Torvalds [Sat, 30 Sep 2006 15:37:55 +0000 (08:37 -0700)]
Merge branch 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6
* 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6:
[PATCH] Use early clobber in semaphores
[PATCH] Define vsyscall cache as blob to make clearer that user space shouldn't use it
[PATCH] Re-positioning the bss segment
[PATCH] Use ARRAY_SIZE in setup.c
[PATCH] i386: replace intermediate array-size definitions with ARRAY_SIZE()
[PATCH] x86: Clean up x86 NMI sysctls
[PATCH] Refactor some duplicated code in mpparse.c
[PATCH] Document iommu=panic
[PATCH] Fix broken indentation in iommu_setup
[PATCH] Allow disabling DAC using command line options
[PATCH] Add proper sparse __user casts to __copy_to_user_inatomic
[PATCH] i386: Update defconfig
[PATCH] Update defconfig
Linus Torvalds [Sat, 30 Sep 2006 01:54:48 +0000 (18:54 -0700)]
Merge /pub/scm/linux/kernel/git/davem/net-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
[ATM]: [lec] use refcnt to protect lec_arp_entries outside lock
[ATM]: [lec] add reference counting to lec_arp entries
[ATM]: [lec] use work queue instead of timer for lec arp expiry
[ATM]: [lec] old_close is no longer used
[ATM]: [lec] convert lec_arp_table to hlist
[ATM]: [lec] header indent, comment and whitespace cleanup
[ATM]: [lec] indent, comment and whitespace cleanup [continued]
[ATM]: [lec] indent, comment and whitespace cleanup
[SCTP]: Do not timestamp every SCTP packet.
[SCTP]: Use correct mask when disabling PMTUD.
[SCTP]: Include sk_buff overhead while updating the peer's receive window.
[SCTP]: Enable Nagle algorithm by default.
[BNX2]: Disable MSI on 5706 if AMD 8132 bridge is present.
[NetLabel]: audit fixups due to delayed feedback
Chas Williams [Sat, 30 Sep 2006 00:17:17 +0000 (17:17 -0700)]
[ATM]: [lec] use refcnt to protect lec_arp_entries outside lock
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:16:48 +0000 (17:16 -0700)]
[ATM]: [lec] add reference counting to lec_arp entries
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:15:59 +0000 (17:15 -0700)]
[ATM]: [lec] use work queue instead of timer for lec arp expiry
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:15:15 +0000 (17:15 -0700)]
[ATM]: [lec] old_close is no longer used
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:14:27 +0000 (17:14 -0700)]
[ATM]: [lec] convert lec_arp_table to hlist
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:13:24 +0000 (17:13 -0700)]
[ATM]: [lec] header indent, comment and whitespace cleanup
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:11:47 +0000 (17:11 -0700)]
[ATM]: [lec] indent, comment and whitespace cleanup [continued]
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chas Williams [Sat, 30 Sep 2006 00:11:14 +0000 (17:11 -0700)]
[ATM]: [lec] indent, comment and whitespace cleanup
Signed-off-by: Chas Williams <chas@cmf.nrl.navy.mil>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sat, 30 Sep 2006 00:10:03 +0000 (17:10 -0700)]
[SCTP]: Do not timestamp every SCTP packet.
We only need the timestamp on COOKIE-ECHO chunks, so instead of always
timestamping every SCTP packet, let common code timestamp if the socket
option is set. For COOKIE-ECHO, simply get the time of day if we don't
have a timestamp. This introduces a small possibility that the cookie
may be considered expired, but it will be renegotiated.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Yasevich [Sat, 30 Sep 2006 00:09:34 +0000 (17:09 -0700)]
[SCTP]: Use correct mask when disabling PMTUD.
Signed-off-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Sat, 30 Sep 2006 00:09:05 +0000 (17:09 -0700)]
[SCTP]: Include sk_buff overhead while updating the peer's receive window.
Currently if the sender is sending small messages, it can cause a receiver
to run out of receive buffer space even when the advertised receive window
is still open and results in packet drops and retransmissions. Including
a overhead while updating the sender's view of peer receive window will
reduce the chances of receive buffer space overshooting the receive window.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sridhar Samudrala [Sat, 30 Sep 2006 00:08:01 +0000 (17:08 -0700)]
[SCTP]: Enable Nagle algorithm by default.
This allows more aggressive bundling of chunks when sending small
messages.
Signed-off-by: Sridhar Samudrala <sri@us.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael Chan [Sat, 30 Sep 2006 00:06:23 +0000 (17:06 -0700)]
[BNX2]: Disable MSI on 5706 if AMD 8132 bridge is present.
MSI is defined to be 32-bit write. The 5706 does 64-bit MSI writes
with byte enables disabled on the unused 32-bit word. This is legal
but causes problems on the AMD 8132 which will eventually stop
responding after a while.
Without this patch, the MSI test done by the driver during open will
pass, but MSI will eventually stop working after a few MSIs are
written by the device.
AMD believes this incompatibility is unique to the 5706, and
prefers to locally disable MSI rather than globally disabling it
using pci_msi_quirk.
Update version to 1.4.45.
Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Moore [Sat, 30 Sep 2006 00:05:05 +0000 (17:05 -0700)]
[NetLabel]: audit fixups due to delayed feedback
Fix some issues Steve Grubb had with the way NetLabel was using the audit
subsystem. This should make NetLabel more consistent with other kernel
generated audit messages specifying configuration changes.
Signed-off-by: Paul Moore <paul.moore@hp.com>
Acked-by: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Use early clobber in semaphores
New code clobbers the result always early, so tell gcc about it
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Define vsyscall cache as blob to make clearer that user space shouldn't use it
Signed-off-by: Andi Kleen <ak@suse.de>
Vivek Goyal [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Re-positioning the bss segment
[AK: This apparently broke some systems, but we need it to fix
a compile problem with old binutils and in theory the patch
is correct. So let's trying reenabling it again.]
o Currently bss segment is being placed somewhere in the middle (after .data)
section and after bss lots of init section and data sections are coming.
Is it intentional?
o One side affect of placing bss in the middle is that objcopy keeps the
bss in raw binary image (vmlinux.bin) hence unnecessarily increasing
the size of raw binary image. (In my case ~600K). It also increases
the size of generated bzImage, though the increase is very small
(896 bytes), probably a very high compression ratio for stream
of zeros.
o This patch moves the bss at the end hence reducing the size of
bzImage by 896 bytes and size of vmlinux.bin by 600K.
o This change benefits in the context of relocatable kernel patches. If
kernel bss is not part of compressed data (vmlinux.bin) then it does
not have to be decompressed and this area can be used by the decompressor
for its execution hence keeping the memory requirements bounded and
decompressor code does not stomp over any other data loaded beyond
kernel image (As might be the case with bootloaders like kexec).
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Use ARRAY_SIZE in setup.c
Based on i386 patch from Bjorn.
Signed-off-by: Andi Kleen <ak@suse.de>
Bjorn Helgaas [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] i386: replace intermediate array-size definitions with ARRAY_SIZE()
Code is easier to validate if array sizes aren't hidden behind extra
#defines.
Signed-off-by: Bjorn Helgaas <bjorn.helgaas@hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] x86: Clean up x86 NMI sysctls
Use prototypes in headers
Don't define panic_on_unrecovered_nmi for all architectures
Cc: dzickus@redhat.com
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Refactor some duplicated code in mpparse.c
No logic changes
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Document iommu=panic
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Fix broken indentation in iommu_setup
No functional changes; only white space.
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Allow disabling DAC using command line options
Might or might not work around some reported bugs on VIA systems.
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] Add proper sparse __user casts to __copy_to_user_inatomic
Noticed by Al Viro
Cc: viro@ftp.linux.org.uk
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:55 +0000 (01:47 +0200)]
[PATCH] i386: Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
Andi Kleen [Fri, 29 Sep 2006 23:47:54 +0000 (01:47 +0200)]
[PATCH] Update defconfig
Signed-off-by: Andi Kleen <ak@suse.de>
David S. Miller [Thu, 28 Sep 2006 02:52:35 +0000 (19:52 -0700)]
[SERIAL] sunzilog: Mark sunzilog_init_hw as __devinit.
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 28 Sep 2006 02:43:02 +0000 (19:43 -0700)]
[SPARC]: Don't zero out tail during copy_from_user_inatomic().
Actually, since we use the same code for all the copying
types in and out of userspace, we check at runtime whether
preemption is disabled.
Signed-off-by: David S. Miller <davem@davemloft.net>
Ollie Wild [Fri, 29 Sep 2006 22:50:28 +0000 (15:50 -0700)]
[PATCH] uml build fix
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
David Woodhouse [Fri, 29 Sep 2006 22:50:25 +0000 (15:50 -0700)]
[PATCH] MLSXFRM: fix mis-labelling of child sockets
Accepted connections of types other than AF_INET, AF_INET6, AF_UNIX won't
have an appropriate label derived from the peer, so don't use it.
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Acked-by: Paul Moore <paul.moore@hp.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Linus Torvalds [Fri, 29 Sep 2006 22:18:22 +0000 (15:18 -0700)]
Merge branch 'for-linus' of /linux/kernel/git/roland/infiniband
* 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: (33 commits)
IB/ipath: Fix lockdep error upon "ifconfig ibN down"
IB/ipath: Fix races with ib_resize_cq()
IB/ipath: Support new PCIE device, QLE7142
IB/ipath: Set CPU affinity early
IB/ipath: Fix EEPROM read when driver is compiled with -Os
IB/ipath: Fix and recover TXE piobuf and PBC parity errors
IB/ipath: Change HT CRC message to indicate how to resolve problem
IB/ipath: Clean up module exit code
IB/ipath: Call mtrr_del with correct arguments
IB/ipath: Flush RWQEs if access error or invalid error seen
IB/ipath: Improved support for PowerPC
IB/ipath: Drop unnecessary "(void *)" casts
IB/ipath: Support multiple simultaneous devices of different types
IB/ipath: Fix mismatch in shifts and masks for printing debug info
IB/ipath: Fix compiler warnings and errors on non-x86_64 systems
IB/ipath: Print more informative parity error messages
IB/ipath: Ensure that PD of MR matches PD of QP checking the Rkey
IB/ipath: RC and UC should validate SLID and DLID
IB/ipath: Only allow complete writes to flash
IB/ipath: Count SRQs properly
...
Linus Torvalds [Fri, 29 Sep 2006 16:36:55 +0000 (09:36 -0700)]
Merge git://oss.sgi.com:8090/xfs/xfs-2.6
* git://oss.sgi.com:8090/xfs/xfs-2.6: (49 commits)
[XFS] Remove v1 dir trace macro - missed in a past commit.
[XFS] 955947: Infinite loop in xfs_bulkstat() on formatter() error
[XFS] pv 956241, author: nathans, rv: vapo - make ino validation checks
[XFS] pv 956240, author: nathans, rv: vapo - Minor fixes in
[XFS] Really fix use after free in xfs_iunpin.
[XFS] Collapse sv_init and init_sv into just the one interface.
[XFS] standardize on one sema init macro
[XFS] Reduce endian flipping in alloc_btree, same as was done for
[XFS] Minor cleanup from dio locking fix, remove an extra conditional.
[XFS] Fix kmem_zalloc_greedy warnings on 64 bit platforms.
[XFS] pv 955157, rv bnaujok - break the loop on EFAULT formatter() error
[XFS] pv 955157, rv bnaujok - break the loop on formatter() error
[XFS] Fixes the leak in reservation space because we weren't ungranting
[XFS] Add lock annotations to xfs_trans_update_ail and
[XFS] Fix a porting botch on the realtime subvol growfs code path.
[XFS] Minor code rearranging and cleanup to prevent some coverity false
[XFS] Remove a no-longer-correct debug assert from dio completion
[XFS] Add a greedy allocation interface, allocating within a min/max size
[XFS] Improve error handling for the zero-fsblock extent detection code.
[XFS] Be more defensive with page flags (error/private) for metadata
...
Yoichi Yuasa [Fri, 29 Sep 2006 06:42:38 +0000 (08:42 +0200)]
[PATCH] i2c-sibyte: Fix modular build breakage
Fix undefined reference in i2c_sibyte_exit().
drivers/built-in.o: In function `i2c_sibyte_exit':
i2c-sibyte.c:(.exit.text+0x368): undefined reference to `i2c_del_bus'
i2c-sibyte.c:(.exit.text+0x368): relocation truncated to fit: R_MIPS_26 against `i2c_del_bus'
i2c-sibyte.c:(.exit.text+0x38c): undefined reference to `i2c_del_bus'
i2c-sibyte.c:(.exit.text+0x38c): relocation truncated to fit: R_MIPS_26 against `i2c_del_bus'
Signed-off-by: Yoichi Yuasa <yoichi_yuasa@tripeaks.co.jp>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Paul Jackson [Fri, 29 Sep 2006 09:01:48 +0000 (02:01 -0700)]
[PATCH] cpuset: fix obscure attach_task vs exiting race
Fix obscure race condition in kernel/cpuset.c attach_task() code.
There is basically zero chance of anyone accidentally being harmed by this
race.
It requires a special 'micro-stress' load and a special timing loop hacks
in the kernel to hit in less than an hour, and even then you'd have to hit
it hundreds or thousands of times, followed by some unusual and senseless
cpuset configuration requests, including removing the top cpuset, to cause
any visibly harm affects.
One could, with perhaps a few days or weeks of such effort, get the
reference count on the top cpuset below zero, and manage to crash the
kernel by asking to remove the top cpuset.
I found it by code inspection.
The race was introduced when 'the_top_cpuset_hack' was introduced, and one
piece of code was not updated. An old check for a possibly null task
cpuset pointer needed to be changed to a check for a task marked
PF_EXITING. The pointer can't be null anymore, thanks to
the_top_cpuset_hack (documented in kernel/cpuset.c). But the task could
have gone into PF_EXITING state after it was found in the task_list scan.
If a task is PF_EXITING in this code, it is possible that its task->cpuset
pointer is pointing to the top cpuset due to the_top_cpuset_hack, rather
than because the top_cpuset was that tasks last valid cpuset. In that
case, the wrong cpuset reference counter would be decremented.
The fix is trivial. Instead of failing the system call if the tasks cpuset
pointer is null here, fail it if the task is in PF_EXITING state.
The code for 'the_top_cpuset_hack' that changes an exiting tasks cpuset to
the top_cpuset is done without locking, so could happen at anytime. But it
is done during the exit handling, after the PF_EXITING flag is set. So if
we verify that a task is still not PF_EXITING after we copy out its cpuset
pointer (into 'oldcs', below), we know that 'oldcs' is not one of these
hack references to the top_cpuset.
Signed-off-by: Paul Jackson <pj@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Kirill Korotaev [Fri, 29 Sep 2006 09:01:47 +0000 (02:01 -0700)]
[PATCH] SubmittingPatches: add a note about "format=flowed" when sending patches
Add a note about "format=flowed" when sending patches and explain how to
fix mozilla. Thunderbird has the similar options.
Signed-off-by: Kirill Korotaev <dev@openvz.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ingo Molnar [Fri, 29 Sep 2006 09:01:46 +0000 (02:01 -0700)]
[PATCH] lockdep core: improve the lock-chain-hash
With CONFIG_DEBUG_LOCK_ALLOC turned off i was getting sporadic failures in
the locking self-test:
------------>
| Locking API testsuite:
----------------------------------------------------------------------------
| spin |wlock |rlock |mutex | wsem | rsem |
--------------------------------------------------------------------------
A-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-A-B-C deadlock: ok | ok | ok | ok | ok | ok |
A-B-B-C-C-D-D-A deadlock: ok |FAILED| ok | ok | ok | ok |
A-B-C-D-B-D-D-A deadlock: ok | ok | ok | ok | ok | ok |
A-B-C-D-B-C-D-A deadlock: ok | ok | ok | ok | ok |FAILED|
after much debugging it turned out to be caused by accidental chain-hash
key collisions. The current hash is:
#define iterate_chain_key(key1, key2) \
(((key1) << MAX_LOCKDEP_KEYS_BITS/2) ^ \
((key1) >> (64-MAX_LOCKDEP_KEYS_BITS/2)) ^ \
(key2))
where MAX_LOCKDEP_KEYS_BITS is 11. This hash is pretty good as it will
shift by 5 bits in every iteration, where every new ID 'mixed' into the
hash would have up to 11 bits. But because there was a 6 bits overlap
between subsequent IDs and their high bits tended to be similar, there was
a chance for accidental chain-hash collision for a low number of locks
held.
the solution is to shift by 11 bits:
#define iterate_chain_key(key1, key2) \
(((key1) << MAX_LOCKDEP_KEYS_BITS) ^ \
((key1) >> (64-MAX_LOCKDEP_KEYS_BITS)) ^ \
(key2))
This keeps the hash perfect up to 5 locks held, but even above that the
hash is still good because 11 bits is a relative prime to the total 64
bits, so a complete match will only occur after 64 held locks (which doesnt
happen in Linux). Even after 5 locks held, entropy of the 5 IDs mixed into
the hash is already good enough so that overlap doesnt generate a colliding
hash ID.
with this change the false positives went away.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Vivek Goyal [Fri, 29 Sep 2006 09:01:45 +0000 (02:01 -0700)]
[PATCH] Kcore elf note namesz field fix
o As per ELF specifications, it looks like that elf note "namesz" field
contains the length of "name" including the size of null character. And
currently we are filling "namesz" without taking into the consideration
the null character size.
o Kexec-tools performs this check deligently hence I ran into the issue
while trying to open /proc/kcore in kexec-tools for some info.
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andrew Morton [Fri, 29 Sep 2006 09:01:44 +0000 (02:01 -0700)]
[PATCH] expand_fdtable(): remove pointless unlock+lock
This unlock/lock on a super-unlikely path isn't worth the kernel text.
Cc: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Vadim Lobanov [Fri, 29 Sep 2006 09:01:43 +0000 (02:01 -0700)]
[PATCH] Clean up expand_fdtable() and expand_files()
Perform a code cleanup against the expand_fdtable() and expand_files()
functions inside fs/file.c. It aims to make the flow of code within these
functions simpler and easier to understand, via added comments and modest
refactoring.
Signed-off-by: Vadim Lobanov <vlobanov@speakeasy.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alexey Dobriyan [Fri, 29 Sep 2006 09:01:43 +0000 (02:01 -0700)]
[PATCH] Documentation/SubmittingDrivers: minor update
* fix copright typo
* remove trailing whitespace
* remove Kernel Traffic from Resources. Zack, it was great reading!
* Name Arjan by name and fix URL of "How to NOT" paper.
* Remove "Last updated" tag.
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Cox [Fri, 29 Sep 2006 09:01:41 +0000 (02:01 -0700)]
[PATCH] audit/accounting: tty locking
Add tty locking around the audit and accounting code.
The whole current->signal-> locking is all deeply strange but it's for
someone else to sort out. Add rather than replace the lock for acct.c
Signed-off-by: Alan Cox <alan@redhat.com>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Cox [Fri, 29 Sep 2006 09:01:40 +0000 (02:01 -0700)]
[PATCH] Fix locking for tty drivers when doing urgent characters
If you send a priority character (as is done for flow control) then the tty
driver can either have its own method for "jumping the queue" or the characrer
can be queued normally. In the latter case we call the write method but
without the atomic_write_lock taken elsewhere.
Make this consistent. Note that the send_xchar method if implemented remains
outside of the lock as it can jump ahead of a current write so must not be
locked out by it.
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Cox [Fri, 29 Sep 2006 09:01:39 +0000 (02:01 -0700)]
[PATCH] specialix - remove private speed decoding
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Cox [Fri, 29 Sep 2006 09:01:38 +0000 (02:01 -0700)]
[PATCH] istallion: Remove private baud rate decoding, which is also broken in this case on some platforms
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Alan Cox [Fri, 29 Sep 2006 09:01:38 +0000 (02:01 -0700)]
[PATCH] generic_serial: remove private decoding of baud rate bits
The driver has no business doing this work itself any more and hasn't for some
years. When the new speed stuff goes in this will break entirely so fix it up
ready.
Also remove a #if 0 around a comment....
Signed-off-by: Alan Cox <alan@redhat.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Atsushi Nemoto [Fri, 29 Sep 2006 09:01:37 +0000 (02:01 -0700)]
[PATCH] RTC: more XSTP/VDET support for rtc-rs5c348 driver
If the chip detected "oscillator stop" condition, show an warning message.
And initialize it with the Epoch time instead of leaving it with unknown
date/time.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Adrian Bunk [Fri, 29 Sep 2006 09:01:36 +0000 (02:01 -0700)]
[PATCH] build sound/sound_firmware.c only for OSS
All sound/sound_firmware.c contains is mod_firmware_load() that is a legacy
API only used by some OSS drivers.
This patch builds it into an own sound_firmware module that is only built
depending on CONFIG_SOUND_PRIME making the kernel slightly smaller for ALSA
users.
[alan@lxorguk.ukuu.org.uk: comment fix]
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Acked-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rusty Russell [Fri, 29 Sep 2006 09:01:35 +0000 (02:01 -0700)]
[PATCH] stop_machine.c copyright
I had to look back: this code was extracted from the module.c code in 2005.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andreas Gruenbacher [Fri, 29 Sep 2006 09:01:35 +0000 (02:01 -0700)]
[PATCH] Access Control Lists for tmpfs
Add access control lists for tmpfs.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Andreas Gruenbacher [Fri, 29 Sep 2006 09:01:34 +0000 (02:01 -0700)]
[PATCH] Generic infrastructure for acls
The patches solve the following problem: We want to grant access to devices
based on who is logged in from where, etc. This includes switching back and
forth between multiple user sessions, etc.
Using ACLs to define device access for logged-in users gives us all the
flexibility we need in order to fully solve the problem.
Device special files nowadays usually live on tmpfs, hence tmpfs ACLs.
Different distros have come up with solutions that solve the problem to
different degrees: SUSE uses a resource manager which tracks login sessions
and sets ACLs on device inodes as appropriate. RedHat uses pam_console, which
changes the primary file ownership to the logged-in user. Others use a set of
groups that users must be in in order to be granted the appropriate accesses.
The freedesktop.org project plans to implement a combination of a
console-tracker and a HAL-device-list based solution to grant access to
devices to users, and more distros will likely follow this approach.
These patches have first been posted here on 2 February 2005, and again
on 8 January 2006. We have been shipping them in SLES9 and SLES10 with
no problems reported. The previous submission is archived here:
http://lkml.org/lkml/2006/1/8/229
http://lkml.org/lkml/2006/1/8/230
http://lkml.org/lkml/2006/1/8/231
This patch:
Add some infrastructure for access control lists on in-memory
filesystems such as tmpfs.
Signed-off-by: Andreas Gruenbacher <agruen@suse.de>
Cc: Hugh Dickins <hugh@veritas.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Chris Snook [Fri, 29 Sep 2006 09:01:33 +0000 (02:01 -0700)]
[PATCH] enforce RLIMIT_NOFILE in poll()
POSIX states that poll() shall fail with EINVAL if nfds > OPEN_MAX. In
this context, POSIX is referring to sysconf(OPEN_MAX), which is the value
of current->signal->rlim[RLIMIT_NOFILE].rlim_cur in the linux kernel, not
the compile-time constant which happens to also be named OPEN_MAX. In the
current code, an application may poll up to max_fdset file descriptors,
even if this exceeds RLIMIT_NOFILE. The current code also breaks
applications which poll more than max_fdset descriptors, which worked circa
2.4.18 when the check was against NR_OPEN, which is 1024*1024. This patch
enforces the limit precisely as POSIX defines, even if RLIMIT_NOFILE has
been changed at run time with ulimit -n.
To elaborate on the rationale for this, there are three cases:
1) RLIMIT_NOFILE is at the default value of 1024
In this (default) case, the patch changes nothing. Calls with nfds > 1024
fail with EINVAL both before and after the patch, and calls with nfds <=
1024 pass the check both before and after the patch, since 1024 is the
initial value of max_fdset.
2) RLIMIT_NOFILE has been raised above the default
In this case, poll() becomes more permissive, allowing polling up to
RLIMIT_NOFILE file descriptors even if less than 1024 have been opened.
The patch won't introduce new errors here. If an application somehow
depends on poll() failing when it polls with duplicate or invalid file
descriptors, it's already broken, since this is already allowed below 1024,
and will also work above 1024 if enough file descriptors have been open at
some point to cause max_fdset to have been increased above nfds.
3) RLIMIT_NOFILE has been lowered below the default
In this case, the system administrator or the user has gone out of their
way to protect the system from inefficient (or malicious) applications
wasting kernel memory. The current code allows polling up to 1024 file
descriptors even if RLIMIT_NOFILE is much lower, which is not what the user
or administrator intended. Well-written applications which only poll
valid, unique file descriptors will never notice the difference, because
they'll hit the limit on open() first. If an application gets broken
because of the patch in this case, then it was already poorly/maliciously
designed, and allowing it to work in the past was a violation of POSIX and
a DoS risk on low-resource systems.
With this patch, poll() will permit exactly what POSIX suggests, no more,
no less, and for any run-time value set with ulimit -n, not just 256 or
1024. There are existing apps which which poll a large number of file
descriptors, some of which may be invalid, and if those numbers stradle
1024, they currently fail with or without the patch in -mm, though they
worked fine under 2.4.18.
Signed-off-by: Chris Snook <csnook@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Eric Sesterhenn [Fri, 29 Sep 2006 09:01:32 +0000 (02:01 -0700)]
[PATCH] Uninitialized variable in drivers/net/wan/syncppp.c
For len equal to 4, we never call sppp_lcp_conf_parse_options(),
therefore rmagic does not get initialized.
Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de>
Acked-by: Paul Fulghum <paulkf@microgate.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ian S. Nelson [Fri, 29 Sep 2006 09:01:31 +0000 (02:01 -0700)]
[PATCH] /sys/modules: allow full length section names
I've been using systemtap for some debugging and I noticed that it can't
probe a lot of modules. Turns out it's kind of silly, the sections section
of /sys/module is limited to 32byte filenames and many of the actual
sections are a a bit longer than that.
[akpm@osdl.org: rewrite to use dymanic allocation]
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Matthew Wilcox [Fri, 29 Sep 2006 09:01:30 +0000 (02:01 -0700)]
[PATCH] SuperH list is moderated
I just got a bounce telling me my contributions aren't welcome.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Pavel Machek [Fri, 29 Sep 2006 09:01:29 +0000 (02:01 -0700)]
[PATCH] network block device is mostly known as "NBD"
People search maintainers for NBD and then decide it is not
maintained.
(akpm: ditto LVM. And other things, but I forget what they were)
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This page took 0.049983 seconds and 5 git commands to generate.