Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix support for external clock source #2

Closed
koalo opened this issue May 4, 2013 · 0 comments
Closed

Fix support for external clock source #2

koalo opened this issue May 4, 2013 · 0 comments
Assignees
Labels

Comments

@koalo
Copy link
Owner

koalo commented May 4, 2013

The Raspberry Pi assumes 40fs bit clock even for external clocks. Change that to 64fs, 48fs... It should be switchable.

@ghost ghost assigned koalo May 4, 2013
@koalo koalo closed this as completed in 609b3b8 May 6, 2013
koalo pushed a commit that referenced this issue Aug 27, 2013
Since commit 89c8d91 ("tty: localise the lock") I see a dead lock
in one of my dummy_hcd + g_nokia test cases. The first run was usually
okay, the second often resulted in a splat by lockdep and the third was
usually a dead lock.
Lockdep complained about tty->hangup_work and tty->legacy_mutex taken
both ways:
| ======================================================
| [ INFO: possible circular locking dependency detected ]
| 3.7.0-rc6+ raspberrypi#204 Not tainted
| -------------------------------------------------------
| kworker/2:1/35 is trying to acquire lock:
|  (&tty->legacy_mutex){+.+.+.}, at: [<c14051e6>] tty_lock_nested+0x36/0x80
|
| but task is already holding lock:
|  ((&tty->hangup_work)){+.+...}, at: [<c104f6e4>] process_one_work+0x124/0x5e0
|
| which lock already depends on the new lock.
|
| the existing dependency chain (in reverse order) is:
|
| -> #2 ((&tty->hangup_work)){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c104d82d>] flush_work+0x3d/0x240
|        [<c12e6986>] tty_ldisc_flush_works+0x16/0x30
|        [<c12e7861>] tty_ldisc_release+0x21/0x70
|        [<c12e0dfc>] tty_release+0x35c/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #1 (&tty->legacy_mutex/1){+.+...}:
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c1405279>] tty_lock_pair+0x29/0x70
|        [<c12e0bb8>] tty_release+0x118/0x470
|        [<c1105e28>] __fput+0xd8/0x270
|        [<c1105fcd>] ____fput+0xd/0x10
|        [<c1051dd9>] task_work_run+0xb9/0xf0
|        [<c1002a51>] do_notify_resume+0x51/0x80
|        [<c140550a>] work_notifysig+0x35/0x3b
|
| -> #0 (&tty->legacy_mutex){+.+.+.}:
|        [<c107f3c9>] __lock_acquire+0x1189/0x16a0
|        [<c107fe74>] lock_acquire+0x84/0x190
|        [<c140276c>] mutex_lock_nested+0x6c/0x2f0
|        [<c14051e6>] tty_lock_nested+0x36/0x80
|        [<c140523f>] tty_lock+0xf/0x20
|        [<c12df8e4>] __tty_hangup+0x54/0x410
|        [<c12dfcb2>] do_tty_hangup+0x12/0x20
|        [<c104f763>] process_one_work+0x1a3/0x5e0
|        [<c104fec9>] worker_thread+0x119/0x3a0
|        [<c1055084>] kthread+0x94/0xa0
|        [<c140ca37>] ret_from_kernel_thread+0x1b/0x28
|
|other info that might help us debug this:
|
|Chain exists of:
|  &tty->legacy_mutex --> &tty->legacy_mutex/1 --> (&tty->hangup_work)
|
| Possible unsafe locking scenario:
|
|       CPU0                    CPU1
|       ----                    ----
|  lock((&tty->hangup_work));
|                               lock(&tty->legacy_mutex/1);
|                               lock((&tty->hangup_work));
|  lock(&tty->legacy_mutex);
|
| *** DEADLOCK ***

Before the path mentioned tty_ldisc_release() look like this:

|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);
|	tty_lock();

As it can be seen, it first flushes the workqueue and then grabs the
tty_lock. Now we grab the lock first:

|	tty_lock_pair(tty, o_tty);
|	tty_ldisc_halt(tty);
|	tty_ldisc_flush_works(tty);

so lockdep's complaint seems valid.

The earlier version of this patch took the ldisc_mutex since the other
user of tty_ldisc_flush_works() (tty_set_ldisc()) did this.
Peter Hurley then said that it is should not be requried. Since it
wasn't done earlier, I dropped this part.
The code under tty_ldisc_kill() was executed earlier with the tty lock
taken so it is taken again.

I was able to reproduce the deadlock on v3.8-rc1, this patch fixes the
problem in my testcase. I didn't notice any problems so far.

Cc: Alan Cox <[email protected]>
Cc: Peter Hurley <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
…x/kernel/git/paulg/linux

Paul Gortmaker says:

====================
The Ethernet-HowTo was maintained for roughly 10 years, from 1993 to 2003.
Fortunately sane hardware probing and auto detection (via PCI and ISA/PnP)
largely made the document a relic of the past, hence it being abandoned
a decade ago.

However, there is one last useful thing that we can extract from the
effort made in maintaining that document.  We can use it to guide us
with respect to what rare, experimental and/or super ancient 10Mbit
ISA drivers don't make sense to maintain in-tree anymore.

Nobody will argue that ISA is obsolete.  Availability went away at about
the time Pentium3 motherboards moved from 500MHz Slot1/SECC processors
to the green 500MHz Socket 370 Pentium3 chips, at the turn of the century.

In theory, it is possible that someone could still be running one of these
12+ year old P3 machines and want 3.9+ bleeding edge kernels (but unlikely).
In light of the above (remote) possibility, we can defer the removal of some
ISA network drivers that were highly popular and well tested.  Typically
that means the stuff more from the mid to late '90s, some with ISA PnP
support, like the 3c509, the wd/SMC 8390 based stuff, PCnet/lance etc.

But a lot of other drivers, typically from the early 1990s were for rare
hardware, and experimental (to the point of requiring a cron job that would
do a test ping, and then ifconfig down/up and/or a rmmod/insmod!).  And
some of these drivers (znet, and lp486e to name two) are physically tied
to platforms with on motherboard ethernet -- of 486 machines that date
from the early 1990s and can only have single digit amounts of memory.

What I'd like to achieve here with this series, is to get rid of those old
drivers that are no longer being used.  In an earlier discussion where
I'd proposed deleting a single driver, Alan suggested we instead dump
all the historical stuff in one go, to make it "...immediately obvious
where the break point is..."[1] and that it was "perfectly reasonable it
(and a pile of other ISA cards) ought to be shown the door"[2].  So that
is the goal here - make a clear line in the sand where the really ancient
stuff finally gets kicked to the curb.

Two old parallel port drivers are considered for removal here as well,
since in early 386/486 ISA machines, the parallel port was typically found
with the UARTS on the multi-I/O ISA controller card.  These drivers also date
from the early 1990's; parallel ports are no longer found on modern boards,
and their performance was not even capable of 10% of 10Mbit bandwidth.

Allow me a preemptive justification against the inevitable comments from
well meaning bystanders who suggest "why not just leave all this alone?".
Dead drivers cost us all if they are left in tree.  If you think that
is false, then please first consider:

-every time you type "git status", you are checking to see if modifications
 have been made by you to all that dead code.

-every time you type "git grep <regex>" you are searching through files
 which contain that dead code that simply does not interest you.

-every time you build a "allyesconfig" and an "allmodconfig" (don't tell
 me you skip this step before submitting your changes to a maintainer),
 you waste CPU cycles building this dead code.

-every time there is a tree wide API change, or cleanup, or file relocation,
 we pay the cost of updating dead code, or moving dead code.

-daily regression tests (take linux-next as the most transparent
 example) spend time building (and possibly running) this dead code.

-hard working people who regularly run auditing tools looking for lurking
 bugs (sparse/coverity/smatch/coccinelle) are wasting time checking for,
 and fixing bugs in this dead code.

This last one is key.  Please take a look at the git history for the
files that are proposed for removal here.  Look at the git history for
any one of them ("git whatchanged --follow drivers/net/.../driver.c")
Mentally sort the changes into two bins -- (1) the robotic tree-wide
changes, and (2) the "look I found a real run-time bug while using this"
category.  You will see that category #2 is essentially empty.

Further to that, realize that drivers don't simply disappear.  We are
not operating in the binary-only distribution space like other OS.  All
these drivers remain in the git history forever.  If a person is an
enthusiast for extreme legacy hardware, they are probably already
customizing their kernel source and building it themselves to support
such systems.  Also keep in mind that they could still build the 3.8
kernel exactly as-is, and run it (or a 3.8.x stable variant of it) for
several more years if they were really determined to cling to these old
experimental ISA drivers for some reason.

In summary, I hope that folks can be pragmatic about this, and not
get swept up in nostalgia.  Ask yourself whether it is realistic to
expect a person would have a genuine use case where they would
need to build a 3.9+ modern kernel and install it on some legacy hardware
that has no option but to absolutely _require_ one of the drivers
that are deleted here.

The following series was created with --irreversible-delete for
ease of review (it skips showing the content of files that are
deleted); however the complete patches can be pulled as per below.
====================

Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Allow multiple listener sockets to bind to the same port.

Motivation for soresuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Motivation for soreuseport would be something like a web server
binding to port 80 running with multiple threads, where each thread
might have it's own listener socket.  This could be done as an
alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

Signed-off-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Tom Herbert says:

====================
This series implements so_reuseport (SO_REUSEPORT socket option) for
TCP and UDP.  For TCP, so_reuseport allows multiple listener sockets
to be bound to the same port.  In the case of UDP, so_reuseport allows
multiple sockets to bind to the same port.  To prevent port hijacking
all sockets bound to the same port using so_reuseport must have the
same uid.  Received packets are distributed to multiple sockets bound
to the same port using a 4-tuple hash.

The motivating case for so_resuseport in TCP would be something like
a web server binding to port 80 running with multiple threads, where
each thread might have it's own listener socket.  This could be done
as an alternative to other models: 1) have one listener thread which
dispatches completed connections to workers. 2) accept on a single
listener socket from multiple threads.  In case #1 the listener thread
can easily become the bottleneck with high connection turn-over rate.
In case #2, the proportion of connections accepted per thread tends
to be uneven under high connection load (assuming simple event loop:
while (1) { accept(); process() }, wakeup does not promote fairness
among the sockets.  We have seen the  disproportion to be as high
as 3:1 ratio between thread accepting most connections and the one
accepting the fewest.  With so_reusport the distribution is
uniform.

The TCP implementation has a problem in that the request sockets for a
listener are attached to a listener socket.  If a SYN is received, a
listener socket is chosen and request structure is created (SYN-RECV
state).  If the subsequent ack in 3WHS does not match the same port
by so_reusport, the connection state is not found (reset) and the
request structure is orphaned.  This scenario would occur when the
number of listener sockets bound to a port changes (new ones are
added, or old ones closed).  We are looking for a solution to this,
maybe allow multiple sockets to share the same request table...

The motivating case for so_reuseport in UDP would be something like a
DNS server.  An alternative would be to recv on the same socket from
multiple threads.  As in the case of TCP, the load across these threads
tends to be disproportionate and we also see a lot of contection on
the socket lock.  Note that SO_REUSEADDR already allows multiple UDP
sockets to bind to the same port, however there is no provision to
prevent hijacking and nothing to distribute packets across all the
sockets sharing the same bound port.  This patch does not change the
semantics of SO_REUSEADDR, but provides usable functionality of it
for unicast.
====================

Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
… ISP83XX

Issue:
Mailbox command timed out after switching from polling mode to interrupt mode.

Events:-
 1. Mailbox interrupts are disabled
 2. FW generates AEN and at same time driver enables Mailbox Interrupt
 3. Driver issues new mailbox to Firmware

In above case driver will not get AEN interrupts generated by FW in step #2 as
FW generated this AEN when interrupts are disabled. During the same time
driver enabled the mailbox interrupt, so driver will not poll for interrupt.
Driver will never process AENs generated in step #2 and issues new mailbox to
FW, but now FW is not able to post mailbox completion as AENs generated before
are not processed by driver.

Fix:
Enable Mailbox / AEN interrupts before initializing FW in case of ISP83XX.
This will make sure we process all Mailbox and AENs in interrupt mode.

Signed-off-by: Vikas Chaudhary <[email protected]>
Signed-off-by: James Bottomley <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
This patch reduces the critical section protected by sco_conn_lock in
sco_conn_ready function. The lock is acquired only when it is really
needed.

This patch fixes the following lockdep warning which is generated
when the host terminates a SCO connection.

Today, this warning is a false positive. There is no way those
two threads reported by lockdep are running at the same time since
hdev->workqueue (where rx_work is queued) is single-thread. However,
if somehow this behavior is changed in future, we will have a
potential deadlock.

======================================================
[ INFO: possible circular locking dependency detected ]
3.8.0-rc1+ #7 Not tainted
-------------------------------------------------------
kworker/u:1H/1018 is trying to acquire lock:
 (&(&conn->lock)->rlock){+.+...}, at: [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]

but task is already holding lock:
 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}:
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa003436e>] sco_connect_cfm+0xbe/0x350 [bluetooth]
       [<ffffffffa0015d6c>] hci_event_packet+0xd3c/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

-> #0 (&(&conn->lock)->rlock){+.+...}:
       [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
       [<ffffffff81083011>] lock_acquire+0xb1/0xe0
       [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
       [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
       [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
       [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
       [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
       [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
       [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
       [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
       [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
       [<ffffffff81056021>] kthread+0xd1/0xe0
       [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
                               lock(&(&conn->lock)->rlock);
                               lock(slock-AF_BLUETOOTH-BTPROTO_SCO);
  lock(&(&conn->lock)->rlock);

 *** DEADLOCK ***

4 locks held by kworker/u:1H/1018:
 #0:  (hdev->name#2){.+.+.+}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #1:  ((&hdev->rx_work)){+.+.+.}, at: [<ffffffff8104d5f8>] process_one_work+0x258/0x4f0
 #2:  (&hdev->lock){+.+.+.}, at: [<ffffffffa000fbe9>] hci_disconn_complete_evt.isra.54+0x59/0x3c0 [bluetooth]
 #3:  (slock-AF_BLUETOOTH-BTPROTO_SCO){+.+...}, at: [<ffffffffa0033d5a>] sco_conn_del+0x8a/0xe0 [bluetooth]

stack backtrace:
Pid: 1018, comm: kworker/u:1H Not tainted 3.8.0-rc1+ #7
Call Trace:
 [<ffffffff813e92f9>] print_circular_bug+0x1fb/0x20c
 [<ffffffff81082215>] __lock_acquire+0x1465/0x1c70
 [<ffffffff81083011>] lock_acquire+0xb1/0xe0
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffff813efd01>] _raw_spin_lock+0x41/0x80
 [<ffffffffa0033ba6>] ? sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033ba6>] sco_chan_del+0x66/0x190 [bluetooth]
 [<ffffffffa0033d6d>] sco_conn_del+0x9d/0xe0 [bluetooth]
 [<ffffffffa0034653>] sco_disconn_cfm+0x53/0x60 [bluetooth]
 [<ffffffffa000fef3>] hci_disconn_complete_evt.isra.54+0x363/0x3c0 [bluetooth]
 [<ffffffffa000fbd0>] ? hci_disconn_complete_evt.isra.54+0x40/0x3c0 [bluetooth]
 [<ffffffffa00150f7>] hci_event_packet+0xc7/0x29b0 [bluetooth]
 [<ffffffff81202e90>] ? __dynamic_pr_debug+0x80/0x90
 [<ffffffff8133ff7d>] ? kfree_skb+0x2d/0x40
 [<ffffffffa0021644>] ? hci_send_to_monitor+0x1a4/0x1c0 [bluetooth]
 [<ffffffffa0004583>] hci_rx_work+0x133/0x870 [bluetooth]
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104d65f>] process_one_work+0x2bf/0x4f0
 [<ffffffff8104d5f8>] ? process_one_work+0x258/0x4f0
 [<ffffffff8104fdc1>] ? worker_thread+0x51/0x3e0
 [<ffffffffa0004450>] ? hci_tx_work+0x800/0x800 [bluetooth]
 [<ffffffff81050022>] worker_thread+0x2b2/0x3e0
 [<ffffffff8104fd70>] ? busy_worker_rebind_fn+0x100/0x100
 [<ffffffff81056021>] kthread+0xd1/0xe0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0
 [<ffffffff813f14bc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81055f50>] ? flush_kthread_worker+0xc0/0xc0

Signed-off-by: Andre Guedes <[email protected]>
Signed-off-by: Gustavo Padovan <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
…/kernel/git/paulg/linux

Paul Gortmaker says:

====================
The removal of wanrouter code was originally listed in the (now
gone) feature removal file since May 2012, and an RFC of the
deletion was posted[1] in late 2012.  The overall concept was given
an OK, but defconfig contamination, build failures, etc. meant that
it didn't quite make it into mainline for 3.8.

Since that time, Dan discovered (via code audit) a runtime bug that
proves nobody has been using this for over four years[2].  With that
new information, I think it makes sense for someone to follow through
on Joe's original RFC and get this done for the 3.9 release.

In addition to resolving the build failures of the RFC by keeping
stub headers, this also splits the change into two parts, just like
the token ring removal did.  Part #1 decouples the mainline kernel
from the expired subsystem, and part #2 does the large scale
deletion of the subsystem content.

The advantage of the above, is that a "git blame" will never lead
you to a 4000+ line deletion commit.  The large scale deletion will
never show up in a "git blame" and hence the same advantages that we
get from the "--irreversible-delete" in the review stage of "git
format-patch" are also embedded into the git history itself.  This
may seem like a moot point to some, but for those who spend a
considerable amount of time data mining in the git history, this is
probably worth doing.

I have done build tests of all[mod/yes]config for both the stage 1
(Makefile and Kconfig) and stage 2 (full driver delete) as a sanity
check, and the issues with the previously posted RFC should be gone.

Speaking of "--irreversible-delete" -- these patches were created
with that option, so if you want to use them locally, you are going
to have to pull (location below) the content instead of doing a
"git am" of the mailed out content.

[1] http://patchwork.ozlabs.org/patch/198794/
[2] http://www.spinics.net/lists/netdev/msg218670.html
====================

Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
If pmtx_open() fails to get a slave inode or fails the pty_open(),
the tty is released as part of the error cleanup. As evidenced by the
first BUG stacktrace below, pty_close() assumes that the linked pty has
a valid, initialized inode* stored in driver_data.

Also, as evidenced by the second BUG stacktrace below, pty_unix98_shutdown()
assumes that the master pty's driver_data has been initialized.

1) Fix the invalid assumption in pty_close().
2) Initialize driver_data immediately so proper devpts fs cleanup occurs.

Fixes this BUG:

[  815.868844] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  815.869018] IP: [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.869190] PGD 7c775067 PUD 79deb067 PMD 0
[  815.869315] Oops: 0000 [#1] PREEMPT SMP
[  815.869443] Modules linked in: kvm_intel kvm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi microcode snd_rawmidi psmouse serio_raw snd_seq_midi_event snd_seq snd_timer$
[  815.870025] CPU 0
[  815.870143] Pid: 27819, comm: stress_test_tty Tainted: G        W    3.8.0-next-20130125+ttypatch-2-xeon #2 Bochs Bochs
[  815.870386] RIP: 0010:[<ffffffff81207bcc>]  [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.870540] RSP: 0018:ffff88007d3e1ac8  EFLAGS: 00010282
[  815.870661] RAX: ffff880079c20800 RBX: 0000000000000000 RCX: 0000000000000000
[  815.870804] RDX: ffff880079c209a8 RSI: 0000000000000286 RDI: 0000000000000000
[  815.870933] RBP: ffff88007d3e1ae8 R08: 0000000000000000 R09: 0000000000000000
[  815.871078] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88007bfb7e00
[  815.871209] R13: 0000000000000005 R14: ffff880079c20c00 R15: ffff880079c20c00
[  815.871343] FS:  00007f2e86206700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
[  815.871495] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  815.871617] CR2: 0000000000000028 CR3: 000000007ae56000 CR4: 00000000000006f0
[  815.871752] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  815.871902] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  815.872012] Process stress_test_tty (pid: 27819, threadinfo ffff88007d3e0000, task ffff88007c874530)
[  815.872012] Stack:
[  815.872012]  ffff88007bfb7e00 ffff880079c20c00 ffff88007bfb7e00 0000000000000005
[  815.872012]  ffff88007d3e1b08 ffffffff81417be7 ffff88007caa9bd8 ffff880079c20800
[  815.872012]  ffff88007d3e1bc8 ffffffff8140e5f8 0000000000000000 0000000000000000
[  815.872012] Call Trace:
[  815.872012]  [<ffffffff81417be7>] pty_close+0x157/0x170
[  815.872012]  [<ffffffff8140e5f8>] tty_release+0x138/0x580
[  815.872012]  [<ffffffff816d29f3>] ? _raw_spin_lock+0x23/0x30
[  815.872012]  [<ffffffff816d267a>] ? _raw_spin_unlock+0x1a/0x40
[  815.872012]  [<ffffffff816d0178>] ? __mutex_unlock_slowpath+0x48/0x60
[  815.872012]  [<ffffffff81417dff>] ptmx_open+0x11f/0x180
[  815.872012]  [<ffffffff8119394b>] chrdev_open+0x9b/0x1c0
[  815.872012]  [<ffffffff8118d643>] do_dentry_open+0x203/0x290
[  815.872012]  [<ffffffff811938b0>] ? cdev_put+0x30/0x30
[  815.872012]  [<ffffffff8118d705>] finish_open+0x35/0x50
[  815.872012]  [<ffffffff8119dcce>] do_last+0x6fe/0xe90
[  815.872012]  [<ffffffff8119a7af>] ? link_path_walk+0x7f/0x880
[  815.872012]  [<ffffffff810909d5>] ? cpuacct_charge+0x75/0x80
[  815.872012]  [<ffffffff8119e51c>] path_openat+0xbc/0x4e0
[  815.872012]  [<ffffffff816d0fd0>] ? __schedule+0x400/0x7f0
[  815.872012]  [<ffffffff8140e956>] ? tty_release+0x496/0x580
[  815.872012]  [<ffffffff8119ec11>] do_filp_open+0x41/0xa0
[  815.872012]  [<ffffffff816d267a>] ? _raw_spin_unlock+0x1a/0x40
[  815.872012]  [<ffffffff811abe39>] ? __alloc_fd+0xe9/0x140
[  815.872012]  [<ffffffff8118ea44>] do_sys_open+0xf4/0x1e0
[  815.872012]  [<ffffffff8118eb51>] sys_open+0x21/0x30
[  815.872012]  [<ffffffff816da499>] system_call_fastpath+0x16/0x1b
[  815.872012] Code: 0f 1f 80 00 00 00 00 45 31 e4 eb d7 0f 0b 90 0f 1f 44 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 48 89 fb 4c 89 65 f0 4c 89 6d f8 <48> 8b 47 28 48 81 78 58 d1 1c 0$
[  815.872012] RIP  [<ffffffff81207bcc>] devpts_pty_kill+0x1c/0xa0
[  815.872012]  RSP <ffff88007d3e1ac8>
[  815.872012] CR2: 0000000000000028
[  815.897036] ---[ end trace eadf50b7f34e47d5 ]---

Fixes this BUG also:

[  608.366836] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[  608.366948] IP: [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.367050] PGD 7c75b067 PUD 7b919067 PMD 0
[  608.367135] Oops: 0000 [#1] PREEMPT SMP
[  608.367201] Modules linked in: kvm_intel kvm snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event microcode snd_seq psmouse snd_timer snd_seq_device serio_raw snd mac_hid soundcore snd_page_alloc rfcomm virtio_balloon parport_pc bnep bluetooth ppdev i2c_piix4 lp parport floppy
[  608.367617] CPU 2
[  608.367669] Pid: 1918, comm: stress_test_tty Tainted: G        W    3.8.0-next-20130125+ttypatch-2-xeon #2 Bochs Bochs
[  608.367796] RIP: 0010:[<ffffffff812078d8>]  [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.367885] RSP: 0018:ffff88007ae41a88  EFLAGS: 00010286
[  608.367951] RAX: ffffffff81417e80 RBX: ffff880036472400 RCX: 0000000180400028
[  608.368010] RDX: ffff880036470004 RSI: 0000000000000004 RDI: 0000000000000000
[  608.368010] RBP: ffff88007ae41a98 R08: 0000000000000000 R09: 0000000000000001
[  608.368010] R10: ffffea0001f22e40 R11: ffffffff814151d5 R12: 0000000000000004
[  608.368010] R13: ffff880036470000 R14: 0000000000000004 R15: ffff880036472400
[  608.368010] FS:  00007ff7a5268700(0000) GS:ffff88007fd00000(0000) knlGS:0000000000000000
[  608.368010] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  608.368010] CR2: 0000000000000028 CR3: 000000007a0fd000 CR4: 00000000000006e0
[  608.368010] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  608.368010] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  608.368010] Process stress_test_tty (pid: 1918, threadinfo ffff88007ae40000, task ffff88003688dc40)
[  608.368010] Stack:
[  608.368010]  ffff880036472400 0000000000000001 ffff88007ae41aa8 ffffffff81417e98
[  608.368010]  ffff88007ae41ac8 ffffffff8140c42b ffff88007ac73100 ffff88007ac73100
[  608.368010]  ffff88007ae41b98 ffffffff8140ead5 ffff88007ae41b38 ffff88007ca40e40
[  608.368010] Call Trace:
[  608.368010]  [<ffffffff81417e98>] pty_unix98_shutdown+0x18/0x20
[  608.368010]  [<ffffffff8140c42b>] release_tty+0x3b/0xe0
[  608.368010]  [<ffffffff8140ead5>] __tty_release+0x575/0x5d0
[  608.368010]  [<ffffffff816d2c63>] ? _raw_spin_lock+0x23/0x30
[  608.368010]  [<ffffffff816d28ea>] ? _raw_spin_unlock+0x1a/0x40
[  608.368010]  [<ffffffff816d03e8>] ? __mutex_unlock_slowpath+0x48/0x60
[  608.368010]  [<ffffffff8140ef79>] tty_open+0x449/0x5f0
[  608.368010]  [<ffffffff8119394b>] chrdev_open+0x9b/0x1c0
[  608.368010]  [<ffffffff8118d643>] do_dentry_open+0x203/0x290
[  608.368010]  [<ffffffff811938b0>] ? cdev_put+0x30/0x30
[  608.368010]  [<ffffffff8118d705>] finish_open+0x35/0x50
[  608.368010]  [<ffffffff8119dcce>] do_last+0x6fe/0xe90
[  608.368010]  [<ffffffff8119a7af>] ? link_path_walk+0x7f/0x880
[  608.368010]  [<ffffffff8119e51c>] path_openat+0xbc/0x4e0
[  608.368010]  [<ffffffff8119ec11>] do_filp_open+0x41/0xa0
[  608.368010]  [<ffffffff816d28ea>] ? _raw_spin_unlock+0x1a/0x40
[  608.368010]  [<ffffffff811abe39>] ? __alloc_fd+0xe9/0x140
[  608.368010]  [<ffffffff8118ea44>] do_sys_open+0xf4/0x1e0
[  608.368010]  [<ffffffff816d2c63>] ? _raw_spin_lock+0x23/0x30
[  608.368010]  [<ffffffff8118eb51>] sys_open+0x21/0x30
[  608.368010]  [<ffffffff816da719>] system_call_fastpath+0x16/0x1b
[  608.368010] Code: ec 48 83 c4 10 5b 41 5c 5d c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 48 83 ec 10 4c 89 65 f8 41 89 f4 48 89 5d f0 <48> 8b 47 28 48 81 78 58 d1 1c 00 00 74 0b 48 8b 05 4b 66 cf 00
[  608.368010] RIP  [<ffffffff812078d8>] devpts_kill_index+0x18/0x70
[  608.368010]  RSP <ffff88007ae41a88>
[  608.368010] CR2: 0000000000000028
[  608.394153] ---[ end trace afe83b0fb5fbda93 ]---

Reported-by: Ilya Zykov <[email protected]>
Signed-off-by: Peter Hurley <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
I am getting segfaults *after* the time sorting of perf samples where
the event type is off the charts:

(gdb) bt
\#0  0x0807b1b2 in hists__inc_nr_events (hists=0x80a99c4, type=1163281902) at util/hist.c:1225
\#1  0x08070795 in perf_session_deliver_event (session=0x80a9b90, event=0xf7a6aff8, sample=0xffffc318, tool=0xffffc520,
    file_offset=0) at util/session.c:884
\#2  0x0806f9b9 in flush_sample_queue (s=0x80a9b90, tool=0xffffc520) at util/session.c:555
\#3  0x0806fc53 in process_finished_round (tool=0xffffc520, event=0x0, session=0x80a9b90) at util/session.c:645

This is bizarre because the event has already been processed once --
before it was added to the samples queue -- and the event was found to
be sane at that time.

There seem to be 2 causes:

1. perf_evlist__mmap_read updates the read location even though there
are outstanding references to events sitting in the mmap buffers via the
ordered samples queue.

2. There is a single evlist->event_copy for all evlist entries.
event_copy is used to handle an event wrapping at the mmap buffer
boundary.

This patch addresses the second problem - making event_copy local to
each perf_mmap. With this change my highly repeatable use case no longer
fails.

The first problem is much more complicated and will be the subject of a
future patch.

Signed-off-by: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
 This patch supports basic common driver code for LP5521, LP5523/55231 devices.

 ( Driver Structure Data )

 lp55xx_led and lp55xx_chip
 In lp55xx common driver, two different data structure is used.
 o lp55xx_led
   control multi output LED channels such as led current, channel index.
 o lp55xx_chip
   general chip control such like the I2C and platform data.

 For example, LP5521 has maximum 3 LED channels.
 LP5523/55231 has 9 output channels.

 lp55xx_chip for LP5521 ... lp55xx_led #1
                            lp55xx_led #2
                            lp55xx_led #3

 lp55xx_chip for LP5523 ... lp55xx_led #1
                            lp55xx_led #2
                            .
                            .
                            lp55xx_led #9

 ( Platform Data )

 LP5521 and LP5523/55231 have own specific platform data.
 However, this data can be handled with just one platform data structure.
 The lp55xx platform data is declared in the header.
 This structure is derived from leds-lp5521.h and leds-lp5523.h

Signed-off-by: Milo(Woogyom) Kim <[email protected]>
Signed-off-by: Bryan Wu <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
This patch makes clearer the ambiguous f2fs_gc flow as follows.

1. Remove intermediate checkpoint condition during f2fs_gc
 (i.e., should_do_checkpoint() and GC_BLOCKED)

2. Remove unnecessary return values of f2fs_gc because of #1.
 (i.e., GC_NODE, GC_OK, etc)

3. Simplify write_checkpoint() because of #2.

4. Clarify the main f2fs_gc flow.
 o monitor how many freed sections during one iteration of do_garbage_collect().
 o do GC more without checkpoints if we can't get enough free sections.
 o do checkpoint once we've got enough free sections through forground GCs.

5. Adopt thread-logging (Slack-Space-Recycle) scheme more aggressively on data
  log types. See. get_ssr_segement()

Signed-off-by: Jaegeuk Kim <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Per Al Viro's "signals for dummies" https://lkml.org/lkml/2012/12/6/366
there are 3 golden rules for (not) restarting syscalls:

"	What we need to guarantee is
* restarts do not happen on signals caught in interrupts or exceptions
* restarts do not happen on signals caught in sigreturn()
* restart should happen only once, even if we get through do_signal()
  many times."

ARC Port already handled #1, this patch fixes #2 and #3.

We use the additional state in pt_regs->orig_r8 to ckh if restarting
has already been done once.

Thanks to Al Viro for spotting this.

Signed-off-by: Vineet Gupta <[email protected]>
Cc: Al Viro <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
koalo pushed a commit that referenced this issue Aug 27, 2013
The orig platform code orgnaization was singleton design pattern - only
one platform (and board thereof) would build at a time.

Thus any platform/board specific code (e.g. irq init, early init ...)
expected by ARC common code was exported as well defined set of APIs,
with only ONE instance building ever.

Now with multiple-platform build requirement, that design of code no
longer holds - multiple board specific calls need to build at the same
time - so ARC common code can't use the API approach, it needs a
callback based design where each board registers it's specific set of
functions, and at runtime, depending on board detection, the callbacks
are used from the registry.

This commit adds all the infrastructure, where board specific callbacks
are specified as a "maThine description".

All the hooks are placed in right spots, no board callbacks registered
yet (with MACHINE_STARt/END constructs) so the hooks will not run.

Next commit will actually convert the platform to this infrastructure.

Signed-off-by: Vineet Gupta <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Acked-by: Arnd Bergmann <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
layoutget's prepare hook can call rpc_exit with status = NFS4_OK (0).
Because of this, nfs4_proc_layoutget can't depend on a 0 status to mean
that the RPC was successfully sent, received and parsed.

To fix this, use the result's len member to see if parsing took place.

This fixes the following OOPS -- calling xdr_init_decode() with a buffer length
0 doesn't set the stream's 'p' member and ends up using uninitialized memory
in filelayout_decode_layout.

BUG: unable to handle kernel paging request at 0000000000008050
IP: [<ffffffff81282e78>] memcpy+0x18/0x120
PGD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:11.0/0000:02:01.0/irq
CPU 1
Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache auth_rpcgss nfs_acl autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 dm_mirror dm_region_hash dm_log dm_mod ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon i2c_piix4 i2c_core sg shpchp ext4 mbcache jbd2 sr_mod cdrom sd_mod crc_t10dif pata_acpi ata_generic ata_piix mptspi mptscsih mptbase scsi_transport_spi [last unloaded: speedstep_lib]

Pid: 1665, comm: flush-0:22 Not tainted 2.6.32-356-test-2 #2 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
RIP: 0010:[<ffffffff81282e78>]  [<ffffffff81282e78>] memcpy+0x18/0x120
RSP: 0018:ffff88003dfab588  EFLAGS: 00010206
RAX: ffff88003dc42000 RBX: ffff88003dfab610 RCX: 0000000000000009
RDX: 000000003f807ff0 RSI: 0000000000008050 RDI: ffff88003dc42000
RBP: ffff88003dfab5b0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000080 R12: 0000000000000024
R13: ffff88003dc42000 R14: ffff88003f808030 R15: ffff88003dfab6a0
FS:  0000000000000000(0000) GS:ffff880003420000(0000) knlGS:0000000000000000
CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000008050 CR3: 000000003bc92000 CR4: 00000000001407e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process flush-0:22 (pid: 1665, threadinfo ffff88003dfaa000, task ffff880037f77540)
Stack:
ffffffffa0398ac1 ffff8800397c5940 ffff88003dfab610 ffff88003dfab6a0
<d> ffff88003dfab5d0 ffff88003dfab680 ffffffffa01c150b ffffea0000d82e70
<d> 000000508116713b 0000000000000000 0000000000000000 0000000000000000
Call Trace:
[<ffffffffa0398ac1>] ? xdr_inline_decode+0xb1/0x120 [sunrpc]
[<ffffffffa01c150b>] filelayout_decode_layout+0xeb/0x350 [nfs_layout_nfsv41_files]
[<ffffffffa01c17fc>] filelayout_alloc_lseg+0x8c/0x3c0 [nfs_layout_nfsv41_files]
[<ffffffff8150e6ce>] ? __wait_on_bit+0x7e/0x90

Signed-off-by: Weston Andros Adamson <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
Cc: [email protected]
koalo pushed a commit that referenced this issue Aug 27, 2013
bd_mutex and lo_ctl_mutex can be held in different order.

Path #1:

blkdev_open
 blkdev_get
  __blkdev_get (hold bd_mutex)
   lo_open (hold lo_ctl_mutex)

Path #2:

blkdev_ioctl
 lo_ioctl (hold lo_ctl_mutex)
  lo_set_capacity (hold bd_mutex)

Lockdep does not report it, because path #2 actually holds a subclass of
lo_ctl_mutex.  This subclass seems creep into the code by mistake.  The
patch author actually just mentioned it in the changelog, see commit
f028f3b ("loop: fix circular locking in loop_clr_fd()"), also see:

	http://marc.info/?l=linux-kernel&m=123806169129727&w=2

Path #2 hold bd_mutex to call bd_set_size(), I've protected it
with i_mutex in a previous patch, so drop bd_mutex at this site.

Signed-off-by: Guo Chao <[email protected]>
Cc: Alexander Viro <[email protected]>
Cc: Guo Chao <[email protected]>
Cc: M. Hindess <[email protected]>
Cc: Nikanth Karthikesan <[email protected]>
Cc: Jens Axboe <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Pass the directio request on pageio_init to clean up the API.

Percolate pg_dreq from original nfs_pageio_descriptor to the
pnfs_{read,write}_done_resend_to_mds and use it on respective
call to nfs_pageio_init_{read,write} on the newly created
nfs_pageio_descriptor.

Reproduced by command:
 mount -o vers=4.1 server:/ /mnt
 dd bs=128k count=8 if=/dev/zero of=/mnt/dd.out oflag=direct

BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
IP: [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs]
PGD 34786067 PUD 34794067 PMD 0
Oops: 0002 [#1] SMP
Modules linked in: nfs_layout_nfsv41_files nfsv4 nfs nfsd lockd nfs_acl auth_rpcgss exportfs sunrpc btrfs zlib_deflate libcrc32c ipv6 autofs4
CPU 1
Pid: 259, comm: kworker/1:2 Not tainted 3.8.0-rc6 #2 Bochs Bochs
RIP: 0010:[<ffffffffa021a3a8>]  [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs]
RSP: 0018:ffff880038f8fa68  EFLAGS: 00010206
RAX: ffffffffa021a6a9 RBX: ffff880038f8fb48 RCX: 00000000000a0000
RDX: ffffffffa021e616 RSI: ffff8800385e9a40 RDI: 0000000000000028
RBP: ffff880038f8fa68 R08: ffffffff81ad6720 R09: ffff8800385e9510
R10: ffffffffa0228450 R11: ffff880038e87418 R12: ffff8800385e9a40
R13: ffff8800385e9a70 R14: ffff880038f8fb38 R15: ffffffffa0148878
FS:  0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000028 CR3: 0000000034789000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process kworker/1:2 (pid: 259, threadinfo ffff880038f8e000, task ffff880038302480)
Stack:
 ffff880038f8fa78 ffffffffa021a6bf ffff880038f8fa88 ffffffffa021bb82
 ffff880038f8fae8 ffffffffa021f454 ffff880038f8fae8 ffffffff8109689d
 ffff880038f8fab8 ffffffff00000006 0000000000000000 ffff880038f8fb48
Call Trace:
 [<ffffffffa021a6bf>] nfs_direct_pgio_init+0x16/0x18 [nfs]
 [<ffffffffa021bb82>] nfs_pgheader_init+0x6a/0x6c [nfs]
 [<ffffffffa021f454>] nfs_generic_pg_writepages+0x51/0xf8 [nfs]
 [<ffffffff8109689d>] ? mark_held_locks+0x71/0x99
 [<ffffffffa0148878>] ? rpc_release_resources_task+0x37/0x37 [sunrpc]
 [<ffffffffa021bc25>] nfs_pageio_doio+0x1a/0x43 [nfs]
 [<ffffffffa021be7c>] nfs_pageio_complete+0x16/0x2c [nfs]
 [<ffffffffa02608be>] pnfs_write_done_resend_to_mds+0x95/0xc5 [nfsv4]
 [<ffffffffa0148878>] ? rpc_release_resources_task+0x37/0x37 [sunrpc]
 [<ffffffffa028e27f>] filelayout_reset_write+0x8c/0x99 [nfs_layout_nfsv41_files]
 [<ffffffffa028e5f9>] filelayout_write_done_cb+0x4d/0xc1 [nfs_layout_nfsv41_files]
 [<ffffffffa024587a>] nfs4_write_done+0x36/0x49 [nfsv4]
 [<ffffffffa021f996>] nfs_writeback_done+0x53/0x1cc [nfs]
 [<ffffffffa021fb1d>] nfs_writeback_done_common+0xe/0x10 [nfs]
 [<ffffffffa028e03d>] filelayout_write_call_done+0x28/0x2a [nfs_layout_nfsv41_files]
 [<ffffffffa01488a1>] rpc_exit_task+0x29/0x87 [sunrpc]
 [<ffffffffa014a0c9>] __rpc_execute+0x11d/0x3cc [sunrpc]
 [<ffffffff810969dc>] ? trace_hardirqs_on_caller+0x117/0x173
 [<ffffffffa014a39f>] rpc_async_schedule+0x27/0x32 [sunrpc]
 [<ffffffffa014a378>] ? __rpc_execute+0x3cc/0x3cc [sunrpc]
 [<ffffffff8105f8c1>] process_one_work+0x226/0x422
 [<ffffffff8105f7f4>] ? process_one_work+0x159/0x422
 [<ffffffff81094757>] ? lock_acquired+0x210/0x249
 [<ffffffffa014a378>] ? __rpc_execute+0x3cc/0x3cc [sunrpc]
 [<ffffffff810600d8>] worker_thread+0x126/0x1c4
 [<ffffffff8105ffb2>] ? manage_workers+0x240/0x240
 [<ffffffff81064ef8>] kthread+0xb1/0xb9
 [<ffffffff81064e47>] ? __kthread_parkme+0x65/0x65
 [<ffffffff815206ec>] ret_from_fork+0x7c/0xb0
 [<ffffffff81064e47>] ? __kthread_parkme+0x65/0x65
Code: 00 83 38 02 74 12 48 81 4b 50 00 00 01 00 c7 83 60 07 00 00 01 00 00 00 48 89 df e8 55 fe ff ff 5b 41 5c 5d c3 66 90 55 48 89 e5 <f0> ff 07 5d c3 55 48 89 e5 f0 ff 0f 0f 94 c0 84 c0 0f 95 c0 0f
RIP  [<ffffffffa021a3a8>] atomic_inc+0x4/0x9 [nfs]
 RSP <ffff880038f8fa68>
CR2: 0000000000000028

Signed-off-by: Benny Halevy <[email protected]>
Cc: [email protected] [>= 3.6]
Signed-off-by: Trond Myklebust <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Lee A. Roberts says:

====================
This series of patches resolves several SCTP association hangs observed during
SCTP stress testing.  Observable symptoms include communications hangs with
data being held in the association reassembly and/or lobby (ordering) queues.
Close examination of reassembly/ordering queues may show either duplicated
or missing packets.

In version #2, corrected build failure in initial version of patch series
due to wrong calling sequence for sctp_ulpq_partial_delivery() being inserted
in sctp_ulpq_renege().

In version #3, adjusted patch documentation to be less repetitive.
====================

Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
This is needed because the omap_mux_get_by_name()
function calls the _omap_mux_get_by_name subfunction
for each mux partition until needed mux is not found.
As a result, we get messages like
"Could not find signal XXX" for each partition
where this mux name does not exist.

This patch fixes wrong error message in
the _omap_mux_get_by_name() function moving it
to the omap_mux_get_by_name() one and as result
reduces noise in the kernel log.

My kernel log without this patch:
[...]
[    0.221801] omap_mux_init: Add partition: #2: wkup, flags: 3
[    0.222045] _omap_mux_get_by_name: Could not find signal fref_clk0_out.sys_drm_msecure
[    0.222137] _omap_mux_get_by_name: Could not find signal sys_nirq
[    0.222167] _omap_mux_get_by_name: Could not find signal sys_nirq
[    0.225006] _omap_mux_get_by_name: Could not find signal uart1_rx.uart1_rx
[    0.225006] _omap_mux_get_by_name: Could not find signal uart1_rx.uart1_rx
[    0.270111] _omap_mux_get_by_name: Could not find signal fref_clk4_out.fref_clk4_out
[    0.273406] twl: not initialized

[...]

My kernel log with this patch:
[...]
[    0.221771] omap_mux_init: Add partition: #2: wkup, flags: 3
[    0.222106] omap_mux_get_by_name: Could not find signal sys_nirq
[    0.224945] omap_mux_get_by_name: Could not find signal uart1_rx.uart1_rx
[    0.274536] twl: not initialized
[...]

Signed-off-by: Ruslan Bilovol <[email protected]>
Signed-off-by: Tony Lindgren <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
The following script will produce a kernel oops:

    sudo ip netns add v
    sudo ip netns exec v ip ad add 127.0.0.1/8 dev lo
    sudo ip netns exec v ip link set lo up
    sudo ip netns exec v ip ro add 224.0.0.0/4 dev lo
    sudo ip netns exec v ip li add vxlan0 type vxlan id 42 group 239.1.1.1 dev lo
    sudo ip netns exec v ip link set vxlan0 up
    sudo ip netns del v

where inspect by gdb:

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 107]
    0xffffffffa0289e33 in ?? ()
    (gdb) bt
    #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
    #1  vxlan_stop (dev=0xffff88001bafa000) at drivers/net/vxlan.c:1087
    #2  0xffffffff812cc498 in __dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1299
    #3  0xffffffff812cd920 in dev_close_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:1335
    #4  0xffffffff812cef31 in rollback_registered_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:4851
    #5  0xffffffff812cf040 in unregister_netdevice_many (head=head@entry=0xffff88001f2e7dc8) at net/core/dev.c:5752
    #6  0xffffffff812cf1ba in default_device_exit_batch (net_list=0xffff88001f2e7e18) at net/core/dev.c:6170
    #7  0xffffffff812cab27 in cleanup_net (work=<optimized out>) at net/core/net_namespace.c:302
    #8  0xffffffff810540ef in process_one_work (worker=0xffff88001ba9ed40, work=0xffffffff8167d020) at kernel/workqueue.c:2157
    #9  0xffffffff810549d0 in worker_thread (__worker=__worker@entry=0xffff88001ba9ed40) at kernel/workqueue.c:2276
    #10 0xffffffff8105870c in kthread (_create=0xffff88001f2e5d68) at kernel/kthread.c:168
    #11 <signal handler called>
    #12 0x0000000000000000 in ?? ()
    #13 0x0000000000000000 in ?? ()
    (gdb) fr 0
    #0  vxlan_leave_group (dev=0xffff88001bafa000) at drivers/net/vxlan.c:533
    533		struct sock *sk = vn->sock->sk;
    (gdb) l
    528	static int vxlan_leave_group(struct net_device *dev)
    529	{
    530		struct vxlan_dev *vxlan = netdev_priv(dev);
    531		struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id);
    532		int err = 0;
    533		struct sock *sk = vn->sock->sk;
    534		struct ip_mreqn mreq = {
    535			.imr_multiaddr.s_addr	= vxlan->gaddr,
    536			.imr_ifindex		= vxlan->link,
    537		};
    (gdb) p vn->sock
    $4 = (struct socket *) 0x0

The kernel calls `vxlan_exit_net` when deleting the netns before shutting down
vxlan interfaces. Later the removal of all vxlan interfaces, where `vn->sock`
is already gone causes the oops. so we should manually shutdown all interfaces
before deleting `vn->sock` as the patch does.

Signed-off-by: Zang MingJie <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
ca22e56 (driver-core: implement 'sysdev' functionality for regular
devices and buses) has introduced bus_register macro with a static
key to distinguish different subsys mutex classes.

This however doesn't work for different subsys which use a common
registering function. One example is subsys_system_register (and
mce_device and cpu_device).

In the end this leads to the following lockdep splat:
[  207.271924] ======================================================
[  207.271932] [ INFO: possible circular locking dependency detected ]
[  207.271942] 3.9.0-rc1-0.7-default+ #34 Not tainted
[  207.271948] -------------------------------------------------------
[  207.271957] bash/10493 is trying to acquire lock:
[  207.271963]  (subsys mutex){+.+.+.}, at: [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.271987]
[  207.271987] but task is already holding lock:
[  207.271995]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81046ccf>] cpu_hotplug_begin+0x2f/0x60
[  207.272012]
[  207.272012] which lock already depends on the new lock.
[  207.272012]
[  207.272023]
[  207.272023] the existing dependency chain (in reverse order) is:
[  207.272033]
[  207.272033] -> #4 (cpu_hotplug.lock){+.+.+.}:
[  207.272044]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272056]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272069]        [<ffffffff81046ba9>] get_online_cpus+0x29/0x40
[  207.272082]        [<ffffffff81185210>] drain_all_stock+0x30/0x150
[  207.272094]        [<ffffffff811853da>] mem_cgroup_reclaim+0xaa/0xe0
[  207.272104]        [<ffffffff8118775e>] __mem_cgroup_try_charge+0x51e/0xcf0
[  207.272114]        [<ffffffff81188486>] mem_cgroup_charge_common+0x36/0x60
[  207.272125]        [<ffffffff811884da>] mem_cgroup_newpage_charge+0x2a/0x30
[  207.272135]        [<ffffffff81150531>] do_wp_page+0x231/0x830
[  207.272147]        [<ffffffff8115151e>] handle_pte_fault+0x19e/0x8d0
[  207.272157]        [<ffffffff81151da8>] handle_mm_fault+0x158/0x1e0
[  207.272166]        [<ffffffff814b6153>] do_page_fault+0x2a3/0x4e0
[  207.272178]        [<ffffffff814b2578>] page_fault+0x28/0x30
[  207.272189]
[  207.272189] -> #3 (&mm->mmap_sem){++++++}:
[  207.272199]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272208]        [<ffffffff8114c5ad>] might_fault+0x6d/0x90
[  207.272218]        [<ffffffff811a11e3>] filldir64+0xb3/0x120
[  207.272229]        [<ffffffffa013fc19>] call_filldir+0x89/0x130 [ext3]
[  207.272248]        [<ffffffffa0140377>] ext3_readdir+0x6b7/0x7e0 [ext3]
[  207.272263]        [<ffffffff811a1519>] vfs_readdir+0xa9/0xc0
[  207.272273]        [<ffffffff811a15cb>] sys_getdents64+0x9b/0x110
[  207.272284]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272296]
[  207.272296] -> #2 (&type->i_mutex_dir_key#3){+.+.+.}:
[  207.272309]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272319]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272329]        [<ffffffff8119c254>] link_path_walk+0x6f4/0x9a0
[  207.272339]        [<ffffffff8119e7fa>] path_openat+0xba/0x470
[  207.272349]        [<ffffffff8119ecf8>] do_filp_open+0x48/0xa0
[  207.272358]        [<ffffffff8118d81c>] file_open_name+0xdc/0x110
[  207.272369]        [<ffffffff8118d885>] filp_open+0x35/0x40
[  207.272378]        [<ffffffff8135c76e>] _request_firmware+0x52e/0xb20
[  207.272389]        [<ffffffff8135cdd6>] request_firmware+0x16/0x20
[  207.272399]        [<ffffffffa03bdb91>] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272416]        [<ffffffffa03bd554>] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272431]        [<ffffffffa03bd61c>] mc_device_add+0x7c/0xb0 [microcode]
[  207.272444]        [<ffffffff8134a419>] subsys_interface_register+0xc9/0x100
[  207.272457]        [<ffffffffa04fc0f4>] 0xffffffffa04fc0f4
[  207.272472]        [<ffffffff81000202>] do_one_initcall+0x42/0x180
[  207.272485]        [<ffffffff810bbeff>] load_module+0x19df/0x1b70
[  207.272499]        [<ffffffff810bc376>] sys_init_module+0xe6/0x130
[  207.272511]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272523]
[  207.272523] -> #1 (umhelper_sem){++++.+}:
[  207.272537]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272548]        [<ffffffff814ae9c4>] down_read+0x34/0x50
[  207.272559]        [<ffffffff81062bff>] usermodehelper_read_trylock+0x4f/0x100
[  207.272575]        [<ffffffff8135c7dd>] _request_firmware+0x59d/0xb20
[  207.272587]        [<ffffffff8135cdd6>] request_firmware+0x16/0x20
[  207.272599]        [<ffffffffa03bdb91>] request_microcode_fw+0x61/0xd0 [microcode]
[  207.272613]        [<ffffffffa03bd554>] microcode_init_cpu+0x104/0x150 [microcode]
[  207.272627]        [<ffffffffa03bd61c>] mc_device_add+0x7c/0xb0 [microcode]
[  207.272641]        [<ffffffff8134a419>] subsys_interface_register+0xc9/0x100
[  207.272654]        [<ffffffffa04fc0f4>] 0xffffffffa04fc0f4
[  207.272666]        [<ffffffff81000202>] do_one_initcall+0x42/0x180
[  207.272678]        [<ffffffff810bbeff>] load_module+0x19df/0x1b70
[  207.272690]        [<ffffffff810bc376>] sys_init_module+0xe6/0x130
[  207.272702]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272715]
[  207.272715] -> #0 (subsys mutex){+.+.+.}:
[  207.272729]        [<ffffffff810ae002>] __lock_acquire+0x13b2/0x15f0
[  207.272740]        [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.272751]        [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.272763]        [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.272775]        [<ffffffff81349114>] device_del+0x134/0x1f0
[  207.272786]        [<ffffffff813491f2>] device_unregister+0x22/0x60
[  207.272798]        [<ffffffff814a24ea>] mce_cpu_callback+0x15e/0x1ad
[  207.272812]        [<ffffffff814b6402>] notifier_call_chain+0x72/0x130
[  207.272824]        [<ffffffff81073d6e>] __raw_notifier_call_chain+0xe/0x10
[  207.272839]        [<ffffffff81498f76>] _cpu_down+0x1d6/0x350
[  207.272851]        [<ffffffff81499130>] cpu_down+0x40/0x60
[  207.272862]        [<ffffffff8149cc55>] store_online+0x75/0xe0
[  207.272874]        [<ffffffff813474a0>] dev_attr_store+0x20/0x30
[  207.272886]        [<ffffffff812090d9>] sysfs_write_file+0xd9/0x150
[  207.272900]        [<ffffffff8118e10b>] vfs_write+0xcb/0x130
[  207.272911]        [<ffffffff8118e924>] sys_write+0x64/0xa0
[  207.272923]        [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b
[  207.272936]
[  207.272936] other info that might help us debug this:
[  207.272936]
[  207.272952] Chain exists of:
[  207.272952]   subsys mutex --> &mm->mmap_sem --> cpu_hotplug.lock
[  207.272952]
[  207.272973]  Possible unsafe locking scenario:
[  207.272973]
[  207.272984]        CPU0                    CPU1
[  207.272992]        ----                    ----
[  207.273000]   lock(cpu_hotplug.lock);
[  207.273009]                                lock(&mm->mmap_sem);
[  207.273020]                                lock(cpu_hotplug.lock);
[  207.273031]   lock(subsys mutex);
[  207.273040]
[  207.273040]  *** DEADLOCK ***
[  207.273040]
[  207.273055] 5 locks held by bash/10493:
[  207.273062]  #0:  (&buffer->mutex){+.+.+.}, at: [<ffffffff81209049>] sysfs_write_file+0x49/0x150
[  207.273080]  #1:  (s_active#150){.+.+.+}, at: [<ffffffff812090c2>] sysfs_write_file+0xc2/0x150
[  207.273099]  #2:  (x86_cpu_hotplug_driver_mutex){+.+.+.}, at: [<ffffffff81027557>] cpu_hotplug_driver_lock+0x17/0x20
[  207.273121]  #3:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff8149911c>] cpu_down+0x2c/0x60
[  207.273140]  #4:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81046ccf>] cpu_hotplug_begin+0x2f/0x60
[  207.273158]
[  207.273158] stack backtrace:
[  207.273170] Pid: 10493, comm: bash Not tainted 3.9.0-rc1-0.7-default+ #34
[  207.273180] Call Trace:
[  207.273192]  [<ffffffff810ab373>] print_circular_bug+0x223/0x310
[  207.273204]  [<ffffffff810ae002>] __lock_acquire+0x13b2/0x15f0
[  207.273216]  [<ffffffff812086b0>] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273227]  [<ffffffff810ae329>] lock_acquire+0xe9/0x120
[  207.273239]  [<ffffffff8134af27>] ? bus_remove_device+0x37/0x1c0
[  207.273251]  [<ffffffff814ad807>] mutex_lock_nested+0x37/0x360
[  207.273263]  [<ffffffff8134af27>] ? bus_remove_device+0x37/0x1c0
[  207.273274]  [<ffffffff812086b0>] ? sysfs_hash_and_remove+0x60/0xc0
[  207.273286]  [<ffffffff8134af27>] bus_remove_device+0x37/0x1c0
[  207.273298]  [<ffffffff81349114>] device_del+0x134/0x1f0
[  207.273309]  [<ffffffff813491f2>] device_unregister+0x22/0x60
[  207.273321]  [<ffffffff814a24ea>] mce_cpu_callback+0x15e/0x1ad
[  207.273332]  [<ffffffff814b6402>] notifier_call_chain+0x72/0x130
[  207.273344]  [<ffffffff81073d6e>] __raw_notifier_call_chain+0xe/0x10
[  207.273356]  [<ffffffff81498f76>] _cpu_down+0x1d6/0x350
[  207.273368]  [<ffffffff81027557>] ? cpu_hotplug_driver_lock+0x17/0x20
[  207.273380]  [<ffffffff81499130>] cpu_down+0x40/0x60
[  207.273391]  [<ffffffff8149cc55>] store_online+0x75/0xe0
[  207.273402]  [<ffffffff813474a0>] dev_attr_store+0x20/0x30
[  207.273413]  [<ffffffff812090d9>] sysfs_write_file+0xd9/0x150
[  207.273425]  [<ffffffff8118e10b>] vfs_write+0xcb/0x130
[  207.273436]  [<ffffffff8118e924>] sys_write+0x64/0xa0
[  207.273447]  [<ffffffff814bb599>] system_call_fastpath+0x16/0x1b

Which reports a false possitive deadlock because it sees:
1) load_module -> subsys_interface_register -> mc_deveice_add (*) -> subsys->p->mutex -> link_path_walk -> lookup_slow -> i_mutex
2) sys_write -> _cpu_down -> cpu_hotplug_begin -> cpu_hotplug.lock -> mce_cpu_callback -> mce_device_remove(**) -> device_unregister -> bus_remove_device -> subsys mutex
3) vfs_readdir -> i_mutex -> filldir64 -> might_fault -> might_lock_read(mmap_sem) -> page_fault -> mmap_sem -> drain_all_stock -> cpu_hotplug.lock

but
1) takes cpu_subsys subsys (*) but 2) takes mce_device subsys (**) so
the deadlock is not possible AFAICS.

The fix is quite simple. We can pull the key inside bus_type structure
because they are defined per device so the pointer will be unique as
well. bus_register doesn't need to be a macro anymore so change it
to the inline. We could get rid of __bus_register as there is no other
caller but maybe somebody will want to use a different key so keep it
around for now.

Reported-by: Li Zefan <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
Signed-off-by: Jiri Kosina <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
The get_node_page_ra tries to:
1. grab or read a target node page for the given nid,
2. then, call ra_node_page to read other adjacent node pages in advance.

So, when we try to read a target node page by #1, we should submit bio with
READ_SYNC instead of READA.
And, in #2, READA should be used.

Signed-off-by: Jaegeuk Kim <[email protected]>
Reviewed-by: Namjae Jeon <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Commit 84c1754 (ext4: move work from io_end to inode) triggered a
regression when running xfstest raspberrypi#270 when the file system is mounted
with dioread_nolock.

The problem is that after ext4_evict_inode() calls ext4_ioend_wait(),
this guarantees that last io_end structure has been freed, but it does
not guarantee that the workqueue structure, which was moved into the
inode by commit 84c1754, is actually finished.  Once
ext4_flush_completed_IO() calls ext4_free_io_end() on CPU #1, this
will allow ext4_ioend_wait() to return on CPU #2, at which point the
evict_inode() codepath can race against the workqueue code on CPU #1
accessing EXT4_I(inode)->i_unwritten_work to find the next item of
work to do.

Fix this by calling cancel_work_sync() in ext4_ioend_wait(), which
will be renamed ext4_ioend_shutdown(), since it is only used by
ext4_evict_inode().  Also, move the call to ext4_ioend_shutdown()
until after truncate_inode_pages() and filemap_write_and_wait() are
called, to make sure all dirty pages have been written back and
flushed from the page cache first.

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e
*pdpt = 0000000030bc3001 *pde = 0000000000000000 
Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
Modules linked in:
Pid: 6, comm: kworker/u:0 Not tainted 3.8.0-rc3-00013-g84c1754-dirty raspberrypi#91 Bochs Bochs
EIP: 0060:[<c01dda6a>] EFLAGS: 00010046 CPU: 0
EIP is at cwq_activate_delayed_work+0x3b/0x7e
EAX: 00000000 EBX: 00000000 ECX: f505fe54 EDX: 00000000
ESI: ed5b697c EDI: 00000006 EBP: f64b7e8c ESP: f64b7e84
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 30bc2000 CR4: 000006f0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff0ff0 DR7: 00000400
Process kworker/u:0 (pid: 6, ti=f64b6000 task=f64b4160 task.ti=f64b6000)
Stack:
 f505fe00 00000006 f64b7e9c c01de3d7 f6435540 00000003 f64b7efc c01def1d
 f6435540 00000002 00000000 0000008a c16d0808 c040a10b c16d07d8 c16d08b0
 f505fe00 c16d0780 00000000 00000000 ee153df4 c1ce4a30 c17d0e30 00000000
Call Trace:
 [<c01de3d7>] cwq_dec_nr_in_flight+0x71/0xfb
 [<c01def1d>] process_one_work+0x5d8/0x637
 [<c040a10b>] ? ext4_end_bio+0x300/0x300
 [<c01e3105>] worker_thread+0x249/0x3ef
 [<c01ea317>] kthread+0xd8/0xeb
 [<c01e2ebc>] ? manage_workers+0x4bb/0x4bb
 [<c023a370>] ? trace_hardirqs_on+0x27/0x37
 [<c0f1b4b7>] ret_from_kernel_thread+0x1b/0x28
 [<c01ea23f>] ? __init_kthread_worker+0x71/0x71
Code: 01 83 15 ac ff 6c c1 00 31 db 89 c6 8b 00 a8 04 74 12 89 c3 30 db 83 05 b0 ff 6c c1 01 83 15 b4 ff 6c c1 00 89 f0 e8 42 ff ff ff <8b> 13 89 f0 83 05 b8 ff 6c c1
 6c c1 00 31 c9 83
EIP: [<c01dda6a>] cwq_activate_delayed_work+0x3b/0x7e SS:ESP 0068:f64b7e84
CR2: 0000000000000000
---[ end trace a1923229da53d8a4 ]---

Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: Jan Kara <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
We can deadlock (s_active and fcoe_config_mutex) if a
port is being destroyed at the same time one is being created.

[ 4200.503113] ======================================================
[ 4200.503114] [ INFO: possible circular locking dependency detected ]
[ 4200.503116] 3.8.0-rc5+ #8 Not tainted
[ 4200.503117] -------------------------------------------------------
[ 4200.503118] kworker/3:2/2492 is trying to acquire lock:
[ 4200.503119]  (s_active#292){++++.+}, at: [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
[ 4200.503127]
but task is already holding lock:
[ 4200.503128]  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
[ 4200.503133]
which lock already depends on the new lock.

[ 4200.503135]
the existing dependency chain (in reverse order) is:
[ 4200.503136]
-> #1 (fcoe_config_mutex){+.+.+.}:
[ 4200.503139]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
[ 4200.503143]        [<ffffffff816ca7be>] mutex_lock_nested+0x6e/0x360
[ 4200.503146]        [<ffffffffa02f11bd>] fcoe_enable+0x1d/0xb0 [fcoe]
[ 4200.503148]        [<ffffffffa02f127d>] fcoe_ctlr_enabled+0x2d/0x50 [fcoe]
[ 4200.503151]        [<ffffffffa02ffbe8>] store_ctlr_enabled+0x38/0x90 [libfcoe]
[ 4200.503154]        [<ffffffff81424878>] dev_attr_store+0x18/0x30
[ 4200.503157]        [<ffffffff8122b750>] sysfs_write_file+0xe0/0x150
[ 4200.503160]        [<ffffffff811b334c>] vfs_write+0xac/0x180
[ 4200.503162]        [<ffffffff811b3692>] sys_write+0x52/0xa0
[ 4200.503164]        [<ffffffff816d7159>] system_call_fastpath+0x16/0x1b
[ 4200.503167]
-> #0 (s_active#292){++++.+}:
[ 4200.503170]        [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
[ 4200.503172]        [<ffffffff810c7711>] lock_acquire+0xa1/0x140
[ 4200.503174]        [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
[ 4200.503176]        [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
[ 4200.503178]        [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
[ 4200.503180]        [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
[ 4200.503183]        [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
[ 4200.503185]        [<ffffffff81425534>] device_remove_attrs+0x44/0x80
[ 4200.503187]        [<ffffffff81425e97>] device_del+0x127/0x1c0
[ 4200.503189]        [<ffffffff81425f52>] device_unregister+0x22/0x60
[ 4200.503191]        [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
[ 4200.503194]        [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
[ 4200.503196]        [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
[ 4200.503198]        [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
[ 4200.503203]        [<ffffffff81080c6e>] worker_thread+0x15e/0x440
[ 4200.503205]        [<ffffffff8108715a>] kthread+0xea/0xf0
[ 4200.503207]        [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0

[ 4200.503209]
other info that might help us debug this:

[ 4200.503211]  Possible unsafe locking scenario:

[ 4200.503212]        CPU0                    CPU1
[ 4200.503213]        ----                    ----
[ 4200.503214]   lock(fcoe_config_mutex);
[ 4200.503215]                                lock(s_active#292);
[ 4200.503218]                                lock(fcoe_config_mutex);
[ 4200.503219]   lock(s_active#292);
[ 4200.503221]
 *** DEADLOCK ***

[ 4200.503223] 3 locks held by kworker/3:2/2492:
[ 4200.503224]  #0:  (fcoe){.+.+.+}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
[ 4200.503228]  #1:  ((&port->destroy_work)){+.+.+.}, at: [<ffffffff8107ee2b>] process_one_work+0x13b/0x580
[ 4200.503232]  #2:  (fcoe_config_mutex){+.+.+.}, at: [<ffffffffa02f3338>] fcoe_destroy_work+0xe8/0x120 [fcoe]
[ 4200.503236]
stack backtrace:
[ 4200.503238] Pid: 2492, comm: kworker/3:2 Not tainted 3.8.0-rc5+ #8
[ 4200.503240] Call Trace:
[ 4200.503243]  [<ffffffff816c2f09>] print_circular_bug+0x1fb/0x20c
[ 4200.503246]  [<ffffffff810c680f>] __lock_acquire+0x135f/0x1c90
[ 4200.503248]  [<ffffffff810c463a>] ? debug_check_no_locks_freed+0x9a/0x180
[ 4200.503250]  [<ffffffff810c7711>] lock_acquire+0xa1/0x140
[ 4200.503253]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
[ 4200.503255]  [<ffffffff8122c626>] sysfs_deactivate+0x116/0x160
[ 4200.503258]  [<ffffffff8122d20b>] ? sysfs_addrm_finish+0x3b/0x70
[ 4200.503260]  [<ffffffff8122d20b>] sysfs_addrm_finish+0x3b/0x70
[ 4200.503262]  [<ffffffff8122b2eb>] sysfs_hash_and_remove+0x5b/0xb0
[ 4200.503265]  [<ffffffff8122f3d1>] sysfs_remove_group+0x61/0x100
[ 4200.503273]  [<ffffffff814251eb>] device_remove_groups+0x3b/0x60
[ 4200.503275]  [<ffffffff81425534>] device_remove_attrs+0x44/0x80
[ 4200.503277]  [<ffffffff81425e97>] device_del+0x127/0x1c0
[ 4200.503279]  [<ffffffff81425f52>] device_unregister+0x22/0x60
[ 4200.503282]  [<ffffffffa0300970>] fcoe_ctlr_device_delete+0xe0/0xf0 [libfcoe]
[ 4200.503285]  [<ffffffffa02f1b5c>] fcoe_interface_cleanup+0x6c/0xa0 [fcoe]
[ 4200.503287]  [<ffffffffa02f3355>] fcoe_destroy_work+0x105/0x120 [fcoe]
[ 4200.503290]  [<ffffffff8107ee91>] process_one_work+0x1a1/0x580
[ 4200.503292]  [<ffffffff8107ee2b>] ? process_one_work+0x13b/0x580
[ 4200.503295]  [<ffffffffa02f3250>] ? fcoe_if_destroy+0x230/0x230 [fcoe]
[ 4200.503297]  [<ffffffff81080c6e>] worker_thread+0x15e/0x440
[ 4200.503299]  [<ffffffff81080b10>] ? busy_worker_rebind_fn+0x100/0x100
[ 4200.503301]  [<ffffffff8108715a>] kthread+0xea/0xf0
[ 4200.503304]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160
[ 4200.503306]  [<ffffffff816d70ac>] ret_from_fork+0x7c/0xb0
[ 4200.503308]  [<ffffffff81087070>] ? kthread_create_on_node+0x160/0x160

Signed-off-by: Robert Love <[email protected]>
Tested-by: Jack Morgan <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
v20 HREFs require non-standard configuration of EXT_SUPPLY2 to
function correctly (specific information is commented). Here we make
use of the recently added mechanism to adapt initialisation values
for such use-cases.

Signed-off-by: Lee Jones <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
Dave Jones reports that offlining a CPU leads to this trace:

numa_remove_cpu cpu 1 node 0: mask now 0,2-3
smpboot: CPU 1 is now offline
BUG: using smp_processor_id() in preemptible [00000000] code:
cpu-offline.sh/10591
caller is cmci_rediscover+0x6a/0xe0
Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2
Call Trace:
 [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100
 [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0
 [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae
 [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150
 [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff8104c2e3>] cpu_notify+0x23/0x50
 [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20
 [<ffffffff815ef082>] _cpu_down+0x302/0x350
 [<ffffffff815ef106>] cpu_down+0x36/0x50
 [<ffffffff815f1c9d>] store_online+0x8d/0xd0
 [<ffffffff813edc48>] dev_attr_store+0x18/0x30
 [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150
 [<ffffffff811adfb2>] vfs_write+0xa2/0x170
 [<ffffffff811ae16c>] sys_write+0x4c/0xa0
 [<ffffffff81613019>] system_call_fastpath+0x16/0x1b

However, a look at cmci_rediscover shows that it can be simplified quite
a bit, apart from solving the above issue. It invokes functions that
take spin locks with interrupts disabled, and hence it can run in atomic
context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU
is already dead and out of the cpu_online_mask. So take these points into
account and simplify the code, and thereby also fix the above issue.

Reported-by: Dave Jones <[email protected]>
Signed-off-by: Srivatsa S. Bhat <[email protected]>
Signed-off-by: Tony Luck <[email protected]>
koalo pushed a commit that referenced this issue Aug 27, 2013
The sched_clock_remote() implementation has the following inatomicity
problem on 32bit systems when accessing the remote scd->clock, which
is a 64bit value.

CPU0			CPU1

sched_clock_local()	sched_clock_remote(CPU0)
...
			remote_clock = scd[CPU0]->clock
			    read_low32bit(scd[CPU0]->clock)
cmpxchg64(scd->clock,...)
			    read_high32bit(scd[CPU0]->clock)

While the update of scd->clock is using an atomic64 mechanism, the
readout on the remote cpu is not, which can cause completely bogus
readouts.

It is a quite rare problem, because it requires the update to hit the
narrow race window between the low/high readout and the update must go
across the 32bit boundary.

The resulting misbehaviour is, that CPU1 will see the sched_clock on
CPU1 ~4 seconds ahead of it's own and update CPU1s sched_clock value
to this bogus timestamp. This stays that way due to the clamping
implementation for about 4 seconds until the synchronization with
CLOCK_MONOTONIC undoes the problem.

The issue is hard to observe, because it might only result in a less
accurate SCHED_OTHER timeslicing behaviour. To create observable
damage on realtime scheduling classes, it is necessary that the bogus
update of CPU1 sched_clock happens in the context of an realtime
thread, which then gets charged 4 seconds of RT runtime, which results
in the RT throttler mechanism to trigger and prevent scheduling of RT
tasks for a little less than 4 seconds. So this is quite unlikely as
well.

The issue was quite hard to decode as the reproduction time is between
2 days and 3 weeks and intrusive tracing makes it less likely, but the
following trace recorded with trace_clock=global, which uses
sched_clock_local(), gave the final hint:

  <idle>-0   0d..30 400269.477150: hrtimer_cancel: hrtimer=0xf7061e80
  <idle>-0   0d..30 400269.477151: hrtimer_start:  hrtimer=0xf7061e80 ...
irq/20-S-587 1d..32 400273.772118: sched_wakeup:   comm= ... target_cpu=0
  <idle>-0   0dN.30 400273.772118: hrtimer_cancel: hrtimer=0xf7061e80

What happens is that CPU0 goes idle and invokes
sched_clock_idle_sleep_event() which invokes sched_clock_local() and
CPU1 runs a remote wakeup for CPU0 at the same time, which invokes
sched_remote_clock(). The time jump gets propagated to CPU0 via
sched_remote_clock() and stays stale on both cores for ~4 seconds.

There are only two other possibilities, which could cause a stale
sched clock:

1) ktime_get() which reads out CLOCK_MONOTONIC returns a sporadic
   wrong value.

2) sched_clock() which reads the TSC returns a sporadic wrong value.

#1 can be excluded because sched_clock would continue to increase for
   one jiffy and then go stale.

#2 can be excluded because it would not make the clock jump
   forward. It would just result in a stale sched_clock for one jiffy.

After quite some brain twisting and finding the same pattern on other
traces, sched_clock_remote() remained the only place which could cause
such a problem and as explained above it's indeed racy on 32bit
systems.

So while on 64bit systems the readout is atomic, we need to verify the
remote readout on 32bit machines. We need to protect the local->clock
readout in sched_clock_remote() on 32bit as well because an NMI could
hit between the low and the high readout, call sched_clock_local() and
modify local->clock.

Thanks to Siegfried Wulsch for bearing with my debug requests and
going through the tedious tasks of running a bunch of reproducer
systems to generate the debug information which let me decode the
issue.

Reported-by: Siegfried Wulsch <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1304051544160.21884@ionos
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: [email protected]
koalo pushed a commit that referenced this issue Aug 4, 2014
…r_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…buffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…ffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…fer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…ffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…er_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Cc: Stephen Warren <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…ffer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…fer_push()

The current driver triggers a lockdep warning for if tty_flip_buffer_push() is
called with uart_port->lock locked. This never shows up on UP kernels and comes
up only on SMP kernels.

Crash looks like this (produced with samsung.c driver):

-----
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8)
[<c01b59ac>] (do_raw_spin_unlock+0xc4/0xd8) from [<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0)
[<c03627e4>] (_raw_spin_unlock_irqrestore+0xc/0x38) from [<c020a1a8>] (s3c24xx_serial_rx_chars+0)
[<c020a1a8>] (s3c24xx_serial_rx_chars+0x12c/0x260) from [<c020aae8>] (s3c64xx_serial_handle_irq+)
[<c020aae8>] (s3c64xx_serial_handle_irq+0x48/0x60) from [<c006aaa0>] (handle_irq_event_percpu+0x)
[<c006aaa0>] (handle_irq_event_percpu+0x50/0x194) from [<c006ac20>] (handle_irq_event+0x3c/0x5c)
[<c006ac20>] (handle_irq_event+0x3c/0x5c) from [<c006d864>] (handle_fasteoi_irq+0x80/0x13c)
[<c006d864>] (handle_fasteoi_irq+0x80/0x13c) from [<c006a4a4>] (generic_handle_irq+0x20/0x30)
[<c006a4a4>] (generic_handle_irq+0x20/0x30) from [<c000f454>] (handle_IRQ+0x38/0x94)
[<c000f454>] (handle_IRQ+0x38/0x94) from [<c0008538>] (gic_handle_irq+0x34/0x68)
[<c0008538>] (gic_handle_irq+0x34/0x68) from [<c00123c0>] (__irq_svc+0x40/0x70)
Exception stack(0xc04cdf70 to 0xc04cdfb8)
df60:                                     00000000 00000000 0000166e 00000000
df80: c04cc000 c050278f c050278f 00000001 c04d444c 410fc0f4 c03649b0 00000000
dfa0: 00000001 c04cdfb8 c000f758 c000f75c 60070013 ffffffff
[<c00123c0>] (__irq_svc+0x40/0x70) from [<c000f75c>] (arch_cpu_idle+0x28/0x30)
[<c000f75c>] (arch_cpu_idle+0x28/0x30) from [<c0054888>] (cpu_startup_entry+0x5c/0x148)
[<c0054888>] (cpu_startup_entry+0x5c/0x148) from [<c0497aa4>] (start_kernel+0x334/0x38c)
BUG: spinlock lockup suspected on CPU#0, kworker/0:1/360
 lock: s3c24xx_serial_ports+0x1d8/0x370, .magic: dead4ead, .owner: <none>/-1, .owner_cpu: -1
CPU: 0 PID: 360 Comm: kworker/0:1 Not tainted 3.11.0-rc6-next-20130819-00003-g75485f1 #2
Workqueue: events flush_to_ldisc
[<c0014d58>] (unwind_backtrace+0x0/0xf8) from [<c0011908>] (show_stack+0x10/0x14)
[<c0011908>] (show_stack+0x10/0x14) from [<c035da34>] (dump_stack+0x6c/0xac)
[<c035da34>] (dump_stack+0x6c/0xac) from [<c01b581c>] (do_raw_spin_lock+0x100/0x17c)
[<c01b581c>] (do_raw_spin_lock+0x100/0x17c) from [<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28)
[<c03628a0>] (_raw_spin_lock_irqsave+0x20/0x28) from [<c0203224>] (uart_start+0x18/0x34)
[<c0203224>] (uart_start+0x18/0x34) from [<c01ef890>] (__receive_buf+0x4b4/0x738)
[<c01ef890>] (__receive_buf+0x4b4/0x738) from [<c01efb44>] (n_tty_receive_buf2+0x30/0x98)
[<c01efb44>] (n_tty_receive_buf2+0x30/0x98) from [<c01f2ba8>] (flush_to_ldisc+0xec/0x138)
[<c01f2ba8>] (flush_to_ldisc+0xec/0x138) from [<c0031af0>] (process_one_work+0xfc/0x348)
[<c0031af0>] (process_one_work+0xfc/0x348) from [<c0032138>] (worker_thread+0x138/0x37c)
[<c0032138>] (worker_thread+0x138/0x37c) from [<c0037a7c>] (kthread+0xa4/0xb0)
[<c0037a7c>] (kthread+0xa4/0xb0) from [<c000e5f8>] (ret_from_fork+0x14/0x3c)
-----

Release the port lock before calling tty_flip_buffer_push() and reacquire it
after the call.

Similar stuff was already done for few other drivers in the past, like:

commit 2389b27
Author: Thomas Gleixner <[email protected]>
Date:   Tue May 29 21:53:50 2007 +0100

    [ARM] 4417/1: Serial: Fix AMBA drivers locking

Cc: Tony Prisk <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
When picolcd is switched into bootloader mode (for FW flashing) make
sure not to try to dereference NULL-pointers of feature-devices during
unplug/unbind.

This fixes following BUG:
  BUG: unable to handle kernel NULL pointer dereference at 00000298
  IP: [<f811f56b>] picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd]
  *pde = 00000000
  Oops: 0000 [#1]
  Modules linked in: hid_picolcd syscopyarea sysfillrect sysimgblt fb_sys_fops
  CPU: 0 PID: 15 Comm: khubd Not tainted 3.11.0-rc7-00002-g50d62d4 #2
  EIP: 0060:[<f811f56b>] EFLAGS: 00010292 CPU: 0
  EIP is at picolcd_exit_framebuffer+0x1b/0x80 [hid_picolcd]
  Call Trace:
   [<f811d1ab>] picolcd_remove+0xcb/0x120 [hid_picolcd]
   [<c1469b09>] hid_device_remove+0x59/0xc0
   [<c13464ca>] __device_release_driver+0x5a/0xb0
   [<c134653f>] device_release_driver+0x1f/0x30
   [<c134603d>] bus_remove_device+0x9d/0xd0
   [<c13439a5>] device_del+0xd5/0x150
   [<c14696a4>] hid_destroy_device+0x24/0x60
   [<c1474cbb>] usbhid_disconnect+0x1b/0x40
   ...

Signed-off-by: Bruno Prémont <[email protected]>
Cc: [email protected]
Signed-off-by: Jiri Kosina <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
Since we've started to clean up pending flips when the gpu hangs in

commit 96a0291
Author: Ville Syrjälä <[email protected]>
Date:   Mon Feb 18 19:08:49 2013 +0200

    drm/i915: Finish page flips and update primary planes after a GPU reset

the gpu reset work now also grabs modeset locks. But since work items
on our private work queue are not allowed to do that due to the
flush_workqueue from the pageflip code this results in a neat
deadlock:

INFO: task kms_flip:14676 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kms_flip        D ffff88019283a5c0     0 14676  13344 0x00000004
 ffff88018e62dbf8 0000000000000046 ffff88013bdb12e0 ffff88018e62dfd8
 ffff88018e62dfd8 00000000001d3b00 ffff88019283a5c0 ffff88018ec21000
 ffff88018f693f00 ffff88018eece000 ffff88018e62dd60 ffff88018eece898
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffffa046c0dd>] intel_crtc_wait_for_pending_flips+0xb2/0x114 [i915]
 [<ffffffff81050ff4>] ? finish_wait+0x60/0x60
 [<ffffffffa0478041>] intel_crtc_set_config+0x7f3/0x81e [i915]
 [<ffffffffa031780a>] drm_mode_set_config_internal+0x4f/0xc6 [drm]
 [<ffffffffa0319cf3>] drm_mode_setcrtc+0x44d/0x4f9 [drm]
 [<ffffffff810e44da>] ? might_fault+0x38/0x86
 [<ffffffffa030d51f>] drm_ioctl+0x2f9/0x447 [drm]
 [<ffffffff8107a722>] ? trace_hardirqs_off+0xd/0xf
 [<ffffffffa03198a6>] ? drm_mode_setplane+0x343/0x343 [drm]
 [<ffffffff8112222f>] ? mntput_no_expire+0x3e/0x13d
 [<ffffffff81117f33>] vfs_ioctl+0x18/0x34
 [<ffffffff81118776>] do_vfs_ioctl+0x396/0x454
 [<ffffffff81396b37>] ? sysret_check+0x1b/0x56
 [<ffffffff81118886>] SyS_ioctl+0x52/0x7d
 [<ffffffff81396b12>] system_call_fastpath+0x16/0x1b
2 locks held by kms_flip/14676:
 #0:  (&dev->mode_config.mutex){+.+.+.}, at: [<ffffffffa0316545>] drm_modeset_lock_all+0x22/0x59 [drm]
 #1:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa031656b>] drm_modeset_lock_all+0x48/0x59 [drm]
INFO: task kworker/u8:4:175 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:4    D ffff88018de9a5c0     0   175      2 0x00000000
Workqueue: i915 i915_error_work_func [i915]
 ffff88018e37dc30 0000000000000046 ffff8801938ab8a0 ffff88018e37dfd8
 ffff88018e37dfd8 00000000001d3b00 ffff88018de9a5c0 ffff88018ec21018
 0000000000000246 ffff88018e37dca0 000000005a865a86 ffff88018de9a5c0
Call Trace:
 [<ffffffff8138ee7b>] schedule+0x60/0x62
 [<ffffffff8138f23d>] schedule_preempt_disabled+0x9/0xb
 [<ffffffff8138d0cd>] mutex_lock_nested+0x205/0x3b1
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] ? intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]
 [<ffffffffa044e0a2>] i915_error_work_func+0x128/0x147 [i915]
 [<ffffffff8104a89a>] process_one_work+0x1d4/0x35a
 [<ffffffff8104a821>] ? process_one_work+0x15b/0x35a
 [<ffffffff8104b4a5>] worker_thread+0x144/0x1f0
 [<ffffffff8104b361>] ? rescuer_thread+0x275/0x275
 [<ffffffff8105076d>] kthread+0xac/0xb4
 [<ffffffff81059d30>] ? finish_task_switch+0x3b/0xc0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
 [<ffffffff81396a6c>] ret_from_fork+0x7c/0xb0
 [<ffffffff810506c1>] ? __kthread_parkme+0x60/0x60
3 locks held by kworker/u8:4/175:
 #0:  (i915){.+.+.+}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #1:  ((&dev_priv->gpu_error.work)){+.+.+.}, at: [<ffffffff8104a821>] process_one_work+0x15b/0x35a
 #2:  (&crtc->mutex){+.+.+.}, at: [<ffffffffa0477094>] intel_display_handle_reset+0x7e/0xbd [i915]

This blew up while running kms_flip/flip-vs-panning-vs-hang-interruptible
on one of my older machines.

Unfortunately (despite the proper lockdep annotations for
flush_workqueue) lockdep still doesn't detect this correctly, so we
need to rely on chance to discover these bugs.

Apply the usual bugfix and schedule the reset work on the system
workqueue to keep our own driver workqueue free of any modeset lock
grabbing.

Note that this is not a terribly serious regression since before the
offending commit we'd simply have stalled userspace forever due to
failing to abort all outstanding pageflips.

v2: Add a comment as requested by Chris.

Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Ville Syrjälä <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 OK
[    0.525667] smpboot: Booting Node   1, Processors  #6 #7 #8 #9 #10 #11 OK
[    0.755592] smpboot: Booting Node   0, Processors  #12 #13 #14 #15 #16 #17 OK
[    0.890495] smpboot: Booting Node   1, Processors  #18 #19 #20 #21 #22 #23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
Pull xfs update #2 from Ben Myers:
 "Here we have defrag support for v5 superblock, a number of bugfixes
  and a cleanup or two.

   - defrag support for CRC filesystems
   - fix endian worning in xlog_recover_get_buf_lsn
   - fixes for sparse warnings
   - fix for assert in xfs_dir3_leaf_hdr_from_disk
   - fix for log recovery of remote symlinks
   - fix for log recovery of btree root splits
   - fixes formemory allocation failures with ACLs
   - fix for assert in xfs_buf_item_relse
   - fix for assert in xfs_inode_buf_verify
   - fix an assignment in an assert that should be a test in
     xfs_bmbt_change_owner
   - remove dead code in xlog_recover_inode_pass2"

* tag 'xfs-for-linus-v3.12-rc1-2' of git://oss.sgi.com/xfs/xfs:
  xfs: remove dead code from xlog_recover_inode_pass2
  xfs: = vs == typo in ASSERT()
  xfs: don't assert fail on bad inode numbers
  xfs: aborted buf items can be in the AIL.
  xfs: factor all the kmalloc-or-vmalloc fallback allocations
  xfs: fix memory allocation failures with ACLs
  xfs: ensure we copy buffer type in da btree root splits
  xfs: set remote symlink buffer type for recovery
  xfs: recovery of swap extents operations for CRC filesystems
  xfs: swap extents operations for CRC filesystems
  xfs: check magic numbers in dir3 leaf verifier first
  xfs: fix some minor sparse warnings
  xfs: fix endian warning in xlog_recover_get_buf_lsn()
koalo pushed a commit that referenced this issue Aug 4, 2014
Not all I/O ASIC versions have the free-running counter implemented, an
early revision used in the 5000/1xx models aka 3MIN and 4MIN did not have
it.  Therefore we cannot unconditionally use it as a clock source.
Fortunately if not implemented its register slot has a fixed value so it
is enough if we check for the value at the end of the calibration period
being the same as at the beginning.

This also means we need to look for another high-precision clock source on
the systems affected.  The 5000/1xx can have an R4000SC processor
installed where the CP0 Count register can be used as a clock source.
Unfortunately all the R4k DECstations suffer from the missed timer
interrupt on CP0 Count reads erratum, so we cannot use the CP0 timer as a
clock source and a clock event both at a time.  However we never need an
R4k clock event device because all DECstations have a DS1287A RTC chip
whose periodic interrupt can be used as a clock source.

This gives us the following four configuration possibilities for I/O ASIC
DECstations:

1. No I/O ASIC counter and no CP0 timer, e.g. R3k 5000/1xx (3MIN).

2. No I/O ASIC counter but the CP0 timer, i.e. R4k 5000/150 (4MIN).

3. The I/O ASIC counter but no CP0 timer, e.g. R3k 5000/240 (3MAX+).

4. The I/O ASIC counter and the CP0 timer, e.g. R4k 5000/260 (4MAX+).

For #1 and #2 this change stops the I/O ASIC free-running counter from
being installed as a clock source of a 0Hz frequency.  For #2 it also
arranges for the CP0 timer to be used as a clock source rather than a
clock event device, because having an accurate wall clock is more
important than a high-precision interval timer.  For #3 there is no
change.  For #4 the change makes the I/O ASIC free-running counter
installed as a clock source so that the CP0 timer can be used as a clock
event device.

Unfortunately the use of the CP0 timer as a clock event device relies on a
succesful completion of c0_compare_interrupt.  That never happens, because
while waiting for a CP0 Compare interrupt to happen the function spins in
a loop reading the CP0 Count register.  This makes the CP0 Count erratum
trigger reliably causing the interrupt waited for to be lost in all cases.
As a result #4 resorts to using the CP0 timer as a clock source as well,
just as #2.  However we want to keep this separate arrangement in case
(hope) c0_compare_interrupt is eventually rewritten such that it avoids
the erratum.

Signed-off-by: Maciej W. Rozycki <[email protected]>
Cc: [email protected]
Patchwork: https://patchwork.linux-mips.org/patch/5825/
Signed-off-by: Ralf Baechle <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
When parsing lines from objdump a line containing source code starting
with a numeric label is mistaken for a line of disassembly starting with
a memory address.

Current validation fails to recognise that the "memory address" is out
of range and calculates an invalid offset which later causes this
segfault:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
631				hits += h->addr[offset++];
(gdb) bt
 #0  0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
 #1  0x00000000004d65e3 in annotate_browser__calc_percent (browser=0x7fffffffd130, evsel=0xa01da0) at ui/browsers/annotate.c:364
 #2  0x00000000004d7433 in annotate_browser__run (browser=0x7fffffffd130, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:672
 #3  0x00000000004d80c9 in symbol__tui_annotate (sym=0xc989a0, map=0xa02660, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:962
 #4  0x00000000004d7aa0 in hist_entry__tui_annotate (he=0xdf73f0, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:823
 #5  0x00000000004dd648 in perf_evsel__hists_browse (evsel=0xa01da0, nr_events=1, helpline=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", ev_name=0xa02cd0 "cycles", left_exits=false, hbt=
    0x0, min_pcnt=0, env=0xa011e0) at ui/browsers/hists.c:1659
 #6  0x00000000004de372 in perf_evlist__tui_browse_hists (evlist=0xa01520, help=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", hbt=0x0, min_pcnt=0, env=0xa011e0)
    at ui/browsers/hists.c:1950
 #7  0x000000000042cf6b in __cmd_report (rep=0x7fffffffd6c0) at builtin-report.c:581
 #8  0x000000000042e25d in cmd_report (argc=0, argv=0x7fffffffe4b0, prefix=0x0) at builtin-report.c:965
 #9  0x000000000041a0e1 in run_builtin (p=0x801548, argc=1, argv=0x7fffffffe4b0) at perf.c:319
 #10 0x000000000041a319 in handle_internal_command (argc=1, argv=0x7fffffffe4b0) at perf.c:376
 #11 0x000000000041a465 in run_argv (argcp=0x7fffffffe38c, argv=0x7fffffffe380) at perf.c:420
 #12 0x000000000041a707 in main (argc=1, argv=0x7fffffffe4b0) at perf.c:521

After the fix is applied the symbol can be annotated showing the
problematic line "1:      rep"

copy_user_generic_string  /usr/lib/debug/lib/modules/3.9.10-100.fc17.x86_64/vmlinux
             */
            ENTRY(copy_user_generic_string)
                    CFI_STARTPROC
                    ASM_STAC
                    andl %edx,%edx
              and    %edx,%edx
                    jz 4f
              je     37
                    cmpl $8,%edx
              cmp    $0x8,%edx
                    jb 2f           /* less than 8 bytes, go to byte copy loop */
              jb     33
                    ALIGN_DESTINATION
              mov    %edi,%ecx
              and    $0x7,%ecx
              je     28
              sub    $0x8,%ecx
              neg    %ecx
              sub    %ecx,%edx
        1a:   mov    (%rsi),%al
              mov    %al,(%rdi)
              inc    %rsi
              inc    %rdi
              dec    %ecx
              jne    1a
                    movl %edx,%ecx
        28:   mov    %edx,%ecx
                    shrl $3,%ecx
              shr    $0x3,%ecx
                    andl $7,%edx
              and    $0x7,%edx
            1:      rep
100.00        rep    movsq %ds:(%rsi),%es:(%rdi)
                    movsq
            2:      movl %edx,%ecx
        33:   mov    %edx,%ecx
            3:      rep
              rep    movsb %ds:(%rsi),%es:(%rdi)
                    movsb
            4:      xorl %eax,%eax
        37:   xor    %eax,%eax
              data32 xchg %ax,%ax
                    ASM_CLAC
                    ret
              retq

Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
This patch fixes the issues indicated by the test results that
ipmi_msg_handler() is invoked in atomic context.

BUG: scheduling while atomic: kipmi0/18933/0x10000100
Modules linked in: ipmi_si acpi_ipmi ...
CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G       AW    3.10.0-rc7+ #2
Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010
 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8
 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4
 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8
Call Trace:
 <IRQ>  [<ffffffff814c4a1e>] dump_stack+0x19/0x1b
 [<ffffffff814bfbab>] __schedule_bug+0x46/0x54
 [<ffffffff814c73e0>] __schedule+0x83/0x59c
 [<ffffffff81058853>] __cond_resched+0x22/0x2d
 [<ffffffff814c794b>] _cond_resched+0x14/0x1d
 [<ffffffff814c6d82>] mutex_lock+0x11/0x32
 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58
 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si]
 [<ffffffff812bf6e4>] deliver_response+0x55/0x5a
 [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65
 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19
 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc
 [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si]
 ...

Also Tony Camuso says:

 We were getting occasional "Scheduling while atomic" call traces
 during boot on some systems. Problem was first seen on a Cisco C210
 but we were able to reproduce it on a Cisco c220m3. Setting
 CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around
 tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device.

 =================================
 [ INFO: inconsistent lock state ]
 2.6.32-415.el6.x86_64-debug-splck #1
 ---------------------------------
 inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
 ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes:
  (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126
 {SOFTIRQ-ON-W} state was registered at:
   [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570
   [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120
   [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400
   [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60
   [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234
   [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be

The fix implemented by this change has been tested by Tony:

 Tested the patch in a boot loop with lockdep debug enabled and never
 saw the problem in over 400 reboots.

Reported-and-tested-by: Tony Camuso <[email protected]>
Signed-off-by: Lv Zheng <[email protected]>
Reviewed-by: Huang Ying <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ #2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xfs_dquot_other_class);
  lock(&xfs_dquot_other_class);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by touch/21072:
 #0:  (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
 #1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
 #2:  (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
 #3:  (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
 #4:  (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
 #5:  (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
 #6:  (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <[email protected]>
Signed-off-by: Dave Chinner <[email protected]>
Reviewed-by: Ben Myers <[email protected]>
Signed-off-by: Ben Myers <[email protected]>

(cherry picked from commit f112a04)
koalo pushed a commit that referenced this issue Aug 4, 2014
When btrfs creates a bioset, we must also allocate the integrity data pool.
Otherwise btrfs will crash when it tries to submit a bio to a checksumming
disk:

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
 IP: [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
 PGD 2305e4067 PUD 23063d067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP
 Modules linked in: btrfs scsi_debug xfs ext4 jbd2 ext3 jbd mbcache
sch_fq_codel eeprom lpc_ich mfd_core nfsd exportfs auth_rpcgss af_packet
raid6_pq xor zlib_deflate libcrc32c [last unloaded: scsi_debug]
 CPU: 1 PID: 4486 Comm: mount Not tainted 3.12.0-rc1-mcsum #2
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 task: ffff8802451c9720 ti: ffff880230698000 task.ti: ffff880230698000
 RIP: 0010:[<ffffffff8111e28a>]  [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
 RSP: 0018:ffff880230699688  EFLAGS: 00010286
 RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000005f8445
 RDX: 0000000000000001 RSI: 0000000000000010 RDI: 0000000000000000
 RBP: ffff8802306996f8 R08: 0000000000011200 R09: 0000000000000008
 R10: 0000000000000020 R11: ffff88009d6e8000 R12: 0000000000011210
 R13: 0000000000000030 R14: ffff8802306996b8 R15: ffff8802451c9720
 FS:  00007f25b8a16800(0000) GS:ffff88024fc80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000018 CR3: 0000000230576000 CR4: 00000000000007e0
 Stack:
  ffff8802451c9720 0000000000000002 ffffffff81a97100 0000000000281250
  ffffffff81a96480 ffff88024fc99150 ffff880228d18200 0000000000000000
  0000000000000000 0000000000000040 ffff880230e8c2e8 ffff8802459dc900
 Call Trace:
  [<ffffffff811b2208>] bio_integrity_alloc+0x48/0x1b0
  [<ffffffff811b26fc>] bio_integrity_prep+0xac/0x360
  [<ffffffff8111e298>] ? mempool_alloc+0x58/0x150
  [<ffffffffa03e8041>] ? alloc_extent_state+0x31/0x110 [btrfs]
  [<ffffffff81241579>] blk_queue_bio+0x1c9/0x460
  [<ffffffff8123e58a>] generic_make_request+0xca/0x100
  [<ffffffff8123e639>] submit_bio+0x79/0x160
  [<ffffffffa03f865e>] btrfs_map_bio+0x48e/0x5b0 [btrfs]
  [<ffffffffa03c821a>] btree_submit_bio_hook+0xda/0x110 [btrfs]
  [<ffffffffa03e7eba>] submit_one_bio+0x6a/0xa0 [btrfs]
  [<ffffffffa03ef450>] read_extent_buffer_pages+0x250/0x310 [btrfs]
  [<ffffffff8125eef6>] ? __radix_tree_preload+0x66/0xf0
  [<ffffffff8125f1c5>] ? radix_tree_insert+0x95/0x260
  [<ffffffffa03c66f6>] btree_read_extent_buffer_pages.constprop.128+0xb6/0x120
[btrfs]
  [<ffffffffa03c8c1a>] read_tree_block+0x3a/0x60 [btrfs]
  [<ffffffffa03caefd>] open_ctree+0x139d/0x2030 [btrfs]
  [<ffffffffa03a282a>] btrfs_mount+0x53a/0x7d0 [btrfs]
  [<ffffffff8113ab0b>] ? pcpu_alloc+0x8eb/0x9f0
  [<ffffffff81167305>] ? __kmalloc_track_caller+0x35/0x1e0
  [<ffffffff81176ba0>] mount_fs+0x20/0xd0
  [<ffffffff81191096>] vfs_kern_mount+0x76/0x120
  [<ffffffff81193320>] do_mount+0x200/0xa40
  [<ffffffff81135cdb>] ? strndup_user+0x5b/0x80
  [<ffffffff81193bf0>] SyS_mount+0x90/0xe0
  [<ffffffff8156d31d>] system_call_fastpath+0x1a/0x1f
 Code: 4c 8d 75 a8 4c 89 6d e8 45 89 e0 4c 8d 6f 30 48 89 5d d8 41 83 e0 af 48
89 fb 49 83 c6 18 4c 89 7d f8 65 4c 8b 3c 25 c0 b8 00 00 <48> 8b 73 18 44 89 c7
44 89 45 98 ff 53 20 48 85 c0 48 89 c2 74
 RIP  [<ffffffff8111e28a>] mempool_alloc+0x4a/0x150
  RSP <ffff880230699688>
 CR2: 0000000000000018
 ---[ end trace 7a96042017ed21e2 ]---

Signed-off-by: Darrick J. Wong <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: Chris Mason <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
If EM Transmit bit is busy during init ata_msleep() is called.  It is
wrong - msleep() should be used instead of ata_msleep(), because if EM
Transmit bit is busy for one port, it will be busy for all other ports
too, so using ata_msleep() causes wasting tries for another ports.

The most common scenario looks like that now
(six ports try to transmit a LED meaasege):
- port #0 tries for the 1st time and succeeds
- ports #1-5 try for the 1st time and sleeps
- port #1 tries for the 2nd time and succeeds
- ports #2-5 try for the 2nd time and sleeps
- port #2 tries for the 3rd time and succeeds
- ports #3-5 try for the 3rd time and sleeps
- port #3 tries for the 4th time and succeeds
- ports #4-5 try for the 4th time and sleeps
- port #4 tries for the 5th time and succeeds
- port #5 tries for the 5th time and sleeps

At this moment port #5 wasted all its five tries and failed to
initialize.  Because there are only 5 (EM_MAX_RETRY) tries available
usually only five ports succeed to initialize. The sixth port and next
ones usually will fail.

If msleep() is used instead of ata_msleep() the first port succeeds to
initialize in the first try and next ones usually succeed to
initialize in the second try.

tj: updated comment

Signed-off-by: Lukasz Dorau <[email protected]>
Signed-off-by: Tejun Heo <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 057db84 upstream.

Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  #1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  #2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  #3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  #4 ffffffff812a1065 (__fput+0x155/0x360)
  #5 ffffffff812a12de (____fput+0x1e/0x30)
  #6 ffffffff8111708d (task_work_run+0x10d/0x140)
  #7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  #8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  #9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  #10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  #1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  #2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  #3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  #4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  #5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  #6 ffffffff8129b668 (finish_open+0x68/0xa0)
  #7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  #8 ffffffff812b7350 (path_openat+0x120/0xb50)
  #9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  #10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  #11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  #12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser->idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/[email protected]

Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 4e58e54 upstream.

If an TRACE_EVENT() uses __assign_str() or __get_str on a NULL pointer
then the following oops will happen:

BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c127a17b>] strlen+0x10/0x1a
*pde = 00000000 ^M
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-rc1-test+ #2
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006^M
task: f5cde9f0 ti: f5e5e000 task.ti: f5e5e000
EIP: 0060:[<c127a17b>] EFLAGS: 00210046 CPU: 1
EIP is at strlen+0x10/0x1a
EAX: 00000000 EBX: c2472da8 ECX: ffffffff EDX: c2472da8
ESI: c1c5e5fc EDI: 00000000 EBP: f5e5fe84 ESP: f5e5fe80
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
CR0: 8005003b CR2: 00000000 CR3: 01f32000 CR4: 000007d0
Stack:
 f5f18b90 f5e5feb8 c10687a8 0759004f 00000005 00000005 00000005 00200046
 00000002 00000000 c1082a93 f56c7e28 c2472da8 c1082a93 f5e5fee4 c106bc61^M
 00000000 c1082a93 00000000 00000000 00000001 00200046 00200082 00000000
Call Trace:
 [<c10687a8>] ftrace_raw_event_lock+0x39/0xc0
 [<c1082a93>] ? ktime_get+0x29/0x69
 [<c1082a93>] ? ktime_get+0x29/0x69
 [<c106bc61>] lock_release+0x57/0x1a5
 [<c1082a93>] ? ktime_get+0x29/0x69
 [<c10824dd>] read_seqcount_begin.constprop.7+0x4d/0x75
 [<c1082a93>] ? ktime_get+0x29/0x69^M
 [<c1082a93>] ktime_get+0x29/0x69
 [<c108a46a>] __tick_nohz_idle_enter+0x1e/0x426
 [<c10690e8>] ? lock_release_holdtime.part.19+0x48/0x4d
 [<c10bc184>] ? time_hardirqs_off+0xe/0x28
 [<c1068c82>] ? trace_hardirqs_off_caller+0x3f/0xaf
 [<c108a8cb>] tick_nohz_idle_enter+0x59/0x62
 [<c1079242>] cpu_startup_entry+0x64/0x192
 [<c102299c>] start_secondary+0x277/0x27c
Code: 90 89 c6 89 d0 88 c4 ac 38 e0 74 09 84 c0 75 f7 be 01 00 00 00 89 f0 48 5e 5d c3 55 89 e5 57 66 66 66 66 90 83 c9 ff 89 c7 31 c0 <f2> ae f7 d1 8d 41 ff 5f 5d c3 55 89 e5 57 66 66 66 66 90 31 ff
EIP: [<c127a17b>] strlen+0x10/0x1a SS:ESP 0068:f5e5fe80
CR2: 0000000000000000
---[ end trace 01bc47bf519ec1b2 ]---

New tracepoints have been added that have allowed for NULL pointers
being assigned to strings. To fix this, change the TRACE_EVENT() code
to check for NULL and if it is, it will assign "(null)" to it instead
(similar to what glibc printf does).

Reported-by: Shuah Khan <[email protected]>
Reported-by: Jovi Zhangwei <[email protected]>
Link: http://lkml.kernel.org/r/CAGdX0WFeEuy+DtpsJzyzn0343qEEjLX97+o1VREFkUEhndC+5Q@mail.gmail.com
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 9cbf117 ("tracing/events: provide string with undefined size support")
Signed-off-by: Steven Rostedt <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
…is completed

commit e0d5973 upstream.

Currently, when mounting pstore file system, a read callback of
efi_pstore driver runs mutiple times as below.

- In the first read callback, scan efivar_sysfs_list from head and pass
  a kmsg buffer of a entry to an upper pstore layer.
- In the second read callback, rescan efivar_sysfs_list from the entry
  and pass another kmsg buffer to it.
- Repeat the scan and pass until the end of efivar_sysfs_list.

In this process, an entry is read across the multiple read function
calls. To avoid race between the read and erasion, the whole process
above is protected by a spinlock, holding in open() and releasing in
close().

At the same time, kmemdup() is called to pass the buffer to pstore
filesystem during it. And then, it causes a following lockdep warning.

To make the dynamic memory allocation runnable without taking spinlock,
holding off a deletion of sysfs entry if it happens while scanning it
via efi_pstore, and deleting it after the scan is completed.

To implement it, this patch introduces two flags, scanning and deleting,
to efivar_entry.

On the code basis, it seems that all the scanning and deleting logic is
not needed because __efivars->lock are not dropped when reading from the
EFI variable store.

But, the scanning and deleting logic is still needed because an
efi-pstore and a pstore filesystem works as follows.

In case an entry(A) is found, the pointer is saved to psi->data.  And
efi_pstore_read() passes the entry(A) to a pstore filesystem by
releasing  __efivars->lock.

And then, the pstore filesystem calls efi_pstore_read() again and the
same entry(A), which is saved to psi->data, is used for resuming to scan
a sysfs-list.

So, to protect the entry(A), the logic is needed.

[    1.143710] ------------[ cut here ]------------
[    1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110()
[    1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags))
[    1.144058] Modules linked in:
[    1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2
[    1.144058]  0000000000000009 ffff8800797e9ae0 ffffffff816614a5 ffff8800797e9b28
[    1.144058]  ffff8800797e9b18 ffffffff8105510d 0000000000000080 0000000000000046
[    1.144058]  00000000000000d0 00000000000003af ffffffff81ccd0c0 ffff8800797e9b78
[    1.144058] Call Trace:
[    1.144058]  [<ffffffff816614a5>] dump_stack+0x54/0x74
[    1.144058]  [<ffffffff8105510d>] warn_slowpath_common+0x7d/0xa0
[    1.144058]  [<ffffffff8105517c>] warn_slowpath_fmt+0x4c/0x50
[    1.144058]  [<ffffffff8131290f>] ? vsscanf+0x57f/0x7b0
[    1.144058]  [<ffffffff810bbd74>] lockdep_trace_alloc+0x104/0x110
[    1.144058]  [<ffffffff81192da0>] __kmalloc_track_caller+0x50/0x280
[    1.144058]  [<ffffffff815147bb>] ? efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff8115b260>] kmemdup+0x20/0x50
[    1.144058]  [<ffffffff815147bb>] efi_pstore_read_func.part.1+0x12b/0x170
[    1.144058]  [<ffffffff81514800>] ? efi_pstore_read_func.part.1+0x170/0x170
[    1.144058]  [<ffffffff815148b4>] efi_pstore_read_func+0xb4/0xe0
[    1.144058]  [<ffffffff81512b7b>] __efivar_entry_iter+0xfb/0x120
[    1.144058]  [<ffffffff8151428f>] efi_pstore_read+0x3f/0x50
[    1.144058]  [<ffffffff8128d7ba>] pstore_get_records+0x9a/0x150
[    1.158207]  [<ffffffff812af25c>] ? selinux_d_instantiate+0x1c/0x20
[    1.158207]  [<ffffffff8128ce30>] ? parse_options+0x80/0x80
[    1.158207]  [<ffffffff8128ced5>] pstore_fill_super+0xa5/0xc0
[    1.158207]  [<ffffffff811ae7d2>] mount_single+0xa2/0xd0
[    1.158207]  [<ffffffff8128ccf8>] pstore_mount+0x18/0x20
[    1.158207]  [<ffffffff811ae8b9>] mount_fs+0x39/0x1b0
[    1.158207]  [<ffffffff81160550>] ? __alloc_percpu+0x10/0x20
[    1.158207]  [<ffffffff811c9493>] vfs_kern_mount+0x63/0xf0
[    1.158207]  [<ffffffff811cbb0e>] do_mount+0x23e/0xa20
[    1.158207]  [<ffffffff8115b51b>] ? strndup_user+0x4b/0xf0
[    1.158207]  [<ffffffff811cc373>] SyS_mount+0x83/0xc0
[    1.158207]  [<ffffffff81673cc2>] system_call_fastpath+0x16/0x1b
[    1.158207] ---[ end trace 61981bc62de9f6f4 ]---

Signed-off-by: Seiji Aguchi <[email protected]>
Tested-by: Madper Xie <[email protected]>
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 6fdda9a upstream.

As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -> timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty raspberrypi#4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-> #5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-> #4 (&rt_b->rt_runtime_lock){-.-...}:
[snipped]
-> #3 (&rq->lock){-.-.-.}:
[snipped]
-> #2 (&p->pi_lock){-.-.-.}:
[snipped]
-> #1 (&(&pool->lock)->rlock){-.-...}:
[  251.101967]        [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
[  251.101967]        [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
[  251.101967]        [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
[  251.101967]        [<ffffffff84398500>] _raw_spin_lock+0x40/0x80
[  251.101967]        [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0
[  251.101967]        [<ffffffff81154168>] queue_work_on+0x98/0x120
[  251.101967]        [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30
[  251.101967]        [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160
[  251.101967]        [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [<ffffffff843a4b49>] ia32_sysret+0x0/0x5
[  251.101967]
-> #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&rt_b->rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #1:  (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  #2:  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty raspberrypi#4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner <[email protected]>
Cc: Prarit Bhargava <[email protected]>
Cc: Richard Cochran <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Sasha Levin <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Tested-by: Sasha Levin <[email protected]>
Signed-off-by: John Stultz <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit ae10b2b upstream.

Nathan Zimmer found that once we get over 10+ cpus, the scalability of
SPECjbb falls over due to the contention on the global 'epmutex', which is
taken in on EPOLL_CTL_ADD and EPOLL_CTL_DEL operations.

Patch #1 removes the 'epmutex' lock completely from the EPOLL_CTL_DEL path
by using rcu to guard against any concurrent traversals.

Patch #2 remove the 'epmutex' lock from EPOLL_CTL_ADD operations for
simple topologies.  IE when adding a link from an epoll file descriptor to
a wakeup source, where the epoll file descriptor is not nested.

This patch (of 2):

Optimize EPOLL_CTL_DEL such that it does not require the 'epmutex' by
converting the file->f_ep_links list into an rcu one.  In this way, we can
traverse the epoll network on the add path in parallel with deletes.
Since deletes can't create loops or worse wakeup paths, this is safe.

This patch in combination with the patch "epoll: Do not take global 'epmutex'
for simple topologies", shows a dramatic performance improvement in
scalability for SPECjbb.

Signed-off-by: Jason Baron <[email protected]>
Tested-by: Nathan Zimmer <[email protected]>
Cc: Eric Wong <[email protected]>
Cc: Nelson Elhage <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Davide Libenzi <[email protected]>
Cc: "Paul E. McKenney" <[email protected]>
CC: Wu Fengguang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit d25f06e upstream.

vmxnet3's netpoll driver is incorrectly coded.  It directly calls
vmxnet3_do_poll, which is the driver internal napi poll routine.  As the netpoll
controller method doesn't block real napi polls in any way, there is a potential
for race conditions in which the netpoll controller method and the napi poll
method run concurrently.  The result is data corruption causing panics such as this
one recently observed:
PID: 1371   TASK: ffff88023762caa0  CPU: 1   COMMAND: "rs:main Q:Reg"
 #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
 #1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
 #2 [ffff88023abd58b0] oops_end at ffffffff8152b570
 #3 [ffff88023abd58e0] die at ffffffff81010e0b
 #4 [ffff88023abd5910] do_trap at ffffffff8152add4
 #5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
 #6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
    [exception RIP: vmxnet3_rq_rx_complete+1968]
    RIP: ffffffffa00f1e80  RSP: ffff88023abd5ac8  RFLAGS: 00010086
    RAX: 0000000000000000  RBX: ffff88023b5dcee0  RCX: 00000000000000c0
    RDX: 0000000000000000  RSI: 00000000000005f2  RDI: ffff88023b5dcee0
    RBP: ffff88023abd5b48   R8: 0000000000000000   R9: ffff88023a3b6048
    R10: 0000000000000000  R11: 0000000000000002  R12: ffff8802398d4cd8
    R13: ffff88023af35140  R14: ffff88023b60c890  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
 #8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
 #9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7

The fix is to do as other drivers do, and have the poll controller call the top
half interrupt handler, which schedules a napi poll properly to recieve frames

Tested by myself, successfully.

Signed-off-by: Neil Horman <[email protected]>
CC: Shreyas Bhatewara <[email protected]>
CC: "VMware, Inc." <[email protected]>
CC: "David S. Miller" <[email protected]>
Reviewed-by: Shreyas N Bhatewara <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 2b90563 upstream.

When stopping nfsd, I got BUG messages, and soft lockup messages,
The problem is cuased by double rb_erase() in nfs4_state_destroy_net()
and destroy_client().

This patch just let nfsd traversing unconfirmed client through
hash-table instead of rbtree.

[ 2325.021995] BUG: unable to handle kernel NULL pointer dereference at
          (null)
[ 2325.022809] IP: [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982] PGD 7a91b067 PUD 7a33d067 PMD 0
[ 2325.022982] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC
[ 2325.022982] Modules linked in: nfsd(OF) cfg80211 rfkill bridge stp
llc snd_intel8x0 snd_ac97_codec ac97_bus auth_rpcgss nfs_acl serio_raw
e1000 i2c_piix4 ppdev snd_pcm snd_timer lockd pcspkr joydev parport_pc
snd parport i2c_core soundcore microcode sunrpc ata_generic pata_acpi
[last unloaded: nfsd]
[ 2325.022982] CPU: 1 PID: 2123 Comm: nfsd Tainted: GF          O
3.14.0-rc8+ #2
[ 2325.022982] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[ 2325.022982] task: ffff88007b384800 ti: ffff8800797f6000 task.ti:
ffff8800797f6000
[ 2325.022982] RIP: 0010:[<ffffffff8133c18c>]  [<ffffffff8133c18c>]
rb_erase+0x14c/0x390
[ 2325.022982] RSP: 0018:ffff8800797f7d98  EFLAGS: 00010246
[ 2325.022982] RAX: ffff880079c1f010 RBX: ffff880079f4c828 RCX:
0000000000000000
[ 2325.022982] RDX: 0000000000000000 RSI: ffff880079bcb070 RDI:
ffff880079f4c810
[ 2325.022982] RBP: ffff8800797f7d98 R08: 0000000000000000 R09:
ffff88007964fc70
[ 2325.022982] R10: 0000000000000000 R11: 0000000000000400 R12:
ffff880079f4c800
[ 2325.022982] R13: ffff880079bcb000 R14: ffff8800797f7da8 R15:
ffff880079f4c860
[ 2325.022982] FS:  0000000000000000(0000) GS:ffff88007f900000(0000)
knlGS:0000000000000000
[ 2325.022982] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 2325.022982] CR2: 0000000000000000 CR3: 000000007a3ef000 CR4:
00000000000006e0
[ 2325.022982] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 2325.022982] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[ 2325.022982] Stack:
[ 2325.022982]  ffff8800797f7de0 ffffffffa0191c6e ffff8800797f7da8
ffff8800797f7da8
[ 2325.022982]  ffff880079f4c810 ffff880079bcb000 ffffffff81cc26c0
ffff880079c1f010
[ 2325.022982]  ffff880079bcb070 ffff8800797f7e28 ffffffffa01977f2
ffff8800797f7df0
[ 2325.022982] Call Trace:
[ 2325.022982]  [<ffffffffa0191c6e>] destroy_client+0x32e/0x3b0 [nfsd]
[ 2325.022982]  [<ffffffffa01977f2>] nfs4_state_shutdown_net+0x1a2/0x220
[nfsd]
[ 2325.022982]  [<ffffffffa01700b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[ 2325.022982]  [<ffffffffa017013e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[ 2325.022982]  [<ffffffffa001f1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[ 2325.022982]  [<ffffffffa017064b>] nfsd_destroy+0x5b/0x80 [nfsd]
[ 2325.022982]  [<ffffffffa0170773>] nfsd+0x103/0x130 [nfsd]
[ 2325.022982]  [<ffffffffa0170670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[ 2325.022982]  [<ffffffff810a8232>] kthread+0xd2/0xf0
[ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[ 2325.022982]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[ 2325.022982] Code: 48 83 e1 fc 48 89 10 0f 84 02 01 00 00 48 3b 41 10
0f 84 08 01 00 00 48 89 51 08 48 89 fa e9 74 ff ff ff 0f 1f 40 00 48 8b
50 10 <f6> 02 01 0f 84 93 00 00 00 48 8b 7a 10 48 85 ff 74 05 f6 07 01
[ 2325.022982] RIP  [<ffffffff8133c18c>] rb_erase+0x14c/0x390
[ 2325.022982]  RSP <ffff8800797f7d98>
[ 2325.022982] CR2: 0000000000000000
[ 2325.022982] ---[ end trace 28c27ed011655e57 ]---

[  228.064071] BUG: soft lockup - CPU#0 stuck for 22s! [nfsd:558]
[  228.064428] Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211
xt_conntrack rfkill ebtable_nat ebtable_broute bridge stp llc
ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6
nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw
ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security
iptable_raw nfsd(OF) auth_rpcgss nfs_acl lockd snd_intel8x0
snd_ac97_codec ac97_bus joydev snd_pcm snd_timer e1000 sunrpc snd ppdev
parport_pc serio_raw pcspkr i2c_piix4 microcode parport soundcore
i2c_core ata_generic pata_acpi
[  228.064539] CPU: 0 PID: 558 Comm: nfsd Tainted: GF          O
3.14.0-rc8+ #2
[  228.064539] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS
VirtualBox 12/01/2006
[  228.064539] task: ffff880076adec00 ti: ffff880074616000 task.ti:
ffff880074616000
[  228.064539] RIP: 0010:[<ffffffff8133ba17>]  [<ffffffff8133ba17>]
rb_next+0x27/0x50
[  228.064539] RSP: 0018:ffff880074617de0  EFLAGS: 00000282
[  228.064539] RAX: ffff880074478010 RBX: ffff88007446f860 RCX:
0000000000000014
[  228.064539] RDX: ffff880074478010 RSI: 0000000000000000 RDI:
ffff880074478010
[  228.064539] RBP: ffff880074617de0 R08: 0000000000000000 R09:
0000000000000012
[  228.064539] R10: 0000000000000001 R11: ffffffffffffffec R12:
ffffea0001d11a00
[  228.064539] R13: ffff88007f401400 R14: ffff88007446f800 R15:
ffff880074617d50
[  228.064539] FS:  0000000000000000(0000) GS:ffff88007f800000(0000)
knlGS:0000000000000000
[  228.064539] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  228.064539] CR2: 00007fe9ac6ec000 CR3: 000000007a5d6000 CR4:
00000000000006f0
[  228.064539] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  228.064539] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  228.064539] Stack:
[  228.064539]  ffff880074617e28 ffffffffa01ab7db ffff880074617df0
ffff880074617df0
[  228.064539]  ffff880079273000 ffffffff81cc26c0 ffffffff81cc26c0
0000000000000000
[  228.064539]  0000000000000000 ffff880074617e48 ffffffffa01840b8
ffffffff81cc26c0
[  228.064539] Call Trace:
[  228.064539]  [<ffffffffa01ab7db>] nfs4_state_shutdown_net+0x18b/0x220
[nfsd]
[  228.064539]  [<ffffffffa01840b8>] nfsd_shutdown_net+0x38/0x70 [nfsd]
[  228.064539]  [<ffffffffa018413e>] nfsd_last_thread+0x4e/0x80 [nfsd]
[  228.064539]  [<ffffffffa00aa1eb>] svc_shutdown_net+0x2b/0x30 [sunrpc]
[  228.064539]  [<ffffffffa018464b>] nfsd_destroy+0x5b/0x80 [nfsd]
[  228.064539]  [<ffffffffa0184773>] nfsd+0x103/0x130 [nfsd]
[  228.064539]  [<ffffffffa0184670>] ? nfsd_destroy+0x80/0x80 [nfsd]
[  228.064539]  [<ffffffff810a8232>] kthread+0xd2/0xf0
[  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[  228.064539]  [<ffffffff816c493c>] ret_from_fork+0x7c/0xb0
[  228.064539]  [<ffffffff810a8160>] ? insert_kthread_work+0x40/0x40
[  228.064539] Code: 1f 44 00 00 55 48 8b 17 48 89 e5 48 39 d7 74 3b 48
8b 47 08 48 85 c0 75 0e eb 25 66 0f 1f 84 00 00 00 00 00 48 89 d0 48 8b
50 10 <48> 85 d2 75 f4 5d c3 66 90 48 3b 78 08 75 f6 48 8b 10 48 89 c7

Fixes: ac55fdc (nfsd: move the confirmed and unconfirmed hlists...)
Signed-off-by: Kinglong Mee <[email protected]>
Reviewed-by: Jeff Layton <[email protected]>
Signed-off-by: J. Bruce Fields <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 622cad1 upstream.

The function ext4_update_i_disksize() is used in only one place, in
the function mpage_map_and_submit_extent().  Move its code to simplify
the code paths, and also move the call to ext4_mark_inode_dirty() into
the i_data_sem's critical region, to be consistent with all of the
other places where we update i_disksize.  That way, we also keep the
raw_inode's i_disksize protected, to avoid the following race:

      CPU #1                                 CPU #2

   down_write(&i_data_sem)
   Modify i_disk_size
   up_write(&i_data_sem)
                                        down_write(&i_data_sem)
                                        Modify i_disk_size
                                        Copy i_disk_size to on-disk inode
                                        up_write(&i_data_sem)
   Copy i_disk_size to on-disk inode

Signed-off-by: "Theodore Ts'o" <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit a585f87 upstream.

The scenario here is that someone calls enable_irq_wake() from somewhere
in the code. This will result in the lockdep producing a backtrace as can
be seen below. In my case, this problem is triggered when using the wl1271
(TI WlCore) driver found in drivers/net/wireless/ti/ .

The problem cause is rather obvious from the backtrace, but let's outline
the dependency. enable_irq_wake() grabs the IRQ buslock in irq_set_irq_wake(),
which in turns calls mxs_gpio_set_wake_irq() . But mxs_gpio_set_wake_irq()
calls enable_irq_wake() again on the one-level-higher IRQ , thus it tries to
grab the IRQ buslock again in irq_set_irq_wake() . Because the spinlock in
irq_set_irq_wake()->irq_get_desc_buslock()->__irq_get_desc_lock() is not
marked as recursive, lockdep will spew the stuff below.

We know we can safely re-enter the lock, so use IRQ_GC_INIT_NESTED_LOCK to
fix the spew.

 =============================================
 [ INFO: possible recursive locking detected ]
 3.10.33-00012-gf06b763-dirty raspberrypi#61 Not tainted
 ---------------------------------------------
 kworker/0:1/18 is trying to acquire lock:
  (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88

 but task is already holding lock:
  (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&irq_desc_lock_class);
   lock(&irq_desc_lock_class);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 3 locks held by kworker/0:1/18:
  #0:  (events){.+.+.+}, at: [<c0036308>] process_one_work+0x134/0x4a4
  #1:  ((&fw_work->work)){+.+.+.}, at: [<c0036308>] process_one_work+0x134/0x4a4
  #2:  (&irq_desc_lock_class){-.-...}, at: [<c00685f0>] __irq_get_desc_lock+0x48/0x88

 stack backtrace:
 CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 3.10.33-00012-gf06b763-dirty raspberrypi#61
 Workqueue: events request_firmware_work_func
 [<c0013eb4>] (unwind_backtrace+0x0/0xf0) from [<c0011c74>] (show_stack+0x10/0x14)
 [<c0011c74>] (show_stack+0x10/0x14) from [<c005bb08>] (__lock_acquire+0x140c/0x1a64)
 [<c005bb08>] (__lock_acquire+0x140c/0x1a64) from [<c005c6a8>] (lock_acquire+0x9c/0x104)
 [<c005c6a8>] (lock_acquire+0x9c/0x104) from [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58)
 [<c051d5a4>] (_raw_spin_lock_irqsave+0x44/0x58) from [<c00685f0>] (__irq_get_desc_lock+0x48/0x88)
 [<c00685f0>] (__irq_get_desc_lock+0x48/0x88) from [<c0068e78>] (irq_set_irq_wake+0x20/0xf4)
 [<c0068e78>] (irq_set_irq_wake+0x20/0xf4) from [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24)
 [<c027260c>] (mxs_gpio_set_wake_irq+0x1c/0x24) from [<c0068cf4>] (set_irq_wake_real+0x30/0x44)
 [<c0068cf4>] (set_irq_wake_real+0x30/0x44) from [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4)
 [<c0068ee4>] (irq_set_irq_wake+0x8c/0xf4) from [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c)
 [<c0310748>] (wlcore_nvs_cb+0x10c/0x97c) from [<c02be5e8>] (request_firmware_work_func+0x38/0x58)
 [<c02be5e8>] (request_firmware_work_func+0x38/0x58) from [<c0036394>] (process_one_work+0x1c0/0x4a4)
 [<c0036394>] (process_one_work+0x1c0/0x4a4) from [<c0036a4c>] (worker_thread+0x138/0x394)
 [<c0036a4c>] (worker_thread+0x138/0x394) from [<c003cb74>] (kthread+0xa4/0xb0)
 [<c003cb74>] (kthread+0xa4/0xb0) from [<c000ee00>] (ret_from_fork+0x14/0x34)
 wlcore: loaded

Signed-off-by: Marek Vasut <[email protected]>
Acked-by: Shawn Guo <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 3f1f9b8 upstream.

This fixes the following lockdep complaint:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ #7 Tainted: G           O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
 (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0

but task is already holding lock:
 (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

which lock already depends on the new lock.

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_es_lock);
                               lock(&(&sbi->s_es_lru_lock)->rlock);
                               lock(&ei->i_es_lock);
  lock(&(&sbi->s_es_lru_lock)->rlock);

 *** DEADLOCK ***

6 locks held by kworker/u24:0/4356:
 #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 #2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
 #3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
 #4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
 #5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ #7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
 ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
 [<ffffffff815df0bb>] dump_stack+0x4e/0x68
 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
 [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff810a8707>] lock_acquire+0x87/0x120
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
	...

Reported-by: Minchan Kim <[email protected]>
Signed-off-by: Theodore Ts'o <[email protected]>
Reported-by: Minchan Kim <[email protected]>
Cc: Zheng Liu <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
koalo pushed a commit that referenced this issue Aug 4, 2014
commit 0b462c8 upstream.

While a queue is being destroyed, all the blkgs are destroyed and its
->root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
   [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
   [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
   [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
   [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
   [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
   [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff81427505>] blk_put_queue+0x15/0x20
   [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
   [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
   [<ffffffff817930e2>] device_release+0x32/0xa0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff817934d7>] put_device+0x17/0x20
   [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
   [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
   [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
   [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
   [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
   [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
   [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
   [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
   [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b

776687b ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo <[email protected]>
Reported-by: Shirish Pargaonkar <[email protected]>
Reported-by: Sasha Levin <[email protected]>
Reported-by: Jet Chen <[email protected]>
Tested-by: Shirish Pargaonkar <[email protected]>
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant