forked from torvalds/linux
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
drm/i915: RC6 WA BB #18
Open
matt-auld
wants to merge
20
commits into
rib:wip/rib/oa-next
Choose a base branch
from
matt-auld:wip/matt-auld/rc6-wa-bb-playground-v2
base: wip/rib/oa-next
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
drm/i915: RC6 WA BB #18
matt-auld
wants to merge
20
commits into
rib:wip/rib/oa-next
from
matt-auld:wip/matt-auld/rc6-wa-bb-playground-v2
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This adds a DRM_IOCTL_I915_PERF_OPEN ioctl comparable to perf_event_open that opens a file descriptor for an event source. Based on our initial experience aiming to use the core perf infrastructure, this interface is inspired by perf, but focused on exposing metrics about work running on Gen graphics instead a CPU. One notable difference is that it doesn't support mmaping a circular buffer of samples into userspace. The currently planned use cases require an internal buffering that forces at least one copy of data which can be neatly hidden in a read() based interface. No specific event types are supported yet so perf_event_open can currently only get as far as returning EINVAL for an unknown event type. Signed-off-by: Robert Bragg <[email protected]>
OACONTROL changes quite a bit for gen8, with some bits split out into a per-context OACTXCONTROL register Signed-off-by: Robert Bragg <[email protected]>
Adds a static OA unit, MUX + B Counter configuration for basic render metrics on Haswell. This is autogenerated from an internal XML description of metric sets. Signed-off-by: Robert Bragg <[email protected]>
Gen graphics hardware can be set up to periodically write snapshots of performance counters into a circular buffer via its Observation Architecture and this patch exposes that capability to userspace via the i915 perf interface. Cc: Chris Wilson <[email protected]> Signed-off-by: Robert Bragg <[email protected]> Signed-off-by: Zhenyu Wang <[email protected]>
Each metric set is given a sysfs entry like: /sys/class/drm/card0/metrics/<guid>/id This allows userspace to enumerate the specific sets that are available for the current system. The 'id' file contains an unsigned integer that can be used to open the associated metric set via DRM_IOCTL_I915_PERF_OPEN. The <guid> is a globally unique ID for a specific OA unit configuration that can be reliably used as a key to lookup corresponding counter meta data and normalization equations. Signed-off-by: Robert Bragg <[email protected]>
Consistent with the kernel.perf_event_paranoid sysctl option that can allow non-root users to access system wide cpu metrics, this can optionally allow non-root users to access system wide OA counter metrics from Gen graphics hardware. Signed-off-by: Robert Bragg <[email protected]>
The minimal sampling period is now configurable via a dev.i915.oa_min_timer_exponent sysctl parameter. Following the precedent set by perf, the default is the minimum that won't (on its own) exceed the default kernel.perf_event_max_sample_rate default of 100000 samples/s. Signed-off-by: Robert Bragg <[email protected]>
This adds 'compute', 'compute extended', 'memory reads', 'memory writes' and 'sampler balance' metric sets for Haswell. Signed-off-by: Robert Bragg <[email protected]>
The mask of slices enabled sometimes affects what Observability counters are exposed for a particular metric set configuration and so userspace needs the mask to be able to correctly normalize the raw data. Signed-off-by: Robert Bragg <[email protected]>
Enables userspace to determine the number of slices enabled and also know what specific slices are enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW slice configuration. Signed-off-by: Robert Bragg <[email protected]>
Not assuming the subslice mask is uniform across all slices this mask covers all slices with N bits per slice, where N is the maximum number of slices for a particular Gen. Notably this matches the definition of the $SubsliceMask variable that is currently found in out OA metrics XML files. Notably this updates the ss_max constant (max number of sub slices) from 4 to 3 because although the FUSE2 register provides a 4 bit mask, gen 9 configurations really only go up to a maximum of 3 slices, with bit 3 in the FUSE2 subslice disabled mask always set. Cc: Zhenyu Wang <[email protected]> Signed-off-by: Robert Bragg <[email protected]>
Assuming a uniform mask across all slices, this enables userspace to determine the specific sub slices enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW sub slice configuration. Signed-off-by: Robert Bragg <[email protected]>
The current context user handles are specific to drm file instance. There are some usecases, which may require a global id for the contexts. For e.g. a system level GPU profiler tool may lean upon the global context ids to associate the performance snapshots with individual contexts. This global id may also be used further in order to provide a unique context id to hw. In this patch, the global ids are allocated from a separate cyclic idr and can be further utilized for any usecase described above. v2: According to Chris' suggestion, implemented a separate idr for holding global ids for contexts, as opposed to overloading the file specific ctx->user_handle for this purpose. This global id can also further be used wherever hw has to be programmed with ctx unique id, though this patch just introduces the hw global id as such. Signed-off-by: Sourab Gupta <[email protected]>
This will allow the ID to be given to the HW as the unique context identifier that's written, for example, to the context status buffer on preemption and included in reports written by the OA unit. Cc: Sourab Gupta <[email protected]> Signed-off-by: Robert Bragg <[email protected]>
The newly added intel_context::global_id is suitable (a globally unique 20 bit ID) for giving to the hardware as a unique context identifier. Compared to using the pinned address of a logical ring context these IDs are constant for the lifetime of a context whereas a context could be repinned at different addresses during its lifetime. Having a stable ID is useful when we need to buffer information associated with a context based on this ID so the association can't be lost. For example the OA unit writes out counter reports to a circular buffer tagged with this ID and we want to be able to accurately filter reports for a specific context, ideally without the added complexity of tracking context re-pinning while the OA buffer may contain reports with older IDs. Cc: Sourab Gupta <[email protected]> Signed-off-by: Robert Bragg <[email protected]>
Since the exponent for periodic OA counter sampling is maintained in a per-context register while we want to treat it as if it were global state we need to be able to safely issue an mmio write to a per-context register and affect any currently running context. We have to take some extra care in this case and this adds a utility api to encapsulate what's required. Signed-off-by: Robert Bragg <[email protected]>
Adds a static OA unit, MUX, B Counter + Flex EU configurations for basic render metrics on Broadwell, Cherryview and Skylake. These are autogenerated from an internal XML description of metric sets. Signed-off-by: Robert Bragg <[email protected]>
Enables access to OA unit metrics for BDW, CHV and SKL which all share the same OA unit design. Signed-off-by: Robert Bragg <[email protected]>
Signed-off-by: Robert Bragg <[email protected]>
matt-auld
pushed a commit
to matt-auld/linux
that referenced
this pull request
Jul 18, 2016
This bug leads to: [ 1.906411] Unable to handle kernel NULL pointer dereference at virtual address 0000000c [ 1.914878] pgd = c0004000 [ 1.917786] [0000000c] *pgd=00000000 [ 1.921536] Internal error: Oops: 5 [rib#1] SMP ARM [ 1.926357] Modules linked in: [ 1.929556] CPU: 0 PID: 14 Comm: kworker/0:1 Not tainted 4.4.5 rib#18 [ 1.936006] Hardware name: Generic AM33XX (Flattened Device Tree) [ 1.942383] Workqueue: events power_supply_changed_work [ 1.947842] task: de2c41c0 ti: de2c8000 task.ti: de2c8000 [ 1.953483] PC is at tps65217_ac_get_property+0x14/0x28 [ 1.958937] LR is at tps65217_ac_get_property+0x10/0x28 Driver was trying to use drv_data in property get handler. However drv_data was not set, so it caused NULL pointer dereference. This patch properly sets drv_data during probe by power_supply_config parameter, so the property get handler works as desired. Signed-off-by: Marcin Niestroj <[email protected]> Fixes: 3636859 ("power_supply: Add support for tps65217-charger") Signed-off-by: Sebastian Reichel <[email protected]>
rib
force-pushed
the
wip/rib/oa-next
branch
from
September 12, 2016 20:38
4121b50
to
44bfda2
Compare
rib
force-pushed
the
wip/rib/oa-next
branch
2 times, most recently
from
January 20, 2017 20:18
f95436e
to
cca83f7
Compare
rib
force-pushed
the
wip/rib/oa-next
branch
4 times, most recently
from
February 13, 2017 15:18
3a43942
to
13de484
Compare
rib
force-pushed
the
wip/rib/oa-next
branch
2 times, most recently
from
March 14, 2017 23:37
88ccf7a
to
6709fd7
Compare
rib
force-pushed
the
wip/rib/oa-next
branch
2 times, most recently
from
March 23, 2017 20:20
e387c0b
to
bba4f4b
Compare
rib
pushed a commit
that referenced
this pull request
Mar 23, 2017
As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: #8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . #9 [] tcp_rcv_established at ffffffff81580b64 #10 [] tcp_v4_do_rcv at ffffffff8158b54a #11 [] tcp_v4_rcv at ffffffff8158cd02 #12 [] ip_local_deliver_finish at ffffffff815668f4 #13 [] ip_local_deliver at ffffffff81566bd9 #14 [] ip_rcv_finish at ffffffff8156656d #15 [] ip_rcv at ffffffff81566f06 #16 [] __netif_receive_skb_core at ffffffff8152b3a2 #17 [] __netif_receive_skb at ffffffff8152b608 #18 [] netif_receive_skb at ffffffff8152b690 #19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] #20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] #21 [] net_rx_action at ffffffff8152bac2 #22 [] __do_softirq at ffffffff81084b4f #23 [] call_softirq at ffffffff8164845c #24 [] do_softirq at ffffffff81016fc5 #25 [] irq_exit at ffffffff81084ee5 torvalds#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
rib
force-pushed
the
wip/rib/oa-next
branch
3 times, most recently
from
April 12, 2017 15:54
1e01ca9
to
5539acc
Compare
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Apr 26, 2017
commit 4dfce57 upstream. There have been several reports over the years of NULL pointer dereferences in xfs_trans_log_inode during xfs_fsr processes, when the process is doing an fput and tearing down extents on the temporary inode, something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr" [exception RIP: xfs_trans_log_inode+0x10] rib#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs] rib#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs] rib#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs] rib#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs] rib#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs] rib#14 [ffff8800a57bbe00] evict at ffffffff811e1b67 rib#15 [ffff8800a57bbe28] iput at ffffffff811e23a5 rib#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8 rib#17 [ffff8800a57bbe88] dput at ffffffff811dd06c rib#18 [ffff8800a57bbea8] __fput at ffffffff811c823b rib#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e rib#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27 rib#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c rib#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d As it turns out, this is because the i_itemp pointer, along with the d_ops pointer, has been overwritten with zeros when we tear down the extents during truncate. When the in-core inode fork on the temporary inode used by xfs_fsr was originally set up during the extent swap, we mistakenly looked at di_nextents to determine whether all extents fit inline, but this misses extents generated by speculative preallocation; we should be using if_bytes instead. This mistake corrupts the in-memory inode, and code in xfs_iext_remove_inline eventually gets bad inputs, causing it to memmove and memset incorrect ranges; this became apparent because the two values in ifp->if_u2.if_inline_ext[1] contained what should have been in d_ops and i_itemp; they were memmoved due to incorrect array indexing and then the original locations were zeroed with memset, again due to an array overrun. Fix this by properly using i_df.if_bytes to determine the number of extents, not di_nextents. Thanks to dchinner for looking at this with me and spotting the root cause. Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Apr 26, 2017
[ Upstream commit 45caeaa ] As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: rib#8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . rib#9 [] tcp_rcv_established at ffffffff81580b64 rib#10 [] tcp_v4_do_rcv at ffffffff8158b54a rib#11 [] tcp_v4_rcv at ffffffff8158cd02 rib#12 [] ip_local_deliver_finish at ffffffff815668f4 rib#13 [] ip_local_deliver at ffffffff81566bd9 rib#14 [] ip_rcv_finish at ffffffff8156656d rib#15 [] ip_rcv at ffffffff81566f06 rib#16 [] __netif_receive_skb_core at ffffffff8152b3a2 rib#17 [] __netif_receive_skb at ffffffff8152b608 rib#18 [] netif_receive_skb at ffffffff8152b690 rib#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] rib#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] rib#21 [] net_rx_action at ffffffff8152bac2 rib#22 [] __do_softirq at ffffffff81084b4f rib#23 [] call_softirq at ffffffff8164845c rib#24 [] do_softirq at ffffffff81016fc5 rib#25 [] irq_exit at ffffffff81084ee5 torvalds#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
May 25, 2017
[ Upstream commit 45caeaa ] As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: rib#8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . rib#9 [] tcp_rcv_established at ffffffff81580b64 rib#10 [] tcp_v4_do_rcv at ffffffff8158b54a rib#11 [] tcp_v4_rcv at ffffffff8158cd02 rib#12 [] ip_local_deliver_finish at ffffffff815668f4 rib#13 [] ip_local_deliver at ffffffff81566bd9 rib#14 [] ip_rcv_finish at ffffffff8156656d rib#15 [] ip_rcv at ffffffff81566f06 rib#16 [] __netif_receive_skb_core at ffffffff8152b3a2 rib#17 [] __netif_receive_skb at ffffffff8152b608 rib#18 [] netif_receive_skb at ffffffff8152b690 rib#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] rib#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] rib#21 [] net_rx_action at ffffffff8152bac2 rib#22 [] __do_softirq at ffffffff81084b4f rib#23 [] call_softirq at ffffffff8164845c rib#24 [] do_softirq at ffffffff81016fc5 rib#25 [] irq_exit at ffffffff81084ee5 torvalds#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
matt-auld
pushed a commit
to matt-auld/linux
that referenced
this pull request
Aug 11, 2017
When PCCT is not available, kernel crashes as below when requests PCC channel 0. This patch fixes this issue. [ 0.920454] PCCT header not found. ... [ 8.031309] Unable to handle kernel NULL pointer dereference at virtual address 00000010 [ 8.031310] [0000000000000010] user address but active_mm is swapper [ 8.031312] Internal error: Oops: 96000004 [rib#1] PREEMPT SMP [ 8.031313] Modules linked in: [ 8.031316] CPU: 31 PID: 1 Comm: swapper/0 Tainted: G W 4.13.0-rc1 rib#18 [ 8.031317] Hardware name: AppliedMicro(R) 07/20/2017 [ 8.031318] task: ffff809ef3b08000 task.stack: ffff809ef3b10000 [ 8.031322] PC is at pcc_mbox_request_channel+0x8c/0x160 [ 8.031325] LR is at xgene_slimpro_i2c_probe+0x1c0/0x378 [ 8.031326] pc : [<ffff000008899450>] lr : [<ffff000008819dac>] pstate: 00000045 [ 8.031327] sp : ffff809ef3b13bd0 [ 8.031327] x29: ffff809ef3b13bd0 x28: ffff000008ed90a0 [ 8.031329] x27: ffff000009091000 x26: ffff000008e50470 [ 8.031330] x25: ffff000008ed9100 x24: ffff809eefd9ac30 [ 8.031332] x23: 0000000000000000 x22: ffff0000090e3e10 [ 8.031333] x21: ffff0000090e3000 x20: 0000000000000000 [ 8.031335] x19: 0000000000000000 x18: 0000000000087ffc [ 8.031336] x17: 2fe48d76a78303f0 x16: 0000000000087ffc [ 8.031337] x15: ffff000000000000 x14: 0000000000000000 [ 8.031339] x13: 0000000000000000 x12: 0000000000000018 [ 8.031340] x11: 0000000000000018 x10: 0101010101010101 [ 8.031342] x9 : 0000000000000000 x8 : 7f7f7f7f7f7f7f7f [ 8.031343] x7 : fefefefeff6b646d x6 : 0000008080808080 [ 8.031345] x5 : 0000000000000000 x4 : 0000000000000001 [ 8.031346] x3 : 0000000000000000 x2 : ffff000008819b64 [ 8.031348] x1 : 0000000000000000 x0 : 0000000000000000 ... [ 8.031393] Call trace: [ 8.031394] Exception stack(0xffff809ef3b13a00 to 0xffff809ef3b13b30) [ 8.031395] 3a00: 0000000000000000 0001000000000000 ffff809ef3b13bd0 ffff000008899450 [ 8.031397] 3a20: ffff809f7e1f9a10 ffff000008f60be0 0000000000000001 ffff809ef3b13b7c [ 8.031398] 3a40: ffff809f7e1f9a10 0000000000000000 ffff000009091000 0000000000000003 [ 8.031399] 3a60: ffff000009091000 0000000000000003 ffff809ef3b13a80 ffff0000084e0794 [ 8.031400] 3a80: ffff809ef3b13a90 ffff00000850bb64 ffff809ef3b13ad0 ffff00000850bf34 [ 8.031402] 3aa0: 0000000000000000 0000000000000000 ffff000008819b64 0000000000000000 [ 8.031403] 3ac0: 0000000000000001 0000000000000000 0000008080808080 fefefefeff6b646d [ 8.031404] 3ae0: 7f7f7f7f7f7f7f7f 0000000000000000 0101010101010101 0000000000000018 [ 8.031405] 3b00: 0000000000000018 0000000000000000 0000000000000000 ffff000000000000 [ 8.031406] 3b20: 0000000000087ffc 2fe48d76a78303f0 [ 8.031409] [<ffff000008899450>] pcc_mbox_request_channel+0x8c/0x160 [ 8.031410] [<ffff000008819dac>] xgene_slimpro_i2c_probe+0x1c0/0x378 [ 8.031413] [<ffff0000085e84dc>] platform_drv_probe+0x50/0xbc [ 8.031414] [<ffff0000085e68a4>] driver_probe_device+0x21c/0x2d0 [ 8.031416] [<ffff0000085e6a04>] __driver_attach+0xac/0xb0 [ 8.031417] [<ffff0000085e4a78>] bus_for_each_dev+0x58/0x98 [ 8.031418] [<ffff0000085e61e4>] driver_attach+0x20/0x28 [ 8.031419] [<ffff0000085e5e0c>] bus_add_driver+0x1c8/0x22c [ 8.031421] [<ffff0000085e7324>] driver_register+0x60/0xf4 [ 8.031422] [<ffff0000085e8420>] __platform_driver_register+0x4c/0x54 [ 8.031425] [<ffff000008e96dd0>] xgene_slimpro_i2c_driver_init+0x18/0x20 [ 8.031426] [<ffff000008083144>] do_one_initcall+0x38/0x124 [ 8.031429] [<ffff000008e50d0c>] kernel_init_freeable+0x190/0x22c [ 8.031431] [<ffff0000089eac30>] kernel_init+0x10/0xfc [ 8.031432] [<ffff000008082ec0>] ret_from_fork+0x10/0x50 [ 8.031434] Code: cb030e63 8b030013 b140067f 54fffda8 (f9400a61) [ 8.031448] ---[ end trace 14eb48a4e1e1f9fb ]--- Signed-off-by: Hoan Tran <[email protected]> Acked-by: Prashanth Prakash <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Jan 11, 2018
Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ rib#18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through *(u64 *)val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <[email protected]> Reviewed-by: Darren Kenny <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Tested-by: Marc Zyngier <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krčmář <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Christoffer Dall <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Mar 13, 2018
when sock_create_kern(..., a) returns an error, 'a' might not be a valid pointer, so it shouldn't be dereferenced to read a->sk->sk_sndbuf and and a->sk->sk_rcvbuf; not doing that caused the following crash: general protection fault: 0000 [rib#1] SMP KASAN Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 0 PID: 4254 Comm: syzkaller919713 Not tainted 4.16.0-rc1+ rib#18 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: 0018:ffff8801b06afbc8 EFLAGS: 00010202 RAX: dffffc0000000000 RBX: ffff8801b63457c0 RCX: ffffffff85a3e746 RDX: 0000000000000004 RSI: 00000000ffffffff RDI: 0000000000000020 RBP: ffff8801b06afbf0 R08: 00000000000007c0 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 R13: ffff8801b6345c08 R14: 00000000ffffffe9 R15: ffffffff8695ced0 FS: 0000000001afb880(0000) GS:ffff8801db200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020000040 CR3: 00000001b0721004 CR4: 00000000001606f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __sock_create+0x4d4/0x850 net/socket.c:1285 sock_create net/socket.c:1325 [inline] SYSC_socketpair net/socket.c:1409 [inline] SyS_socketpair+0x1c0/0x6f0 net/socket.c:1366 do_syscall_64+0x282/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x26/0x9b RIP: 0033:0x4404b9 RSP: 002b:00007fff44ab6908 EFLAGS: 00000246 ORIG_RAX: 0000000000000035 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00000000004404b9 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 000000000000002b RBP: 00007fff44ab6910 R08: 0000000000000002 R09: 00007fff44003031 R10: 0000000020000040 R11: 0000000000000246 R12: ffffffffffffffff R13: 0000000000000006 R14: 0000000000000000 R15: 0000000000000000 Code: 48 c1 ea 03 80 3c 02 00 0f 85 b3 01 00 00 4c 8b a3 48 04 00 00 48 b8 00 00 00 00 00 fc ff df 49 8d 7c 24 20 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 82 01 00 00 4d 8b 7c 24 20 48 b8 00 00 00 00 RIP: smc_create+0x14e/0x300 net/smc/af_smc.c:1410 RSP: ffff8801b06afbc8 Fixes: cd6851f smc: remote memory buffers (RMBs) Reported-and-tested-by: [email protected] Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: Ursula Braun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Mar 13, 2018
dev_get_by_index is being called in addr_resolve function which returns NULL and NULL pointer access leads to kernel crash. Following call trace is observed while running rdma_lat test application [ 146.173149] BUG: unable to handle kernel NULL pointer dereference at 00000000000004a0 [ 146.173198] IP: addr_resolve+0x9e/0x3e0 [ib_core] [ 146.173221] PGD 0 P4D 0 [ 146.173869] Oops: 0000 [rib#1] SMP PTI [ 146.182859] CPU: 8 PID: 127 Comm: kworker/8:1 Tainted: G O 4.15.0-rc6+ rib#18 [ 146.183758] Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01KN179, BIOS-[TCE132H-2.50]- 10/11/2017 [ 146.184691] Workqueue: ib_cm cm_work_handler [ib_cm] [ 146.185632] RIP: 0010:addr_resolve+0x9e/0x3e0 [ib_core] [ 146.186584] RSP: 0018:ffffc9000362faa0 EFLAGS: 00010246 [ 146.187521] RAX: 000000000000001b RBX: ffffc9000362fc08 RCX: 0000000000000006 [ 146.188472] RDX: 0000000000000000 RSI: 0000000000000096 RDI : ffff88087fc16990 [ 146.189427] RBP: ffffc9000362fb18 R08: 00000000ffffff9d R09: 00000000000004ac [ 146.190392] R10: 00000000000001e7 R11: 0000000000000001 R12: ffff88086af2e090 [ 146.191361] R13: 0000000000000000 R14: 0000000000000001 R15: 00000000ffffff9d [ 146.192327] FS: 0000000000000000(0000) GS:ffff88087fc00000(0000) knlGS:0000000000000000 [ 146.193301] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 146.194274] CR2: 00000000000004a0 CR3: 000000000220a002 CR4: 00000000003606e0 [ 146.195258] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 146.196256] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 146.197231] Call Trace: [ 146.198209] ? rdma_addr_register_client+0x30/0x30 [ib_core] [ 146.199199] rdma_resolve_ip+0x1af/0x280 [ib_core] [ 146.200196] rdma_addr_find_l2_eth_by_grh+0x154/0x2b0 [ib_core] The below patch adds the missing NULL pointer check returned by dev_get_by_index before accessing the netdev to avoid kernel crash. We observed the below crash when we try to do the below test. server client --------- --------- |1.1.1.1|<----rxe-channel--->|1.1.1.2| --------- --------- On server: rdma_lat -c -n 2 -s 1024 On client:rdma_lat 1.1.1.1 -c -n 2 -s 1024 Fixes: 2002983 ("IB/core: Validate route when we init ah") Signed-off-by: Muneendra <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Feb 26, 2019
Enabling SLUB_DEBUG's SLAB_CONSISTENCY_CHECKS with KASAN_SW_TAGS triggers endless false positives during boot below due to check_valid_pointer() checks tagged pointers which have no addresses that is valid within slab pages: BUG radix_tree_node (Tainted: G B ): Freelist Pointer check fails ----------------------------------------------------------------------------- INFO: Slab objects=69 used=69 fp=0x (null) flags=0x7ffffffc000200 INFO: Object @offset=15060037153926966016 fp=0x Redzone: bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb bb ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 18 6b 06 00 08 80 ff d0 .........k...... Object : 18 6b 06 00 08 80 ff d0 00 00 00 00 00 00 00 00 .k.............. Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Object : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ Redzone: bb bb bb bb bb bb bb bb ........ Padding: 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a 5a ZZZZZZZZZZZZZZZZ CPU: 0 PID: 0 Comm: swapper/0 Tainted: G B 5.0.0-rc5+ rib#18 Call trace: dump_backtrace+0x0/0x450 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xa0/0xfc print_trailer+0x1bc/0x1d0 object_err+0x40/0x50 alloc_debug_processing+0xf0/0x19c ___slab_alloc+0x554/0x704 kmem_cache_alloc+0x2f8/0x440 radix_tree_node_alloc+0x90/0x2fc idr_get_free+0x1e8/0x6d0 idr_alloc_u32+0x11c/0x2a4 idr_alloc+0x74/0xe0 worker_pool_assign_id+0x5c/0xbc workqueue_init_early+0x49c/0xd50 start_kernel+0x52c/0xac4 FIX radix_tree_node: Marking all objects used Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Qian Cai <[email protected]> Reviewed-by: Andrey Konovalov <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
matt-auld
pushed a commit
to matt-auld/linux
that referenced
this pull request
Mar 29, 2019
…r-free issue The evlist should be destroyed before the perf session. Detected with gcc's ASan: ================================================================= ==27350==ERROR: AddressSanitizer: heap-use-after-free on address 0x62b000002e38 at pc 0x5611da276999 bp 0x7ffce8f1d1a0 sp 0x7ffce8f1d190 WRITE of size 8 at 0x62b000002e38 thread T0 #0 0x5611da276998 in __list_del /home/work/linux/tools/include/linux/list.h:89 rib#1 0x5611da276d4a in __list_del_entry /home/work/linux/tools/include/linux/list.h:102 rib#2 0x5611da276e77 in list_del_init /home/work/linux/tools/include/linux/list.h:145 rib#3 0x5611da2781cd in thread__put util/thread.c:130 rib#4 0x5611da2cc0a8 in __thread__zput util/thread.h:68 rib#5 0x5611da2d2dcb in hist_entry__delete util/hist.c:1148 rib#6 0x5611da2cdf91 in hists__delete_entry util/hist.c:337 rib#7 0x5611da2ce19e in hists__delete_entries util/hist.c:365 rib#8 0x5611da2db2ab in hists__delete_all_entries util/hist.c:2639 rib#9 0x5611da2db325 in hists_evsel__exit util/hist.c:2651 rib#10 0x5611da1c5352 in perf_evsel__exit util/evsel.c:1304 rib#11 0x5611da1c5390 in perf_evsel__delete util/evsel.c:1309 rib#12 0x5611da1b35f0 in perf_evlist__purge util/evlist.c:124 rib#13 0x5611da1b38e2 in perf_evlist__delete util/evlist.c:148 rib#14 0x5611da069781 in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1645 rib#15 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 rib#16 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 rib#17 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 rib#18 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520 rib#19 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) rib#20 0x5611d9ff35c9 in _start (/home/work/linux/tools/perf/perf+0x3e95c9) 0x62b000002e38 is located 11320 bytes inside of 27448-byte region [0x62b000000200,0x62b000006d38) freed by thread T0 here: #0 0x7fdccb04ab70 in free (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedb70) rib#1 0x5611da260df4 in perf_session__delete util/session.c:201 rib#2 0x5611da063de5 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1300 rib#3 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642 rib#4 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 rib#5 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 rib#6 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 rib#7 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520 rib#8 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) previously allocated by thread T0 here: #0 0x7fdccb04b138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) rib#1 0x5611da26010c in zalloc util/util.h:23 rib#2 0x5611da260824 in perf_session__new util/session.c:118 rib#3 0x5611da0633a6 in __cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1192 rib#4 0x5611da06973c in cmd_top /home/changbin/work/linux/tools/perf/builtin-top.c:1642 rib#5 0x5611da17d038 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 rib#6 0x5611da17d577 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 rib#7 0x5611da17d97b in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 rib#8 0x5611da17e0e9 in main /home/changbin/work/linux/tools/perf/perf.c:520 rib#9 0x7fdcc970f09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) SUMMARY: AddressSanitizer: heap-use-after-free /home/work/linux/tools/include/linux/list.h:89 in __list_del Shadow bytes around the buggy address: 0x0c567fff8570: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff8580: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff8590: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff85a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff85b0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c567fff85c0: fd fd fd fd fd fd fd[fd]fd fd fd fd fd fd fd fd 0x0c567fff85d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff85e0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff85f0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff8600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c567fff8610: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==27350==ABORTING Signed-off-by: Changbin Du <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
matt-auld
pushed a commit
to matt-auld/linux
that referenced
this pull request
Apr 2, 2019
…mory When halting a guest, QEMU flushes the virtual ITS caches, which amounts to writing to the various tables that the guest has allocated. When doing this, we fail to take the srcu lock, and the kernel shouts loudly if running a lockdep kernel: [ 69.680416] ============================= [ 69.680819] WARNING: suspicious RCU usage [ 69.681526] 5.1.0-rc1-00008-g600025238f51-dirty rib#18 Not tainted [ 69.682096] ----------------------------- [ 69.682501] ./include/linux/kvm_host.h:605 suspicious rcu_dereference_check() usage! [ 69.683225] [ 69.683225] other info that might help us debug this: [ 69.683225] [ 69.683975] [ 69.683975] rcu_scheduler_active = 2, debug_locks = 1 [ 69.684598] 6 locks held by qemu-system-aar/4097: [ 69.685059] #0: 0000000034196013 (&kvm->lock){+.+.}, at: vgic_its_set_attr+0x244/0x3a0 [ 69.686087] rib#1: 00000000f2ed935e (&its->its_lock){+.+.}, at: vgic_its_set_attr+0x250/0x3a0 [ 69.686919] rib#2: 000000005e71ea54 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.687698] rib#3: 00000000c17e548d (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.688475] rib#4: 00000000ba386017 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.689978] rib#5: 00000000c2c3c335 (&vcpu->mutex){+.+.}, at: lock_all_vcpus+0x64/0xd0 [ 69.690729] [ 69.690729] stack backtrace: [ 69.691151] CPU: 2 PID: 4097 Comm: qemu-system-aar Not tainted 5.1.0-rc1-00008-g600025238f51-dirty rib#18 [ 69.691984] Hardware name: rockchip evb_rk3399/evb_rk3399, BIOS 2019.04-rc3-00124-g2feec69fb1 03/15/2019 [ 69.692831] Call trace: [ 69.694072] lockdep_rcu_suspicious+0xcc/0x110 [ 69.694490] gfn_to_memslot+0x174/0x190 [ 69.694853] kvm_write_guest+0x50/0xb0 [ 69.695209] vgic_its_save_tables_v0+0x248/0x330 [ 69.695639] vgic_its_set_attr+0x298/0x3a0 [ 69.696024] kvm_device_ioctl_attr+0x9c/0xd8 [ 69.696424] kvm_device_ioctl+0x8c/0xf8 [ 69.696788] do_vfs_ioctl+0xc8/0x960 [ 69.697128] ksys_ioctl+0x8c/0xa0 [ 69.697445] __arm64_sys_ioctl+0x28/0x38 [ 69.697817] el0_svc_common+0xd8/0x138 [ 69.698173] el0_svc_handler+0x38/0x78 [ 69.698528] el0_svc+0x8/0xc The fix is to obviously take the srcu lock, just like we do on the read side of things since bf30824. One wonders why this wasn't fixed at the same time, but hey... Fixes: bf30824 ("KVM: arm/arm64: VGIC/ITS: protect kvm_read_guest() calls with SRCU lock") Signed-off-by: Marc Zyngier <[email protected]>
djdeath
pushed a commit
to djdeath/linux
that referenced
this pull request
Jul 9, 2019
The lcdc device is missing the dma_coherent_mask definition causing the following warning on da850-evm: da8xx_lcdc da8xx_lcdc.0: found Sharp_LK043T1DG01 panel ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/dma/mapping.c:247 dma_alloc_attrs+0xc8/0x110 Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.2.0-rc3-00077-g16d72dd4891f rib#18 Hardware name: DaVinci DA850/OMAP-L138/AM18x EVM [<c000fce8>] (unwind_backtrace) from [<c000d900>] (show_stack+0x10/0x14) [<c000d900>] (show_stack) from [<c001a4f8>] (__warn+0xec/0x114) [<c001a4f8>] (__warn) from [<c001a634>] (warn_slowpath_null+0x3c/0x48) [<c001a634>] (warn_slowpath_null) from [<c0065860>] (dma_alloc_attrs+0xc8/0x110) [<c0065860>] (dma_alloc_attrs) from [<c02820f8>] (fb_probe+0x228/0x5a8) [<c02820f8>] (fb_probe) from [<c02d3e9c>] (platform_drv_probe+0x48/0x9c) [<c02d3e9c>] (platform_drv_probe) from [<c02d221c>] (really_probe+0x1d8/0x2d4) [<c02d221c>] (really_probe) from [<c02d2474>] (driver_probe_device+0x5c/0x168) [<c02d2474>] (driver_probe_device) from [<c02d2728>] (device_driver_attach+0x58/0x60) [<c02d2728>] (device_driver_attach) from [<c02d27b0>] (__driver_attach+0x80/0xbc) [<c02d27b0>] (__driver_attach) from [<c02d047c>] (bus_for_each_dev+0x64/0xb4) [<c02d047c>] (bus_for_each_dev) from [<c02d1590>] (bus_add_driver+0xe4/0x1d8) [<c02d1590>] (bus_add_driver) from [<c02d301c>] (driver_register+0x78/0x10c) [<c02d301c>] (driver_register) from [<c000a5c0>] (do_one_initcall+0x48/0x1bc) [<c000a5c0>] (do_one_initcall) from [<c05cae6c>] (kernel_init_freeable+0x10c/0x1d8) [<c05cae6c>] (kernel_init_freeable) from [<c048a000>] (kernel_init+0x8/0xf4) [<c048a000>] (kernel_init) from [<c00090e0>] (ret_from_fork+0x14/0x34) Exception stack(0xc6837fb0 to 0xc6837ff8) 7fa0: 00000000 00000000 00000000 00000000 7fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 8a8073511be81dd2 ]--- Add a 32-bit mask to the platform device's definition. Signed-off-by: Bartosz Golaszewski <[email protected]> Signed-off-by: Sekhar Nori <[email protected]>
matt-auld
pushed a commit
to matt-auld/linux
that referenced
this pull request
Oct 9, 2019
The LCD timing definitions between Linux DRM vs Allwinner are different, below diagram shows this clear differences. Active Front Sync Back Region Porch Porch <-----------------------><----------------><--------------><--------------> //////////////////////| ////////////////////// | ////////////////////// |.................. ................ ________________ <----- [hv]display -----> <------------- [hv]sync_start ------------> <--------------------- [hv]sync_end ----------------------> <-------------------------------- [hv]total ------------------------------> <----- lcd_[xy] --------> <- lcd_[hv]spw -> <---------- lcd_[hv]bp ---------> <-------------------------------- lcd_[hv]t ------------------------------> The DSI driver misinterpreted the vbp term from the BSP code to refer only to the backporch, when in fact it was backporch + sync. Thus the driver incorrectly used the vertical front porch plus sync in its calculation of the DRQ set bit value, when it should not have included the sync timing. Including additional sync timings leads to flip_done timed out as: WARNING: CPU: 0 PID: 31 at drivers/gpu/drm/drm_atomic_helper.c:1429 drm_atomic_helper_wait_for_vblanks.part.1+0x298/0x2a0 [CRTC:46:crtc-0] vblank wait timed out Modules linked in: CPU: 0 PID: 31 Comm: kworker/0:1 Not tainted 5.1.0-next-20190514-00029-g09e5b0ed0a58 rib#18 Hardware name: Allwinner sun8i Family Workqueue: events deferred_probe_work_func [<c010ed54>] (unwind_backtrace) from [<c010b76c>] (show_stack+0x10/0x14) [<c010b76c>] (show_stack) from [<c0688c70>] (dump_stack+0x84/0x98) [<c0688c70>] (dump_stack) from [<c011d9e4>] (__warn+0xfc/0x114) [<c011d9e4>] (__warn) from [<c011da40>] (warn_slowpath_fmt+0x44/0x68) [<c011da40>] (warn_slowpath_fmt) from [<c040cd50>] (drm_atomic_helper_wait_for_vblanks.part.1+0x298/0x2a0) [<c040cd50>] (drm_atomic_helper_wait_for_vblanks.part.1) from [<c040e694>] (drm_atomic_helper_commit_tail_rpm+0x5c/0x6c) [<c040e694>] (drm_atomic_helper_commit_tail_rpm) from [<c040e4dc>] (commit_tail+0x40/0x6c) [<c040e4dc>] (commit_tail) from [<c040e5cc>] (drm_atomic_helper_commit+0xbc/0x128) [<c040e5cc>] (drm_atomic_helper_commit) from [<c0411b64>] (restore_fbdev_mode_atomic+0x1cc/0x1dc) [<c0411b64>] (restore_fbdev_mode_atomic) from [<c04156f8>] (drm_fb_helper_restore_fbdev_mode_unlocked+0x54/0xa0) [<c04156f8>] (drm_fb_helper_restore_fbdev_mode_unlocked) from [<c0415774>] (drm_fb_helper_set_par+0x30/0x54) [<c0415774>] (drm_fb_helper_set_par) from [<c03ad450>] (fbcon_init+0x560/0x5ac) [<c03ad450>] (fbcon_init) from [<c03eb8a0>] (visual_init+0xbc/0x104) [<c03eb8a0>] (visual_init) from [<c03ed1b8>] (do_bind_con_driver+0x1b0/0x390) [<c03ed1b8>] (do_bind_con_driver) from [<c03ed780>] (do_take_over_console+0x13c/0x1c4) [<c03ed780>] (do_take_over_console) from [<c03ad800>] (do_fbcon_takeover+0x74/0xcc) [<c03ad800>] (do_fbcon_takeover) from [<c013c9c8>] (notifier_call_chain+0x44/0x84) [<c013c9c8>] (notifier_call_chain) from [<c013cd20>] (__blocking_notifier_call_chain+0x48/0x60) [<c013cd20>] (__blocking_notifier_call_chain) from [<c013cd50>] (blocking_notifier_call_chain+0x18/0x20) [<c013cd50>] (blocking_notifier_call_chain) from [<c03a6e44>] (register_framebuffer+0x1e0/0x2f8) [<c03a6e44>] (register_framebuffer) from [<c04153c0>] (__drm_fb_helper_initial_config_and_unlock+0x2fc/0x50c) [<c04153c0>] (__drm_fb_helper_initial_config_and_unlock) from [<c04158c8>] (drm_fbdev_client_hotplug+0xe8/0x1b8) [<c04158c8>] (drm_fbdev_client_hotplug) from [<c0415a20>] (drm_fbdev_generic_setup+0x88/0x118) [<c0415a20>] (drm_fbdev_generic_setup) from [<c043f060>] (sun4i_drv_bind+0x128/0x160) [<c043f060>] (sun4i_drv_bind) from [<c044b598>] (try_to_bring_up_master+0x164/0x1a0) [<c044b598>] (try_to_bring_up_master) from [<c044b668>] (__component_add+0x94/0x140) [<c044b668>] (__component_add) from [<c0445e1c>] (sun6i_dsi_probe+0x144/0x234) [<c0445e1c>] (sun6i_dsi_probe) from [<c0452ef4>] (platform_drv_probe+0x48/0x9c) [<c0452ef4>] (platform_drv_probe) from [<c04512cc>] (really_probe+0x1dc/0x2c8) [<c04512cc>] (really_probe) from [<c0451518>] (driver_probe_device+0x60/0x160) [<c0451518>] (driver_probe_device) from [<c044f7a4>] (bus_for_each_drv+0x74/0xb8) [<c044f7a4>] (bus_for_each_drv) from [<c045107c>] (__device_attach+0xd0/0x13c) [<c045107c>] (__device_attach) from [<c0450474>] (bus_probe_device+0x84/0x8c) [<c0450474>] (bus_probe_device) from [<c0450900>] (deferred_probe_work_func+0x64/0x90) [<c0450900>] (deferred_probe_work_func) from [<c0135970>] (process_one_work+0x204/0x420) [<c0135970>] (process_one_work) from [<c013690c>] (worker_thread+0x274/0x5a0) [<c013690c>] (worker_thread) from [<c013b3d8>] (kthread+0x11c/0x14c) [<c013b3d8>] (kthread) from [<c01010e8>] (ret_from_fork+0x14/0x2c) Exception stack(0xde539fb0 to 0xde539ff8) 9fa0: 00000000 00000000 00000000 00000000 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 ---[ end trace 495200a78b24980e ]--- random: fast init done [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CRTC:46:crtc-0] flip_done timed out [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [CONNECTOR:48:DSI-1] flip_done timed out [drm:drm_atomic_helper_wait_for_dependencies] *ERROR* [PLANE:30:plane-0] flip_done timed out With the terms(as described in above diagram) fixed, the panel displays correctly without any timeouts. Tested-by: Merlijn Wajer <[email protected]> Signed-off-by: Jagan Teki <[email protected]> Signed-off-by: Icenowy Zheng <[email protected]> Signed-off-by: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.