Skip to content

Commit

Permalink
Merge pull request #13 from vilterp/pv-alloc-profile-docs
Browse files Browse the repository at this point in the history
alloc profiler: add docs and news
  • Loading branch information
vilterp authored Jan 7, 2022
2 parents 3aaa9d9 + 5dc2beb commit 830e2ad
Show file tree
Hide file tree
Showing 4 changed files with 62 additions and 1 deletion.
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,9 @@ Standard library changes
Further, percent utilization is now reported as a total or per-thread, based on whether the thread is idle or not at
each sample. `Profile.fetch()` by default strips out the new metadata to ensure backwards compatibility with external
profiling data consumers, but can be included with the `include_meta` kwarg. ([#41742])
* The new `Profile.Allocs` module allows memory allocations to be profiled. The stack trace, type, and size of each
allocation is recorded, and a `sample_rate` argument allows a tunable amount of allocations to be skipped,
reducing performance overhead. ([#42768])

#### Random

Expand Down
17 changes: 17 additions & 0 deletions doc/src/manual/profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,23 @@ and how much garbage it collects each time. This can be enabled with
[`GC.enable_logging(true)`](@ref), which causes Julia to log to stderr every time
a garbage collection happens.

### Allocation Profiler

The allocation profiler records the stack trace, type, and size of each
allocation while it is running. It can be invoked with
[`Profile.Allocs.@profile`](@ref).

This information about the allocations is returned as an array of `Alloc`
objects, wrapped in an `AllocResults` object. The best way to visualize
these is currently with the [PProf.jl](https://github.com/JuliaPerf/PProf.jl)
library, which can visualize the call stacks which are making the most
allocations.

The allocation profiler does have significant overhead, so a `sample_rate`
argument can be passed to speed it up by making it skip some allocations.
Passing `sample_rate=1.0` will make it record everything (which is slow);
`sample_rate=0.1` will record only 10% of the allocations (faster), etc.

## External Profiling

Currently Julia supports `Intel VTune`, `OProfile` and `perf` as external profiling tools.
Expand Down
17 changes: 17 additions & 0 deletions stdlib/Profile/docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# [Profiling](@id lib-profiling)

## CPU Profiling

```@docs
Profile.@profile
```
Expand All @@ -15,3 +17,18 @@ Profile.retrieve
Profile.callers
Profile.clear_malloc_data
```

## Memory profiling

```@docs
Profile.Allocs.@profile
```

The methods in `Profile.Allocs` are not exported and need to be called e.g. as `Profile.Allocs.fetch()`.

```@docs
Profile.Allocs.clear
Profile.Allocs.fetch
Profile.Allocs.start
Profile.Allocs.stop
```
26 changes: 25 additions & 1 deletion stdlib/Profile/src/Allocs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ end
Profile allocations that happen during `expr`, returning
both the result and and AllocResults struct.
A sample rate of 1.0 will record everything; 0.0 will record nothing.
```julia
julia> Profile.Allocs.@profile sample_rate=0.01 peakflops()
1.03733270279065e11
Expand All @@ -59,18 +61,40 @@ function _prof_expr(expr, opts)
end
end

function start(; sample_rate::Number)
"""
Profile.Allocs.start(sample_rate::Real)
Begin recording allocations with the given sample rate
A sample rate of 1.0 will record everything; 0.0 will record nothing.
"""
function start(; sample_rate::Real)
ccall(:jl_start_alloc_profile, Cvoid, (Cdouble,), Float64(sample_rate))
end

"""
Profile.Allocs.stop()
Stop recording allocations.
"""
function stop()
ccall(:jl_stop_alloc_profile, Cvoid, ())
end

"""
Profile.Allocs.clear()
Clear all previously profiled allocation information from memory.
"""
function clear()
ccall(:jl_free_alloc_profile, Cvoid, ())
end

"""
Profile.Allocs.fetch()
Retrieve the recorded allocations, and decode them into Julia
objects which can be analyzed.
"""
function fetch()
raw_results = ccall(:jl_fetch_alloc_profile, RawAllocResults, ())
decoded_results = decode(raw_results)
Expand Down

0 comments on commit 830e2ad

Please sign in to comment.