Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocation profiler #42768

Merged
merged 112 commits into from
Jan 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
b49dd24
alloc profiler
vilterp Oct 7, 2021
2fe1678
fix key
vilterp Oct 22, 2021
c370f5a
don't skip stdlib
vilterp Oct 22, 2021
b8cffd3
rename
vilterp Oct 23, 2021
c733cd6
record freed values
vilterp Oct 23, 2021
964a8a4
print frees by type
vilterp Oct 23, 2021
aed5f5d
fix recording & printing of frees
vilterp Oct 23, 2021
665cacd
Return the raw data from the C++ allocations vector into julia!! :)
NHDaly Dec 6, 2021
7ed832e
get started on decoding on Julia side
vilterp Dec 7, 2021
371bb96
delete more serialization code
vilterp Dec 7, 2021
c8452ce
delete destructors and constructors
vilterp Dec 7, 2021
2d9ec5b
use unique_ptr
vilterp Dec 7, 2021
5cd096a
separate free function; move Julia decoding to stdlib
vilterp Dec 7, 2021
04e2b17
move more to the stdlib
vilterp Dec 7, 2021
77ce036
start hacking on _reformat_bt
vilterp Dec 7, 2021
33dde11
fix decoding
vilterp Dec 7, 2021
07d1009
get alloc size and type address
vilterp Dec 7, 2021
6ec2fd0
just keep type type address for now
vilterp Dec 7, 2021
f5f1be2
send type names over to the Julia side
vilterp Dec 8, 2021
9fe1ad2
package up frees as well
vilterp Dec 8, 2021
f768234
try to fix
vilterp Dec 8, 2021
b11a0fe
malloc strings so they're not junk by the time we look at them
vilterp Dec 8, 2021
c1faa44
implement caching of stack trace lookups
vilterp Dec 8, 2021
671b88e
add GC.@preserve, magically fixing segfaults (???)
vilterp Dec 9, 2021
6932831
delete code to stringify type names on C++ side
vilterp Dec 9, 2021
e164cc8
Fix hang in GC: use `jl_safe_printf` not `jl_printf`
NHDaly Dec 10, 2021
c736dbc
Mark the AllocProfile C code as `JL_NOTSAFEPOINT`
NHDaly Dec 10, 2021
9407a91
print incr. vs full
vilterp Dec 14, 2021
f297d94
put test back in; get it passing
vilterp Dec 18, 2021
50dcb63
simpler Base.show for AllocResults
vilterp Dec 14, 2021
976d2bc
remove gc.preserve
vilterp Dec 18, 2021
e1b2221
add top level `profile` function
vilterp Dec 18, 2021
03a0102
clear after decoding
vilterp Dec 18, 2021
7975c77
free memory allocated for results
vilterp Dec 18, 2021
2c84bb9
first hack at multithreaded profiler
vilterp Dec 18, 2021
b49606d
tweaks
vilterp Dec 18, 2021
84745bf
hold onto vectors
vilterp Dec 20, 2021
a03cfe3
fix some compile errors
vilterp Dec 20, 2021
79427ca
Remove file println debugging
NHDaly Dec 21, 2021
9f63529
Add precompile statements to AllocProfile package
NHDaly Dec 21, 2021
7d12971
add comment about precompilation
NHDaly Dec 21, 2021
a26f375
Malloc right-sized buffers for backtraces.
NHDaly Dec 21, 2021
b5ab611
test that we get a free
vilterp Dec 21, 2021
d53afa2
remove GC logging. splitting off into another PR
vilterp Dec 21, 2021
9de01c6
remove a couple debugging things
vilterp Dec 21, 2021
cdabd0e
add macro interface; use in test
vilterp Dec 21, 2021
ba3c0c8
fix whitespace
vilterp Dec 22, 2021
ddf945c
make skip_every a named optional argument to the macro
vilterp Dec 22, 2021
e2a8ba3
remove debug logs
vilterp Dec 22, 2021
ad1b0e0
add multithreaded test
vilterp Dec 22, 2021
a712267
skip another frame to avoid record_allocated_value
vilterp Jan 4, 2022
ec6cc6f
move AllocProfile into Profile package
vilterp Dec 22, 2021
83ef9ea
rename AllocProfile => Profile.Allocs
vilterp Dec 22, 2021
44073ef
load types: stop using threshold hack; just check for jl_buff_tag
vilterp Jan 6, 2022
b3e5102
remove unused struct
vilterp Jan 6, 2022
0fd988b
remove unused `profile` function
vilterp Jan 6, 2022
0119ce3
first hack at fetch API
vilterp Jan 6, 2022
9d06356
switch to uniform sampling with rand()
vilterp Jan 6, 2022
f0d7da4
small default sample rate
vilterp Jan 6, 2022
cecf3a1
update comment
vilterp Jan 6, 2022
545d06c
Comments + formatting cleanup for C files
NHDaly Jan 6, 2022
e1f8684
Cleanups & comments & docstring
NHDaly Jan 6, 2022
7de1d5f
Split out and update Profile.Allocs tests
NHDaly Jan 6, 2022
7c862da
Move the extern "C" around the impl of external funcs for Allocs
NHDaly Jan 6, 2022
2c271ab
remove pritnln
NHDaly Jan 6, 2022
2cb0a59
move structs to header
NHDaly Jan 6, 2022
535c2fc
Tried to fix compile error still failed...
NHDaly Jan 6, 2022
af8931c
Revert move struct defs - use forward decl instead
NHDaly Jan 6, 2022
f47253c
Rename BTElementT and add docstring example
NHDaly Jan 6, 2022
423830d
remove garbage profile stuff
vilterp Jan 6, 2022
4a86866
remove unordered_map
vilterp Jan 6, 2022
d02a5b6
Fix typo after merge
NHDaly Jan 6, 2022
bfd6210
Finish comment
NHDaly Jan 6, 2022
ff00317
fix typo
NHDaly Jan 6, 2022
e07e67d
Stop clearing the profiled Allocations on `start()`
NHDaly Jan 6, 2022
ed4cb44
Fix tests after remove Garbage Profiler (frees)
NHDaly Jan 6, 2022
c27971b
rename to to `maybe_record_alloc_to_profile`
vilterp Jan 6, 2022
48261b5
small test cleanups
vilterp Jan 6, 2022
43276f4
Fix double-free in the stop,fetch,clear logic
NHDaly Jan 7, 2022
8062b72
Add test for start stop fetch clear
NHDaly Jan 7, 2022
aeec3b6
Ah, fix tricky C++ism: missing a `&` caused accidental double-free
NHDaly Jan 7, 2022
00bb947
print warning message
vilterp Jan 6, 2022
d3d868f
Update stdlib/Profile/src/Allocs.jl
NHDaly Jan 7, 2022
d352a80
a couple cleanups; remove multithread assertion
vilterp Jan 7, 2022
3aaa9d9
remove unordered_map import
vilterp Jan 7, 2022
db64f63
first crack at docs
vilterp Jan 7, 2022
5c25bcd
add news
vilterp Jan 7, 2022
4fec8dd
Apply suggestions from code review
NHDaly Jan 7, 2022
9a4b7fb
PR feedback
vilterp Jan 7, 2022
5dc2beb
Merge branch 'pv-alloc-profile-docs-feedback' into pv-alloc-profile-docs
vilterp Jan 7, 2022
830e2ad
Merge pull request #13 from vilterp/pv-alloc-profile-docs
vilterp Jan 7, 2022
cd14357
rename structs to match Julia style
vilterp Jan 11, 2022
0f39d48
Merge pull request #16 from vilterp/pv-alloc-profile-style-renames
vilterp Jan 11, 2022
e98589b
print out percentage of allocs missed
vilterp Jan 10, 2022
107ece0
remove some debug printlns
vilterp Jan 10, 2022
c4bebea
PR feedback
vilterp Jan 11, 2022
f4f1a62
more PR feedback: `_` before global names
vilterp Jan 11, 2022
452c7cd
rename jl_raw_alloc_results_t => jl_profile_allocs_raw_results_t
vilterp Jan 11, 2022
3f219c7
remove `@show`s
vilterp Jan 11, 2022
b25005a
Merge pull request #15 from vilterp/pv-alloc-profile-allocs-missed
vilterp Jan 11, 2022
bf08801
record string allocs
vilterp Jan 12, 2022
8ec194b
add string test
vilterp Jan 12, 2022
41f169c
Merge pull request #18 from vilterp/pv-alloc-profile-record-strings
vilterp Jan 12, 2022
3fc80ac
alloc profiler: avoid divide-by-zero errors (#19)
vilterp Jan 12, 2022
9465507
add paragraph about missed allocations
vilterp Jan 13, 2022
76ad6f0
use !!! note style in comment
vilterp Jan 18, 2022
dd693cc
only print warning when missed_percentage > 0
vilterp Jan 18, 2022
4deaf6c
Attempt to test the warning logs, but struggling to capture stderr
NHDaly Jan 18, 2022
13a85de
Remove the tests, but at least exercise both cases...
NHDaly Jan 18, 2022
68e1a15
improve warning wording to indicate sampling
NHDaly Jan 18, 2022
b5f636a
Add type assertion for alloc profiler unit tests
NHDaly Jan 18, 2022
b16b610
remove superfluous whitespace
vilterp Jan 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,9 @@ Standard library changes
Further, percent utilization is now reported as a total or per-thread, based on whether the thread is idle or not at
each sample. `Profile.fetch()` by default strips out the new metadata to ensure backwards compatibility with external
profiling data consumers, but can be included with the `include_meta` kwarg. ([#41742])
* The new `Profile.Allocs` module allows memory allocations to be profiled. The stack trace, type, and size of each
allocation is recorded, and a `sample_rate` argument allows a tunable amount of allocations to be skipped,
reducing performance overhead. ([#42768])

#### Random

Expand Down
25 changes: 25 additions & 0 deletions doc/src/manual/profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -336,6 +336,31 @@ and how much garbage it collects each time. This can be enabled with
[`GC.enable_logging(true)`](@ref), which causes Julia to log to stderr every time
a garbage collection happens.

### Allocation Profiler

The allocation profiler records the stack trace, type, and size of each
NHDaly marked this conversation as resolved.
Show resolved Hide resolved
allocation while it is running. It can be invoked with
[`Profile.Allocs.@profile`](@ref).

This information about the allocations is returned as an array of `Alloc`
objects, wrapped in an `AllocResults` object. The best way to visualize
these is currently with the [PProf.jl](https://github.com/JuliaPerf/PProf.jl)
library, which can visualize the call stacks which are making the most
allocations.

The allocation profiler does have significant overhead, so a `sample_rate`
argument can be passed to speed it up by making it skip some allocations.
Passing `sample_rate=1.0` will make it record everything (which is slow);
`sample_rate=0.1` will record only 10% of the allocations (faster), etc.

vilterp marked this conversation as resolved.
Show resolved Hide resolved
!!! note

The current implementation of the Allocations Profiler _does not
capture all allocations._ You can read more about the missing allocations
and the plan to improve this, here: https://github.com/JuliaLang/julia/issues/43688.
Calling `Profile.Allocs.fetch()` will print a log line reporting the percentage
of missed allocations, so you can understand the accuracy of your profile.

## External Profiling

Currently Julia supports `Intel VTune`, `OProfile` and `perf` as external profiling tools.
Expand Down
4 changes: 2 additions & 2 deletions src/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ RUNTIME_SRCS := \
jltypes gf typemap smallintset ast builtins module interpreter symbol \
dlload sys init task array dump staticdata toplevel jl_uv datatype \
simplevector runtime_intrinsics precompile \
threading partr stackwalk gc gc-debug gc-pages gc-stacks method \
threading partr stackwalk gc gc-debug gc-pages gc-stacks gc-alloc-profiler method \
jlapi signal-handling safepoint timing subtype \
crc32c APInt-C processor ircode opaque_closure codegen-stubs coverage
SRCS := jloptions runtime_ccall rtutils
Expand Down Expand Up @@ -288,7 +288,7 @@ $(BUILDDIR)/disasm.o $(BUILDDIR)/disasm.dbg.obj: $(SRCDIR)/debuginfo.h $(SRCDIR)
$(BUILDDIR)/dump.o $(BUILDDIR)/dump.dbg.obj: $(addprefix $(SRCDIR)/,common_symbols1.inc common_symbols2.inc builtin_proto.h serialize.h)
$(BUILDDIR)/gc-debug.o $(BUILDDIR)/gc-debug.dbg.obj: $(SRCDIR)/gc.h
$(BUILDDIR)/gc-pages.o $(BUILDDIR)/gc-pages.dbg.obj: $(SRCDIR)/gc.h
$(BUILDDIR)/gc.o $(BUILDDIR)/gc.dbg.obj: $(SRCDIR)/gc.h
$(BUILDDIR)/gc.o $(BUILDDIR)/gc.dbg.obj: $(SRCDIR)/gc.h $(SRCDIR)/gc-alloc-profiler.h
$(BUILDDIR)/init.o $(BUILDDIR)/init.dbg.obj: $(SRCDIR)/builtin_proto.h
$(BUILDDIR)/interpreter.o $(BUILDDIR)/interpreter.dbg.obj: $(SRCDIR)/builtin_proto.h
$(BUILDDIR)/jitlayers.o $(BUILDDIR)/jitlayers.dbg.obj: $(SRCDIR)/jitlayers.h $(SRCDIR)/codegen_shared.h
Expand Down
1 change: 1 addition & 0 deletions src/array.c
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,7 @@ JL_DLLEXPORT jl_value_t *jl_alloc_string(size_t len)
s = jl_gc_big_alloc(ptls, allocsz);
}
jl_set_typeof(s, jl_string_type);
maybe_record_alloc_to_profile(s, len);
*(size_t*)s = len;
jl_string_data(s)[len] = 0;
return s;
Expand Down
139 changes: 139 additions & 0 deletions src/gc-alloc-profiler.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
// This file is a part of Julia. License is MIT: https://julialang.org/license

#include "gc-alloc-profiler.h"

#include "julia_internal.h"
#include "gc.h"

#include <string>
#include <vector>

using std::string;
using std::vector;

struct jl_raw_backtrace_t {
jl_bt_element_t *data;
size_t size;
};

struct jl_raw_alloc_t {
jl_datatype_t *type_address;
jl_raw_backtrace_t backtrace;
size_t size;
};

// == These structs define the global singleton profile buffer that will be used by
// callbacks to store profile results. ==
struct jl_per_thread_alloc_profile_t {
vector<jl_raw_alloc_t> allocs;
};

struct jl_alloc_profile_t {
double sample_rate;

vector<jl_per_thread_alloc_profile_t> per_thread_profiles;
};

struct jl_combined_results {
vector<jl_raw_alloc_t> combined_allocs;
};

// == Global variables manipulated by callbacks ==

jl_alloc_profile_t g_alloc_profile;
int g_alloc_profile_enabled = false;
jl_combined_results g_combined_results; // Will live forever.

// === stack stuff ===

jl_raw_backtrace_t get_raw_backtrace() {
// A single large buffer to record backtraces onto
static jl_bt_element_t static_bt_data[JL_MAX_BT_SIZE];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not thread safe. You should be (re)using the existing buffer in the ptls to be safe here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Yeah, we need a buffer per thread…

Did you mean to (re)use this buffer? https://github.com/vilterp/julia/blob/c12aca890a8ae387d11db0b54351f8b61305c00b/src/julia_threads.h#L247

It seems like we should add our own (here or as a global) to avoid colliding with that… We've shied away from adding stuff to ptls to avoid breaking stuff or adding overhead, but maybe it's better than globals?

Funny our multithreaded test didn't catch this; maybe the stack traces were just garbage.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as you avoid re-using this buffer in the interval between an error filling this buffer, and then allocating the exception stack, I think you should be okay here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes me a little nervous to reuse the buffer for two things, but it is space-efficient! I guess it should be fine like you say…

filed #44099 to track this; we'll write up a PR soon

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The backtrace would be essentially the same, just a bit longer perhaps

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @vtjnash for pointing this out! 😅 haha it feels pretty silly that we missed that. Thanks for the help!!

The suggestion to use a buffer in ptls makes sense. I agree with @vilterp that sharing that buffer makes me slightly nervous, but reading through your description that sounds totally fine 👍 cool, thank you!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the CPU profiler use that buffer at all? If someone is running CPU profiling and Allocs profiling at the same time, is there a chance it'll interrupt our thread while we were in the middle of using the buffer and we'll get any problems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profiling using a different buffer entirely

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, thanks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more question: is it okay to write over this buffer, since it seems like it's used to scan for roots?:

julia/src/task.c

Lines 344 to 345 in 7ccc83e

// storing bt_size in ptls ensures roots in bt_data will be found
ptls->bt_size = rec_backtrace(ptls->bt_data, JL_MAX_BT_SIZE, skip + 1);

julia/src/task.c

Lines 618 to 623 in 7ccc83e

// The temporary ptls->bt_data is rooted by special purpose code in the
// GC. This exists only for the purpose of preserving bt_data until we
// set ptls->bt_size=0 below.
jl_push_excstack(&ct->excstack, exception,
ptls->bt_data, ptls->bt_size);
ptls->bt_size = 0;

I think that it's fine, from what i can read, but just want to double check one more time.

Also, i've opened a PR for this, here: #44114


size_t bt_size = rec_backtrace(static_bt_data, JL_MAX_BT_SIZE, 2);

// Then we copy only the needed bytes out of the buffer into our profile.
size_t bt_bytes = bt_size * sizeof(jl_bt_element_t);
jl_bt_element_t *bt_data = (jl_bt_element_t*) malloc(bt_bytes);
memcpy(bt_data, static_bt_data, bt_bytes);

return jl_raw_backtrace_t{
bt_data,
bt_size
};
}

// == exported interface ==

extern "C" { // Needed since these functions doesn't take any arguments.

JL_DLLEXPORT void jl_start_alloc_profile(double sample_rate) {
// We only need to do this once, the first time this is called.
while (g_alloc_profile.per_thread_profiles.size() < jl_n_threads) {
g_alloc_profile.per_thread_profiles.push_back(jl_per_thread_alloc_profile_t{});
}

g_alloc_profile.sample_rate = sample_rate;
g_alloc_profile_enabled = true;
}

JL_DLLEXPORT jl_profile_allocs_raw_results_t jl_fetch_alloc_profile() {
// combine allocs
// TODO: interleave to preserve ordering
for (auto& profile : g_alloc_profile.per_thread_profiles) {
for (const auto& alloc : profile.allocs) {
g_combined_results.combined_allocs.push_back(alloc);
}

profile.allocs.clear();
}

return jl_profile_allocs_raw_results_t{
g_combined_results.combined_allocs.data(),
g_combined_results.combined_allocs.size(),
};
}

JL_DLLEXPORT void jl_stop_alloc_profile() {
g_alloc_profile_enabled = false;
}

JL_DLLEXPORT void jl_free_alloc_profile() {
// Free any allocs that remain in the per-thread profiles, that haven't
// been combined yet (which happens in fetch_alloc_profiles()).
for (auto& profile : g_alloc_profile.per_thread_profiles) {
for (auto alloc : profile.allocs) {
free(alloc.backtrace.data);
}
profile.allocs.clear();
}

// Free the allocs that have been already combined into the combined results object.
for (auto alloc : g_combined_results.combined_allocs) {
free(alloc.backtrace.data);
}

g_combined_results.combined_allocs.clear();
}

// == callback called into by the outside ==

void _maybe_record_alloc_to_profile(jl_value_t *val, size_t size) JL_NOTSAFEPOINT {
auto& global_profile = g_alloc_profile;
auto& profile = global_profile.per_thread_profiles[jl_threadid()];

auto sample_val = double(rand()) / double(RAND_MAX);
auto should_record = sample_val <= global_profile.sample_rate;
if (!should_record) {
return;
}

auto type = (jl_datatype_t*)jl_typeof(val);
profile.allocs.emplace_back(jl_raw_alloc_t{
type,
get_raw_backtrace(),
size
});
}

} // extern "C"
49 changes: 49 additions & 0 deletions src/gc-alloc-profiler.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
// This file is a part of Julia. License is MIT: https://julialang.org/license

#ifndef JL_GC_ALLOC_PROFILER_H
#define JL_GC_ALLOC_PROFILER_H

#include "julia.h"
#include "ios.h"

#ifdef __cplusplus
extern "C" {
#endif

// ---------------------------------------------------------------------
// The public interface to call from Julia for allocations profiling
// ---------------------------------------------------------------------

// Forward-declaration to avoid depenency in header file.
struct jl_raw_alloc_t; // Defined in gc-alloc-profiler.cpp

typedef struct {
struct jl_raw_alloc_t *allocs;
size_t num_allocs;
} jl_profile_allocs_raw_results_t;

JL_DLLEXPORT void jl_start_alloc_profile(double sample_rate);
JL_DLLEXPORT jl_profile_allocs_raw_results_t jl_fetch_alloc_profile(void);
JL_DLLEXPORT void jl_stop_alloc_profile(void);
JL_DLLEXPORT void jl_free_alloc_profile(void);

// ---------------------------------------------------------------------
// Functions to call from GC when alloc profiling is enabled
// ---------------------------------------------------------------------

void _maybe_record_alloc_to_profile(jl_value_t *val, size_t size) JL_NOTSAFEPOINT;

extern int g_alloc_profile_enabled;

static inline void maybe_record_alloc_to_profile(jl_value_t *val, size_t size) JL_NOTSAFEPOINT {
if (__unlikely(g_alloc_profile_enabled)) {
_maybe_record_alloc_to_profile(val, size);
}
}

#ifdef __cplusplus
}
#endif


#endif // JL_GC_ALLOC_PROFILER_H
1 change: 1 addition & 0 deletions src/gc.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#endif
#endif
#include "julia_assert.h"
#include "gc-alloc-profiler.h"

#ifdef __cplusplus
extern "C" {
Expand Down
2 changes: 2 additions & 0 deletions src/julia_internal.h
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
#include "support/hashing.h"
#include "support/ptrhash.h"
#include "support/strtod.h"
#include "gc-alloc-profiler.h"
#include <uv.h>
#if !defined(_WIN32)
#include <unistd.h>
Expand Down Expand Up @@ -364,6 +365,7 @@ STATIC_INLINE jl_value_t *jl_gc_alloc_(jl_ptls_t ptls, size_t sz, void *ty)
v = jl_gc_big_alloc(ptls, allocsz);
}
jl_set_typeof(v, ty);
maybe_record_alloc_to_profile(v, sz);
vchuravy marked this conversation as resolved.
Show resolved Hide resolved
return v;
}

Expand Down
17 changes: 17 additions & 0 deletions stdlib/Profile/docs/src/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# [Profiling](@id lib-profiling)

## CPU Profiling

```@docs
Profile.@profile
```
Expand All @@ -15,3 +17,18 @@ Profile.retrieve
Profile.callers
Profile.clear_malloc_data
```

## Memory profiling

```@docs
Profile.Allocs.@profile
```

The methods in `Profile.Allocs` are not exported and need to be called e.g. as `Profile.Allocs.fetch()`.

```@docs
Profile.Allocs.clear
Profile.Allocs.fetch
Profile.Allocs.start
Profile.Allocs.stop
```
Loading