Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update #1

Closed
wants to merge 38 commits into from
Closed

Update #1

wants to merge 38 commits into from

Conversation

BenPope
Copy link
Member

@BenPope BenPope commented Nov 12, 2020

Most important fix for us is: "Handle CPUs not attached to any NUMA nodes"

elazarl and others added 30 commits September 19, 2020 19:32
In order to differentiate between seastar allocations and system
allocations, we'll add a method to test whether a specific address is
seastar allocated. Easy, since we know the virtual address space
boundaries.

Signed-off-by: Elazar Leibovich <[email protected]>
We save the original overridden memory allocations functions in global
variables for use in next commits.

Signed-off-by: Elazar Leibovich <[email protected]>
We mark all reactor threads with a is_reactor_thread TLS boolean
variable, to select suitable allocator.

Signed-off-by: Elazar Leibovich <[email protected]>
cpu_pages is a thread local variable. This means its constructor and
destructor can run on any thread. Reactor thread or alien thread.

Obviously, cpu_pages have no-op constructor, but it segfaults if the
destructor runs without cpu_pages::initialize being called.

We only run the destructor inf cpu_pages::initialize was ran.

Signed-off-by: Elazar Leibovich <[email protected]>
We use system allocator if TLS variable is_reactor_thread is set.

Note: If for any reason dlsym failed to find the original functions, we
will silently fallback to seastar allocator.

Likewise, before main, and before constructors are running, seastar
allocator would be used.

Signed-off-by: Elazar Leibovich <[email protected]>
Since most free calls would be frees for local CPU we wish to quickly
find out if this is the case;

For that, we'll keep a CPU local cpu_id_mask and a CPU local expected cpu_id
value. We could then quickly mask the given pointer with the mask, and
see if it matches the expected cpu_id.

For foreign pointers, the expected cpu_id value would be all bits set,
and the condition would be impossible to satisfy on foreign thread.

[avi: also include mem_base() in expected_cpu_id, and expand the mask,
      to avoid false positive on addresses that happen to have 0
      in the cpu_id bits (which are most foreign addresses)]
Signed-off-by: Elazar Leibovich <[email protected]>

include mem_base in expected_cpu_id
Since try_foreign_free has a fast path that doesn't need cpu_pages, it
can be a static function, and call cpu_pages only when it actually
succeed freeing.

Signed-off-by: Elazar Leibovich <[email protected]>
In rare cases, one can cal memory::configure without doing any memory
allocations at all, and hence call cpu_pages::resize before
cpu_pages::initialize was called.

I chose not to add anything to cpu_pages::resize, since in all other
paths to cpu_pages::resize, cpu_pages::initialize is called before.

When cpu_pages::resize is called before cpu_pages::initialize, it would
loop indefinitely, since cpu_pages::nr_pages would be 0.

Signed-off-by: Elazar Leibovich <[email protected]>
This commit maintains counters of foreign malloc and frees.
Differentiating local foreign frees from reactor foreign free.

Signed-off-by: Elazar Leibovich <[email protected]>
We run allocation + free on an alien thread, and make sure that malloc
statistics are changed.

We test the following matrix:

                 | malloc | realloc | malloc+realloc | aligned_alloc|
                 +--------------------------------------------------+
malloc+free on   |        |         |                |              |
alien            |        |         |                |              |
-----------------+--------------------------------------------------+
malloc on alien  |        |         |                |              |
free on reactor  |        |         |                |              |
-----------------+--------------------------------------------------+
malloc on reactor|        |         |                |              |
free on alien    |        |         |                |              |

Signed-off-by: Elazar Leibovich <[email protected]>
When using alien threads, seastar regular thread_local statistics
won't be collected.

A different approach, as discussed in [0], is to keep record of the
allocations statistics in an array. Each alien thread will record its
statistics to bucket "hashed_thread_id % total_num_buckets".

When collecting statistics one can go over the buckets and get an
estimate of foreign threads behavior, hoping there are few collisions.

This would assure us statistics will work well even when creating and
destroying many foreign threads.

[0] https://groups.google.com/d/msg/seastar-dev/vpmi31RNu0M/cUhsdFMVBgAJ

[avi: add #include <thread> for std::thread::id]
[avi: remove now-unused g_allocs and friends]
Signed-off-by: Elazar Leibovich <[email protected]>
semaphore methods accessing trivial types are noexcept.

In addition, with 9ae33e6
expiring_fifo methods are noexcept and we can
mark methods using them as noexcept too.

Signed-off-by: Benny Halevy <[email protected]>
promise is nothrow move constructible.
Also, promise::set_value is noexcept so we can
mark basic_semaphore::signal noexcept as well.

Signed-off-by: Benny Halevy <[email protected]>
This is on the error handling path and there's no reason
to keep it defined in the header file(s).

Signed-off-by: Benny Halevy <[email protected]>
broken_semaphore and semaphore_timed_out defaultconstructors do not throw
so semaphore_default_exception_factory's respective methods can
be marked noexcept as well.

Signed-off-by: Benny Halevy <[email protected]>
Catch exceptions when formatting named_semaphore_timed_out
or broken_named_semaphore exceptions and convert them
into a static message.

Signed-off-by: Benny Halevy <[email protected]>
This allows us to instantiate a single ExceptionFactory -
the semaphore's base-class, rather than two of them,
one for the semaphore and one for the expiry_handler.

With that. constructing the wait list becomes noexcept
(based on std::function being nothrow move constructible)

Note that theoretically, exception_factory::timeout() may throw.
To simplify error handling in the expiry handler function,
catch this exception and turn the error into a generic, non-throwing one
and with that mark the returned function noexcept.

Signed-off-by: Benny Halevy <[email protected]>
Theoretically, exception_factory::broken() may throw.
To simplify error handling, catch this exception
and turn the error into a generic, non-throwing one
and with that mark the broken() method noexcept.

Signed-off-by: Benny Halevy <[email protected]>
Error path need not be implemented in the header file.
Also mark them noexcept as broken_condition_variable
and condition_variable_timed_out are nothrow default constructible.

Signed-off-by: Benny Halevy <[email protected]>
Now with the respective semaphore interface marked noexcept
and given that condition_variable_exception_factory is
nothrow constructible, condition_variable constructor
and methods can all be marked noexcept.

Signed-off-by: Benny Halevy <[email protected]>
Merged patch set by Elazar Leibovich and changes by Avi Kivity:

Currently, the seastar allocator and seastar::alien are mutually
exclusive. The reason is that the Seastar allocator does not support
allocating from an alien thread; only allocations from Seastar shards
are allowed. Since the Seastar allocator is not thread-safe, alien
threads cannot touch its memory arenas.

This patchset solves the problem by falling back to the libc allocator
when called from a non-Seastar thread, and by calling libc free for
allocations that were originally allocated by the libc allocator.

  memory: hashed statistics for alien threads
  memory: test for using glibc allocator on alien threads
  memory: track statistics of allocations and frees
  memory: cpu_pages::resize must not be called before initialize
  memory: try_foreign_free should be a static function
  memory: fast path for local free
  memory: use system allocator for alien threads
  memory: alien thread shouldn't call cpu_pages dtor
  memory: identify reactor threads by a TLS variable
  memory: search overrided memory functions with dlsym
  memory: enable testing whether address is seastar allocated
Seastar httpd currently does not support support some HTTP methods such as
HEAD and OPTIONS, not able to handle by apps.
Add unsupported methods (HEAD, OPTIONS, TRACE, CONNECT) and let apps to
handle them.

Closes scylladb#820
To handle 'match all' rule correctly, we need to return a handler
unconditionally when no URL added to the rule.

Fixes scylladb#821

Closes scylladb#822
Debug mode requires a large amount of stack, since the sanitizer
disables stack slot reuse, inserts padding between stack variables,
and does not inline. This was already recognized in 7d55992
("Allocate more stack when we have both asan and optimizations"),
but only for sanitize mode. It was observed recently on debug mode
too (on aarch64).

To prevent stack overflow, increase stack size in debug mode too.
Message-Id: <[email protected]>
"
This series makes more semaphore-related methods and constructors noexcept.

In particular, the exception factories defined in seastar were made noexcept
and semaphore::broken() was hardened to handle exception from the
ExceptionFactory::broken() method, in case it throws.

Test: unit(dev), semaphore(debug)
"

* tag 'semaphore-noexcept-v3' of github.com:bhalevy/seastar:
  condition_variable: mark functions noexcept
  condition_variable: move exception factory functions out of line
  semaphore: broken: make noexcept
  semaphore: use functor as wait list expiry_handler
  semaphore: make named_semaphore_exception_factory noexcept
  semaphore: semaphore_default_exception_factory: mark noexcept
  semaphore: move exception factories code out of line
  semaphore: mark wait list entry noexcept
  semaphore: mark trivial methods noexcept
malloc_usable_size() checks whether it is running in a reactor
thread in order to see if it needs to dispatch to the libc
malloc_usable_size() or our own. But in one case this fails.

The case is when a static constructor performs memory allocation.
at this point, the reactor and memory system have not been
initialized yet, so is_reactor_thread == false. The original
libc memory functions have not been looked up yet, since these
operations also happen in a static constructor, which can
be scheduled after the one triggering the problem.

Fix by using is_seastar_memory() instead. This is similar to
what realloc() does, for the same reasons.
Message-Id: <[email protected]>
When distributing memory and assigning IO queues seastar needs
to know the CPU:NUMA_node mapping. Each time this value is needed
it is calculated from hwloc again.

It's better to keep this mapping in advance and use it later.

It's crucial to have it for further patches.

Signed-off-by: Pavel Emelyanov <[email protected]>
Sometimes a processing unit (PU in hwloc terms) may be attached
to a NUMA node with no memory on board (e.g. the respective
ram banks are empty). In this case newer (2.2.0+) hwloc does NOT
report any NODE and seastar hits the respective assertion. Older
hwloc reported empty node.

To fix this we'll need to distribute node-less PUS (called
orphans in this patch) among existing nodes and for this it's
needed to collect them all when mapping PUs to nodes.

Next patch will make the distribution itself.

Signed-off-by: Pavel Emelyanov <[email protected]>
When a PU is reported by hwloc not to have any local NUMA node
it should better be assigned to some "random" node, rather than
crash. To do this assignment group all the orphan PUs, then
scatter the groups between nodes evenly.

The grouping can be done in many ways, the goal here is to have
even distribution. On the other hand it seems worth keeping the
PUs sharing the Lx cache in the same node. So the grouping
logic is -- try to group PUs by L3, L2 and L1 caches sequentally,
until it's seen that we cover all the available nodes.

If even L1 is not enough -- then scatter PUs one-by-one.

When this situation happens, warn user in logs, that the hardware
is not configured/assembled in optimal way.

Signed-off-by: Pavel Emelyanov <[email protected]>
xemul and others added 8 commits November 6, 2020 14:32
On debug level not to spoil the logs, but to have the
chance to find out what's going on.

Signed-off-by: Pavel Emelyanov <[email protected]>
It can be not nice to assign CPUs to remote NUMA nodes begind the curtains.
The user should have the ability to disable this behavior.

Signed-off-by: Pavel Emelyanov <[email protected]>
"
In hwloc-2.2.0 and newer there's a feature -- empty NUMA nodes are no
longer reported. "Empty" are those with 0 available memory (older hwloc
reported those nodes, though).

If any seastar app is started on a hardware that's configured/assembled
to have empty nodes, seastar crashes on assertion "PU not inside any
NUMA node". I have such a box (offtopic -- probably because those nodes
are somehow 1:1 mapped to empty ram banks) and propose to have this
fixed in software :)

more info on this matter: open-mpi/hwloc#430
tests: iotune, scylla unit(dev)
"

* 'br-empty-numa-nodes-fix-2' of https://github.com/xemul/seastar:
  cli: Make orphan CPUs auto-assignment configurable
  resource: Log PUs to NUMA assignments
  resource: Distribute orphan PUs among nodes
  resource: Collect orphan PUs to be assigned later
  resource: Map cpus to nodes in advance
Sometimes one wants to dump a detailed diagnostics report on an error to
help diagnosing the underlying problem. Just logging these with >= info
is problematic because these reports can be big and the errors can come
in bursts (or simply happen too often), completely flooding the logs.
Currently the standard workaround for this is to log these with debug
level and have the user set the log level to debug when they are
interested in the diagnostics. This is also problematic, this doesn't
help with diagnosing problems that already happened and reproducing
might be hard or impossible (and will be prone to log flooding just the
same).

This patch presents a solution to this problem: a flexible way to
rate-limit log messages. The rate-limiter can be shared with multiple
messages, across multiple loggers and it ensures that in the given
interval at least one of the messages sharing the rate-limiter will be
logged. This might seem overly-restrictive, but for the diagnostics dump
use-case it was developed for it is perfectly reasonable as there will
be typically just one message.

Signed-off-by: Botond Dénes <[email protected]>
Message-Id: <[email protected]>
Fixes scylladb#823

gnutls 3.6/fedora33 does not allow rsa-sha1 certs.
As a quick fix, update the certs to rsa-sha256.

A more long-term solution should generate these certs
on build (cmake) instead of keeping bin blobs

Message-Id: <[email protected]>
So it can be controlled by the application.
Can be ON, OFF or DEFAULT (which enables it for Debug and Sanitize)

Signed-off-by: Benny Halevy <[email protected]>
Message-Id: <[email protected]>
Fixes clang 11.0.0 warning:
    ../../include/seastar/core/semaphore.hh:121:17: error: lambda capture 'this' is not used [-Werror,-Wunused-lambda-capture]

Signed-off-by: Benny Halevy <[email protected]>
Message-Id: <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Nov 12, 2020

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 7 committers have signed the CLA.

❌ nyh
❌ elazarl
❌ bhalevy
❌ syuu1228
❌ avikivity
❌ xemul
❌ denesb
You have signed the CLA already but the status is still pending? Let us recheck it.

@dotnwat
Copy link
Member

dotnwat commented Nov 12, 2020

Perhaps we should manually push so it is a fast forward w.r.t. upstream seastar.

@BenPope BenPope closed this Nov 12, 2020
mmaslankaprv pushed a commit to mmaslankaprv/seastar that referenced this pull request Mar 31, 2021
This reverts commit 33406cf. It
introduces memory leaks:

Direct leak of 24 byte(s) in 1 object(s) allocated from:
    #0 0x7fb773b389d7 in operator new(unsigned long) (/lib64/libasan.so.5+0x10f9d7)
    redpanda-data#1 0x108f0d4 in seastar::reactor::poller::~poller() ../src/core/reactor.cc:2879
    redpanda-data#2 0x11c1e59 in std::experimental::fundamentals_v1::_Optional_base<seastar::reactor::poller, true>::~_Optional_base() /usr/include/c++/9/experimental/optional:288
    redpanda-data#3 0x118f2d7 in std::experimental::fundamentals_v1::optional<seastar::reactor::poller>::~optional() /usr/include/c++/9/experimental/optional:491
    redpanda-data#4 0x108c5a5 in seastar::reactor::run() ../src/core/reactor.cc:2587
    redpanda-data#5 0xf1a822 in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) ../src/core/app-template.cc:199
    redpanda-data#6 0xf1885d in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) ../src/core/app-template.cc:115
    redpanda-data#7 0xeb2735 in operator() ../src/testing/test_runner.cc:72
    redpanda-data#8 0xebb342 in _M_invoke /usr/include/c++/9/bits/std_function.h:300
    redpanda-data#9 0xf3d8b0 in std::function<void ()>::operator()() const /usr/include/c++/9/bits/std_function.h:690
    redpanda-data#10 0x1034c72 in seastar::posix_thread::start_routine(void*) ../src/core/posix.cc:52
    redpanda-data#11 0x7fb7738804e1 in start_thread /usr/src/debug/glibc-2.30-13-g919af705ee/nptl/pthread_create.c:479

Reported-by: Rafael Avila de Espindola <[email protected]>
mmaslankaprv pushed a commit to mmaslankaprv/seastar that referenced this pull request Mar 31, 2021
"
This series enhances seastar backtraces to contain information not only about
current task's call stack, but also information about tasks which are blocked
on the current task.

For example, when a thread is waiting for I/O operation to complete, the
continuation chain will contain the continuation which handles ready I/O, then
a continuation which wakes the thread, then the continuation which waits for
the thread to finish. This gives much more context in the backtrace, similar
to the one we would get in a synchronous programming model.

Presenting only current backtrace of the reactor thread is in many
cases not enough.

This is how extended backtraces will be logged:

INFO  2020-05-06 11:39:32,362 [shard 0] seastar - backtrace:    0x5b7ed9
   0x5b80e2
   0x5b8599
   0x437d76
   0x43e88b
   0x4c9617
   0x4c990d
   0x4f65c5
   0x4a962c
   0x4aa245
   0x49e785
   0x4c288d
   /lib64/libpthread.so.0+0x94e1
   /lib64/libc.so.6+0x1016a2
   --------
   seastar::continuation<seastar::internal::promise_base_with_type<>, seastar::future<>::then_impl_nrvo<func4()::{lambda()redpanda-data#1}, seastar::future<> >(func4()::{lambda()redpanda-data#1}&&)::{lambda()redpanda-data#1}::operator()() const::{lambda(seastar::internal::promise_base_with_type<>&, seastar::future_state<>&&)redpanda-data#1}>
   --------
   seastar::future<>::thread_wake_task

The --- separator separates backtraces of different tasks in the chain.

A side benefit of the series is that all backtraces will now some meaningful
information in them (current continuation name) without the need to resolve addresses.

Another use case is heap profiles. Currently measuring memory
allocations for one large operation is hard if it starts many
allocating continuations. Tasks for those continuations will
be spread in the profile and not correlated with the grand
operation. Backtracing across tasks solves that because different
paths are joined by a common base.

The task class is extended with a virtual method:

   virtual task* waiting_task() noexcept = 0;

which allows one to walk the continuation chain. It returns the next task
blocked on the current one.

When backtrace is collected, for each task its type_info is pushed,
which allows us to obtain the name of the continuation.

Optionally, when the build is configured with --enable-task-backtrace, full backtrace
is attached to continuation tasks when they're created (at deferring points)
so that we get more context.

No regression showed by perf_simple_query (@ 120k tps).
"

* tag 'backtracing-across-tasks-v5.1' of github.com:tgrabiec/seastar:
  util/backtrace: Cache hash of the backtrace
  core/task, util/backtrace: Allow capturing backtrace at preemption points
  core/memory: Move disable_backtrace_temporarily declaration to the header
  tests: future: Add backtracing test
  addr2line: Ignore separator lines
  core/task, util/backtrace: Collect backtraces across continuation chains
  util/backtrace: Extract operator<<(std::ostream&, frame&)
  core/make_task: Implement lambda_task::waiting_task()
  core/task: Store promise inside lambda_task
  core/task: Extrack make_task() to a separate header
  core/reactor: Expose a pointer to currently running task
  core/future: Add ability to walk continuation chains
  util/backtrace: Capture current scheduling group in the backtrace
mmaslankaprv pushed a commit to mmaslankaprv/seastar that referenced this pull request Mar 31, 2021
The iotune tool measures disk throughput and IOPS by doing four
sequential measurements:

1. sequentially writes into a big file
2. sequentially reads from the same file
3. randomly writes into this file again
4. randomly reads from, you know, the File

It's improtant that the measurement redpanda-data#1 comes first. On start
the test file is created and truncated to its size and this
first measurement fills it with data which is then read by steps
2 and 4. Respectively, after the 1st measurement the size of
the file should be updated to reflect the real amount of data
written into it.

The latter is done by taking the number of bytes written into
file. But in reality the first test may wrap around the initial
file size and re-write some data into it. After this the file
size can be seen bigger than it actually is, even times bigger.

Subsequently, the next tests will go and read from/write to
random holes in this area. For reading tests this becomes quite
problematic as the kernel will not submit real IO requess for
reads from missing (due to holes) blocks. As a result, the shown
bandwidth and IOPS will be some average value of disk IOPS and
kernel "reads-from-holes-per-second".

Fix this by getting the maximum position at which the first test
writes and limiting the next tests with this value, instead of
the amount of (over-)writter bytes.

Signed-off-by: Pavel Emelyanov <[email protected]>
Message-Id: <[email protected]>
dotnwat pushed a commit that referenced this pull request May 12, 2021
…o_with

Fixes failures in debug mode:
```
$ build/debug/tests/unit/closeable_test -l all -t deferred_close_test
WARNING: debug mode. Not for benchmarking or production
random-seed=3064133628
Running 1 test case...
Entering test module "../../tests/unit/closeable_test.cc"
../../tests/unit/closeable_test.cc(0): Entering test case "deferred_close_test"
../../src/testing/seastar_test.cc(43): info: check true has passed
==9449==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases!
terminate called after throwing an instance of 'seastar::broken_promise'
  what():  broken promise
==9449==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fbf1f49f000; bottom 0x7fbf40971000; size: 0xffffffffdeb2e000 (-558702592)
False positive error reports may follow
For details see google/sanitizers#189
=================================================================
==9449==AddressSanitizer CHECK failed: ../../../../libsanitizer/asan/asan_thread.cpp:356 "((ptr[0] == kCurrentStackFrameMagic)) != (0)" (0x0, 0x0)
    #0 0x7fbf45f39d0b  (/lib64/libasan.so.6+0xb3d0b)
    #1 0x7fbf45f57d4e  (/lib64/libasan.so.6+0xd1d4e)
    #2 0x7fbf45f3e724  (/lib64/libasan.so.6+0xb8724)
    #3 0x7fbf45eb3e5b  (/lib64/libasan.so.6+0x2de5b)
    #4 0x7fbf45eb51e8  (/lib64/libasan.so.6+0x2f1e8)
    #5 0x7fbf45eb7694  (/lib64/libasan.so.6+0x31694)
    #6 0x7fbf45f39398  (/lib64/libasan.so.6+0xb3398)
    #7 0x7fbf45f3a00b in __asan_report_load8 (/lib64/libasan.so.6+0xb400b)
    #8 0xfe6d52 in bool __gnu_cxx::operator!=<dl_phdr_info*, std::vector<dl_phdr_info, std::allocator<dl_phdr_info> > >(__gnu_cxx::__normal_iterator<dl_phdr_info*, std::vector<dl_phdr_info, std::allocator<dl_phdr_info> > > const&, __gnu_cxx::__normal_iterator<dl_phdr_info*, std::vector<dl_phdr_info, std::allocator<dl_phdr_info> > > const&) /usr/include/c++/10/bits/stl_iterator.h:1116
    #9 0xfe615c in dl_iterate_phdr ../../src/core/exception_hacks.cc:121
    #10 0x7fbf44bd1810 in _Unwind_Find_FDE (/lib64/libgcc_s.so.1+0x13810)
    #11 0x7fbf44bcd897  (/lib64/libgcc_s.so.1+0xf897)
    #12 0x7fbf44bcea5f  (/lib64/libgcc_s.so.1+0x10a5f)
    #13 0x7fbf44bcefd8 in _Unwind_RaiseException (/lib64/libgcc_s.so.1+0x10fd8)
    #14 0xfe6281 in _Unwind_RaiseException ../../src/core/exception_hacks.cc:148
    #15 0x7fbf457364bb in __cxa_throw (/lib64/libstdc++.so.6+0xaa4bb)
    #16 0x7fbf45e10a21  (/lib64/libboost_unit_test_framework.so.1.73.0+0x1aa21)
    #17 0x7fbf45e20fe0 in boost::execution_monitor::execute(boost::function<int ()> const&) (/lib64/libboost_unit_test_framework.so.1.73.0+0x2afe0)
    #18 0x7fbf45e21094 in boost::execution_monitor::vexecute(boost::function<void ()> const&) (/lib64/libboost_unit_test_framework.so.1.73.0+0x2b094)
    #19 0x7fbf45e43921 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::function<void ()> const&, unsigned long) (/lib64/libboost_unit_test_framework.so.1.73.0+0x4d921)
    #20 0x7fbf45e5eae1  (/lib64/libboost_unit_test_framework.so.1.73.0+0x68ae1)
    #21 0x7fbf45e5ed31  (/lib64/libboost_unit_test_framework.so.1.73.0+0x68d31)
    #22 0x7fbf45e2e547 in boost::unit_test::framework::run(unsigned long, bool) (/lib64/libboost_unit_test_framework.so.1.73.0+0x38547)
    #23 0x7fbf45e43618 in boost::unit_test::unit_test_main(bool (*)(), int, char**) (/lib64/libboost_unit_test_framework.so.1.73.0+0x4d618)
    #24 0x44798d in seastar::testing::entry_point(int, char**) ../../src/testing/entry_point.cc:77
    #25 0x4134b5 in main ../../include/seastar/testing/seastar_test.hh:65
    #26 0x7fbf44a1b1e1 in __libc_start_main (/lib64/libc.so.6+0x281e1)
    #27 0x4133dd in _start (/home/bhalevy/dev/seastar/build/debug/tests/unit/closeable_test+0x4133dd)
```

Signed-off-by: Benny Halevy <[email protected]>
Message-Id: <[email protected]>
dotnwat pushed a commit that referenced this pull request Aug 9, 2021
This is step #1 in removing all classes manipulations API from
io_queue to io_priority_class.

Signed-off-by: Pavel Emelyanov <[email protected]>
dotnwat pushed a commit that referenced this pull request Aug 9, 2021
The io_queue_topology describes "numerical" topology -- the
numbers of queues and groups and their mappings to each other.
Later this info is materialized into device_io_topology by
creating vectors of queues and groups.

There's no real need in this split. The io_queue_topology can
just create the needed vectors with no objects in them, so
that later queues and groups are put into _it_ instead of on
the device_io_topology.

This is step #1 -- move vector of queues into io_queue_topology.

Signed-off-by: Pavel Emelyanov <[email protected]>
dotnwat pushed a commit that referenced this pull request Aug 9, 2021
This is a tiny leak fix -- if queues initialization fails in the
middle some allocated, but not yet assigned queues will not be
deleted.

(side note: this patch is the beneficiary of the explicit outline
constructors for io_queue_topology from patch #1 of this set)

Signed-off-by: Pavel Emelyanov <[email protected]>
travisdowns pushed a commit to travisdowns/seastar that referenced this pull request Apr 11, 2022
… enqueue_op throws

If enqueue_op fails to allocate, op_func ends up setting `fut`
error to broken promise and we get the following warning:
```
WARN  2021-09-20 11:02:33,437 [shard 0] seastar - Exceptional future ignored: seastar::broken_promise (broken promise), backtrace: 0x57a609 0x57a932 0x57ace9 0x488b57 0x488c7c 0x44a970 0x41418e 0x4645e9 0x47f99a 0x484dfb 0x485acc 0x44f6ef 0x44fcfb 0x45342d 0x4464f8 0x42a33c
```

Decoded:
```
seastar::append_challenged_posix_file_impl::write_dma(unsigned long, void const*, unsigned long, seastar::io_priority_class const&, seastar::io_intent*) [clone .cold] at /home/bhalevy/dev/seastar/build/release/../../src/core/file-impl.hh:258
 (inlined by) seastar::append_challenged_posix_file_impl::write_dma(unsigned long, void const*, unsigned long, seastar::io_priority_class const&, seastar::io_intent*) at /home/bhalevy/dev/seastar/build/release/../../src/core/file.cc:811
seastar::file::dma_write_impl(unsigned long, unsigned char const*, unsigned long, seastar::io_priority_class const&, seastar::io_intent*) at /home/bhalevy/dev/seastar/build/release/../../src/core/file.cc:1182
seastar::future<unsigned long> seastar::file::dma_write<char>(unsigned long, char const*, unsigned long, seastar::io_priority_class const&, seastar::io_intent*) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/file.hh:351
 (inlined by) seastar::file_data_sink_impl::do_put(unsigned long, seastar::temporary_buffer<char>) at /home/bhalevy/dev/seastar/build/release/../../src/core/fstream.cc:431
seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}::operator()() at /home/bhalevy/dev/seastar/build/release/../../src/core/fstream.cc:387
seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}>(seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}&&) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/future.hh:2135
 (inlined by) seastar::future<void> seastar::futurize<seastar::future<void> >::invoke<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}>(seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}&&, seastar::internal::monostate) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/future.hh:1979
 (inlined by) seastar::future<void> seastar::future<void>::then_impl<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}, seastar::future<void> >(seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}&&) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/future.hh:1601
 (inlined by) seastar::internal::future_result<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}, void>::future_type seastar::internal::call_then_impl<seastar::future<void> >::run<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}>(seastar::future<void>&, seastar::internal::future_result&&) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/future.hh:1234
 (inlined by) seastar::future<void> seastar::future<void>::then<seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}, seastar::future<void> >(seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>)::{lambda()redpanda-data#1}&&) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/future.hh:1520
 (inlined by) seastar::file_data_sink_impl::put(seastar::temporary_buffer<char>) at /home/bhalevy/dev/seastar/build/release/../../src/core/fstream.cc:406
seastar::data_sink::put(seastar::temporary_buffer<char>) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/iostream.hh:142
 (inlined by) seastar::output_stream<char>::put(seastar::temporary_buffer<char>) at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/iostream-impl.hh:454
seastar::output_stream<char>::flush() at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/iostream-impl.hh:420
seastar::output_stream<char>::close() at /home/bhalevy/dev/seastar/build/release/../../include/seastar/core/iostream-impl.hh:512
```

To prevent this, catch the error around enqueue_op and handle
it by fut.ignore_ready_future().

Signed-off-by: Benny Halevy <[email protected]>
BenPope pushed a commit that referenced this pull request Sep 14, 2022
Some may contain space characters around the '+' operator in, e.g.:
```
    #1  0x00007fd2dab4f950 abort (libc.so.6 + 0x26950)
```

Fixes scylladb#1206

Signed-off-by: Benny Halevy <[email protected]>
BenPope pushed a commit that referenced this pull request Sep 14, 2022
…es' from Benny Halevy

Some may contain space characters around the '+' operator in, e.g.:
```
    #1  0x00007fd2dab4f950 abort (libc.so.6 + 0x26950)
```

Fixes scylladb#1206

Signed-off-by: Benny Halevy <[email protected]>

Closes scylladb#1207

* https://github.com/scylladb/seastar:
  seastar-addr2line: strip input lines
  seastar-addr2line: support more flexible syslog-style backtraces
travisdowns pushed a commit to travisdowns/seastar that referenced this pull request Jan 10, 2023
When we enable the sanitizer, we get following error while running
iotune:

==86505==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 4096 byte(s) in 1 object(s) allocated from:
    #0 0x5701b8 in aligned_alloc (/home/syuu/seastar.2/build/sanitize/apps/iotune/iotune+0x5701b8) (BuildId: 411f9852d64ed8982d5b33d02489b5932d92b8b7)
    redpanda-data#1 0x6d0813 in seastar::filesystem_has_good_aio_support(seastar::basic_sstring<char, unsigned int, 15u, true>, bool) /home/syuu/seastar.2/src/core/fsqual.cc:74:16
    redpanda-data#2 0x5bcd0d in main::$_0::operator()() const::'lambda'()::operator()() const /home/syuu/seastar.2/apps/iotune/iotune.cc:742:21
    redpanda-data#3 0x5bb1f1 in seastar::future<int> seastar::futurize<int>::apply<main::$_0::operator()() const::'lambda'()>(main::$_0::operator()() const::'lambda'()&&, std::tuple<>&&) /home/syuu/seastar.2/include/seastar/core/future.hh:2118:28
    redpanda-data#4 0x5bb039 in seastar::futurize<std::invoke_result<main::$_0::operator()() const::'lambda'()>::type>::type seastar::async<main::$_0::operator()() const::'lambda'()>(seastar::thread_attributes, main::$_0::operator()() const::'lambda'()&&)::'lambda'()::operator()() const /home/syuu/seastar.2/include/seastar/core/thread.hh:258:13
    redpanda-data#5 0x5bb039 in seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::invoke_result<main::$_0::operator()() const::'lambda'()>::type>::type seastar::async<main::$_0::operator()() const::'lambda'()>(seastar::thread_attributes, main::$_0::operator()() const::'lambda'()&&)::'lambda'()>::call(seastar::noncopyable_function<void ()> const*) /home/syuu/seastar.2/include/seastar/util/noncopyable_function.hh:124:20
    redpanda-data#6 0x8e0a77 in seastar::thread_context::main() /home/syuu/seastar.2/src/core/thread.cc:299:9
    redpanda-data#7 0x7f30ff8547bf  (/lib64/libc.so.6+0x547bf) (BuildId: 85c438f4ff93e21675ff174371c9c583dca00b2c)

SUMMARY: AddressSanitizer: 4096 byte(s) leaked in 1 allocation(s).

This is because we don't free buffer which allocated at filesystem_has_good_aio_support(), we should free it to avoid such error.

And this is needed to test Scylla machine image with debug mode binary,
since it tries to run iotune with the sanitizer and fails.

Closes scylladb#1284
travisdowns pushed a commit to travisdowns/seastar that referenced this pull request Jan 10, 2023
this change is a follow-up of 338ba97.

before this change, ${CMAKE_CURRENT_BINARY_DIR} is used for
Seastar_BINARY_DIR. if Seastar is a top-level project, the values
of ${CMAKE_CURRENT_BINARY_DIR} and ${CMAKE_BINARY_DIR} are
identical. but if Seastar is embedded in a parent project,
${CMAKE_BINARY_DIR} would be somewhere like "bulid/seastar" where
"build" is the build directory of the parent project.
but we are still referencing the build directory with
${Seastar_BINARY_DIR} and issuing commands like

cmake --build ${Seastar_BINARY_DIR} --target ${target}

if this would fail as the build directory is not ${Seastar_BINARY_DIR}
anymore. if the cmake generator is make, the failure would look like:

> gmake: *** No rule to make target 'test_unit_abort_source_run'.  Stop.

if the cmake generator is ninja, the failure would look like:

> 1/95 Test  redpanda-data#1: Seastar.unit.abort_source .....................***Failed    0.02 sec
> ninja: error: loading 'build.ninja': No such file or directory

after this change, all occurrences of

cmake --build ${Seastar_BINARY_DIR}

are changed to

cmake --build ${CMAKE_BINARY_DIR}

this ensure that these commands can find the build directory
at the top level of the build tree.

Signed-off-by: Kefu Chai <[email protected]>
nvartolomei pushed a commit to nvartolomei/rp-seastar that referenced this pull request Jul 9, 2024
in main(), we creates an instance of `http_server_control` using
new, but we never destroy it. this is identified by ASan

```
==2190125==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 8 byte(s) in 1 object(s) allocated from:
    #0 0x55e21cf487bd in operator new(unsigned long) /home/kefu/dev/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:86:3
    redpanda-data#1 0x55e21cf6cf31 in main::$_0::operator()() const::'lambda'()::operator()() const /home/kefu/dev/seastar/apps/httpd/main.cc:121:27
    redpanda-data#2 0x55e21cf6b4cc in int std::__invoke_impl<int, main::$_0::operator()() const::'lambda'()>(std::__invoke_other, main::$_0::operator()() const::'lambda'()&&) /usr/lib/gcc/x86_64-redhat-linux/14/../../../../incl
ude/c++/14/bits/invoke.h:61:14
    redpanda-data#3 0x55e21cf6b46c in std::__invoke_result<main::$_0::operator()() const::'lambda'()>::type std::__invoke<main::$_0::operator()() const::'lambda'()>(main::$_0::operator()() const::'lambda'()&&) /usr/lib/gcc/x86_
64-redhat-linux/14/../../../../include/c++/14/bits/invoke.h:96:14
    redpanda-data#4 0x55e21cf6b410 in decltype(auto) std::__apply_impl<main::$_0::operator()() const::'lambda'(), std::tuple<>>(main::$_0::operator()() const::'lambda'()&&, std::tuple<>&&, std::integer_sequence<unsigned long, .
..>) /usr/lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/tuple:2921:14
    redpanda-data#5 0x55e21cf6b3b2 in decltype(auto) std::apply<main::$_0::operator()() const::'lambda'(), std::tuple<>>(main::$_0::operator()() const::'lambda'()&&, std::tuple<>&&) /usr/lib/gcc/x86_64-redhat-linux/14/../../../
../include/c++/14/tuple:2936:14
    redpanda-data#6 0x55e21cf6b283 in seastar::future<int> seastar::futurize<int>::apply<main::$_0::operator()() const::'lambda'()>(main::$_0::operator()() const::'lambda'()&&, std::tuple<>&&) /home/kefu/dev/seastar/include/sea
star/core/future.hh:2005:28
    redpanda-data#7 0x55e21cf6b043 in seastar::futurize<std::invoke_result<main::$_0::operator()() const::'lambda'()>::type>::type seastar::async<main::$_0::operator()() const::'lambda'()>(seastar::thread_attributes, main::$_0:
:operator()() const::'lambda'()&&)::'lambda'()::operator()() const /home/kefu/dev/seastar/include/seastar/core/thread.hh:260:13
    redpanda-data#8 0x55e21cf6ae74 in seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::invoke_result<main::$_0::operator()() const::'lambda'()>::type>::type seastar::async<main::$_0::operator()()
 const::'lambda'()>(seastar::thread_attributes, main::$_0::operator()() const::'lambda'()&&)::'lambda'()>::call(seastar::noncopyable_function<void ()> const*) /home/kefu/dev/seastar/include/seastar/util/noncopyable
_function.hh:129:20
    redpanda-data#9 0x7f5d757a0fb3 in seastar::noncopyable_function<void ()>::operator()() const /home/kefu/dev/seastar/include/seastar/util/noncopyable_function.hh:215:16
    redpanda-data#10 0x7f5d75ef5611 in seastar::thread_context::main() /home/kefu/dev/seastar/src/core/thread.cc:311:9
    redpanda-data#11 0x7f5d75ef50eb in seastar::thread_context::s_main(int, int) /home/kefu/dev/seastar/src/core/thread.cc:287:43
    redpanda-data#12 0x7f5d72f8a18f  (/lib64/libc.so.6+0x5a18f) (BuildId: b098f1c75a76548bb230d8f551eae07a2aeccf06)
```

so, in this change, let's hold it using a smart pointer, so we
can destroy it when it leaves the lexical scope.

Signed-off-by: Kefu Chai <[email protected]>

Closes scylladb#2224
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants