Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI Standalone Implementation - First Bits #46

Merged
merged 27 commits into from
Jan 21, 2021
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7a0a305
adding helloworld example code
csegarragonz Dec 15, 2020
cd2c581
skeleton of native mpi compilation
csegarragonz Dec 17, 2020
a4f6867
adding a task to run mpi-specific applications
csegarragonz Dec 17, 2020
feb7e8e
adding executor script
csegarragonz Dec 18, 2020
6e97ea2
adding new faabric mpi main macro
csegarragonz Jan 4, 2021
0d17cd6
adding scale-up and scale-down options through docker-compose and exp…
csegarragonz Jan 8, 2021
732afe4
adding native mpi_executor in lieu of the overly complex macro
csegarragonz Jan 11, 2021
5ae234d
adding a different target for the native mpi build, and renamed for d…
csegarragonz Jan 11, 2021
0e3bd08
updates to the executor to support being linked as a dynamic library
csegarragonz Jan 13, 2021
b70c9d8
added option to run endpoint thread in background
csegarragonz Jan 13, 2021
37919c0
helloworld working fine with one process
csegarragonz Jan 13, 2021
b2e2b71
helloworld working with >1 processes
csegarragonz Jan 14, 2021
c22aa51
finished automation for deployment and cleanup of MPI tasks
csegarragonz Jan 15, 2021
6f965c9
updating the docs
csegarragonz Jan 15, 2021
7534a0d
fixes
csegarragonz Jan 15, 2021
448c1f3
gracefuly exit
csegarragonz Jan 15, 2021
38cd510
adding build in test file
csegarragonz Jan 15, 2021
8622705
fixes
csegarragonz Jan 15, 2021
a7aa748
adding checks in dev commands
csegarragonz Jan 15, 2021
6f0afcd
addiing link directory as cmake would find pistache
csegarragonz Jan 15, 2021
1f6e8a3
changes suggested in pr comments
csegarragonz Jan 18, 2021
4efc512
switched deployment method to use custom docker compose file and dock…
csegarragonz Jan 18, 2021
35f3447
reverting changes in endpoint, and update the docs
csegarragonz Jan 19, 2021
2bd711f
moving to top-level directory to encapsulate things
csegarragonz Jan 19, 2021
1ecd8f4
modified cmake and added new check in actions
csegarragonz Jan 20, 2021
ca25dad
building container each new release
csegarragonz Jan 20, 2021
027af25
tiny typo
csegarragonz Jan 21, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,10 @@ jobs:
run: inv dev.cc faabric --shared
- name: "Install Faabric shared library"
run: inv dev.install faabric --shared
- name: "Build MPI native library"
run: inv dev.cc faabricmpi_native --shared
- name: "Install MPI native library"
run: inv dev.install faabricmpi_native --shared
- name: "Build examples"
run: inv examples
- name: "Run example to check"
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ endfunction()
add_subdirectory(src/endpoint)
add_subdirectory(src/executor)
add_subdirectory(src/mpi)
add_subdirectory(src/mpi-native)
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
add_subdirectory(src/proto)
add_subdirectory(src/redis)
add_subdirectory(src/scheduler)
Expand Down
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,9 @@ inv container.push
inv container.build --push
```

## Additional documentation

More detail on some key features and implementations can be found below:

- [Native MPI builds](docs/native_mpi.md) - run native applications against
Faabric's MPI library.
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ services:
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- /usr/bin/docker:/usr/bin/docker
- /usr/local/bin/docker-compose:/usr/bin/docker-compose
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
- .:/code/faabric
- ./build:/build/faabric
working_dir: /code/faabric
Expand Down
51 changes: 51 additions & 0 deletions docs/native_mpi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Native MPI execution in Faabric

Faabric supports linking MPI binaries against our custom MPI implementation
used in [Faasm](https://github.com/faasm/faasm). This way, you can test the
compliance of your MPI application with our API (a subset of the standard)
without the burden of cross-compiling to WebAssembly.

To run native MPI applications you need to: compile the dynamic library, and
slightly modify the original soource code.

## Compiling the library

Compilation should be smooth if you are running our recommended containerised
[development environment](../README.md). You may access the container running
`./bin/cli.sh`.

Then, to compile the library:
```bash
inv dev.cmake --shared
inv dev.cc faabricmpi_native --shared
inv dev.install faabricmpi_native --shared
```

## Adapting the source code
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved

## Running the binary

To run an example, run this command _outside_ the container:
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
```bash
# The --clean flag re-creates _all_ containers
inv mpi.execute mpi_helloworld --clean --np 5
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
```

To clean the cluster and set the development one again:
```bash
inv mpi.clean
```

Using the `--force` flag will recreate _all_ containers hence finishing any
sessions you may have open:
```bash
inv mpi.clean --force
```

## Debugging

If at some point you reach an unstable state of the cluster, stop it completely
using:
```bash
docker-compose down
```
2 changes: 2 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ function(add_example example_name)
target_link_libraries(${example_name}
faabric
faabricmpi
faabricmpi_native
protobuf
pthread
pistache
Expand All @@ -31,6 +32,7 @@ function(add_example example_name)
endfunction()

add_example(check)
add_example(mpi_helloworld)
add_example(server)

add_custom_target(all_examples DEPENDS ${ALL_EXAMPLES})
62 changes: 62 additions & 0 deletions examples/mpi_helloworld.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
#include <faabric/util/logging.h>

#include <faabric/mpi-native/MpiExecutor.h>
#include <faabric/mpi/mpi.h>

int main(int argc, char** argv)
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
{
auto logger = faabric::util::getLogger();
auto& scheduler = faabric::scheduler::getScheduler();
auto& conf = faabric::util::getSystemConfig();

// Global configuration
conf.maxNodes = 1;
conf.maxNodesPerFunction = 1;

bool __isRoot;
int __worldSize;
if (argc < 2) {
logger->debug("Non-root process started");
__isRoot = false;
} else if (argc < 3) {
logger->error("Root process started without specifying world size!");
return 1;
} else {
logger->debug("Root process started");
__worldSize = std::stoi(argv[2]);
__isRoot = true;
logger->debug("MPI World Size: {}", __worldSize);
}

// Pre-load message to bootstrap execution
if (__isRoot) {
faabric::Message msg = faabric::util::messageFactory("mpi", "exec");
msg.set_mpiworldsize(__worldSize);
scheduler.callFunction(msg);
}

{
faabric::executor::SingletonPool p;
p.startPool(false);
}

return 0;
}

int faabric::executor::mpiFunc()
{
auto logger = faabric::util::getLogger();
logger->info("Hello world from Faabric MPI Main!");

MPI_Init(NULL, NULL);

int rank, worldSize;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &worldSize);

logger->info("Hello faabric from process {} of {}", rank + 1, worldSize);

MPI_Finalize();

return 0;
}
12 changes: 11 additions & 1 deletion include/faabric/endpoint/Endpoint.h
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@
#include <pistache/endpoint.h>
#include <pistache/http.h>

#include <memory>
#include <thread>

namespace faabric::endpoint {
class Endpoint
{
Expand All @@ -13,12 +16,19 @@ class Endpoint

Endpoint(int port, int threadCount);

void start();
void start(bool background = true);

void doStart();

void stop();

virtual std::shared_ptr<Pistache::Http::Handler> getHandler() = 0;

private:
bool isBackground;
int port = faabric::util::getSystemConfig().endpointPort;
int threadCount = faabric::util::getSystemConfig().endpointNumThreads;
std::thread servingThread;
std::unique_ptr<Pistache::Http::Endpoint> httpEndpoint;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need the modifications to the endpoint here? I would have thought we'd run the endpoint in the foreground or not run it at all. Is it needed for this native MPI stuff?

Copy link
Collaborator Author

@csegarragonz csegarragonz Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The endpoint is currently used to bootstrap execution. This is, before instantiating the SingletonPool object that starts the thread pool, state server, and http endpoint, we make a call to the scheduler in examples/mpi_helloworld.cpp:35 this in turn loads the function defined in faabric::executor::mpiFunc and at this point we don't need the endpoint anymore.

If running the HTTP endpoint in the background is something we definately won't need down the road, then these changes may add complexity for a single call, but as previously mentioned they are needed for initialisation.

Alternatively, we could try and come up with a different bootstrapping scheme offline (that does not rely on the handling of http requests).

Copy link
Collaborator

@Shillaker Shillaker Jan 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see where the call to the endpoint is actually made. I can see that the scheduler is called, but I don't think the endpoint needs to be running for the scheduler to work. IIRC the endpoint is basically just an HTTP wrapper around the scheduler that doesn't do much else. What does the endpoint add in this case and where in mpi_helloworld.cpp is the endpoint actually used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right about this. I missunderstood some of the endpoint functionalities, I really don't need it at all. I will revert to the implementation in master.

};
}
2 changes: 2 additions & 0 deletions include/faabric/executor/FaabricExecutor.h
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,8 @@ class FaabricExecutor
bool success,
const std::string& errorMsg);

virtual void postFinishCall();

virtual void postFinish();

bool _isBound = false;
Expand Down
2 changes: 1 addition & 1 deletion include/faabric/executor/FaabricPool.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ class FaabricPool

void startFunctionCallServer();

void startThreadPool();
void startThreadPool(bool background = true);

void startStateServer();

Expand Down
44 changes: 44 additions & 0 deletions include/faabric/mpi-native/MpiExecutor.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
#pragma once

#include <faabric/endpoint/FaabricEndpoint.h>
#include <faabric/executor/FaabricExecutor.h>
#include <faabric/executor/FaabricPool.h>
#include <faabric/scheduler/Scheduler.h>
#include <faabric/util/logging.h>

namespace faabric::executor {
class MpiExecutor final : public faabric::executor::FaabricExecutor
{
public:
explicit MpiExecutor();

bool doExecute(faabric::Message& msg) override;

void postFinishCall() override;

void postFinish() override;
};

class SingletonPool : public faabric::executor::FaabricPool
{
public:
SingletonPool();

~SingletonPool();

void startPool(bool background);

protected:
std::unique_ptr<FaabricExecutor> createExecutor(int threadIdx) override
{
return std::make_unique<MpiExecutor>();
}

private:
faabric::endpoint::FaabricEndpoint endpoint;
faabric::scheduler::Scheduler& scheduler;
};

extern faabric::Message* executingCall;
extern int __attribute__((weak)) mpiFunc();
}
64 changes: 47 additions & 17 deletions src/endpoint/Endpoint.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,23 @@ Endpoint::Endpoint(int portIn, int threadCountIn)
, threadCount(threadCountIn)
{}

void Endpoint::start()
void Endpoint::start(bool background)
{
const std::shared_ptr<spdlog::logger>& logger = faabric::util::getLogger();
auto logger = faabric::util::getLogger();
this->isBackground = background;

logger->info("Starting HTTP endpoint");
if (background) {
logger->debug("Starting HTTP endpoint in background thread");
servingThread = std::thread([this] { doStart(); });
} else {
logger->debug("Starting HTTP endpoint in this thread");
doStart();
}
}

void Endpoint::doStart()
{
auto logger = faabric::util::getLogger();

// Set up signal handler
sigset_t signals;
Expand All @@ -35,23 +47,41 @@ void Endpoint::start()
.backlog(256)
.flags(Pistache::Tcp::Options::ReuseAddr);

Pistache::Http::Endpoint httpEndpoint(addr);
httpEndpoint.init(opts);
httpEndpoint = std::make_unique<Pistache::Http::Endpoint>(addr);
httpEndpoint->init(opts);

// Configure and start endpoint
httpEndpoint.setHandler(this->getHandler());
httpEndpoint.serveThreaded();

// Wait for a signal
logger->info("Awaiting signal");
int signal = 0;
int status = sigwait(&signals, &signal);
if (status == 0) {
logger->info("Received signal: {}", signal);
} else {
logger->info("Sigwait return value: {}", signal);
httpEndpoint->setHandler(this->getHandler());
httpEndpoint->serveThreaded();

// Wait for a signal if running in foreground (this blocks execution)
if (!isBackground) {
logger->info("Awaiting signal");
int signal = 0;
int status = sigwait(&signals, &signal);
if (status == 0) {
logger->info("Received signal: {}", signal);
} else {
logger->info("Sigwait return value: {}", signal);
}

stop();
}
}

void Endpoint::stop()
{
// Shut down the endpoint
auto logger = faabric::util::getLogger();

logger->debug("Shutting down HTTP endpoint");
this->httpEndpoint->shutdown();

httpEndpoint.shutdown();
if (isBackground) {
logger->debug("Waiting for HTTP endpoint background thread");
if (servingThread.joinable()) {
servingThread.join();
}
}
}
}
5 changes: 5 additions & 0 deletions src/executor/FaabricExecutor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@ void FaabricExecutor::finishCall(faabric::Message& msg,

// Increment the execution counter
executionCount++;

// Hook
this->postFinishCall();
}

void FaabricExecutor::run()
Expand Down Expand Up @@ -202,6 +205,8 @@ void FaabricExecutor::preFinishCall(faabric::Message& call,
const std::string& errorMsg)
{}

void FaabricExecutor::postFinishCall() {}

void FaabricExecutor::postFinish() {}

void FaabricExecutor::flush() {}
Expand Down
13 changes: 11 additions & 2 deletions src/executor/FaabricPool.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ void FaabricPool::startStateServer()
stateServer.start();
}

void FaabricPool::startThreadPool()
void FaabricPool::startThreadPool(bool background)
{
const std::shared_ptr<spdlog::logger>& logger = faabric::util::getLogger();
logger->info("Starting worker thread pool");
Expand All @@ -65,7 +65,11 @@ void FaabricPool::startThreadPool()
createExecutor(threadIdx);

// Worker will now run for a long time
executor.get()->run();
try {
executor.get()->run();
} catch (faabric::util::ExecutorFinishedException& e) {
this->_shutdown = true;
}

// Handle thread finishing
threadTokenPool.releaseToken(executor.get()->threadIdx);
Expand All @@ -82,6 +86,11 @@ void FaabricPool::startThreadPool()

// Will die gracefully at this point
});

// Make the call to startThreadPool blocking
csegarragonz marked this conversation as resolved.
Show resolved Hide resolved
if (!background) {
poolThread.join();
}
}

void FaabricPool::reset()
Expand Down
Loading