WebServ is a simple HTTP server written in C++ that adheres to HTTP/1.1 standards. It is designed to be non-blocking, supporting multiple clients simultaneously using efficient I/O multiplexing. The server configuration is provided through a configuration file, allowing users to customize various aspects of the server's behavior.
This project was developed in colaboration with jscaetano .
- Host static websites: Host static websites with HTML, CSS, and JavaScript files.
- Host multiple websites: Host multiple websites simultaneously, each with its own configuration.
- Non-blocking I/O: Uses efficient I/O multiplexing to handle multiple clients simultaneously without blocking.
- Customizable configuration: The server configuration is provided through a configuration file, allowing users to customize various aspects of the server's behavior.
- HTTP/1.1 compliant: Adheres to the HTTP/1.1 standard, supporting common HTTP methods such as GET, POST, and DELETE.
- Error handling: Returns appropriate error codes and messages for various client requests.
- File uploads: TSupports file uploads using the POST method.
- Directory listing: The server can automatically generate a directory listing for a requested directory.
- Redirects: The server can redirect clients to a different URL based on the request.
- Error pages: The server can return custom error pages for various error codes.
- CGI scripts: Supports CGI scripts for dynamic content generation.
Keyword | Description | Example | Must have |
---|---|---|---|
server | Server block. | server { ... } | Yes |
listen | Port number to listen on. | 8080 | Yes |
host | Host name. | localhost | Yes |
root | Root directory for the server. | /path/to/root/on/your/machine | Yes |
index | Default index file. | index.html | Yes |
client_max_body_size | Maximum size of the client request body. | 1M | Yes |
server_name | Server name. | my_server | Optional |
error_page | Error page to return for 404 Not Found errors. | 404.html | Optional |
Keyword | Description | Example | Must have |
---|---|---|---|
location | URL path to match. | /upload { ... } | Yes |
allow_methods | Allowed HTTP methods. | GET POST DELETE | Optional |
try_file | File to try when a directory is requested. | file.html | Optional |
return | URL to redirect to. | /index.html | Optional |
autoindex | Enable directory listing. | on | Optional |
root | Root directory for the location. | /path/to/root/on/your/machine | Optional |
upload_to | Directory to upload files to. | /path/to/upload/to | Optional |
cgi_path | Path to the CGI scripts. | /path/to/cgi/scripts | Optional |
cgi_extension | File extension for CGI scripts. | .py | Optional |
To compile the webserv program, use the provided Makefile by running make
in the project directory. This will generate an executable named webserv
.
make
To run the webserv program, simply execute the generated webserv
binary. This will start the server using the default configuration file ./conf/default.conf
.
./webserv
If you want to use a different configuration file, you can provide the path to the configuration file as an argument to the program.
./webserv /path/to/your/config/file.conf
Once running, you can interact with the server sending requests using curl
or a web browser. The server will respond to requests based on the configuration file provided.
- 1.1. How it Works
- 1.2. Which Multiplexing I/O
- 1.3. How the poll is working
- 1.4. How is checked the Client read and write
- 1.5. Polling Requests Error Handling
An http server is nothing more than input and output reader that uses file descriptors to read and write on a socket. The server is in an infinite loop waiting for a Client's request, when it receives the request it makes the request parse and sends the response to the Client.
The function that makes this type of multiplexing chosen was poll
, which is a more modern version of what the select
function does.
Only one vector poll is used to store all information from servers and clients. It is declared in the class Service
class Service
{
private:
pollfdVector _pollingRequests;
And the function poll()
is called each on time in the main loop of the function Service::launch receiving the same vector of Polls _pollingRquests
, then check all sockets. It's called every time on while because Clients can be added or removed on each time. The poll function is called in the function _initPollingRequests
void Service::_initPollingRequests()
{
if (poll(this->_pollingRequests.data(), this->_pollingRequests.size(), POLL_TIME_OUT) < 0 && g_shutdown == false)
throw std::runtime_error(ERR_POLL_FAIL + std::string(std::strerror(errno)));
}
The first sockets of poll are filled with the information of the servers in the function Service::Setup, where at the end of each server, the function is called_addSocketInPollingRequests which adds the server socket to poll.
void Service::_addSocketInPollingRequests()
{
pollfd request;
if (this->_tmp.launch == true)
{
request.fd = this->_tmp.connectionSocket;
request.events = POLLIN | POLLOUT;
}
else
{
request.fd = this->_tmp.socket;
request.events = POLLIN;
}
request.revents = 0;
this->_pollingRequests.push_back(request);
}
To accept Clients, after the setup of all servers, within the function Service::launch, When the Client has a request of read data, The function is called _acceptClient that accepts the client connection and adds the Client's socket in Poll.
void Service::_acceptConnection()
{
this->_tmp.connectionSocket = accept(this->_tmp.socket, NULL, NULL);
if (this->_tmp.connectionSocket < 0)
throw std::runtime_error(ERR_ACCEPT_SOCKET);
fcntl(this->_tmp.connectionSocket, F_SETFL, O_NONBLOCK); // set socket to non-blocking
this->_clients.push_back(Client(this->_servers.at(this->_tmp.id), this->_tmp.connectionSocket));
this->_addSocketInPollingRequests();
}
On each iteration of the loop in the function Service::launch, é chamada a função _pollingManager that checks the type of polling request.
void Service::_pollingManager()
{
for (size_t i = 0; i < this->_pollingRequests.size(); i++)
{
this->_getLaunchInfo(i);
if (this->_hasDataToRead())
continue;
if (this->_hasBadRequest())
continue;
if (this->_isServerRequest())
continue;
if (this->_hasDataToSend())
continue;
this->_resetInfo();
}
}
In the internal functions of _pollingManager, is made the error treatment of each type of polling request and then called the function _closeConnection that closes the connection and then deletes that client.
void Service::_closeConnection(std::string const &msg)
{
close(this->_tmp.socket);
this->_pollingRequests.erase(this->_pollingRequests.begin() + this->_tmp.id);
this->_clients.erase(this->_clients.begin() + this->_tmp.clientID);
printInfo(msg, RED);
}
The function _closeConnection
is called in functions _readData, _hasBadRequest and _hasDataToSend. Specifically in the function _readData, The error is throw when the recv
function returns a value less than 1. that 0 and -1 are treated as a factor to close the connection.
void Service::_readData()
{
char buffer[BUFFER_SIZE] = {0};
int bytes = recv(this->_tmp.socket, buffer, BUFFER_SIZE, 0);
if (bytes > 0)
this->_clients.at(this->_tmp.clientID).appendRequest(buffer, bytes);
else
this->_closeConnection(CLOSE_MSG);
}
To define more hosts besides localhost, you need to add the host IP to the `/etc/hosts' file from the operating system using sudo.
To check the server response, use the following command:
curl --resolve localhost:8080:127.0.0.1 localhost:8080
To setup a error page, you need to setup the ./conf file with the error_page
directive in the server block, for example:
server {
server_name my_server;
listen 8080;
host localhost;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/main_site;
index index.html;
client_max_body_size 1M;
error_page 404.html;
}
This page will be returned when the server returns the error 404 Not Found
. If a error page is not setup, the server will return a default error page.
To check if the client's body size limiter is working , use the following command:
Set the .conf client_max_body_size
to 9 and location/
to allow_methods post
and get
:
curl -X POST -H "Content-Type: plain/text" --data "1234567890" localhost:8080/curl_post.py
In this case, the server must return error 413 Payload is too large
, but if it removes the 0
from the body end, the body will have 9 bytes, then it will not give error 413.
To setup the file when a directory is requested, you need to setup the ./conf file with the try_file
directive, for example:
location /upload {
allow_methods GET POST;
try_file file.html;
}
In this case, the server will try to find the file file.html
in the directory /upload
and return it if it exists, else, the server will return the error 404 Not Found
.
To check the GET request, use the following command:
curl localhost:8080/index.html
To check the POST request, use the following command:
curl -X POST -H "Content-Type: plain/text" --data "1234567890" localhost:8080/curl_post.py
This command will upload a file called upload.txt
to the server with the content 1234567890
.
To check the DELETE request, use the following command:
curl -X DELETE localhost:8080/upload/upload.txt
This command will delete the file upload.txt
from the server.
To check the UNKNOWN request, use the following command:
curl -X UNKNOWN localhost:8080
This command will return the error 501 Not Implemented
.
To download the uploaded file, After using the command to upload the file, use the following command:
curl -O localhost:8080/upload/upload.txt
This command will download the file upload.txt
from the server to the current directory.
To check the network with firefox
, flow the steps:
- Open the
firefox
browser - Click on the page with the right mouse button and select
inspect element
- Click on the
network
tab - On the navigation bar, type the URL
localhost:8080
- On the
network
tab, click on the request to see more details
To see a directory content, you need to type the URL with the directory name at the end, for example:
localhost:8080/upload/
If the autoindex is enabled, the server will return the directory content, else, the server will return the error 403 Forbidden
.
To try to redirect URL, you must setup the ./conf file location to redirect to another URL, for example:
location /redirect {
allow_methods GET;
return /index.html;
}
In this case, the server will redirect the URL localhost:8080/redirect
to localhost:8080/index.html
.
To check the server with diferent ports and sites, you need to setup the ./conf file with diferent ports and sites, for example:
server {
server_name main_site;
listen 8080;
host localhost;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/main_site;
index index.html;
client_max_body_size 1M;
}
server {
server_name blue;
listen 8081;
host blue.42.fr;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/blue;
index index.html;
client_max_body_size 1M;
}
In this case, the servers will be listening on ports 8080 and 8081, and the sites will be main_site
and blue
respectively.
To check the server with the same port, you need to setup the ./conf file with the same port, for example:
server {
server_name main_site;
listen 8080;
host localhost;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/main_site;
index index.html;
client_max_body_size 1M;
}
server {
server_name blue;
listen 8080;
host localhost;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/blue;
index index.html;
client_max_body_size 1M;
}
In this case, the only server that will be listening on port 8080 will be the main_site
server, because the blue
server will not be able to listen on the same port.
To check the server with the same port and different hosts, you need to setup the ./conf file with the same port and different hosts, for example:
server {
server_name main_site;
listen 8080;
host localhost;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/main_site;
index index.html;
client_max_body_size 1M;
}
server {
server_name blue;
listen 8080;
host blue.42.fr;
root /home/wcorrea-/workplace/common_core/webserv/webserv/websites/blue;
index index.html;
client_max_body_size 1M;
}
In this case, the servers will be listening on port 8080, and the sites will be main_site
and blue
respectively.
The server isn't able to listen on the same port and host simultaneously because the bind work only with one socket per port and host. When the server is listening on the same port and different hosts it's setup because the hosts are different. If it'll find a host that is already listening on the same port and host, this server will be ignored.
If the request received has a server_name and a host that is on the .conf but isn't the default, the server will replaced by this server temporarily on the function _checkRequestedServer
void Service::_checkRequestedServer()
{
std::string request = this->_clients.at(this->_tmp.clientID).getRequest();
std::string requestedServer;
size_t pos;
if ((pos = request.find(REQUEST_HOST)))
{
requestedServer = request.substr(pos + std::strlen(REQUEST_HOST));
if ((pos = requestedServer.find(NEWLINE)))
requestedServer = requestedServer.substr(0, pos);
}
else
return;
if ((pos = requestedServer.find(":")))
requestedServer = requestedServer.substr(0, pos);
Server defaultServer = this->_clients.at(this->_tmp.clientID).getServer();
serverVector::iterator server = this->_servers.begin();
for (; server != this->_servers.end(); server++)
{
if (requestedServer == server->getServerName() && server->getHost() == defaultServer.getHost())
this->_clients.at(this->_tmp.clientID).changeServer(*server);
}
}
To test this, use the same_host_port.conf
file and the following command:
curl --resolve localhost:8080:127.0.0.1 localhost:8080
This command will return the default server page.
curl --resolve black:8080:127.0.0.1 black:8080
This command will return the non-default server page.
Because the school has a security policy, the students can't use the sudo command. It means that the students can't run the specific features:
- Run the server if in ports less than 1024
- Set more custom hosts in the
/etc/hosts
file - Install the
siege
program for the stress test
- Create a VM, or use a VM that already exists of your previus projects like
born2beroot
orinception
. - Share your project folder with the VM using the
Virtual Box Shared Folder
feature. - Access your VM through the terminal with the
ssh
command.
Tip: We recommend using the terminal called terminitor
for this tests. With this, you can open multiple terminals at the same window.
Terminator Shortcuts:
- Open terminator:
Ctrl
+Alt
+T
- Split terminal horizontally:
Ctrl
+Shift
+O
- Split terminal vertically:
Ctrl
+Shift
+E
- Close terminal:
Ctrl
+Shift
+W
- Change terminal:
Ctrl
+Shift
+Tab
And to set more hosts in the /etc/hosts
file, we need to use sudo too, for example:
sudo nano /etc/hosts
To check the server with the stress test, you need to install the siege
program:
sudo apt-get install siege
Now you need to add a new host to the VM's /etc/hosts
file, for example:
127.0.0.1 testing
After this, create a file called empty.html
in your server root directory and keep it empty, then add this configuration to the ./conf file:
server {
server_name my_server;
listen 8080;
host testing;
root /your/vm/shared/folder/with/your/website;
index empty.html;
client_max_body_size 1M;
}
Now, run the following command:
```bash
siege -b http://testing:8080/empty.html -t 1m
-b
defines the test as a benchmark.
-t
defines the time of the test. It can s
for seconds, m
for minutes or h
for hours.
If the availability is above 99.5% for a simple request on an empty page, the server is working fine.
To check the server with the memory leak test, you can use the top
. The top
program shows the memory usage of each process. the webserv
program need to be running. Then, run the following command:
pgrep webserv
This command will return the process ID of the siege
program, then run the following command:
top -p <process ID>
Now you can see the memory usage of your webserv
. The column RES
shows the phisisical memory usage of the process. If the memory usage increases indefinitely, the server has a memory leak.
For a better test you can check the memory usage of the webserv
program while the siege
program is running.
To check the server is with hanging connections, you can use the netstat
program. While the siege test is running, run the following command:
watch -n 1 netstat -tuln
If the state of the connection is different from ESTABLISHED
or LISTEN
, the connection is hanging.