Name	Name	Last commit message	Last commit date
parent directory ..
SimpleServer	SimpleServer
includes	includes
srcs	srcs
utils	utils
www	www
Makefile	Makefile
README.md	README.md
linux_tester	linux_tester

webserver

Epoll HTTP Web Server with C++98

Introduction

ARPANET: Advanced Research Projects Agency Network
HTTP: Hyper Text Transfer Protocole
HTML: Hyper Text Markup Language
CSS: Cascading Style Sheets
RFC: Read For Comment

"Development of HTTP was initiated by Tim Berners-Lee at CERN in 1989 and summarized in a simple document describing the behavior of a client and a server using the first HTTP protocol version that was named 0.9" (Wikipedia)
The RFC system was invented by Steve Crocker in 1969 to help record unofficial notes on the development of ARPANET. Which later was used to define how the HTTP should work among other things.
In the RFC it's defined the semantics used to best create HTTP Requests and Responses.

What is a Client and a Server?
A Server is a computer program which stores, manages, sends and recieves data. In particular an HTTP Server recieves an HTTP Request, and creates an HTTP Response.
An HTTP server is usefull to navigate through resources in a given enviroment. For example, a website is a set of files which togeather make a functional webpage.
Usually websites use HTML to define their content, a CSS stylesheet to modify the properties of the content (color, position, size; etc) and a script which can dinamicly modify the website (usually javascript).
A Client is a program which can create an HTTP Request, and receive an HTTP Response.
Web Browsers are the most common clients, which also can display the resources Requested in a readable way, using an HTML interpreter.
Browsers will interpret the Request content, and will identify all the resources that are needed (images, videos, embeded links; etc) and will create a Request to show all resources needed by the webpage.
What is a Socket
.... explain sockets, host: ports ips etc etc

For example
The HTML document bellow, on the third line is requesting a CSS stylesheet, on line 4 it's requesting a JS script and on the last lines it's requesting an image.
This HTML document needs 3 resouces and each will be requested individually, it's up to the Client to request it, and how to do it.

<!DOCTYPE html>
<html lang="en">
<link rel="stylesheet" href="index.css">
<script src="index.js"></script>
<body>
 <h1>Hello, this is Home</h1>
 <h3>Click on the screen.</h3>
 <p>Coding is fun!</p>
 <img src="/random/content/mexico.jpg", width="500px" height="auto">
</body>
</html>

Assuming that this HTML document is named index.html and the following configuration:
The Route(location) "/" accepts the GET method, and has the indexes: "index.html & index.htm" and the root is "www/"

Configuration images

Requests and Responses Walkthrough

The Following image shows the whole process of retrieving a webpage:

You get this as a result:

Configuration

The webserver can be configured with directives. Each Directive serves one purpose although some Directives are obsolete when used in a certain context.
This websserver a configuration file based on the NGINX configuration file.
All Featured Directives are listed bellow on the table.
The Configuration File has 2 contexts, server & location context. The Location context must belong to a server context. CGI is suported.

Directives

Example:

server {
  # INSIDE SERVER CONTEXT
  # (SP) Space
  DIRECTIVE (SP) VALUE
  location / {
    #INSIDE LOCATION CONTEXT
    DIRECTIVE (SP) VALUE
  }
}

Directive	Description	Example
autoindex	directory listing on & off	autoindex off \| on
cgi	cgi set up a new cgi support extension. Each cgi directive will add an executable to the location. The cgi URI should always make a request to the executable including the extension. When the URI target is an executable which is not defined, the first executable for the extension is used.	cgi <extension> <path_to_executable>
cgi-bin	cgi-bin will set up the path to the executable where it should be executed also, it's setted up as an enviromental variable.	cgi-bin <path_to_bin>
client_max_body_size	Limit client body size, if the request exceeds the number of bytes, the error 414 is returned	client_max_body_size #bytes
error_page	Setup default error pages. You can define one or more error codes. The path where the server searches for the errors should be the last parameter of the directive. Webserv only supports custom HTML error pages. Webserv will look for "error path + error code + .html". The error pages are inherited from the server context to the location context.	error_page code1 [code2] [codeN] ... <path>
limit_methods	Define a list of accepted HTTP methods for the route (inside location scope), if not defined any, the default is to accept GET only.	limit_methods METHODS
listen	Choose the port and host of each ’server', localhost and "*" is accepted.	listen host:port
location	Setup routes with one or multiple rules/configuration	location route { ... }
root	Set up a root path to	root <path>
redirect	Redirect to another location.	redirect [full URI or relative URI]
server_name	Setup the server_names or not	server_name name1 [name2] ...
upload	upload	upload

TODO

File Upload
Limit Body Size
PUT & DELETE actions
Coockies

BUGS

Autoindex is on
Location root doesn't exist
Should Return 404 not found -> returns 403 Forbidden

RFC's

Guides

Simple Webserver
- Simple-webserver-from-scratch
NGINX (Directives)
- NGINX webserver configuration file tutorial
GET / POST / DELETE
- https://www.youtube.com/watch?v=nMGlaiNBbNU&list=PL6n9fhu94yhW7yoUOGNOfHurUE6bpOO2b&index=4
MORE...
- https://www.youtube.com/watch?v=eesqK59rhGA
- https://www.youtube.com/watch?v=FBkZ2TJZZUY
epoll

Tony Extra Info

SEARCH FOR A FILE:
if found -> return a file in a body message
else -> return a file with error code (what code? 404 - file not found)
find response codes and messages corresponding to them, create std::map<errorCode, response>
BOTH CASES:
content length = size of the buffer of the file that we look for / or response html
content type = CHECK HOW TO GET FILE METADATA; can also set encoding
DATE: TODAY, NOW (when the request was made or when we sent response?)
- ! Last-Modified - last modified date(only for encountered file)

// PSEUDO CODE:

std::map<errorCode, response>;

buffer = read(expected_file)
if buffer == -1
    return 404.html;

if (expected_file.has_value() != error)
    .create file "response"; // actually it can be stringstream where we save

    response << "Status-Line = HTTP/1.1 << map.at(position_of_status_code).first << map[status_code]; // HTTP/1.1 404 Not Found << std::endl;
    response << "Server: PulgaBrenoPrzemek/1.0 (Ubuntu 20.04 LTS)" << std::endl;
    response << "Content Type: " << split the file name, take the part after "." eg: castle.jpg -> so if jpg -> content type: (find it in the internet)
        *Content-type: text/plain; charset=us-ascii <- default (when cliend does not specify nothing). type/subtype ; parametre // parametre := attribute(charset) = value(us-ascii)
    We will have to check if the content type we sent is accepted by the browser (mime types has to be parsed in request). if not it throws 415)


if (buffer.compare("connection")
    if value after = keep-alive(is now by default)

    response << "Connection: Open/Keep-Alive" << std::endl;

    else
    response << "Connection: Closed" << std::endl;
        CLOSE SOCKET AT THE END - dont! client close connection. We can set time after we gonna close it but not depends on it.


Client Content Negotiation:
If there is multiple resource, server may sent back list of avaliable representations of the resource (code 300), then client sent another get request with a
link that he wants reagarding to the headders (language he accept, encoding etc)


{
  // Content-Type:

std::string mimetype(std::string const & file_name) {
  file_name = split('.') -> take part after '.' eg jpg/html/css etc;


//TEXT
  if (txt)
    return "text/plain"
  else if (html)
    return "text/html";
  else if (css)
    return "text/css";
  else if (js) //javascript // "; parametre" - charset=/anything/ makes it invalid!
    return "text/javascript"
  else if (json)
    return "application/json"
  else if (jsonld)
    return "application/ld+json";
  else if (xml)
    return "application/xml";
  else if (pdf)
    return "application/pdf"

//DOCUMENTS
  else if(doc)
    return "application/msword";
  else if(docx)
    return "application/vnd.openxmlformats-officedocument.wordprocessingml.document"
  else if (ppt)
    return "application/vnd.ms-powerpoint"
  else if (pptx)
    return "application/vnd.openxmlformats-officedocument.presentationml.presentation"
  else if (odt)
    retrn "application/vnd.oasis.opendocument.text"
  else if (xls)
    return "application/vnd.ms-excel"
  else if (xlsx)
    return "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet"
  else if (odp)
    return "application/vnd.oasis.opendocument.presentation"
  else if (ods)
    return "application/vnd.oasis.opendocument.spreadsheet"
  
  
//IMAGE
  else if (jpeg)
    return "image/jpeg"
  else if (png)
    return "image/png"
  else if (apng)
    return "image/apng"
  else if (avif)
    return "image/avif"
  else if (gif)
    return "image/gif"
  else if (svg)
    return "image/svg+xml"
  else if (webp)
    return "image/webp"
  else if (bmp)
    return "image/bmp"
  else if (ico || cur)
    return "image/x-icon
  else if (.tif || .tiff)
    return "image/tiff"

//SOUND
  else if (mp3)
    return "audio/mpeg"
  else if (aac)
    return "audio/aac"
  else if (wav)
    return "audio/wave"

//VIDEO
  else if (flac)
    return "audio/flac"
  else if (mpeg)
    return "audio/mpeg"
  else if (mp4)
    return "video/mp4"
  else if (avi)
    return "video/x-msvideo"
  
//AUDIO-VIDEO
  else if (3gp)
    return "video/3gpp; audio/3gpp" - audio if file does not contain video
  
//ARCHIVES
  else if (bz)
    return "application/x-bzip"
  else if (bz2)
    return "application/x-bzip2"
  else if (gz)
    return "application/gzip"
  else if (zip)
    return "application/zip"
  else if (7z)
    return "application/x-7z-compressed"
  else if (tar)
    return "application/x-tar"

//DEFAULT    h
  else // else if (bin) - any kind of data
    return "application/octet-stream" // default for binary files. It means unknown binary file
}

} //CONTENT TYPE


autoindex on:
  fork
    std::cout << execv(ls -l / -R) << std::endl;



RESPONSE(string path)
std::map<int, std::string> _codeMessage;
split host from direction
split file from direction

//CHECK IF METHOD IS SUPPORTED
if (server._methods.compare(method) == false) {
      std::string buffer;
      int contentlength = read(WHATSERVER?/error/404.html)
      std::stringstream error;

      error << "HTTP/1.1" << 414 (not sure whats the code) << _codeMessage[404] << "\r\n";
      error << "Content-Type: text/html\r\n"
      error << "Content-Length: " << contentlength << "\r\n";
      error << "\r\n"
      error << buffer;

      write(client_fd, error, strlen(buffer));
}

std::vector<std::string>::iterator it = _location.begin()
for (; it != _location.end() ; it++){
  if ((direction == *it)) // ============= IF LOCATION IT REQUEST FOUND
    if (file_open(_l_root/file) == 0) { // ==== IF REQUESTED FILE NOT FOUND
      std::string buffer;
      int contentlength = read(WHATSERVER?/error/404.html)
      std::stringstream error;

      error << "HTTP/1.1" << 404 << _codeMessage[404] << "\r\n";
      error << "Content-Type: text/html\r\n"
      error << "Content-Length: " << contentlength << "\r\n";
      error << "\r\n"
      error << buffer;

      write(client_fd, error, strlen(buffer));
    }
    else if (file_not_specified) ======== IF FILE NOT SPECIFIED check WHAT HAPPEN?
      @same as if, but with index.html

    else ================================= IF REQUESTED FILE FOUND {
      std::string buffer;
      int contentlength = read(WHATSERVER?/content/split_file.extension)
      std::stringstream response;

      response << "HTTP/1.1" << 200 << _codeMessage[200] << "\r\n";
      response << "Content-Type: " << mimetype(file_name) << "\r\n";
      response << "Content-Length: " << contentlength << "\r\n";
      response << "\r\n";
      response << buffer;
    }
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

webserv

webserv

README.md

webserver

Introduction

Configuration images

Requests and Responses Walkthrough

Configuration

Directives

TODO

BUGS

RFC's

Guides

Tony Extra Info

Files

webserv

Directory actions

More options

Directory actions

More options

Latest commit

History

webserv

Folders and files

parent directory

README.md

webserver

Introduction

Configuration images

Requests and Responses Walkthrough

Configuration

Directives

TODO

BUGS

RFC's

Guides

Tony Extra Info