From 61ca86b05493f82bcbddd76911fee64dc636c885 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Lo=C3=AFc=20Hoguin?= Date: Thu, 27 Jun 2013 00:02:12 +0200 Subject: Greatly improve the guide introduction --- guide/erlang_beginners.md | 43 +++++++++ guide/erlang_web.md | 181 +++++++++++++++++++++++++++++++++++++ guide/getting_started.md | 80 +++++++++++++++++ guide/introduction.md | 93 ++----------------- guide/modern_web.md | 224 ++++++++++++++++++++++++++++++++++++++++++++++ guide/toc.md | 30 ++++++- 6 files changed, 563 insertions(+), 88 deletions(-) create mode 100644 guide/erlang_beginners.md create mode 100644 guide/erlang_web.md create mode 100644 guide/getting_started.md create mode 100644 guide/modern_web.md (limited to 'guide') diff --git a/guide/erlang_beginners.md b/guide/erlang_beginners.md new file mode 100644 index 0000000..7778dee --- /dev/null +++ b/guide/erlang_beginners.md @@ -0,0 +1,43 @@ +Erlang for beginners +==================== + +Chances are you are interested in using Cowboy, but have +no idea how to write an Erlang program. Fear not! This +chapter will help you get started. + +We recommend two books for beginners. You should read them +both at some point, as they cover Erlang from two entirely +different perspectives. + +Learn You Some Erlang for Great Good! +------------------------------------- + +The quickest way to get started with Erlang is by reading +a book with the funny name of [LYSE](http://learnyousomeerlang.com), +as we affectionately call it. + +It will get right into the syntax and quickly answer the questions +a beginner would ask themselves, all the while showing funny +pictures and making insightful jokes. + +You can read an early version of the book online for free, +but you really should buy the much more refined paper and +ebook versions. + +Programming Erlang +------------------ + +After writing some code, you will probably want to understand +the very concepts that make Erlang what it is today. These +are best explained by Joe Armstrong, the godfather of Erlang, +in his book [Programming Erlang](http://pragprog.com/book/jaerlang2/programming-erlang). + +Instead of going into every single details of the language, +Joe focuses on the central concepts behind Erlang, and shows +you how they can be used to write a variety of different +applications. + +At the time of writing, the 2nd edition of the book is in beta, +and includes a few details about upcoming Erlang features that +cannot be used today. Choose the edition you want, then get +reading! diff --git a/guide/erlang_web.md b/guide/erlang_web.md new file mode 100644 index 0000000..d665ffe --- /dev/null +++ b/guide/erlang_web.md @@ -0,0 +1,181 @@ +Erlang and the Web +================== + +The Web is concurrent +--------------------- + +When you access a website there is little concurrency +involved. A few connections are opened and requests +are sent through these connections. Then the web page +is displayed on your screen. Your browser will only +open up to 4 or 8 connections to the server, depending +on your settings. This isn't much. + +But think about it. You are not the only one accessing +the server at the same time. There can be hundreds, if +not thousands, if not millions of connections to the +same server at the same time. + +Even today a lot of systems used in production haven't +solved the C10K problem (ten thousand concurrent connections). +And the ones who did are trying hard to get to the next +step, C100K, and are pretty far from it. + +Erlang meanwhile has no problem handling millions of +connections. At the time of writing there are application +servers written in Erlang that can handle more than two +million connections on a single server in a real production +application, with spare memory and CPU! + +The Web is concurrent, and Erlang is a language designed +for concurrency, so it is a perfect match. + +Of course, various platforms need to scale beyond a few +million connections. This is where Erlang's built-in +distribution mechanisms come in. If one server isn't +enough, add more! Erlang allows you to use the same code +for talking to local processes or to processes in other +parts of your cluster, which means you can scale very +quickly if the need arises. + +The Web has large userbases, and the Erlang platform was +designed to work in a distributed setting, so it is a +perfect match. + +Or is it? Surely you can find solutions to handle that many +concurrent connections with my favorite language... But all +these solutions will break down in the next few years. Why? +Firstly because servers don't get any more powerful, they +instead get a lot more cores and memory. This is only useful +if your application can use them properly, and Erlang is +light-years away from anything else in that area. Secondly, +today your computer and your phone are online, tomorrow your +watch, goggles, bike, car, fridge and tons of other devices +will also connect to various applications on the Internet. + +Only Erlang is prepared to deal with what's coming. + +The Web is soft real time +------------------------- + +What does soft real time mean, you ask? It means we want the +operations done as quickly as possible, and in the case of +web applications, it means we want the data propagated fast. + +In comparison, hard real time has a similar meaning, but also +has a hard time constraint, for example an operation needs to +be done in under N milliseconds otherwise the system fails +entirely. + +Users aren't that needy yet, they just want to get access +to their content in a reasonable delay, and they want the +actions they make to register at most a few seconds after +they submitted them, otherwise they'll start worrying about +whether it successfully went through. + +The Web is soft real time because taking longer to perform an +operation would be seen as bad quality of service. + +Erlang is a soft real time system. It will always run +processes fairly, a little at a time, switching to another +process after a while and preventing a single process to +steal resources from all others. This means that Erlang +can guarantee stable low latency of operations. + +Erlang provides the guarantees that the soft real time Web +requires. + +The Web is asynchronous +----------------------- + +Long ago, the Web was synchronous because HTTP was synchronous. +You fired a request, and then waited for a response. Not anymore. +It all started when XmlHttpRequest started being used. It allowed +the client to perform asynchronous calls to the server. + +Then Websocket appeared and allowed both the server and the client +to send data to the other endpoint completely asynchronously. The +data is contained within frames and no response is necessary. + +Erlang processes work the same. They send each other data contained +within messages and then continue running without needing a response. +They tend to spend most of their time inactive, waiting for a new +message, and the Erlang VM happily activate them when one is received. + +It is therefore quite easy to imagine Erlang being good at receiving +Websocket frames, which may come in at unpredictable times, pass the +data to the responsible processes which are always ready waiting for +new messages, and perform the operations required by only activating +the required parts of the system. + +The more recent Web technologies, like Websocket of course, but also +SPDY and HTTP/2.0, are all fully asynchronous protocols. The concept +of requests and responses is retained of course, but anything could +be sent in between, by both the client or the browser, and the +responses could also be received in a completely different order. + +Erlang is by nature asynchronous and really good at it thanks to the +great engineering that has been done in the VM over the years. It's +only natural that it's so good at dealing with the asynchronous Web. + +The Web is omnipresent +---------------------- + +The Web has taken a very important part of our lives. We're +connected at all times, when we're on our phone, using our computer, +passing time using a tablet while in the bathroom... And this +isn't going to slow down, every single device at home or on us +will be connected. + +All these devices are always connected. And with the number of +alternatives to give you access to the content you seek, users +tend to not stick around when problems arise. Users today want +their applications to be always available and if it's having +too many issues they just move on. + +Despite this, when developers choose a product to use for building +web applications, their only concern seem to be "Is it fast?", +and they look around for synthetic benchmarks showing which one +is the fastest at sending "Hello world" with only a handful +concurrent connections. Web benchmarks haven't been representative +of reality in a long time, and are drifting further away as +time goes on. + +What developers should really ask themselves is "Can I service +all my users with no interruption?" and they'd find that they have +two choices. They can either hope for the best, or they can use +Erlang. + +Erlang is built for fault tolerance. When writing code in any other +language, you have to check all the return values and act accordingly +to avoid any unforeseen issues. If you're lucky, you won't miss +anything important. When writing Erlang code, you can just check +the success condition and ignore all errors. If an error happen, +the Erlang process crashes and is then restarted by a special +process called a supervisor. + +The Erlang developer thus has no need to fear about unhandled +errors, and can focus on handling only the errors that should +give some feedback to the user and let the system take care of +the rest. This also has the advantage of allowing him to write +a lot less code, and letting him sleep at night. + +Erlang's fault tolerance oriented design is the first piece of +what makes it the best choice for the omnipresent, always available +Web. + +The second piece is Erlang's built-in distribution. Distribution +is a key part of building a fault tolerant system, because it +allows you to handle bigger failures, like a whole server going +down, or even a data center entirely. + +Fault tolerance and distribution are important today, and will be +vital in the future of the Web. Erlang is ready. + +Erlang is the ideal platform for the Web +---------------------------------------- + +Erlang provides all the important features that the Web requires +or will require in the near future. Erlang is a perfect match +for the Web, and it only makes sense to use it to build web +applications. diff --git a/guide/getting_started.md b/guide/getting_started.md new file mode 100644 index 0000000..812b1e0 --- /dev/null +++ b/guide/getting_started.md @@ -0,0 +1,80 @@ +Getting started +=============== + +Cowboy does nothing by default. + +Cowboy requires the `crypto` and `ranch` applications to be started. + +``` erlang +ok = application:start(crypto). +ok = application:start(ranch). +ok = application:start(cowboy). +``` + +Cowboy uses Ranch for handling the connections and provides convenience +functions to start Ranch listeners. + +The `cowboy:start_http/4` function starts a listener for HTTP connections +using the TCP transport. The `cowboy:start_https/4` function starts a +listener for HTTPS connections using the SSL transport. + +Listeners are a group of processes that are used to accept and manage +connections. The processes used specifically for accepting connections +are called acceptors. The number of acceptor processes is unrelated to +the maximum number of connections Cowboy can handle. Please refer to +the [Ranch guide](http://ninenines.eu/docs/en/ranch/HEAD/guide/toc) +for in-depth information. + +Listeners are named. They spawn a given number of acceptors, listen for +connections using the given transport options and pass along the protocol +options to the connection processes. The protocol options must include +the dispatch list for routing requests to handlers. + +The dispatch list is explained in greater details in the +[Routing](routing.md) chapter. + +``` erlang +Dispatch = cowboy_router:compile([ + %% {URIHost, list({URIPath, Handler, Opts})} + {'_', [{'_', my_handler, []}]} +]), +%% Name, NbAcceptors, TransOpts, ProtoOpts +cowboy:start_http(my_http_listener, 100, + [{port, 8080}], + [{env, [{dispatch, Dispatch}]}] +). +``` + +Cowboy features many kinds of handlers. For this simple example, +we will just use the plain HTTP handler, which has three callback +functions: init/3, handle/2 and terminate/3. You can find more information +about the arguments and possible return values of these callbacks in the +[cowboy_http_handler function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_http_handler). +Following is an example of a simple HTTP handler module. + +``` erlang +-module(my_handler). +-behaviour(cowboy_http_handler). + +-export([init/3]). +-export([handle/2]). +-export([terminate/3]). + +init({tcp, http}, Req, Opts) -> + {ok, Req, undefined_state}. + +handle(Req, State) -> + {ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req), + {ok, Req2, State}. + +terminate(Reason, Req, State) -> + ok. +``` + +The `Req` variable above is the Req object, which allows the developer +to obtain information about the request and to perform a reply. Its usage +is explained in the [cowboy_req function reference](http://ninenines.eu/docs/en/cowboy/HEAD/manual/cowboy_req). + +You can find many examples in the `examples/` directory of the +Cowboy repository. A more complete "Hello world" example can be +found in the `examples/hello_world/` directory. diff --git a/guide/introduction.md b/guide/introduction.md index fb338ac..8c936a5 100644 --- a/guide/introduction.md +++ b/guide/introduction.md @@ -21,16 +21,12 @@ features both a Function Reference and a User Guide. Prerequisites ------------- -It is assumed the developer already knows Erlang and has basic knowledge -about the HTTP protocol. +No Erlang knowledge is required for reading this guide. The reader will +be introduced to Erlang concepts and redirected to reference material +whenever necessary. -In order to run the examples available in this user guide, you will need -Erlang and rebar installed and in your $PATH. - -Please see the [rebar repository](https://github.com/basho/rebar) for -downloading and building instructions. Please look up the environment -variables documentation of your system for details on how to update the -$PATH information. +Knowledge of the HTTP protocol is recommended but not required, as it +will be detailed throughout the guide. Supported platforms ------------------- @@ -57,81 +53,4 @@ Header names are case insensitive. Cowboy converts all the request header names to lowercase, and expects your application to provide lowercase header names in the response. -Getting started ---------------- - -Cowboy does nothing by default. - -Cowboy requires the `crypto` and `ranch` applications to be started. - -``` erlang -ok = application:start(crypto). -ok = application:start(ranch). -ok = application:start(cowboy). -``` - -Cowboy uses Ranch for handling the connections and provides convenience -functions to start Ranch listeners. - -The `cowboy:start_http/4` function starts a listener for HTTP connections -using the TCP transport. The `cowboy:start_https/4` function starts a -listener for HTTPS connections using the SSL transport. - -Listeners are a group of processes that are used to accept and manage -connections. The processes used specifically for accepting connections -are called acceptors. The number of acceptor processes is unrelated to -the maximum number of connections Cowboy can handle. Please refer to -the Ranch guide for in-depth information. - -Listeners are named. They spawn a given number of acceptors, listen for -connections using the given transport options and pass along the protocol -options to the connection processes. The protocol options must include -the dispatch list for routing requests to handlers. - -The dispatch list is explained in greater details in the Routing section -of the guide. - -``` erlang -Dispatch = cowboy_router:compile([ - %% {URIHost, list({URIPath, Handler, Opts})} - {'_', [{'_', my_handler, []}]} -]), -%% Name, NbAcceptors, TransOpts, ProtoOpts -cowboy:start_http(my_http_listener, 100, - [{port, 8080}], - [{env, [{dispatch, Dispatch}]}] -). -``` - -Cowboy features many kinds of handlers. It has plain HTTP handlers, loop -handlers, Websocket handlers, REST handlers and static handlers. Their -usage is documented in the respective sections of the guide. - -Most applications use the plain HTTP handler, which has three callback -functions: init/3, handle/2 and terminate/3. You can find more information -about the arguments and possible return values of these callbacks in the -HTTP handlers section of this guide. Following is an example of a simple -HTTP handler module. - -``` erlang --module(my_handler). --behaviour(cowboy_http_handler). - --export([init/3]). --export([handle/2]). --export([terminate/3]). - -init({tcp, http}, Req, Opts) -> - {ok, Req, undefined_state}. - -handle(Req, State) -> - {ok, Req2} = cowboy_req:reply(200, [], <<"Hello World!">>, Req), - {ok, Req2, State}. - -terminate(Reason, Req, State) -> - ok. -``` - -The `Req` variable above is the Req object, which allows the developer -to obtain information about the request and to perform a reply. Its usage -is explained in its respective section of the guide. +The same applies to any other case insensitive value. diff --git a/guide/modern_web.md b/guide/modern_web.md new file mode 100644 index 0000000..6c668b3 --- /dev/null +++ b/guide/modern_web.md @@ -0,0 +1,224 @@ +The modern Web +============== + +Let's take a look at various technologies from the beginnings +of the Web up to this day, and get a preview of what's +coming next. + +Cowboy is compatible with all the technology cited in this +chapter except of course HTTP/2.0 which has no implementation +in the wild at the time of writing. + +The prehistoric Web +------------------- + +HTTP was initially created to serve HTML pages and only +had the GET method for retrieving them. This initial +version is documented and is sometimes called HTTP/0.9. +HTTP/1.0 defined the GET, HEAD and POST methods, and +was able to send data with POST requests. + +HTTP/1.0 works in a very simple way. A TCP connection +is first established to the server. Then a request is +sent. Then the server sends a response back and closes +the connection. + +Suffice to say, HTTP/1.0 is not very efficient. Opening +a TCP connection takes some time, and pages containing +many assets load much slower than they could because of +this. + +Most improvements done in recent years focused on reducing +this load time and reducing the latency of the requests. + +HTTP/1.1 +-------- + +HTTP/1.1 quickly followed and added a keep-alive mechanism +to allow using the same connection for many requests, as +well as streaming capabilities, allowing an endpoint to send +a body in well defined chunks. + +HTTP/1.1 defines the OPTIONS, GET, HEAD, POST, PUT, DELETE, +TRACE and CONNECT methods. The PATCH method was added in more +recent years. It also improves the caching capabilities with +the introduction of many headers. + +HTTP/1.1 still works like HTTP/1.0 does, except the connection +can be kept alive for subsequent requests. This however allows +clients to perform what is called as pipelining: sending many +requests in a row, and then processing the responses which will +be received in the same order as the requests. + +REST +---- + +The design of HTTP/1.1 was influenced by the REST architectural +style. REST, or REpresentational State Transfer, is a style of +architecture for loosely connected distributed systems. + +REST defines constraints that systems must obey to in order to +be RESTful. A system which doesn't follow all the constraints +cannot be considered RESTful. + +REST is a client-server architecture with a clean separation +of concerns between the client and the server. They communicate +by referencing resources. Resources can be identified, but +also manipulated. A resource representation has a media type +and information about whether it can be cached and how. Hypermedia +determines how resources are related and they can be used. +REST is also stateless. All requests contain the complete +information necessary to perform the action. + +HTTP/1.1 defines all the methods, headers and semantics required +to implement RESTful systems. + +REST is most often used when designing web application APIs +which are generally meant to be used by executable code directly. + +XmlHttpRequest +-------------- + +Also know as AJAX, this technology allows Javascript code running +on a web page to perform asynchronous requests to the server. +This is what started the move from static websites to dynamic +web applications. + +XmlHttpRequest still performs HTTP requests under the hood, +and then waits for a response, but the Javascript code can +continue to run until the response arrives. It will then receive +the response through a callback previously defined. + +This is of course still requests initiated by the client, +the server still had no way of pushing data to the client +on its own, so new technology appeared to allow that. + +Long-polling +------------ + +Polling was a technique used to overcome the fact that the server +cannot push data directly to the client. Therefore the client had +to repeatedly create a connection, make a request, get a response, +then try again a few seconds later. This is overly expensive and +adds an additional delay before the client receives the data. + +Polling was necessary to implement message queues and other +similar mechanisms, where a user must be informed of something +when it happens, rather than when he refreshes the page next. +A typical example would be a chat application. + +Long-polling was created to reduce the server load by creating +less connections, but also to improve latency by getting the +response back to the client as soon as it becomes available +on the server. + +Long-polling works in a similar manner to polling, except the +request will not get a response immediately. Instead the server +leaves it open until it has a response to send. After getting +the response, the client creates a new request and gets back +to waiting. + +You probably guessed by now that long-polling is a hack, and +like most hacks it can suffer from unforeseen issues, in this +case it doesn't always play well with proxies. + +HTML5 +----- + +HTML5 is, of course, the HTML version after HTML4. But HTML5 +emerged to solve a specific problem: dynamic web applications. + +HTML was initially created to write web pages which compose +a website. But soon people and companies wanted to use HTML +to write more and more complex websites, eventually known as +web applications. They are for example your news reader, your +email client in the browser, or your video streaming website. + +Because HTML wasn't enough, they started using proprietary +solutions, often implemented using plug-ins. This wasn't +perfect of course, but worked well enough for most people. + +However, the needs for a standard solution eventually became +apparent. The browser needed to be able to play media natively. +It needed to be able to draw anything. It needed an efficient +way of streaming events to the server, but also receiving +events from the server. + +The solution went on to become HTML5. At the time of writing +it is being standardized. + +EventSource +----------- + +EventSource, sometimes also called Server-Sent Events, is a +technology allowing servers to push data to HTML5 applications. + +EventSource is one-way communication channel from the server +to the client. The client has no means to talk to the server +other than by using HTTP requests. + +It consists of a Javascript object allowing setting up an +EventSource connection to the server, and a very small protocol +for sending events to the client on top of the HTTP/1.1 +connection. + +EventSource is a lightweight solution that only works for +UTF-8 encoded text data. Binary data and text data encoded +differently are not allowed by the protocol. A heavier but +more generic approach can be found in Websocket. + +Websocket +--------- + +Websocket is a protocol built on top of HTTP/1.1 that provides +a two-ways communication channel between the client and the +server. Communication is asynchronous and can occur concurrently. + +It consists of a Javascript object allowing setting up a +Websocket connection to the server, and a binary based +protocol for sending data to the server or the client. + +Websocket connections can transfer either UTF-8 encoded text +data or binary data. The protocol also includes support for +implementing a ping/pong mechanism, allowing the server and +the client to have more confidence that the connection is still +alive. + +A Websocket connection can be used to transfer any kind of data, +small or big, text or binary. Because of this Websocket is +sometimes used for communication between systems. + +SPDY +---- + +SPDY is an attempt to reduce page loading time by opening a +single connection per server, keeping it open for subsequent +requests, and also by compressing the HTTP headers to reduce +the size of requests. + +SPDY is compatible with HTTP/1.1 semantics, and is actually +just a different way of performing HTTP requests and responses, +by using binary frames instead of a text-based protocol. +SPDY also allows the server to send responses without needing +a request to exist, essentially enabling server push. + +SPDY is an experiment that has proven successful and is used +as the basis for the HTTP/2.0 standard. + +Browsers make use of TLS Next Protocol Negotiation to upgrade +to a SPDY connection seamlessly if the protocol supports it. + +The protocol itself has a few shortcomings which are being +fixed in HTTP/2.0. + +HTTP/2.0 +-------- + +HTTP/2.0 is the long-awaited update to the HTTP/1.1 protocol. +It is based on SPDY although a lot has been improved at the +time of writing. + +HTTP/2.0 is an asynchronous two-ways communication channel +between two endpoints. + +It is planned to be ready late 2014. diff --git a/guide/toc.md b/guide/toc.md index f8eeb18..f30a5bd 100644 --- a/guide/toc.md +++ b/guide/toc.md @@ -1,11 +1,39 @@ Cowboy User Guide ================= +The Cowboy User Guide explores the modern Web and how to make +best use of Cowboy for writing powerful web applications. + +Introducing Cowboy +------------------ + * [Introduction](introduction.md) * Purpose * Prerequisites + * Supported platforms * Conventions - * Getting started + * [The modern Web](modern_web.md) + * The prehistoric Web + * HTTP/1.1 + * REST + * Long-polling + * HTML5 + * EventSource + * Websocket + * SPDY + * HTTP/2.0 + * [Erlang and the Web](erlang_web.md) + * The Web is concurrent + * The Web is soft real time + * The Web is asynchronous + * The Web is omnipresent + * Erlang is the ideal platform for the Web + * [Erlang for beginners](erlang_beginners.md) + * [Getting started](getting_started.md) + +Using Cowboy +------------ + * [Routing](routing.md) * Purpose * Structure -- cgit v1.2.3