diff options
Diffstat (limited to 'docs/en/ranch/1.4/guide/parsers/index.html')
-rw-r--r-- | docs/en/ranch/1.4/guide/parsers/index.html | 266 |
1 files changed, 266 insertions, 0 deletions
diff --git a/docs/en/ranch/1.4/guide/parsers/index.html b/docs/en/ranch/1.4/guide/parsers/index.html new file mode 100644 index 00000000..c663bad3 --- /dev/null +++ b/docs/en/ranch/1.4/guide/parsers/index.html @@ -0,0 +1,266 @@ +<!DOCTYPE html> +<html lang="en"> + +<head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1.0"> + <meta name="description" content=""> + <meta name="author" content="Loïc Hoguin based on a design from (Soft10) Pol Cámara"> + + <meta name="generator" content="Hugo 0.26" /> + + <title>Nine Nines: Writing parsers</title> + + <link href='https://fonts.googleapis.com/css?family=Open+Sans:400,700,400italic' rel='stylesheet' type='text/css'> + <link href="/css/99s.css?r=1" rel="stylesheet"> + + <link rel="shortcut icon" href="/img/ico/favicon.ico"> + <link rel="apple-touch-icon-precomposed" sizes="114x114" href="/img/ico/apple-touch-icon-114.png"> + <link rel="apple-touch-icon-precomposed" sizes="72x72" href="/img/ico/apple-touch-icon-72.png"> + <link rel="apple-touch-icon-precomposed" href="/img/ico/apple-touch-icon-57.png"> + + +</head> + + +<body class=""> + <header id="page-head"> + <div id="topbar" class="container"> + <div class="row"> + <div class="span2"> + <h1 id="logo"><a href="/" title="99s">99s</a></h1> + </div> + <div class="span10"> + + <div id="side-header"> + <nav> + <ul> + <li><a title="Hear my thoughts" href="/articles">Articles</a></li> + <li><a title="Watch my talks" href="/talks">Talks</a></li> + <li class="active"><a title="Read the docs" href="/docs">Documentation</a></li> + <li><a title="Request my services" href="/services">Consulting & Training</a></li> + </ul> + </nav> + <ul id="social"> + <li> + <a href="https://github.com/ninenines" title="Check my Github repositories"><img src="/img/ico_github.png" data-hover="/img/ico_github_alt.png" alt="Github"></a> + </li> + <li> + <a title="Keep in touch!" href="http://twitter.com/lhoguin"><img src="/img/ico_microblog.png" data-hover="/img/ico_microblog_alt.png"></a> + </li> + <li> + <a title="Contact me" href="mailto:[email protected]"><img src="/img/ico_mail.png" data-hover="/img/ico_mail_alt.png"></a> + </li> + </ul> + </div> + </div> + </div> + </div> + + +</header> + +<div id="contents" class="two_col"> +<div class="container"> +<div class="row"> +<div id="docs" class="span9 maincol"> + +<h1 class="lined-header"><span>Writing parsers</span></h1> + +<div class="paragraph"><p>There are three kinds of protocols:</p></div> +<div class="ulist"><ul> +<li> +<p> +Text protocols +</p> +</li> +<li> +<p> +Schema-less binary protocols +</p> +</li> +<li> +<p> +Schema-based binary protocols +</p> +</li> +</ul></div> +<div class="paragraph"><p>This chapter introduces the first two kinds. It will not cover +more advanced topics such as continuations or parser generators.</p></div> +<div class="paragraph"><p>This chapter isn’t specifically about Ranch, we assume here that +you know how to read data from the socket. The data you read and +the data that hasn’t been parsed is saved in a buffer. Every +time you read from the socket, the data read is appended to the +buffer. What happens next depends on the kind of protocol. We +will only cover the first two.</p></div> +<div class="sect1"> +<h2 id="_parsing_text">Parsing text</h2> +<div class="sectionbody"> +<div class="paragraph"><p>Text protocols are generally line based. This means that we can’t +do anything with them until we receive the full line.</p></div> +<div class="paragraph"><p>A simple way to get a full line is to use <code>binary:split/{2,3}</code>.</p></div> +<div class="listingblock"> +<div class="title">Using binary:split/2 to get a line of input</div> +<div class="content"><!-- Generator: GNU source-highlight 3.1.8 +by Lorenzo Bettini +http://www.lorenzobettini.it +http://www.gnu.org/software/src-highlite --> +<pre><tt><span style="font-weight: bold"><span style="color: #0000FF">case</span></span> <span style="font-weight: bold"><span style="color: #000000">binary:split</span></span>(<span style="color: #009900">Buffer</span>, <span style="color: #990000"><<</span><span style="color: #FF0000">"\n"</span><span style="color: #990000">>></span>) <span style="font-weight: bold"><span style="color: #0000FF">of</span></span> + [<span style="color: #990000">_</span>] <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">get_more_data</span></span>(<span style="color: #009900">Buffer</span>); + [<span style="color: #009900">Line</span>, <span style="color: #009900">Rest</span>] <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">handle_line</span></span>(<span style="color: #009900">Line</span>, <span style="color: #009900">Rest</span>) +<span style="font-weight: bold"><span style="color: #0000FF">end</span></span><span style="color: #990000">.</span></tt></pre></div></div> +<div class="paragraph"><p>In the above example, we can have two results. Either there was +a line break in the buffer and we get it split into two parts, +the line and the rest of the buffer; or there was no line break +in the buffer and we need to get more data from the socket.</p></div> +<div class="paragraph"><p>Next, we need to parse the line. The simplest way is to again +split, here on space. The difference is that we want to split +on all spaces character, as we want to tokenize the whole string.</p></div> +<div class="listingblock"> +<div class="title">Using binary:split/3 to split text</div> +<div class="content"><!-- Generator: GNU source-highlight 3.1.8 +by Lorenzo Bettini +http://www.lorenzobettini.it +http://www.gnu.org/software/src-highlite --> +<pre><tt><span style="font-weight: bold"><span style="color: #0000FF">case</span></span> <span style="font-weight: bold"><span style="color: #000000">binary:split</span></span>(<span style="color: #009900">Line</span>, <span style="color: #990000"><<</span><span style="color: #FF0000">" "</span><span style="color: #990000">>></span>, [<span style="color: #FF6600">global</span>]) <span style="font-weight: bold"><span style="color: #0000FF">of</span></span> + [<span style="color: #990000"><<</span><span style="color: #FF0000">"HELLO"</span><span style="color: #990000">>></span>] <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">be_polite</span></span>(); + [<span style="color: #990000"><<</span><span style="color: #FF0000">"AUTH"</span><span style="color: #990000">>></span>, <span style="color: #009900">User</span>, <span style="color: #009900">Password</span>] <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">authenticate_user</span></span>(<span style="color: #009900">User</span>, <span style="color: #009900">Password</span>); + [<span style="color: #990000"><<</span><span style="color: #FF0000">"QUIT"</span><span style="color: #990000">>></span>, <span style="color: #009900">Reason</span>] <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">quit</span></span>(<span style="color: #009900">Reason</span>) + <span style="font-style: italic"><span style="color: #9A1900">%% ...</span></span> +<span style="font-weight: bold"><span style="color: #0000FF">end</span></span><span style="color: #990000">.</span></tt></pre></div></div> +<div class="paragraph"><p>Pretty simple, right? Match on the command name, get the rest +of the tokens in variables and call the respective functions.</p></div> +<div class="paragraph"><p>After doing this, you will want to check if there is another +line in the buffer, and handle it immediately if any. +Otherwise wait for more data.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_parsing_binary">Parsing binary</h2> +<div class="sectionbody"> +<div class="paragraph"><p>Binary protocols can be more varied, although most of them are +pretty similar. The first four bytes of a frame tend to be +the size of the frame, which is followed by a certain number +of bytes for the type of frame and then various parameters.</p></div> +<div class="paragraph"><p>Sometimes the size of the frame includes the first four bytes, +sometimes not. Other times this size is encoded over two bytes. +And even other times little-endian is used instead of big-endian.</p></div> +<div class="paragraph"><p>The general idea stays the same though.</p></div> +<div class="listingblock"> +<div class="title">Using binary pattern matching to split frames</div> +<div class="content"><!-- Generator: GNU source-highlight 3.1.8 +by Lorenzo Bettini +http://www.lorenzobettini.it +http://www.gnu.org/software/src-highlite --> +<pre><tt><span style="color: #990000"><<</span> <span style="color: #009900">Size</span><span style="color: #990000">:</span><span style="color: #993399">32</span>, <span style="color: #990000">_/</span><span style="color: #FF6600">bits</span> <span style="color: #990000">>></span> <span style="color: #990000">=</span> <span style="color: #009900">Buffer</span>, +<span style="font-weight: bold"><span style="color: #0000FF">case</span></span> <span style="color: #009900">Buffer</span> <span style="font-weight: bold"><span style="color: #0000FF">of</span></span> + <span style="color: #990000"><<</span> <span style="color: #009900">Frame</span><span style="color: #990000">:</span><span style="color: #009900">Size</span><span style="color: #990000">/</span><span style="font-weight: bold"><span style="color: #000080">binary</span></span>, <span style="color: #009900">Rest</span><span style="color: #990000">/</span><span style="color: #FF6600">bits</span> <span style="color: #990000">>></span> <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">handle_frame</span></span>(<span style="color: #009900">Frame</span>, <span style="color: #009900">Rest</span>); + <span style="color: #990000">_</span> <span style="color: #990000">-></span> + <span style="font-weight: bold"><span style="color: #000000">get_more_data</span></span>(<span style="color: #009900">Buffer</span>) +<span style="font-weight: bold"><span style="color: #0000FF">end</span></span><span style="color: #990000">.</span></tt></pre></div></div> +<div class="paragraph"><p>You will then need to parse this frame using binary pattern +matching, and handle it. Then you will want to check if there +is another frame fully received in the buffer, and handle it +immediately if any. Otherwise wait for more data.</p></div> +</div> +</div> + + + + + + + + + + + <nav style="margin:1em 0"> + + <a style="float:left" href="https://ninenines.eu/docs/en/ranch/1.4/guide/embedded/"> + Embedded mode + </a> + + + + <a style="float:right" href="https://ninenines.eu/docs/en/ranch/1.4/guide/ssl_auth/"> + SSL client authentication + </a> + + </nav> + + + + +</div> + +<div class="span3 sidecol"> + + +<h3> + Ranch + 1.4 + + User Guide +</h3> + +<ul> + + <li><a href="/docs/en/ranch/1.4/guide">User Guide</a></li> + + + <li><a href="/docs/en/ranch/1.4/manual">Function Reference</a></li> + + +</ul> + +<h4 id="docs-nav">Navigation</h4> + +<h4>Version select</h4> +<ul> + + + + <li><a href="/docs/en/ranch/1.4/guide">1.4</a></li> + + <li><a href="/docs/en/ranch/1.3/guide">1.3</a></li> + + <li><a href="/docs/en/ranch/1.2/guide">1.2</a></li> + +</ul> + +</div> +</div> +</div> +</div> + + <footer> + <div class="container"> + <div class="row"> + <div class="span6"> + <p id="scroll-top"><a href="#">↑ Scroll to top</a></p> + <nav> + <ul> + <li><a href="mailto:[email protected]" title="Contact us">Contact us</a></li><li><a href="https://github.com/ninenines/ninenines.github.io" title="Github repository">Contribute to this site</a></li> + </ul> + </nav> + </div> + <div class="span6 credits"> + <p><img src="/img/footer_logo.png"></p> + <p>Copyright © Loïc Hoguin 2012-2016</p> + </div> + </div> + </div> + </footer> + + + <script src="/js/custom.js"></script> + </body> +</html> + + |