diff options
Diffstat (limited to 'articles/erlang-scalability/index.html')
-rw-r--r-- | articles/erlang-scalability/index.html | 372 |
1 files changed, 212 insertions, 160 deletions
diff --git a/articles/erlang-scalability/index.html b/articles/erlang-scalability/index.html index 213881da..a9e0be42 100644 --- a/articles/erlang-scalability/index.html +++ b/articles/erlang-scalability/index.html @@ -7,7 +7,7 @@ <meta name="description" content=""> <meta name="author" content="Loïc Hoguin based on a design from (Soft10) Pol Cámara"> - <meta name="generator" content="Hugo 0.17" /> + <meta name="generator" content="Hugo 0.26" /> <title>Nine Nines: Erlang Scalability</title> @@ -74,140 +74,140 @@ </p> </header> -<div class="paragraph"><p>I would like to share some experience and theories on
-Erlang scalability.</p></div>
-<div class="paragraph"><p>This will be in the form of a series of hints, which
-may or may not be accompanied with explanations as to why
-things are this way, or how they improve or reduce the scalability
-of a system. I will try to do my best to avoid giving falsehoods,
-even if that means a few things won’t be explained.</p></div>
-<div class="sect1">
-<h2 id="_nifs">NIFs</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>NIFs are considered harmful. I don’t know any single NIF-based
-library that I would recommend. That doesn’t mean they should
-all be avoided, just that if you’re going to want your system to
-scale, you probably shouldn’t use a NIF.</p></div>
-<div class="paragraph"><p>A common case for using NIFs is JSON processing. The problem
-is that JSON is a highly inefficient data structure (similar
-in inefficiency to XML, although perhaps not as bad). If you can
-avoid using JSON, you probably should. MessagePack can replace
-it in many situations.</p></div>
-<div class="paragraph"><p>Long-running NIFs will take over a scheduler and prevent Erlang
-from efficiently handling many processes.</p></div>
-<div class="paragraph"><p>Short-running NIFs will still confuse the scheduler if they
-take more than a few microseconds to run.</p></div>
-<div class="paragraph"><p>Threaded NIFs, or the use of the <code>enif_consume_timeslice</code>
-might help allievate this problem, but they’re not a silver bullet.</p></div>
-<div class="paragraph"><p>And as you already know, a crashing NIF will take down your VM,
-destroying any claims you may have at being scalable.</p></div>
-<div class="paragraph"><p>Never use a NIF because "C is fast". This is only true in
-single-threaded programs.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_bifs">BIFs</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>BIFs can also be harmful. While they are generally better than
-NIFs, they are not perfect and some of them might have harmful
-effects on the scheduler.</p></div>
-<div class="paragraph"><p>A great example of this is the <code>erlang:decode_packet/3</code>
-BIF, when used for HTTP request or response decoding. Avoiding
-its use in <em>Cowboy</em> allowed us to see a big increase in
-the number of requests production systems were able to handle,
-up to two times the original amount. Incidentally this is something
-that is impossible to detect using synthetic benchmarks.</p></div>
-<div class="paragraph"><p>BIFs that return immediately are perfectly fine though.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_binary_strings">Binary strings</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>Binary strings use less memory, which means you spend less time
-allocating memory compared to list-based strings. They are also
-more natural for strings manipulation because they are optimized
-for appending (as opposed to prepending for lists).</p></div>
-<div class="paragraph"><p>If you can process a binary string using a single match context,
-then the code will run incredibly fast. The effects will be much
-increased if the code was compiled using HiPE, even if your Erlang
-system isn’t compiled natively.</p></div>
-<div class="paragraph"><p>Avoid using <code>binary:split</code> or <code>binary:replace</code>
-if you can avoid it. They have a certain overhead due to supporting
-many options that you probably don’t need for most operations.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_buffering_and_streaming">Buffering and streaming</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>Use binaries. They are great for appending, and it’s a direct copy
-from what you receive from a stream (usually a socket or a file).</p></div>
-<div class="paragraph"><p>Be careful to not indefinitely receive data, as you might end up
-having a single binary taking up huge amounts of memory.</p></div>
-<div class="paragraph"><p>If you stream from a socket and know how much data you expect,
-then fetch that data in a single <code>recv</code> call.</p></div>
-<div class="paragraph"><p>Similarly, if you can use a single <code>send</code> call, then
-you should do so, to avoid going back and forth unnecessarily between
-your Erlang process and the Erlang port for your socket.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_list_and_binary_comprehensions">List and binary comprehensions</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>Prefer list comprehensions over <code>lists:map/2</code>. The
-compiler will be able to optimize your code greatly, for example
-not creating the result if you don’t need it. As time goes on,
-more optimizations will be added to the compiler and you will
-automatically benefit from them.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_gen_server_is_no_silver_bullet">gen_server is no silver bullet</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>It’s a bad idea to use <code>gen_server</code> for everything.
-For two reasons.</p></div>
-<div class="paragraph"><p>There is an overhead everytime the <code>gen_server</code> receives
-a call, a cast or a simple message. It can be a problem if your
-<code>gen_server</code> is in a critical code path where speed
-is all that matters. Do not hesitate to create other kinds of
-processes where it makes sense. And depending on the kind of process,
-you might want to consider making them special processes, which
-would essentially behave the same as a <code>gen_server</code>.</p></div>
-<div class="paragraph"><p>A common mistake is to have a unique <code>gen_server</code> to
-handle queries from many processes. This generally becomes the
-biggest bottleneck you’ll want to fix. You should try to avoid
-relying on a single process, using a pool if you can.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_supervisor_and_monitoring">Supervisor and monitoring</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>A <code>supervisor</code> is also a <code>gen_server</code>,
-so the previous points also apply to them.</p></div>
-<div class="paragraph"><p>Sometimes you’re in a situation where you have supervised
-processes but also want to monitor them in one or more other
-processes, effectively duplicating the work. The supervisor
-already knows when processes die, why not use this to our
-advantage?</p></div>
-<div class="paragraph"><p>You can create a custom supervisor process that will perform
-both the supervision and handle exit and other events, allowing
-to avoid the combination of supervising and monitoring which
-can prove harmful when many processes die at once, or when you
-have many short lived processes.</p></div>
-</div>
-</div>
-<div class="sect1">
-<h2 id="_ets_for_lolspeed_tm">ets for LOLSPEED(tm)</h2>
-<div class="sectionbody">
-<div class="paragraph"><p>If you have data queried or modified by many processes, then
-<code>ets</code> public or protected tables will give you the
-performance boost you require. Do not forget to set the
-<code>read_concurrency</code> or <code>write_concurrency</code>
-options though.</p></div>
-<div class="paragraph"><p>You might also be thrilled to know that Erlang R16B will feature
-a big performance improvement for accessing <code>ets</code> tables
-concurrently.</p></div>
-</div>
-</div>
+<div class="paragraph"><p>I would like to share some experience and theories on +Erlang scalability.</p></div> +<div class="paragraph"><p>This will be in the form of a series of hints, which +may or may not be accompanied with explanations as to why +things are this way, or how they improve or reduce the scalability +of a system. I will try to do my best to avoid giving falsehoods, +even if that means a few things won’t be explained.</p></div> +<div class="sect1"> +<h2 id="_nifs">NIFs</h2> +<div class="sectionbody"> +<div class="paragraph"><p>NIFs are considered harmful. I don’t know any single NIF-based +library that I would recommend. That doesn’t mean they should +all be avoided, just that if you’re going to want your system to +scale, you probably shouldn’t use a NIF.</p></div> +<div class="paragraph"><p>A common case for using NIFs is JSON processing. The problem +is that JSON is a highly inefficient data structure (similar +in inefficiency to XML, although perhaps not as bad). If you can +avoid using JSON, you probably should. MessagePack can replace +it in many situations.</p></div> +<div class="paragraph"><p>Long-running NIFs will take over a scheduler and prevent Erlang +from efficiently handling many processes.</p></div> +<div class="paragraph"><p>Short-running NIFs will still confuse the scheduler if they +take more than a few microseconds to run.</p></div> +<div class="paragraph"><p>Threaded NIFs, or the use of the <code>enif_consume_timeslice</code> +might help allievate this problem, but they’re not a silver bullet.</p></div> +<div class="paragraph"><p>And as you already know, a crashing NIF will take down your VM, +destroying any claims you may have at being scalable.</p></div> +<div class="paragraph"><p>Never use a NIF because "C is fast". This is only true in +single-threaded programs.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_bifs">BIFs</h2> +<div class="sectionbody"> +<div class="paragraph"><p>BIFs can also be harmful. While they are generally better than +NIFs, they are not perfect and some of them might have harmful +effects on the scheduler.</p></div> +<div class="paragraph"><p>A great example of this is the <code>erlang:decode_packet/3</code> +BIF, when used for HTTP request or response decoding. Avoiding +its use in <em>Cowboy</em> allowed us to see a big increase in +the number of requests production systems were able to handle, +up to two times the original amount. Incidentally this is something +that is impossible to detect using synthetic benchmarks.</p></div> +<div class="paragraph"><p>BIFs that return immediately are perfectly fine though.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_binary_strings">Binary strings</h2> +<div class="sectionbody"> +<div class="paragraph"><p>Binary strings use less memory, which means you spend less time +allocating memory compared to list-based strings. They are also +more natural for strings manipulation because they are optimized +for appending (as opposed to prepending for lists).</p></div> +<div class="paragraph"><p>If you can process a binary string using a single match context, +then the code will run incredibly fast. The effects will be much +increased if the code was compiled using HiPE, even if your Erlang +system isn’t compiled natively.</p></div> +<div class="paragraph"><p>Avoid using <code>binary:split</code> or <code>binary:replace</code> +if you can avoid it. They have a certain overhead due to supporting +many options that you probably don’t need for most operations.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_buffering_and_streaming">Buffering and streaming</h2> +<div class="sectionbody"> +<div class="paragraph"><p>Use binaries. They are great for appending, and it’s a direct copy +from what you receive from a stream (usually a socket or a file).</p></div> +<div class="paragraph"><p>Be careful to not indefinitely receive data, as you might end up +having a single binary taking up huge amounts of memory.</p></div> +<div class="paragraph"><p>If you stream from a socket and know how much data you expect, +then fetch that data in a single <code>recv</code> call.</p></div> +<div class="paragraph"><p>Similarly, if you can use a single <code>send</code> call, then +you should do so, to avoid going back and forth unnecessarily between +your Erlang process and the Erlang port for your socket.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_list_and_binary_comprehensions">List and binary comprehensions</h2> +<div class="sectionbody"> +<div class="paragraph"><p>Prefer list comprehensions over <code>lists:map/2</code>. The +compiler will be able to optimize your code greatly, for example +not creating the result if you don’t need it. As time goes on, +more optimizations will be added to the compiler and you will +automatically benefit from them.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_gen_server_is_no_silver_bullet">gen_server is no silver bullet</h2> +<div class="sectionbody"> +<div class="paragraph"><p>It’s a bad idea to use <code>gen_server</code> for everything. +For two reasons.</p></div> +<div class="paragraph"><p>There is an overhead everytime the <code>gen_server</code> receives +a call, a cast or a simple message. It can be a problem if your +<code>gen_server</code> is in a critical code path where speed +is all that matters. Do not hesitate to create other kinds of +processes where it makes sense. And depending on the kind of process, +you might want to consider making them special processes, which +would essentially behave the same as a <code>gen_server</code>.</p></div> +<div class="paragraph"><p>A common mistake is to have a unique <code>gen_server</code> to +handle queries from many processes. This generally becomes the +biggest bottleneck you’ll want to fix. You should try to avoid +relying on a single process, using a pool if you can.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_supervisor_and_monitoring">Supervisor and monitoring</h2> +<div class="sectionbody"> +<div class="paragraph"><p>A <code>supervisor</code> is also a <code>gen_server</code>, +so the previous points also apply to them.</p></div> +<div class="paragraph"><p>Sometimes you’re in a situation where you have supervised +processes but also want to monitor them in one or more other +processes, effectively duplicating the work. The supervisor +already knows when processes die, why not use this to our +advantage?</p></div> +<div class="paragraph"><p>You can create a custom supervisor process that will perform +both the supervision and handle exit and other events, allowing +to avoid the combination of supervising and monitoring which +can prove harmful when many processes die at once, or when you +have many short lived processes.</p></div> +</div> +</div> +<div class="sect1"> +<h2 id="_ets_for_lolspeed_tm">ets for LOLSPEED(tm)</h2> +<div class="sectionbody"> +<div class="paragraph"><p>If you have data queried or modified by many processes, then +<code>ets</code> public or protected tables will give you the +performance boost you require. Do not forget to set the +<code>read_concurrency</code> or <code>write_concurrency</code> +options though.</p></div> +<div class="paragraph"><p>You might also be thrilled to know that Erlang R16B will feature +a big performance improvement for accessing <code>ets</code> tables +concurrently.</p></div> +</div> +</div> </article> </div> @@ -216,55 +216,107 @@ concurrently.</p></div> <h3>More articles</h3> <ul id="articles-nav" class="extra_margin"> - <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-rc.2/">Cowboy 2.0 release candidate 2</a></li> + + <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-rc.2/">Cowboy 2.0 release candidate 2</a></li> + + + + <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-rc.1/">Cowboy 2.0 release candidate 1</a></li> + - <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-rc.1/">Cowboy 2.0 release candidate 1</a></li> + + <li><a href="https://ninenines.eu/articles/the-elephant-in-the-room/">The elephant in the room</a></li> + - <li><a href="https://ninenines.eu/articles/the-elephant-in-the-room/">The elephant in the room</a></li> + + <li><a href="https://ninenines.eu/articles/dont-let-it-crash/">Don't let it crash</a></li> + - <li><a href="https://ninenines.eu/articles/dont-let-it-crash/">Don't let it crash</a></li> + + <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-pre.4/">Cowboy 2.0 pre-release 4</a></li> + - <li><a href="https://ninenines.eu/articles/cowboy-2.0.0-pre.4/">Cowboy 2.0 pre-release 4</a></li> + + <li><a href="https://ninenines.eu/articles/ranch-1.3/">Ranch 1.3</a></li> + - <li><a href="https://ninenines.eu/articles/ranch-1.3/">Ranch 1.3</a></li> + + <li><a href="https://ninenines.eu/articles/ml-archives/">Mailing list archived</a></li> + - <li><a href="https://ninenines.eu/articles/ml-archives/">Mailing list archived</a></li> + + <li><a href="https://ninenines.eu/articles/website-update/">Website update</a></li> + - <li><a href="https://ninenines.eu/articles/website-update/">Website update</a></li> + + <li><a href="https://ninenines.eu/articles/erlanger-playbook-september-2015-update/">The Erlanger Playbook September 2015 Update</a></li> + - <li><a href="https://ninenines.eu/articles/erlanger-playbook-september-2015-update/">The Erlanger Playbook September 2015 Update</a></li> + + <li><a href="https://ninenines.eu/articles/erlanger-playbook/">The Erlanger Playbook</a></li> + - <li><a href="https://ninenines.eu/articles/erlanger-playbook/">The Erlanger Playbook</a></li> + + <li><a href="https://ninenines.eu/articles/erlang-validate-utf8/">Validating UTF-8 binaries with Erlang</a></li> + - <li><a href="https://ninenines.eu/articles/erlang-validate-utf8/">Validating UTF-8 binaries with Erlang</a></li> + + <li><a href="https://ninenines.eu/articles/on-open-source/">On open source</a></li> + - <li><a href="https://ninenines.eu/articles/on-open-source/">On open source</a></li> + + <li><a href="https://ninenines.eu/articles/the-story-so-far/">The story so far</a></li> + - <li><a href="https://ninenines.eu/articles/the-story-so-far/">The story so far</a></li> + + <li><a href="https://ninenines.eu/articles/cowboy2-qs/">Cowboy 2.0 and query strings</a></li> + - <li><a href="https://ninenines.eu/articles/cowboy2-qs/">Cowboy 2.0 and query strings</a></li> + + <li><a href="https://ninenines.eu/articles/january-2014-status/">January 2014 status</a></li> + - <li><a href="https://ninenines.eu/articles/january-2014-status/">January 2014 status</a></li> + + <li><a href="https://ninenines.eu/articles/farwest-funded/">Farwest got funded!</a></li> + - <li><a href="https://ninenines.eu/articles/farwest-funded/">Farwest got funded!</a></li> + + <li><a href="https://ninenines.eu/articles/erlang.mk-and-relx/">Build Erlang releases with Erlang.mk and Relx</a></li> + - <li><a href="https://ninenines.eu/articles/erlang.mk-and-relx/">Build Erlang releases with Erlang.mk and Relx</a></li> + + <li><a href="https://ninenines.eu/articles/xerl-0.5-intermediate-module/">Xerl: intermediate module</a></li> + - <li><a href="https://ninenines.eu/articles/xerl-0.5-intermediate-module/">Xerl: intermediate module</a></li> + + <li><a href="https://ninenines.eu/articles/xerl-0.4-expression-separator/">Xerl: expression separator</a></li> + - <li><a href="https://ninenines.eu/articles/xerl-0.4-expression-separator/">Xerl: expression separator</a></li> + + <li><a href="https://ninenines.eu/articles/erlang-scalability/">Erlang Scalability</a></li> + - <li><a href="https://ninenines.eu/articles/erlang-scalability/">Erlang Scalability</a></li> + + <li><a href="https://ninenines.eu/articles/xerl-0.3-atomic-expressions/">Xerl: atomic expressions</a></li> + - <li><a href="https://ninenines.eu/articles/xerl-0.3-atomic-expressions/">Xerl: atomic expressions</a></li> + + <li><a href="https://ninenines.eu/articles/xerl-0.2-two-modules/">Xerl: two modules</a></li> + - <li><a href="https://ninenines.eu/articles/xerl-0.2-two-modules/">Xerl: two modules</a></li> + + <li><a href="https://ninenines.eu/articles/xerl-0.1-empty-modules/">Xerl: empty modules</a></li> + - <li><a href="https://ninenines.eu/articles/xerl-0.1-empty-modules/">Xerl: empty modules</a></li> + + <li><a href="https://ninenines.eu/articles/ranch-ftp/">Build an FTP Server with Ranch in 30 Minutes</a></li> + - <li><a href="https://ninenines.eu/articles/ranch-ftp/">Build an FTP Server with Ranch in 30 Minutes</a></li> + + <li><a href="https://ninenines.eu/articles/tictactoe/">Erlang Tic Tac Toe</a></li> + - <li><a href="https://ninenines.eu/articles/tictactoe/">Erlang Tic Tac Toe</a></li> + </ul> |