From eb9ee88f4cc640065f4902e270d834bfb596d5fc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?John=20H=C3=B6gberg?=
Date: Mon, 15 Oct 2018 18:17:12 +0200
Subject: Optimize operator '--' and yield on large inputs
The removal set now uses a red-black tree instead of an array on
large inputs, decreasing runtime complexity from `n*n` to
`n*log(n)`. It will also exit early when there are no more items
left in the removal set, drastically improving performance and
memory use when the items to be removed are present near the head
of the list.
This got a lot more complicated than before as the overhead of
always using a red-black tree was unacceptable when either of the
inputs were small, but this compromise has okay-to-decent
performance regardless of input size.
Co-authored-by: Dmytro Lytovchenko
---
system/doc/efficiency_guide/retired_myths.xml | 14 ++++++++++++++
1 file changed, 14 insertions(+)
(limited to 'system/doc/efficiency_guide/retired_myths.xml')
diff --git a/system/doc/efficiency_guide/retired_myths.xml b/system/doc/efficiency_guide/retired_myths.xml
index 9b914a3b6e..144c942c2b 100644
--- a/system/doc/efficiency_guide/retired_myths.xml
+++ b/system/doc/efficiency_guide/retired_myths.xml
@@ -60,4 +60,18 @@
That leads us to the myth that tail-recursive functions are faster
than body-recursive functions.
+
+
+ Myth: List subtraction ("--" operator) is slow
+
+ List subtraction used to have a run-time complexity proportional to the
+ product of the length of its operands, so it was extremely slow when both
+ lists were long.
+
+ As of OTP 22 the run-time complexity is "n log n" and the operation will
+ complete quickly even when both lists are very long. In fact, it is
+ faster and uses less memory than the commonly used workaround to convert
+ both lists to ordered sets before subtracting them with
+ ordsets:subtract/2.
+
--
cgit v1.2.3