From 437555fd6c495915773b0f9ade7aad3fd0a73a1b Mon Sep 17 00:00:00 2001
From: Raimo Niskanen <raimo@erlang.org>
Date: Tue, 21 Mar 2017 16:36:33 +0100
Subject: Implement Xoroshiro116+ and improve statisticals

Implement Xoroshiro116+ as 'exrop' with fixes.

Deprecate all old algorithms but reincarnate 'exs1024' as 'exs1024s'
and 'exsplus' as 'exsp' with fixes.

Fixes:
* Avoid skew for uniform integers caused by using a simple 'rem'
  operation for range confinement.  Correctness requires retry
  with new random value for an unfortunate first value.
* Implement a correct algorithm that collects enough random
  bits for ranges larger than the generator's precision.
* Fix uniform density for floats by acquiring 53 bits
  then multiplying with 2.0^(-53) which produces floats
  on the form N * 2.0^(-53).
---
 lib/stdlib/doc/src/rand.xml    | 130 ++++++++--
 lib/stdlib/src/rand.erl        | 455 +++++++++++++++++++++++++++-------
 lib/stdlib/test/rand_SUITE.erl | 544 ++++++++++++++++++++++++++++++++---------
 3 files changed, 900 insertions(+), 229 deletions(-)

(limited to 'lib/stdlib')
diff --git a/lib/stdlib/doc/src/rand.xml b/lib/stdlib/doc/src/rand.xml
index 2ddf3021ac..470dc4da97 100644
--- a/lib/stdlib/doc/src/rand.xml
+++ b/lib/stdlib/doc/src/rand.xml
@@ -50,26 +50,73 @@
     <p>The following algorithms are provided:</p>
 
     <taglist>
-      <tag><c>exsplus</c></tag>
+      <tag><c>exrop</c></tag>
       <item>
-        <p>Xorshift116+, 58 bits precision and period of 2^116-1</p>
+        <p>Xoroshiro116+, 58 bits precision and period of 2^116-1</p>
         <p>Jump function: equivalent to 2^64 calls</p>
       </item>
-      <tag><c>exs64</c></tag>
-      <item>
-        <p>Xorshift64*, 64 bits precision and a period of 2^64-1</p>
-        <p>Jump function: not available</p>
-      </item>
-      <tag><c>exs1024</c></tag>
+      <tag><c>exs1024s</c></tag>
       <item>
         <p>Xorshift1024*, 64 bits precision and a period of 2^1024-1</p>
         <p>Jump function: equivalent to 2^512 calls</p>
       </item>
+      <tag><c>exsp</c></tag>
+      <item>
+        <p>Xorshift116+, 58 bits precision and period of 2^116-1</p>
+        <p>Jump function: equivalent to 2^64 calls</p>
+	<p>
+	  This is a corrected version of the previous default algorithm,
+	  that now has been superseeded by Xoroshiro116+ (<c>exrop</c>).
+	  Since there is no native 58 bit rotate instruction this
+	  algorithm executes a little (say &lt; 15%) faster than <c>exrop</c>.
+	  See the 
+	  <url href="http://xorshift.di.unimi.it">algorithms' homepage</url>.
+	</p>
+      </item>
     </taglist>
 
-    <p>The default algorithm is <c>exsplus</c>. If a specific algorithm is
+    <p>
+      The default algorithm is <c>exrop</c> (Xoroshiro116+).
+      If a specific algorithm is
       required, ensure to always use <seealso marker="#seed-1">
-      <c>seed/1</c></seealso> to initialize the state.</p>
+      <c>seed/1</c></seealso> to initialize the state.
+    </p>
+
+    <p>
+      Undocumented (old) algorithms are deprecated but still implemented
+      so old code relying on them will produce
+      the same pseudo random sequences as before.
+    </p>
+
+    <note>
+      <p>
+	There were a number of problems in the implementation
+	of the now undocumented algorithms, which is why
+	they are deprecated.  The new algorithms are a bit slower
+	but do not have these problems:
+      </p>
+      <p>
+	Uniform integer ranges had a skew in the probability distribution
+	that was not noticable for small ranges but for large ranges
+	less than the generator's precision the probability to produce
+	a low number could be twice the probability for a high.
+      </p>
+      <p>
+	Uniform integer ranges larger than or equal to the generator's
+	precision used a floating point fallback that only calculated
+	with 52 bits which is smaller than the requested range
+	and therefore were not all numbers in the requested range
+	even possible to produce.
+      </p>
+      <p>
+	Uniform floats had a non-uniform density so small values
+	i.e less than 0.5 had got smaller intervals decreasing
+	as the generated value approached 0.0 although still uniformly
+	distributed for sufficiently large subranges.  The new algorithms
+	produces uniformly distributed floats on the form N * 2.0^(-53)
+	hence equally spaced.
+      </p>
+    </note>
 
     <p>Every time a random number is requested, a state is used to
       calculate it and a new state is produced. The state can either be
@@ -99,19 +146,19 @@ R1 = rand:uniform(),</pre>
     <p>Use a specified algorithm:</p>
 
     <pre>
-_ = rand:seed(exs1024),
+_ = rand:seed(exs1024s),
 R2 = rand:uniform(),</pre>
 
     <p>Use a specified algorithm with a constant seed:</p>
 
     <pre>
-_ = rand:seed(exs1024, {123, 123534, 345345}),
+_ = rand:seed(exs1024s, {123, 123534, 345345}),
 R3 = rand:uniform(),</pre>
 
    <p>Use the functional API with a non-constant seed:</p>
 
    <pre>
-S0 = rand:seed_s(exsplus),
+S0 = rand:seed_s(exrop),
 {R4, S1} = rand:uniform_s(S0),</pre>
 
    <p>Create a standard normal deviate:</p>
@@ -127,6 +174,39 @@ S0 = rand:seed_s(exsplus),
       </p>
     </note>
 
+    <p>
+      For all these generators the lowest bit(s) has got
+      a slightly less random behaviour than all other bits.
+      1 bit for <c>exrop</c> (and <c>exsp</c>),
+      and 3 bits for <c>exs1024s</c>.
+      See for example the explanation in the
+      <url href="http://xoroshiro.di.unimi.it/xoroshiro128plus.c">
+	Xoroshiro128+
+      </url>
+      generator source code:
+    </p>
+    <pre>
+Beside passing BigCrush, this generator passes the PractRand test suite
+up to (and included) 16TB, with the exception of binary rank tests,
+which fail due to the lowest bit being an LFSR; all other bits pass all
+tests. We suggest to use a sign test to extract a random Boolean value.</pre>
+    <p>
+      If this is a problem; to generate a boolean
+      use something like this:
+    </p>
+    <pre>(rand:uniform(16) > 8)</pre>
+    <p>
+      And for a general range, with <c>N = 1</c> for <c>exrop</c>,
+      and <c>N = 3</c> for <c>exs1024s</c>:
+    </p>
+    <pre>(((rand:uniform(Range bsl N) - 1) bsr N) + 1)</pre>
+    <p>
+      The floating point generating functions in this module
+      waste the lowest bits when converting from an integer
+      so they avoid this snag.
+    </p>
+
+
   </description>
   <datatypes>
     <datatype>
@@ -141,6 +221,18 @@ S0 = rand:seed_s(exsplus),
     <datatype>
       <name name="alg_state"/>
     </datatype>
+    <datatype>
+      <name name="state"/>
+      <desc><p>Algorithm-dependent state.</p></desc>
+    </datatype>
+    <datatype>
+      <name name="export_state"/>
+      <desc>
+	<p>
+	  Algorithm-dependent state that can be printed or saved to file.
+	</p>
+      </desc>
+    </datatype>
     <datatype>
       <name name="exs64_state"/>
       <desc><p>Algorithm specific internal state</p></desc>
@@ -154,16 +246,8 @@ S0 = rand:seed_s(exsplus),
       <desc><p>Algorithm specific internal state</p></desc>
     </datatype>
     <datatype>
-      <name name="state"/>
-      <desc><p>Algorithm-dependent state.</p></desc>
-    </datatype>
-    <datatype>
-      <name name="export_state"/>
-      <desc>
-	<p>
-	  Algorithm-dependent state that can be printed or saved to file.
-	</p>
-      </desc>
+      <name name="exrop_state"/>
+      <desc><p>Algorithm specific internal state</p></desc>
     </datatype>
   </datatypes>
 
diff --git a/lib/stdlib/src/rand.erl b/lib/stdlib/src/rand.erl
index dfd102f9ef..2ee0ddb036 100644
--- a/lib/stdlib/src/rand.erl
+++ b/lib/stdlib/src/rand.erl
@@ -1,7 +1,7 @@
 %%
 %% %CopyrightBegin%
 %%
-%% Copyright Ericsson AB 2015-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2015-2017. All Rights Reserved.
 %%
 %% Licensed under the Apache License, Version 2.0 (the "License");
 %% you may not use this file except in compliance with the License.
@@ -20,6 +20,9 @@
 %% =====================================================================
 %% Multiple PRNG module for Erlang/OTP
 %% Copyright (c) 2015-2016 Kenji Rikitake
+%%
+%% exrop (xoroshiro116+) added and statistical distribution
+%% improvements by the Erlang/OTP team 2017
 %% =====================================================================
 
 -module(rand).
@@ -32,44 +35,175 @@
 	]).
 
 -compile({inline, [exs64_next/1, exsplus_next/1,
-		   exsplus_jump/1,
 		   exs1024_next/1, exs1024_calc/2,
-		   exs1024_jump/1,
+                   exrop_next/1, exrop_next_s/2,
 		   get_52/1, normal_kiwi/1]}).
 
--define(DEFAULT_ALG_HANDLER, exsplus).
+-define(DEFAULT_ALG_HANDLER, exrop).
 -define(SEED_DICT, rand_seed).
 
+%% =====================================================================
+%% Bit fiddling macros
+%% =====================================================================
+
+-define(BIT(Bits), (1 bsl (Bits))).
+-define(MASK(Bits), (?BIT(Bits) - 1)).
+-define(MASK(Bits, X), ((X) band ?MASK(Bits))).
+-define(
+   BSL(Bits, X, N),
+   %% N is evaluated 2 times
+   (?MASK((Bits)-(N), (X)) bsl (N))).
+-define(
+   ROTL(Bits, X, N),
+   %% Bits is evaluated 2 times
+   %% X is evaluated 2 times
+   %% N i evaluated 3 times
+   (?BSL((Bits), (X), (N)) bor ((X) bsr ((Bits)-(N))))).
+
+%%-define(TWO_POW_MINUS53, (math:pow(2, -53))).
+-define(TWO_POW_MINUS53, 1.11022302462515657e-16).
+
 %% =====================================================================
 %% Types
 %% =====================================================================
 
+-type uint64() :: 0..?MASK(64).
+-type uint58() :: 0..?MASK(58).
+
 %% This depends on the algorithm handler function
 -type alg_state() ::
-        exs64_state() | exsplus_state() | exs1024_state() | term().
+        exs64_state() | exsplus_state() | exs1024_state() |
+        exrop_state() | term().
 
-%% This is the algorithm handler function within this module
+%% This is the algorithm handling definition within this module,
+%% and the type to use for plugins.
+%%
+%% The 'type' field must be recognized by the module that implements
+%% the algorithm, to interpret an exported state.
+%%
+%% The 'bits' field indicates how many bits the integer
+%% returned from 'next' has got, i.e 'next' shall return
+%% an random integer in the range 0..(2^Bits - 1).
+%% At least 53 bits is required for the floating point
+%% producing fallbacks.  This field is only used when
+%% the 'uniform' or 'uniform_n' fields are not defined.
+%%
+%% The fields 'next', 'uniform' and 'uniform_n'
+%% implement the algorithm.  If 'uniform' or 'uinform_n'
+%% is not present there is a fallback using 'next' and either
+%% 'bits' or the deprecated 'max'.
+%%
 -type alg_handler() ::
         #{type := alg(),
-          max  := integer() | infinity,
+          bits => non_neg_integer(),
+          weak_low_bits => non_neg_integer(),
+          max => non_neg_integer(), % Deprecated
           next :=
-              fun((alg_state()) -> {non_neg_integer(), alg_state()}),
-          uniform :=
-              fun((state()) -> {float(), state()}),
-          uniform_n :=
-              fun((pos_integer(), state()) -> {pos_integer(), state()}),
-          jump :=
-              fun((state()) -> state())}.
+              fun ((alg_state()) -> {non_neg_integer(), alg_state()}),
+          uniform =>
+              fun ((state()) -> {float(), state()}),
+          uniform_n =>
+              fun ((pos_integer(), state()) -> {pos_integer(), state()}),
+          jump =>
+              fun ((state()) -> state())}.
 
 %% Algorithm state
 -type state() :: {alg_handler(), alg_state()}.
--type builtin_alg() :: exs64 | exsplus | exs1024.
+-type builtin_alg() :: exs64 | exsplus | exsp | exs1024 | exs1024s | exrop.
 -type alg() :: builtin_alg() | atom().
 -type export_state() :: {alg(), alg_state()}.
 -export_type(
    [builtin_alg/0, alg/0, alg_handler/0, alg_state/0,
     state/0, export_state/0]).
--export_type([exs64_state/0, exsplus_state/0, exs1024_state/0]).
+-export_type(
+   [exs64_state/0, exsplus_state/0, exs1024_state/0, exrop_state/0]).
+
+%% =====================================================================
+%% Range macro and helper
+%% =====================================================================
+
+-define(
+   uniform_range(Range, Alg, R, V, MaxMinusRange, I),
+   if
+       0 =< (MaxMinusRange) ->
+           if
+               %% Really work saving in odd cases;
+               %% large ranges in particular
+               (V) < (Range) ->
+                   {(V) + 1, {(Alg), (R)}};
+               true ->
+                   (I) = (V) rem (Range),
+                   if
+                       (V) - (I) =< (MaxMinusRange) ->
+                           {(I) + 1, {(Alg), (R)}};
+                       true ->
+                           %% V in the truncated top range
+                           %% - try again
+                           ?FUNCTION_NAME((Range), {(Alg), (R)})
+                   end
+           end;
+       true ->
+           uniform_range((Range), (Alg), (R), (V))
+   end).
+
+%% For ranges larger than the algorithm bit size
+uniform_range(Range, #{next:=Next, bits:=Bits} = Alg, R, V) ->
+    WeakLowBits =
+        case Alg of
+            #{weak_low_bits:=WLB} -> WLB;
+            #{} -> 0
+        end,
+    %% Maybe waste the lowest bit(s) when shifting in new bits
+    Shift = Bits - WeakLowBits,
+    ShiftMask = bnot ?MASK(WeakLowBits),
+    RangeMinus1 = Range - 1,
+    if
+        (Range band RangeMinus1) =:= 0 -> % Power of 2
+            %% Generate at least the number of bits for the range
+            {V1, R1, _} =
+                uniform_range(
+                  Range bsr Bits, Next, R, V, ShiftMask, Shift, Bits),
+            {(V1 band RangeMinus1) + 1, {Alg, R1}};
+        true ->
+            %% Generate a value with at least two bits more than the range
+            %% and try that for a fit, otherwise recurse
+            %%
+            %% Just one bit more should ensure that the generated
+            %% number range is at least twice the size of the requested
+            %% range, which would make the probability to draw a good
+            %% number better than 0.5.  And repeating that until
+            %% success i guess would take 2 times statistically amortized.
+            %% But since the probability for fairly many attemtpts
+            %% is not that low, use two bits more than the range which 
+            %% should make the probability to draw a bad number under 0.25,
+            %% which decreases the bad case probability a lot.
+            {V1, R1, B} =
+                uniform_range(
+                  Range bsr (Bits - 2), Next, R, V, ShiftMask, Shift, Bits),
+            I = V1 rem Range,
+            if
+                (V1 - I) =< (1 bsl B) - Range ->
+                    {I + 1, {Alg, R1}};
+                true ->
+                    %% V1 drawn from the truncated top range
+                    %% - try again
+                    {V2, R2} = Next(R1),
+                    uniform_range(Range, Alg, R2, V2)
+            end
+    end.
+%%
+uniform_range(Range, Next, R, V, ShiftMask, Shift, B) ->
+    if 
+        Range =< 1 ->
+            {V, R, B};
+        true ->
+            {V1, R1} = Next(R),
+            %% Waste the lowest bit(s) when shifting in new bits
+            uniform_range(
+              Range bsr Shift, Next, R1,
+              ((V band ShiftMask) bsl Shift) bor V1,
+              ShiftMask, Shift, B + Shift)
+    end.
 
 %% =====================================================================
 %% API
@@ -156,7 +290,16 @@ uniform(N) ->
 
 -spec uniform_s(State :: state()) -> {X :: float(), NewState :: state()}.
 uniform_s(State = {#{uniform:=Uniform}, _}) ->
-    Uniform(State).
+    Uniform(State);
+uniform_s({#{bits:=Bits, next:=Next} = Alg, R0}) ->
+    {V, R1} = Next(R0),
+    %% Produce floats on the form N * 2^(-53)
+    {(V bsr (Bits - 53)) * ?TWO_POW_MINUS53, {Alg, R1}};
+uniform_s({#{max:=Max, next:=Next} = Alg, R0}) ->
+    {V, R1} = Next(R0),
+    %% Old broken algorithm with non-uniform density
+    {V / (Max + 1), {Alg, R1}}.
+
 
 %% uniform_s/2: given an integer N >= 1 and a state, uniform_s/2
 %% uniform_s/2 returns a random integer X where 1 =< X =< N,
@@ -164,13 +307,26 @@ uniform_s(State = {#{uniform:=Uniform}, _}) ->
 
 -spec uniform_s(N :: pos_integer(), State :: state()) ->
                        {X :: pos_integer(), NewState :: state()}.
-uniform_s(N, State = {#{uniform_n:=Uniform, max:=Max}, _})
-  when 0 < N, N =< Max ->
-    Uniform(N, State);
-uniform_s(N, State0 = {#{uniform:=Uniform}, _})
-  when is_integer(N), 0 < N ->
-    {F, State} = Uniform(State0),
-    {trunc(F * N) + 1, State}.
+uniform_s(N, State = {#{uniform_n:=UniformN}, _})
+  when is_integer(N), 1 =< N ->
+    UniformN(N, State);
+uniform_s(N, {#{bits:=Bits, next:=Next} = Alg, R0})
+  when is_integer(N), 1 =< N ->
+    {V, R1} = Next(R0),
+    MaxMinusN = ?BIT(Bits) - N,
+    ?uniform_range(N, Alg, R1, V, MaxMinusN, I);
+uniform_s(N, {#{max:=Max, next:=Next} = Alg, R0})
+  when is_integer(N), 1 =< N ->
+    %% Old broken algorithm with skewed probability
+    %% and gap in ranges > Max
+    {V, R1} = Next(R0),  
+    if
+        N =< Max ->
+            {(V rem N) + 1, {Alg, R1}};
+        true ->
+            F = V / (Max + 1),
+            {trunc(F * N) + 1, {Alg, R1}}
+    end.
 
 %% jump/1: given a state, jump/1
 %% returns a new state which is equivalent to that
@@ -179,7 +335,10 @@ uniform_s(N, State0 = {#{uniform:=Uniform}, _})
 
 -spec jump(state()) -> NewState :: state().
 jump(State = {#{jump:=Jump}, _}) ->
-    Jump(State).
+    Jump(State);
+jump({#{}, _}) ->
+    erlang:error(not_implemented).
+
 
 %% jump/0: read the internal state and
 %% apply the jump function for the state as in jump/1
@@ -187,7 +346,6 @@ jump(State = {#{jump:=Jump}, _}) ->
 %% then returns the new value.
 
 -spec jump() -> NewState :: state().
-
 jump() ->
     seed_put(jump(seed_get())).
 
@@ -207,7 +365,7 @@ normal() ->
 -spec normal_s(State :: state()) -> {float(), NewState :: state()}.
 normal_s(State0) ->
     {Sign, R, State} = get_52(State0),
-    Idx = R band 16#FF,
+    Idx = ?MASK(8, R),
     Idx1 = Idx+1,
     {Ki, Wi} = normal_kiwi(Idx1),
     X = R * Wi,
@@ -223,16 +381,6 @@ normal_s(State0) ->
 %% =====================================================================
 %% Internal functions
 
--define(UINT21MASK, 16#00000000001fffff).
--define(UINT32MASK, 16#00000000ffffffff).
--define(UINT33MASK, 16#00000001ffffffff).
--define(UINT39MASK, 16#0000007fffffffff).
--define(UINT58MASK, 16#03ffffffffffffff).
--define(UINT64MASK, 16#ffffffffffffffff).
-
--type uint64() :: 0..16#ffffffffffffffff.
--type uint58() :: 0..16#03ffffffffffffff.
-
 -spec seed_put(state()) -> state().
 seed_put(Seed) ->
     put(?SEED_DICT, Seed),
@@ -246,20 +394,30 @@ seed_get() ->
 
 %% Setup alg record
 mk_alg(exs64) ->
-    {#{type=>exs64, max=>?UINT64MASK, next=>fun exs64_next/1,
-       uniform=>fun exs64_uniform/1, uniform_n=>fun exs64_uniform/2,
-       jump=>fun exs64_jump/1},
+    {#{type=>exs64, max=>?MASK(64), next=>fun exs64_next/1},
      fun exs64_seed/1};
 mk_alg(exsplus) ->
-    {#{type=>exsplus, max=>?UINT58MASK, next=>fun exsplus_next/1,
-       uniform=>fun exsplus_uniform/1, uniform_n=>fun exsplus_uniform/2,
+    {#{type=>exsplus, max=>?MASK(58), next=>fun exsplus_next/1,
+       jump=>fun exsplus_jump/1},
+     fun exsplus_seed/1};
+mk_alg(exsp) ->
+    {#{type=>exsp, bits=>58, weak_low_bits=>1, next=>fun exsplus_next/1,
+       uniform=>fun exsp_uniform/1, uniform_n=>fun exsp_uniform/2,
        jump=>fun exsplus_jump/1},
      fun exsplus_seed/1};
 mk_alg(exs1024) ->
-    {#{type=>exs1024, max=>?UINT64MASK, next=>fun exs1024_next/1,
-       uniform=>fun exs1024_uniform/1, uniform_n=>fun exs1024_uniform/2,
+    {#{type=>exs1024, max=>?MASK(64), next=>fun exs1024_next/1,
+       jump=>fun exs1024_jump/1},
+     fun exs1024_seed/1};
+mk_alg(exs1024s) ->
+    {#{type=>exs1024s, bits=>64, weak_low_bits=>3, next=>fun exs1024_next/1,
        jump=>fun exs1024_jump/1},
-     fun exs1024_seed/1}.
+     fun exs1024_seed/1};
+mk_alg(exrop) ->
+    {#{type=>exrop, bits=>58, weak_low_bits=>1, next=>fun exrop_next/1,
+       uniform=>fun exrop_uniform/1, uniform_n=>fun exrop_uniform/2,
+       jump=>fun exrop_jump/1},
+     fun exrop_seed/1}.
 
 %% =====================================================================
 %% exs64 PRNG: Xorshift64*
@@ -270,29 +428,18 @@ mk_alg(exs1024) ->
 -opaque exs64_state() :: uint64().
 
 exs64_seed({A1, A2, A3}) ->
-    {V1, _} = exs64_next(((A1 band ?UINT32MASK) * 4294967197 + 1)),
-    {V2, _} = exs64_next(((A2 band ?UINT32MASK) * 4294967231 + 1)),
-    {V3, _} = exs64_next(((A3 band ?UINT32MASK) * 4294967279 + 1)),
-    ((V1 * V2 * V3) rem (?UINT64MASK - 1)) + 1.
+    {V1, _} = exs64_next((?MASK(32, A1) * 4294967197 + 1)),
+    {V2, _} = exs64_next((?MASK(32, A2) * 4294967231 + 1)),
+    {V3, _} = exs64_next((?MASK(32, A3) * 4294967279 + 1)),
+    ((V1 * V2 * V3) rem (?MASK(64) - 1)) + 1.
 
 %% Advance xorshift64* state for one step and generate 64bit unsigned integer
 -spec exs64_next(exs64_state()) -> {uint64(), exs64_state()}.
 exs64_next(R) ->
     R1 = R bxor (R bsr 12),
-    R2 = R1 bxor ((R1 band ?UINT39MASK) bsl 25),
+    R2 = R1 bxor ?BSL(64, R1, 25),
     R3 = R2 bxor (R2 bsr 27),
-    {(R3 * 2685821657736338717) band ?UINT64MASK, R3}.
-
-exs64_uniform({Alg, R0}) ->
-    {V, R1} = exs64_next(R0),
-    {V / 18446744073709551616, {Alg, R1}}.
-
-exs64_uniform(Max, {Alg, R}) ->
-    {V, R1} = exs64_next(R),
-    {(V rem Max) + 1, {Alg, R1}}.
-
-exs64_jump(_) ->
-    erlang:error(not_implemented).
+    {?MASK(64, R3 * 2685821657736338717), R3}.
 
 %% =====================================================================
 %% exsplus PRNG: Xorshift116+
@@ -307,10 +454,12 @@ exs64_jump(_) ->
 -dialyzer({no_improper_lists, exsplus_seed/1}).
 
 exsplus_seed({A1, A2, A3}) ->
-    {_, R1} = exsplus_next([(((A1 * 4294967197) + 1) band ?UINT58MASK)|
-			    (((A2 * 4294967231) + 1) band ?UINT58MASK)]),
-    {_, R2} = exsplus_next([(((A3 * 4294967279) + 1) band ?UINT58MASK)|
-			    tl(R1)]),
+    {_, R1} = exsplus_next(
+                [?MASK(58, (A1 * 4294967197) + 1)|
+                 ?MASK(58, (A2 * 4294967231) + 1)]),
+    {_, R2} = exsplus_next(
+                [?MASK(58, (A3 * 4294967279) + 1)|
+                 tl(R1)]),
     R2.
 
 -dialyzer({no_improper_lists, exsplus_next/1}).
@@ -319,17 +468,22 @@ exsplus_seed({A1, A2, A3}) ->
 -spec exsplus_next(exsplus_state()) -> {uint58(), exsplus_state()}.
 exsplus_next([S1|S0]) ->
     %% Note: members s0 and s1 are swapped here
-    S11 = (S1 bxor (S1 bsl 24)) band ?UINT58MASK,
+    S11 = S1 bxor ?BSL(58, S1, 24),
     S12 = S11 bxor S0 bxor (S11 bsr 11) bxor (S0 bsr 41),
-    {(S0 + S12) band ?UINT58MASK, [S0|S12]}.
+    {?MASK(58, S0 + S12), [S0|S12]}.
+
 
-exsplus_uniform({Alg, R0}) ->
+exsp_uniform({Alg, R0}) ->
     {I, R1} = exsplus_next(R0),
-    {I / (?UINT58MASK+1), {Alg, R1}}.
+    %% Waste the lowest bit since it is of lower
+    %% randomness quality than the others
+    {(I bsr (58-53)) * ?TWO_POW_MINUS53, {Alg, R1}}.
 
-exsplus_uniform(Max, {Alg, R}) ->
+exsp_uniform(Range, {Alg, R}) ->
     {V, R1} = exsplus_next(R),
-    {(V rem Max) + 1, {Alg, R1}}.
+    MaxMinusRange = ?BIT(58) - Range,
+    ?uniform_range(Range, Alg, R1, V, MaxMinusRange, I).
+
 
 %% This is the jump function for the exsplus generator, equivalent
 %% to 2^64 calls to next/1; it can be used to generate 2^52
@@ -357,7 +511,7 @@ exsplus_jump(S, AS, _, 0) ->
     {S, AS};
 exsplus_jump(S, [AS0|AS1], J, N) ->
     {_, NS} = exsplus_next(S),
-    case (J band 1) of
+    case ?MASK(1, J) of
         1 ->
             [S0|S1] = S,
             exsplus_jump(NS, [(AS0 bxor S0)|(AS1 bxor S1)], J bsr 1, N-1);
@@ -374,9 +528,9 @@ exsplus_jump(S, [AS0|AS1], J, N) ->
 -opaque exs1024_state() :: {list(uint64()), list(uint64())}.
 
 exs1024_seed({A1, A2, A3}) ->
-    B1 = (((A1 band ?UINT21MASK) + 1) * 2097131) band ?UINT21MASK,
-    B2 = (((A2 band ?UINT21MASK) + 1) * 2097133) band ?UINT21MASK,
-    B3 = (((A3 band ?UINT21MASK) + 1) * 2097143) band ?UINT21MASK,
+    B1 = ?MASK(21, (?MASK(21, A1) + 1) * 2097131),
+    B2 = ?MASK(21, (?MASK(21, A2) + 1) * 2097133),
+    B3 = ?MASK(21, (?MASK(21, A3) + 1) * 2097143),
     {exs1024_gen1024((B1 bsl 43) bor (B2 bsl 22) bor (B3 bsl 1) bor 1),
      []}.
 
@@ -399,11 +553,11 @@ exs1024_gen1024(N, R, L) ->
 %% X: random number output
 -spec exs1024_calc(uint64(), uint64()) -> {uint64(), uint64()}.
 exs1024_calc(S0, S1) ->
-    S11 = S1 bxor ((S1 band ?UINT33MASK) bsl 31),
+    S11 = S1 bxor ?BSL(64, S1, 31),
     S12 = S11 bxor (S11 bsr 11),
     S01 = S0 bxor (S0 bsr 30),
     NS1 = S01 bxor S12,
-    {(NS1 * 1181783497276652981) band ?UINT64MASK, NS1}.
+    {?MASK(64, NS1 * 1181783497276652981), NS1}.
 
 %% Advance xorshift1024* state for one step and generate 64bit unsigned integer
 -spec exs1024_next(exs1024_state()) -> {uint64(), exs1024_state()}.
@@ -414,13 +568,6 @@ exs1024_next({[H], RL}) ->
     NL = [H|lists:reverse(RL)],
     exs1024_next({NL, []}).
 
-exs1024_uniform({Alg, R0}) ->
-    {V, R1} = exs1024_next(R0),
-    {V / 18446744073709551616, {Alg, R1}}.
-
-exs1024_uniform(Max, {Alg, R}) ->
-    {V, R1} = exs1024_next(R),
-    {(V rem Max) + 1, {Alg, R1}}.
 
 %% This is the jump function for the exs1024 generator, equivalent
 %% to 2^512 calls to next(); it can be used to generate 2^512
@@ -467,7 +614,7 @@ exs1024_jump(S, AS, [H|T], _, 0, TN) ->
     exs1024_jump(S, AS, T, H, ?JUMPELEMLEN, TN);
 exs1024_jump({L, RL}, AS, JL, J, N, TN) ->
     {_, NS} = exs1024_next({L, RL}),
-    case (J band 1) of
+    case ?MASK(1, J) of
         1 ->
             AS2 = lists:zipwith(fun(X, Y) -> X bxor Y end,
                         AS, L ++ lists:reverse(RL)),
@@ -476,6 +623,136 @@ exs1024_jump({L, RL}, AS, JL, J, N, TN) ->
             exs1024_jump(NS, AS, JL, J bsr 1, N-1, TN-1)
     end.
 
+%% =====================================================================
+%% exrop PRNG: Xoroshiro116+
+%%
+%% Reference URL: http://xorshift.di.unimi.it/
+%%
+%% 58 bits fits into an immediate on 64bits Erlang and is thus much faster.
+%% In fact, an immediate number is 60 bits signed in Erlang so you can
+%% add two positive 58 bit numbers and get a 59 bit number that still is
+%% a positive immediate, which is a property we utilize here...
+%%
+%% Modification of the original Xororhiro128+ algorithm to 116 bits
+%% by Sebastiano Vigna.  A lot of thanks for his help and work.
+%% =====================================================================
+%% (a, b, c) = (24, 2, 35)
+%% JUMP Polynomial = 0x9863200f83fcd4a11293241fcb12a (116 bit)
+%%
+%% From http://xoroshiro.di.unimi.it/xoroshiro116plus.c:
+%% ---------------------------------------------------------------------
+%% /* Written in 2017 by Sebastiano Vigna (vigna@acm.org).
+%%
+%% To the extent possible under law, the author has dedicated all copyright
+%% and related and neighboring rights to this software to the public domain
+%% worldwide. This software is distributed without any warranty.
+%%
+%% See <http://creativecommons.org/publicdomain/zero/1.0/>. */
+%%
+%% #include <stdint.h>
+%%
+%% #define UINT58MASK (uint64_t)((UINT64_C(1) << 58) - 1)
+%%
+%% uint64_t s[2];
+%%
+%% static inline uint64_t rotl58(const uint64_t x, int k) {
+%%     return (x << k) & UINT58MASK | (x >> (58 - k));
+%% }
+%% 
+%% uint64_t next(void) {
+%%     uint64_t s1 = s[1];
+%%     const uint64_t s0 = s[0];
+%%     const uint64_t result = (s0 + s1) & UINT58MASK;
+%%
+%%     s1 ^= s0;
+%%     s[0] = rotl58(s0, 24) ^ s1 ^ ((s1 << 2) & UINT58MASK); // a, b
+%%     s[1] = rotl58(s1, 35); // c
+%%     return result;
+%% }
+%%
+%% void jump(void) {
+%%     static const uint64_t JUMP[] =
+%%         { 0x4a11293241fcb12a, 0x0009863200f83fcd };
+%%
+%%     uint64_t s0 = 0;
+%%     uint64_t s1 = 0;
+%%     for(int i = 0; i < sizeof JUMP / sizeof *JUMP; i++)
+%%         for(int b = 0; b < 64; b++) {
+%%             if (JUMP[i] & UINT64_C(1) << b) {
+%%                 s0 ^= s[0];
+%% 	           s1 ^= s[1];
+%% 	       }
+%% 	       next();
+%% 	   }
+%%     s[0] = s0;
+%%     s[1] = s1;
+%% }
+
+-opaque exrop_state() :: nonempty_improper_list(uint58(), uint58()).
+
+-dialyzer({no_improper_lists, exrop_seed/1}).
+exrop_seed({A1, A2, A3}) ->
+    [_|S1] =
+        exrop_next_s(
+          ?MASK(58, (A1 * 4294967197) + 1),
+          ?MASK(58, (A2 * 4294967231) + 1)),
+    exrop_next_s(?MASK(58, (A3 * 4294967279) + 1), S1).
+
+-dialyzer({no_improper_lists, exrop_next_s/2}).
+%% Advance xoroshiro116+ state one step
+%% [a, b, c] = [24, 2, 35]
+-define(
+   exrop_next_s(S0, S1, S1_a),
+   begin
+       S1_a = S1 bxor S0,
+       [?ROTL(58, S0, 24) bxor S1_a bxor ?BSL(58, S1_a, 2)| % a, b
+        ?ROTL(58, S1_a, 35)] % c
+   end).
+exrop_next_s(S0, S1) ->
+    ?exrop_next_s(S0, S1, S1_a).
+
+-dialyzer({no_improper_lists, exrop_next/1}).
+%% Advance xoroshiro116+ state one step, generate 58 bit unsigned integer,
+%% and waste the lowest bit since it is of lower randomness quality
+exrop_next([S0|S1]) ->
+    {?MASK(58, S0 + S1), ?exrop_next_s(S0, S1, S1_a)}.
+
+exrop_uniform({Alg, R}) ->
+    {V, R1} = exrop_next(R),
+    %% Waste the lowest bit since it is of lower
+    %% randomness quality than the others
+    {(V bsr (58-53)) * ?TWO_POW_MINUS53, {Alg, R1}}.
+
+exrop_uniform(Range, {Alg, R}) ->
+    {V, R1} = exrop_next(R),
+    MaxMinusRange = ?BIT(58) - Range,
+    ?uniform_range(Range, Alg, R1, V, MaxMinusRange, I).
+
+%% Split a 116 bit constant into two '1'++58 bit words,
+%% the top '1' marks the top of the word
+-define(
+   JUMP_116(Jump),
+   [?BIT(58) bor ?MASK(58, (Jump)),?BIT(58) bor ((Jump) bsr 58)]).
+%%
+exrop_jump({Alg,S}) ->
+    [J|Js] = ?JUMP_116(16#9863200f83fcd4a11293241fcb12a),
+    {Alg, exrop_jump(S, 0, 0, J, Js)}.
+%%
+-dialyzer({no_improper_lists, exrop_jump/5}).
+exrop_jump(_S, S0, S1, 1, []) -> % End of jump constant
+    [S0|S1];
+exrop_jump(S, S0, S1, 1, [J|Js]) -> % End of the word
+    exrop_jump(S, S0, S1, J, Js);
+exrop_jump([S__0|S__1] = _S, S0, S1, J, Js) ->
+    case ?MASK(1, J) of
+        1 ->
+            NewS = exrop_next_s(S__0, S__1),
+            exrop_jump(NewS, S0 bxor S__0, S1 bxor S__1, J bsr 1, Js);
+        0 ->
+            NewS = exrop_next_s(S__0, S__1),
+            exrop_jump(NewS, S0, S1, J bsr 1, Js)
+    end.
+
 %% =====================================================================
 %% Ziggurat cont
 %% =====================================================================
@@ -483,9 +760,13 @@ exs1024_jump({L, RL}, AS, JL, J, N, TN) ->
 -define(NOR_INV_R, 1/?NOR_R).
 
 %% return a {sign, Random51bits, State}
+get_52({Alg=#{bits:=Bits, next:=Next}, S0}) ->
+    %% Use the high bits
+    {Int,S1} = Next(S0),
+    {?BIT(Bits - 51 - 1) band Int, Int bsr (Bits - 51), {Alg, S1}};
 get_52({Alg=#{next:=Next}, S0}) ->
     {Int,S1} = Next(S0),
-    {((1 bsl 51) band Int), Int band ((1 bsl 51)-1), {Alg, S1}}.
+    {?BIT(51) band Int, ?MASK(51, Int), {Alg, S1}}.
 
 %% Slow path
 normal_s(0, Sign, X0, State0) ->
diff --git a/lib/stdlib/test/rand_SUITE.erl b/lib/stdlib/test/rand_SUITE.erl
index 098eefeb61..51bb03f572 100644
--- a/lib/stdlib/test/rand_SUITE.erl
+++ b/lib/stdlib/test/rand_SUITE.erl
@@ -1,7 +1,7 @@
 %%
 %% %CopyrightBegin%
 %%
-%% Copyright Ericsson AB 2000-2016. All Rights Reserved.
+%% Copyright Ericsson AB 2000-2017. All Rights Reserved.
 %%
 %% Licensed under the Apache License, Version 2.0 (the "License");
 %% you may not use this file except in compliance with the License.
@@ -66,18 +66,19 @@ group(reference_jump) ->
 %% A simple helper to test without test_server during dev
 test() ->
     Tests = all(),
-    lists:foreach(fun(Test) ->
-                          try
-                              ok = ?MODULE:Test([]),
-                              io:format("~p: ok~n", [Test])
-                          catch _:Reason ->
-                                    io:format("Failed: ~p: ~p ~p~n",
-                                              [Test, Reason, erlang:get_stacktrace()])
-                          end
-                  end, Tests).
+    lists:foreach(
+      fun (Test) ->
+              try
+                  ok = ?MODULE:Test([]),
+                  io:format("~p: ok~n", [Test])
+              catch _:Reason ->
+                      io:format("Failed: ~p: ~p ~p~n",
+                                [Test, Reason, erlang:get_stacktrace()])
+              end
+      end, Tests).
 
 algs() ->
-    [exs64, exsplus, exs1024].
+    [exs64, exsplus, exsp, exrop, exs1024, exs1024s].
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
@@ -226,10 +227,10 @@ interval_float_1(0) -> ok;
 interval_float_1(N) ->
     X = rand:uniform(),
     if
-	0.0 < X, X < 1.0 ->
+	0.0 =< X, X < 1.0 ->
 	    ok;
 	true ->
-	    io:format("X=~p 0<~p<1.0~n", [X,X]),
+	    io:format("X=~p 0=<~p<1.0~n", [X,X]),
 	    exit({X, rand:export_seed()})
     end,
     interval_float_1(N-1).
@@ -246,6 +247,8 @@ reference_1(Alg) ->
     Testval = gen(Alg),
     case Refval =:= Testval of
         true -> ok;
+        false when Refval =:= not_implemented ->
+            exit({not_implemented,Alg});
         false ->
 	    io:format("Failed: ~p~n",[Alg]),
 	    io:format("Length ~p ~p~n",[length(Refval), length(Testval)]),
@@ -254,25 +257,29 @@ reference_1(Alg) ->
     end.
 
 gen(Algo) ->
-    Seed = case Algo of
-	       exsplus -> %% Printed with orig 'C' code and this seed
-		   rand:seed_s({exsplus, [12345678|12345678]});
-	       exs64 -> %% Printed with orig 'C' code and this seed
-		   rand:seed_s({exs64, 12345678});
-	       exs1024 -> %% Printed with orig 'C' code and this seed
-		   rand:seed_s({exs1024, {lists:duplicate(16, 12345678), []}});
-	       _ ->
-		   rand:seed(Algo, {100, 200, 300})
-	   end,
-    gen(?LOOP, Seed, []).
-
-gen(N, State0 = {#{max:=Max}, _}, Acc) when N > 0 ->
+    State =
+        case Algo of
+            exs64 -> %% Printed with orig 'C' code and this seed
+                rand:seed_s({exs64, 12345678});
+            _ when Algo =:= exsplus; Algo =:= exsp; Algo =:= exrop ->
+                %% Printed with orig 'C' code and this seed
+                rand:seed_s({Algo, [12345678|12345678]});
+            _ when Algo =:= exs1024; Algo =:= exs1024s ->
+                %% Printed with orig 'C' code and this seed
+                rand:seed_s({Algo, {lists:duplicate(16, 12345678), []}});
+            _ ->
+                rand:seed(Algo, {100, 200, 300})
+        end,
+    Max = range(State),
+    gen(?LOOP, State, Max, []).
+
+gen(N, State0, Max, Acc) when N > 0 ->
     {Random, State} = rand:uniform_s(Max, State0),
     case N rem (?LOOP div 100) of
-	0 -> gen(N-1, State, [Random|Acc]);
-	_ -> gen(N-1, State, Acc)
+	0 -> gen(N-1, State, Max, [Random|Acc]);
+	_ -> gen(N-1, State, Max, Acc)
     end;
-gen(_, _, Acc) -> lists:reverse(Acc).
+gen(_, _, _, Acc) -> lists:reverse(Acc).
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %% This just tests the basics so we have not made any serious errors
@@ -307,11 +314,11 @@ basic_uniform_1(N, S0, Sum, A0) when N > 0 ->
     basic_uniform_1(N-1, S, Sum+X, A);
 basic_uniform_1(0, {#{type:=Alg}, _}, Sum, A) ->
     AverN = Sum / ?LOOP,
-    io:format("~.10w: Average: ~.4f~n", [Alg, AverN]),
+    io:format("~.12w: Average: ~.4f~n", [Alg, AverN]),
     Counters = array:to_list(A),
     Min = lists:min(Counters),
     Max = lists:max(Counters),
-    io:format("~.10w: Min: ~p Max: ~p~n", [Alg, Min, Max]),
+    io:format("~.12w: Min: ~p Max: ~p~n", [Alg, Min, Max]),
 
     %% Verify that the basic statistics are ok
     %% be gentle we don't want to see to many failing tests
@@ -326,11 +333,11 @@ basic_uniform_2(N, S0, Sum, A0) when N > 0 ->
     basic_uniform_2(N-1, S, Sum+X, A);
 basic_uniform_2(0, {#{type:=Alg}, _}, Sum, A) ->
     AverN = Sum / ?LOOP,
-    io:format("~.10w: Average: ~.4f~n", [Alg, AverN]),
+    io:format("~.12w: Average: ~.4f~n", [Alg, AverN]),
     Counters = tl(array:to_list(A)),
     Min = lists:min(Counters),
     Max = lists:max(Counters),
-    io:format("~.10w: Min: ~p Max: ~p~n", [Alg, Min, Max]),
+    io:format("~.12w: Min: ~p Max: ~p~n", [Alg, Min, Max]),
 
     %% Verify that the basic statistics are ok
     %% be gentle we don't want to see to many failing tests
@@ -345,7 +352,7 @@ basic_normal_1(N, S0, Sum, Sq) when N > 0 ->
 basic_normal_1(0, {#{type:=Alg}, _}, Sum, SumSq) ->
     Mean = Sum / ?LOOP,
     StdDev =  math:sqrt((SumSq - (Sum*Sum/?LOOP))/(?LOOP - 1)),
-    io:format("~.10w: Average: ~7.4f StdDev ~6.4f~n", [Alg, Mean, StdDev]),
+    io:format("~.12w: Average: ~7.4f StdDev ~6.4f~n", [Alg, Mean, StdDev]),
     %% Verify that the basic statistics are ok
     %% be gentle we don't want to see to many failing tests
     abs(Mean) < 0.005 orelse ct:fail({average, Alg, Mean}),
@@ -365,7 +372,7 @@ plugin(Config) when is_list(Config) ->
                           {V2, S2} = rand:uniform_s(S1),
                           true = is_float(V2),
                           S2
-                  end, crypto_seed(), lists:seq(1, 200)),
+                  end, crypto64_seed(), lists:seq(1, 200)),
             ok
     catch
         error:low_entropy ->
@@ -375,86 +382,220 @@ plugin(Config) when is_list(Config) ->
     end.
 
 %% Test implementation
-crypto_seed() ->
-    {#{type=>crypto,
-       max=>(1 bsl 64)-1,
-       next=>fun crypto_next/1,
-       uniform=>fun crypto_uniform/1,
-       uniform_n=>fun crypto_uniform_n/2},
+crypto64_seed() ->
+    {#{type=>crypto64,
+       bits=>64,
+       next=>fun crypto64_next/1,
+       uniform=>fun crypto64_uniform/1,
+       uniform_n=>fun crypto64_uniform_n/2},
      <<>>}.
 
 %% Be fair and create bignums i.e. 64bits otherwise use 58bits
-crypto_next(<<Num:64, Bin/binary>>) ->
+crypto64_next(<<Num:64, Bin/binary>>) ->
     {Num, Bin};
-crypto_next(_) ->
-    crypto_next(crypto:strong_rand_bytes((64 div 8)*100)).
+crypto64_next(_) ->
+    crypto64_next(crypto:strong_rand_bytes((64 div 8)*100)).
 
-crypto_uniform({Api, Data0}) ->
-    {Int, Data} = crypto_next(Data0),
+crypto64_uniform({Api, Data0}) ->
+    {Int, Data} = crypto64_next(Data0),
     {Int / (1 bsl 64), {Api, Data}}.
 
-crypto_uniform_n(N, {Api, Data0}) when N < (1 bsl 64) ->
-    {Int, Data} = crypto_next(Data0),
+crypto64_uniform_n(N, {Api, Data0}) when N < (1 bsl 64) ->
+    {Int, Data} = crypto64_next(Data0),
     {(Int rem N)+1, {Api, Data}};
-crypto_uniform_n(N, State0) ->
-    {F,State} = crypto_uniform(State0),
+crypto64_uniform_n(N, State0) ->
+    {F,State} = crypto64_uniform(State0),
     {trunc(F * N) + 1, State}.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
 %% Not a test but measures the time characteristics of the different algorithms
-measure(Suite) when is_atom(Suite) -> [];
-measure(_Config) ->
-    ct:timetrap({minutes,15}), %% valgrind needs a lot of time
+measure(Config) ->
+    ct:timetrap({minutes,30}), %% valgrind needs a lot of time
+    case ct:get_timetrap_info() of
+        {_,{_,1}} -> % No scaling
+            do_measure(Config);
+        {_,{_,Scale}} ->
+            {skip,{will_not_run_in_scaled_time,Scale}}
+    end.
+
+do_measure(_Config) ->
     Algos =
         try crypto:strong_rand_bytes(1) of
-            <<_>> -> [crypto64]
+            <<_>> -> [crypto64, crypto]
         catch
             error:low_entropy -> [];
             error:undef -> []
         end ++ algs(),
-    io:format("RNG uniform integer performance~n",[]),
-    _ = measure_1(random, fun(State) -> {int, random:uniform_s(10000, State)} end),
-    _ = [measure_1(Algo, fun(State) -> {int, rand:uniform_s(10000, State)} end) || Algo <- Algos],
-    io:format("RNG uniform float performance~n",[]),
-    _ = measure_1(random, fun(State) -> {uniform, random:uniform_s(State)} end),
-    _ = [measure_1(Algo, fun(State) -> {uniform, rand:uniform_s(State)} end) || Algo <- Algos],
-    io:format("RNG normal float performance~n",[]),
-    io:format("~.10w: not implemented (too few bits)~n", [random]),
-    _ = [measure_1(Algo, fun(State) -> {normal, rand:normal_s(State)} end) || Algo <- Algos],
+    %%
+    ct:pal("RNG uniform integer performance~n",[]),
+    TMark1 =
+        measure_1(
+          random,
+          fun (_) -> 10000 end,
+          undefined,
+          fun (Range, State) ->
+                  {int, random:uniform_s(Range, State)}
+          end),
+    _ =
+        [measure_1(
+           Algo,
+           fun (_) -> 10000 end,
+           TMark1,
+           fun (Range, State) ->
+                   {int, rand:uniform_s(Range, State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG uniform integer 2^(N-1)  performance~n",[]),
+    RangeTwoPowFun = fun (State) -> quart_range(State) bsl 1 end,
+    TMark2 =
+        measure_1(
+          random,
+          RangeTwoPowFun,
+          undefined,
+          fun (Range, State) ->
+                  {int, random:uniform_s(Range, State)}
+          end),
+    _ =
+        [measure_1(
+           Algo,
+           RangeTwoPowFun,
+           TMark2,
+           fun (Range, State) ->
+                   {int, rand:uniform_s(Range, State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG uniform integer 3*2^(N-2)+1  performance~n",[]),
+    RangeLargeFun = fun (State) -> 3 * quart_range(State) + 1 end,
+    TMark3 =
+        measure_1(
+          random,
+          RangeLargeFun,
+          undefined,
+          fun (Range, State) ->
+                  {int, random:uniform_s(Range, State)}
+          end),
+    _ =
+        [measure_1(
+           Algo,
+           RangeLargeFun,
+           TMark3,
+           fun (Range, State) ->
+                   {int, rand:uniform_s(Range, State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG uniform integer 2^128  performance~n",[]),
+    TMark4 =
+        measure_1(
+          random,
+          fun (_) -> 1 bsl 128 end,
+          undefined,
+          fun (Range, State) ->
+                  {int, random:uniform_s(Range, State)}
+          end),
+    _ =
+        [measure_1(
+           Algo,
+           fun (_) -> 1 bsl 128 end,
+           TMark4,
+           fun (Range, State) ->
+                   {int, rand:uniform_s(Range, State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG uniform integer 2^128 + 1  performance~n",[]),
+    TMark5 =
+        measure_1(
+          random,
+          fun (_) -> (1 bsl 128) + 1 end,
+          undefined,
+          fun (Range, State) ->
+                  {int, random:uniform_s(Range, State)}
+          end),
+    _ =
+        [measure_1(
+           Algo,
+           fun (_) -> (1 bsl 128) + 1 end,
+           TMark5,
+           fun (Range, State) ->
+                   {int, rand:uniform_s(Range, State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG uniform float performance~n",[]),
+    TMark6 =
+        measure_1(
+          random,
+          fun (_) -> 0 end,
+          undefined,
+          fun (_, State) ->
+                  {uniform, random:uniform_s(State)}
+          end),
+    _ =
+        [measure_1(
+           Algo, 
+           fun (_) -> 0 end,
+           TMark6,
+           fun (_, State) ->
+                   {uniform, rand:uniform_s(State)}
+           end) || Algo <- Algos],
+    %%
+    ct:pal("~nRNG normal float performance~n",[]),
+    io:format("~.12w: not implemented (too few bits)~n", [random]),
+    _ = [measure_1(
+           Algo,
+           fun (_) -> 0 end,
+           TMark6,
+           fun (_, State) ->
+                   {normal, rand:normal_s(State)}
+           end) || Algo <- Algos],
     ok.
 
-measure_1(Algo, Gen) ->
+measure_1(Algo, RangeFun, TMark, Gen) ->
     Parent = self(),
-    Seed = fun(crypto64) -> crypto_seed();
-	      (random) -> random:seed(os:timestamp()), get(random_seed);
-	      (Alg) -> rand:seed_s(Alg)
-	   end,
-
-    Pid = spawn_link(fun() ->
-			     Fun = fun() -> measure_2(?LOOP, Seed(Algo), Gen) end,
-			     {Time, ok} = timer:tc(Fun),
-			     io:format("~.10w: ~pµs~n", [Algo, Time]),
-			     Parent ! {self(), ok},
-			     normal
-		     end),
+    Seed =
+        case Algo of
+            crypto64 ->
+                crypto64_seed();
+            crypto ->
+                crypto:rand_seed_s();
+            random ->
+                random:seed(os:timestamp()), get(random_seed);
+            _ ->
+                rand:seed_s(Algo)
+        end,
+    Range = RangeFun(Seed),
+    Pid = spawn_link(
+            fun() ->
+                    Fun = fun() -> measure_2(?LOOP, Range, Seed, Gen) end,
+                    {Time, ok} = timer:tc(Fun),
+                    Percent =
+                        case TMark of
+                            undefined -> 100;
+                            _ -> (Time * 100 + 50) div TMark
+                        end,
+                    io:format(
+                      "~.12w: ~p ns ~p% [16#~.16b]~n",
+                      [Algo, (Time * 1000 + 500) div ?LOOP, Percent, Range]),
+                    Parent ! {self(), Time},
+                    normal
+            end),
     receive
 	{Pid, Msg} -> Msg
     end.
 
-measure_2(N, State0, Fun) when N > 0 ->
-    case Fun(State0) of
+measure_2(N, Range, State0, Fun) when N > 0 ->
+    case Fun(Range, State0) of
 	{int, {Random, State}}
-	  when is_integer(Random), Random >= 1, Random =< 100000 ->
-	    measure_2(N-1, State, Fun);
-	{uniform, {Random, State}} when is_float(Random), Random > 0, Random < 1 ->
-	    measure_2(N-1, State, Fun);
+	  when is_integer(Random), Random >= 1, Random =< Range ->
+	    measure_2(N-1, Range, State, Fun);
+	{uniform, {Random, State}}
+          when is_float(Random), 0.0 =< Random, Random < 1.0 ->
+	    measure_2(N-1, Range, State, Fun);
 	{normal, {Random, State}} when is_float(Random) ->
-	    measure_2(N-1, State, Fun);
+	    measure_2(N-1, Range, State, Fun);
 	Res ->
 	    exit({error, Res, State0})
     end;
-measure_2(0, _, _) -> ok.
+measure_2(0, _, _, _) -> ok.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %% The jump sequence tests has two parts
@@ -479,36 +620,43 @@ reference_jump_1(Alg) ->
 	    io:format("Failed: ~p~n",[Alg]),
 	    io:format("Length ~p ~p~n",[length(Refval), length(Testval)]),
 	    io:format("Head ~p ~p~n",[hd(Refval), hd(Testval)]),
+	    io:format("Vals ~p ~p~n",[Refval, Testval]),
 	    exit(wrong_value)
     end.
 
 gen_jump_1(Algo) ->
-    Seed = case Algo of
-	       exsplus -> %% Printed with orig 'C' code and this seed
-		   rand:seed_s({exsplus, [12345678|12345678]});
-	       exs1024 -> %% Printed with orig 'C' code and this seed
-		   rand:seed_s({exs1024, {lists:duplicate(16, 12345678), []}});
-	       exs64 -> %% Test exception of not_implemented notice
-	       try rand:jump(rand:seed_s(exs64))
-	       catch
-	            error:not_implemented -> not_implemented
-	       end;
-	       _ -> % unimplemented
-		   not_implemented
-	   end,
-    case Seed of
+    State =
+        case Algo of
+            exs64 -> %% Test exception of not_implemented notice
+                try rand:jump(rand:seed_s(exs64))
+                catch
+                    error:not_implemented -> not_implemented
+                end;
+            _ when Algo =:= exsplus; Algo =:= exsp; Algo =:= exrop ->
+                %% Printed with orig 'C' code and this seed
+                rand:seed_s({Algo, [12345678|12345678]});
+            _ when Algo =:= exs1024; Algo =:= exs1024s ->
+                %% Printed with orig 'C' code and this seed
+                rand:seed_s({Algo, {lists:duplicate(16, 12345678), []}});
+            _ -> % unimplemented
+                not_implemented
+        end,
+    case State of
         not_implemented -> [not_implemented];
-        S -> gen_jump_1(?LOOP_JUMP, S, [])
+        _ ->
+            Max = range(State),
+            gen_jump_1(?LOOP_JUMP, State, Max, [])
     end.
 
-gen_jump_1(N, State0 = {#{max:=Max}, _}, Acc) when N > 0 ->
+gen_jump_1(N, State0, Max, Acc) when N > 0 ->
     {_, State1} = rand:uniform_s(Max, State0),
     {Random, State2} = rand:uniform_s(Max, rand:jump(State1)),
     case N rem (?LOOP_JUMP div 100) of
-	0 -> gen_jump_1(N-1, State2, [Random|Acc]);
-	_ -> gen_jump_1(N-1, State2, Acc)
+	0 -> gen_jump_1(N-1, State2, Max, [Random|Acc]);
+	_ -> gen_jump_1(N-1, State2, Max, Acc)
     end;
-gen_jump_1(_, _, Acc) -> lists:reverse(Acc).
+gen_jump_1(_, _, _, Acc) -> lists:reverse(Acc).
+
 
 %% Check if each algorithm generates the proper jump sequence
 %% with the internal state in the process dictionary.
@@ -530,25 +678,26 @@ reference_jump_0(Alg) ->
 
 gen_jump_0(Algo) ->
     Seed = case Algo of
-	       exsplus -> %% Printed with orig 'C' code and this seed
-		   rand:seed({exsplus, [12345678|12345678]});
-	       exs1024 -> %% Printed with orig 'C' code and this seed
-		   rand:seed({exs1024, {lists:duplicate(16, 12345678), []}});
 	       exs64 -> %% Test exception of not_implemented notice
-	       try
-               _ = rand:seed(exs64),
-               rand:jump()
-	       catch
-	            error:not_implemented -> not_implemented
-	       end;
+                   try
+                       _ = rand:seed(exs64),
+                       rand:jump()
+                   catch
+                       error:not_implemented -> not_implemented
+                   end;
+	       _ when Algo =:= exsplus; Algo =:= exsp; Algo =:= exrop ->
+                   %% Printed with orig 'C' code and this seed
+		   rand:seed({Algo, [12345678|12345678]});
+	       _ when Algo =:= exs1024; Algo =:= exs1024s ->
+                   %% Printed with orig 'C' code and this seed
+		   rand:seed({Algo, {lists:duplicate(16, 12345678), []}});
 	       _ -> % unimplemented
 		   not_implemented
 	   end,
     case Seed of
         not_implemented -> [not_implemented];
-        S ->
-            {Seedmap=#{}, _} = S,
-            Max = maps:get(max, Seedmap),
+        _ ->
+            Max = range(Seed),
             gen_jump_0(?LOOP_JUMP, Max, [])
     end.
 
@@ -643,9 +792,77 @@ reference_val(exsplus) ->
      16#6c6145ffa1169d,16#18ec2c393d45359,16#1f1a5f256e7130c,16#131cc2f49b8004f,
      16#36f715a249f4ec2,16#1c27629826c50d3,16#914d9a6648726a,16#27f5bf5ce2301e8,
      16#3dd493b8012970f,16#be13bed1e00e5c,16#ceef033b74ae10,16#3da38c6a50abe03,
-     16#15cbd1a421c7a8c,16#22794e3ec6ef3b1,16#26154d26e7ea99f,16#3a66681359a6ab6].
+     16#15cbd1a421c7a8c,16#22794e3ec6ef3b1,16#26154d26e7ea99f,16#3a66681359a6ab6];
+
+reference_val(exsp) ->
+    reference_val(exsplus);
+reference_val(exs1024s) ->
+    reference_val(exs1024);
+reference_val(exrop) ->
+%% #include <stdint.h>
+%% #include <stdio.h>
+%%
+%% uint64_t s[2];
+%% uint64_t next(void);
+%% /* Xoroshiro116+ PRNG here */
+%%
+%% int main(char *argv[]) {
+%%     int n;
+%%     uint64_t r;
+%%     s[0] = 12345678;
+%%     s[1] = 12345678;
+%%
+%%     for (n = 1000000;  n > 0;  n--) {
+%%         r = next();
+%%         if ((n % 10000) == 0) {
+%%             printf("%llu,", (unsigned long long) (r + 1));
+%%         }
+%%     }
+%%     printf("\n");
+%% }
+    [24691357,29089185972758626,135434857127264790,
+     277209758236304485,101045429972817342,
+     241950202080388093,283018380268425711,268233672110762489,
+     173241488791227202,245038518481669421,
+     253627577363613736,234979870724373477,115607127954560275,
+     96445882796968228,166106849348423677,
+     83614184550774836,109634510785746957,68415533259662436,
+     12078288820568786,246413981014863011,
+     96953486962147513,138629231038332640,206078430370986460,
+     11002780552565714,238837272913629203,
+     60272901610411077,148828243883348685,203140738399788939,
+     131001610760610046,30717739120305678,
+     262903815608472425,31891125663924935,107252017522511256,
+     241577109487224033,263801934853180827,
+     155517416581881714,223609336630639997,112175917931581716,
+     16523497284706825,201453767973653420,
+     35912153101632769,211525452750005043,96678037860996922,
+     70962216125870068,107383886372877124,
+     223441708670831233,247351119445661499,233235283318278995,
+     280646255087307741,232948506631162445,
+     %%
+     117394974124526779,55395923845250321,274512622756597759,
+     31754154862553492,222645458401498438,
+     161643932692872858,11771755227312868,93933211280589745,
+     92242631276348831,197206910466548143,
+     150370169849735808,229903773212075765,264650708561842793,
+     30318996509793571,158249985447105184,
+     220423733894955738,62892844479829080,112941952955911674,
+     203157000073363030,54175707830615686,
+     50121351829191185,115891831802446962,62298417197154985,
+     6569598473421167,69822368618978464,
+     176271134892968134,160793729023716344,271997399244980560,
+     59100661824817999,150500611720118722,
+     23707133151561128,25156834940231911,257788052162304719,
+     176517852966055005,247173855600850875,
+     83440973524473396,94711136045581604,154881198769946042,
+     236537934330658377,152283781345006019,
+     250789092615679985,78848633178610658,72059442721196128,
+     98223942961505519,191144652663779840,
+     102425686803727694,89058927716079076,80721467542933080,
+     8462479817391645,2774921106204163].
 
-%%%
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 
 reference_jump_val(exsplus) ->
     [82445318862816932, 145810727464480743, 16514517716894509, 247642377064868650,
@@ -701,4 +918,93 @@ reference_jump_val(exs1024) ->
      17936751184378118743, 4224632875737239207, 15888641556987476199, 9586888813112229805,
      9476861567287505094, 14909536929239540332, 17996844556292992842, 2699310519182298856];
 
-reference_jump_val(exs64) -> [not_implemented].
+reference_jump_val(exsp) ->
+    reference_jump_val(exsplus);
+reference_jump_val(exs1024s) ->
+    reference_jump_val(exs1024);
+reference_jump_val(exs64) -> [not_implemented];
+reference_jump_val(exrop) ->
+%% #include <stdint.h>
+%% #include <stdio.h>
+%%
+%% uint64_t s[2];
+%% uint64_t next(void);
+%% /* Xoroshiro116+ PRNG here */
+%%
+%% int main(char *argv[]) {
+%%     int n;
+%%     uint64_t r;
+%%     s[0] = 12345678;
+%%     s[1] = 12345678;
+
+%%     for (n = 1000;  n > 0;  n--) {
+%%         next();
+%%         jump();
+%%         r = next();
+%%         if ((n % 10) == 0) {
+%%             printf("%llu,", (unsigned long long) (r + 1));
+%%         }
+%%     }
+%%     printf("\n");
+%% }
+    [60301713907476001,135397949584721850,4148159712710727,
+     110297784509908316,18753463199438866,
+     106699913259182846,2414728156662676,237591345910610406,
+     48519427605486503,38071665570452612,
+     235484041375354592,45428997361037927,112352324717959775,
+     226084403445232507,270797890380258829,
+     160587966336947922,80453153271416820,222758573634013699,
+     195715386237881435,240975253876429810,
+     93387593470886224,23845439014202236,235376123357642262,
+     22286175195310374,239068556844083490,
+     120126027410954482,250690865061862527,113265144383673111,
+     57986825640269127,206087920253971490,
+     265971029949338955,40654558754415167,185972161822891882,
+     72224917962819036,116613804322063968,
+     129103518989198416,236110607653724474,98446977363728314,
+     122264213760984600,55635665885245081,
+     42625530794327559,288031254029912894,81654312180555835,
+     261800844953573559,144734008151358432,
+     77095621402920587,286730580569820386,274596992060316466,
+     97977034409404188,5517946553518132,
+     %%
+     56460292644964432,252118572460428657,38694442746260303,
+     165653145330192194,136968555571402812,
+     64905200201714082,257386366768713186,22702362175273017,
+     208480936480037395,152926769756967697,
+     256751159334239189,130982960476845557,21613531985982870,
+     87016962652282927,130446710536726404,
+     188769410109327420,282891129440391928,251807515151187951,
+     262029034126352975,30694713572208714,
+     46430187445005589,176983177204884508,144190360369444480,
+     14245137612606100,126045457407279122,
+     169277107135012393,42599413368851184,130940158341360014,
+     113412693367677211,119353175256553456,
+     96339829771832349,17378172025472134,110141940813943768,
+     253735613682893347,234964721082540068,
+     85668779779185140,164542570671430062,18205512302089755,
+     282380693509970845,190996054681051049,
+     250227633882474729,171181147785250210,55437891969696407,
+     241227318715885854,77323084015890802,
+     1663590009695191,234064400749487599,222983191707424780,
+     254956809144783896,203898972156838252].
+
+%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+
+%% The old algorithms used a range 2^N - 1 for their reference val
+%% tests, which was incorrect but works as long as you do not draw
+%% the value 2^N, which is very unlikely.  It was not possible
+%% to simply correct the range to 2^N due to another incorrectness
+%% in that the old algorithms changed to using the broken
+%% (multiply a float approach with too few bits) approach for
+%% ranges >= 2^N.  This function digs out the range to use
+%% for the reference tests for old and new algorithms.
+range({#{bits:=Bits}, _}) -> 1 bsl Bits;
+range({#{max:=Max}, _}) -> Max; %% Old incorrect range
+range({_, _, _}) -> 51. % random
+
+
+quart_range({#{bits:=Bits}, _}) -> 1 bsl (Bits - 2);
+quart_range({#{max:=Max}, _}) -> (Max bsr 2) + 1;
+quart_range({#{}, _}) -> 1 bsl 62; % crypto
+quart_range({_, _, _}) -> 1 bsl 49. % random
-- 
cgit v1.2.3