- Python: Repeat the same random numbers using seed
- Why are repeatable «random» numbers a good thing?
- Prev Next
- Author: Gabor Szabo
- [PYTHON] [TensorFlow 2.x (tf.keras)] Fixed random number seed to improve reproducibility
- Execution environment
- background
- Random seed fixed
- About tf.random.set_seed
- Supplement
- Summary
- Python fix random seed
- With the random module
- Python 2.7
- Python 3.5
- With the numpy.random module
- Python 2.7
- Python 3.5
Python: Repeat the same random numbers using seed
Python has a module called random that can provide pseudo random numbers.
In a nutshell that means that the numbers seem to be random and can be used for various applications as if they were indeed random, but in fact they are just a really strange series of fixed numbers.
That means you should not use them for certain types of applications, e.g. encryption and that they are repeatable.
If you start from the same place in the series twice, then you get the exact same «random» numbers.
The way to set this beginning in the random module of python is to call the random.seed() function and give it an arbitrary number. e.g. 42 would be perfect.
In this simple script we just load the random module and called the random.random() method.
examples/python/fixed_seed/single_random.py
import random print(random.random())
Every time we run this script we get a different number.
0.511318181959 0.771417342337 0.565304847619
This happens because when python loads the random module it calls the seed function with the current time. As that time always changes the casual viewer would see random numbers.
examples/python/fixed_seed/single_random_fixed_seed.py
import random random.seed(42) print(random.random())
If we run this script several time we’ll always get back the same «random» number.
0.639426798458 0.639426798458 0.639426798458
Why are repeatable «random» numbers a good thing?
You might ask. Well, they are good if you would like to make sure you can run the same sequence of events while they are (almost) randomly created. For example when you write some test code. With tests you usually want the process to be repeatable.
So if you have a function that uses random numbers to calculate something and for every call it will return a different result then it is hard to check if it works properly. If you fix the random numbers then you will be able to observe the same result twice.
Anyway, what happens if your code loads other modules that also use random numbers?
examples/python/fixed_seed/main.py
import random import other #random.seed(42) def f(): print(random.random()) f() other.g()
examples/python/fixed_seed/other.py
import random def g(): print(random.random())
If we run python main.py several time we’ll get different number-pairs on every run:
0.864327113674 0.675706432586 0.221254773857 0.0473047970533 0.415061037659 0.718553482388
If we enable the call to random.seed(42) we get the same two numbers on every run:
0.639426798458 0.0250107552227 0.639426798458 0.0250107552227 0.639426798458 0.0250107552227
Prev Next
If you have any comments or questions, feel free to post them on the source of this page in GitHub. Source on GitHub. Comment on this post
Author: Gabor Szabo
Gábor who writes the articles of the Code Maven site offers courses in in the subjects that are discussed on this web site.
Gábor helps companies set up test automation, CI/CD Continuous Integration and Continuous Delivery and other DevOps related systems. Gabor can help your team improve the development speed and reduce the risk of bugs.
He is also the author of a number of eBooks.
Contact Gabor if you’d like to hire his services.
If you would like to support his freely available work, you can do it via Patreon, GitHub, or PayPal.
[PYTHON] [TensorFlow 2.x (tf.keras)] Fixed random number seed to improve reproducibility
I will show you how to fix the random number seed with Tensorflow 2.x ( tf.keras ).
Execution environment
The code used for the test can be found here (https://github.com/tokusumi/tf-keras-random-seed).
background
In the development of machine learning, there are demands such as «I want to make learning reproducible» and «I want to fix the initial value of the model for testing». Since the difference in the initial value of the weight affects the learning result, it seems that fixing the initial value will help solve these problems.
Random numbers are used to generate the initial value of the weight. Random numbers are generated based on random number seeds. By default, TensorFlow has a variable random number seed. Therefore, a model with a different initial value will be generated each time. Therefore, this time, we aim to improve the reproducibility by fixing the random number seed.
Random seed fixed
In addition to TensorFlow, we also fix the seeds for NumPy and Python built-in functions. In summary, the following random number fixed functions can be implemented.
import tensorflow as tf import numpy as np import random import os def set_seed(seed=200): tf.random.set_seed(seed) # optional # for numpy.random np.random.seed(seed) # for built-in random random.seed(seed) # for hash seed os.environ["PYTHONHASHSEED"] = str(seed)
It is used as follows. However, if TensorFlow random seeding is sufficient, replace set_seed with tf.random.set_seed .
set_seed(0) toy_model = tf.keras.Sequential( tf.keras.layers.Dense(2, input_shape=(10,)) ) #Some processing. #Reproduce the model set_seed(0) reproduced_toy_model = tf.keras.Sequential( tf.keras.layers.Dense(2, input_shape=(10,)) )
reproduced_toy_model has the same initial value (weight) as the previously generated model toy_model . In other words, it has been reproduced.
If you do not use set_seed , reproducible_toy_model and toy_model will have completely different initial values, resulting in poor reproducibility.
In addition to tf.keras.Sequential , you can also use the Functional API and SubClass.
Let’s sort out the method of fixing the random number seed ( set_seed ) a little more.
About tf.random.set_seed
The behavior of tf.random.set_seed needs a little attention.
First, after using tf.random.set_seed , try using a function that uses random numbers ( tf.random.uniform : sampling values randomly from a uniform distribution) several times.
tf.random.set_seed(0) tf.random.uniform([1]) # => [0.29197514] tf.random.uniform([1]) # => [0.5554141] (Different values!) tf.random.uniform([1]) # => [0.1952138] (Different values!!) tf.random.uniform([1]) # => [0.17513537](Different values. )
Different values were output for each. It seems that reproducibility will not be possible as it is. However, use tf.random.set_seed again as follows.
tf.random.set_seed(0) tf.random.uniform([1]) # => [0.29197514](A) tf.random.uniform([1]) # => [0.5554141] (B) tf.random.set_seed(0) tf.random.uniform([1]) # => [0.29197514](Reproduction of A) tf.random.uniform([1]) # => [0.5554141] (Reproduction of B)
In this way, the output is reproduced starting from the place where tf.random.set_seed is called (even though tf.random.uniform is a function that outputs a random value).
So, for example, if you call tf.random.set_seed just before creating a model instance (using Sequential, functional API or SubClass), the generated model will have the same initial value every time.
Supplement
TensorFlow has layers and functions that allow you to pass seed as an argument.
However, I think that explicitly specifying the initializer argument to be passed to layer or layer is not a very realistic method as the model grows.
In addition, there are some that do not work well unless the tf.random.set_seed introduced this time is used together.
So, even if you don’t have many places to fix, try tf.random.set_seed first.
Summary
In TensorFlow 2.x (tf.keras) you can use tf.random.set_seed to fix the random seed.
In particular, it will be possible to generate a model with the same initial weight value each time, so improvement in reproducibility can be expected.
Python fix random seed
A place to get a quick fix of python tips and tricks to make you a better Pythonista.
The «random» module with the same seed produces a different sequence of numbers in Python 2 vs 3. If reproducibility is important to you, use the «numpy.random» module instead.
With the random module
Python 2.7
import random random.seed(42) print(random.random()) print(random.random()) print(random.random()) print(random.random()) print(random.random())
0.025010755222666936 0.22321073814882275 0.6766994874229113 0.08693883262941615 0.029797219438070344
Python 3.5
import random random.seed(42) print(random.random()) print(random.random()) print(random.random()) print(random.random()) print(random.random())
0.025010755222666936 0.24489185380347622 0.7364712141640124 0.5904925124490397 0.029797219438070344
With the numpy.random module
Python 2.7
import numpy as np np.random.seed(42) print(np.random.random()) print(np.random.random()) print(np.random.random()) print(np.random.random()) print(np.random.random())
0.3745401188473625 0.9507143064099162 0.7319939418114051 0.5986584841970366 0.15601864044243652
Python 3.5
import numpy as np np.random.seed(42) print(np.random.random()) print(np.random.random()) print(np.random.random()) print(np.random.random()) print(np.random.random())
0.3745401188473625 0.9507143064099162 0.7319939418114051 0.5986584841970366 0.15601864044243652