Normalizing Observations in Machine Learning: A Comprehensive Guide

7 min readSep 10, 2024



In machine learning and reinforcement learning (RL), data normalization is a crucial preprocessing step that ensures different features contribute equally to model performance. Normalizing observations can improve the efficiency and stability of training algorithms, particularly those sensitive to the scale of input data, such as gradient-based methods.

This comprehensive guide explores various techniques for normalizing observations, starting with basic methods using scikit-learn’s Normalizer and advancing to sophisticated strategies in RL environments using Maze. We'll cover:

  • The importance of observation normalization
  • Basic normalization using scikit-learn
  • Advanced normalization strategies in RL with Maze
  • Practical examples and code snippets
  • Custom normalization strategies

Part 1: Basic Observation Normalization with Scikit-Learn

Problem: Rescaling Features to Unit Norm

You have a set of feature values for different observations, and you want to rescale them so that the norm (or length) of each observation becomes 1. This technique ensures that all features contribute equally to distance-based calculations, which is essential in algorithms like k-nearest neighbors or clustering algorithms.

Solution: Using Normalizer with a Norm Argument

Scikit-learn provides a built-in tool, Normalizer, which scales individual samples to have unit norm. Here's an example:

# Load libraries
import numpy as np
from sklearn.preprocessing import Normalizer

# Create feature matrix
features = np.array([
[0.5, 0.5],
[1.1, 3.4],
[1.5, 20.2],
[1.63, 34.4],
[10.9, 3.3]

# Create normalizer
normalizer = Normalizer(norm="l2")

# Transform feature matrix
normalized_features = normalizer.transform(features)


[[0.70710678 0.70710678]
[0.30782029 0.95144452]
[0.07405353 0.99725427]
[0.04733062 0.99887928]
[0.95709822 0.28976368]]

Discussion: Understanding Normalization

Normalization ensures that each observation (a row in your dataset) has a unit norm. This is particularly useful when the scale of the features varies significantly or when you intend to use methods that rely on the magnitude of feature vectors.

Example: L2 Normalization

features_l2_norm = Normalizer(norm="l2").transform(features)


[[0.70710678 0.70710678]
[0.30782029 0.95144452]
[0.07405353 0.99725427]
[0.04733062 0.99887928]
[0.95709822 0.28976368]]

Example: L1 Normalization (Manhattan Norm)

Alternatively, you can use the L1 norm (Manhattan norm), where the norm is the sum of the absolute values of the features:

# Transform feature matrix using L1 norm
features_l1_norm = Normalizer(norm="l1").transform(features)


[[0.5        0.5       ]
[0.24444444 0.75555556]
[0.06912442 0.93087558]
[0.04524008 0.95475992]
[0.76760563 0.23239437]]

Key Differences Between L1 and L2 Norms

  • L1 Norm: Rescales the values so that the sum of the absolute values of the features equals 1. It’s often used in sparse datasets where individual features are important, such as text data.
  • L2 Norm: Rescales the values so that the square root of the sum of the squared feature values equals 1. It’s useful when you need to minimize the influence of outliers.

In the L1 normalization example, the sum of the values in each observation equals 1:

# Sum of the first observation's values after L1 normalization
print("Sum of the first observation's values:", np.sum(features_l1_norm[0]))


Sum of the first observation's values: 1.0

Practical Application of Normalization

Normalization is essential in:

  • Distance-based algorithms: Ensures that no single feature dominates the distance calculations.
  • Text classification: In natural language processing, normalizing word frequency vectors prevents bias towards longer documents.
  • Neural networks: Helps in faster convergence during training.

Part 2: Advanced Observation Normalization in Reinforcement Learning with Maze

In reinforcement learning, especially when dealing with complex environments and high-dimensional observations, normalizing inputs to the models (policy and value networks) is crucial for efficient training. Maze provides a flexible and customizable way to normalize observations via the ObservationNormalizationWrapper.


The ObservationNormalizationWrapper in Maze allows you to:

  • Apply different normalization strategies (e.g., mean-zero-std-one, range [0, 1]).
  • Estimate normalization statistics from observations collected by interacting with the environment.
  • Specify an action sampling policy for collecting these statistics.
  • Manually specify normalization statistics if known beforehand.
  • Exclude certain observations from normalization (e.g., action masks).
  • Preserve normalization statistics for continuing training runs or deploying an agent.
  • Support gym dictionary observation spaces.
  • Extend with custom observation normalization strategies.

Getting Started

To use observation normalization in Maze:

  1. Add the ObservationNormalizationWrapper to your environment's wrapper stack in your Hydra configuration.
  2. Configure the normalization settings according to your requirements.

List of Features

  • Normalization Strategies: Choose from built-in strategies like mean-zero-std-one or range [0, 1].
  • Statistics Estimation: Collect observations by interacting with the environment to estimate normalization statistics.
  • Sampling Policy: Define a policy (e.g., random policy) for collecting samples.
  • Manual Specification: Manually set normalization statistics if they are known.
  • Exclusion: Exclude specific observations from normalization.
  • Persistence: Save and load normalization statistics for future use.
  • Custom Strategies: Implement and integrate custom normalization strategies.


Example 1: Normalization with Estimated Statistics

This example applies default observation normalization to all observations with statistics automatically estimated via sampling.

Hydra Configuration:

# @package wrappers
default_strategy: maze.normalization_strategies.MeanZeroStdOneObservationNormalizationStrategy
clip_range: [~, ~]
axis: ~
default_statistics: ~
statistics_dump: statistics.pkl
_target_: maze.core.agent.random_policy.RandomPolicy
exclude: ~
manual_config: ~


  • Applies mean-zero, standard deviation-one normalization to all observations.
  • Does not clip observations after normalization.
  • Does not compute individual normalization statistics along different axes.
  • Dumps normalization statistics to statistics.pkl.
  • Estimates statistics from observations collected via random sampling.
  • Does not exclude any observations.
  • No manual statistics are provided.

Example 2: Normalization with Manual Statistics

Manually specify both the default normalization strategy and its corresponding statistics.

Hydra Configuration:

# @package wrappers
default_strategy: maze.normalization_strategies.RangeZeroOneObservationNormalizationStrategy
clip_range: [0, 1]
axis: ~
min: 0
max: 255
statistics_dump: statistics.pkl
_target_: maze.core.agent.random_policy.RandomPolicy
exclude: ~
manual_config: ~


  • Applies range-zero-one normalization with manually set statistics to all observations.
  • Clips normalized observations to the range [0, 1].
  • Subtracts 0 and divides by 255 (useful for RGB pixel observations).
  • Remaining settings do not have an effect here since statistics are manually specified.

Example 3: Custom Normalization and Excluding Observations

Utilize the full feature set of observation normalization, including custom strategies and exclusion.

Hydra Configuration:

# @package wrappers
default_strategy: maze.normalization_strategies.MeanZeroStdOneObservationNormalizationStrategy
clip_range: [~, ~]
axis: ~
default_statistics: ~
statistics_dump: statistics.pkl
_target_: maze.core.agent.random_policy.RandomPolicy
exclude: [action_mask]
strategy: maze.normalization_strategies.RangeZeroOneObservationNormalizationStrategy
clip_range: [0, 1]
axis: ~
min: 0
max: 255
strategy: maze.normalization_strategies.MeanZeroStdOneObservationNormalizationStrategy
clip_range: [-3, 3]
axis: [0]


  • Default Behavior: Identical to Example 1 for observations without manual configuration.

Observation pixel_image:

  • Uses range-zero-one normalization with manually specified statistics.
  • Clips values to [0, 1].

Observation feature_vector:

  • Normalizes each element using element-wise mean and standard deviation.
  • Computes statistics along axis [0].
  • Clips each element to [-3, 3].


  • Excludes action_mask from normalization.

Example 4: Using Custom Normalization Strategies

Implement and add your own normalization strategies if built-in ones are insufficient.

Hydra Configuration Details:

Default Behavior: Identical to Example 1 for observations without manual configuration.

Observation pixel_image:

  • Uses range-zero-one normalization with manually specified statistics.
  • Clips values to [0, 1].

Observation feature_vector:

  • Normalizes each element using element-wise mean and standard deviation.
  • Computes statistics along axis [0].
  • Clips each element to [-3, 3].


  • Excludes action_mask from normalization.

Example 4: Using Custom Normalization Strategies

Implement and add your own normalization strategies if built-in ones are insufficient.

Hydra Configuration:

# @package wrappers
default_strategy: my_project.normalization_strategies.custom.CustomObservationNormalizationStrategy
clip_range: [~, ~]
axis: ~
default_statistics: ~
statistics_dump: statistics.pkl
_target_: maze.core.agent.random_policy.RandomPolicy
exclude: ~
manual_config: ~


  • Implement the ObservationNormalizationStrategy interface for your custom strategy.
  • Ensure the strategy is accessible within your Python path.
  • Reference the custom strategy in your configuration.

Example 5: Plain Python Configuration

Use observation normalization directly within Python without Hydra configuration.

"""Example showing how to use observation normalization directly from Python."""

from maze.core.agent.random_policy import RandomPolicy
from maze.core.wrappers.maze_gym_env_wrapper import GymMazeEnv
from maze.core.wrappers.observation_normalization.observation_normalization_wrapper import ObservationNormalizationWrapper
from maze.core.wrappers.observation_normalization.observation_normalization_utils import obtain_normalization_statistics

# Instantiate a Maze environment
env = GymMazeEnv("CartPole-v0")

# Normalization configuration as a Python dictionary
normalization_config = {
"default_strategy": "maze.normalization_strategies.MeanZeroStdOneObservationNormalizationStrategy",
"default_strategy_config": {"clip_range": (None, None), "axis": 0},
"default_statistics": None,
"statistics_dump": "statistics.pkl",
"sampling_policy": RandomPolicy(env.action_spaces_dict),
"exclude": None,
"manual_config": None

# 1. PREPARATION: Estimate normalization statistics
# ------------------------------------------------
# Wrap the environment for observation normalization
env = ObservationNormalizationWrapper.wrap(env, **normalization_config)

# Estimate the normalization statistics
normalization_statistics = obtain_normalization_statistics(env, n_samples=1000)

# 2. APPLICATION (Training, Rollout, Deployment)
# ----------------------------------------------
# Instantiate the training environment
training_env = GymMazeEnv("CartPole-v0")

# Wrap the training environment for observation normalization
training_env = ObservationNormalizationWrapper.wrap(training_env, **normalization_config)

# Reuse the estimated statistics in the training environment

# Now, the training environment yields normalized observations
normalized_obs = training_env.reset()

Built-in Normalization Strategies

Normalization strategies define how input observations are normalized. Maze provides several built-in strategies:

  • MeanZeroStdOneObservationNormalizationStrategy: Normalizes observations to have zero mean and unit standard deviation.
  • RangeZeroOneObservationNormalizationStrategy: Scales observations to be within the range [0, 1].

Refer to the Maze documentation for more details on each strategy.

The Bigger Picture: Observation Normalization in the Interaction Loop

Observation normalization is embedded within the overall interaction loop between the agent and the environment. It operates between the ObservationConversionInterface (which converts environment states into machine-readable observations) and the agent.


  1. Sampling: Collect observations using a specified policy (e.g., random policy).
  2. Statistics Estimation: Compute normalization statistics based on collected observations.
  3. Normalization: Apply the normalization strategies to observations before they are fed to the agent.
  4. Persistence: Save the normalization statistics for future use (e.g., deployment).

Visual Representation

While we cannot display images in this text format, imagine a flowchart where:

  • Environment generates raw observations.
  • Observations pass through the ObservationNormalizationWrapper.
  • Normalized observations are then fed to the Agent.
  • The Agent produces actions that are sent back to the Environment.


Normalization is an essential preprocessing technique in both machine learning and reinforcement learning. It ensures that input features contribute equally to the learning process, prevents numerical instability, and can significantly improve model performance.

Whether using basic normalization methods with scikit-learn or advanced strategies in RL environments with Maze, understanding and correctly implementing observation normalization is key to building efficient and robust models.

By leveraging tools like scikit-learn’s Normalizer and Maze's ObservationNormalizationWrapper, you can:

  • Easily apply normalization techniques suited to your data.
  • Customize normalization strategies to fit specific needs.
  • Improve training efficiency and model performance.

Where to Go Next

  • Preprocessing Observations: Before normalizing, you might want to preprocess observations using wrappers like PreProcessingWrapper.
  • Learn More About Maze: Explore Maze’s documentation to understand more about environment wrappers and other features.
  • Custom Strategies: Experiment with creating custom normalization strategies tailored to your specific use case.




Written by Scaibu

Revolutionize Education with Scaibu: Improving Tech Education and Building Networks with Investors for a Better Future

No responses yet