Python YAML configuration with environment variables parsing

Python YAML configuration with environment variables parsing

Load a YAML configuration file and resolve any environment variables

If you’ve worked with Python projects, you’ve probably have stumbled across the many ways to provide configuration. I am not going to go through all the ways here, but a few of them are:

  • using .ini files

  • using a python class

  • using .env files

  • using JSON or XML files

  • using a yaml file

And so on. I’ve put some useful links about the different ways below, in case you are interested in digging deeper.

My preference is working with yaml configuration because I usually find very handy and easy to use and I really like that yaml files are also used in e.g. docker-compose configuration so it is something most are familiar with.

For yaml parsing I use the PyYAML Python library.

In this article we’ll talk about the yaml file case and more specifically what you can do to avoid keeping your secrets, e.g. passwords, hosts, usernames etc, directly on it.

Let’s say we have a very simple example of a yaml file configuration:

database:
 name: database_name
 user: me
 password: very_secret_and_complex
 host: localhost
 port: 5432

ws:
 user: username
 password: very_secret_and_complex_too
 host: localhost

When you come to a point where you need to deploy your project, it is not really safe to have passwords and sensitive data in a plain text configuration file lying around on your production server. That’s where **environment variables **come in handy. So the goal here is to be able to easily replace the very_secret_and_complex password with input from an environment variable, e.g. DB_PASS, so that this variable only exists when you set it and run your program instead of it being hardcoded somewhere.

For PyYAML to be able to resolve environment variables, we need three main things:

  • A regex pattern for the environment variable identification e.g. pattern = re.compile(‘.*?\${(\w+)}.*?’)

  • A tag that will signify that there’s an environment variable (or more) to be parsed, e.g. !ENV.

  • And a function that the loader will use to resolve the environment variables

def constructor_env_variables(loader, node):
    """
    Extracts the environment variable from the node's value
    :param yaml.Loader loader: the yaml loader
    :param node: the current node in the yaml
    :return: the parsed string that contains the value of the environment
    variable
    """
    value = loader.construct_scalar(node)
    match = pattern.findall(value)
    if match:
        full_value = value
        for g in match:
            full_value = full_value.replace(
                f'${{{g}}}', os.environ.get(g, g)
            )
        return full_value
    return value

Here’s a complete example:

import os
import re
import yaml


def parse_config(path=None, data=None, tag='!ENV'):
    """
    Load a yaml configuration file and resolve any environment variables
    The environment variables must have !ENV before them and be in this format
    to be parsed: ${VAR_NAME}.
    E.g.:

    database:
        host: !ENV ${HOST}
        port: !ENV ${PORT}
    app:
        log_path: !ENV '/var/${LOG_PATH}'
        something_else: !ENV '${AWESOME_ENV_VAR}/var/${A_SECOND_AWESOME_VAR}'

    :param str path: the path to the yaml file
    :param str data: the yaml data itself as a stream
    :param str tag: the tag to look for
    :return: the dict configuration
    :rtype: dict[str, T]
    """
    # pattern for global vars: look for ${word}
    pattern = re.compile('.*?\${(\w+)}.*?')
    loader = yaml.SafeLoader

    # the tag will be used to mark where to start searching for the pattern
    # e.g. somekey: !ENV somestring${MYENVVAR}blah blah blah
    loader.add_implicit_resolver(tag, pattern, None)

    def constructor_env_variables(loader, node):
        """
        Extracts the environment variable from the node's value
        :param yaml.Loader loader: the yaml loader
        :param node: the current node in the yaml
        :return: the parsed string that contains the value of the environment
        variable
        """
        value = loader.construct_scalar(node)
        match = pattern.findall(value)  # to find all env variables in line
        if match:
            full_value = value
            for g in match:
                full_value = full_value.replace(
                    f'${{{g}}}', os.environ.get(g, g)
                )
            return full_value
        return value

    loader.add_constructor(tag, constructor_env_variables)

    if path:
        with open(path) as conf_data:
            return yaml.load(conf_data, Loader=loader)
    elif data:
        return yaml.load(data, Loader=loader)
    else:
        raise ValueError('Either a path or data should be defined as input')

Example of a YAML configuration with environment variables:

database:
 name: database_name
 user: !ENV ${DB_USER}
 password: !ENV ${DB_PASS}
 host: !ENV ${DB_HOST}
 port: 5432

ws:
 user: !ENV ${WS_USER}
 password: !ENV ${WS_PASS}
 host: !ENV ‘https://${CURR_ENV}.ws.com.local'

This can also work with more than one environment variables declared in the same line for the same configuration parameter like this:

ws:
 user: !ENV ${WS_USER}
 password: !ENV ${WS_PASS}
 host: !ENV '[https://${CURR_ENV}.ws.com.](https://${CURR_ENV}.ws.com.local')[${MODE}](https://${CURR_ENV}.ws.com.local')'  # multiple env var

And how to use this:

First set the environment variables. For example, for the DB_PASS :

export DB_PASS=very_secret_and_complex

Or even better, so that the password is not echoed in the terminal:

read -s ‘Database password: ‘ db_pass
export DB_PASS=$db_pass
# To run this:
# export DB_PASS=very_secret_and_complex 
# python use_env_variables_in_config_example.py -c /path/to/yaml
# do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='My awesome script')
    parser.add_argument(
        "-c", "--conf", action="store", dest="conf_file",
        help="Path to config file"
    )
    args = parser.parse_args()
    conf = parse_config(path=args.conf_file)

Then you can run the above script:

python use_env_variables_in_config_example.py -c /path/to/yaml

And in your code, do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']

I hope this was helpful. Any thoughts, questions, corrections and suggestions are very welcome :)

The Many Faces and Files of Python Configs As we cling harder and harder to Dockerfiles, Kubernetes, or any modern preconfigured app environment, our dependency…hackersandslackers.com 4 Ways to manage the configuration in Python I’m not a native speaker. Sorry for my english. Please understand.hackernoon.com Python configuration files A common need when writing an application is loading and saving configuration values in a human-readable text format…devdungeon.com Configuration files in Python Most interesting programs need some kind of configuration: Content Management Systems like WordPress blogs, WikiMedia…martin-thoma.com