Load a YAML configuration file and resolve any environment variables
If you’ve worked with Python projects, you’ve probably have stumbled across the many ways to provide configuration. I am not going to go through all the ways here, but a few of them are:
using .ini files
using a python class
using .env files
using JSON or XML files
using a yaml file
And so on. I’ve put some useful links about the different ways below, in case you are interested in digging deeper.
My preference is working with yaml configuration because I usually find very handy and easy to use and I really like that yaml files are also used in e.g. docker-compose configuration so it is something most are familiar with.
For yaml parsing I use the PyYAML Python library.
In this article we’ll talk about the yaml file case and more specifically what you can do to avoid keeping your secrets, e.g. passwords, hosts, usernames etc, directly on it.
Let’s say we have a very simple example of a yaml file configuration:
database:
name: database_name
user: me
password: very_secret_and_complex
host: localhost
port: 5432
ws:
user: username
password: very_secret_and_complex_too
host: localhost
When you come to a point where you need to deploy your project, it is not really safe to have passwords and sensitive data in a plain text configuration file lying around on your production server. That’s where **environment variables **come in handy. So the goal here is to be able to easily replace the very_secret_and_complex
password with input from an environment variable, e.g. DB_PASS
, so that this variable only exists when you set it and run your program instead of it being hardcoded somewhere.
For PyYAML to be able to resolve environment variables, we need three main things:
A regex pattern for the environment variable identification e.g.
pattern = re.compile(‘.*?\${(\w+)}.*?’)
A tag that will signify that there’s an environment variable (or more) to be parsed, e.g.
!ENV
.And a function that the loader will use to resolve the environment variables
def constructor_env_variables(loader, node):
"""
Extracts the environment variable from the node's value
:param yaml.Loader loader: the yaml loader
:param node: the current node in the yaml
:return: the parsed string that contains the value of the environment
variable
"""
value = loader.construct_scalar(node)
match = pattern.findall(value)
if match:
full_value = value
for g in match:
full_value = full_value.replace(
f'${{{g}}}', os.environ.get(g, g)
)
return full_value
return value
Here’s a complete example:
import os
import re
import yaml
def parse_config(path=None, data=None, tag='!ENV'):
"""
Load a yaml configuration file and resolve any environment variables
The environment variables must have !ENV before them and be in this format
to be parsed: ${VAR_NAME}.
E.g.:
database:
host: !ENV ${HOST}
port: !ENV ${PORT}
app:
log_path: !ENV '/var/${LOG_PATH}'
something_else: !ENV '${AWESOME_ENV_VAR}/var/${A_SECOND_AWESOME_VAR}'
:param str path: the path to the yaml file
:param str data: the yaml data itself as a stream
:param str tag: the tag to look for
:return: the dict configuration
:rtype: dict[str, T]
"""
# pattern for global vars: look for ${word}
pattern = re.compile('.*?\${(\w+)}.*?')
loader = yaml.SafeLoader
# the tag will be used to mark where to start searching for the pattern
# e.g. somekey: !ENV somestring${MYENVVAR}blah blah blah
loader.add_implicit_resolver(tag, pattern, None)
def constructor_env_variables(loader, node):
"""
Extracts the environment variable from the node's value
:param yaml.Loader loader: the yaml loader
:param node: the current node in the yaml
:return: the parsed string that contains the value of the environment
variable
"""
value = loader.construct_scalar(node)
match = pattern.findall(value) # to find all env variables in line
if match:
full_value = value
for g in match:
full_value = full_value.replace(
f'${{{g}}}', os.environ.get(g, g)
)
return full_value
return value
loader.add_constructor(tag, constructor_env_variables)
if path:
with open(path) as conf_data:
return yaml.load(conf_data, Loader=loader)
elif data:
return yaml.load(data, Loader=loader)
else:
raise ValueError('Either a path or data should be defined as input')
Example of a YAML configuration with environment variables:
database:
name: database_name
user: !ENV ${DB_USER}
password: !ENV ${DB_PASS}
host: !ENV ${DB_HOST}
port: 5432
ws:
user: !ENV ${WS_USER}
password: !ENV ${WS_PASS}
host: !ENV ‘https://${CURR_ENV}.ws.com.local'
This can also work with more than one environment variables declared in the same line for the same configuration parameter like this:
ws:
user: !ENV ${WS_USER}
password: !ENV ${WS_PASS}
host: !ENV '[https://${CURR_ENV}.ws.com.](https://${CURR_ENV}.ws.com.local')[${MODE}](https://${CURR_ENV}.ws.com.local')' # multiple env var
And how to use this:
First set the environment variables. For example, for the DB_PASS
:
export DB_PASS=very_secret_and_complex
Or even better, so that the password is not echoed in the terminal:
read -s ‘Database password: ‘ db_pass
export DB_PASS=$db_pass
# To run this:
# export DB_PASS=very_secret_and_complex
# python use_env_variables_in_config_example.py -c /path/to/yaml
# do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='My awesome script')
parser.add_argument(
"-c", "--conf", action="store", dest="conf_file",
help="Path to config file"
)
args = parser.parse_args()
conf = parse_config(path=args.conf_file)
Then you can run the above script:
python use_env_variables_in_config_example.py -c /path/to/yaml
And in your code, do stuff with conf, e.g. access the database password like this: conf['database']['DB_PASS']
I hope this was helpful. Any thoughts, questions, corrections and suggestions are very welcome :)
Useful links
The Many Faces and Files of Python Configs As we cling harder and harder to Dockerfiles, Kubernetes, or any modern preconfigured app environment, our dependency…hackersandslackers.com 4 Ways to manage the configuration in Python I’m not a native speaker. Sorry for my english. Please understand.hackernoon.com Python configuration files A common need when writing an application is loading and saving configuration values in a human-readable text format…devdungeon.com Configuration files in Python Most interesting programs need some kind of configuration: Content Management Systems like WordPress blogs, WikiMedia…martin-thoma.com