Add `data_preparation/generate_customers.py`, a script that takes the
`base_data.json` file generated by `get_base_data.py` and randomly
samples a given number of customers.
To simplify things, each customer is assigned exactly one gas and one
electricity meter and each of them is read between 1 and 10 times.
The full data including meters, meter readings and dates as well as
customers and addresses is stored in a final JSON file named
`customers.json`.
The script `get_base_data` takes the raw datafiles (such as `names.txt`)
and formats them in a common JSON file, which can be later used to
randomly generate customer and meter readings data.
Additionally, the script filters all eligible zip codes an approximate
avacon netz service area and provides some additional information for
them.
An example output file, `base_data.json` has been added to the repo in
a previous commit.
Usually, one would not check out the actual data files, but store them
elsewhere (such as in Azure Blob Storage). In this case, it is still
convenient for external reviewers to get an idea of the structure of
the data.
Since the files in total are less than 2MB, this is acceptable for this
specific case.
Setting up python project files using poetry. A basic environment
is installed including dash for the app that will be implemented
later.
Also contains several dev tools, including pre-commit hooks.
Co-authored-by: Tobias Quadfasel <tobias.loesche@studium.uni-hamburg.de>
Reviewed-on: #1