38 Commits

Author SHA1 Message Date
701919a83a Merge pull request 'fix(sql-formatting): Fix SQL code formatting' (#10) from fix/sql-formatting into main
Reviewed-on: #10
2024-10-06 10:06:26 +00:00
5cec810947 fix(sql-formatting): Fix SQL code formatting
Fixed SQL code formatting errors by:
- catching both single and double backslashes in the formatting
- explicitly telling LLM how to format linebreaks

Also did some changes to the UI and allowed general questions
about the database content to be asked.
2024-10-06 11:59:45 +02:00
76a7168aa3 Merge pull request 'docs/finish-readmeÖ Update documentation' (#9) from docs/finish-readme into main
Reviewed-on: #9
2024-09-04 06:21:35 +00:00
Tobias Quadfasel
724c04d5f2 Exchange cloud structure SVG with PNG 2024-09-04 08:19:34 +02:00
Tobias Quadfasel
fb53ddcc8e Update README, include cloud structure schematic 2024-09-04 08:15:12 +02:00
Tobias Quadfasel
0344da8191 Minor docstring improvement 2024-09-04 01:32:58 +02:00
Tobias Quadfasel
583444261e Removing unneccessary print call 2024-09-04 00:00:32 +02:00
65aa4800fc Merge pull request 'feat/azure: Add components for azure deployment' (#8) from feat/azure into main
Reviewed-on: #8
2024-09-03 19:54:26 +00:00
Tobias Quadfasel
02f1b41cb9 feat(azure): Added necessary azure components to app
Using respective credentials for both local development as well as
deployment. When deployed on azure, the app authenticates with the SQL
database via Entra ID (formerly active directory) and accesses other
credentials via key vault as a system managed identity.
2024-09-03 21:51:12 +02:00
Tobias Quadfasel
22000f1b0e feat(azure): Add azure-related python packages for deployment 2024-09-03 20:44:17 +02:00
2302c2014a Merge pull request 'fix/docker' (#7) from fix/docker into main
Reviewed-on: #7
2024-09-03 15:29:04 +00:00
Tobias Quadfasel
a583f07751 fix(docker): Added minor change to app notification for users 2024-09-03 17:27:24 +02:00
Tobias Quadfasel
e9adb3c588 fix(docker): Fix docker file
The docker file was updated to align with the software
requirements of the app. An additional script `install_odbc.sh` is added
to install the required microsoft ODBC driver.
2024-09-03 17:23:41 +02:00
d9e570c75a Merge pull request 'feat/auth' (#6) from feat/auth into main
Reviewed-on: #6
2024-09-03 14:07:39 +00:00
Tobias Quadfasel
60206f48ef feat(auth): Add basic authentication to app 2024-09-03 16:05:18 +02:00
Tobias Quadfasel
0b7a8218e7 feat(auth): Add dash-auth package
In order to be able to authenticate with the app, a simple approach
using dash-auth with environment-based secrets is used for now.
2024-09-03 16:03:35 +02:00
9eab00f4bc Merge pull request 'feat/ai-chat
Add core components for Database chat' (#5) from feat/ai-chat into main
Reviewed-on: #5
2024-09-03 13:38:23 +00:00
Tobias Quadfasel
2c83ad2ee5 feat(ai-chat): Add text field to describe manual SQL input field
Before, there was no direct description about the usage of the manual
SQL query input field. A plotly dash component was added with a more
precise description for the user.
2024-09-03 15:06:32 +02:00
Tobias Quadfasel
94b5545173 feat(ai-chat): Add code logic for AI-based data chat
Add the first working code logic both in terms of backend and
frontend-related tasks. Add a detailled system message for improved
results. Add several UI improvements for result display and user
information. Add text input field for direct SQL code comparison.

The implementation of the openAI backend had to be changed due to strict
rate limits of azure OpenAI free tier and was replaced with a regular
openai API key.
2024-09-03 14:48:38 +02:00
Tobias Quadfasel
4b9fa0579e feat(ai-chat): Add SQL query field for comparison
In order to compare the (not yet implemented) SQL query generated by
the LLM with an actual query, another text field was added that parses
the query to `pyodbc`, which connects to our database, stores the
resulting rows in a `pandas` dataframe and then visualizes it as a table
in plotly dash.

The SQL functionalities are implemented in the `sql_utils.py` module.

Additionally, some minor updates to the overall behavior and layout of
the app were implemented.
2024-09-02 20:43:48 +02:00
Tobias Quadfasel
923dc3b439 feat(ai-chat): Add first version of ai chat as well as frontend
Includes the first version of a rudimentary chat app, still without the
SQL capabilities that we want later. For now, we can connect to the
Azure OpenAI source and then have the response displayed in a plotly
dash webapp.

Some styling and UI elements were also added, such as logos. UI
components are designed that the user cannot enter the same query twice
and cannot click the submit button as long as the query is running.
2024-08-31 23:38:14 +02:00
Tobias Quadfasel
c61c355ee6 feat(ai-chat): Add openai python SDK to project 2024-08-31 17:03:58 +02:00
1b0ef07ae5 Merge pull request 'docs(dataprep): Add documentation about data sources and structure' (#4) from docs/dataprep into main
Reviewed-on: #4
2024-08-31 13:23:47 +00:00
Tobias Quadfasel
d8f922e1d2 docs(dataprep): Add documentation about data sources and structure
Added detailled information about data sources used for customer data
generation as well as the structure of the SQL database. Also included
an Entity Relationship Diagram (ERD) for effective visualization of the
database structure.
2024-08-31 15:21:08 +02:00
6a1a5c11f0 Merge pull request 'feat/add-data' (#3) from feat/add-data into main
Reviewed-on: #3
2024-08-31 12:43:11 +00:00
Tobias Quadfasel
61b238e87b feat(add-data): Add script to insert prepped data into database
The script `insert_sql.py` uses `pyodbc` to connect to the Azure SQL
database, loads the data from the preprocessed `customers.json` file,
formats them and then inserts them into the created table schema.
2024-08-31 14:39:21 +02:00
Tobias Quadfasel
66d1eec875 feat(add-data): Add script for SQL schema initialization
Since we are working with an Azure SQL database, we need to fill the
generated customer data in a fitting schema. The schema will be
described in more detail in an updated README file later.

The added script uses `pyodbc` to connect to the database and create the
tables. This requires a connection string, which will not be checked out
to this repo for security reasons and must be obtained separately.

Additionally, a script `test_sql_connection.py` is added with this
commit, which is a simple utility to test the `pyodbc` connection.
2024-08-31 14:28:44 +02:00
Tobias Quadfasel
3a4bc53dcc feat(add-data): Add script to generate costumers from base data
Add `data_preparation/generate_customers.py`, a script that takes the
`base_data.json` file generated by `get_base_data.py` and randomly
samples a given number of customers.

To simplify things, each customer is assigned exactly one gas and one
electricity meter and each of them is read between 1 and 10 times.

The full data including meters, meter readings and dates as well as
customers and addresses is stored in a final JSON file named
`customers.json`.
2024-08-31 14:23:50 +02:00
Tobias Quadfasel
596893edf3 feat(add-data): Add general config and utilities for data prep 2024-08-31 14:15:53 +02:00
Tobias Quadfasel
49ff1bbfec feat(add-data): Add script for acquiring base data files
The script `get_base_data` takes the raw datafiles (such as `names.txt`)
and formats them in a common JSON file, which can be later used to
randomly generate customer and meter readings data.

Additionally, the script filters all eligible zip codes an approximate
avacon netz service area and provides some additional information for
them.

An example output file, `base_data.json` has been added to the repo in
a previous commit.
2024-08-31 14:10:56 +02:00
Tobias Quadfasel
22c05e827c feat(add-data): Add actual data used for data generation to repo
Usually, one would not check out the actual data files, but store them
elsewhere (such as in Azure Blob Storage). In this case, it is still
convenient for external reviewers to get an idea of the structure of
the data.

Since the files in total are less than 2MB, this is acceptable for this
specific case.
2024-08-31 14:00:29 +02:00
Tobias Quadfasel
fb65fc202b feat(add-data): Update pyproject.toml and poetry.lock files 2024-08-31 13:55:18 +02:00
c31a57f537 feat/base-app: Merge into main (#2) 2024-08-29 15:43:34 +00:00
Tobias Quadfasel
5a5f41246d feat(base-app): Add docker setup
Adding a dockerfile as well as an `entrypoint.sh` file for deployment
using `gunicorn`.
2024-08-29 17:35:14 +02:00
Tobias Quadfasel
25aeb62951 feat(base-app): Add gunicorn to dependencies 2024-08-29 17:34:04 +02:00
Tobias Quadfasel
065c831cde feat(base-app): Add base dash app file
This commit adds a file that contains a standard 'Hello World'
example of a plotly dash app.
2024-08-29 16:28:21 +02:00
b307b8327a feat(project_setup): Initialize project (#1)
Setting up python project files using poetry. A basic environment
is installed including dash for the app that will be implemented
later.

Also contains several dev tools, including pre-commit hooks.

Co-authored-by: Tobias Quadfasel <tobias.loesche@studium.uni-hamburg.de>
Reviewed-on: #1
2024-08-29 14:08:09 +00:00
9377c07044 docs: Add README.md
This is the initial commit containing a rudimentary one-liner as README file.
2024-08-23 22:53:05 +00:00