You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/SETUP/BACK_END.md
+68-96Lines changed: 68 additions & 96 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,34 +6,42 @@ If you plan to contribute to the data wrangling on this project and need to run
6
6
7
7
## Setup
8
8
9
-
### Fork the Repository
9
+
### Fork and Clone
10
10
11
11
1. Navigate to [our GitHub repository](https://github.com/CodeForPhilly/clean-and-green-philly).
12
12
2. Create a fork of the repository by clicking the "Fork" button in the top right corner of the page.
13
13
3. Clone your fork of the repository to your local machine using `git clone`.
14
14
15
15
**Note:** Keep your fork up to date with the original repository by following the instructions [here](https://docs.github.com/en/get-started/quickstart/fork-a-repo#keep-your-fork-synced).
16
16
17
-
### Precommit hook
17
+
### Install Dependencies and Pre-commit Hooks
18
18
19
-
We use a precommit hook to help with formatting and linting in our CI/CD pipeline. When setting up the repo, please first [make sure you have `pre-commit` installed](https://pre-commit.com/) using `pip` or another package manager. Once that's done, run `pre-commit install` in the root directory to set up the precommit hooks (configured in `.pre-commit-config.yaml`).
19
+
1. Navigate to the root directory and install the virtual environment and dependencies:
20
20
21
-
**Important:** After the above step manually copy the commit message hook file to ensure conventional commit format validation:
21
+
```sh
22
+
uv sync
23
+
```
22
24
23
-
**Windows Command Prompt:**
25
+
2. Install pre-commit hooks for code quality and commit message validation:
24
26
25
-
```cmd
26
-
copy .github\hooks\commit-msg .git\hooks\
27
-
```
27
+
```sh
28
+
uv run pre-commit install
29
+
```
28
30
29
-
**Mac/Linux/Git Bash:**
31
+
3. Copy the commit message hook file to ensure conventional commit format validation:
30
32
31
-
```bash
32
-
cp .github/hooks/commit-msg .git/hooks/
33
-
chmod +x .git/hooks/commit-msg
34
-
```
33
+
**Windows Command Prompt:**
35
34
36
-
This will install both code quality checks and commit message format validation.
35
+
```cmd
36
+
copy .github\hooks\commit-msg .git\hooks\
37
+
```
38
+
39
+
**Mac/Linux/Git Bash:**
40
+
41
+
```bash
42
+
cp .github/hooks/commit-msg .git/hooks/
43
+
chmod +x .git/hooks/commit-msg
44
+
```
37
45
38
46
> **Note:** All commits must follow the Conventional Commits format: `<type>[optional scope]: <description>`
39
47
>
@@ -45,89 +53,83 @@ This will install both code quality checks and commit message format validation.
45
53
> -`feat(FilterView): add new method for conditional filtering`
46
54
> -`docs: update the pull request template`
47
55
48
-
### Set Environment Variables
49
-
50
-
The project requires specific and sensitive information to run, which should be stored in the user's development environment rather than in source control. Here are instructions for setting environment variables locally on your machine or using a `.env` file.
56
+
### Environment Variables
51
57
52
-
#### Using a .env File
58
+
Copy the `data/.env.example` file to `data/.env` and fill in your actual values:
53
59
54
-
1. Create a file named `.env` in the `/data` subdirectory of your project.
55
-
2. Add the following environment variables to the `.env` file:
56
-
57
-
<!-- TODO: What env vars do we need? -->
58
-
59
-
All local environment variables will be passed through to docker compose, so if you have them set up in the `.env` file, you should not need to hard-code them elsewhere.
60
-
61
-
#### Setting Environment Variables Locally
62
-
63
-
For Mac and Linux, you can permanently store the environment variables in your command line shell's configuration file, e.g., `~/.bashrc`, `~/.bash_profile`, `~/.zshrc`, or `~/.profile`. Add a line `export VAR_NAME=VALUE` in your file and run `source <file>` to read it in when newly created. Any new shells will automatically have the new environment.
64
-
65
-
For Windows, you can set environment variables under System -> Advanced or you can download a terminal emulator such as [Git Bash](https://gitforwindows.org/) and follow the instructions for Mac and Linux above. A terminal emulator is recommended.
60
+
```sh
61
+
cp data/.env.example data/.env
62
+
```
66
63
67
-
All of your local environment variables will be passed through to docker compose, so if you have them locally, you should not have to hard-code them.
64
+
Then edit `data/.env` with your specific credentials for Google Cloud Platform and Slack integration.
68
65
69
-
### Docker Build
66
+
All environment variables will be automatically passed through to Docker containers.
70
67
71
-
Docker is a platform that allows you to containerize and run applications in isolated environments, making it easier to manage dependencies and ensure consistent deployments. Download the [latest version of Docker Desktop for your operating system](https://www.docker.com/products/docker-desktop/).
68
+
### Docker Setup
72
69
73
-
We use [docker compose](https://docs.docker.com/compose/) to manage the backend Docker services. The `data/docker-compose.yaml` file defines the services.
70
+
1. Install [Docker Desktop](https://www.docker.com/products/docker-desktop/) for your operating system.
74
71
75
-
1. The first time you set up your backend, or any the Docker file changes, build the Docker services by running:
72
+
2. Build the Docker services (run from the `data/` directory):
76
73
77
74
```sh
75
+
cd data
78
76
docker compose build
79
77
```
80
78
81
-
For first-time runs, set `FORCE_RELOAD=True` in `config.py` and optionally `log_level: int = logging.DEBUG` to get more verbose output.
82
-
83
-
All Docker commands should be run from the `data/` directory.
79
+
3. Run the main pipeline:
84
80
85
-
#### Windows
86
-
87
-
1. Make sure Docker is running by opening the Docker Desktop app.
88
-
2. Open the command prompt. Navigate to the location of the `clean-and-green-philly` repository. Run `cd data` and then `docker compose run vacant-lots-proj`.
89
-
3. When the script is done running, you’ll get a notification. When you’re done, to shut off the Docker container (which uses memory), run `docker compose down`.
81
+
```sh
82
+
docker compose run vacant-lots-proj
83
+
```
90
84
91
-
#### Linux
85
+
4. When finished, shut down containers:
86
+
```sh
87
+
docker compose down
88
+
```
92
89
93
-
1. In the terminal, navigate to your repository location using `cd path/to/clean-and-green-philly`. Then run `cd data` to move into the `data` directory.
94
-
2. Run `docker compose run vacant-lots-proj`. Enter your password if requested. If you run into an error message related to "KEY_ID" or something similar, you may have to do the following:
95
-
- Hard-code your `VACANT_LOTS_DB` variable in `docker-compose.yml`.
90
+
**Note:** For first-time runs, set `FORCE_RELOAD=True` in `config.py` and optionally `log_level: int = logging.DEBUG` for verbose output.
96
91
97
-
The backend also works on WSL Ubuntu running Docker for Linux on Windows 10.
92
+
## Python Development
98
93
99
-
3. When you're finished, and you want to shut down the Docker container, run `docker compose down`.
94
+
You can develop and run the backend `script.py` and unit tests outside of Docker using your local Python environment:
100
95
101
-
#### macOS
96
+
1. Install the same Python version as defined in the `Dockerfile` (3.11.4)
97
+
2. Use `uv` to create a virtual environment and install dependencies from `pyproject.toml`
98
+
3. Run unit tests with `pytest`
102
99
103
-
In the terminal, use the `cd` command to navigate to your repository location, and then into the `data` directory. Run `docker compose run vacant-lots-proj`. This command starts Docker Compose and sets up your environment as defined in your `docker-compose.yml` file. When you're finished and want to shut down the Docker containers, run `docker compose down`.
100
+
For testing individual services:
104
101
105
-
## Python Development
102
+
```sh
103
+
cd data
104
+
uv run test_service.py [name_of_service]
105
+
# Example: uv run test_service.py opa_properties
106
+
```
106
107
107
-
You can set up your local Python environment so you can develop and run the backend `script.py`and create and run unit tests outside of Docker. Build your local environment to match what is defined in the `Dockerfile`. Install the same python version as is in the Dockerfile, using `uv` to manage multiple distributions if needed. Use `uv` to create a virtual environment. Install the pip dependencies that are defined in `pyproject.toml` into your virtual environment. Install the executables with `apt-get`. Now you can develop in Python in your terminal and IDE and run unit tests with `pytest`.
108
+
The `config.py`file defines several log levels for testing the pipeline, including profiling, geometry debugging, and verbose output.
108
109
109
110
## Configuration
110
111
111
-
There are numerous configuration variables in `data/src/config/config.py`. See the documentation in that file for each variable. You will also have to set up environmental variables as defined throughout this document.
112
+
Configuration variables are defined in `data/src/config/config.py`. See the documentation in that file for each variable.
112
113
113
-
There are the following secrets that may be securely shared with you by the project leads:
114
+
### Required Secrets
114
115
115
-
- The password for the project's Google account to access the cloud platform. For development purposes, you can work in your personal cloud account, see the GCP section below.
116
-
- The Slack API key to post diff reports to the project Slack via the messenger bot. See the 'Backup and difference reporting' section below. You can set up your own Slack bot for your personal workspace and use that API key for local testing. See [this link](https://www.datacamp.com/tutorial/how-to-send-slack-messages-with-python) for instructions or do a Google search on how to do it.
116
+
The following secrets may be shared by project leads:
117
117
118
-
#### Making code changes
118
+
-**Google Cloud credentials** for accessing the cloud platform
119
+
-**Slack API key** for posting diff reports to the project Slack
119
120
120
-
Changes to our codebase should always address an [issue](https://github.com/CodeForPhilly/vacant-lots-proj/issues)and need to be requested to be merged by submitting a pull request that will be reviewed by at least the team lead or tech lead.
121
+
For development, you can set up your own GCP account and Slack bot for testing.
121
122
122
-
####Formatting
123
+
###Code Changes and Formatting
123
124
124
-
Format all python files by running:
125
+
- Changes should address an [issue](https://github.com/CodeForPhilly/vacant-lots-proj/issues)
126
+
- Submit pull requests for review by team lead or tech lead
127
+
- Format Python files:
128
+
```sh
129
+
docker compose run --rm vacant-lots-proj sh -c "ruff format"
130
+
```
125
131
126
-
```sh
127
-
docker compose run --rm vacant-lots-proj sh -c "ruff format"
128
-
```
129
-
130
-
#### Google Cloud (GCP)
132
+
## Google Cloud Platform (GCP)
131
133
132
134
The map data is converted to the [pmtiles](https://docs.protomaps.com/pmtiles/) format and served from Google Cloud. For access to production credentials, contact the project lead.
133
135
@@ -173,33 +175,3 @@ When a diff is performed, an html file of the contents of the '{table_name}\_dif
173
175
The `CAGP_SLACK_API_TOKEN` environmental variable must be set with the API key for the Slack app that can write messages to the channel as configured in the config.py `report_to_slack_channel` variable.
174
176
175
177
The report will also be emailed to any emails configured in the config.py `report_to_email` variable.
176
-
177
-
# Production script execution
178
-
179
-
The job to reload the backend data has been scheduled in the Google Cloud to run on a weekly basis.
180
-
181
-
A virtual machine running Debian Linux named `backend` is set up in the compute engine of the CAGP GCP account. The staging branch of the git project has been cloned here into the home directory of the `cleanandgreenphl` user. All required software such as docker and git has been installed on this vm.
182
-
183
-
To access the Linux terminal of this vm instance via SSH you can use the 'SSH-in-browser' GCP tool on the web. Go to Compute Engine -> VM instances and select SSH next to the `backend` instance, then select 'Open in browser window'.
184
-
185
-
You can also connect to the vm with the terminal ssh client on your pc. This is recommended for more advanced use cases as the web UI is limited. To set this up, follow the steps below:
186
-
187
-
- In GCP, go to IAM and Admin -> Service Accounts -> Keys and click on the `[email protected]` account.
188
-
- Click 'Add key'. You can only download the service account JSON key file when you create a key so you will have to create a new key. Select 'JSON' and save the .json file to your local machine.
189
-
- Download and install the [Google Cloud Command Line Interface (CLI)](https://cloud.google.com/sdk/docs/install) for your OS.
190
-
- In your terminal, navigate to the folder with your saved .json file. Run the command:
- You will land in the home directory of the `cleanandgreenphl` user. The project has been cloned to this directory.
195
-
196
-
The job to regenerate and upload the tiles file and street images to the GCP bucket has been scheduled in `cron` to run weekly on Wednesday at 5 AM. You can run `crontab -l` to see the job. Currently it looks like this:
197
-
198
-
`0 5 * * 3 . /home/cleanandgreenphl/.cagp_env && cd clean-and-green-philly/data && docker compose run vacant-lots-proj && docker compose run streetview`
199
-
200
-
The specific production environmental variables are stored in `/home/cleanandgreenphl/.cagp_env`. Some variables in the `data/src/config/config.py` project file have been edited locally for the scheduled run. Be careful when running this job in this environment because the production web site could be affected.
201
-
202
-
The message with the diff report will be sent to the `clean-and-green-philly-back-end` Slack channel.
203
-
204
-
To troubleshoot any errors you can look at the docker logs of the last run container. e.g.:
0 commit comments