Distributed System Performance Demonstration

This project demonstrates the improved performance of a distributed system compared to a centralized system. The code compares the time it takes to process documents with both systems and also includes a CLI to interact with the data.

Getting Started

Before running the code, ensure you have Python installed on your system. This project has been tested with Python 3. 8 and above. Additionally, if you encounter any missing dependencies, install them via pip.

movie-review-data contains the article information to be processed. Add more documents to this folder to increase the number of documents as required.

Running The Code

To run the code, open a terminal, navigate to the folder containing main.py, and execute:

python main.py

How The Code Works

The main.py script performs the following operations:

Measures the runtime performance of processing 8 documents with 1 worker (a simulation of a centralized system).
Measures the runtime performance of processing the same 8 documents with multiple workers (a simulation of a distributed system).
Launches a command-line interface (CLI) to interact with the data post-processing.

CLI Usage

Once main.py has completed its performance measurements, it will present a CLI with the following options:

1. Get Campaign Details
2. Get Campaign ID List
3. Get Article IDs for Campaign
4. Get Date List
5. Get By Date
6. Get By Article ID
7. Exit

To use the CLI:

Enter the number corresponding to the action you want to perform.
Follow the prompts to enter additional information as required (e.g., Campaign ID, Article ID, Date, etc.)
The output will be presented directly in the terminal.
To exit the CLI, enter 7 when prompted for a choice.

Project Design

The project design document can be found in project-design-document.md. Although we haven't implemented the complete system, the prototype demonstrates the performance improvements of a distributed system over a centralized one.

Some Items that were not implemented

GPT connection (to save on cost)
Paxos consensus (currently there is a shared KV Store)
Read-Repair (currently there is a shared KV Store)

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
concurrency-examples		concurrency-examples
data		data
distributed_system		distributed_system
kvstore		kvstore
movie-review-data		movie-review-data
node		node
utils		utils
Initial Project Doc.docx		Initial Project Doc.docx
LICENSE		LICENSE
README.md		README.md
main.py		main.py
project-design-document.md		project-design-document.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distributed System Performance Demonstration

Getting Started

Running The Code

How The Code Works

CLI Usage

Project Design

Some Items that were not implemented

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

L0Lmaker/DistributedSemanticAnalysis

Folders and files

Latest commit

History

Repository files navigation

Distributed System Performance Demonstration

Getting Started

Running The Code

How The Code Works

CLI Usage

Project Design

Some Items that were not implemented

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages