Software Architecture
Software Architecture
Application Diagram
Frontend (Ruby on Rails - MVC Architecture):
Model-View-Controller (MVC) Pattern:
Model: Manages the business logic and database interactions using Active Record. It stores user particulars and other data extracted from documents in the respective databases.
View: Responsible for presenting data to the user and collecting input. The views are designed using HTML, CSS, and embedded Ruby (ERB) to render dynamic content.
Controller: Acts as an intermediary between the View and the Model. It handles incoming HTTP requests, interacts with the backend services, and manages the flow of data to and from the Model and View.
Backend (Flask Microservice - RESTful API):
Microservices Architecture:
The backend is built as a microservice using Flask, providing RESTful APIs to handle specific functionalities.
Document Processing: The Flask backend uses Google’s Document AI to extract data from uploaded documents. This service is responsible for parsing and analyzing the document content to extract relevant information.
Chatbot and Verification: Vertex AI is integrated with the Flask backend to manage the chatbot functionalities and to verify the extracted document data. The chatbot can interact with users, providing guidance and collecting additional information if necessary.
Communication Flow:
The Controller in the Ruby on Rails frontend sends HTTP requests to the Flask backend via RESTful APIs, requesting document processing and chatbot interactions.
The Flask backend processes the requests, utilizing Document AI for data extraction and Vertex AI for chatbot interactions and verification.
The backend then sends a response back to the Rails frontend. This response might include extracted data, verification results, or chatbot responses.
The Controller receives this response, processes it, and performs further actions:
Google Cloud Storage: The Controller interacts with Google Cloud Storage to store the original and processed documents in designated buckets.
Active Record Database Interaction: The Controller then uses Active Record to save the extracted and verified particulars into the relevant database tables.
Data Storage and Persistence:
Google Cloud Storage: Stores the documents securely in cloud buckets, ensuring they are accessible for future reference and processing.
Active Record Database: Handles the persistence of user details, document metadata, and other critical data, ensuring that the application maintains a structured and queryable data store.
Database Diagram
Database:
The database schema includes several key tables, each corresponding to our main controllers:
Foreign Applicant: Stores users information such as name, email, and date of birth.
Document: Store each document's data and which user is it from.
Key relationships:
One-to-Many: A user can have many documents
Google Cloud Platform Services
Vertex AI:
Purpose: Vertex AI is leveraged for its robust machine learning and AI capabilities, specifically for the chatbot interactions and document verification processes within your application.
Chatbot Functionality:
Vertex AI powers the AI-driven chatbot that interacts with users, providing real-time assistance, answering queries, and guiding them through the document submission process.
The chatbot is trained to understand user inputs, provide contextually relevant responses, and gather additional information if needed.
Document Verification:
Vertex AI also plays a crucial role in the verification of data extracted from documents. After Document AI processes the documents, Vertex AI can further analyze and cross-verify the information against predefined criteria or external data sources.
This ensures that the data is accurate and meets the necessary validation rules before being stored or further processed.
Document AI:
Purpose: Document AI is utilized for its advanced document understanding and data extraction capabilities.
Data Extraction:
Document AI automatically extracts structured data from a wide range of document types, such as invoices, forms, and identity documents.
The service processes the uploaded documents, identifying key fields and extracting relevant information such as names, dates, amounts, and other important details.
This extracted data is then sent back to the Flask backend for further processing, including verification and storage.
Integration with Flask Backend:
The Flask microservice communicates with Document AI via RESTful APIs, sending document data and receiving structured information in return.
This seamless integration allows for automated and scalable document processing, reducing the need for manual data entry and minimizing errors.
Google Cloud Storage:
Purpose: Google Cloud Storage is used to securely store and manage the documents and other data files generated or uploaded by the users.
Storage Buckets:
Documents are stored in GCP Storage Buckets, which act as scalable and durable repositories for large amounts of unstructured data.
The architecture typically involves creating separate buckets for different types of data, such as original documents, processed documents, and backups.
Secure Access and Management:
Access to these buckets is controlled via GCP’s Identity and Access Management (IAM), ensuring that only authorized services and users can interact with the stored data.
Versioning and lifecycle management policies can be applied to these buckets to manage data retention and storage costs effectively.
Integration with Frontend:
The Rails controller interacts with Google Cloud Storage to upload processed documents, ensuring they are available for future reference or further processing.
The stored documents can be retrieved by other components of the application as needed, supporting features like re-verification, user access, or compliance audits.
Automating CI/CD with GitHub Actions
GitHub Actions is used to automate the deployment process, ensuring that every time you make changes to your code, it gets deployed automatically without manual intervention. This is achieved through a workflow configuration file, located at .github/workflows/deploy.yml, which outlines the steps necessary to deploy our application.
Postman API Requests/Response
We used Postman to test these API endpoints of the Flask Backend to ensure they are working as expected.