APS 2025

Infrastructure & DevOps

Technologies & Libraries

Cloudflare for SSL, DDoS protection, and CDN
Nginx as reverse proxy and static file server
Gunicorn for Python WSGI app deployment
PostgreSQL with PgBouncer for DB optimization
systemd for service control and zero-downtime deployment
Ubuntu Server in production; shared DB for dev on Windows

Key Implementations & Functionalities

Multi-layer stack: Cloudflare → Nginx → Gunicorn → PostgreSQL
High efficiency: PgBouncer manages 1000 Django → 12 DB connections
Secure static content delivery with headers and gzip
9 domains hosted under a unified stack
Handles over 100M media files with optimized caching
Local development shares production database seamlessly

Backend Architecture

Technologies & Libraries

Django 5.2.3 with built-in ORM and admin
django-parler for multilingual support (20 languages)
Django Q for async tasks without Redis
PostgreSQL FTS with GIN indexes for search
django-cms and django-grappelli for content and UI admin

Key Implementations & Functionalities

i18n-ready models for Artists, Artworks, Museums, Articles
SEO-friendly multilingual URLs auto-detect language
Caching system using DB cache
20 millions indexed records for fast multilingual search
Background task queue with admin monitoring
Optimized queries for large-scale read performance

Model Content Updates

Technologies & Libraries

LangChain and OpenAI/Gemini for content generation
json_repair for structured output recovery
Custom prompt engineering for content field accuracy
Multi-source enrichment (Wikipedia, DB, etc.)

Key Implementations & Functionalities

4 models updated via AI: Artist, Artwork, Museum, Article
20 million records updated in 20 languages
Batch scripts manage content regeneration workflows
Fallback and retries ensure process continuity
Logs, cost tracking, and validation post-processing
Generates field-level computed metadata (e.g., tags, styles)

Generative AI – Article Generation

Technologies & Libraries

Django templates for structured articles
AI-assisted content synthesis and SEO tuning
Image logic for selecting article visuals
Scheduled generation via Django Q

Key Implementations & Functionalities

Menu-based and "Top 25" articles auto-generated per category
Links artworks, artists, museums into cohesive narratives
Multi-language generation in 20 languages
Uses metadata for historical context and keyword optimization
Handles bulk runs with queueing system
Enforces uniqueness and readability in all content

AI-Powered Image Generation & Processing

Technologies & Libraries

GPT for prompt generation
PIL/Pillow for image resizing and conversion
Support for multiple image APIs
Django Q for batch image tasks

Key Implementations & Functionalities

AI generates article cover images and artwork alternatives
User photos restyled into artistic formats (20+ styles)
Automatic thumbnails in multiple resolutions
Image optimization with WebP/JPEG and CDN delivery
Storage auto-organized and cache-controlled
Face detection and enhancement on portraits

Frontend Architecture & User Interface

Technologies & Libraries

Django templates with Bootstrap 5 and HTMX
JavaScript (ES6+), CSS Grid, and Flexbox for UI logic

Key Implementations & Functionalities

Fully multilingual responsive interface with SEO optimization
HTMX enables filter/search without full page reload
Image delivery optimized for device and DPI
OAuth-enabled auth flows, shopping cart, and payment integration
Structured data and robots/sitemap management
Full accessibility and performance optimization (PWA-ready)

WahooArt 2023

ChatBot

Technologies & Libraries

Languages & Core: Python, sqlite3 (Caching), pickle (Serialization), hashlib, re (Regular Expressions).
AI Orchestration: LangChain (Chains, Retrievers, Prompt Engineering).
LLM Providers: OpenAI API (GPT-3.5, GPT-4, GPT-4 Turbo), Replicate API (Llama 2-70b), Together AI (Llama 2-70b).
Vector Search & Embeddings: ChromaDB (Vector Store), HuggingFace BGE Embeddings (BAAI/bge-large-en).
External Search APIs: SerpAPI (Bing Search Engine integration), Wikipedia API (via WikipediaRetriever).
Data Acquisition: BeautifulSoup4 (bs4) for HTML parsing/cleaning, requests for HTTP handling.

Key Implementations & Functionalities

Hybrid RAG Architecture: Engineered a Retrieval-Augmented Generation system that dynamically aggregates context from three distinct layers: internal unstructured documentation (Vector Search), internal structured databases (SQL-like queries for Artworks/Orders), and external real-time web data (SerpAPI/Wikipedia).
Multi-LLM Abstraction Layer: Developed a unified LLM_Engine class capable of seamlessly switching between proprietary (GPT-4) and open-source (Llama 2) models at runtime, optimizing for cost and performance based on task complexity.
Intelligent Intent Analysis: Implemented a two-stage reasoning pipeline where an initial LLM call analyzes user input to generate a structured JSON plan, extracting specific search keywords and categorizing intent (e.g., "Visual_Arts", "Customer_Previous_Order") before executing retrieval tools.
Context Window Optimization: Designed a specialized scraping pipeline that cleans, tokenizes, and truncates external web content to ensure relevant information fits effectively within the LLM's context window limits.

Translation Engine

Technologies & Libraries:

Python, Argos Translate (OpenNMT), Custom Parsing Algorithms, Standard Library (Global State Management).

Key Implementations & Functionalities:

Enterprise-Scale Translation: Designed and deployed an automated system to translate 370,000 artworks into 9 languages (including French, German, Chinese, and Russian), significantly expanding international accessibility.
HTML-Aware Parsing: Developed a character-stream parser to isolate and preserve HTML tags (e.g., <H1>, <A>) while selectively translating only the text content.
Dynamic Resource Management: Built a system to automatically detect, download, and install missing neural language packages at runtime based on the required language pair.
Performance Optimization: Implemented global in-memory caching to track installed language pairs, preventing redundant initialization and ensuring efficient high-volume processing.

Art Article Generator

Technologies & Libraries:

Programming Language: Python
AI & LLM Integration: OpenAI API (GPT-3.5/4), Replicate API (Mistral/Mixtral), Local LLM execution (ctransformers, AutoModelForCausalLM for Llama/Orca models).
Frameworks: LangChain (Prompt management, caching), ChromaDB (Vector Database for RAG).
Computer Vision: OpenCV (cv2) for image comparison (ORB feature matching), Multimodal LLM integration (LLaVA).
Natural Language Processing: SpaCy (en_core_web_md) for semantic similarity and text parsing.
Web & Networking: requests (Session management, Retries, Proxies), BeautifulSoup4 (HTML parsing), urllib, SerpApi (Bing Search engine integration).
Data Management: SQLite (Custom caching implementation), Pandas, Pickle, JSON (demjson3, json_repair).

Key Implementations & Functionalities:

Retrieval-Augmented Generation (RAG): Designed a custom pipeline that retrieves context from three sources before generation: internal WahooArt/ArtsDot databases (Lotus Domino .nsf endpoints), external web searches (Bing/Google via SerpApi), and Wikipedia (WikipediaRetriever).
Multilingual Content Automation: Implemented a batch processing system to generate academic-style art articles across 6 languages (English, French, German, Italian, Spanish, Portuguese) targeting specific topics like Art Styles, Media, and Museums.
Automated Publishing Pipeline: Created scripts to sanitize generated HTML, handle character encoding for European languages, and automatically POST content to the CMS (Content Management System).
Data Scraping & Indexing: Built a system to scrape internal documentation urls, vectorize content using HuggingFace embeddings, and store them in ChromaDB for semantic search retrieval.
Image Analysis: Integrated functions for comparing artwork images using varying computer vision techniques (OpenCV structural comparison vs. AI-based visual description).

EAML 2020-2022

Infrastructure Layer (Nginx, Django, MySQL)

Technologies & Libraries

Key Implementations & Functionalities

• Designed horizontally scalable server stack
• Integrated Celery + Redis for asynchronous ML job scheduling
• Built robust schema-migration and zero-downtime deployment pipeline
• Implemented rate-limiting, session/token authentication, and full audit logging
• Optimized database pooling and ACID transactions for concurrent ML workloads

Web Interface & Frontend

Technologies & Libraries

Key Implementations & Functionalities

• Built real-time dashboards showing model metrics, accuracy curves, and system status
• Developed drag-and-drop data upload (CSV/Excel) with automatic schema validation
• Created dynamic prediction and billing interfaces with exportable reports
• Implemented dual-layer validation (client + server) for secure, seamless UX

Work Processing System

Technologies & Libraries

Celery | Redis Pub/Sub | Python Multiprocessing | Celery Beat Scheduler
• Fault-tolerant distributed job execution
• Task persistence and monitoring

Key Implementations & Functionalities

• Built WorkDispatcher / Processor / Monitor trio for distributed ML job orchestration
• Enabled non-blocking training sessions (72 + hrs) with live progress updates
• Implemented priority queues, retry logic, and GPU/CPU resource management
• Added real-time performance metrics and dead-letter reprocessing

Feature Engineering Module

Technologies & Libraries

Pandas | NumPy | NLTK | Sentence-Transformers | Scikit-Learn Encoders
• Automatic data-type detection, encoding, and scaling
• Persistent encoder caching for consistent inference

Key Implementations & Functionalities

• Built bidirectional EncDec for consistent encode/decode transformations
• Developed unified pipeline for numeric, categorical, and text data
• Implemented semantic text embeddings and multi-modal feature fusion
• Optimized preprocessing (100× speedup via NumPy vectorization)
• Intelligent imputation, outlier detection, and JSON flattening for mixed data

Neural Network Engine

Technologies & Libraries

TensorFlow 2.17 | PyTorch 2.8 | Keras API | CUDA | Scikit-Learn Metrics
• Model serialization (.h5 + metadata)
• GPU-accelerated training and hyperparameter optimization

Key Implementations & Functionalities

• Designed NNEngine for full lifecycle: architecture → training → evaluation → inference
• Auto-architecture search testing 50 + network variants via evolutionary algorithms
• Implemented early-stopping, learning-rate scheduling, and checkpointing
• Supported regression, multi-class, and multi-output prediction tasks
• Provided feature importance and confidence interval outputs for interpretability

Machine Learning Core

Technologies & Libraries

Django ORM | Pandas | OpenPyXL | Django REST Framework
• Modular pipeline: DataReader → FeatureEngineering → NNEngine → API
• Persistent model versioning and cost tracking

Key Implementations & Functionalities

• Developed Machine orchestrator managing model creation, training, and prediction
• Exposed /learn, /predict, /status REST endpoints for external integration
• Reduced ML workflows to < 5 lines of code via MachineEasyAutoML API
• Added rollback and version control for A/B testing and reproducibility
• Supported batch inference (10K+ rows in < 10 s) and transfer learning

Page updated

Google Sites

Report abuse