Chapter 1: Flask Architecture & WSGI Internals

Flask is a minimalist web framework built on the WSGI (Web Server Gateway Interface) standard (PEP 3333). Its architecture is designed around the principle of a "micro-kernel" that provides essential routing and HTTP primitives while delegating higher-level concerns—like database management or form validation—to specialized extensions. By leveraging the Werkzeug WSGI toolkit, Flask enables high-performance Python web services that integrate seamlessly with industry-standard servers like Gunicorn and Nginx.

I. The WSGI Specification: The Universal Bridge

WSGI is the standard interface between web servers and Python web applications. It ensures that any WSGI-compliant server can run any WSGI-compliant application. Flask implements the application side of this contract, accepting a raw environ dictionary and providing a start_response callback to the server.

1. Thread-Local Execution Contexts

To manage request-specific data without passing objects through every function call, Flask utilize Thread Locals (via werkzeug.local). This allows for global-like access to variables while maintaining strict isolation between concurrent requests.

request: Encapsulates the current HTTP request state (args, form, files).
g: A developer-controlled storage object for the duration of a single request context.
session: A cryptographically signed cookie providing state persistence across multiple requests.

II. Application Factories & The Modular Kernel

In production-grade engineering, applications should avoid using a global app instance. Instead, the Application Factory pattern is used to instantiate the Flask object within a function. This enables:

Dynamic Configuration: Easily swapping between Development, Testing, and Production settings.
Testing Isolation: Creating fresh application instances for every test case to prevent state leakage.
Multiple Instances: Running different versions or configurations of the app within the same process.

III. Production Anti-Patterns

Exposing Gunicorn Directly: Running the WSGI server without a reverse proxy (Nginx). This leaves the app vulnerable to "Slowloris" attacks and lacks the request-buffering capabilities necessary for stable performance.
Global State Modification: Storing user-specific data in module-level variables. Since Flask workers are reused, this results in data corruption and security leaks between different users.
Sync I/O in Handlers: Performing heavy database exports or external API calls synchronously. This blocks the WSGI worker, dropping the server's concurrent capacity to near-zero.

IV. Performance Bottlenecks

Worker Context Switching: Configuring too many synchronous workers (e.g., --workers 64). The OS will spend more time in CPU context-switching than executing application code.
Thread-Local Stack Pressure: Deeply nested request/application context pushes can increase the memory overhead of the worker thread.
Unbuffered Logging: Writing extensive logs to a slow disk synchronously within the request cycle, causing the worker to enter a "Wait" state.