Remember your facebook page. It has pictures, video/audio, text, friend info etc. So many types of content. Isn't it?
Notice that URL is same (www.facebook.com). Do you think that a single server is handing all these content types?
I wish that these contents could be served by different servers- Server which is expert in providing video/audio in real time, server which is good in handling pictures etc. Why so? If we segregate traffic in this way, then performance will be better, solution will be modular and can be easily scaled by adding up new servers of same expertise, whenever needed. So far so good. But the question is: Who will decide which server should be used for a request. Layer 7 Load balancing helps in this requirement. It inspects the application layer traffic. So, if HTTP request is for video content, it will pass this request to sever expert in handling video request.
Layer 7 switching takes its name from the OSI model, indicating that the device switches requests based on layer 7 (application) data.
As shown in above diagram, A layer 7 switch presents to the outside world a "virtual server" that accepts requests on behalf of a number of servers and distributes those requests based on policies that use application data to determine which server should service which request. This allows for the application infrastructure to be specifically tuned/optimized to serve specific types of content. For example, one server can be tuned to serve only images, another for execution of server-side scripting languages like PHP and ASP, and another for static content such as HTML , CSS , and JavaScript.
Unlike traditional load balancing, layer 7 switching does not require that all servers in the pool have the same content. In fact, layer 7 switching expects that servers will have different content, thus the need to more deeply inspect requests before determining where they should be directed. Layer 7 switches are capable of directing requests based on URI, host, HTTP headers, and anything in the application message.
Layer 7 load balancing improves performance by executing only those policies that are applicable to the content. Moreover, It allows for increased efficiency of the application infrastructure. For example, only two highly tuned image servers may be required to meet application performance and user concurrency needs, while three or four optimized servers may be necessary to meet the same requirements for PHP or ASP scripting services. Being able to separate out content based on type, URI, or data allows for better allocation of physical resources in the application infrastructure (As shown in below diagram).
https://devcentral.f5.com/articles/layer-7-switching-load-balancing-layer-7-load-balancing