Reverse HTTP or HTTPS Proxy

Setting up a reverse proxy isn't hard. Unfortunately there isn't a lot of clear information on the internet, which means if you're new to it, you may find it frustrating trying to understand how to set one up. Here I'll show you how to set up a basic reverse proxy in Apache. Once you understand the basics, you can modify or add to the procedure to suit your needs.

A reverse proxy takes an http request from a browser,  but instead of answering the request directly, it forwards the request to a web server, receives the reply from the web server, and forwards the reply back to the browser. It does all of this without revealing the existence of the web server. The browser sees only the proxy server. It is unaware that the reverse proxy is just a "middleman." 

Two typical uses of a reverse proxy are:

  • Handle public http requests for a webserver that is on a private network. The reverse proxy can be given a public ip address while the webserver can remain protected behind a firewall.
  • Provide a means to customize the look and feel of a website without modifying the webserver. A single webserver can take on different appearances by applying stylesheets to the reverse proxy. This can be useful to a service provider that offers a web portal to its customers and wants each customer to experience a different look and feel.

It's important to understand the following:

  • The client's browser may communicate with the proxy via http and/or https.
  • The proxy may communicate with the web server via http and/or https.
  • The above two things are independent and mutually exclusive.

In other words, the protocol used on the front end (between the proxy and the browser) does not necessarily have to match the protocol used on the back end (between the proxy and the web server.) It's entirely possible for the reverse proxy to communicate with the web server entirely through https, while at the same time communicating with the browser entirely through http. Or vice-versa. 

First thing to decide is which protocols the proxy will support on the front and back ends. If the web server on the back end uses https for security (login pages, password prompts), then it is highly recommended you force an https connection on the front end so that confidential information is not passed in plain text between the browser and the proxy server. Even though the traffic between the proxy and the web server would be encrypted (https), the traffic between the proxy and the browser would not (http) and therefore anyone in the middle of these could sniff the http traffic and capture passwords.

I will show you how to build a proxy that assumes the backend is always https and that forces the front end to always use https as well. This example assumes you are using some sort of Unix or Linux server.

  1. You'll need to install Apache and mod_ssl. In CentOS you can do yum install httpd mod_ssl
  2. Make sure the right modules are loaded. Go into htppd.conf and make sure the following lines are uncommented:
    LoadModule ssl_module modules/
    LoadModule rewrite_module modules/
    LoadModule proxy_module modules/
    LoadModule proxy_balancer_module modules/
    LoadModule proxy_ftp_module modules/
    LoadModule proxy_http_module modules/
    LoadModule proxy_ajp_module modules/
    LoadModule proxy_connect_module modules/
  3. Create a VirtualHost for port 80
    1. In httpd.conf add the following Virtual Host
      NameVirtualHost *:80
      <VirtualHost *:80>
        ServerName <hostname of proxy server>:80
        DocumentRoot /var/www/html

      # force https between proxy server and browser
        RewriteEngine On
        RewriteCond %{HTTPS} off
        RewriteRule (.*) https://%{HTTP_HOST}%{REQUEST_URI}
    2. The above entry forces all http browser requests to https
  4. Create a VirtualHost for port 443
    1. Since we're forcing the browser to use https, the work of the proxy itself will be configured here. Edit /etc/httpd/conf.d/ssl.conf as follows:
      LoadModule ssl_module modules/
      Listen 443
      NameVirtualHost *:443
      <VirtualHost *:443>
      DocumentRoot "/var/www/html"
      ServerName <hostname of proxy server>:443

      # turn off FORWARD proxying
      ProxyRequests Off

      # do not proxy the following images. useful for removing unwanted logos and verbage
      ProxyPass /images/logo1.gif !
      ProxyPassReverse /images/logo1.gif !
      ProxyPass /images/logo2.png !
      ProxyPassReverse /images/logo2.png !
      ProxyPass /images/Company_Login.png !
      ProxyPassReverse /images/Company_Login.png !

      # Perform some replacements of headers, footers, styles. Useful for replacing Company1 stuff with Company2 stuff.
      AddOutputFilterByType SUBSTITUTE text/html
      Substitute s|2012,\sCompany1,\sInc.\sAll\srights\sreserved|Company2|f
      Substitute s|width="40%"\sborder="0"\salign="center"\scellpadding="0"\scellspacing="0"\sstyle="background-image[\w\W\t\r\n\f\a\e\v\d\s\D\S\A\Z]*|style="display:none;">|f
      # Proxy everything else
      # Force communication with back end over https

      ProxyPass / https://<back end server hostname>/
      ProxyPassReverse / https://<back end server hostname>/

      # Force SSLV3 to the back end if you need to (normally not necessary unless the back end is running an old version of Tomcat)

      SSLProxyProtocol SSLv3

      # Make sure the proxy can communicate with the web server using https

      SSLProxyEngine on

      RewriteEngine On

      <... keep other default directives>

  5. Restart apache

Now if you point your browser to http://<hostname of proxy server>, it should pull up an https website and the hostname of the back end server should never appear in the URL.