Auto-Detecting Approval
Background
When OAuth is used with installed applications acting as consumers, there is a UI challenge in transitioning to/from a web browser.
Specifically an installed application will launch the approval page in a web browser and the user will have to authenticate and authorize the OAuth request. After the request token is authorized, the application may exchange it for an access token. This creates a disconnect in the user experience because users can become confused when asked to switch back to their application after completing authorization in a stand-alone web browser. While it is relatively easy for an installed application to start the approval process, it is more difficult to detect its completion.
One option to improve this experience is to use an embedded web browser controlled by the installed application. This would give it greater control over the browsing experience and keep the user to the task at hand. However there are 3 downsides:
Not all web browsers support easy embedding (For example, on Windows, it is very easy to embed Internet Explorer, but not so easy to do so for Chrome and Firefox.)
The user is forced to authenticate again (unless the embedded browser shares cookies with the user's main browser). This is not only annoying, but the password autofill feature may not work and many users are then forced to lookup their password.
An even bigger challenge happens for embedded browsers if users authenticate with something more complicated then a password. For example, some users may have to enter a one-time code that is sent to their mobile phone if there is no cookie in the device's browser to indicate that they already performed this second authentication step. In other Enterprise cases, the user may have been authenticated when they logged into the operating system of their computer, and it may be impossible to repeat that process from an embedded browser.
The only way around these problems is to launch a windows on the user's default browser where they are logged in (usually, or at least the user has a way to login again if need be).
This document assumes that the default browser will be used to show the OAuth approval flow and considers some options for detecting the completion of the approval flow.
Options
1. Run a local web server.
The installed application operates a local web server, listening on a particular port. The OAuth service provider redirects to that server after approval is complete, via the callback URL.
Advantages: Reliable; allows local client to control and fine-tune the UI user sees after approval (e.g., server does not have to guess client configuration or scenario).
Disadvantages: Firewall may block or prompt for listening sockets; the local web server never receives a signal if the user abandons the flow; even a minimal server is a lot of code; this creates a new attack surface (e.g., even if the server only accepts loop-back connections, other sites can try to attack it by commanding the user browser to send requests).
2. Monitor cookies.
Callback URL points to a website operated by the software vendor. When the approval completes and service provider redirects to the page, that page writes a cookie with a particular name/value. Client application polls the persistent cookie jar of the web browser to check if that cookie has been added.
Advantages: Simple for websites to write cookies; simple for installed applications to check the cookie jar.
Disadvantages: Requires public interface to access cookie jar, which does not exist for all browsers; subject to race conditions between multiple processes in Chrome and Firefox because cookie store is not sychronized; requires persistent cookies to make cross-process signaling work.
3. Use browser extension, running inside the browser process
There are many possible variations on this approach. For example, a browser extension could wait to be invoked by script running on the callback URL (passive) or could monitor for a particular URL being visited (active). Such a browser extension can be implemented using the extensibility mechanisms of browsers (ActiveX control or BHO for IE, NPAPI plugin for Firefox, etc.) or DLL injection straight into the browser process.
Advantages: Very reliable. This is the only solution that can detect when user has wandered off, closed approval page, or declined approval. (The web-based solutions only learn that the flow completed but not the result until attempting fetch access token.)
Disadvantages: Need locally installed binary (could be distributed with each app); requires different solutions for each browser and OS; significant amount of code.
4. Monitor web-browser title-bar
As with option #2, the callback URL points to a website operated by the installed application vendor. Upon completion of the approval flow, the HTML page returned on the callback URL has a title containing a special string. The installed application periodically checks the title bar for all applications on the current user desktop to see if any window has a title bar with the expected string.
Advantages: Works for any web browser that displays the HTML page title in the window title bar or per-tab title.
Disadvantages: The user might notice strange characters or serial number in the title bar; depends on the implementation details of web browsers (namely that HTML <title> element becomes a window title text in GUI); user may close tab or browser can crash before navigation completes.
Cem Paya
Information Security Engineer, Google