Project Overview

The main objective is to develop the technology and sample implementations for web applications that include real-time voice communications. The project aims for web communication widgets to become as common on web pages as other components such as layout, buttons, text fields, images and multimedia players. Our relevant paper on SIP APIs for voice and video communications on the web (slides) was presented at IPTcomm 2011.

Most Internet applications use HTTP as the only application protocol. At the same time, global voice communications, both fixed and mobile use VoIP based on SIP standards for interoperability, but have not produced any significant web applications other than emulating legacy telephony services. In reality, only two protocols are required for web communications: HTTP for signaling and control, and UDP for real-time media transport. All other application specific functionality can reside in the application itself; in the user client and/or in a web server.

Part 1 (week 1): Initial Reading
  1. Read Internet Draft on SIP APIs for Communications on the Web. Focus on specific steps to enable web communications.
  2. Read Report from RTC-Web Workshop focusing on Workshop Conclusions. Also skim through two papers: The Future of Web Applications and Architectural Framework for Browser-Based Real-Time Communications.
  3. Read potential platform issues and options for web communications.
  4. Read blog post on REST and SIP to get an idea on how to define web API for communication and conferencing.

From standardization perspective, there are two parts to web communications: (1) specific HTML/Javascript extensions to enable new elements or components for devices, codecs, and communication, and (2) protocol to enable end-to-end communication among browser instances. While standardization of these tasks in W3C and IETF may take few years, in this short semester project we will focus on pre-standard implementations of these parts to enable web communications.

Among the available architectural alternatives, using a separate application to facilitate end-to-end media path appears to work well in light of existing tools and constraints. In this architecture, there are three blocks: client, server and separate application. The (web) server is extended to facilitate signaling for real-time communication using HTTP and asynchronous primitives. The client or web browser runs HTML, Javascript or plugins such as Flash Player to display the front-end of the communication widget. The client also communicates with the separate application using a well defined and authenticated API. The separate application runs on user's host computer to enable end-to-end media path as instructed by the web application running in the client.

Part 2 (week 2-3): Software Design
  1. Draw block diagram of software system and identify various interfaces.
  2. For the use case of a two-party voice communication, create message flow among different components in your system using concerete examples.
  3. For each interface, list all the communication messages for both successful and failure use cases, e.g., signaling messages between client and server, control commands from client to separate application, etc.
  4. From application developer's point of view, list all the new programming elements and components that a web developer will use to build web communication applications.

As mentioned before, the web server facilitates signaling for real-time communication using HTTP and asynchronous primitives. Several options exist for asynchronous communication on web, e.g., Comet, BOSH, and more recently server-events and websocket extensions of HTML5. We will use existing technologies as much as possible. A major role of web server in this project includes signaling for communication. In future we will enable interworking between web browser and traditional VoIP clients using such web server as gateway. Hence the client-server communication API should include sufficient information to allow translation to SIP call signaling and to enable end-to-end media path.

Apache web server is very popular and developer friendly. We will use Apache web server with server side programming as much as possible. The modular design allows extending the server if needed. We feel that most of the server side functions can be done using application programming itself without having to write a server extension.

Part 3 (week 4-5): Signaling Service
  1. Using Apache web server side programming such as JSP, PHP, Servlet or Python, implement the server functions to facilitate signaling for real-time communication using HTTP.
  2. At minimum, it should include login/logout and call setup/termination API messages.
  3. Additionally, call setup should include (media and transport) capability negotiation similar to SIP/SDP albeit using modern XML/JSON-based data format.
  4. The API should follow REST principles. Incoming call indication should use asynchronous signaling instead of polling.
  5. Implement a test client in an scripting language to test your web server functions.

End-to-end media path is established using the separate application running on user's host computer. The separate application receives commands from the client application running in the web browser and uses ICE-like process to establish end-to-end media path. Several existing SIP applications and libraries include ICE functions which can be reused in this project, e.g., pjsip includes pjnath. In future, this will get replaced with implementation of host identity protocol (HIP). This project focuses on only UDP for media transport, and TCP transport or HTTP tunneling is for future work.

While the separate application can be implemented in any programming language, in post-standard future we will need to merge it with the browser implementation. Hence C/C++ is the preferred programming language for implementing the separate application.

Second constraint on the separate application is that the API should be accessible from web applications hence should use HTTP as much as possible. It may use RTMP for media transport between the Flash Player and separate application if a Flash application is used to facilitate media capture and display, but this is orthogonal to the core API.

The separate application should authenticate all client connections directly by asking the end user. It should also ask permission from the end user before initiating or accepting media connection with another browser instance.

Part 4 (week 6-8): End-to-End Media Path
  1. Implement a separate portable application in C/C++. It should support the API commands from part 2 to establish end-to-end media path using existing NAT traversal libraries such as pjnath.
  2. It should include audio capture and playback using portable audio libraries such as rtaudio and audio codec such as speex.
  3. Optionally, it should allow Flash application to send captured and encoded media.
  4. To test this component, use a test script that generates commands, and external STUN and TURN servers to enable connectivity across NAT and firewall.

The client application running in the web browser uses widgets and controls the web communication process. It interacts with both the server for signaling and separate application to establish media path. This project will create widgets for common web communication tasks such as two-party voice call. The widgets are combination of HTML, Javascript and/or plugin applications to enable a specific function. A web developer can use such widgets in her web pages to quickly incorporate the web communication feature for the web site.

When a widget is loaded in the browser, it should first detect whether appropriate version of the separate application is running on host computer or not. If needed, it should prompt the user to install a particular version of the separate application.

The two-party communication widget should have developer API to login and logout, and to initiate, receive, accept or reject calls. It should also have API to control device parameters such as microphone mute state, speaker volume, etc.

Part 5 (week 9-11): Client Widget and Web Application
  1. Implement the two-party communication widget and an example web page to facilitate two-party voice call. The widget should be available as a Javascript library, and the web page should use only HTML and Javascript to access the widget.
  2. Write a short developer tutorial on how to build web communication applications using your widget.
  3. Write specific tasks that are needed to extend your widget to support advanced communication such as multiparty conferencing, video, etc.

One factor that defines the success of any web application is how easy it is to use. Our project includes a separate application that must be installed by the end user. To provide a seamless and immersive web experience, the installation of this separate application should be as easy as possible. Secondly, for a third-party web site to use this project, some server side component needs to be installed as well.

Part 6 (week 12-13): Application Installer
  1. Create an installer for your separate application that can be easily run by the end user on various platforms. Optionally, allow automatic version upgrade. Your client widget should prompt the user to install if needed.
  2. Create an installer for your server side application. You can assume external components such as Apache web server and Mysql database are already installed.

Documentation and demonstration of this project are crucial grading components. Unlike homework assignments, this project requires high quality software engineering practice because there will be more/different students working on or extending it in future. For a successful project, you should provide the following deliverables.

Project Deliverables
  1. Design document as described in part 2.
  2. Source code with extensive code comments for all the software components and test scripts (part 3-5).
  3. Application installers as described in part 6 and available to any one on the Internet.
  4. Meeting minutes of all the project related meetings highlighting important decisions, action items and accomplishments.
  5. A short tutorial describing how to use your software from a web application developer's point of view.
  6. A 2-page project report highlighting your goals, accomplishments, project experience, and future work.