The investigation of systematic patterns in the control of gaze and speech has been an intensively studied topic in human-human, dyadic communication. The previous research covers various aspects of communication, such as the production of referring expressions, joint attention through the control of gaze direction, as well as gaze contact and gaze aversion. This talk will introduce the lab work that has been conducted on the analysis of gaze and speech interaction in physical and virtual reality environments. In particular, we focus on human-human and human-avatar interaction in dyadic communication settings. From a technical point of view, one of the major challenges for the analyses of gaze behavior in physical social settings is to dynamically detect and track objects in the visual environment. In dyadic communication settings, this challenge reduces to a face detection and tracking problem in a video stream in synchronization with speech data. Overlaying gaze data on the video stream allows the identification of gaze contact and gaze aversion by means of the analysis of already detected faces in the stream. Speech-act annotation is incorporated to analyze the synchronization of gaze contact with speech utterances. In virtual reality environments, the challenge is the direction of gaze of communication partners, in this case avatars, as perceived by the humans. This talk will introduce our technical solutions to investigate these aspects interaction and future research directions.