Communication

Somax has eyes and ears but no voice. It can observe but it cannot speak. That doesn't mean it does not communicate. It does, just in a more personal way to ensure that personal information always remains personal.

Somax can:

Voice chat to a mobile device or PC with microphone and earpiece, or directly to wireless headset.
Text chat to a mobile device, PC with screen and keyboard or with on board display.
Playback image, audio and video sources to on board display or to a mobile device or PC with supporting playback hardware.
Neck up body language via camera, gimbal, and camera mounted display.
Alerts via vibration and lighting on supported mobile devices or by long range visual cue displayed on camera mounted display.

Affect

Somax is designed to connect with its owner on as many levels as possible. It does this to provide more useful and understandable AI.

AffectAI.
AffectEmualtorAI.

Augmented Reality

Somax AR is used to capture data and provide operational feedback.

Capture. A small device similar in size to a bluetooth ear piece and warn in a similar fashion will transmit live attitude, heading and reference information to Somax. Somax will use this Augmentation to study, among other things, where and how long you place attention or focus within a environment specific. It will also measure neck up body language for input to the AffectAI and emulator.
Feedback. Somax will use attached generators, on board user personal devices, to provide variably frequency vibratory feedback. Somax will also use its camera motion controller and display controllers to augment user visual perception within the user field of vision.

Virtual Reality

Somax VR is used primarily to recreate an environment that was previously recorded.

Capture. Somax has a color 3D depth camera, a 3D sound microphone array, and dual attitude, heading and reference systems to record the world in Audio-Visual 3D. Somax will use these sensors along with others to capture, display and learn about humans and 3D worlds. Capture duration is limited to Somax onboard storage capacity and at full detail, with all sensors capturing, will be further limited by a high bandwidth data stream.
Re-Presentation. Somax will have the ability to reconstruct the context of a Captured stream at a later time in using a off the shelf VR headset. This might be initiated as part of refinement or in response to a Somax Application request. A Re-Presentation need not be the product of the users AI and could also be sourced from the Somax Community.
Refinement. Somax will grow its usefulness and imprint by continually refining it's AI's. Sometimes it is most effective to evaluate sensors in response to a Re-Presentation.

Innovations

Visual Remote Connection Configuration.

The Somax platform will have the ability to connect and interact with any device capable of standard wired or wireless communication. Somax will have electronic and audio visual means of configuring and connecting to such capable devices. On Windows 10, Windows 10 Mobile and Android platforms, a Somax application will be available for download from each respective device platform store. The application will configure and initiate connections to a target Somax. In cases where self-configuring electronic connection (USB on the go for example) is possible, the connection will initiate without any further human action. In cases where a connection might only be configured by a human (i.e bluetooth pairing), the Somax application will initiate a two-way visual connection to a target Somax and exchange configuration information by facing the Somax Application device screen and forward facing camera at the target Somax before issuing the voice command "Somax, Configure Visually".

VR Experience Reconstruction

Somax views the world through a special set of eyes. It see's color like you and me. also like humans, it has depth perception and can understand how close / far away objects are.

Reconstruction. Unlike most humans Somax can see, remember, and reproduce objects in 3 dimensions at a level of detail beyond comprehension. On command Somax can for example capture a 3 dimensional model of a human with sub millimeter accuracy! On command it can also capture a spherical 3 dimensional picture of its visually unobstructed environment with a radius up to 10 meters. This picture will contain can contain full color image data as well the distance to each pixel in the image. Somax will use it's AI engine to transform this data into OpenGL instructions describing the location and estimated geometry of it's visible surroundings. A separate viewing application specific to a VR platform will use this information to load the OpenGL instructions and generate the world as experienced by Somax at that moment. And even better it will allow you to move to a different point of view and see what Somax imagines objects might look like.

Environment Compression for Storage and Transmission

Somax has depth perception and as such can answer questions about the distance to a location within its visual field. This is quite useful for many things but when combined with AI you can do something special.

AI can perform a task called semantic segmentation which divides and image into regions that are labels with it contents. For example, if an image of a typical kitchen is segmented then we would see rectangles containing labels like: sink, cabinet, or apple, plate, fork, etc..

The shapes of most objects can be decomposed into a small set of general primitive shapes like: cubes, pyramids, cones and such.

A common technique in 3D modeling of virtual worlds is to build a basic, roughly shaped representation of an object and apply to which a actual picture of the object is mapped. This is called texture mapping.

The idea here is to apply the techniques above in reverse. It would take a 3D depth picture and convert it to primitive geometric shapes and textures.

This model could then be transmitted or stored using far less than a light field image which captures images with depth. This would the greatest impact on send streams of video. Once an initial world is segmented and compressed, only the things that change in the image need be sent again. If something in the scene moves, then information about the new orientation of primitives and any newly exposed texture patches would need be transmitted. Compare this to continually transmitting 360 degree images with 40 bits per pixel (remember the depth and transparency bits) versus to cube19 changed to orientation x,y,z!

The result would be loadable in a VR and a person would be able to not only see what the camera sees but also what the AI believes the unseen looks like.

Imagine taking a picture of the moon and being able to see what is on the dark side. You can't see the Dark Side of the Moon, that is impossible. It is however possible and quite reasonable to imagine the Dark Side of the Moon based largely on what the lighted side looks like.

This idea defines a way for an AI to use a RGB depth image of the lit side of the moon to build a complete 3D model based on what is known about the shape and texture of the moon.

To be able to take this idea and run with it, labeled data about primitive geometry and objects would need to be obtained. OpenGL content, VRML, or CAD data may be a good source for this.

Tap, Double Tap, Direction of Arrival

Somax has dual 9 axis inertial measurement units. One possible use of these sensors is detecting a tap or double tap.

Given that one unit is fixed to the frame and the other to the camera. Further given that both sensors the relative position of each unit, to the other is known at all times.

It may be possible to create a 3 space vector based on the strength of the tap measured at each sensor. The vector would indicate the direction the tap originated from.

Air Keys

Using Somax's depth camera with motion feature, it should be possible to stand in front of Somax and simply trace out letters or numbers. This could be used for input where a keyboard would normally used.

This could be extended to shapes as well.

This also could be used in VR as a means of input and would also be quite useful for helping in refinement where a human needs to show or trace the outline an object in 3 space.