Target solution + plan:
Given our visuals are very specific (ex. exposed back, forearm, lotion, medicinal alcohol, specific colored cloth, cotton), we feel the best plan of action is to train our own CNN (convolutional neural network) model for most of our specific object detection needs. Using the deep perception network that comes with Stretch was nowhere close to accurate enough..
Additionally, we plan on having two separate models: 1) for detecting skin parts such as the back/forearm of the user, 2) another for detecting different topical liquids, cloth and cotton.
On Kaggle, we found a dataset PASCAL-Part (insert URL here) which has 300K+ images that split the body in forearms, torso/back, head, legs into 2-3 parts, etc. We'd split these images by 10% for validation, 80% for training and 10% for testing. This would suffice for training our custom model, and tuning the hyperparameters. We'd likely write the custom CNN utilizing this dataset in PyTorch (library). .
For the objects listed above, we'd have another model. We'd use YOLOV8 on the computer vision side for detecting these objects, and train it on the google open images dataset. Again we need to detect a towel, cotton, and two liquids (alcohol + lotion).
Note: for obstacle detection (when the robot is heading towards the user), simple sensors would suffice or the YOLO 8 model above. There isn't a strong need to detect WHAT obstacles are in the way, but rather be able to identify them alongside their size. Then the robot can attempt to navigate around the obstacle.
Fallback ideas to simplify target solution:
1) We found 1-2 open source body segmentation models. We're considering using BodyPix, url: insert URL here, as an alternative to a custom model for detecting the back/forearm of the user
2) If need-be, we plan to make he objects/liquid a specific color to be more easily identified as a common liquid/object in the google open images dataset.
3) For the site detection, someone could draw on the user's back/forearm in erasable marker to indicate the boundary to apply liquid to. At least four corner marks would be necessary for simple rectangle sites. Later if time allows, we could aim for something more ambitious like any closed shape (line boundaries drawn on the user's skin).
Target Solution, other variables to track:
surface area of application site
- too much surface area is a problem because you're probably 1) wasting the liquid and 2) rubbing it on places that were not intended
any wounds or sub-areas to avoid on the scanned application site
- don't want to hurt user
flatness or 3d orientation of scanned application site
- we support any combination of horizontal and vertical orientations (up to user), although horizontal might be more comfortable
4. pressure applied to back
- too much pressure is a problem: don't hurt user
- too little pressure is a problem: not applying liquid well enough
5. track motion of user or application site
- too much movement poses a problem with applying the topical liquid
- erratic movement can indicate discomfort
6. robot track its speed of wiping motion, and how long it has been since last coating the cloth with the liquid
- want to do an effective job (right speed without causing discomfort, and need enough liquid on the cloth or else get a refill)
7. how much liquid is on the cloth/cotton (optional)
- might need to redip if not enough