Students will learn how to install software on a UNIX computer system. To test the installation, they will hook up a camera and run example face tracking python program and a target recognition program using computer vision.
Learning outcome:
Prepare portable computing platform and practice Unix shell skills. Observe Unix open software distribution through online repositories both in binary as well as source code.
Urs Utzinger, Updated 1/7/2025
Your Raspberry Pi
Network Connection
When 30 students attempt downloading software packages the network will slow down. I upgraded the lab wireless accesspoint but it still is lots of data to download. One way to speed this up is to connect to the wired network on your desk but we only have one per desk. You can also try hotspot on your phone. You can measure network speed in a web browser using by searching for speedtest.
sudo apt-get update
sudo apt-get upgrade
This is standard approach for the Linux Unix family (Debian, Ubuntu) to update system packages. MacOS is a part of the BSD Unix Family and uses different package manager. Besides apt there are other software distrbution channels to install applications such as flatpack and snap. If you install OpenCV (see below) with apt you should not also install it with snap or flatpack. On Windows, your Android or iOS phone this process is hidden but it executes simlar process.
We want to setup OpenCV (Computer Vision) and install the libraries it uses. You will need to answer questions with "Y". All commands below will need to complete without errors. The first command will take substantial time.
apt first queries a database online what subcomponents it needs then looksup what dependencies are not yet available on your computer, then downloada the packges, unpacks the compressed package (its in a compressed archive to make it smaller for distribution), installs them and updates documentation and the database on your computer for the installed programs.
sudo apt-get install libopencv-dev
sudo apt-get install opencv-data
sudo apt-get install python3-opencv
Update the python setup. The first 5 commands are commented out because they are already part of Raspberry Pi OS when you installed the image.
When you install python modules with apt, it installs them system wide. With python there is also the pip installer which is used to install python packages for individual user without modifying system packages. Raspberry Pi as well as Ubuntu or MacOS needs Python to work properly and you should be carful modifying the system installed Python package. Here we only add a small number of packages and use pip later.
smbus provides binding to I2C bus which we need to read sensors.
picamera2 are bindings to the Raspberry Pi camera subsytem
# sudo apt-get install python3-dev
# sudo apt-get install python3-pip
# sudo apt-get install python3-numpy
# sudo apt-get install python3-setuptools
# sudo apt-get install python3-wheel
sudo apt-get install python3-smbus
sudo apt-get install python3-picamera2
It is common to create a separate user defined Python environment that allows different version of Python and Python Packages to co-exists. Raspian bookworm expects you to work in virtual environment. You can create several virtual environments. In BME225 you used Jupiter, we do not use Jupiter on Raspberry Pi. Often you need one version of a python package for one application but an other for an other application. The virtual environment allows to easily switch to different setting. We will make one for BME210:
# Make the folder
mkdir ~/pythonBME210
# Go to the folder
cd ~/pythonBME210
# Create the environment
python3 -m venv --system-site-packages env
To activate Python Virtual Environment (each time after you start a shell, but Thonny or VS COde keeps the setting)
cd ~/pythonBME210
source env/bin/activate
To deactivate Python Virtual Environment
deactivate
You can run and debug your python code in Thonny. We will not use Conda as that works better for desktop computers. Thonny is light weight and works for most of your code. However, in order to use Thonny and the virtual environment you need to change the Python interpreter in Thonny to the one in the virtual environment.
Switch Thonny to regular mode: Top right.
Restart Thonny
In Tool->Options select interpreter.
Browse for python3 in the ~/pythonBME210/env/bin/python3
Visual studio code is a better programming editor than Thonny but it needs more resources and takes longer to start. Usually the binaries are available at Microsoft: https://code.visualstudio.com/#alt-downloads however for the Raspberry Pi you will need to use an other approach:
sudo apt install code
In the Raspberry Pi menu under programming you should be able to find Visual Studio Code after you installed it. You can not easily use a Webbrowser and Code on a computer with 2 GByte memory.
In this course we will need several python packages. Python installs extensions with its own program called pip. pip contacts pypi.org to find user created python extensions.
There is a Python interpreter that runs on microcontrollers called Circuit Python. You learned how to program a microcontroller (ESP8266) in BME225 in C. Circuit Python adds the abillity to run Python programs on a microcontrollers. The packages developed for Circuit Python will support our need to read sensors on the Raspberry Pi. Therefore we will install a package the provides compatibility:
adafruit-pureio (to access I2C and SPI sensor interface)
adafruit-blinka (interface to Circuit Python packages)
adafruit-circuitpython-motorkit (for our motor hat which you will receive later)
For object detection we will use Tensorflow Lite as that runs stably on Raspberry Pi.
tensorflow light for object detection on
Start the virtual python environment:
cd ~/pythonBME210
source env/bin/activate
Install the python packages:
pip3 install adafruit-pureio
pip3 install adafruit-blinka
pip3 install adafruit-circuitpython-motorkit
pip3 install tflite-runtime
We will also program a robotic arm called meArm. I maintain the code for that project on my Github page. If you do not have an account on Github you should consider getting one at some point in time. Its not needed for this class.
You can access all my code repositories on https://github.com/uutzingern and also on https://github.com/MediBrick . It shows what I have been working on.
Github is its own code repository system. You can learn the commands to pull and push code or use a desktop program to do it. It manages your code and was developed by the creator of Linux. Its notoriously difficult to remember how to use it.
Execute the command
git clone https://github.com/uutzinger/meArmPi.git
git clone https://github.com/uutzinger/camera.git
This will create a folder called meArm. When there is new version of meArmPi library you can update it with git pull when you are in the meArmPi folder.
To install the camera package in the pythonBME210 environment execute python3 setup.py install in the camera folder.
Open VS code and click on the 4 squares on the left side. This allows to install extensions.
Install Python, Pylance, Python Debugger by Microsoft.
When installing the extension you will be given the option to Select or create a Python Environment. Select Interpreter. Select custom path and Browse to pythonBME210//env/bin and select python3.
You can open a folder, for example meArm and it will display all the files in the folder on the left site. If you open meArm.py, on the bottom right it will display the python interpreter it will use to run the program. You can click on it and verify that its pointing to pytonBME210.
You can test opencv installation with:
In a shell/terminal start python3
import cv2 If this does not complete successfully you did not complete the installation script above.
cv2.__version__
should display version number
exit() the python interpreter
Obtain one of the CSI cameras from the course staff. You will need also the flat ribbon cable attached to it. These are typical cameras used for machine vision or autonomous systems.
Make sure the Raspberry Pi is turned off.
Insert the cable into the camera slot of the Raspberry Pi: Gently pull the release hooks towards you. Orient the cable with the pads towards the visible metal connector pins and slide the cable into the connector. Push the cable hook back in. Have the connection inspected by course staff.
Make sure no metal parts touch your camera and the exposed connections are covered up. Otherwise you fry the camera or the Raspberry Pi.
Power on the Raspberry Pi.
To check if the camera works, execute in a terminal rpicam-hello -t 0. You can find documentation on raspberry pi website. Raspian is now using libcamera to manipulate the camra and its own python wrapper and camera tools.
As next step we want to test the camera with the camera package I made for python.
Open Thonny and the example program in the camera folder: raspi_capture_display.py and attempt to run it. You can save the program to an other filename because we will edit it below.
We will want to detect human posture in the video images. For that we will use Tensorflow Lite. There are many platforms to run "Convolutional Neural Networks" such as Tensorflow (Google) , PyTorch (Meta), MNN (Alibaba) etc. A CNN model is the base of all AI as it takes input and creates output by multiplying it with weights through different layers. This is IMO the best explanation how AI works: https://youtu.be/D8GOeCFFby4?si=Qiq8AY2DsSZlkoUV
First we will need to obtain the pose model.
Model choices: Lightning (192×192) is faster; Thunder (256×256) is more accurate
Lightning model:
wget -O model.tflite "https://tfhub.dev/google/lite-model/movenet/singlepose/lightning/tflite/int8/4?lite-format=tflite"
Thunder model:
wget -O model.tflite "https://tfhub.dev/google/lite-model/movenet/singlepose/thunder/tflite/int8/4?lite-format=tflite"
Keypoints: SinglePose MoveNet outputs 17 COCO keypoints with (y, x, score) and shape [1, 1, 17, 3]. MoveNet is described on TensorFlow website.
Where the model comes from: The TF Hub MoveNet tutorial shows how to download the .tflite directly from TF Hub using wget.
You will need to add the image analysis to the camera program we tested above. The sections below will help you do this.
Functions to handle MoveNet in python:
Loading the necessary python packages in the beginning of your program. This is addition of opencv and loading camera and goes to beginning of the program:
import numpy as np
from tflite_runtime.interpreter import Interpreter
Some other helpers you should define. This usually is placed after the imports.
# COCO-17 keypoints used by MoveNet SinglePose
#
KEYPOINT_NAMES = [
"nose", "left_eye", "right_eye", "left_ear", "right_ear",
"left_shoulder", "right_shoulder", "left_elbow", "right_elbow",
"left_wrist", "right_wrist", "left_hip", "right_hip",
"left_knee", "right_knee", "left_ankle", "right_ankle"
]
# Skeleton edges (index pairs) — matches common MoveNet demo conventions
#
EDGES = [
(0, 1), (0, 2), (1, 3), (2, 4),
(0, 5), (0, 6), (5, 7), (7, 9),
(6, 8), (8, 10), (5, 6),
(5, 11), (6, 12), (11, 12),
(11, 13), (13, 15), (12, 14), (14, 16)
]
These functions have documentation between the """ """ quotes.
def load_interpreter(model_path: str) -> Interpreter:
"""
Loads a TensorFlow Lite convolutional neural network (CNN) model
into memory and prepares it for inference.
Parameters
----------
model_path : str
File path to the .tflite MoveNet model.
Returns
-------
Interpreter
A TensorFlow Lite interpreter ready to run inference.
"""
interpreter = Interpreter(model_path=model_path)
interpreter.allocate_tensors()
return interpreter
def preprocess_bgr(frame_bgr: np.ndarray, input_h: int, input_w: int, input_dtype) -> np.ndarray:
"""
Preprocesses a camera frame so it matches the input requirements
of the MoveNet neural network.
This includes:
- Converting from BGR (OpenCV default) to RGB
- Resizing to the model's expected resolution
- Casting to the correct data type
- Adding a batch dimension
Parameters
----------
frame_bgr : np.ndarray
Input image frame from OpenCV in BGR format.
input_h : int
Height expected by the neural network.
input_w : int
Width expected by the neural network.
input_dtype :
Data type required by the model input tensor.
Returns
-------
np.ndarray
Preprocessed image tensor ready for neural network inference.
"""
# MoveNet examples use RGB input; many OpenCV cameras provide BGR
frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)
resized = cv2.resize(frame_rgb, (input_w, input_h), interpolation=cv2.INTER_AREA)
# Most MoveNet int8 models expect uint8; but we handle dtype generically
if input_dtype == np.uint8:
inp = resized.astype(np.uint8)
else:
# float models typically expect float32
inp = resized.astype(np.float32)
# Some float models expect [0,1] normalization; check your model if needed
inp = inp / 255.0
return np.expand_dims(inp, axis=0) # [1, H, W, 3]
def infer_keypoints(interpreter: Interpreter, input_tensor: np.ndarray) -> np.ndarray:
"""
Runs the MoveNet neural network on a preprocessed input image
and extracts the predicted human pose keypoints.
Parameters
----------
interpreter : Interpreter
Loaded TensorFlow Lite interpreter.
input_tensor : np.ndarray
Preprocessed input image tensor.
Returns
-------
np.ndarray
Raw model output containing normalized keypoint coordinates
and confidence scores.
"""
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
interpreter.set_tensor(input_details[0]["index"], input_tensor)
interpreter.invoke()
# [1, 1, 17, 3]
return interpreter.get_tensor(output_details[0]["index"])
def keypoints_to_pixels(kpts: np.ndarray, frame_h: int, frame_w: int):
"""
Converts normalized keypoint coordinates (0–1) produced by MoveNet
into pixel coordinates relative to the original camera frame.
Parameters
----------
kpts : np.ndarray
Raw keypoint output from the neural network.
frame_h : int
Height of the camera frame in pixels.
frame_w : int
Width of the camera frame in pixels.
Returns
-------
list of tuples
List of (x, y, confidence) keypoints in pixel coordinates.
"""
# kpts shape: [1, 1, 17, 3] -> [17, 3]
pts = kpts[0, 0, :, :] # (y, x, score)
out = []
for (y, x, s) in pts:
px = int(x * frame_w)
py = int(y * frame_h)
out.append((px, py, float(s)))
return out
def draw_pose(frame_bgr: np.ndarray, points, min_score=0.2):
"""
Draws detected human pose keypoints and skeletal connections
onto the camera frame using OpenCV drawing primitives.
Parameters
----------
frame_bgr : np.ndarray
Original camera frame.
points : list of tuples
List of (x, y, confidence) keypoints in pixel coordinates.
min_score : float
Minimum confidence threshold required to draw a keypoint
or skeletal connection.
"""
# points: list of (x,y,score)
for (x, y, s) in points:
if s >= min_score:
cv2.circle(frame_bgr, (x, y), 4, (0, 255, 0), -1)
for (a, b) in EDGES:
xa, ya, sa = points[a]
xb, yb, sb = points[b]
if sa >= min_score and sb >= min_score:
cv2.line(frame_bgr, (xa, ya), (xb, yb), (255, 0, 0), 2)
Code for the main program. This sually goes after the function definitions and before the main while loop:
# Load the model
model_path = "model.tflite"
interpreter = load_interpreter(model_path)
# Extract model details
input_details = interpreter.get_input_details()
_, in_h, in_w, _ = input_details[0]["shape"]
in_dtype = input_details[0]["dtype"]
Analysis code for the main loop. Affter we obtained the images we want to analyze them and show the body parts we found:
# Convert frame to model input
h, w = frame.shape[:2]
input_tensor = preprocess_bgr(frame, in_h, in_w, in_dtype)
# Infer the keypoints
kpts = infer_keypoints(interpreter, input_tensor)
points = keypoints_to_pixels(kpts, h, w)
# Draw the keypoints
draw_pose(frame, points, min_score=0.2)
In this assignement you used many unix commands. You could enter them into your cheat sheet for unix commands.