Drafted by Alfred, modified on 06/22/2023
Abstract:
This document summarizes steps of how to run FPGA acceleration applications on Amazon Web Service (AWS). The document includes the following contents:
1. How to create AWS F1 instance;
2. How to initialize the server;
3. How to create Vitis project and compile host application and FPGA kernels;
4. How to create AWS FPGA image from generated FPGA kernels;
5. How to run acceleration applications;
6. How to work with Spot instances;
This document also contains basic information about how to create FPGA acceleration applications on
ZCU104 evaluation board. Sections marked with ’*’ are not necessary for running acceleration applications
on AWS. First of all, create an AWS account on the official website.
1 Create AWS F1 instance
1.1 Generate ssh keys for first-time access
The AWS instances do not have a password by default. Therefore, the ’password authentication’ attribution in the ssh configuration is disabled at the beginning. To log in with ssh, a key-pair authentication is required, and thereby we have to create the key-pair in advance. When creating the instance, the private key is saved in the instance and then we can log in using the public key. Here is how to create the ssh-key-pair:
Step 1: Log into the AWS management console as the root user
Step 2: On the top right, change the location to N. Virginia (us-east-1). AWS F1 instances are only available in this area.
Step 3: Click the Services on the top left and select EC2.
Step 4: Expand ’Network’In the left and click ’Key Pairs’.
Step 5: Click the ’Create key pair’ on the top right. Enter the key name and select ’RSA’ and ’.pem’.
Step 6: Click ’Create Key Pair’ at the bottom. Your browser should automatically download the public key named *.pem. Save the key properly, it is the only chance to get access to this key file. If your local machine is running a Linux system, it is suggested to save it under ∼/.ssh/.
1.2 Create Amazon cloud service access key
The ssh-key-pair is created for login to the instance locally. To use the AWS services, such as the S3 bucket, from either the local machine or the remote instance, we need another access key. Here is how to generate the AWS services access key:
Step 1: Log into the AWS management console as the root user.
Step 2: Click your user name on the top right, select ’My security credentials’
Step 3: Expand ’Access keys (access key ID and secret access key)’, and click ’Create New Access Key’.
Step 4: Click ’Download Key File’, which is a CSV file and it contains your AWS access key ID and AWS access key. Again, save the key properly, it is the only chance to get the access key.
1.3 Subscribe Xilinx FPGA Developer AMI
The AWS instance is basically a remote PC; we have to specify the software (including the operating system) to be installed on the PC. Xilinx has created a centos system Image, named’FPGA developer AMI’, with the Vitis development tools kit (Vitis, Vitis HLS, Vivado, petalinux, etc.) installed; a lot of time can be saved to directly use the provided image. To use that, we have to subscribe to it first, which is simple. It may take several minutes for Amazon (or Xilinx) to approve the subscription. To subscribe, go to (or just search) AWS FPGA Developer AMI and click ’Continue to Subscribe’. You may need to confirm your subscription again.
1.4 Create AWS F1 instance
After the FPGA developer AMI subscription is permitted, we can not start creating the AWS instance. The AWS F1 instances have several types that are indexed with a different alphabet. We only need the ’C’ series instances, which are computation optimized, and the ’F’ series instances, which are equipped with Xilinx FPGA PCIe cards. The price is counted according to the uptime of the instance, and the unit price (per hour) of the ’F’ series instances is much more than that of the ’C’ series. Since Vitis project compiling takes even hours to finish, Xilinx recommends that customers use ’C’ series instances to build the project and start ’F’ series instances after the FPGA images are prepared. The flow to create two types of instances are almost identical, the steps are shown as follows.
Step 1: Log in to the AWS management console, click ’Services’, and select ’EC2’.
Step 2: Expand the ’Instances’ and click ’Instances’ under the tab.
Step 3: Click the ’Launch instances’ on the top right, which should have been highlighted with orange.
Step 4: Choose an Amazon Machine Image (AMI). Search ’FPGA’ in the search bar, and then click ’AWS Marketplace’. If you have successfully subscribed to the FPGA Developer AMI (section 1.3), you shall see the FPGA Developer AMI on the right. Select it. The details are shown in Fig. 1.
Step 5: Choose an Instance Type. After selection, you may see a floating window listing the price of different instances; click continue if you saw this. As discussed before, you can choose either the ’C’ family or the ’F’ family. Choose ’c4.4xlarge’ for now, and click ’Next: Configure Instance Details’ in the bottom right.
Step 6: Configure Instance Details. Typically, you don’t need to change anything here. Click ’Next: Add Storage’.
Step 7: Add Storage. Mostly, Amazon already added another 5GB EBS storage for you, but I find it is not necessary. Delete it by clicking the × on the right. Click ’Next: Add Tags’.
Step 8: Add Tags. Typically, you don’t need to change anything here. Click ’Next: Configure Security Group’.
Figure 2: The rules for remote access
Step 9: Configure Security Group. Select ’Create a new security group’. Initially, only ssh port 22 is opened to the public network. We need to add some other rules to enable the function of the remote desktop. Click ’Add Rule’ and fill them as shown in Fig. 2. Click ’Review and Launch’.
Step 10: Review Instance Launch. Check everything, especially for the security group. Click ’Launch’, and you shall see a window asking you to select key pair. Choose ’Choose an existing key pair’, and select the key pair you created before (section 1.1). Check the acknowledgment, and click ’Launch Instances’. *Step 11: Start instance Go back to your ’EC2 Management’ Console, and select the ’Instances’ (like Step2). You shall see the instance just created. Update the status, and once the ’Status Check’ become ’2/2 checks passed’, the instance should be ready to use. To start an existing instance, right-click the instance name and click ’Start instance’ (not ’Launch instance’).
*Step 12: You will be charged if the instance is open, so after finishing your project, never forget to close it. Right-click on the instance name, and click ’Stop instance’ to close it. Don’t click ’Terminate instance’, that will remove the entire instance. When creating ’F’ instances, it may be refused at step 10 with an error stating that the number of vCPUsyou applied is more than permitted. The reason is that number of vCPUs allowed on ’F’ instances is 0 by default while the minimum number of vCPUs required to create ’F’ instances is 8, which actually means customers are initially not allowed to use ’F’ instances (The number of vCPUs is counted depending on the instance type, therefore it doesn’t matter how many vCPUs you have used on ’C’ series instances). Therefore, we have to send an application to Amazon to ask for an 8 vCPUs allowance.
Figure 1: The location of FPGA Developer AMI
Figure 2: The rules for remote access
2 Initialize server
When the instance is first created, it is a bare operation system. We have to initialize it by installing typically required software and simplify the login process. Here are the three main tasks.
2.1 Setup Graphic User Interface (GUI) and remote desktop
Vitis IDE requires a desktop environment to run and of course, we can only access the desktop remotely. Therefore, we have to install the desktop environment first and then set up the remote access. Here provides the universal way to accomplish it. The ’Remote Desktop’ which is included in Windows 10 can be used to access it. Amazon provides another way called ’NICE DCV’ whose installation instructions are provided in the Appendix. Though I didn’t find any performance difference between the two approaches, ’NICE DCV’ requires standalone software to be installed. Therefore, I prefer the universal solution, which is provided right here:
Figure 3: Location of the public IP
Step 1: Start the instance first, and you shall see the public IP address assigned in the instance detail, as is shown in Fig. 3. Open a linux terminal (or login to our sever if you don’t have available linux system), and use the command:
#it needs to change user permission first when the system is Ubuntu
*sudo chmod 400 <name_of_your_key.pem> #it means only users have read permission
ssh -i <name_of_your_key.pem> centos@<ip_address>
to login to the instance, where the .pem file is the key file you downloaded in section 1.1. Type is if your system asks if you want to remember the machine. If you want to use PuTTY on your Windows machine, follow the instructions here https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/putty.html. Note that you have to use ”centos” as the user name instead of ec2 when using PuTTY .
Step 2: Run the following commands. All steps come from https://devopscube.com/how-to-setup-gui-for-amazon-ec2-rhel-7-instance/.
# update the server, 'yum' in centos is just the 'apt' in ubuntu
sudo yum -y update
# install GUI
sudo yum groupinstall -y "Server with GUI"
# start GUI during boot
sudo systemctl set-default graphical.target
sudo systemctl default
# Add the xrdp repository to your instance
sudo rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-1.el7.nux.noarch.rpm↪→
# Install XRDP and tigervnc-server
sudo yum install -y xrdp tigervnc-server
# Setup SELINUX security
chcon --type=bin_t /usr/sbin/xrdp
chcon --type=bin_t /usr/sbin/xrdp-sesman
# Start and Enable XRDP
sudo systemctl start xrdp
sudo systemctl enable xrdp
# Enable XRDP Port
sudo firewall-cmd --permanent --add-port=3389/tcp
sudo firewall-cmd --reload
# Setup Password (different with website, change ec-user to your user name, which is centos in this case)
sudo passwd centos
# Setup Password for root
sudo su
passwd
Step 3: Open ’Remote Desktop Connection’ software in your system (should already installed if you are using Windows), type in the ip address of the instance. Expand the settings, change the color settings to 24-bits. Click connect.
Step 4: Ignore all the warnings and trust the connection. Then a login environment shows up. Type in the username ’centos’ and the password you set in the Step 3. Click ’OK’. Now you should be able to see the desktop.
2.2 Exchange ssh key if necessary
To simplify the terminal ssh login and file transferring (scp), I recommend exchange the ssh-keys with your local machine so that you don’t have to type the password all the time. This is also beneficial for automating software executions in the future when file transfers happen frequently. In your local linux machine, if you haven’t create any key pairs before, use command:
ssh-keygen -t rsa
It may ask you some questions, skip them by input ’Enter’ directly. After this, a key pair should be created under ∼/.ssh/ folder. The private key is named as id rsa (no suffix) and the public key is id rsa.pub. Copy the public key to the sever and run the following command on the server:
cat id_rsa.pub >> .ssh/authorized_keys
Note that the .ssh folder may not exist in the server (or instance), create it before if the command fails. You can also create a key pair on the server and save it into your local machine with the same method. Fig. 4 shows a brief introduction to how ssh key pair authentication works.
Figure 4: The authentication process using public/private key pair
2.3 Setup S3 bucket
S3 bucket is a cloud storage service provided by Amazon. The customized FPGA images are required to be saved on the S3 bucket to ensure the accessibility of all the instances. Here are the steps:
Step 0*: Install AWS client tools. The tools shall already be installed on the FPGA Developer AMI, but if you want to get access to the AWS services from your local machine, run the below command. If you are using windows machine, follow the instruction here https://aws.amazon.com/cli/
# use yum instead of apt-get if it is centos operation system
sudo apt-get install awscli
Step 1: Run
aws configure
It will first ask you to input the AWS Access Key ID, you can find it in the file downloaded in section 1.2. After that, it ask you to input the AWS Secret Access Key, and you can find it in the above mentioned file as well. Then, it ask for the default region name; enter us-east-1. Finally, set the default output format as json. If you want to know something about S3, here is the reference https://awscli.amazonaws.com/v2/ocumentation/api/latest/reference/s3/index.html.
Step 2: Create an S3 bucket using:
# replace <bucket-name> with one you like but the bucket name cannot contain capital letters. You could try one of these instead: niteshdemo or nitesh_demo
aws s3 mb s3://<bucket-name> --region us-east-1
Step 3: Create required folders to save temporary .dcp and log files. Using the following commands:
# Create temporary file
touch DCP_FILES_GO_HERE.txt
# Copy the temporary file into it to let s3 bucket create the folder automatically
aws s3 cp DCP_FILES_GO_HERE.txt s3://<bucket-name>/<dcp-folder-name>/
touch LOG_FILES_GO_HERE.txt
aws s3 cp LOG_FILES_GO_HERE.txt s3://<bucket-name>/<log-folder-name>/
# remove local useless temporary files
rm DCP_FILES_GO_HERE.txt LOG_FILES_GO_HERE.txt
3 Introduction to Xilinx Runtime Library
Xilinx Runtime Library (XRT) is a software interface connecting the host (linux system user layer) to the FPGA device. The architecture of an XRT application is shown in Fig. 5. It receives the bit-container file named is *.xclbin, which includes both bitstream and the hardware description information (similar to *.hwh in PYNQ), and responses to the API invokes from linux user software. Therefore, to run an application on XRT platform, at least two files are required: host executable bin file (software running on linux) and the bit-container. The XRT is further divided into two categories based on the host. The current XRT supported development boards. Boards like U200 and AWS F1 have PCIe interfaces to communicate directly with a PC. The host in this case is obviously the PC itself. Boards with ZYNQ MP equipped typically cannot communicate with PC directly. Therefore, the PS works as the host, and the softwares becomes an embedded software running on the aarch64 architecture. That is also why Xilinx gives it another name as ”Embedded Platforms”.
Figure 5: The architecture of XRT applications
4 Introduction to Vitis IDE*
Vitis has two main types of projects: Application Project and Platform Project. The Application Project is created based on a compiled Platform Project. Therefore, if the official platform project is not available, we have to create the platform by ourself. The AWS F1 instances is provided by Xilinx.
The application project has three sub-projects. Typically, after you inputting the application project name (denotes as name), Vitis first create a folder named as name system in the working repository, where collects all the compiled files from the three sub-projects. The first sub-project is the host application project, which is saved in the folder name. This is just a normal linux project which can be compiled directly with the host even outside the Vitis as long as all required libraries are included in the compilation. The second sub-project is saved in name kernels. This is a Vitis HLS (High-level synthesis) project, where the functions inside are synthesized and implemented on FPGA. Therefore, it is forbidden to use any system calls of Linux nor any dynamic arranged memories inside this project (incompatible with Hardware). The third sub-project is saved in name system hw link . This project creates the bit-container or the *.xclbin files. This sub-project is a bridge between the IP created from the kernel project and the host projects. The BUS connection and hardware address arrangement (similar to the process we creating an functional block diagram from generated HLS IP cores) happens in this project, and thereby, as said before, the hardware description file is included into the *.xclbin file. To summarize, if we create a application project named as ’fir’, the folder structure is shown below.
|------Working Repository
| |--fir
| |--fir_kernels
| |--fir_system_hw_linked
| |--fir_system
Another thing to notice is that the virtualized function name to be invoked in the host application has the same name with the functions defined in kernel project; while the name of *.xclbin file is determined by the bit-container name specified in the hw link project.
5 Create acceleration applications with Vitis
The AWS FPGA acceleration application creation generally has 5 steps:
Step 1: Create Vitis Project.
Step 2: Create Host and Kernel source files.
Step 3: Build and Test.
Step 4: Create AWS FPGA images.
Step 5: Run application.
5.1 Vitis Accel on AWS: Create Vitis Project
Step 1: Start a ’c’ series instance and log in to it using Remote Desktop Connection.
Step 2: Download AWS platform resources.
• Right click on the desktop, select Open Terminal.
• Download platform resources. Run:
cd src
git clone https://github.com/aws/aws-fpga.git
Step 3: Start Vitis. Run:
source /opt/Xilinx/Vitis/2021.1/settings64.sh
# It may take longer at the first time. You should see "INFO: Vitis Setup PASSED" in the end if everything goes well. When you open a new terminal every time, you need to do this step first.
source ~/src/aws-fpga/vitis_setup.sh
# Anywhere you like, but make sure it is a empty folder. It will be quite a mess after compiling.
mkdir -p ~/src/project_data/Vitis_Demo
cd project_data/Vitis_Demo/
vitis &
Step 4: Create Project:
• In the Vitis IDE Launcher dialog, set the Workspace to /src/project data/Vitis Demo
• Click Create Application Project. Click Next.
• In the Platform dialog, choose Select a platform from repository, and click Add.
• In the Specify Custom Platform Location dialog, go to ∼/src/aws-fpga/Vitis, select aws platform; click Open. It takes seconds to load the platform information. Click Next when it is ready. The dialog shall look like Fig. 6.
• In the Application Project Details, specify the Application project name. Use ’fir’ for now. Click Next.
• In the Templates dialog, select Empty Application (XRT Native API’s). Click Finish.
Figure 6: The Overview of AWS platform
5.2 Vitis Accel on AWS: Create Host and Kernel source files
Step 1: In the Explorer (the window on the top left), right-click the fir system/fir/src folder. Click New ---> File. In the Create New File dialog, type in the File name: xcl2.hpp. The content is in xcl2.hpp. Copy that into the created file.
Step 2: Use the same way, create and add contents into the following files to fir system/fir/src:
• xcl2.cpp
• host.cpp
Step 3: Similarly, create and add contents into the following files to fir system/fir kernels/src:
• fir naive.cpp
• fir shift register.cpp
Step 4: Double click fir system/fir kernels/fir kernels.prj. Then in the edit panel, click Add Function button (which is quite hard to see, see Fig. 7). Add both fir naive and fir shift register.
Step 5: Double click fir system/fir kernels/fir system hw link.prj. In the edit panel, you shall see that two functions are already added automatically to the binary container 1. Click (Just one time) binary container 1 and rename it as fir.
Step 6: Save all modified files.
Figure 7: Add hardware function to the kernel
5.3 Vitis Accel on AWS: Build and Test
Step 1: On the top right, click the upside-down on the right of the hammer, click Emulation-SW (See Fig.8). This will build the target on the emulated software.
Step 2: Right click the fir system in the Explorer, select Run as Launch SW Emulator. See Fig. 9 for details. You should see TEST PASSED in the output console if everything goes well.
Step 3: Similarly, build and run it on the Hardware Emulator.
Step 4: Build the Hardware files. It may take long time. When finished, you can find an executable file fir and the bit-container fir.xclbin in fir system/fir/Hardware/.
Figure 8: Build on Software Emulator
Figure 9: Run on Software Emulator
5.4 Vitis Accel on AWS: Create AWS FPGA images
The executable file fir is ready to use. However, the bit-container is not allowed to be downloaded to the AWS ’f’ instance directly. For security, Amazon requires us to upload the bit-container to s3 bucket and they will regenerate a verified bit-container, named as fir.awsxclbin in this case, for us to use. Here are the steps.
Step 1: Open a terminal, cd to /src/project data/Vitis Demo. Run:
mkdir aws_image
cd aws_image
cp ../fir/Hardware/fir.xclbin ./
~/src/aws-fpga/Vitis/tools/create_vitis_afi.sh \
-xclbin=fir.xclbin \
-o=fir \
-s3_bucket=<bucket-name> \
-s3_dcp_key=<dcp-folder-name> \
-s3_logs_key=<log-folder-name>
The <bucket-name>, <dcp-folder-name> and <log-folder-name> are the ones you created in section 2.3. After running this command, you shall see Successfully wrote (37921 bytes) to the output file: fir.awsxclbin if everything works well. It take about half an hour for Amazon to generate the verified bit container. However,
you should already see fir.awsxclbin in the folder. Notice that it is just a link to the real image saved in Amazon cloud, which may not be ready do be download. To check the status, Run in the aws image folder: aws ec2 describe-fpga-images --fpga-image-ids=$(cat *_afi_id.txt | awk 'NR==2{print$2}' | tr -cd 'a-z0-9-')
The output should be something like this:
{
"FpgaImages": [
{
"UpdateTime": "2021-09-29T14:06:38.000Z",
"Name": "fir",
"Tags": [],
"FpgaImageGlobalId": "agfi-07fa3a2a783db5e44",
"Public": false,
"State": {
"Code": "pending"
},
"OwnerId": "492975968878",
"FpgaImageId": "afi-01e77a2928e2ede26",
"CreateTime": "2021-09-29T14:06:38.000Z",
"Description": "fir"
}
]
}
Check it regularly until you see the State becomes available. Step 2: If you are running the compilation o ’c’ series platform, you need to copy the executable file fir and the verified bit-container fir.awsxclbin to the ’f’ instance later. Therefore, copy the files to s3 bucket using
aws s3 cp ./fir.awsxclbin s3://<bucket-name>/
aws s3 cp ../fir/Hardware/fir s3://<bucket-name>/
5.5 Vitis Accel on AWS: Run application
After the State becomes available. Close your ’c’ instance and start the ’f’ instance. You may not need to use remote desktop in this case. Login to the ’f’ instance. First, download the executable file and the bit-container from s3 bucket and run it. Run:
cd src
git clone https://github.com/aws/aws-fpga.git
# this is required or the fir cannot find necessary xilinx libraries
source ~/src/aws-fpga/vitis_setup.sh
mkdir -p ~/app/fir
cd ~/app/fir
# download files
# need to configure aws first, follow the 2.3 Setup S3 bucket in page 8
aws s3 cp s3://<bucket-name>/fir.awsxclbin ./
aws s3 cp s3://<bucket-name>/fir ./
*sudo systemctl restart mpd
sudo chmod +x fir
# run
./fir fir.awsxclbin
The output should be:
Found Platform
Platform Name: Xilinx
INFO: Reading fir.awsxclbin
Loading: 'fir.awsxclbin'
Trying to program device[0]: xilinx_aws-vu9p-f1_shell-v04261818_201920_2
Device[0]: program successful!
Example Testdata Signal_Length=1048576 for 100 iteration
|-------------------------+-------------------------|
| Kernel(100 iterations) | Wall-Clock Time (ns) |
|-------------------------+-------------------------|
| fir_naive | 45452725817 |
| fir_shift_register | 507174877 |
|-------------------------+-------------------------|
| Speedup: | 89.619435 |
|-------------------------+-------------------------|
Note: Wall Clock Time is meaningful for real hardware execution only, not for emulation. Please refer to profile summary for kernel execution time for hardware emulation.
TEST PASSED
This project compares two implementation methods of fir filter. Shift register based fir is much faster than the one that is not hardware optimized. That concludes the tutorial.
6 Get started with PYNQ using AWS F1 Instances
Step 1: Install packages for PYNQ.
Step 2: Start Jupyter Notebook run.
Step 3: Program the Device.
Step 4: Run Accelerators.
6.1 Get started with PYNQ using AWS F1 Instances
Step 1: Launch an ’f’ series instance with 16 vCPUs or more and login to it using Remote Desktop Connection.
Step 2: Download pynq packages.
• Right click on the desktop, select Open Terminal.
• Download pynq packages. Run:
source ~/src/aws-fpga/vitis_setup.sh
sudo systemctl restart mpd
sudo pip3 install pynq
sudo pip3 install Jupyter
Step 3: Start ’jupyter notebook’ Run:
source ~/src/aws-fpga/vitis_setup.sh
mkdir -p ~/Jupyter
cd ~/Jupyter
jupyter notebook
6.2 PYNQ on XRT Platforms
The reference is https://pynq.readthedocs.io/en/v2.6.1/pynq_alveo.html#multiple-cards.
Step 1: Import required package
from pynq import Device
from pynq import Overlay
from pynq import allocate
# the numpy and matplotlib shall be installed in advance
import numpy as np
import matplotlib.pyplot as plt
Step 2: Multiple cards
• PYNQ supports multiple accelerator cards in one server. It provides a Device class to designate which card should be used for given operation. The first operation is to query the cards in the system.
for i in range(len(pynq.Device.devices)):
print("{}) {}".format(i, pynq.Device.devices[i].name))
# Note that the PYNQ framework doesn’t at present do any error checking to ensure that buffers have been allocated on the same card that a kernel is on. It is up to you to ensure that only the correct buffers are used with the correct cards.
overlay_1 = pynq.Overlay('fir.awsxclbin', device = pynq.Device.decives[0])
overlay_2 = pynq.Overlay('fir.awsxclbin', device = pynq.Device.decives[1])
• The output should be
0) xilinx_aws-vu9p-f1_shell-v04261818_201920_2
1) xilinx_aws-vu9p-f1_shell-v04261818_201920_2
Step 3: Allocating Memory
• One of the big changes with a PCle FPGA is how memory is allocated. There are potentially multiple banks of DDR, PLRAM and HBM available and buffers need to be placed into the appropriate memory. Fabric-accessible memory is still allocated using pynq.allocate with the target key word parameter used to select which bank the buffer should be allocated in.
# read the fir_kernels/src/fir_shift_register.cpp first
SIGNAL_SIZE = int(1024)
# memory banks are named based on the device's XRT shell that is in use and can be found through the overlay class and in the shell's documentation
sig_1 = allocate((SIGNAL_SIZE,),dtype = 'int32',target = overlay_1.bank0)
sig_2 = allocate((SIGNAL_SIZE,),dtype = 'int32',target = overlay_2.bank0)
out_1 = allocate((SIGNAL_SIZE,),dtype = 'int32',target = overlay_1.bank0)
out_2 = allocate((SIGNAL_SIZE,),dtype = 'int32',target = overlay_2.bank0)
coe_1 = allocate((11,),dtype = 'int32',target = overlay_1.bank0)
coe_2 = allocate((11,),dtype = 'int32',target = overlay_2.bank0)
coes = [53, 0, -91, 0, 313, 500, 313, 0, -91, 0, 53]
for i in range(11):
coe_1[i] = coes[i]
coe_2[i] = coes[i]
# sources are cos_wave with different frequency, and put them into different FPGA device
omega_1 = 0.15
omega_2 = 0.62
t_index = np.arrange(0,SIGNZAL_SIZE)
for i in range(SIGNAL_SIZE):
sig_1[i] = int(2048*np.cos(omega_1*np.pi*i))
sig_2[i] = int(2048*np.cos(omega_2*np.pi*i))
out_1[i] = 0
out_2[i] = 0
Step 4:Running Accelerators
• Buffers also need to be explicitly synchronized between the host and accelerator card memories. This
is to keep buffers allocated through pynq.allocate generic, and also enable more advanced uses like
overlapping computation and data transfers. The buffer has sync to device and sync from device
functions to manage this transfer of data.
• PYNQ for XRT platforms provides the same access to the registers of the kernels on the card as IP in a
ZYNQ design, however one of the advantages of the XRT environment is that we have more information
on the types and argument names for the kernels. For this reason we have added the ability to call kernels
directly without needing to explicitly read and write registers.
• For HLS C++ or OpenCL kernels the signature of the call function will mirror the function in the original
source. You can see how that has been interpreted in Python by looking at the .signature property
of the kernel. .call will call the kernel synchronously, returning only when the execution has finished.
For more complex sequences of kernel calls it may be desirable to start accelerators without waiting for
them to complete before continuing. To support this there is also a .start function which takes the same
arguments as .call but returns a handle that has a .wait() function that will block until the kernel has
finished.
sig_1.sync_to_device()
sig_2.sync_to_device()
coe_1.sync_to_device()
coe_2.sync_to_device()
# complete task 1 and 2 on two boards separately
task_on_1 = overlay_1.fir_shift_register_1.start(out_1,sig_1,coe_1,SIGNAL_SIZE)↪→
task_on_2 = overlay_2.fir_shift_register_1.start(out_2,sig_2,coe_2,SIGNAL_SIZE)↪→
task_on_1.wait()
task_on_2.wait()
out_1.sync_from_device()
out_2.sync_from_device()
Result:
The normalized frequency of sig 1 (sent to FPGA1) is 0.15. Therefore, we should expect a 60dB gain for the result. The normalized frequency of sig2 (sent to FPGA2) is 0.62. Therefore, we should expect a very small output signal. The results are plotted in Fig. 11 and Fig. 12.
Figure 10: The body plot of filter
Figure 11: The output of fpga 1. Notice that the expected magnitude is 1000 (60dB) times 2048.
Figure 12: The output of fpga 2.
7 Working with Spot Instances
Step 1: Create a Spot Instance request.
Step 2: Launch instances.
7.1 The definition of the Spot Instance and its benefits compared to On-demand Instances
The User Guide of Linux Instances is https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-spot-instances.html. A Spot Instance uses spare EC2 at a lower price. It has two advantages, one is that it can be launched immediately when the Spot Instance request is active and capacity is available and is the same as On-demand instances. Another is that the price of Spot Instance is very low, compared to the on-demand price, the discount is up to 90 percent c4.xlarge instance only needs 0.037 USD per hour while the same On-demand instance needs 0.199 USD per hour. However, it has a risk of interruption when EC2 needs the capacity back. Amazon EC2 provides a Spot instance interruption notice, which gives the instance a two-minute warning before it is interrupted. Therefore, it is more suitable for stateless, fault-tolerant, flexible applications. The following table shows the key differences between Spot Instances and On-demand Instances.
7.2 Create a Spot Instance Request
Spot Instance request needs to include the desired number of instances, the instance type, and the Availability Zone. Figure 13 shows how Spot Instance requests work and the request has two types: one-time and persistent. One-time Spot instance request means it remains active until some actions happened as requesting failed, launching instances, or canceling the request and it needs re-apply next time. The persistent request will remain active until the user cancels or terminated it.
Step 1: Login to the AWS management console, click ’Services’ and select ’EC2’.
Step 2: Expand the ’Instances’ and click ’Instances’ under the tab.
Step 3: Click the ’ Launch instances’ on the top right, which should have been highlighted with orange.
Step 4: Create the name and click Add additional tags. In the Resource types option, click the upside-down and choose Instances and Spot Instance requests.
Step 5: Choose an Amazon Machine Image (AMI).. Search ’FPGA’ in the search bar, and then click ’AWS Marketplace. If you have successfully subscribed to the FPGA Developer AMI (section 1.3), you shall see the FPGA Developer AMI on the right. Select it.
Step 6: Instance type. Choose the Instance type that you want.
Step 7: Key pair. Choose the .pem file that you created before.
Step 8: Network settings. Choose the Create security group and choose Allow SSH traffic from that helps you connect to the instance. Or choose Select existing security group and choose SSH-RDP-SMB.
Step 9: Configure storage Typically, you don’t need to change anything here.
Step 10: Advanced details. First, in the Purchasing option, choose Request Spot Instances and click Customize. Second, choose Set your maximum price and print the number that needs to be larger than the minimum required Spot request fulfillment price of 13.2. Third, in the request type, choose one-time or persistent. Fourth, if you choose persistent, it is better to set the expiry date. Fifth, in the Interruption behavior, for persistent requests, valid values are stop and hibernate, and for one-time requests, only terminate is valid. Sixth, in the Instance auto-recovery, choose default. Seventh, in the Shutdown behavior, choose stop. Eighth, in the Termination protection, choose disable. Ninth, in the Tenncy, choose Dedicated-run a dedicated instance which means running on single-tenant, dedicated hardware. The others don’t need to change.
Step 10: Launch Instance
Figure 13: Process of Spot Instance requesting work
Figure 14: Name and Tags
Figure 15: Network setting
Figure 16: Advanced details
Figure 17: Result