My main research interest revolves around analyzing the power delivery networks in multicore digital circuits, inefficiencies and challenges related to this task and ways of mitigating noise, voltage drop and thermal dissipation. The shrinking of the transistor size and subsequent increase in device count per area unit, is causing an exponential increase chip power density, so much so that in the most advanced chips the power per unit area is similar to that of a nuclear reactor. This fact, along with reduced thickness of metal connections for power delivery means more power is being wasted in form of heat on these interconnections, and the cell voltage reduction, creating timing violations.
For this reason, I am researching on a technique called Voltage Stacking, which allows for a reduction of PDN lossess by reusing charge in similar activity circuits. However, its efficiency suffers from current mismatch of stacked blocks, as it causes voltage noise and reduces circuit robustness. The reason for imbalance is the supply current data dependency of CMOS logic cells.
Cell-level balancing is a promising way to solve the activity mismatch, and the resulting supply voltage noise in Voltage Stacking. By implementing the datapath and register file with cells which have very low supply current variability, it is possible to balance the charge consumed by the stacked circuits.
Recent advances in 3D integrated circuit integration has brough hope to keep Moore's law alive by integrating even more devices in a package by stacking multiple silicon dies on top of each other and providing interconnections for signals and power with through silicon vias (TSV). Unfortunately, these interconnections are difficult to manufacture, require large area and their parasitics limit the efficient application, particularly in power transfer.
3D IC seems like a good candidate for applying intra-die level Voltage Stacking, especially for multi-die memories and accelerators. Vertical integration naturally matches stacking voltage domains, and if applied correctly, it can help solving one of the main challenges in 3D IC - power delivery through TSVs. In my work, I am trying to find an efficient way to partition the design to achieve the lowest voltage noise related to activity mismatch and characterize the benefits of stacked domains.
Vector processors have first appeared in 1970s and were first meant for accelerating large dataset operations in supercomputers, but they never reached mainstream outside of this domain. With the emergence of powerful vector instruction sets, such as x86 AVX, ARM SVE, and open-source RVV, vector processing can now be integrated into ultra-low power microprocessors, offering a more flexible computing platform than a basic SIMD array due to the variable vector length.
My research interest is on-edge neural network acceleration with these versatile processors. By applying custom arithmetic systems, such as bfloat or POSIT I seek to reduce the power and area cost of implementing a compact inference accelerator using custom RISC-V Vector extension.
According to HALO Trust, up to 2 million landmines have been laid in Ukraine since the start of full-scale war in 2022, heavily affecting the many regions of the country, including Kharkiv and Chernihiv, where my family comes from. Unfortunately, this issue is very much global. In Cambodia, 18,800 lives were lost due to landmines since 1979. It is estimated that the clearing operation in Ukraine using currently available methods will take more than 700 years.
Since then, I have been looking for a way to contribute to a demining process by working on a drone-based demining system that uses on-device computer vision algorithms for explosive detection. The metal casing and the explosive material of a landmine placed in ground has a different emissivity than its surroundings, making them visible in infrared light spectra. For this reason, a multispectral imaging camera is a key component in detection systrem. Such multispectral image can be fed as an input to an object detection network and point to a location of suspected explosives.
Previously published works focused merely on acquiring the image from the drone and left the processing to be performed on a GPU, long after the drone has landed. For a real-time detection, this poses an unacceptable operational delay. Additionally, there is no guarantee that the drone will come back to the operator or be able to transmit a complete video feed through wireless channel, especially in case where electronic warfare components and jamming may be present. Therefore, it is desirable to process the detection algorithm on the drone and send only single images with marked explosive locations. Acceleration on edge is the solution that I am looking into - by putting a small accelerator that can run an object detection network, I am hoping to create an autonomous system that provides the operators only with necessary data and keeping them safe from potential explosions.
Coming soon...