Somax Hardware Thoughts and Ideas

CAMD Motion Controller resolution

August 29, 2018

I just realized something about Somax Proto III/ IV.

Proto III has 3 axis and each axis has a resolution of 800 (.45 degree) steps per revolution. Pan Axis has 360 degrees of freedom. Tilt axis has 180 degrees of freedom, Rotate axis has 360 degrees of freedom.

This means the R200/D400 camera gimbal has camera has 800 x 400 x 800 = 256 million individually addressable positions in orientation!

I'm looking at a smaller, much cheaper motor that has about 1300 steps per revolution. This would give 1.2675 billion individually addressable positions for the camera. Put in a slightly neater way, the the gimbal would have 30-bit motor resolution.

Proto IV would bump this number to 9.5 billion addresses or 33 bit resolution!

CAMD Motion Controller resolution

August 29, 2018

I just realized something about Somax Proto III/ IV.

This means the R200/D400 camera gimbal has camera has 800 x 400 x 800 = 256 million individually addressable positions in orientation!

Proto IV would bump this number to 9.5 billion addresses or 33 bit resolution!

Tuning Dual AHRS Co-processors

August 29, 2018

Somax has dual Bosch BNO055 AHRS co-processors. One is attached to the frame. The other is on the back side of the R200/D400 camera gimbal. The camera gimbal is mechanized with 3(proto III) or 4(proto IV) stepper motors each, 4:1 geared 200 step stepper motors with 800 fixed and known positions per axis . When the motors are not moving, and the frame is stationary, the position of each Bosch sensors is known with at least 30 bits of resolution.

I wonder how this can be used to tune or improve the accuracy of sensors when the frame is not stationary and or the motors are moving.

Actually I think I know how to use AI to obain the answer at 400 GFLOPS!

And 2 is greater than 10. More GFLOPS

August 29, 2018

I was thinking about 400 GFLOPS. That seems like a lot but honestly it is and it isn't. It's a lot for one network but we want to run many networks. If we had enough power it would really be helpful to be able to train small reinforcement networks.

Before anyone starts thinking Support Vector Machines, refer to the home page! That's yesterdays stuff. If it doesn't support neural networks, it is not worth the time, and I say this after three years of investigation.

If I want to train on Somax I need more FLOPS. A lot more FLOPS!

Well, how does 4000 GFLOPS sound? How does it sound for the same amount of power? That's right, the same 1 watt delivers 10 times the GFLOPS on the Myriad X. Now we can talk in numbers than can train Neural Networks, them numbers are, TFLOPS. When expressed in TFLOPS the equation show a nice relationship, 4 TFLOPS = 4 Watts. And this number scales in the exact same ratio for as far as you want to go. However, there are diminishing efficiencies when training is spread across so many devices that be somewhat offset with a high speed bus like PCIe or PCI express. It's supported by the Myraid X.

For Somax, once again it's Edison to the rescue! Here is a github for a design by LGS Innovations that adds PCIe to the Edison. The BOM is only six parts and most are power related so it shouldn't be that expensive either.

How did it get faster without using more power? Simple when you think about it. Die Shrink and the wonder of Moore's Law. A 28 nano meter Myriad 2 was die shrunk to a tight and trim 16 nano meter Myriad X. That means, that as the crow fly's, an electron now has to travel 57% less distance from gate to gate! And if you have the die, why not just go ahead and add a few more shave processor to fill the newly vacated space. How about 25% more! The new total 16, is up from the Myraid 2's 12. A lot has been said about the slowing rate of Moore's Law, however it seems that for things that need to go fast and things that need to go 'fast enough' slowing is occurring for the former but in fact the opposite is occurring for the later. This is good news for Somax.

Intel, don't fret, what you may realize is you are not going to sell many more thousand dollar processors that go 2000 Megahertz. You may not realize you are going to sell to sell 2000 1000 Megahertz processors for everyone of them instead! You are the leader in inference. If you hang on a bit longer and build your IOT line in the Edison class of hardware, You'll be able to connect your IOT and Edge Computing Businesses. You already have the Edge now let the Edge give you IOT. projects like mechanizedAI are your incubators and we need what you build because Arduino or Raspberry-pi is simply not going to cut it for grass roots useful AI!

"It's called the Central Processing Unit, not the Everything Processing Unit. An IOT SOC needs to remember this for inference integration. It needs to go "fast enough" and have as many integrated controllers as possible."mechanizedAI

Like the magnetic field is theorized to do one day, some day much sooner, we will see Moore's Law scale in a different direction.

Electrical Sampling Internal vs External.

August 29, 2018

I was thinking it would be good to have a high precision and a high speed analog to digital converter with at least two channels each.

After some thought it might actually make more sense with respect to the objectives of the Somax platform to focus monitoring internal electrical systems systems, instead external.

Instead of building an oscilloscope functionality which would help when adding a new device to Somax, Add device that can monitor an existing devices with more detail. This would mean adding bus analyzers instead of analog probes. It would mean creating a specification for what need to be monitored. some starting point would voltage, current, and temperature. For any sensors added we would expect these data points to be collectible.

Arm Compute Engine

September 5, 2018

I have been on the prowl for compute engines that are low power and capable of copious amounts of inference. So far I have spec'ed the Intel Myriad 2 and X. The 2 is available in the form of the Neural Compute stick or the Up AI board with the same part Up also has a version in the pipe that has two Myraid 2 processors in effect doubling the GFLOPS. The Myraid X is not easy to get your hands on and I've been looking. Any of the other options like the Laci compute engine have yet to materialize as hardware.

Recently I said that I wasn't able to find a Raspberry Pi solution suitable for Somax needs. Well, that has changed. Kinda. I looked deeper and found that Raspberry Pi still did not have the needed capacity but and whoa daddy this is really good but, when you broaden your horizons to anything with an ARM processor then you get a few really good options! Qualcomm has the Snapdragon line, Hawei has the Kirin line, Xylinx has the 96 line and Rockchips has the Rock960 Pro line. Spec's looked good on all of them. I mean good as in "stop looking you've found what you need!" Some of these engines are topping out at 2.4 TOPS in 8 bit precision with 16 bit being the linearly scaled to 1.2 TOPS. For perspective, The NVidia Pascal line of GPGPU's handle 8-bit precision (all of this is integer mind you so it's still apples to apples) at 47 TOPS. A portable version of Pascal is found in the Jetson TX2 platform. The TX2 uses on average 7.5 watts of power but when it's really cranking on inference, and getting that 47 TOPS, it will use 13 watts. The ARM engines deliver 2.4 TOPS , for the exact same inference, in all other respects, then power and though put , which is about 1 watt (I'm estimating at 1.2 Watts until I can prove different) or .5 Watts per TOPS. Using the same math with respect to the TX2 finds the Pascal engine at .27 Watts per TOPS. Supposing we only need 2.4 TOPS, and if it scales linearly that would be about .66 Watts for the same everything, including power, that is costing 1.2 Watts or about double on ARM. On the basis of computation per Watt needed (not the maximum) the NVidia chip looks better. The down side of the NVidia platform is the price, The processing module is about $300 and it needs a carrier for another $300. The ARM solution can be had for $99 with 2 gigs DDR3-1866 / 16 gigs eMMC or $139 for a 4 / 32 gig combo of the same, with all the above (module included) integrated into the carrier. No math to do here, if you only need 2.4 TOPS then it will cost $99 for a bare bones that may due in most cases or $139 that will probably due for the standard deviation of well crafted algorithms. One price for the TX2 means that it is $600 for hardware that can sustain the required 2.4 TOPS. While I expect both claims to naturally be best-case, I expect it from both, so that is a wash in my opinion. As for Size this is another difference which favor the ARM solutions. The TX2 with carrier is 170x171x15 millimeters (module 50x87x??) while the ARM is 85x54 millimeters for everything. One fits in your pocket, the other is better left in the car!

I ordered a Rock960 from Seed Studios which should have the RK3399Pro SOC and the 2.4 TOPS per Watt capability. It will be here this week and I am anxious to qualify this board with some actual wild inference!

HardwareCommentsLog

Google Sites

Report abuse