@splinedrive This is what I used to use with my NexysVideo:
https://t.co/8zBhZJrXSe
The same repository has generated controllers from LiteX for: Genesys2, NexysA7, OrangeCrab.
@splinedrive Nice! A bare-metal RTOS is much better suited for embedded applications on FPGAs.
Take a look at the RTOS I am developing:
https://t.co/HWRz7rBECx
@GregDavill Is OrangeCrab still being manufactured ? it has become un-available for purchase.
Are you working on an upgraded version ? if yes, my feature request would be to make it possible to turn off or control the really bright LED that shows it is powered on.
This release now includes the data-cache coherency feature, which greatly benefits multi-core performance, with CoreMark/MHz scaling almost linearly from 1 to 16 cores.
Check out the underLineOS API reference:
https://t.co/HWRz7rBECx
Software-Hardware Co-design example on OrangeCrab, where a simple hardware peripheral to drive an LED and sample a push-button state is connected using Wishbone4 and driven by software.
_underLineOS multi-threading is used to generate PWM signal and smoothly blink an LED.
This release still lacks the data-cache coherency feature which is now passing all tests, but needs loose ends tied before it can be published.
This release is meant for publishing all the fixes and improvements accumulated while working on the data-cache coherency feature.
The implications are that atomic variables must always be manipulated by atomic instructions which do not create/update cache-entries, in order to maintain coherency.
Fighting cache coherency bugs involving atomic operations.
Atomic operations work using a lock signal that allows a core to have exclusive bus access until the lock is removed. For performance the lock is held for as short of a period as possible.
Holding that lock while waiting for coherency traffic would impact performance; hence atomic operations do not use coherency traffic nor does it create/update cache-entries.
Fun fact: _OS (underLineOS) is tickless. Timer interrupts are scheduled dynamically whereby the next timer interrupt is programmed based on the nearest upcoming event (thread wakeup, deadline, etc...). It reduces CPU wake-ups, improving power efficiency.
My implementation is an update based cache coherence protocol resembling the Dragon Protocol.
The data-caches of each core are daisy-chained in a circle using valid-ready handshakes/connections through which the coherency traffic circulate.
You need cache coherency if you have more than one CPU, you want each CPU to use data cache, and you do not want to do manual data cache maintenance (ie: invalidate , writeback) to achieve coherency when accessing data shared by more than one CPU.