Gems and Cobblestones of the FPGA Oberon System

Synopsis. In this section I will try to catch gems which occasionally flash in the mailing list. On the other hand, the cobblestones are useful pieces, which do not look flashy, but can be handy in various situations.

Motivation. Mailing list messages come and go. Sometimes there is a gem which is worth keeping, but it quickly gets forgotten. I will try to collect them here, selected according to my subjective judgement.

Batch processing of Oberon commands

Oberon List [] on behalf of Andreas Pirklbauer [andreas_pirklbauer _at_]
Sent: Thursday, June 11, 2020 2:52 PM
To: Oberon List [oberon _at_]

Extended Oberon now contains a simple batch processing facility, which allows users to activate multiple commands as follows:

    ORP.Compile A.Mod B.Mod C.Mod ~
    System.Watch ~
    MyModule.MyCommand A B C ~
which executes the set of commands until the return code of any command is different from zero or the syntax is not that of a valid Oberon command.

Any command can, but does not have to, set a return code by calling the procedure Oberon.Return(res), for example Oberon.Return(3), within the command. This essentially sets a global variable Oberon.Par.res, which can be checked after a command has been executed. Oberon.Par.res is reset to zero before command activation.
KernelDebugger - Debug and inspect one RISC5 machine from another one via serial link

Compared with modern development environments, debugging RISC5 Oberon programs is limited. Due to memory constraints and no multitasking, running the debugger on the same machine as the debuggee is tricky. Therefore, Michael Schierl took a different approach, known from low-level driver development: The debuggee is running on one RISC5 machine, and the debugger on another one. Both are connected via a serial link. It is called a Kernel Debugger by the author.

Project Oberon running from LPDDR memory on Pipistrello, now with 8 bpp and 16 bpp color display modes

There is a lot of valuable information in this post, which can be used for other designs. For reference, keep in mind that the entire BRAM in the LX9 is 64 kB, which is barely sufficient for just the cache. The very fact of using DRAM is upping the kind of the FPGA which can be used, unless you want to run w/o a cache and accept a tremendous performance hit. The following bullets are my own take-home.
  • The mono display buffer in BRAM, 160 kB, big enough for a 1440x900 display.
    Note: Framebuffer in BRAM requires a rather large FPGA. It is also hogging one of the most valuable resources.
  • The cache also in BRAM, either 64 or 128 kB.
    Note: It is hard to imagine the cache not being in BRAM. Note the size. The entire BRAM in LX9 is 64 kB.
  • Both the B/W and color framebuffers in the same design, like the original Ceres Oberon described in the vintage Project Oberon.
  • Coordination of the cache and the video buffer relocated to DRAM seems non trivial.
    Note: The difficulty is that writing to the framebuffer proceeds via the cache.
  • Consider adding a fast non-DRAM memory for just the framebuffer.
    Design 1: Memory mapped framebuffer, using "cycle stealing" by NW and PR.
    Design 2: Memory wrapped inside the video controller, like in the latest books by P.P.Chu.
  • I am tempted to consider adding HyperRAM for the framebuffer and following the Design 2.
    Note: In this scenario Display.Mod may need substantial internal changes.
Oberon on behalf of Magnus Karlsson [magnus _at_]
Sent: Wednesday, May 27, 2020 7:05 PM

8 bpp needs 0.75 MB memory, 16 bpp needs 1.5 MB memory for display buffer

On 5/27/2020 3:50 PM, Magnus Karlsson wrote:
> I have made updates to the code on github with the following changes:
> * The code now has it's own repository with a master branch and two versions with color display added
> * The mono display buffer in BRAM is now 160 KB, big enough for a 1440x900 display
> * The CPU clock is now coming from the memory controller PLL, leaving the video PLL free to implement any frequency needed by the video controller
> * The cache is now 64KB (2-way, 128-set, 256 bytes/cache line) to make room for the larger mono display buffer
> * The color version have 8 bpp or 16 bpp color video added in parallel with the mono display. 8 bpp needs 0.75 MB memory, 16 bpp needs 1.5 MB memory for display buffer
> * The color buffer can be at any place in the 15.75 MB lpddr memory space (aligned to 16 bytes) controlled by a color buffer address register
> * When using color display the cache needs to be flushed. There are several ways to flush the cache - on demand by software, on every vsync and using auto-flush where a cache line is flushed every 1024 CPU clocks.
> * The selection between mono display and color display is via a register bit.
> Jörg Straube have successfully shown the code working in both 1440x900 mono display mode and 8 bpp color display mode.
> >
> Cheers,
> Magnus

> On 5/15/2020 7:53 AM, Magnus Karlsson wrote:
>> The project have been updated at
>> This version is based on the current RISC5 verilog code (dated 11.12.2018).
>> System info:
>> Cache size: 128 KB (2-way 256-set).
>> Cached area: lower 15.75 MB of the RISC5 memory map is mapped to cache/lpddr memory.
>> The non-cached area (top 256 KB of the RISC5 memory map) is used for I/O (top 4KB), boot prom (next 4KB), then a free 248 KB area where the video display buffer can be relocated .
>> Video buffer is in BRAM and is located at the default place at power up but can be relocated anywhere in the memory map by simply writing to a new 24-bit video base register.
>> Since the video frame buffer is in non-cached BRAM there is no need to flush the cache, which btw uses write-back policy.
>> The system will be slowed down by cache misses but there is about 11% gain by having the video buffer in BRAM (no cpu stalls due to video controller memory access).
>> Magnus

Contact us for more info

Updated June/15/2020.
© 2020 by