What is B2?
|
Physicist's view of B2
|
Programmer's view of B2
B2 system home page
Extensible framework for Nuclear Physics data analysis
What is B2?
B2 is an object-oriented extensible data analysis system implemented under
Oberon System V4 environment. B2 was designed for efficient analysis
of data from multidetector Nuclear Physics experiments. Other application
areas, such as spectroscopy, are also possible. The system can be used
either as a histogramming and display subroutine library, or as a
component framework for developing complete acquisition and analysis
systems. Persistent object management, histogramming library, routing of
experimental data packets, abstract data processing components, and
interactive graphical display, all were implemented in Oberon-2 in fewer
than 3000 executable statements. The software is portable between common
operating platforms for which Oberon System V4 is available:
Windows/Windows NT, MacOS, and several Unix flavors.
The reader may be familiar with other analysis systems such as
PAW, ROOT, LISA, Smaug/Xamine, or Python.
Compared with other software, B2 is tiny, easy to maintain
and to modify, modular, and run-time extensible.
Originally I wrote B2 in 1996-1998 because I always needed such a tool,
and I could not find it anywhere. Recently I dusted it off with
the plan to port it to the FPGA Oberon System and run it on the
FPGA, after necessary modifications.
Is B2 portable?
B2 has been designed and programmed under ETH Oberon System V4-0.9 for
Linux. The whole B2 software used to be portable by recompilation to any
Oberon System V4 (Windows, Mac, PowerMac, Amiga, and Unix systems including
AIX, DECStation, HP/UX, SiliconGraphics, and Sun). Most of these platforms
do not exist anymore in 2020. B2 should still run under both Linux and Windows
versions of V4. I plan to port it to the FPGA Oberon.
Physicist's view of B2
Analysing data from multidetector Nuclear Physics experiments always takes longer
than expected. Sometimes it may even take years. Modern detector devices
produce gigabytes of data per experiment.
- The devices are hierarchically structured into smaller units. For example,
the Superball was divided into sectors, and every sector was equipped with
a number of phototubes. The Miniball was likewise divided into rings, each ring
was further divided into detector units, and each unit provideed three different
(but not independent) electronic signals.
- Other than the above characteristcs, different devices seem to have almost
nothing in common with one another (e.g., signals coming from the Superball were vastly
different from those coming from the Miniball or from particle telescopes).
Every device imposes its own set of rules and its own interpretation
of its data.
- Experiments are performed in the "event-by-event" mode, which means signals from
every nuclear interaction "event" are digitized and processed separately from other
interaction "events".
- The devices are operated in coincidence with one another. (E.g., neutron
information from Superball and charged particle (CP) information from Miniball
were processed together, for every interaction event).
The most valuable information is contained in correlations
among data from separate devices. (E.g., correlations of neutron and
CP multiplicities, or a correlation of neutron and CP
transverse energies.)
Experimental data stream cannot be treated as a simple uncorrelated union (i.e., cartesian product)
of data coming from different detectors or from different subsystems. Quite the opposite,
the most important information is contained in correlations. Given the number of signals
(roughly proportional to the number of individual detectors) one has to explore
potential correlations in a multidimensional space, where "multi" may go up to hundreds.
It is therefore not surprising that data analysis often becomes a bottleneck.
Improvements of data analysis techniques are thus worth careful consideration.
This paragraph was originally written ca 1997. The situation has changed since then.
A quick look at software packages currently in use in various nuclear physics labs
reveals a curious fact: while experimental hardware is highly specialized
and sophisticated, data analysis software seems to be rather unspecialized.
The most common tools for reaction studies are one-dimensional and two-dimensional histograms,
combined with two-dimensional selection contours. Ability to plot one and two dimensional matrices
in different graphical representations belongs more to the domain of data presentation
than to data analysis. Lack of support for data structures other than
matrices was largely due to technical limitations of Fortran-77,
which is still the main programming language in experimental physics.
In 2020 Fortran 77 has disappeared. Either Python or C++ are often used
for data analysis. The latter is often used within ROOT.
Due to its object-oriented foundations, the B2 framework provides
means to better structure the data analysis. The B2 architecture is similar to
concepts known from modular electronics. A data analysis package hosted by the B2 framework
is programmed as a collection of user-supplied Soft Processor units
managed by a common data-transmission Bus. Processors can be connected to
and disconnected from the Bus. B2 system specifies rules of interfacing Processor modules
to the Bus and among themselves. Conformance to this specification enables Processors
to exchange data over the Bus. It also promotes reusability of Processor modules.
Processors can be either simple Processors corresponding to individual detector
modules (such as particle telescopes), or composite Processors corresponding
to collections of individual detectors. Composite Processors are capable of containing
other Processors (either simple or composite) and of distributing packets of data
to their subordinate Processors. In such a way, arbitrary data packages can be routed
to arbitrary Processors, combined into a network of arbitrary topology. Highly segmented
experimental hardware can be thus rendered in software in entirely natural fashion.
These two concepts (smart data structures and user-defined Processors)
can be combined together to yield software analogs of major experimental
hardware such as the Superball, the Miniball, or any other experimental
devices or subsystems.
Programmer's view of B2
Developing new "classes" is not necessary to use the B2 system for
data analysis. Thanks to this "traditional side" of the Oberon language,
parts of the B2 system (such as a histogramming package) can be treated like a
traditional subroutine library. Existing numerical subroutines can be
translated from Fortran to Oberon with very little effort (almost automatically).
I developed my first Oberon program (a Monte Carlo simulation package)
in parallel in Fortran and in Oberon in order to investigate
whether or not both languages are equally suitable for numerical work.
B2 is hosted within an object-oriented, extensible Oberon System V4.
B2 is itself en extension of this environment,
and it can be extended further without any compiled-in limitations.
The package is implemented in Oberon-2 programming language. It makes extensive use
of type-bound procedures (Oberon-2 methods). Installable instance-bound
procedures (Oberon-1 methods) are used to a limited degree, mostly in the
graphical part of the system.
B2 is programmed in a type-safe way, which in practice means it is not buggy.
Safe programming is far from trivial in case of software based
entirely on run-time dynamic memory management.
It is however natural under Oberon, where pointer variables are both strongly typed
and garbage collected. An entire B2 system was programmed with high-level constructs
without resorting to low-level features of the language
(i.e., module SYSTEM has not been used to relax security conditions). Assertions
have been extensively used to achieve run-time security of all B2 modules. Even though
B2 code is exclusively high-level, it is by no means inefficient. Benchmarks
showed that B2 histogramming ran faster than CERN HBOOK written in Fortran.
A unique feature of the Oberon System is its run-time extensibility.
Under Oberon System all applications can be extended at run-time, while they
are active. A B2 user/programmer
can work on developing a particular B2 module, while other B2 modules are
loaded to memory and keep processing data. No data is lost from memory between
compile-link cycles. The programmer does not need to reload all spectra from disk just to
change a few lines in the data-processing code.
The B2 system is open-ended. Users can extend the base system in at least two
different ways: (1) by programming new "smart data structures"; and (2) by
programming whole new subsystems.
1. Programming new "smart data structures" is based on
inheritance. Possible new "smart structures" include new Processor types based on
existing (empty) templates, enhanced histograms with
experiment-specific behavior, etc. In OOP parlance smart data structures
are usually termed objects.
2. Adding whole new subsystems is based on the "orthogonal design" of B2.
Existing B2 subsystems depend on one another as little as possible. Adding new
subsystems will not disrupt existing ones.
Communication between such (future) subsystems is based on exchanging
"messages" over a run-time "message communication bus" provided as a standard
part of the operating environment. Only a few graphical modules make explicit
use of Oberon System graphics primitives and of Oberon GUI.
B2 is structured into a few isolated parts, namely:
- The object management library (based on "persistent object" concept).
- A fast and efficient histogramming package (both one and two-dimensional spectra).
- Graphical histogram display.
- The data dispatch bus. It serves as a communication medium among Processors.
- Run-time extensible user interface (inherited from the host Oberon System).
The principle of reusing the code became the key to small size of the
system (less than three thousand executable statements).
This principle dictates that
basic B2 services have been programmed only once and subsequently reused by
different parts of the system thanks to polymorphism of underlying objects.
The B2 object management subsystem can serve as an illustrative example of this approach.
Data-analysis objects such as histograms
or selection contours need to be managed in essentially similar ways. All such objects
need to be grouped into various lists, to be saved to disk and read back to memory, etc.
Under the object-oriented paradigm all management was programmed only once in terms
of an archetypal abstract Object defined in the base module B2Base.Mod.
The same code was used to manage all data analysis objects descended
from the archetypal abstract one. This approach yielded code which is not only small,
but also reliable, because it is tested by virtually every operation
of the B2 system. In such a way, code reuse leads to code reliability.
The B2 code is thoroughly
commented and formatted to promote its readibility.
The code which is both small and understandable can relieve the user
from the need of external support. In order to achieve that goal an effort was made to
write the code of publishable quality, similar to
the code published in the Oberon literature.
Authors and credits
Both the B2 code and the documentation were written by one person (myself)
in a relatively short total integrated time (about two months). The development
was spread over about two years due to other ongoing projects. It took me a few years to
learn all the OOP techniques necessary to implement such an extensible object-oriented
software framework.
This work would not have been possible without generous contributions from many people:
- Professors J.Gutknecht and N.Wirth of ETH Zurich designed the Oberon
language and the Oberon System for the Ceres computer.
- Their associates from ETH Zurich and from
Johannes Kepler Univ. Linz
ported their work to many computer platforms and made Oberon widely available.
- Stefan Ludwig (ETH Zurich) kindly contributed adjustable array code used in B2Base.Mod.
- Persistent object management code B2Base.Mod was inspired by
OS.Mod by Professor H.P.Mossenboeck (Univ. Linz).
- Whitney DeVries contributed many helpful hints during initial stages of the project.
- B2 was developed within the context of my other research
at University of Rochester Nuclear Chemistry group.
Oberon System books
- N. Wirth and M. Reiser
Programming in Oberon. Steps beyond Pascal and Modula.
Addison Wesley, 1992, ISBN 0-201-56543-9.
Tutorial for the Oberon and Oberon-2 programming languages
and concise languages reference. Covers most that you will ever need
in computer programming: classical procedural style, structured programming,
OOP, data structures such as lists and trees, etc. This book is an instant classic
not just on Oberon programming, but programming in general.
- H. Moessenboeck
Object-Oriented Programming in Oberon-2.
Springer, 1993, ISBN 3-540-56411-X.
Principles and applications of object-oriented programming
with examples in the language Oberon-2. Advanced concepts such as OOP,
frameworks, Model-View-Controller, and many more, are all explained
in a way which can be understood.
- N. Wirth and J. Gutknecht
Project Oberon. The Design of an Operating System and Compiler.
Addison Wesley, 1992, ISBN 0-201-54428-8.
Program listings with explanation for the whole system,
including the compiler for NS32000 processor. In case you ever wondered
whether operating systems have to be large, slow and buggy,
here is the answer. This operating system consists of 12,000 lines
of documented code (including graphical windowing system, mail and network support,
and Oberon compiler).
- M. Reiser
The Oberon System. User Guide and Programmer's Manual.
Addison Wesley, 1991, ISBN 0-201-54422-9.
Addison Wesley, 1992, ISBN 0-201-56543-9.
User manual for the programming environment and reference
for the standard module library. Whether you are using V4 or System-3,
you need this book. It contains a valuable example of a substantial
application PostIt, which can be studied as a tutorial.
B2_overview.htm, last updated Apr/27/98; edited on Nov/30/2020.
(C) Wojtek Skulski 1997-2020.