This page outlines the main concepts of the HighLAND framework.

General workflow

The basic workflow for using the HighLAND framework is:

Start with an input file (in general a root file) for which a converter exist.
Run an analysis executable that performs cuts on the input data, and creates an output file containing one or several micro-trees of the results.
Use the DrawingTools or run a short ROOT macro to inspect the micro-tree file.

DataClasses - the representation of event information

Internally, the HighLAND framework doesn't use the same representation of the data as the original analysis files. Instead, a subset of the information is extracted and converted into some data classes, which all start with the Ana* prefix. The purpose of this is twofold, to reduce the processing time, by only using the minimal amount of information necessary, and to make the analysis independent of the input format. The data classes are defined in three levels:

CoreDataClasses.hxx/.cxx in psyche/psycheCore, defines the basic infraestructure that is needed by psyche for the event selection and the systematics propagation. These are very simple classes needed by psycheCore machinery: basically AnaSpillC, AnaBunchC, AnaEventC, AnaRecObjectC and AnaTrueObjectC.
BaseDataClasses.hxx/.cxx in psyche/psycheEventModel, this is a real event model used by specific psyche selections and systematics. It contains classes as AnaParticleB, AnaVertexB, AnaTrueParticleB, etc
DataClasses.hxx/.cxx in highland2/highlandEventModel, extends the BaseDataClasses adding some more information that could be relevant for the development of some analyses

There is a third set of files, DataClassesIO.hxx/.cxx, in highlandIO, that contains the methods needed to dump into the FlatTree the information contained in the DataClasses (and hence BaseDataClasses and CoreDataClasses).

The top-level object is an AnaSpillB, which contains the information from one beam spill (in general formed formed of several bunches). The full structure of the AnaSpillB object can be found in the class documentation, but a summary is:

AnaSpillB - top-level object. It represents a beam spill
- AnaEventInfoB - event information (run, subrun, evt number, etc)
- AnaBeamB - beam summary information
- AnaDataQualityB - detector data quality information
- AnaTriggerB - trigger information
- vector< AnaBunchB > - A vector of bunches in the Spill
  - vector< AnaVertexB > - Vector of reconstructed vertices in this Bunch
    - vector< AnaParticleB > - Vector of particles associated to this vertex
    - AnaTrueObjectC - True object (a AnaTrueVertexB) associated to this reconstructed vertex
  - vector< AnaParticleB > - Vector of reconstructed particles in this Bunch
    - AnaParticleB - Original Particle from which this particle comes from (explained below)
    - AnaTrueObjectC - true object (AnaTrueParticleB) associated with the particle
    - AnaVertexB - reconstructed vertex to which this particle is associated
- vector< AnaTrueParticleB > - vector of true particles in the Spill
  - vector< AnaDetCrossingB > - Vector of detector crossings. These are in the order in which the detectors were crossed.
  - AnaTrueVertexB - True vertex associated to this true particle
- vector< AnaTrueVertexB > - vector of true vertices in the Spill
  - vector< AnaTrueParticleB > - vector of true particles associated to a true vertex

A special case oa AnaParticleB is AnaTrackB, which represents a global track in ND280. It has the following structure:

AnaTrackB - Inherits from AnaParticleB and represents a global track
- AnaTrueObjectB - True object (a AnaTrueVertexB) associated to this reconstructed vertex
- vector< AnaTPCParticleB > - TPC segments of particle
- vector< AnaFGDParticleB > - FGD segments of particle
- vector< AnaECALParticleB > - ECal segments of particle
- vector< AnaP0DParticleB > - P0D segments of particle
- vector< AnaSMRDParticleB > - SMRD segments of particle

The above structure is the one defined in the BaseDataClasses (hence the B suffix in all classes, it means Base). The Original particle mentioned above is a pointer to another instance of the same particle, but before applying corrections or systematics. This defines three levels of AnaSpillB, the raw, corrected and final. AnaParticleB's in the final AnaSpillB will have a pointer to the corresponding particle in the corrected AnaSpillB. And AnaParticleB's in the corrected AnaSpillB will have a pointer to the unmodified particle in the raw AnaSpillB.

The various Ana* objects are filled by the InputConverter's in psycheIO and highlandIO. The appropriate converter is found automatically depending on the input file (for root files each converter looks for a specific tree).

As mentioned previously, the Ana* objects only contain a subset of the information found in the original analysis files. This means that they may not contain all the information you want in order to perform your analysis. It is possible to extend the existing data classes to contain the information you specifically need for your analysis. For example, DataClasses.hxx (under highlandEventModel) contains extensions of BaseDataClasses.hxx (under psycheEventModel). In the same way you can build your own extended data classes from the ones in DataClasses.hxx

Corrections

After the information in the input file has been read in, it is possible to "manipulate" that information using Corrections.

Corrections change the input data to rectify a problem in the MC or in the real data. Imagine for example that the energy deposited by a particle in a given sub-detector is overestimated in the MC by about 5%, and that this effect depends on the particle type (6% for muons and 4% for electrons). We could introduce a correction for the MC, which would scale the deposited energy by 0.94 for true muons and by 0.96 for true electrons. In this way we make sure that any cut using the deposited energy will have the same effect in data and MC, avoiding the corresponding systematic.

Corrections are only applied once per spill.

Systematics

After corrections have been applied it is possible to "manipulate" again the information using Systematics. In this case the purpose is different: we want to test several values of a given detector property and check the impact on the number of selected events. This allows propagating systematic errors numerically.

There are two types of Systematics: variations and weights. Systematic variations are applied before the event selection and they modify the event properties, as Corrections do. In fact both corrections and systematics where represented by the same class, InputVariation, in the first version of the highland framwork. The other type of systematics are the systematic weights, which only vary the total weight of the event in our sample. Those systematics are applied after the event selection.

In the above example about the deposited energy, the correction introduced cannot be perfectly known. The 4% and 6% mentioned have an error (i.e. 0.5%). This error should be propagated as a systematic. A given number of toy experiments will be run with different values of the scaling parameter for the deposited energy (i.e. for muons 0.93, 0.935, 0.95, ..., following a gaussian distribution with mean 0.94 and sigma 0.005). If a cut on the deposited energy (or a variable using it) is applied the number of selected events could differ depending on the scaling applied. The RMS of the number of selected events for all toy experiments represents the systematic error induced by the deposited energy scaling.

Detailed info about systematics can be found here.

Analysis configurations

The user can run several analyses in parallel minimising the acces to disk. Those parallel analyses are call configurations. Although this might be extended in the future, currenly configurations only allow you to specify which systematics errors will be propagated, the number of toy experiments and the random seed. Detailed info can be found below and also here.

A tree for each configuration is produced in the output file. By default a single configuration (called "default") is run, producing a single tree (with name default). This tree does not have any systematics enabled and hence it represents the nominal selection.

Applying systematics as "configurations" and "toy experiments"

As explained above, the effect of systematics error sources is propagated numerically to the final number of selected events by making multiple "throws" (toy experiments) of the systematic source parameter. The HighLAND framework provides a simple interface for specifying how many throws to make for each systematic variation. For example imagine we want a configuration with name "momresol_conf" with only one systematic enabled, the momentum resolution event variation.

// Add the class that performs momentum resolution variation to the event variation manager
evar().AddEventVariation(momresol_syst_index, "MomResol", new MomentumResolSystematics());
// Add to the configuration manager a configuration called "momresol_conf", with a given number of toy experiments (ntoys), random seed (randomSeed)
// and a ToyMaker to create the toy experiments   
conf().AddConfiguration(momresol_conf_index, "momresol_conf", ntoys, randomSeed, new baseToyMaker(randomSeed));
// Enable the momresol systematic (provided its index) in this configuration 
conf().EnableEventVariation(momresol_syst_index, momresol_conf_index);

The class baseToyMaker is defined in the baseAnalysisPackage. For more detailed information about systematics visit the systematics page.

The event loop

The framework handles configurations and toy experiments in an efficient manner, and only needs to read in the event once. The general loop structure looks like (see also the figure in the front page ):

Loop over events in the input file (entries in the input root file)
- Convert to DataClass format (recreates the Spill structure)
- Apply corrections
- Loop over Bunches in the Spill
  - Loop over configurations
    - Loop over toy experiments
      - Apply event variations
      - Loop over selections
        
        Apply event selection cuts
        
        Apply event weights
      - Fill output trees