# TileCal ROD HW Requirements and LArg compatibility

A report document

J. Castelo

Jose.Castelo@ific.uv.es

IFIC, University of Valencia. SPAIN

# DRAFT

#### **Abstract**

This note describes the hardware requirements for the ATLAS Tile Calorimeter Read Out Drivers, it studies several options and the compatibility with Liquid Argon new design for a common ROD for the ATLAS calorimetry. We try to summarize some possible choice that must be discussed and the advantages that this one will report

## **INDEX**

| 1 | 11/1   | RODUCTION. SUMMARIZING THE READ OUT SYSTEM REQUIREMENTS                                | 4     |
|---|--------|----------------------------------------------------------------------------------------|-------|
| 2 | DA'    | TAFLOW                                                                                 | 4     |
|   | 2.1    | THE LIQUID ARGON NEW ROD MOTHERBOARD DESIGN DESCRIPTION                                | 4     |
|   | 2.2    | OPTION 1: REDESIGN THE INPUT STAGE OF THE BOARD                                        |       |
|   | 2.2.   |                                                                                        |       |
|   | 2.2.2  | 2 Disadvantages                                                                        | 7     |
|   | 2.3    | OPTION 2: USING EXACTLY THE SAME PCB AS LARG BUT CLOCK SELECT                          | 8     |
|   | 2.3.   |                                                                                        |       |
|   | 2.3.2  | 3                                                                                      | 8     |
|   | 2.4    | OPTION 3: USING ROD DEMONSTRATOR BOARD, AND DESIGN A NEW TRANSITION MODULE FOR TILECAL |       |
|   | RODs   |                                                                                        |       |
|   | 2.4.   | 1207 60 000 000                                                                        | و     |
|   | 2.4.   |                                                                                        |       |
| 3 | TTO    | C PARTITIONS                                                                           | 9     |
|   | 3.1    | INTRODUCTION                                                                           | 9     |
|   | 3.2    | Using 64 Modules                                                                       |       |
|   | 3.3    | USING 32 MODULES                                                                       |       |
| 4 | CO     | ST ESTIMATION                                                                          | 11    |
|   | 4.1    | ACTUAL DESIGN. ROD DEMONSTRATOR BOARD USING ONLY 2 PUS+TM4PLUS1                        | 11    |
|   | 4.2    | USING NEW LARG ROD BOARDS WITH ONLY 2 PROCESSING UNITS AND TWO OUTPUT MEZZANINE LINKS  |       |
|   | 4.3    | USING ROD DEMONSTRATOR BOARD+TM8PLUS1                                                  |       |
| 5 | TET I' | ΓURE UPGRADES                                                                          | 12    |
| J | 10.    |                                                                                        | •• 14 |
| 6 | CO     | NCLUSIONS. PREFERRED SOLUTIONS                                                         | 12    |
|   | 6.1    | DATAFLOW OPTION 2: USING EXACTLY THE SAME PCB AS LARG BUT CLOCK SELECT                 | 12    |
|   | 6.2    | TTC PARTITION SELECTION                                                                |       |
| 7 | AC     | KNOWLEDGEMENTS                                                                         | 14    |
| 8 | AC     | RONYMS                                                                                 | 14    |
| 9 | REI    | FERENCES                                                                               | 16    |

NOTE: AFTER PAGE 18 THERE ARE 9 PAGES, IN LANDSCAPE
FORMAT, FOR THE FIGURES ADRESSED IN THIS DOCUMENT

# **Table Index**

| TABLE 1 TILECAL ROD CHARACTERISTICS                                                     | 5      |
|-----------------------------------------------------------------------------------------|--------|
| TABLE 2 ROD BUDGET USING ROD DEMONSTRATOR BOARD 64 MODULES                              | 11     |
| TABLE 3 ROD BUDGET USING NEW INTEGRATED LARG BOARD (32 MODULES); ERROR! MARCADOR NO DEF | INIDO. |
| TABLE 4 ROD BUDGET USING ROD DEMONSTRATOR AND DESIGNING A NEW TM8PLUS1 FOR TILECAL FROM | 111200 |
| SCRATCH                                                                                 | 12     |

# **Figure Index**

## (At the end of the document in landscape format)

| Figure 1 LARG ROD Module DATAFLOW BLOCK DIAGRAM (THE NEW DESIGN)                       | 1  |
|----------------------------------------------------------------------------------------|----|
| FIGURE 2 TILECAL ROD MODULE DATAFLOW BLOCK DIAGRAM REDESIGNING THE INPUT PART          |    |
| FIGURE 3 TILECAL ROD MODULE DATAFLOW BLOCK DIAGRAM (USING LARGROD NEW DESIGN)          | 3  |
| Figure 4 The LARG Input stage modifications needed                                     |    |
| FIGURE 5 TILECAL ROD MODULE DATAFLOW BLOCK DIAGRAM USING ROD DEMONSTRATOR BOARD + CUST |    |
| TRANSITION MODULE TM8PLUS1                                                             | 5  |
| FIGURE 6 TILECAL ROD DATAFLOW SCHEME 64 ROD MODULES                                    |    |
| FIGURE 7 TILECAL ROD PARTITIONS PROPOSAL 64 RODS                                       |    |
| FIGURE 8 TILECAL ROD DATAFLOW SCHEME 32 ROD MODULES                                    |    |
| Figure 9 TILECAL ROD PARTITIONS PROPOSAL 32 RODS                                       | 9  |
| FIGURE 10 DATAFLOW OF THE PREFERRED SOLUTION USING NEW LIARG MOTHERBOARD               |    |
| FIGURE 11 PREFERRED CONFIGURATION OF 4 TTC PARTITIONS IN BARRELS                       | 12 |
|                                                                                        |    |

# 1 Introduction. Summarizing the Read Out System requirements

The TileCal ROD has to read and compute 9856 channels each 10µs and it must be able to deal with this huge amount of data in real time. The data gathered from these channels are digitized and transmitted to RODs with high-speed optical links. Each ROD module must have the ability to process this data, and send it through an output optical link to the next stage (ROBs) in the data acquisition chain. Also, the ROD system must provide some communication for monitoring and control of all the RODs modules, and this feature is driven by the ROD controller (SBC), which is the master CPU of the ROD crate for controlling the ROD modules (slave devices). There are another slave module that must be built for this application, and it's the TBM, which is responsible for receipt the TTC information at ROD crate level and distribute it to all ROD modules, by other hand it has another function that is to recollect and implement an OR function over all the BUSY signals of the ROD modules in order to stop the L1A generation. A bidirectional communication with CTP is done through a TTC crate in the partition, managing BUSY and TTC signals.

In Table 1 is summarized the read out driver performance requirements for the hadronic calorimeter.

## 2 Dataflow

The motivation of this document is the comparisons of the Tilecal ROD requirements with the LArg new motherboard design for take advantage of the new more integrated board due to cost reduction. In this section, three possible architecture schemes are described, and a comparison between them should report a final hardware solution for the Tilecal ROD dataflow.

#### 2.1 The Liquid Argon new ROD motherboard design description

In Figure 1 is shown the dataflow architecture of the new ROD motherboard.

The RX part is based in up to eight integrated G-LINKS using the chip HDMP1024 running with an 80MHz reception clock.

| Number of Channels                                                                                     | 9856   |
|--------------------------------------------------------------------------------------------------------|--------|
| Number of Drawers (FEB)                                                                                | 256    |
| Number of channels per drawer (EB)                                                                     | 32     |
| Number of channels per drawer (CB)                                                                     | 45     |
| Number of Drawers (EB)                                                                                 | 128    |
| Number of Drawers (CB)                                                                                 | 128    |
| Input event size per FEB (7 samples) in kbytes [2]                                                     | 0.57   |
| Total input event size (7 samples)                                                                     | 147.00 |
| Input Data Bandwidth @ 100kHz Lvl1 ATLAS rate Gbytes/sec                                               | 14.02  |
| Number of Drawers (FEB) per ROD                                                                        | 4      |
| Number of RODs                                                                                         | 64     |
| Typical output event size per ROD (Typical Size 1) in kbytes [2]                                       | 1.10   |
| Output Data Bandwith @ 100kHz Lvl1 ATLAS rate Gbytes/sec                                               | 6.70   |
| Number of PUs per ROD                                                                                  | 4      |
| Number of PU (DSP) instructions per channel (seven samples). Aplying Optimal filtering (E, t and chi2) | 70     |
| Total processing power in MIPs                                                                         | 68992  |

Table 1 Tilecal ROD characteristics

#### The main tasks for the Staging FPGA are:

- It multiplexes the data from the different FEB inputs and send it to the connector of the PU concerned, depending if it is staging or not. This feature provides the possibility to use only two processing units instead of four, routing the data to the right PU (the staging is configured through VMEbus).
- The Glink chips might need a configuration that will be performed by the staging chips.
- It gets the temperature of the Glink and transmits it to the VME chip. Because these chip usually has high power consumption.
- It transmits the Glink errors (parity and ready) to the PUs and to the VME.

- During the tests it will:
  - Read the Glink data and transfer it to the VME.
  - Transfer data from the VME to the PU. Similar function as 'data distributor block' in the demonstrator board.
  - > Transfer at high rate some pattern data to the PU.

The Processing Units are responsible for the data and TTC reception, the implementation of optimal filtering algorithms, the local histograming and to send the processed data to the Output Controller FPGA. A FIFO is needed for having some buffering in the output. It's planned that this Processing unit contains two fixed point Texas Instruments DSP TMS320C6414 running at 600MHz clock rate.

The Output Controller FPGA has to treat the processed data and decide (after previous VME configuration) when to send the events to the output links or to the SDRAM to be available to read through VME bus (low speed data taking). Another possibility is to send to both, in case of spying data for test purposes (≈5% of the data is spying with VME crate controller).

A serializer/deserializer module is implemented because there were not enough pins in P3 backplane to send data for the 4 OC. The trick used is converting the unipolar signals to high noise immunity differential signals like LVDS standard.

This huge amount of data is sent to the ROB through up to four mezzanine S-LINK LSC controlled by an FPGA in the transition module.

The reception of TTC info is done by the TTCrx chip and a TTC\_FPGA , which distributes this TTC info to each PU (for compare this TTC info with the one received from the front end data stream) and to the Output Controllers for building the data fragments with the DAQ-1 event format.

The VME FPGA, obviously, is used for the communication with the ROD controller SBC. It is responsible for booting the PU (DSP and FPGA code), and read/write the CSR of Staging FPGAs, OC, TTC FPGA, and in general for access to the control and monitoring of the board.

In the next lines is compared and discussed several options that could be taken around the new design. We will compare three options for the tilecal ROD dataflow architecture, having in mind the advantages and disadvantages of each one.

#### 2.2 Option 1: Redesign the input stage of the board

In Figure 2 is shown the Tilecal ideal solution in case of redesigning the input stage of this board. The Tilecal front end interface links are implemented with G-LINKs 3,3v HDMP1032 chips running at 40MHz, and a TX system of two fibers which send the same fragment for data error check in the ROD reception side [4]. The changes proposed are to put double optical receiver and two 3,3v HDMP1034 RX chips with 40 MHz clock for data de-serialization and staging FPGA machine clock. A FIFO for at least two input events is suggested because we need a temporary storage for the

redundancy link data, while we are checking the CRC for one of them, in case of errors we read from the FIFO the other link fragment and we use this one if it's OK. If not an error flag must be reported.

The rest of the board is maintained as the original design but without using two Processing Units, and two Output Controllers plus SDRAM data storage. Of course, the motivation of this is for saving costs. We must consider that LArg needs more processing power per link because this system processes 128 channels/link, while Tilecal system sends the links less saturated, and is not possible to send more than 48 channels/link due to drawer mechanical constraints. The real number of channels/link is 45 for CB and 32 for EB. Well, let we work with numbers:

- <u>Input bandwidth</u>: the maximum input BW of each link for a tilecal physic event is **467,2 Mbit/sec,** and for four drawers is **1,825 Gbits/sec.** Taking into account that the input bandwidth of the Processing Unit is **2,5Gbits/sec** (64bits@40Mhz) the input stage BW is solved.
- The processing unit: we need to process 154 channels (four drawers) in two TMS320C6414@600MHz DSPs, and each one of these units could perform 4800 MIPS. This DSP has the same core with some improvements in number of registers and an enhanced DMA unit over the actual DSP we have tested is the TMS320C6202@250MHz with up to 2000 MIPS[14]. With this unit we could process 45 channels in around 5,5μs if programmed in assembler and 15,5μs if programmed in C code [3]. Therefore we conclude that we could process 154 channels with the new PU TMS320C6414@600MHz[15] with 9600MIPs (two DSPs) in 3,92 μs with assembler programming and around 11μs with C language. Because our limit is 10μs at LVL1 100KHz rate, thus if we believe in improvements in the C compiler from Texas Instruments, probably we could program the final system in a better maintainable C code and only with 2 Processing Unit mezzanines installed in the motherboard.
- <u>Output Bandwidth</u>: the typical BW for 154 channels (four drawers) is **656 Mbits/sec** [2]. Then, an Output Controller FPGA of **1,28 Gbits/sec** (32@40MHz) has enough BW for the output of each Processing Unit (154 channels each). Two Output controllers with two mezzanine links mounted in the transition module are enough for this configuration.

#### 2.2.1 Advantages

- ✓ With this solution we reduce the number of RODs from 64 two 32 maintaining the readout in 64 links to ROBs. Therefore we don't spoil the granularity of the read out system.
- ✓ Is possible to receive the two redundant fibers, and we use the HDMP1034 chips at 3,3v with no cooling problems, and can select between using or not the enhanced simplex mode (pin ESMPXENB=0) in the HDMP1032 transmitter interface link [8].

#### 2.2.2 Disadvantages

- Hardware design. The routing for the new PCB must be done specifically for tilecal.
- The PCB of tilecal and the LArg one will be different, thus a common order could not be done with less advantages for decreasing costs because of a high number of units request.

#### 2.3 Option 2: Using exactly the same PCB as LArg but clock select

This option could be seen in Figure 3 and is my favorite one. It consists in using the same PCB as LArg, with HDMP1024 [9] with the enhanced simplex mode disabled (pin ESMPXENB=0) in the HDMP1032 transmitter interface link.

The only consideration we must do is to put a **clock multiplexer** to select between 80MHZ deserializer reception clock for LArg, and 40MHz for tilecal. Because LArg is in design phase this is possible to implement, and the cost of to put two crystal clocks with a MUX in the board is very low. Besides, the clock select between the Tilecal Mode and the LArg mode could be done with a simple jumper or with a FPGA active PIN (then it could be configured from VME). In Figure 4 are marked the little changes over the initial LArg design about the clock and the control lines.

All the other blocks are the same as described in the previous sections.

#### 2.3.1 Advantages

- ✓ With this solution we reduce the number of RODs from **64** two **32** maintaining the readout in 64 links to ROBs.
- ✓ The hardware development and routing is the same as LArg, thus a common order could be done, and the ROD for calorimeters will be the same, only changing a reception clock jumper. This will report a better maintainability during the LHC life.

#### 2.3.2 Disadvantages

- It's not possible to read the two redundant fibers of each interface link, but the TDR only specifies one as a ROD specification, and future improvements in HW could made possible to read both.
- ☐ The problem with the cooling of 5v HDMP1024, but LArg ROD community seems to have this problem under control with specific power fans.

# 2.4 Option 3: Using ROD demonstrator board, and design a new transition module for tilecal RODs

This option is shown in Figure 5. The idea is to use the ROD demonstrator board [19] we have now and design a new transition module TM8Plus1 (now we have the TM4Plus1 [18]) with 8 integrated dual g-link HDMP1034 for data redundancy check and one s-link output mounted in a mezzanine card.

#### 2.4.1 Advantages

✓ We could read the two fibers per interface link for data fragment error check.

#### 2.4.2 Disadvantages

- H We have to design a complete new transition module from the scratch with high development and test times.
- If The ROD demonstrator board is not fully tested and we have to repair bugs in a board that we haven't designed therefore with a lot of black box for us (e.g. the TTCrx mezzanine is not mounted in the board).
- If Probably we will have problems with the availability of components for future production with the ROD demonstrator board.
- The output bandwidth of the Output controller and LSC S-Link to ROB will overflow because now the mapping is 32 RODs/ROBs and de data gathering mapping is 8:1 (8 inputs and one output). The output BW is **1,25Gbits/sec**, and in a typical case is **1,28Gbits/sec** (four drawers 656 Mbits/sec [2], then 8 drawers 1,28Gbps).

## 3 TTC Partitions

#### 3.1 Introduction

The ROD must be able to interact with CTP [20] system for trigger management. Two kinds of signals are used for this purpose:

- The receipt of the L1A trigger and TTC indicating a valid event and its type plus other
- Generation of a BUSY signal when the buffer is nearly full.

The subdetector has several partitions, which must be considered as subsystems. Each subsystem requires partitions because subsystems need to work independently. It could be a complete subdetector or a subset of a sub-detector.

#### A partition requires:

- Independent Trigger and Timing system (TTC)
- Independent handling of the dead-time (BUSY)

All the TTC systems of sub-detectors in normal data taking use the trigger signal provided by the CTP and the normal timing signals. Their BUSY signals are transmitted to the CTP.

The sub-detector in calibration uses its own trigger generator and timing signals (if necessary). Its BUSY signal is routed to its trigger generator to handle dead-time. If the ROBs are used, the corresponding BUSY signals are used. If the ROBs receive the TTC, they must receive the private signals.

In the next points we will propose two kinds of partitions in case of using 32 ROD modules with the more integrated board, or in case of using 64 modules (ROD demonstrator board). This is only to illustrate several possibilities but I think that this issue must be studied more carefully because it concerns the architecture of the data acquisition system for the calorimeter, and of course *comments* and ideas are welcome...

#### 3.2 Using 64 Modules

In Figure 6 is shown the architecture corresponding to four partitions of 16 ROD modules each, organized as four crates. Thus, we have one partition per crate.

In Figure 7 we could see this configuration with more detail. Each ROD crate partition interacts with one TTC 6U VME crate, receiving and sending its own TTC and BUSY signals respectively. The partitions are controlled by partition master, which communicates through Ethernet with the ROD crate controllers (running local daq applications) and the TTC controllers for configuring TTCvi and the other modules inserted in the TTC crate. The partition information is loaded and stored in a common detector database. For more details in the function of each module take a look at reference [1].

My questions>>> In principle the baseline of the ROD pretends to organize four drawers (two CB and two EB) of each module to be processed in the same ROD. This segmentations corresponds to 16 modules per crate partition varying  $\phi$   $\pi/2$  and  $\eta$   $\pi$ . Why not to organize the partition varying  $\phi$   $2\pi$  (i.e. 4 partitions 2 EB, 2 CB)?

#### 3.3 Using 32 Modules

Another TTC partitioning having in mind one crate partition is the one as shown in Figure 8 and more detailed Figure 9. Basically is the same as in the previous point but only using two TTC crates. Therefore I propose to the reader to analyze the schema by himself.

My questions>>> Tilecal could be considered as a unique partition? This option will allow us to save TTC crates and modules. Disadvantages: the detector will have only an OR busy signal, then if only one ROD module generates a busy, this signal will be transmitted to CTP and all the data acquisition will be stopped.

## 4 Cost estimation

In the next subsections is shown an estimation of the total cost for Tilecal ROD system. Of course, the components are not bought and we don't have a precisely budget, so they represent more or less the actual market prices. When looking at the prices, also consider the advantages and disadvantages of each option commented in previous descriptions (e.g. manpower, maintainability).

#### 4.1 Actual design. ROD demonstrator board using only 2 PUs+TM4plus1

| Description                              | No. | Price/unit<br>(CHF) | Total price<br>(CHF) |
|------------------------------------------|-----|---------------------|----------------------|
| 9U VME ROD demonstratormotherboard       | 64  | 3000                | 192000               |
| Processing Units mezzanines              | 128 | 1500                | 192000               |
| 9U VME Transition ModuleTM4Plus1 board   | 64  | 2000                | 128000               |
| S-Link ODIN LDC mezzanine                | 256 | 370                 | 94720                |
| 9U Crates (e.g. Wienner VME 6000 series) | 4   | 10000               | 40000                |
| 6U VME ROD Controller SBC                | 4   | 5000                | 20000                |
| VME Trigger&Busy module                  | 4   | 2000                | 8000                 |
| P3 Backplane                             | 4   | 1000                | 4000                 |
| 6U TTC Crate+module partitions           | 4   | 10000               | 40000                |
|                                          |     | TOTAL               | 718720               |

Table 2 Rod budget using Rod Demonstrator board 64 modules

#### 4.2 Using New LArg ROD boards with only 2 processing units and two output mezzanine links

| Description                                | No. | Price/unit<br>(CHF) | Total price<br>(CHF) |
|--------------------------------------------|-----|---------------------|----------------------|
| 9U VME ROD motherboard                     | 32  | 4000                | 128000               |
| Processing Units mezzanines                | 64  | 1500                | 96000                |
| 9U VME Transition Module with output links | 32  | 2500                | 80000                |
| S-Link HOLA 2.5Gbps LSC mezzanine          | 64  | 282                 | 18048                |
| 9U ROD Crates                              | 2   | 10000               | 20000                |
| 6U VME ROD Controller SBC                  | 2   | 5000                | 10000                |
| VME Trigger&Busy module                    | 2   | 2000                | 4000                 |
| P3 Backplane                               | 2   | 1000                | 2000                 |
| 6U TTC Crate+module partitions             | 2   | 10000               | 20000                |
|                                            |     | TOTAL               | 378048               |

Table 3 Rod budget using new integrated LArg board (32 modules)

\_

<sup>\*</sup>The boxes without price indicate that it was not known for me, or there are cards not designed yet.

#### 4.3 Using ROD demonstrator board+TM8Plus1

| Description                              | No. | Price/unit<br>(CHF) | Total price<br>(CHF) |
|------------------------------------------|-----|---------------------|----------------------|
| 9U VME ROD demonstratormotherboard       | 32  | 3000                | 96000                |
| Processing Units mezzanines              | 128 | 1500                | 192000               |
| 9U VME Transition ModuleTM8Plus1 board   | 32  | 3400                | 108800               |
| S-Link ODIN LSC mezzanine                | 32  | 370                 | 11840                |
| 9U Crates (e.g. Wienner VME 6000 series) | 2   | 15000               | 30000                |
| 6U VME ROD Controller SBC                | 2   | 5000                | 10000                |
| VME Trigger&Busy module                  | 2   | 2000                | 4000                 |
| P3 Backplane                             | 2   | 1000                | 2000                 |
| 6U TTC Crate+module partitions           | 2   | 10000               | 20000                |
|                                          |     | TOTAL               | 474640               |

Table 4 Rod budget using Rod demonstrator and designing a new TM8Plus1 for tilecal from scratch

# 5 Future upgrades

Due to cost and to maintain hardware compatibility with LArg ROD is not possible to read two redundancy fibres from the front-end links for data error checking. In principle this was not a TDR specification, but some experiences indicate that link data errors have been encountered due to a noisy environment. LArg has been done the same checks [17] and finally they have chosen the option of sending only one fibre, I guess because of the cost too. Because the interface links system is under production for two fibres system, one link could be read by now and future efforts could be done for trying to read the two fibres, nevertheless, the system is open and is upgradeable.

## 6 Conclusions. Preferred Solutions

#### 6.1 Dataflow option 2: Using exactly the same PCB as LArg but clock select

This is the preferred solution because reports <u>lower cost</u> and <u>higher hardware compatibility</u> with LArg motherboards. Several detailed studies and/or test must be done over this configuration for be sure of the compatibility:

The optical link interface compatibility with Tilecal interface links.

- In principle (manufacturer datasheet information) the RX chip HDMP1024 is compatible with the TX HDMP1032 with enhanced mode disabled. But a real set-up must test this feature. Besides, is preferred not to use the enhanced mode because several tests done by S-LINK people at CERN detected some bug in transmissions with this mode.
- ➤ The clock selector between 40MHz for Tilecal and 80MHz for LArg seems not to be a implementation problem since the staging FPGA receives a 40MHz clock from VME FPGA and this could be used for the receiver. The program inside the staging FPGA should be able to select the right reception frequency.
- The power consumption of silicon chips switching is proportional to the frequency. Because of this behaviour LArg has a very complicated cooling system for the HDMP1024 chips running at 80MHz because the g-link chips must be under 35°C to work properly at 80MHz. Thus this situation requires to modify the fan tray adding turbines which send air flow in a tube located on the motherboard and the G-link chip will be inside this tube. We must investigate if working only at 40MHz is possible to remove this cooling system and run properly, because this will decrease complexity in our system (reducing cost, too).
- The CRC check of the input data must be implemented. In principle is preferred that the staging FPGA only routes data, and the data rearrangement and check will be done inside the input FPGA of the PU. Since the CRC check takes a lot of cells inside FPGA, some simulations must be done in order to test if the program could be synthesized inside of one of these FPGAs. As far as I know the Stagging FPGA will be a Altera ACEX 1k50 484 FineLineBGA (50.000 gates) and the Input FPGA of the PU are really two Altera ACEX1k100 (100.000 gates). This family FPGAs are called High-Volume SRAM PLDs
- To keep the possibility of reading the two redundant fibers from each drawer in order to be more fault tolerant to radiation in cases of high luminosity this board could be used too, but if we read all the redundant fibers we must use the double of boards (64) which will double the project cost. Thus, initially only one is planned, and in future the channels more sensible to radiation could be readout trough two fiber.

About the Software, the parts that need to be programmed specifically for Tilecal are the *staging FPGA*, the *Processing Unit (input FPGA, and DSP)*, and probably some adaptation in VME libraries. The *Staging FPGA* must be able to manage the Tilecal input data and control words and send it to the PU. The input FPGA of the PU must reorganize the Tiles data mapping and send it to DSP in order to run optimal filtering algorithm custom implemented for the number of samples, bits/sample and output dataformat [2] of Tilecal detector. In principle the *Output Controller*, *TTC* and *VME* FPGA programs could be used as they are.

In Figure 10 is indicated with a green line the dataflow direction from input to outputs of the preferred solution for the RODs.

#### 6.2 TTC partition selection

Several discussions concluded that the preferred number of partitions for the Tilecal ReadOut is 4. And they must be organized around  $\phi$  [0,  $2\pi$ ]. It means one partition for left EB, one for left CB, one for the right CB, and another one for right EB.

In Figure 11 is represented this configuration for partitioning the read-out in barrels. The schema shown is for 64 RODs but it could be implemented for 32 RODs. This will allow us to work with only central barrels if there are not enough RODs in an initial start of the system.

# 7 Acknowledgements

My more sincere gratefulness to all those that have helped me in the more technical details and to those that, with his support, they have facilitated the opportune meetings that have done possible this document

# 8 Acronyms

**HW** : Hardware

**BW** : Bandwidth

**CB** : Central barrel

**CRC** : cyclic redundancy checking

**CTP** : Central Trigger Processor

**DAQ**: Data Acquisition (system)

DCS : Detector Control System

**DCTPI**: Detector-to-CTP Interface

**DIG** : Detector Interface Group

**DSP**: Digital signal Processor

**EB** : Extended barrel

**FEB**: Front end boards

**FIFO**: First Input First Output (memory)

**FPGA** : Field programmable gate array

**HOLA**: High-speed Optical Link for ATLAS

L1A : Level-1 Accept (Signal)

LAN : Local Area Network

**LArg** : Liquid Argon (calorimeter)

**LDC** : Link destination card

**LHC** : Large Hadron Collider (accelerator)

**LSC** : Link source card

LTP : Local trigger processor

LVDS : Low voltage differential signal

MIP : Million Instructions per Second

MUX : Multiplexor (data)

**OC** : Output controller (FPGA)

**ODIN** : Optical Dual G-LINK

**PCB**: Printed circuit board

**PU**: Processing unit

**ROB** : Read-out Buffer

**RX** : Link Receiver

**SBC**: Single Board Computer

**SDRAM**: Synchronous Dynamic Random Allocation Memory

**TBM**: Trigger and Busy Module

**TDR** : Technical design report

TTC : Timing, Trigger and Control (System)

**TTCex** : TTC encoder/transmitter

TTCrx : TTC receiver (chip)

TTCvi : TTC-VMEbus INTERFACE

TX : Link Transmitter

**VMEbus**: VersaModular Eurocard bus

### 9 References

[1] RD12 - Timing, Trigger and Control (TTC) Systems for LHC Detectors.

*Reference*: http://ttc.web.cern.ch/TTC/intro.html

[2] The I/O Dataformat for the TileCal Readout System (Jose Castelo)

Reference: http://ific.uv.es/tical/rod/doc/rod\_data\_format%20proposal.pdf

[3] **ROD Processing Unit Performance** (Jose Castelo).

Reference: http://documents.cern.ch/archive/electronic/other/agenda/a02281/a02281s2t8/ROD\_DSP\_performance.pdf

- [4] TileCal ROD pages. Reference: http://ific.uv.es/tical/rod
- [5] Interface link pages. Reference: http://hep.uchicago.edu/atlas/electr/electronics.html
- [6] **Digitizers system pages.** Reference: http://www.physto.se/~ker/designreview/dr.html
- [7] Tilecal Electronics (CERN)

Reference: http://atlasinfo.cern.ch/Atlas/SUB\_DETECTORS/TILE/elec/electronics.html

[8] Agilent HDMP-1032/1034 Transmitter/Receiver Chip Set

Reference: http://literature.agilent.com/litweb/pdf/5968-5909E.pdf

[9] Agilent HDMP-1022/1024 Transmitter/Receiver Chip Set

Reference: http://literature.agilent.com/litweb/pdf/5966-1183E.pdf

[10] Progress on the Motherboard (April 2002). Author: Daniel Lamarra, U. Geneva

Reference: http://documents.cern.ch/AGE/current/fullAgenda.php?ida=a02524

[11] **ROD** Algorithm Performances using the DSP TMS320C64x.

Reference: http://documents.cern.ch/cgi-bin/setlink?base=atlnot&categ=Note&id=larg-2001-020

[12] Avoiding rate limitations in the staging of the LArg ROD system.

Reference: http://documents.cern.ch/cgi-bin/setlink?base=atlnot&categ=Note&id=larg-2001-020

[13] Timing, Trigger and Control, and Dead-time handling. Author: Ph. Farthouat

Reference: http://mclaren.home.cern.ch/mclaren/atlas/conferences/ROD/ttc.pdf

[14] TMS320C6202, Fixed-Point Digital Signal Processor.

Reference: http://focus.ti.com/docs/prod/productfolder.jhtml?genericPartNumber=TMS320C6202

#### [15] TMS320C6414, Fixed-Point Digital Signal Processor.

Reference: http://focus.ti.com/docs/prod/productfolder.jhtml?genericPartNumber=TMS320C6414

#### [16] LArg Front-end Optical Links.

Reference: http://atlas.web.cern.ch/Atlas/GROUPS/FRONTEND/larg\_links/

#### [17] LAr Front-end Optical Link Selection (from Jingbo YE www page)

Dual G-LINK. Reference: http://www.physics.smu.edu/~yejb/atlas/documents/link\_selection\_docs/dual\_glink\_final.doc

Single G-LINK. Reference: http://www.physics.smu.edu/~yejb/atlas/documents/link\_selection\_docs/single\_glink\_final.doc

#### [18] TM4plus1: Active S-LINK to VME64x transition module.

Reference: http://hsi.web.cern.ch/HSI/s-link/devices/tm4plus1/

#### [19] LArg ROD Demonstrator Board (DPNC196)

Reference:http://dpnc.unige.ch/LaMarra/DPNC196/

# [20] Use of the Central Trigger Processor (CTP) and of the Timing, Trigger& Control System (TTC) for Timing and Calibration (R. Spiwoks)

Reference: http://press.web.cern.ch/Atlas/GROUPS/DAQTRIG/LEVEL1/ctpttc/meet\_tdaq\_121101.pdf



Figure 1 LArg ROD Module DATAFLOW Block Diagram (the new design)

Figures page 2



Figure 2 TileCal ROD Module DATAFLOW Block Diagram redesigning the input part



Figure 3 TileCal ROD Module DATAFLOW Block Diagram (using LArgROD new design)





Figure 5 TileCal ROD Module DATAFLOW Block Diagram using ROD demonstrator Board + Custom transition module TM8Plus1

TileCal Detector. 64 Modules

4 DRAWERS/ module. 154ch/module

Figure 6 TileCal ROD Dataflow scheme 64 ROD modules



Figure 7 TileCal ROD Partitions proposal 64 RODs



Figure 8 TileCal ROD Dataflow scheme 32 ROD modules



Figure 9 TileCal ROD Partitions proposal 32 RODs



Figure 10 Dataflow of The preferred solution using new Liarg motherboard

TileCal Detector. 64 Modules

Figure 11 Preferred configuration of 4 TTC partitions in barrels