

## FoCal Readout and Trigger

K. Oyama

Nagasaki Institute of Applied Science



## FoCal PAD Readout Scheme (review)

■ Developments finished by Grenoble





## FoCal PAD Readout Scheme (review)

- An aggregator FPGA processes data from four 5-pad boards (20 HGCROCS)
  - zero suppression & formatting
  - trigger sum
  - control



HGCROC data b.w. (1.28 Gbps x 40=51.2 Gbps) is compressed and sent by 2 GBT links (3.2 Gbps x 2) by:

- trigger (up to 200-500 kHz)
  - $\rightarrow$  32bit x 72ch x 20 HGCROCs x 500 kHz = 23 Gbps
- zero suppression (1-12%, avg. < 10%)
  - $\rightarrow$  2.3 Gbps ··· ok for up to ~25% avg. occupancy





# Pad Readout Scheme (questions)

put the aggregator away (~5m) from the detector and pull copper cables

#### **Concerns during prototype developments**

- FPGA in radiation area
  - If not, then there are several solutions
    - replacing aggregator FPGA with ASIC (either existing ASIC or develop ourselves)
    - choose rad-hard FPGA and implement simpler logic

Table 1: Comparison of Xilinx Space-Grade FPGAs

|                        | Virtex-4QV<br>XQRV4QV | Virtex-5QV<br>XQRV5QV | RT Kintex UltraScale<br>XQRKU060 |
|------------------------|-----------------------|-----------------------|----------------------------------|
| Radiation Hardness     | Tolerant              | Hard                  | Tolerant                         |
| Process (nm)           | 90                    | 65                    | 20                               |
| Memory (Mb)            | 4.1 to 9.9            | 12.3                  | 38                               |
| System Logic Cells (K) | 55 to 200             | 131                   | 726                              |
| CLB Flip-Flops (K)     | 49.1 to 178.1         | 81.9                  | 663                              |
| CLB LUTs (K)           | 49.1 to 178.1         | 81.9                  | 331                              |
| Transceivers           | None                  | 18 at 3.125Gb/s       | 32 at 12.5Gb/s                   |
| User I/O               | 640 to 960            | 836                   | 620                              |
| DSP Slices             | 32 to 192             | 320                   | 2,760                            |

https://iapan.xilinx.com/products/silicondevices/fpga/rt-kintex-ultrascale.html#radiation

- Can we readout for higher occupancy than 10% (such as pA) at 500 kHz? Similarly, can we readout pp at min.bias (1 MHz) for pad?
  - Switch to IpGBT (10.24 Gbps) gives 3 times higher b.w.
  - HGCROC data at 1 MHz pp → 32bit x 72ch x 20 HGCROCs x 1 MHz x 10%= 4.6 Gbps ··· OK!
  - With one IpGBT max. tolerated avg. occupancy is  $10.24/46 = \frac{22\%}{10.00}$
  - We don't need to stick on grouping of 4 layers  $\rightarrow$  (example) one IpGBT per 3 layers  $\rightarrow$  30% is tolerated
  - There is no reason not to use IpGBT anywas because it's a next standard to GBT at CERN experiments



## PAD readout concerns (summary)

- (a) Present development by Grenoble + switch to IpGBT
  - radiation concern
  - rad-hard FPGA(minimum function) + SCA → ok
- (b) put aggregator away
  - Signal integrity problem with long 3960 copper cables for 1.28 Gbps to be solved
- (c) Introduce ASIC for data reduction and trigger sum
  - CMS ECON-D and ECON-T ASICs are candidates?
  - Original ASIC → long development time?
  - Not very high advantage in case of ECON-D compare to (d) in terms of number of IpGBTs
    - With our own ASICs, it can be similar to (a)
- (d) No FPGA nor ASIC for aggregation but use only IpGBT
  - Many optical fibers and many CRUs
- Additional FPGA for (c) and (d) can be considered to reduce number of CRUs but may not give advantage in terms of cost





## Using ECON-D and ECON-T ASICs?

#### **Front-end Architecture**

**System Overview** 

#### **Common to silicon and SiPM parts**

**40 MHz trigger** data: ECON-T aggregates, selects/compresses, serializes, and transmits to IpGBT

**750kHz DAQ** data: On L1 accept, ECON-D applies zero suppression, aggregates, serializes, and transmits to IpGBT

#### **HGCAL** specific:

Front-end ASICs HGCROC (Si, SiPM)

Concentrators: ECON (T, D)

Hexaboard, Motherboard (Engine and Wagon)

Tileboard, Motherboard, wingboard

#### Generic developmentsL

IpGBT, VTRX+, SCA

also FEAST, BPOL



- Basic use case by CMS: 12 eRX → 6 eTX → 1 LpGBT
- However, it is also possible to enable/disable individual eTX according to data amount
  - Using two eTX is is safe (keep one as option)
    - eLink reduction factor is 12→2(6) or 10→2(5) depending on our layout
  - Necessary number of IpGBT is either 3960/5/6=132 (6 CRU) or 3960/6/6=110 (5 CRU)
- Data reduction by
  - zero suppression (auto-corrected threshold)
- All configuration through by GBT SCA





## Possible readout scheme with ECON-D

- To use full channels, 12 channels (6 HGCROC) is into one ECON-D is the best
  - since we have 18 layers, 3 layers x 6 groups is the best
  - but routing in Z direction is maybe not trivial
- Maybe the most natural is one layer for one ECON-D



- 3 full layers read out by a single lpGBT ··· maybe this is the best with enough safety margin
- other option is to read out 6 layers by a lpGBT if it is really safe in terms of occupancy

- HGCROC maximum trigger rate: 1 MHz (sustained average)
  - It has long enough buffer → latency of CTP trigger is no problem
- ALPIDE (Pixel) trigger rate is limited
  - It has only three event buffers inside, and no busy output (busy protection impossible)
    - Avoid busy violation by limiting individual readout rate of each strip
    - Busy violation flag is available -> RU may detects and put in data stream
    - Assumption is that we limit readout rate to  $\sim \! 100 \ \text{kHz}$  (system-C simulation by Max will give us more precise and safe number)  $\cdots$  see Dieter's presentation
    - Timing of trigger to ALPIDE is crucial (CTP LM)



## ALPIDE requirements for triggered mode

- ITS2 requires LM (Level minus one, designed for TRD) arrival at  $\sim 1.2 \mu s$ 
  - distribution via CRU doesn't fulfill the timing requirement (2.3  $\mu$ s)
  - ITS2 took solution to directly send CTP LM to sensor through the RU
  - for details please see below (thanks Johan for providing this)
    - https://indico.cern.ch/event/580057/contributions/2382306/attachments/1390715/2135376/WP10 EDR Timing v6.pdf
    - https://indico.cern.ch/event/580057/contributions/2382324/attachments/1394559/2131547/WP10\_EDR\_trigger\_distribution\_v3.pdf
- in FoCal case, we need to carefully think about how to trigger PIXEL by which trigger detector
  - only FIT will contribute to LM
     only hadronic
  - UPC physics needs ZDC as trigger input but ZDC can't do LM
  - FoCal self trigger is only solution?
  - what's other physics?





■ Event interval at 1 MHz min. bias operation can be below 1  $\mu$ s (40 BC)







For the second event, first and second events overlay, and we have no way to distinguish

we can flag or reject



We can associate and remove green signals from red events (better we know green is there in offline)



## Past-future protection at pad level

■ At trigger level, we can reject events with large overlap of physics signals in the same pad





- ← Clean event without any problem
- → Event highly contaminated with past and future events
- At trigger level we can reject those events or put a flag (raw data stream)
- CTP "may" reject events. So, we need to handle this at trigger processor





# ALPIDE triggering idea

- GLOBAL trigger for all PAD, HCAL and PIXEL
  - PAD and HCAL are fully read out with this (up to 1 MHz depending on Aggregator design)
- ROI (region of interest) mask additionally for PIXEL
  - only strips with condition (GLOBAL && ROI) == 1 are read out (max rate: < 100 kHz)





## Trigger very basic concept

#### **■ FPGA** based trigger processor for analyzing HGCROC trigger stream

- GLOBAL: to CTP as L0 contribution input (min.bias,  $\pi^0$ ,  $\gamma$ , jet, UPC, ...)
- pretrigger: GLOBAL && ROI mask information only for PIXEL
- Datastream for trigger and past-future information





- CMS trigger requirement: L1 at 750 kHz, 12.5 µs latency
- ALICE (ALPIDE) trigger requirement: pre-trigger at 1.2 µs latency
- ECON-T super trigger cell algorithm latency is 300 ns ··· good!
- Port use is selectable, depending on algorithm to use
- We need measurement and simulation

#### Note:

- HGCROC fast control is directly from IpGBT and slow control of everything is using SCA
  - no extra clock recovery (PLL)
- ALICE TPC (with SAMPA) also has the same concept (and working well)





■ Super trigger cell (STC) algorithm is similar to what we are thinking for FoCal

- $\blacksquare$  STC16 (16 TC into one sum): 16x7 bits  $\rightarrow$  4E+3M+4A(11 bits)
  - reduction factor of 11/112 ~ 0.1
  - might be possible 12 eRX into 2 eTX (careful check needed)
  - assuming so:
    - we can reduce 3960 HGCROC trigger outputs eLinks to 660 eLinks
    - Since latency is low, maybe we can collect them using IpGBT
      - 660/6 = 110 IpGBT
      - We may not connect "entire" pad layers assuming only 4 layers (4/18), 110 can be reduced to 25 IpGBTs
        - → feasible to inject into one FPGA for trigger processor
        - but we loose charge collection statistics and energy resolution if we want to trigger for shower or jet
        - for min.bias trigger, it might be ok



#### Layout consideration with ECON-T

- STC16 (16 trigger cell into one sum)
  - 1 HGCROC has 8 trigger cells
     → SRC16 perform two HGCROC into one sum
  - two HGCROCs in the same tower is better for also to keep position granularity in trigger algorithm (pattern b instead of a)
  - a group has 2 layers instead of 3 layers for ECON-D
- Need to check with pixel
  - IB configuration 3 ALPIDEs corresponding to 1 HGCROC geometry can be triggered individually







### Aggregator ver.2 with FPGA?

- Replace ETH for controlling and monitoring with GBT SCA
- Replace GBT with IpGBT
- Limit function of FPGA for to reduce radiation cross section
- Add function to recover FPGA configuration during run





## Possible layout of Aggregator 2





## To do as summary and comparison

- Aggregator ver.2 design ··· we may go with keeping different solutions
- 1. ECONs
  - obtain and evaluate ECON-T first then ECON-D, probably with Saga setup?
    - 300 ECON-T packaged available ··· some can be given to ALICE
    - ECON-T more production in a year and ECON-D production submission done
       (for final production, cost for wafer: \$4600 for 200 chips → \$2.4 /chip (+ packaging))
  - · detail discussion with CMS people and technical information transfer needed
  - · can't expect full support for board design, debugging, operation
- 2. (rad-hard) FPGA
  - investigation of rad-hard type or testing normal FPGA in radiation?
  - other than that, the easiest and quickest solution
- 3. own digital ASIC
  - VDEC framework of university of Tokyo can be used
  - experienced people are there (Tsukuba, Saga, Nagasaki, and …)
  - · problems is development time
  - not fully reconfigurable (except for parameters)
  - very good to have this experience for our future experiments
  - start prototyping by earning grant (its also investments for future)
- 4. no-aggregation
  - cost? ··· we need anyways more FPGAs for trigger data collection and more CRU for data reception

- Trigger processor
  - design fully deepens on readout solution of pad
  - looking for "many" input FPGA board
  - test implementation of logic
  - simulation of trigger performance



backup



## Busy violation

- ALPIDE has three "buffer depth", w/o busy protection, data size depends on occupancy
- too high trigger rate will cause busy violation (buffer full but received trigger)





# Detail trigger scheme (yet an idea)



**advantage:** full flexibility on choice of global trigger / ROI elements (matter of wiring and firmware)



# ALPIDE triggering idea

- system-C simulation will soon tell us how often we can trigger ALPIDE safely
- $\blacksquare$  assuming ~100 kHz for pp is a safe limit
  - GLOBAL physics triggers aggregated rejection factor of 5-10 will be safe with ROI
    - assuming 3-5 different triggers, individual trigger may be required better than 10-20
- additionally
  - limit readout region for ALPIDE using HGCROC trigger stream → ROI: region of interest
  - with enough low threshold on charge





- supports triggered and continuous modes
- continuous mode is equivalent to periodic trigger with long gate (ITS uses only this for physics)
  - fake hits, noise and background (beam-gas) ··· needs study
  - at FoCal, impossible to distinguish pile-up (minimum event interval can be much below 1 μs)



S. V. Nesbo et al., "System simulations for the ALICE ITS detector upgrade", EPJ Web of Conferences 245, 02011 (2020) <a href="https://bora.uib.no/bora-xmlui/bitstream/handle/11250/2731420/epjconf\_chep2020\_02011.pdf">https://bora.uib.no/bora-xmlui/bitstream/handle/11250/2731420/epjconf\_chep2020\_02011.pdf</a>



a





### Review: our present aggregator

- One HGCROC has two 1.28 Gbps eLinks for data output
- 4 of 5-pad layer is to be readout by the aggregator
  - 20 HGCROC = 40 eLinks
  - concentrated to 1 GBT of 3.2 Gbps
  - wire reduction factor 1/40
  - using LpGBT (10Gbps), it can be better than 1/100
  - method: trigger, Z.S., and link speed
- Similarly for trigger
- Problem is radiation

