Difference: Summer2016 (9 vs. 10)

Revision 102016-05-19 - dkcira

Line: 1 to 1
 

2016 CMS-Caltech-CERN Summer Students

Line: 85 to 86
  dannyweitekamp@gmail.com
Added:
>
>
Supervisors: Jean-Roch Vlimant
 Caltech, Travel dates:

Project: LHC Event Classification with LSTM-RNN

Added:
>
>

The typical use of classifier in high energy physics analysis is for discrimination between two classes being in object identification (signal versus fakes) of event categorisation (signal versus background). The typical implementation is using a fixed number of high level features of the object or the event to be discriminated. Decision trees in their simplest implementation or with various types of ensembling are seldom substituted with less fashionable feedforward neural nets. For a given search or measurement, the model output is used as a discriminating variable in a cut-and-count or template fit analysis type. A given physics process at the hard scatter level has often a wide range of final signatures in the particle detector by virtue of multiple decay possibilities of elementary particles in stable counterparts resulting. The number of observable objects (electron, muons, photons, jets, b-jets, …) can naturally therefore vary. Because of the fixed size of the input and output of the trained model, a solution often adopted is to perform the analysis in separate categories or channels and conduct a combination of results in one final measurement. Another solution is to use quantile of features in the analysis that are independent on the number of observable objects in the event (sum of transverse momenta, invariantes masses, or other combinations). Using high level combinating features is a potential information loss that is hard to estimate, while making multiple categories essentially means duplicating analysis and results in increased work and complications. Natural language processing is a field of data science that has seen great improvement in the last decades using deep learning thanks to the increase of computing power towards training of model with a very large number of parameters and with the advent of the long short-term memory (LSTM) cells in recurrent neural nets models (RNN). Recurrent neural nets are fixed size models that are trained with sequence of inputs are a time. This make this model adapted to variable size input like texts made of multiple words in various numbers. Such models are used to extract and learn the context and meaning of text. The LSTM allows to correlate the information of inputs far in the input sequence and outperform regular RNN in text processing.

Instead of establishing various channels or high level feature in high energy physics analysis for the aforementioned reasons, this technique should allow to perform the classification across all signatures. We propose in this project to classify signal and background events of high energy physics detector using RNN with LSTM. This could be used in several ways depending on what we want to classify and the observables chosen for training the model We detail below a few possible angle as possible starting points. The event description often used in analysis is in terms of lepton, missing energy (neutrinos) and jets (hadron). The jets are aggregation of multiple particles as an attempt to collect all particles from the decay chain of partons originating from the hard scatter and therefore approximate their kinematic. Particle flow reconstruction is a method that aims at having individual object for all stable particle through the detector, which is therefore a more granular representation of the events. De-facto, in most CMS analysis jet objects are constructed from the aggregation of particle flow jets.

 

Ben Bartlett

bartlett@caltech.edu

 
This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback