recruitment
Welcome! Thank you for your interest in working with me.
I am generally interested in all kinds of machine learning problems, and I’d be thrilled to work with you if you reach out to me with your favorite research topics. If you’d like to, please send your application to [email protected]. Your application should include
- your up-to-date CV or resume;
- 1~3 contacts for (if needed) reference letters;
- what you’d like to work on, and why you do;
- a short description about your background and how it has prepared you for your research interests.
Of course, you might be a technically strong student that has genuine interest in research but doesn’t know where to start. In that case, I would highly recommend you think about the following projects that I am currently working on.
Event Sequence Modeling
Events are everywhere. They include:
- Medical events. Each patient has a sequence of doctor’s visits, tests, diagnoses, and medications.
- Consumer events. Each online consumer has a sequence of online views and purchases.
- Life events. Some people use smart devices to record their eating, traveling, walking, and sleeping.
- Social media events. Facebook and Twitter users generate posts, shares, comments, and messages.


Two examples of event sequences in the medical (left) and educational (right) domains.
I build neural probabilistic models for sequences of events, with which one could predict what events will happen in the future and when they will happen. For example, one may probabilistically predict a patient’s prognosis, eventual diagnosis, and treatment cost based on their symptoms and treatments so far.
My past work includes flexible models (neural Hawkes process & neural Datalog through time) and efficient algorithms (particle smoothing & noise contrastive estimation).
Future research directions might be:
- breaking the limitation of autoregressive neural models (what are they?);
- exploring data augmentation techniques (what would be the challenges?);
- faster inference (why is the current rejection-sampling method slow?);
- …
Maybe you can even come up with your own research questions in this area?
Event-Based Reinforcement Learning
Predictive models may help decision making. In an interactive environment, an intelligent agent may act more wisely if it can predict what will happen and when they’ll happen in response to each action that it may take.
In the medical scenario shown below, an intelligent assistant reads the previous electronic health records, predicts the future condition—in this case, severe illness and a high-risk surgery—of the patient, and suggests the right clinical measurements (to gather more information) and treatments (for improvement) at the right times, aiming to cure or alleviate the illness and make that high-risk surgery no longer needed.

Similar intelligent agents can also benefit other application domains like education, social media, etc.
I work on general techniques that can use reinforcement learning (RL) to enable such agents. What motivates novel RL methods is that we have to consider timings in those applications:
- when reading previous events, the agent needs to take their occurrence times into account;
- when predicting future events, the agent needs to predict when they will happen;
- when taking actions, the agent needs to know when to do them—e.g., clinical treatments that are provided too early or too late may not help but may instead only be a waste of resources.
Novel Transformer Models
Transformer architectures have shown astonishing performance in a wide range of machine learning tasks such as machine translations, few-shot learning, multi-modal learning, etc. A large body of research has been done to improve its efficacy and efficiency.
I am generally interested in whether any of the following techniques would help a Transformer model learn and generalize better:
- look-ahead: how each possible word may (re-)shape the model distribution over future possible words;
- uncertainty quantification: how uncertain the model is about its predictions;
- …