Excellence Scholarships – Msc

IVADO funding program for excellence scholarships –  Msc

IVADO’s commitment to equity, diversity and inclusion and note to applicants
To ensure all members of society draw equal benefit from the advancement of knowledge and opportunities in data science, IVADO promotes equity, diversity and inclusion through each of its programs. IVADO aims to provide a recruitment process and research setting that are inclusive, non-discriminatory, open and transparent.

Overview

Description

FAQ

Application

Results - 2018 Contest

Results - 2019 Contest

Program description

  • Field of study: The IVADO Scholarship Funding program supports research on the issues raised in the Canada First funding competition: data science in a broad sense, including methodological research in data science (machine learning, operations research, statistics) and its application in a range of sectors including the priority sectors of IVADO (health, transportation, logistics, energy, business and finance) or any other sector of application (sociology, physics, linguistics, engineering, etc.).
  • Amount of award and grant period: 20k$ per year for a maximum of 6 sessions or 2 years
  • Opening of the application process: January 2020
  • Application deadline: February 2020
  • Expected results notification date: April 2020
  • Criteria: See the description tab
  • Submission: See the submission tab

Program objectives

The goal of the excellence scholarship program is to support promising students in their training as future highly qualified personnel (researchers, professors, professionals) and more generally, future actors in the field of data science, mainly in IVADO members’ areas of excellence: operations research, machine learning, decision sciences.

Eligibility

  • Scholarship applicants must:
    • have already earned their Bachelor degree prior to the date on which they are applying or intend to earn it by the date on which the competition results are announced. IVADO will be flexible with applicants who provide an adequate explanation for a career interruption or particular circumstances. This explanation must be included in the application (e.g. pregnancy/maternity or sick leave);
    • intend to attend HEC Montréal, Polytechnique Montréal, Université de Montréal, McGill University or University of Alberta;
    • have a first class minimum average grade (3.7/4.3 or 3.5/4.00) over the previous years of study.
  • Professor (supervisor) applicants must:
    • hold a faculty position as a professor at HEC Montréal, Polytechnique Montréal or Université de Montréal;
    • Professors at the University of Alberta and McGill University may act as supervisors providing they are full members of IVADO (Mila, CIRRELT, GERAD, CERC Data Science, CRM, Tech3Lab, AMII).
    • Only submit one application to the competition.

Funding period

The funding period starts in April 2019.

Amounts and terms

The funds shall be transferred to the office of research of the supervisor’s university, and the university shall pay the postdoctoral researcher according to its own compensation rules. For projects that require ethics approval, the funds shall only be paid out once the approval is granted. Some projects may require specific agreements (e.g. pertaining to intellectual property).

Funding may be cut, withheld, delayed or rescinded under the circumstances outlined in the letter of award.

Competitive process

Review and criteria

The applications shall be reviewed to ensure compliance with program rules (e.g. applications that are incomplete, exceed the page limit or list an ineligible applicant or supervisor). Only the applications that meet all criteria will be forwarded to the review committee.

The parity-based review committee shall be made up of university professors and shall not be listed as a supervisor by any applicant. However, given the small size of the communities in certain areas, it may prove difficult to select expert reviewers who are not included in an application submitted to the competition. In such cases, a reviewer may be required to assess an application despite being listed in another application as a supervisor. An external reviewer may also join the committee. The committee shall ensure by all possible means that the reviewer does not influence the ranking of the application in which he/she is included.

The review committee will check the project’s alignment between the research project and IVADO’s scientific direction, then shall rank the applications based on excellence, as well as the project’s alignment with IVADO’s overarching framework, which aims to promote multidisciplinary collaboration and diversity in data science. In terms of excellence, the committee will specifically assess:

In terms of excellence, the committee will specifically assess:

  • Research ability
  • Depth and breadth of experience: multidisciplinary and professional experiences, extra-academic activities, collaborations, contributions to the scientific community and society as a whole, etc.
  • Expected adequacy with the proposed project

Final step and commitments

The student shall:

  • be physically present at his/her supervisor’s university;
  • contribute to IVADO’s community and activities by, for example, taking part in:
    • presentations on his/her research;
    • training and knowledge dissemination activities;
    • consultations;
    • activities generally undertaken by career researchers (mentorship, assessment, co-organization of events, etc.);
  • recognize that he/she is a member of an academic community to which he/she shall contribute;
  • comply with the Tri-Agency Open Access Policy on Publication. Postdoctoral researchers are encouraged to publish their research findings (papers, recordings of presentations, source codes, databases, etc.), in compliance with the intellectual property rules that apply to their own specific case;
  • recognize the financial support granted by IVADO and the CFREF or FRQ when disseminating the research results and, more broadly, in all the activities in which he/she takes part

The supervisor shall:

  • provide a work environment that is conducive to the completion of the project
  • oversee the work of the student

FAQ

  • Is there a particular format for preparing a CV?
    • No, there is no particular format that needs to be followed. However, each piece of the record must help the assessor to form an opinion on the record. A CV that is too long or confusing may make evaluation more difficult.
  • Are there any specific rules for the recommendation letter?
    • No, there are no specific rules for the recommendation letter.
  • Can candidates send recommendation letters themselves?
    • No, recommendation letters can only be upload by their author in the platform.
  • Can I send my unofficial transcript?
    • No, you must upload us your official transcript including all your current results. Originals or certified copies must be scanned and uploaded to the application and for non-Canadian universities, you must specify the rating scale.

Didn’t find what you were looking for? Send us an e-mail.

Please apply through: https://ivado.smapply.io/

All applications will contain:

  • a questionnaire to be completed on the platform WITH a common-language description of the project (maximum length of one page);
  • CV (free format) to be uploaded;
  • Master transcripts (as well as information on the grading scale when the transcript is issued by a non-Canadian university);
  • Two recommendation letters, including an uploaded letter directly from the future  supervisor.
  • Larry Dong (McGill University, Erica Moodie)
    • When making decisions, medical professionals often rely on past experience and their own judgment. However, it is often the case that an individual decision-makerfaces a situation that is unfamiliar to him or her. An adaptive treatment strategy (ATS) can help such biomedical experts in their decision-making, as they are a statistical representation of a decision algorithm for a given treatment that optimizes patient outcomes. ATSs are estimated with large amounts of data, but an issue that may occur is that such sources of data may be subject to unmeasured confounding, whereby important variables needed to ensure the causal inference are missing. The idea behind this research project is to develop a sensitivity analysis to better understand and to quantify the impact of unmeasured confounding on decision rules in ATSs.
  • Jonathan Pilault (Polytechnique Montréal, Christopher Pal)
    • Language understanding and generation is a unique capacity of humans. Automatic summarization is an important task in Natural (human) Language Processing. This task consists in reducing the size of discourse while preserving information content. Abstractive summarization sets itself apart from other types of summarization since it most closely relates to how humans would summarize a book, a movie, an article or a conversation. From a research standpoint, automatic abstractive summarization is interesting since it requires models to both understand and generate human language. In the past year, we have seen research that have improved the ability of Neural Networks to choose the most important parts of discourse while beginning to address key pain points (e.g. repeating sentences, nonsensical formulations) during summary text generation. Recent techniques in Computer Vision image generation tasks have shown that image quality can be further improved using Generative Adversarial Networks (GAN). Our intuition is that the same is true for a Natural Language Processing task. We propose to incorporate newest GAN architectures into some of the most novel abstractive summarization models to validate our hypothesis. The objective is to create a state-of-the-art summarization system that most closely mimics human summarizers. This outcome will also bring us closer to understand GANs analytically.
  • Alice Wu (Polytechnique Montréal, François Soumis)
    • Combiner l’A.I. et la R.O. pour optimiser les blocs mensuels d’équipages aérien.Nos travaux récents portent sur le développement de deux nouveaux algorithmes Improved Primal Simplex (IPS) et Integral Simplex Using Decomposition (ISUD) qui profitent de l’information a priori sur les solutions attendues pour réduire le nombre de variables et de contraintes à traiter simultanément. Actuellement cette information est donnée par des règles fournies par les planificateurs. L’objectif de recherche sera de développer un système utilisant l’intelligence artificielle (IA) pour estimer la probabilité que la variable liant deux rotations fasse partie de la solution d’un problème de blocs mensuels d’équipages aériens. L’apprentissage se fera sur les données historiques de plusieurs mois, de plusieurs types d’avions et de plusieurs compagnies. L’estimation des probabilités doit se faire à partir des caractéristiques des rotations et non à partir de leurs noms. Une rotation ne revient pas d’une compagnie à l’autre ni d’un mois à l’autre. Il faudra identifier les caractéristiques pertinentes. Il faudra de la recherche sur l’apprentissage pour profiter des contraintes du problème. Il y a des contraintes entre le personnel terminant des rotations et celui en commençant par la suite. La validation de l’apprentissage se fera en alimentant les optimiseurs avec l’information estimée et en observant la qualité des solutions obtenues et les temps de calcul. Il y aura de la recherche à faire dans les optimiseurs pour exploiter au mieux cette nouvelle information.

Tiphaine Bonniot de Ruisselet (Polytechnique Montréal, Dominique Orban)

  • Accélération de méthodes d’optimisation pour les problèmes volumineux par évaluation inexact

Nous nous intéressons aux problèmes d’optimisation continue, non convexe et sans contraintes dans lesquels l’évaluation des valeurs de l’objectif et de son gradient sont obtenues à l’issue d’un processus coûteux. Nous supposons que l’on peut obtenir à moindre coûts des approximations de l’objectif et de son gradient à un niveau de précision souhaité. Nous regarderons l’impact de ces hypothèses sur la convergence et la complexité de méthodes d’optimisation classiques ainsi que les économies pouvant être effectuées sur le temps de calcul et la consommation énergétique. Cette étude est motivée, entre autre, par les problèmes d’inversion sismique dont la taille peut avoisiner les centaines de millions de variables et dont la fonction et le gradient peuvent être approximés par la résolution d’un problème aux moindres carrés linéaires. L’économie de temps de calcul et d’énergie est un enjeu majeur de l’ère de l’intelligence artificielle et de l’exploration des données volumineuses et cette approche est nouvelle est prometteuse en terme de retombées économiques et environnementales.

Stephanie Cairns (McGill University, Adam Oberman)

  • Oberman Mathematical approaches to adversarial robustness and confidence in DNN

Deep convolutional neural networks are highly effective at image classification tasks, achieving higher accuracy than conventional machine learning methods but lacking the performance guarantees associated with these methods. Without additional performance guarantees, for example error bounds, they cannot be safely used in applications where errors can be costly. There is a consensus amongst researchers that greater interpretability and robustness are needed. Robustness can be to differences in the data set where the models can be deployed, or even robustness to adversarial samples: perturbations of the data designed deliberately by an adversary to lead to a misclassification.

In this project, we will study reliability in two contexts: (i) developing improved confidence in the prediction of the neural network, using modified losses to improve confidence measures (ii) modified losses which result in better robustness to adversarial examples. The overall goal of the project is to lead to more reliable deep learning models.

Enora Georgeault (HEC Montréal, Marie-Ève Rancourt)

  • Modèles prédictifs de l’allocation des dons de la Croix-Rouge canadienne en réponse aux feux de forêt

Au Canada, les inondations et les feux de forêt sont les catastrophes naturelles qui provoquent le plus de dégâts. Les efforts de la Croix-Rouge canadienne (CRC) visant à atténuer les impacts des feux de forêt dépendent fortement de la capacité des organisations à planifier, à l’avance, les opérations logistiques de secours. Le premier objectif du projet est l’élaboration de modèles permettant de prédire l’allocation des dons en argent aux bénéficiaires, selon les caractéristiques socio-démographiques de la région et du bénéficiaire ainsi que selon les caractéristiques des feux (sévérité et type). Le second objectif sera de comprendre les facteurs qui ont un impact significatif sur les besoins de la CRC lors d’une réponse à un feu de forêt, afin de faciliter la planification des opérations logistiques et les appels de financement.

Bhargav Kanuparthi (Université de Montréal, Yoshua Bengio)

  • h detach Modifying the LSTM Gradient Towards Better Optimization

Recurrent neural networks are known for their notorious exploding and vanishing gradient problem (EVGP). This problem becomes more evident in tasks where the information needed to correctly solve them exist over long time scales, because EVGP prevents important gradient components from being back-propagated adequately over a large number of steps. We introduce a simple stochastic algorithm (\textit{h}-detach) that is specific to LSTM optimization and targeted towards addressing this problem. Specifically, we show that when the LSTM weights are large, the gradient components through the linear path (cell state) in the LSTM computational graph get suppressed. Based on the hypothesis that these components carry information about long term dependencies (which we show empirically), their suppression can prevent LSTMs from capturing them. Our algorithm\footnote{Our code is available at https://github.com/bhargav104/h-detach.} prevents gradients flowing through this path from getting suppressed, thus allowing the LSTM to capture such dependencies better. We show significant improvements over vanilla LSTM gradient based training in terms of convergence speed, robustness to seed and learning rate, and generalization using our modification of LSTM gradient on various benchmark datasets.

Vincent Labonté (Polytechnique Montréal, Michel Gagnon)

  • Extraction de connaissances en français basée sur une traduction des textes en anglais combinée à l’utilisation d’outils développés pour l’anglais

Plusieurs institutions gouvernementales rendent disponible sur leurs sites web un très grand volume de documents qui ne sont écrits que dans la langue officielle du pays. Or, de plus en plus ces institutions désirent transformer ces documents en une base de connaissances, déployée en un ensemble de données ouvertes intégrées au Web sémantique. C’est le cas notamment du ministère de la Culture et des Communications du Québec, qui met à la disposition du public un répertoire du patrimoine culturel du Québec, très riche en informations textuelles, mais qu’il est malheureusement difficile d’intégrer aux données des autres acteurs culturels du Québec, ou de lier à toutes les connaissances patrimoniales qui sont déjà présentes dans le réseau de données ouvertes Linked Open Data (LOD).

Plusieurs travaux ont déjà été proposés pour soutenir l’effort d’extraction de connaissances à partir de textes : des annotateurs sémantiques, qui identifient dans un document les entités qui y sont citées (personnes, organisations, etc.) et les lient à leur représentation dans une base de connaissances du LOD; des extracteurs de relations, capables d’extraire du texte des relations entre deux entités (par exemple, « X est l’auteur du roman Y »); des extracteurs d’événements et d’informations temporelles. Dans la très grande majorité des cas, ces outils ont été développés pour l’anglais, ou offrent de piètres performances lorsqu’appliqués au français.

Nous proposons donc d’explorer une approche qui consiste à produire, à partir d’un corpus de documents en français, une version équivalente traduite sur laquelle seront appliqués les outils déjà existants pour l’anglais (le service Syntaxnet de Google, par exemple). Cela implique qu’il faudra tenir compte des erreurs et inexactitudes qui résulteront de l’étape de traduction. Pour y arriver, des techniques de paraphrase et de simplification de texte seront explorées, l’hypothèse ici étant que des phrases simples sont plus faciles à traduire et que cette simplification n’aura pas d’impact majeur sur la résolution de la tâche si la sémantique est préservée lors de cette simplification. On notera aussi que certains aspects de la langue, comme l’anaphore, perturbent la traduction (le module de traduction aura du mal à choisir entre les pronoms « it » et « he » pour traduire le pronom « il »). Il faudra dans ces cas mesurer précisément leur impact et proposer des solutions de contournement.

En bref, le projet proposé permettra de déterminer dans quelle mesure les services de traduction actuellement disponibles préservent suffisamment le sens du texte pour pouvoir exploiter des outils développés pour une autre langue. L’hypothèse que nous désirons valider est que leurs lacunes peuvent être comblées par certains prétraitements du texte original, et que ces prétraitements peuvent être implémentée à faibles coûts (en temps et en ressources).

Thomas MacDougall (Université de Montréal, Sébastien Lemieux)

  • Use of Deep Learning Approaches in the Activity Prediction and Design of Therapeutic Molecules

The proposed research is to employ Deep Learning and Neural Networks, which are both fields of Machine Learning, to more accurately predict the effectiveness, or “activity”, of potential therapeutic molecules (potential drugs). We are primarily concerned with predicting a given molecule’s ability to inhibit the growth of primary patient cancer cells (cells taken directly from a patient). The Leucegene project at the Institut de Recherche en Immunologie et Cancérologie (IRIC) has tested the activity of a large number of compounds in inhibiting the growth of cancer cells from patients afflicted with acute myeloid leukemia. The proposed research will use this activity data, along with several other data sources, to build an algorithm that can better predict the effectiveness that a molecule will have in inhibiting cancer cell growth. This means that before a molecule is even synthesized in a chemistry lab, a good estimation of its effectiveness as a therapeutic compound can be made, almost instantly. The first approach is to use Neural Networks and “representation learning”, in which features of the molecules that are important to improving activity are identified automatically by the algorithm. This will be done by representing the molecules as graphs and networks. Another approach that will be taken is the use of “multi-task learning” in which the prediction accuracy of an algorithm can be improved if the same algorithm is trained for multiple tasks on multiple datasets. The “multiple tasks” that will be focused on are multiple, but related, drug targets that are essential to cancer cell growth. Moving beyond activity prediction alone, these machine learning architectures will be expanded to design new chemical structures for potential drug molecules, based on information that is learned from drug molecules with known activities. These approaches have the capacity to improve the predictions about whether molecules will make effective drugs, and to design new molecules that have even better effectiveness than known drugs. Research progress in this area will lower the cost, both in money and time, of the drug development process.

Bhairav Mehta (Université de Montréal, Liam Paull)

  • Attacking the Reality Gap in Robotic Reinforcement Learning

As Reinforcement Learning (RL) becomes an increasingly popular avenue of research, one area that stands to be revolutionized is robotics. However, one prominent downside of applying RL in robotics scenarios is the amount of experience today’s RL algorithms require to learn. Since these data-intensive policies cannot be learned on real robots due to time constraints, researchers turn to fast, approximate simulators. Trading off accuracy for speed can cause problems at test time, and policies that fail to transfer to the real world fall prey to the reality gap: the differences between training simulation and the real-world robot. Our project focuses on theoretically analyzing this issue, and provides practical algorithms to improve safety and robustness when transferring robotic policies out of simulation. We propose algorithms that use expert-collected robot data to learn a simulator, allowing for better modeling of the testing distribution and minimizing the reality gap upon transfer. In addition, we study the transfer problem using analysis tools from dynamical systems and continual learning research, looking for indicators in neural network dynamics and optimization that signal when the reality gap is likely to pose an issue. Lastly, we use the analysis to synthesize an algorithm which optimizes for the metrics that signal good, “transferable” policies, allowing safer and more robust sim-to-real transfer.

Timothy Nest (Université de Montréal, Karim Jerbi)

  • Leveraging Machine Learning and Magnetoencephalography for the Study of Normal and atypical states of Consciousness

Understanding the neural processes and network dynamics underlying conscious perception is a complex yet important challenge that lies at the intersection between cognitive brain imaging, mental health, and data science. Magnetoencephalography (MEG) is a brain imaging technique that has many qualities favorable to investigating conscious perception due to its high temporal resolution and high signal to noise ratio. However MEG analyses across space, time and frequency is challenging due to the extreme high-dimensionality of variables of interest, and susceptibility to overfitting. Furthermore, high-computational complexity limits the ease with which investigators might approach some cross-frequency coupling metrics believed to be important for conscious perception and integration, across the whole brain. To mitigate such challenges, researchers frequently rely on a variety of multivariate feature extraction and compression algorithms. However, these techniques still require substantial tuning, and are limited in their application to the kinds of high-order tensor structures encountered in MEG. New methods for the study of conscious perception with MEG are thus needed.

In this project, we will leverage very recent advances in computer science and machine learning that extend algorithms currently used in neuroimaging research, to extreme high-dimensional spaces. Taken together, the proposed research will apply state-of-the-art techniques in machine-learning and electrophysiological signal processing to overcome current obstacles in the study of the brain processes that mediate conscious perception. This work will constitute an important contribution to neuroimaging methodology, neuropharmacology, and psychiatry. Beyond expanding our understanding of healthy cognition, this research may ultimately provide novel paths to the study of psychiatric disorders that involve altered conscious perception, such as Schizophrenia.

Jacinthe Pilette (Université de Montréal, Jean-François Arguin)

  • Recherche de nouvelle physique au Grand collisionneur de hadrons (LHC) à l’aide de l’apprentissage profond

Le Grand collisionneur de hadrons (LHC) se situe au cœur de la recherche fondamentale en physique. Avec sa circonférence de 27 km, celui-ci constitue le plus grand et plus puissant accélérateur de particules au monde. Ceci en fait le meilleur outil afin d’étudier le domaine de l’infiniment petit. C’est d’ailleurs au LHC que le boson de Higgs fut découvert, menant à l’obtention du prix Nobel de physique en 2013.

Cependant, le modèle standard, référence qui dicte les lois régissant les particules et leurs interactions, possède plusieurs lacunes que les physiciens et physiciennes n’ont toujours pas réussi à combler. Plusieurs théories furent élaborées, mais aucune d’entre elles ne fut observée au LHC. Face à ce défi, la communauté de physique des particules devra utiliser une nouvelle approche.

Le groupe ATLAS de l’Université de Montréal s’est ainsi tourné vers l’intelligence artificielle. Le projet élaboré par cette collaboration, et l’objectif principal de cette recherche est de développer un algorithme d’apprentissage profond qui permettrait de détecter des anomalies dans les données. L’algorithme développé sera ensuite utilisé sur les données du détecteur ATLAS dans l’espoir de découvrir des signaux de nouvelle physique et d’améliorer notre compréhension de l’univers.

Léa Ricard (Université de Montréal, Emma Frejinger)

  • Modélisation de la probabilité d’acceptation d’une route dans un contexte de covoiturage

Le covoiturage touche aux algorithmes fréquemment étudiés de tournées de véhicule, de ramassage et de livraison avec fenêtres de temps et de transport à la demande dynamique. Toutefois, très peu d’études s’attardent au contexte où les conducteurs et les passagers peuvent rejeter une proposition de route. Alors que le rejet d’une route proposée est rare lorsque les conducteurs sont des professionnels, c’est plutôt la norme dans un contexte de covoiturage. La modélisation de la probabilité d’acceptation d’une route se pose alors comme un problème central dans le développement d’une application mobile de covoiturage de qualité.

Le modèle d’apprentissage automatique développé devra estimer, selon les caractéristiques de l’utilisateur (notamment s’il est conducteur ou passager) et les routes alternatives proposées, la probabilité d’acceptation d’une route. De prime abord, cette modélisation pose deux défis :

(1) La façon dont les acceptations et les refus sont collectés pose un problème de type logged bandit. À ce titre, plusieurs propositions peuvent être offertes en même temps et un utilisateur peut en accepter plusieurs. De plus, les offres peuvent être activement refusées, simplement ignorées ou acceptées. Puisque les offres sont affichées séquentiellement, celles qui apparaissent en premier ont plus de chance d’attirer l’attention de l’utilisateur. L’ordre des propositions a donc probablement une influence sur la probabilité d’acceptation.
(2) Le comportement des nouveaux utilisateurs, pour qui très peu d’information est disponible, devra être inféré à partir des clients similaires de longue date. Ceci est en soi un problème difficile.

Alexandre Riviello (Polytechnique Montréal, Jean-Pierre David)

  • Hardware Acceleration of Speech Recognition Algorithms

Speech recognition has become prevalent in our lives in recent years. Personal assistants, such as Amazon’s Alexa or Apple’s Siri are such examples. With the rise of deep learning, speech recognition algorithms gained a lot of precision. This is due, mostly, to the use of neural networks. These complex algorithms, used in the context of a classification task, can distinguish between different characters, phonemes or words. However, they require lots of computations, limiting their use in power-constrained devices, such as smartphones. In my research, I will attempt to find hardware-friendly implementations of these networks. Deep learning algorithms are usually written in high-level languages using frameworks such as Torch or Tensorflow. To generate hardware-friendly representations, models will be adapted, using these frameworks. For example, recent findings have shown that basic networks can use weights and activations represented over 1 or 2 bits and retain their accuracy. The reduction of the precision of the network parameters is called quantization. This concept will be one of the many ways used to simplify the networks. Another aspect of this research will be to revisit methods of representing voice features. Traditionally, spoken utterances were converted to Mel Frequency Cepstrum Coefficients (MFCCs) which are essentially values representing signal power over a frequency axis. These coefficients are calculated roughly every 10 ms and are then sent to the model network. A representation of lower precision can greatly reduce the computational costs of the network. The overall goal of the research will be to improve the calculation speed and to diminish the power consumption of speech recognition algorithms.