PhD Position F/M Knowledge-based reinforcement learning and knowledge evolution [Campagne DOC MI-NF-GRE-2023]


2023-05854 – PhD Position F/M Knowledge-based reinforcement learning and knowledge evolution [Campagne DOC MI-NF-GRE-2023]

Contract type :
Fixed-term contract

Level of qualifications required :
Graduate degree or equivalent

Fonction :
PhD Position

Level of experience :
Recently graduated

About the research centre or Inria department

The Inria Grenoble – Rhône-Alpes research center groups together almost 600 people in 22 research teams and 7 research support departments.

Staff is present on three campuses in Grenoble, in close collaboration with other research and higher education institutions (Université Grenoble Alpes, CNRS, CEA, INRAE, …), but also with key economic players in the area.

Inria Grenoble – Rhône-Alpes is active in the fields of high-performance computing, verification and embedded systems, modeling of the environment at multiple levels, and data science and artificial intelligence. The center is a top-level scientific institute with an extensive network of international collaborations in Europe and the rest of the world.




Doctoral school: MSTII , Université Grenoble Alpes.

Advisor: Jérôme Euzenat and Jérôme David .

Group: The work will be carried out in the mOeX team common to INRIA & LIG . It is related to the MIAI Knowledge communication and evolution chair . mOeX is dedicated to study knowledge evolution through adaptation. It gathers researchers which have taken an active part these past 15 years in the development of the semantic web and more specifically ontology matching and data interlinking.

Place of work: The position is located at INRIA Grenoble Rhône-Alpes , Montbonnot a main computer science research lab, in a stimulating research environment.




Cultural knowledge evolution and multiagent reinforcement learning share some of their prominent features. Putting explicit knowledge at the heart of the reinforcement process may contribute to better explanation and transfer.

Cultural knowledge evolution deals with the evolution of knowledge representation in a group of agents. For that purpose, cooperating agents adapt their knowledge to the situations they are exposed to and the feedback they receive from others. This framework has been considered in the context of evolving natural languages [Steels, 2012]. We have applied it to ontology alignment repair, i.e. the improvement of incorrect alignments [Euzenat, 2017] and ontology evolution [Bourahla et al., 2021]. We have shown that it converges towards successful communication through improving the intrinsic knowledge quality.

Reinforcement learning is a learning mechanism adapting the decision making process for maximising the reward provided by the environment to the actions performed by agents [Sutton and Barto, 1998]. Many multi-agent versions of reinforcement learning have also been proposed depending on the agent attitude (cooperative, competitive) and the task structure (homogeneous, heterogeneous) [Bučoniu et al., 2010].

From an external perspective, the two approaches operate in a similar manner: agents perceive their environment, perform an action, receive reward or punishment, adapt their behaviour in consequence. However, a look into the inner mechanisms reveals important differences: the emphasis on knowledge quality instead of reward maximisation, the lack of probabilistic or even gradual interpretation, and even the absence of explicit choice in action or adaptation. Hence these two knowledge acquisition techniques are close enough to suggest replacing one by the other and different enough to cross-fertilise.

This thesis position aims at further exploring the commonalities and differences between experimental cultural knowledge evolution and reinforcement learning. In particular, its purpose is to study which features of one technique may be fruitful in the context of the other and which may not.

For that purpose, one research direction is the introduction of knowledge-based reinforcement learning. In knowledge-based reinforcement learning, the decision-making process (the choice of the action to be performed) is obtained through accumulated explicit knowledge. Thus the adaptation performed after reward or punishment will have to directly affect this knowledge. This has the advantage that it allows to explain the decisions made by agents. It will also allow for explicit knowledge exchange among them [Leno da Silva et al., 2018].

This promotes a less utilitarian view of knowledge in which the evaluation of the performance of the system has to be disconnected from reward maximisation but to depend on the quality of the acquired knowledge. Of course, these two aspects need to remain related (the acquired knowledge must be relevant to the environment). This separation between knowledge and reward is useful when agents have to change environment or use their knowledge to perform various tasks.

Another use of reinforcement mechanisms relevant to cultural knowledge evolution is related to the motivation for agents to explore unknown knowledge territories [Colas et al., 2019]. By associating an intrinsic reward to the newly acquired knowledge, agents are able to improve the coverage of their knowledge in a way not guided by the environment. Complementing cultural knowledge evolution with exploration motivation, should make agents more active in their understanding of the environment and knowledge acquisition.

These problems may be treated both theoretically and experimentally


[Bourahla, 2021] Yasser Bourahla, Manuel Atencia, Jérôme Euzenat, Knowledge improvement and diversity under interaction-driven adaptation of learned ontologies, Proc. 20th AAMAS, London (UK), pp242-250, 2021
[Bučoniu et al., 2010] Lucian Bučoniu, Robert Babuška, Bart De Schutter, Multi-agent reinforcement learning: an overview, Chapter 7 of D. Srinivasan and L.C. Jain, eds., Innovations in Multi-Agent Systems and Applications – 1, Springer , Berlin (DE), pp183–221, 2010
[Colas et al., 2019] Cédric Colas, Pierre-Yves Oudeyer, Olivier Sigaud, Pierre Fournier, Mohamed Chetouani, Curious: Intrinsically motivated modular multi-goal reinforcement learning, Proc. 36th ICML, Long Beach (CA US), pp1331–1340, 2019
[Euzenat, 2017] Jérôme Euzenat, Communication-driven ontology alignment repair and expansion, Proc. 26th IJCAI, Melbourne (AU), pp185-191, 2017
[Leno da Silva et al., 2018] Felipe Leno Da Silva, Matthew Taylor, Anna Helena Reali Costa, Autonomously reusing knowledge in multiagent reinforcement learning, Proc. 27th IJCAI, pp5487-5493, 2018
[Steels, 2012] Luc Steels (ed.), Experiments in cultural language evolution , John Benjamins, Amsterdam (NL), 2012
[Sutton and Barto, 1998] Richard Sutton, Andrew Barto, Reinforcement learning: an introduction, The MIT Press, Cambridge (MA US), 1998 (2nd ed. 2018)


  • MIAI Knowledge communication and evolution:
  • mOeX web site:
  • Lazy lavender:


Main activities

Main activities :

  • Analyse the literature and different approaches
  • Put forward original solutions
  • Proof or experiment and analysis of consequences
  • Presentations and paper writing
  • Dissertation writing and thesis defence
  • All other activities mandated by INRIA and the doctoral school


Researched skills:

  • Curiosity and openness.
  • Interaction with other researchers.
  • Autonomous researcher.
  • Taste for experimentation.
  • Knowledge of multi-agent simulation and/or reinforcement learning not required but a plus.
  • Innovative.

Languages: English mandatory


Benefits package

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
  • Possibility of teleworking (90 days / year) and flexible organization of working hours
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage  under conditions

View or Apply
To help us track our recruitment effort, please indicate in your cover//motivation letter where ( you saw this job posting.

Job Location