Scholarships & Thesis Info and Update: [beasiswa] [info] Ph.D. position in "Transfer in Multi-armed Bandit and Reinforcement Learning" @INRIA Lille, France

Thursday, April 9, 2015

[beasiswa] [info] Ph.D. position in "Transfer in Multi-armed Bandit and Reinforcement Learning" @INRIA Lille, France

Applications are invited for a Ph.D. with Alessandro Lazaric at SequeL, INRIA-Lille.

Ph.D. Title: Transfer in multi-armed bandit and reinforcement learning

Keywords: reinforcement learning, multi-armed bandit, transfer learning, exploration-exploitation, representation learning, hierarchical learning.

Research Topic

The main objective of this Ph.D. research project is to advance the state-of-the-art in the field of multi-armed bandit (MAB) and reinforcement learning (RL) through the development of novel transfer learning algorithms. Multi-armed bandit and reinforcement learning formalize the problem of learning an optimal behavior policy from the experience directly collected from an unknown environment. Such general model already provides powerful tools that can be used to learn from data in a very diverse range of applications (e.g., see successful applications to computer games, recommendation systems, energy management, logistics, and autonomous robotics). Nonetheless, practical limitations of current algorithms encouraged research in developing efficient ways to integrate expert prior knowledge into the learning process. Although this improves the performance of RL algorithms, it dramatically reduces their autonomy, since it requires a constant supervision by a domain expert. A solution to this problem is provided by transfer learning, which is directly motivated by the observation that one of the key features that allows humans to accomplish complicated tasks is their ability of building general knowledge from past experience and transfer it in learning new tasks. Thus, we believe that bringing the capability of transfer of learning to existing machine learning algorithms will enable them to solve series of tasks in complex and unknown environments. The objective is to develop algorithms that not only learn from experience but also extract knowledge and transfer it through different tasks; thus obtaining a dramatic speed-up in the learning process and a significant improvement of its overall performance. Thus, the general objective in this Ph.D. project is to design RL algorithms able to incrementally discover, construct, and transfer "prior" knowledge in a fully automatic way.

Research Program

While the idea of transfer learning has been applied in a series of machine learning problems, its integration in MAB and RL is much more complicated. In fact, the number of scenarios that can be constructed and the different types of knowledge that can be constructed and transferred is much larger than in simpler problems, such as supervised learning. During the Ph.D. we will thus investigate a variety of approaches to transfer in MAB and RL, ranging from transfer of sample to transfer of representations. We will address some of the following questions:

(i) Exploration. Which knowledge transfer can provably improve the exploration-exploitation performance of a MAB and RL algorithm in terms of sample complexity and regret?

(ii) Representation. Which techniques of representation better fit into transfer in RL?

(iii) Hierarchical structures. Is it possible to prove the advantage of hierarchical structures over flat structures in MAB (e.g., hierarchical clustering) and in RL (e.g., options)? Under which assumptions? How can we create such hierarchies automatically?

The previous questions will require theoretical, algorithmic and empirical study. The Ph.D. will cover different learning scenarios (e.g., multi-armed bandit, linear bandit, contextual bandit, full reinforcement learning) and different validation environments (e.g., fully synthetic, off-line evaluation from logged data, online simulation in real-world applications). As such, we expect the Ph.D. to produce a variety of results:

Theoretical study of the conditions and the type of improvement brought by transfer methods w.r.t. no-transfer standard RL algorithms.
Empirical validation of the proposed algorithms and comparison with existing transfer and no-transfer methods.
Investigation of the application of transfer in RL to real-world problems such as recommendation systems, trading, and computer games.

Profile

The applicant must have a Master of Science in Computer Science, Statistics, or related fields, possibly with background in reinforcement learning, bandits, or optimization. Candidates with either very strong mathematical or computer science background will be considered. The working language in the lab is English, a good written and oral communication skills are mandatory.

Application

The application should include a brief description of research interests and past experience, a CV, degrees and grades, a copy of Master thesis (or a draft thereof), motivation letter (short but pertinent to this call), relevant publications, and other relevant documents. Candidates are encouraged to provide letter(s) of recommendation and contact information to reference persons. Please send your application in one single pdf to alessandro.lazaric-at-inria.fr.

Application closing date: May 15, 2015
Interviews: May/June 2015
Final decision: June/July 2015
Duration: 3 years (a full time position)
Starting date: October 15, 2015 (flexible)
Supervisor: Alessandro Lazaric
Place: SequeL, INRIA Lille - Nord Europe

Working environment

The PhD candidate will work at SequeL (https://sequel.lille.inria.fr/) lab at Inria Lille - Nord Europe located in Lille. Inria (http://www.inria.fr/) is France's leading institution in Computer Science, with over 2800 scientists employed, of which around 250 in Lille. Lille is the capital of the north of France, a metropolis with 1 million inhabitants, with excellent train connection to Brussels (30 min), Paris (1h) and London (1h30). The research team SequeL (Sequential Learning) is composed of about 20 members working in machine learning, notably in reinforcement learning, multi-armed bandit, statistical learning, and sequence prediction. The Ph.D. program will be co-funded by the ANR ExTra-Learn project, which is entirely focused on the problem of transfer in RL.

Benefits

Salary: 1957,54 € the first two years and 2058,84 € the third year
Salary after taxes: around 1597,11€ the 1st two years and 1679,76 € the 3rd year (benefits included).
Possibility of French courses
Help for housing
Participation for public transport
Scientific Resident card and help for husband/wife visa

References

[1] D. Calandriello, A. Lazaric, M. Restelli. "Sparse Multi-task Reinforcement Learning". In Proceedings of the Twenty-Eigth Annual Conference on Neural Information Processing Systems (NIPS'14), 2014.

[2] M. Gheshlaghi-Azar, A. Lazaric, E. Brunskill. "Resource-efficient Stochastic Optimization of a Locally Smooth Function under Correlated Bandit Feedback". In Proceedings of the Thirty-First International Conference on Machine Learning (ICML'14), 2014.

[3] M. Azar, A. Lazaric, and E. Brunskill. "Sequential Transfer in Multi-armed Bandit with Finite Set of Models". In: Proceedings of the Twenty-Seventh Annual Conference on Neural Information Processing Systems (NIPS'13). 2013. pp. 2220-2228.

[4] A. Lazaric and M. Restelli. "Transfer from Multiple MDPs". In Proceedings of the Twenty-Fifth Annual Conference on Neural Information Processing Systems (NIPS'11), 2011.

[5] A. Lazaric. "Transfer in Reinforcement Learning: a Framework and a Survey". In M. Wiering and M. van Otterlo, editors, Reinforcement Learning: State of the Art, Springer, 2011.

[6] M. E. Taylor and P. Stone. "Transfer Learning for Reinforcement Learning Domains: A Survey". Journal of Machine Learning Research, 10(1): pp. 1633–1685, 2009.

[7] R. S. Sutton and A. Barto. Reinforcement Learning: an Introduction. MIT Press, Cambridge, MA, 1998.

__._,_.___

Posted by: Tri Kurniawan Wijaya <trikurniawanwijaya@yahoo.com>

Reply via web post

•

Reply to sender

•

Reply to group

•

Start a New Topic

•

Messages in this topic (1)

INFO, TIPS BEASISWA, FAQ - ADS:
http://id-scholarships.blogspot.com/

===============================

INFO LOWONGAN DI BIDANG MIGAS:
http://www.lowongan-kerja.info/lowongan/oil-jobs/

===============================

INGIN KELUAR DARI MILIS BEASISWA?
Kirim email kosong ke beasiswa-unsubscribe@yahoogroups.com

Visit Your Group

New Members 22

• Privacy • Unsubscribe • Terms of Use

__,_._,___