Why Chemistry Needs Python

This lecture is part of the OeAD-funded ASEA-UNINET project “CORE: Computational Organic Research Education.” It represents a collaborative effort between the Technical University of Vienna (TU Wien) and the University of Malaya to modernize chemistry education through computational and data-driven approaches. Within this framework, the lecture forms a special session of the Python programming course at the University of Malaya, introducing students to how chemical structures and data can be explored using Python.

Students learn how to represent molecules using SMILES notation, process them with RDKit, extract molecular features, and apply simple machine learning models (for example, predicting solubility) within Python notebooks. All materials are freely available and designed for hands-on work using Google Colab.

What is this lecture about?

The lecture bridges chemistry and programming by demonstrating how digital molecular representations can be analyzed and visualized in Python. It introduces students to essential cheminformatics workflows and basic data-driven prediction techniques. The aim is to build intuition for how computational methods are shaping modern organic chemistry and chemical education.

Learning Goals

  • Understand how molecules can be represented digitally using SMILES notation.
  • Use RDKit to handle, visualize, and extract features from molecular structures.
  • Apply introductory machine learning workflows to chemical datasets (e.g., solubility prediction).
  • Develop confidence working with chemistry data in Python and Google Colab.

Materials

CORE: Computational Organic Research Education logo