Python for Data Analysis#
Overview#
This course provides an in-depth exploration of Python for data analysis, focusing on essential libraries and tools such as NumPy, Pandas, Matplotlib, and Plotly. Additionally, it covers critical software development practices, including testing, virtual environments, and version control, to ensure code reproducibility and collaboration in research projects. By the end of the course, participants will be adept at performing data manipulation, analysis, and visualisation tasks, and will have a solid understanding of maintaining and sharing their code efficiently.
Course Objectives#
Grasp the fundamentals of Python programming, including data types, control structures, and functions.
Learn how to load, clean, and manipulate data using Pandas for effective data analysis.
Learn to use NumPy for numerical operations and handling large datasets efficiently.
Understand the use of Pandas for handling research problem datasets.
Create a variety of static and interactive visualisations to represent data insights, covering Matplotlib and Plotly.
Apply machine learning techniques using Scikit-Learn for predictive modelling.
Implement testing framework, manage dependencies with virtual environment.
Learn methods to ensure that research and analyses can be reproduced and validated by other.
Pre-requisite Knowledge#
Attendees should have taken the Introduction to Python course.