Fundamentals of Data Science - Gdańsk

The data science is a hot topic now, so together with Piotr Chlebek and CODE:ME foundation we decided to organize a data science course. The course took place on 11-12.01.2020 in Gdańsk. The number of participants was limited, and the list of participants was full long before the date of the course. This shows a huge interest in data science and machine learning. The participants represented different professions, so it was a very interesting experience to teach them. During the course, there were many interesting discussions, participants proposed interesting ideas about task solutions.

The scope of the course was very wide and the participants had an opportunity to learn the fundamentals of data science and machine learning, as well as the tools used in those areas. My part of the course includes the basics of data science tools and algorithms:

  • Anaconda and Jupyter
  • Numpy and Pandas
  • Matplotlib and Seaborn
  • Scikit-learn and Keras
  • Classification, Regression, Clusterization
  • Time series analysis
  • Neural networks

The teaching materials was based on the data gathered from https://dane.gov.pl/. The page provides great datasets that are very useful to teach the basics of data preparation, visualization and so on. Moreover, the use of these data is much more fun! For example, this is much more interesting to predict the number of passengers in public transport in the home city than analyzing the iris dataset.

Another issue tackled by the participants was the analyze of football players from the FIFA game series. We were able to train the Support Vector Regression to predict the value of the players on the transfer market based on the rate of their skills like agility, strength etc. The obtained results were quite good. The problem was also validated on neural network, that performed even better.