1929502_9383_171

RWTHx: Basics of Data Science

92,00 

“Basics of Data Science” gives a comprehensible overview of many fundamental concepts and tools of data science, including data quality and data preprocessing, supervised and unsupervised learning techniques including their evaluation, frequent itemsets and association rules, sequence mining, process mining, text mining, and responsible data science.

Categories ,

About this course

“Basics of Data Science” is designed to provide participants with a comprehensive overview of the fundamental challenges, concepts and tools of data science. The content can be organized in three main areas of data science:

Initially, a brief overview is given to data science infrastructure concerned with volume and velocity. Topics include instrumentation, big data infrastructures and distributed systems, databases and data management. The main challenge is to make things scalable and instant.

The main focus of the course is on data analysis concerned with extracting knowledge from data. Key topics covered are data exploration and visualization, data preprocessing, data quality issues and transformations, various supervised learning techniques with a focus on their evaluation, unsupervised learning, clustering, pattern mining, process mining and text mining. The main challenge of data analysis is to provide answers to known and unknown unknowns.

Finally, data science affects people, organizations, and society. The course is concluded by discussing challenges and providing guidelines and techniques to apply data science techniques responsibly with a focus on confidentiality and fairness. Topics include ethics & privacy, IT law, human-technology interaction, operations management, business models, entrepreneurship, and the main challenge is to do all of the above in a responsible manner.

Throughout the course, the ideas and concepts conveyed in the videos are complemented by hands-on exercises using Python (Jupyter notebooks). Participants will be guided to apply the presented techniques on artificial and real-life data sets to gain valuable hands-on experience.

After the course participants should have a good overview of the best practices, challenges, goals and concepts of the broader data science field, providing a strong foundation for further study or professional development in this rapidly evolving field. Through the combination with hands-on experience with commonly used Python Libraries, participants will be able to conceptualize and implement various basic data analysis techniques in their own projects and accurately evaluate and interpret analysis results.

At a glance

  • Institution: RWTHx
  • Subject: Computer Science
  • Level: Intermediate
  • Prerequisites:

    Everyone from any discipline with an interest in data science can start this course. We expect this course to be useful for everyone. Prior knowledge in math is of advantage (i.e., mathematical notations, linear algebra, stochastics, and statistics), but not mandatory.

  • Language: English
  • Video Transcript: English

What you’ll learn

After taking this course, participants will have gained:

  • Understanding of the role of data science in today’s society and businesses, including challenges and opportunities
  • Good general overview of a broad range of data science techniques
  • Ability to conceptualize and basic data analysis and accurately evaluate and interpret the outcomes
  • Understanding the challenges of responsible data science (fairness, accuracy, confidentiality, transparency) and possible solutions
  • Understanding of the limitations of machine learning, data mining and AI techniques
  • Ability to write short Python programs and use mainstream Python libraries
  • In particular, understanding of and ability to apply the following data analysis concepts and techniques:
  • data visualization and exploration techniques
  • decision trees
  • linear and logistic regression (basic overview)
  • support vector machines (basic overview)
  • neural networks (basic overview)
  • naive bayesian classification (basic overview)
  • evaluation and interpretation of the results obtained using supervised learning
  • clustering techniques
  • frequent item sets
  • association rules
  • sequence mining
  • process mining
  • text mining
  • data preprocessing, data transformation, spotting and handling of data quality problems
  • Application of data analysis techniques without violating confidentiality and fairness

Additional information

Weeks

9

Language

English

Related Products