Chevron Left
How to Win a Data Science Competition: Learn from Top Kagglers(으)로 돌아가기

HSE 대학의 How to Win a Data Science Competition: Learn from Top Kagglers 학습자 리뷰 및 피드백

1,113개의 평가
274개의 리뷰

강좌 소개

If you want to break into competitive data science, then this course is for you! Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science. In this course, you will learn to analyse and solve competitively such predictive modelling tasks. When you finish this class, you will: - Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world tasks. - Learn how to preprocess the data and generate new features from various sources such as text and images. - Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions. - Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data. - Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them. - Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance. - Master the art of combining different machine learning models and learn how to ensemble. - Get exposed to past (winning) solutions and codes and learn how to read them. Disclaimer : This is not a machine learning online course in the general sense. This course will teach you how to get high-rank solutions against thousands of competitors with focus on practical usage of machine learning methods rather than the theoretical underpinnings behind them. Prerequisites: - Python: work with DataFrames in pandas, plot figures in matplotlib, import and train models from scikit-learn, XGBoost, LightGBM. - Machine Learning: basic understanding of linear models, K-NN, random forest, gradient boosting and neural networks. Do you have technical problems? Write to us:

최상위 리뷰

2018년 3월 28일

Top Kagglers gently introduce one to Data Science Competitions. One will have a great chance to learn various tips and tricks and apply them in practice throughout the course. Highly recommended!

2017년 11월 9일

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

필터링 기준:

How to Win a Data Science Competition: Learn from Top Kagglers의 273개 리뷰 중 1~25

교육 기관: Kostyantyn B

2018년 8월 10일

I am very conflicted about this series, as well as this particular course (How to win a Data Science Competition). Let me try to summarize it.


- This is not an introductory level course. You have a chance to learn some advanced techniques of the state-of-the-art Data Science. There are not that many advanced DS/ML courses available so I was very excited when I found this one.

- You get your foot in the door of the Competitive Data Science, something you may not have courage to do on your own (it was certainly the case with me).

Cons... Where do I begin?...

- The courses of this series have been available for quite some time now. Yet the learning materials still feel very raw: I can live with occasional typos but I have seen some mistakes that I found unacceptable. Including things like wrong math formulas, improperly set up Docker environment, and incorrect "correct" answers (one can actually get the credit for the last question in the Programming Assignment 1 only if a wrong answer is submitted! This has been pointed out on the Forum months before I took it, yet here we are).

- The course content is somewhat strange. It is a mix of an introductory-level material and some pretty advanced tricks. Of course, it is the latter that is most appealing to many students like myself. But the problem is, too many of these topics are covered in a very superficial way providing very little substance. I remember getting all excited when the instructors would start talking about the Kaggle competitions they personally participated in... only to be left disappointed with how little I learned from their experience. I am not finished with the course but this has already happened more than once...

- Finally, with all due respect, the instructors are not what you might call outstanding educators. I realize that not everybody can be like say, Andrew Ng and you certainly get to see a broad spectrum of the teaching skills among the instructors at Coursera. Still, in my opinion some (in fact, most) instructors that participated in creating this series have a long way to go...

I don't mean to be harsh. I certainly appreciate what they are trying to achieve here and it is a noble goal. But the execution is flawed and I felt that I had to say something. I know that they can do better (they are without a doubt, a talented bunch of people) and I really hope they can learn from their mistakes.

교육 기관: Nicholas C

2018년 3월 10일

Should have been labeled "How to Cheat a Data Science competition". An entire week is dedicated to Data Leakage and how to exploit it rather than in the spirit of the competition how to create a model that actually solves the problem.

교육 기관: Caio A A O

2017년 11월 20일

Some cool tips on the first week, but then on the second one we have a whole section about how to exploit data leaks on competitions, and that's worth 12% of the final grade. This sucks... If it wasn't part of the advanced machine learning specialization, I wouldn't care, but it is. This plus a peer-graded assignment with really broad criteria really got me thinking about whether it's worth doing it for a verified certificate.

교육 기관: Steven A

2019년 2월 25일

I really enjoyed this course but it was probably 2-3 times more work than I anticipated. Most of that extra time comes from working on the final project, testing things out, etc.

교육 기관: raghuveer n

2017년 12월 26일

Understanding accent is the big challenge and quality of the recording is not great

교육 기관: Nick

2018년 3월 4일

Questions are unclear, authors clearly do not understand what they're asking

교육 기관: Steffen R

2018년 10월 16일

extremely bad supported.

교육 기관: Anirudh m

2018년 8월 5일

Assignments are terribly written. Quizzes don't make sense at times.


2020년 6월 24일

I have several things to say about this course. The first thing is that it is a pretty advanced course, and you have to know it beforehand. However, even the most advanced course must have a structure, so you can link your previous knowledge with the new one, and the course flaws at this point. With all respect, you can see that the instructors certainly have a lot of experience and knowledge, but they don't convey it in the best way. The videos are very difficult to follow, and that is worsened by the rather poor pronunciation of many of the instructors (which of course is not an offence, but it's very important to speak clearly if you are trying to convey knowledge). Quizes, reading material and notebooks are not well written. Code is not well commented and explained. Also, quizes have often ambiguous, or even wrong answers.

The course is focused on some abilities that are very specific to fit models for competition and not for real life performance (however, you can expect this just by reading the name of the course).

Summarizing, I certainly learnt a lot of interesting things, some of them can be applied in a day to day work as a data scientist, but it was not a very pleasant learning experience. This course has a lot of potential, but IMHO it requires some structural changes.

교육 기관: Fabrice L

2019년 4월 11일

A looooot of content!!!

I like the fact that it talk about broad data science topics, and doesn't specialize into one specific domain. You gain some good tricks about pandas, EDA, modeling, feature engineering... etc The skill coverage is very wide.

This is definitely advance, and challenging soemtimes, but you'll learn a lot.

교육 기관: Marsh

2017년 11월 10일

This course is fantastic. It's chock full of practical information that is presented clearly and concisely. I would like to thank the team for sharing their knowledge so generously.

교육 기관: Kamo k

2020년 8월 29일

I can tell that I learned some tips for Kaggle competition.

However, their peer review system for assignment is so ANNOYing!!!! Because it is really hard to get other people to review my projects!!!!!! Every time when I needed to get peer review, I had to share my assignment link on the FORUM, and bag somebody to review my project. That is BULLSHIT!!!!

I submitted my final project 4 days ago before my subscription will charge it again, but nobody reviews my project. That is Horrible system and it is so unfair, so please correct it or give alternative system!!!!!

I definitely don't want to pay for course again for getting peer review.

교육 기관: Andrey L

2017년 12월 24일

I really liked the course, though there were some problems:

- during the first weeks not everything in tests is told in lectures. I fully realize that this course/specialization is advanced. But a lot of things we have to self-learn, which usually isn't a point of coursera courses;

- Coursera Hub is a mess - if you start doing an assignment and the authors change something, than the only way to get the newer version is to take it from Github. This is inconvenient;

- at last the final task: the dataset is too big, even Macbook Pro with 16 Gb Ram takes a lot of time to complete it. Also authors wanted people to try different ways of doing this assignment, but it doesn't work well in coursera peer-graded format;

Otherwise the course is great!

+ I learned a lot of new things and got a deeper understanding of some things I knew;

+ after completing quizes we have detailed explatations - this is really cool;

+ explanations of validation and metrics are very good;

+ and the section about metrics is very interesting by itself - I rarely thought long about these things previously;

+ lectures on mean-encoding were very good, and the programming assignment was excellent;

+ it was quite interesting to learn more about hyperparameter tuning;

+ feature engineering section is really useful;

+ I want to pay a special attention to the additional task - we need to write an algorithm which uses KNN to create new features (distance based metrics and others). It had great practical value. Also I learned to use multiprocessing module, and used this knowledge at my job;

+ previously I only heard about staking and booksting technics, but didn't have an opportunity to use them. Now I can do it thanks to the section.

So the course it really great, and its value isn't limited by Kaggle - knowledge and skills acquired in this course can be used for job and other applications.

교육 기관: Carlos V

2018년 9월 30일

This course is unique, highly recommended to anyone that wants to push their skill with machine learning, the assignments are excellent and super challenging, after completing the final assignment my understanding how to improve an ml model was better, pushing you to understand how to build a machine learning model to be competitive in Kaggle.

All the techniques explained also can help you to create better ml models in general.

Thanks very much to all professors for putting together this fantastic course.

Looking forward to a more advanced version in the future.

교육 기관: Yu Q

2018년 12월 4일

I competed this course within almost 3 months, far more time than I planed. The most time I spent on was to create new features via feature engineering and verify the cross-validation method. This course was difficult, but very helpful and inspirational. Thanks to each teacher and tutor!

교육 기관: Greg W

2019년 2월 19일

Really excellent. Very practical advice from top competitors. This specialization is much more information-dense than most machine learning MOOCs. You really get your money's worth.

교육 기관: Leonardo S

2019년 8월 18일

Please you must improve either the spoken language or the legends. It is really tough to understand these russians speaking english. Besides the course could be more structured and also approach other types of competitions and hints.

But overall it was useful and I really learned a few things.

교육 기관: Pavel S

2019년 6월 22일

bad practice tasks, it's unreal to solve without excellent Mathematics background, to understand the variables in the code must to waste a lot of time, no instructions etc ... i can understand maybe teachers wanted to do the same atmosphere like in competition but for what i'll Pay money ???? i Pay money for teaching not for guessing what to do , first week is cool, others are shit, and remove the lector in glassses

교육 기관: Alexander O

2021년 2월 12일

Overall it's a good course from which you can learn a lot. Have in mind that it's pretty advanced. I have already had my first experience with Kaggle before it and I think it was an advantage.


1. It covers a lot of material, tricks and methods, which can be useful not only for Kaggle.

2. It also contains a lot of Kaggle specific things.

3. Practical assignments are good, though sometimes they might be quite challenging.


1. The instructors are experts, but some of them don't explain things very clearly.

2. The course is not self contained, so that you should be ready to look for some material somewhere else, for example, when you need to complete some of their quizzes.

교육 기관: 李继杨霖

2018년 10월 16일

This is the first course I've finished on coursera. At the beginning, the motivation of taking this course is only to get an better score in Kaggle competition because I major in statistics and am interested in data science. But during the processing of learning, I found many important ideas and experience to deal with the real problem and enjoyed the communication with other people from forum and Kaggle, I also aquired some special experience such as peer review, which is not only very fun but also can provide me different aspects to see the problem I'm dealing with again. Thanks Dmitry Ulyanov, Alexander Guschin, Mikhail Trofimov, Dmitry Altukhov, and Marios Michailidis for sharing your important knowledge and experience with us.

교육 기관: Samuel Y

2018년 2월 22일

Just like the course 1 of this specialization, the pace of course 2 on practical data science competition is very fast, therefore the quizzes and assignments are indeed necessary and very helpful (even for the final project). Some of the programming tasks & final project are quite hard and time consuming. But it is worthwhile to grasp the practical knowledge/skills and work on the real-life programming solution.

P.S. Check the discussion forum for possible problems and bugs. Better start working on the first week on the final project on Kaggle and setup your own notebook with big enough memory in AWS/Google/DIY environment, just as the course suggested.

교육 기관: martin j

2018년 4월 28일

One of the best data science courses on Coursera! (incl. Andrew Ng' courses) . If I should mention one negative thing, it should be that the allocated time for the assignment is not realistic at all. Maybe I am not advanced enough(i don't have a background in CS - and most of the programming skills were restricted to Matlab before I began). But if you interested in machine learning this will not be a problem. The is course packed with machine learning tips and tricks, and for me, this course is more advanced than an average course in my university.

교육 기관: Eric A S

2018년 7월 31일

I am not terribly competitive so I thought this course wouldn't be very interesting. I was wrong. While some of the material covered is specific to competition, most of it is not, and is very useful for any data scientist. In addition, after finishing this course I am addicted to Kaggle and am currently in first and second place in some Kaggle competitions. Highly recommended, even for people who aren't interested in competitive data science.

교육 기관: Tirth P

2019년 4월 25일

Five stars for the amount of hard work the authors have actually put in to make this course the best of all courses in the specialization. It is one of the best courses to succeed in the field of competitive data science. Has a lot of assignments and quizzes to go through in each week. I would highly recommend the course if you want to learn advanced feature engineering and EDA.

Thank You!

교육 기관: Toghrul J

2018년 4월 21일

Very useful course with full of ideas to apply not only on Kaggle competition but also on the daily projects. If you are a Data Scientist and want to get another level, this course is for You. What makes this course so special? whatever they explain it comes from experience and it is very practical. So during classes do not forget take notes, otherwise you can forget :)