Tag: sklearn

Experiment : Supervised learning model to classify a Github Issue as enhancement or bug based purely on issue title

Minhaz

Minhaz

Software Engineer II at Microsoft
I work as Software Engineer for Microsoft Azure Production & Infrastructure Engineering team. My day to day work revolve much around distributed systems and machine learning. I am excited to explore areas like Natural Language Processing and Knowledge Bases and see if they can help solve bunch of problems yet to be commercially solved.
Minhaz
Quick Summary: I Mined more that 1,00,000 Issue data from Github open source repositories. Mined data included { issue title Рstring}, {issue description Рparagraph } and {labels Рdiscrete strings}. Most of them were enhancement or bug. So started with a simple classifier which classifies an issue as Enhancement or Bug based on issue title.

Solution : This solver needs samples of at least 2 classes in the data, but the data contains only one class

Minhaz

Minhaz

Software Engineer II at Microsoft
I work as Software Engineer for Microsoft Azure Production & Infrastructure Engineering team. My day to day work revolve much around distributed systems and machine learning. I am excited to explore areas like Natural Language Processing and Knowledge Bases and see if they can help solve bunch of problems yet to be commercially solved.
Minhaz
So if you are trying to do build a classifier using sklear.linear_model & came across the error: ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0 This is because of a bug in sklearn.linear_model module. Sparkit trains sklearn’s linear models in parallel, then