Ideas for a "suspect changesets classifier"
Käyttäjä naoliv kirjotti tämän 17. Maya 2014 kielellä EnglishSometimes we find here in Brazil some imported data from +4 months ago, that nobody saw until now. Usually, these imports are followed by some other changesets deleting the old data + changesets modifying/adjusting the imported data.
We also see some changesets where people purposely/unconsciously delete a lot of data.
Could a Bayesian filter, SVM or something else be used to classify a suspect changeset? Could we use something smart for this task?
Discussion
Kommentti käyttäjältä cartinus 18. May 2014 klo 01.42
When using WhoDidIt you can see which changesets contain lots of deletions.
Kommentti käyttäjältä naoliv 18. May 2014 klo 02.18
The problem is that I can’t manually verify every changeset (and that’s why I am wanting some kind of classifier).
Kommentti käyttäjältä Nakaner 18. May 2014 klo 13.04
The German user Oli-Wan (a very active German forum member) developes a tool to detect vandalisms and other bad changesets. He has written about his idea/work in German forum. You may contact him in e.g. in German or English.
Kommentti käyttäjältä cartinus 18. May 2014 klo 19.02
That is why I mentioned WhoDidIt. Changesets with lots of deletions are specially marked. So you won’t have to check them all.