Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quality over quantity of exercises? #39

Closed
Cohen-Carlisle opened this issue Jun 21, 2016 · 15 comments
Closed

Quality over quantity of exercises? #39

Cohen-Carlisle opened this issue Jun 21, 2016 · 15 comments

Comments

@Cohen-Carlisle
Copy link
Member

I've noticed a number of very similar exercises have been reported lately. exercism/problem-specifications/issues/276 shows 4 exercises that implement different numerical bases (2,3,8,16). exercism/problem-specifications/issues/268 reports that protein-translation and nucleotide-codons exercises are the same. There are still more examples, like (food-chain, beer-song, and twelve-days), or (nth-prime and sieve) or (accumulate and list-ops). You get the idea.

I've also noticed that the activity per exercise in the Ruby track drops off pretty fast. Visual here:
ruby-activity

@IanWhitney also caught my attention when he made PR exercism/rust/pull/137 to reorder the exercises to more uniformly increase difficulty and focus on not introducing a lot of new concepts at once.

Taken together, I think re-ordering and reduction of exercise duplication would really improve things for the exercist. The hard part, of course, is figuring out the right order and what's duplicated.

@kotp
Copy link
Member

kotp commented Jun 21, 2016

What are the data points in your visual? I think this says there are nearly 2300 people that have done Hello World, and then almost 4000 people that have done the second in the list? And around 300 that have done around 33 of the exercises? Want to make sure that I am interpreting this correctly. It is perhaps not obvious, without some kind of legend.

Good discussion point, I think.

@tejasbubane
Copy link
Member

Are we thinking of "completely" removing some exercises? If yes, would it create problems for people who have already submitted them or fetched but not yet submitted kinda situations?

@Cohen-Carlisle
Copy link
Member Author

@kotp I'm not even sure exactly what that data is. It is derived from my view of http://exercism.io/tracks/ruby/exercises, which for me includes 32 exercises. I believe "activity" is the sum total of iterations and comments.

@tejasbubane We usually deprecate exercises. As far as I know, this means the api will stop serving them on exercism fetch but they can still be submitted.

@kotp
Copy link
Member

kotp commented Jun 21, 2016

@tejasbubane I don't believe we would "completely" remove it, just deprecate as we have done in the past.

@Insti
Copy link

Insti commented Jun 21, 2016

Can we just re-order them based on solution frequency?

Simple, easy to measure, easy to implement.

(But maybe needs some normalisation by age?)

Edit: Or is there a way to get fetch stats and use those to normalise against?

@tejasbubane
Copy link
Member

tejasbubane commented Jun 21, 2016

@kotp @Cohen-Carlisle Did not know we have deprecated exercises in the past. 👍

@kytrinyx
Copy link
Member

@Cohen-Carlisle your view excludes archived solutions.

Try this one: http://exercism.io/stats/ruby

I think this says there are nearly 2300 people that have done Hello World, and then almost 4000 people that have done the second in the list?

We added hello world later than the other one.

select slug, count(id), min(created_at) from submissions where language='ruby' and (slug='hello-world' or slug='hamming') group by slug;

    slug     | count |            min             
-------------+-------+----------------------------
 hello-world |  3111 | 2015-05-10 23:45:00.633782
 hamming     | 10107 | 2013-06-24 14:12:35.428
select slug, count(id) from user_exercises where skipped_at IS NOT NULL and language='ruby' GROUP BY slug;
         slug          | count 
-----------------------+-------
 sum-of-multiples      |     1
 phone-number          |     1
 word-count            |     1
 roman-numerals        |     3
 run-length-encoding   |     1
 gigasecond            |     1
 grains                |     1
 raindrops             |     1
 hello-world           |     2
 atbash-cipher         |     1
 ocr-numbers           |     1
 nth-prime             |     1
 bob                   |     4
 hamming               |     1
 sieve                 |     2
 rna-transcription     |     1
 grade-school          |     1
 food-chain            |     3
 series                |     1
 robot-name            |     2
 leap                  |     1
 accumulate            |     1
 pangram               |     1
 binary                |     2
 difference-of-squares |     1

Can we just re-order them based on solution frequency?

Are you suggesting that we put skipped exercises farther towards the end? Or something else?

normalisation by age

Age of what?

@Insti
Copy link

Insti commented Jun 21, 2016

Are you suggesting that we put skipped exercises farther towards the end? Or something else?

Put the exercises that people are getting stuck on further towards the end.
Edit: I suspect skipping means something domain specific.

Age of what?

Age of problem, so that a problem that has been there since the beginning is compared fairly with one that has been there for a month.

@kytrinyx
Copy link
Member

I suspect skipping means something domain specific.

Yeah, it means that they didn't want to solve it, so they skip it in order to get the next exercise when they say fetch.

exercism skip ruby hello-world

Put the exercises that people are getting stuck on further towards the end.

I think this is a good goal. I don't know why people stop submitting. They might be stuck, but it could be other things. Maybe they didn't get feedback, and they lost interest. Maybe the previous exercise was boring and they lost interest. Maybe they were just trying it for a bit and they moved on.

We know which order people solved things in (we have timestamps for everything), so we could ask pretty specific questions about the data, if we wanted to. I think it would be worth finding out why people don't move on rather than assuming that they're getting stuck (though stuck is a perfectly good guess).

@Insti
Copy link

Insti commented Jun 21, 2016

Is that data publicly available?

It might be possible* to calculate the background exponential decay rate and the hard spots are where the solve rate is sufficiently below that. (But maybe the sample size is too small for this to work.)

* for someone more familiar than me with statistical math.

@kytrinyx
Copy link
Member

I think it is available via the API, but I would have to verify that. If it's not we could make it so.

@c1505
Copy link

c1505 commented Jul 27, 2016

The API doesn't show that data currently. It just shows iterations and submissions for the last 30 days by language track. I'll work on changing it to expose additional stats.

As far as I can tell, this is the only API for Stats. I am new to poking around exercism so please correct me if there is more somewhere else.
https://github.com/exercism/exercism.io/blob/10591be6d5de31a789fab0e5e0cebe15716b766c/api/v1/routes/stats.rb

@kytrinyx
Copy link
Member

As far as I can tell, this is the only API for Stats.

Yeah, that's all we have in terms of stats. All the stats stuff is experimental, so if anyone has good ideas for what to expose and how, we're very open to trying things out.

@Cohen-Carlisle
Copy link
Member Author

From what I've seen since creating this issue, duplication of exercises is being addressed, at least at the x-common level and there are other discussions about deprecating exercises and ordering.
Should this issue be closed now?

@kytrinyx
Copy link
Member

Yeah, I think so. Thanks for checking back in, @Cohen-Carlisle!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants