-
-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add exercise all-your-base. #280
Conversation
@@ -0,0 +1,30 @@ | |||
Implement general base conversion. Given a number on base **a**, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
English question: do we say a number on base X or a number in base X?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's in (that's what you use lower down, and it sounds more natural to me).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea. I'm not a native English speaker.
~~I will fix that later.~~Done! Thanks! 😄
I'm still unsure about what is the correct behavior when dealing with an empty sequence of digits. I chose to treat it as an invalid input, but I want a good explanation for that. |
"" is not a number. "nothing" is not a number. 0 is a number. Does that work? |
"nothing" can be a number of base 24 or higher. For the empty sequence, I'd suggest to follow how most other sequences are
|
Well, the problem we have with numbers is that we usually accept distinct sequences as meaning the same number, e.g. I know we are used to represent the zero as a single digit zero, but if dropping the zeroes at the left is allowed...of course, we can define it in the traditional way, but when I started coding a solution, I was suddenly struck by the feeling that there is something deeply wrong with that. 😕 Taking what @NobbZ said as a principle, it would make sense to define the canonical representation of zero as It's against our social convention, but it is at least consistent. Interesting... Would that be too strange? |
My initial feeling would be to treat an empty sequence of digits as an invalid input. The number |
Would you consider Edit: I'm loving this discussion. Never imagined a simple problem like this one would be that interesting. |
The questions we have to answer are: rebase 2 2 [0,1] = ?
rebase 2 2 [0,0] = ?
rebase 2 2 [0] = ?
rebase 2 2 [] = ? Edit: rebase 2 2 [7] = ? |
To me, the following would be the most natural:
The way I look at it, is that I first concatenate the digits, and then try to parse that number. That means that As a small check, I've created a .NET fiddle that tests this behavior when using .NET to parse the strings: https://dotnetfiddle.net/WPdeRW In that example, "01" parses to |
For me it would be: rebase 2 2 [0,1] = [1]
rebase 2 2 [0,0] = []
rebase 2 2 [0] = []
rebase 2 2 [] = []
and...
rebase 2 2 [7] = exception Edit: The package digits has the same behavior, except that it doesn't check for invalid digits. digits 10 1234
> [1,2,3,4]
digits 10 0
> []
unDigits 10 [1,2,3,4]
> 1234
unDigits 10 []
0 But I will accept the majority decision, of course! What about the others? |
Oh, right. Yeah, totally. |
Are we representing the answer as a sequence as well? |
OK, after I had some time to think about all this while waiting for a train crossing and also had about twice as much red lights as usual, I realized some stuff.
There was actually much more going on in my head, but even as I am hiding in a cellar, I can't think due to heat… |
I think it's unavoidable. If we are converting from one base to another, the types have to be the same. So, it would be a function from two bases and a sequence of digits to a sequence of digits. In Haskell it would be: rebase :: Integral a => a -> a -> [a] -> [a] Did I got the idea the wrong way? |
It is just the way I understood this as well. But I'd restrict everything to positive numbers, is there a typeclass for that, or do one have to use And right now I really got interested in bringing -- I'm a typesignature written in Idris, all types used are in scope without importing
-- anything.
rebase : (from : Nat) -> (to : Nat) -> List (Fin from) -> List (Fin to) Or at least something along these lines ;) Perhaps even aliasing |
About the language-specific representation... @NobbZ I wrote the .json using arrays just to emphasize that this as an numerical problem that deals with sequences of digits. It's up to each language to decide if it's a list, a vector, an array or a string with numbers separated by commas, right? Edit: One thing to keep in mind is the conversion between trinary and hexadecimal in the tests. It returns Anyway, we still need to define the answers for those 5 cases up there. |
How about we represent each digit in the sequence in binary? |
Ok, that's terrible. But we need some JSON-friendly way to specify the value, the base, and then each language will need to figure out how to turn that into the correct type. |
For me the cases exampled above are resolved as follows: rebase 2 2 [0,1] = [1] obvious, isn't it? rebase 2 2 [0,0] = [0]
rebase 2 2 [0] = [0] Ignore leading zeros, but make the overall value of zero explicit rebase 2 2 [] = [] For me, converting an empty sequence does mean to have an empty sequence after conversion. And to be honest, I asked two profs about this. I got two answers :( One would say, trying to convert an empty sequence is just a plain old error and has to be signaled in some way and mumbled something about NULL-pointers (he gives lectures about all the basic numberstuff and theoretical internalls of a computer, as well as helps out in the C and ASM [RIP] exercises), the other one instantly returned an empty list ;) He teaches FP and explained that the algorithm were much better proovable by returning emoty for empty. rebase 2 2 [7] = Nothing -- Left "digit 7 not of base 2"; exception It's rather obvious as well, isn't it? |
Terrible good or terrible bad? 😁 I thought the JSON was good the way it is... 😢 { "description": "trinary to hexadecimal"
, "base_one" : 3
, "base_two" : 16
, "digits" : [1, 1, 2, 0]
, "expectation": [2, 10]
} |
Is there a better name for the |
@wobh, I agree they are bad names....I just copied them. 😁 What about |
This is probably because he/she is thinking that a null strings is signaling the lack of a number, so it makes sense to return an error. I guess most of the C programmers are here.
He/she probably thinks that an empty list just signals the lack of digits, not the lack of a number. This one would maybe think that the correct return for any base conversion of zero is It's not by accident that digits implemented it that way. Seems to be the functional way... The discussion here seems to be just about the acceptable and the canonical representations of We could try a middle ground and leave the conversions from This way we avoid forcing an unidiomatic exercise in a language, because there is no single answer that could please an ASM and a Haskell programmer at the same time. 😄 Most of the over-specified exercises from other language are artificial in Haskell. They are literally alien to us. 👽 In, fact, #276 started just because of a discussion about unidiomatic exercise implementations in an issue raised by @petertseng (exercism/haskell#115). So, reviewing my former position, I'm completely against forcing any track collaborator, maintainer or user to accept artificial exercises. IMHO, we should drop those corner cases or remove the JSON file at all. What do you think? Another question: Should we write...
...or it's OK to leave for each person to decide what is the challenge? |
I think it would be incredibly useful to have a JSON file that defines what a good set of inputs and outputs are. If we don't do that, each implementor will have to make this up from scratch, and will likely miss interesting/important cases, and then we'll have more churn than necessary. I think it's probably fine to drop the corner cases. |
Ok, @kytrinyx. Rewriting the JSON explicitly saying what is left to the implementation. This way the corner cases will be easily identifiable. I still would like more feedback about the variable names in the JSON. ps: Now I regret the moment I wrote:
Sorry, people! 😔 I had no idea about the implications of that... 😁 |
@rbasso Maybe you've seen this SO thread and the answer I'm linking to, but if not, it seems like it would be of interest: http://stackoverflow.com/a/2488528. |
Hi, @wobh. Never saw that thread. Thanks! But I'm still trying to understand how it is related to this exercise. Edit: all-your-base in Haskell will probably be implemented without any use of Char or Text. No parsing at all, just pure numerical transformations. Changing subject, are you OK with I know it's hard, but it would be great to merge this having everyone at least satisfied with the outcome. Really! 😄 So...what are the problems you all see with the version that we have now, people ❓ |
I just thought it could be helpful in determining expectations Haskell programmers might have about numbers as a sequences of digits, by analogy to how they're parsed when represented as a string of characters. If it's not a helpful analogy or example, thank you for at least indulging me and looking at it. It was something I became curious about reading this thread. |
These names seem reasonable to me. Much clearer than expected in this case. |
It's been a while, so... anyone has anything to say? |
I'm missing a testcase for a not so widely used base. Also none of the testcases does show how digits greater than 9 shall be entered/represented. Aside of that I like |
Thanks for the feedback, @NobbZ . About the digits greater than 9, there are two tests: { "description" : "trinary to hexadecimal"
, "input_base" : 3
, "input_digits" : [1, 1, 2, 0]
, "output_base" : 16
, "output_digits": [2, 10]
},
{ "description" : "hexadecimal to trinary"
, "input_base" : 16
, "input_digits" : [2, 10]
, "output_base" : 3
, "output_digits": [1, 1, 2, 0]
}, I'm just adding one more, to use a bigger number without risking overflow { "description" : "15-bit prime"
, "input_base" : 97
, "input_digits" : [3,46,60]
, "output_base" : 73
, "output_digits": [6,10,45]
},
I understand it's inconvenient to have numbers greater than 9 in the sequence, but I don't think it's possible to avoid that if we want a generic base-conversion exercise. For small bases, we could encode the digits in characters, but that wouldn't work for bigger bases and I believe it was decided that this would be a strictly numeric problem, right ❓ If you or anyone wants to change how the number are represented in the .json or add new cases, just send a message and I'll change it immediately. 😄 |
It seems as if I just missed the existing tests, sorry. And I like the one test with exotic bases. I don't like going back to string-representation again. They'd limit us to a max base of 36/62 (depending on case sensitivity). And as you said, it is about converting numbers, not strings. We need to be aware though, that the current exercises are very straight forward to implement since one just can spit out a result of some computation and has a fixed base. This has a higher complexity and might need to appear later on the list of exercises in most languages. |
I think this is a really good first pass. Let's merge it as is, and we can revisit it if we discover issues as languages implement it. |
Katrina Owen writes:
Any idea when you will let blazon file the issues? |
I will try to get this filed tomorrow. If anyone wants to help draft the issue text for blazon (over in #276 I think), that would be a great help! |
Update: it's going to happen over in #279 |
all-your-base 1.2.0 In #280 we made the explicit choice to leave various cases `expected: null` where tracks can make their own decisions (leading zeroes, the representations of zero, whether empty sequence is acceptable as input). There are four affected cases. #473 results in a discussion that some post-processing is necessary, based on the decisions that a given track makes. One might even imagine that each individual generator may have to reimplement base translation to arrive at the proper answers for each case. To ease the process of translation, instead we make some canonical choices, explicitly show what the choices are, and offer that tracks may make a different choice if desired. A previous proposal did not receive support: #468 However, this proposal differs because it does change the comments at the top of the file. Closes #1002 by mutual exclusion
all-your-base 2.0.0 This uses the error object defined in #401 (comment) and thenceforth added to the README's **Test Data Format**. In #280 we made the explicit choice to leave various cases `expected: null` where tracks can make their own decisions (leading zeroes, the representations of zero, whether empty sequence is acceptable as input). However, since `expected: null` is also used for cases where there is always an error (and there is no decision being left to the tracks), it becomes unreasonably hard to tell cases in these two categories apart. To make this easier, all cases in the latter category (always are errors) can canonically be marked as such.
This is the first my first exercise submission, so try to be nice! 😁
Partial solution to #276.
I haven't carefully checked the cases to see if everything is right, so there are probably some errors.