Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add exercise all-your-base. #280

Merged
merged 1 commit into from
Jun 29, 2016
Merged

Add exercise all-your-base. #280

merged 1 commit into from
Jun 29, 2016

Conversation

rbasso
Copy link
Contributor

@rbasso rbasso commented Jun 23, 2016

This is the first my first exercise submission, so try to be nice! 😁

Partial solution to #276.

I haven't carefully checked the cases to see if everything is right, so there are probably some errors.

@@ -0,0 +1,30 @@
Implement general base conversion. Given a number on base **a**,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

English question: do we say a number on base X or a number in base X?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's in (that's what you use lower down, and it sounds more natural to me).

Copy link
Contributor Author

@rbasso rbasso Jun 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea. I'm not a native English speaker.
~~I will fix that later.~~Done! Thanks! 😄

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

I'm still unsure about what is the correct behavior when dealing with an empty sequence of digits. I chose to treat it as an invalid input, but I want a good explanation for that.

@kytrinyx
Copy link
Member

I chose to treat it as an invalid input, but I want a good explanation for that.

"" is not a number. "nothing" is not a number. 0 is a number.

Does that work?

@NobbZ
Copy link
Member

NobbZ commented Jun 23, 2016

"nothing" can be a number of base 24 or higher.

For the empty sequence, I'd suggest to follow how most other sequences are
converted into another one. If input sequence is empty, then output is as
well.
Am 23.06.2016 13:41 schrieb "Katrina Owen" [email protected]:

I chose to treat it as an invalid input, but I want a good explanation for
that.

"" is not a number. "nothing" is not a number. 0 is a number.

Does that work?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#280 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AADmR8TCM3vDzvUuCh7Oqp-48prsnls3ks5qOnDigaJpZM4I8pzt
.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

Well, the problem we have with numbers is that we usually accept distinct sequences as meaning the same number, e.g. [1] == [0,1] == [0,0,1].... I was just thinking if it would make sense to apply the same reasoning to zero: [0, 0] == [0] == []?

I know we are used to represent the zero as a single digit zero, but if dropping the zeroes at the left is allowed...of course, we can define it in the traditional way, but when I started coding a solution, I was suddenly struck by the feeling that there is something deeply wrong with that. 😕

Taking what @NobbZ said as a principle, it would make sense to define the canonical representation of zero as [], and that solves the problem and forces me to rewrite all the tests.

It's against our social convention, but it is at least consistent.

Interesting...

Would that be too strange?

@ErikSchierboom
Copy link
Member

My initial feeling would be to treat an empty sequence of digits as an invalid input. The number 0 is not an empty sequence of digits, it is the sequence with a single 0 digit: [0].

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

Would you consider [0, 0] as zero or as invalid input, @ErikSchierboom ?

Edit: I'm loving this discussion. Never imagined a simple problem like this one would be that interesting.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

The questions we have to answer are:

rebase 2 2 [0,1] = ?
rebase 2 2 [0,0] = ?
rebase 2 2   [0] = ?
rebase 2 2    [] = ?

Edit:

rebase 2 2   [7] = ?

@ErikSchierboom
Copy link
Member

To me, the following would be the most natural:

rebase a a [0,1] = 1
rebase a a [0,0] = 0
rebase a a   [0] = 0
rebase a a    [] = exception

The way I look at it, is that I first concatenate the digits, and then try to parse that number. That means that [0, 1] translates to "01", which is equal to 1 when being parsed. Same goes for [0, 0], which translates to "00", which parsed to 0.

As a small check, I've created a .NET fiddle that tests this behavior when using .NET to parse the strings: https://dotnetfiddle.net/WPdeRW

In that example, "01" parses to 1, "00" and "0" parse to 0 and "" throws an exception.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

For me it would be:

rebase 2 2 [0,1] = [1]
rebase 2 2 [0,0] = []
rebase 2 2   [0] = []
rebase 2 2    [] = []
and...
rebase 2 2   [7] = exception

Edit: The package digits has the same behavior, except that it doesn't check for invalid digits.

digits 10 1234
> [1,2,3,4]
digits 10 0
> []
unDigits 10 [1,2,3,4]
> 1234
unDigits 10 []
0

But I will accept the majority decision, of course!

What about the others?

@kytrinyx
Copy link
Member

"nothing" can be a number of base 24 or higher.

Oh, right. Yeah, totally.

@kytrinyx
Copy link
Member

rebase a a [0,1] = 1

Are we representing the answer as a sequence as well? rebase a a [0,1] = [1]?

@NobbZ
Copy link
Member

NobbZ commented Jun 23, 2016

OK, after I had some time to think about all this while waiting for a train crossing and also had about twice as much red lights as usual, I realized some stuff.

  • I mentioned above that "nothing" could be a number of base 24 or higher, this seems to be not true in our current presentation as a list of single digits in a list.
  • Do we really use a list of integer/array of integers? (Haskell: [Int]; C: int[<size>])
  • What is the alignement of such a sequence? Reading the literal [1, 2] in Haskell is pretty much different from the literal {1, 2} in C! And then Java comes along, which's Linked Lists seems to be reverse to Haskells (#add appends to the end of the list and there is no way to prepend a single element).

There was actually much more going on in my head, but even as I am hiding in a cellar, I can't think due to heat…

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

Are we representing the answer as a sequence as well? rebase a a [0,1] = [1]?

I think it's unavoidable. If we are converting from one base to another, the types have to be the same.

So, it would be a function from two bases and a sequence of digits to a sequence of digits.

In Haskell it would be:

rebase :: Integral a => a -> a -> [a] -> [a]

Did I got the idea the wrong way?

@NobbZ
Copy link
Member

NobbZ commented Jun 23, 2016

rebase :: Integral a => a -> a-> [a] -> [a]

Did I got the idea the wrong way?

It is just the way I understood this as well. But I'd restrict everything to positive numbers, is there a typeclass for that, or do one have to use Word-types explicitely?

And right now I really got interested in bringing Idris or Agda into exercism…

-- I'm a typesignature written in Idris, all types used are in scope without importing
-- anything.
rebase : (from : Nat) -> (to : Nat) -> List (Fin from) -> List (Fin to)

Or at least something along these lines ;) Perhaps even aliasing Fin away to Digit

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

About the language-specific representation... @NobbZ

I wrote the .json using arrays just to emphasize that this as an numerical problem that deals with sequences of digits.

It's up to each language to decide if it's a list, a vector, an array or a string with numbers separated by commas, right?

Edit: One thing to keep in mind is the conversion between trinary and hexadecimal in the tests. It returns [2,10], so the representation needs to deal with numbers possibly bigger than 9.

Anyway, we still need to define the answers for those 5 cases up there.

@kytrinyx
Copy link
Member

One thing to keep in mind is the conversion between trinary and hexadecimal in the tests. It returns [2,10], so the representation needs to deal with numbers possibly bigger than 9.

How about we represent each digit in the sequence in binary?

@kytrinyx
Copy link
Member

Ok, that's terrible. But we need some JSON-friendly way to specify the value, the base, and then each language will need to figure out how to turn that into the correct type.

@NobbZ
Copy link
Member

NobbZ commented Jun 23, 2016

For me the cases exampled above are resolved as follows:

rebase 2 2 [0,1] = [1]

obvious, isn't it?

rebase 2 2 [0,0] = [0]
rebase 2 2   [0] = [0]

Ignore leading zeros, but make the overall value of zero explicit

rebase 2 2    [] = []

For me, converting an empty sequence does mean to have an empty sequence after conversion.

And to be honest, I asked two profs about this. I got two answers :( One would say, trying to convert an empty sequence is just a plain old error and has to be signaled in some way and mumbled something about NULL-pointers (he gives lectures about all the basic numberstuff and theoretical internalls of a computer, as well as helps out in the C and ASM [RIP] exercises), the other one instantly returned an empty list ;) He teaches FP and explained that the algorithm were much better proovable by returning emoty for empty.

rebase 2 2   [7] = Nothing -- Left "digit 7 not of base 2"; exception

It's rather obvious as well, isn't it?

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

Ok, that's terrible. But we need some JSON-friendly way to specify the value, the base, and then each language will need to figure out how to turn that into the correct type.

Terrible good or terrible bad? 😁

I thought the JSON was good the way it is... 😢

{ "description": "trinary to hexadecimal"
, "base_one"   : 3
, "base_two"   : 16
, "digits"     : [1, 1, 2, 0]
, "expectation": [2, 10]
}

@wobh
Copy link

wobh commented Jun 23, 2016

Is there a better name for the base_one, base_two keys? Myself, I'd rename all four keys input_base, input_digits, expected_base, expected_digits?

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

@wobh, I agree they are bad names....I just copied them. 😁

What about inputBase, outputBase, input, output?
Anyone here uses camelCase?

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

rebase 2 2 [] = []
...
One would say, trying to convert an empty sequence is just a plain old error and has to be signaled in some way.

This is probably because he/she is thinking that a null strings is signaling the lack of a number, so it makes sense to return an error. I guess most of the C programmers are here.

...the other one instantly returned an empty list

He/she probably thinks that an empty list just signals the lack of digits, not the lack of a number. This one would maybe think that the correct return for any base conversion of zero is [].

It's not by accident that digits implemented it that way. Seems to be the functional way...

The discussion here seems to be just about the acceptable and the canonical representations of zero, right?

We could try a middle ground and leave the conversions from [0], [], and invalid bases as unspecified in the JSON file. That part would be language-dependent, like the error signaling, so each language would implement its own representation and tests.

This way we avoid forcing an unidiomatic exercise in a language, because there is no single answer that could please an ASM and a Haskell programmer at the same time. 😄

Most of the over-specified exercises from other language are artificial in Haskell. They are literally alien to us. 👽

In, fact, #276 started just because of a discussion about unidiomatic exercise implementations in an issue raised by @petertseng (exercism/haskell#115). binary, trinary, octal and hexadecimal are most of the list, by coincidence. 😄

So, reviewing my former position, I'm completely against forcing any track collaborator, maintainer or user to accept artificial exercises. IMHO, we should drop those corner cases or remove the JSON file at all.

What do you think?

Another question: Should we write...

Implement the conversion yourself. Do not use something else to perform the conversion for you.

...or it's OK to leave for each person to decide what is the challenge?

@kytrinyx
Copy link
Member

IMHO, we should drop those corner cases or remove the JSON file at all.

I think it would be incredibly useful to have a JSON file that defines what a good set of inputs and outputs are. If we don't do that, each implementor will have to make this up from scratch, and will likely miss interesting/important cases, and then we'll have more churn than necessary.

I think it's probably fine to drop the corner cases.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 23, 2016

Ok, @kytrinyx.

Rewriting the JSON explicitly saying what is left to the implementation. This way the corner cases will be easily identifiable.

I still would like more feedback about the variable names in the JSON.

ps: Now I regret the moment I wrote:

I'm still unsure about what is the correct behavior when dealing with an empty sequence of digits. I chose to treat it as an invalid input, but I want a good explanation for that.

Sorry, people! 😔

I had no idea about the implications of that... 😁

@wobh
Copy link

wobh commented Jun 24, 2016

@rbasso Maybe you've seen this SO thread and the answer I'm linking to, but if not, it seems like it would be of interest: http://stackoverflow.com/a/2488528.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 24, 2016

Hi, @wobh. Never saw that thread. Thanks! But I'm still trying to understand how it is related to this exercise.

Edit: all-your-base in Haskell will probably be implemented without any use of Char or Text. No parsing at all, just pure numerical transformations.

Changing subject, are you OK with input_base, input_digits, output_base, output_digits in the JSON file? I avoided expected_base and expected_output because the base is given, and not expected at all.

I know it's hard, but it would be great to merge this having everyone at least satisfied with the outcome. Really! 😄

So...what are the problems you all see with the version that we have now, people ❓

@wobh
Copy link

wobh commented Jun 25, 2016

I just thought it could be helpful in determining expectations Haskell programmers might have about numbers as a sequences of digits, by analogy to how they're parsed when represented as a string of characters. If it's not a helpful analogy or example, thank you for at least indulging me and looking at it. It was something I became curious about reading this thread.

@kytrinyx
Copy link
Member

Changing subject, are you OK with input_base, input_digits, output_base, output_digits in the JSON file?

These names seem reasonable to me. Much clearer than expected in this case.

@rbasso
Copy link
Contributor Author

rbasso commented Jun 28, 2016

It's been a while, so... anyone has anything to say?
Should we merge it, change it, or close it?

@NobbZ
Copy link
Member

NobbZ commented Jun 28, 2016

I'm missing a testcase for a not so widely used base. Also none of the testcases does show how digits greater than 9 shall be entered/represented. Aside of that I like

@rbasso
Copy link
Contributor Author

rbasso commented Jun 28, 2016

Thanks for the feedback, @NobbZ .

About the digits greater than 9, there are two tests:

        { "description"  : "trinary to hexadecimal"
        , "input_base"   : 3
        , "input_digits" : [1, 1, 2, 0]
        , "output_base"  : 16
        , "output_digits": [2, 10]
        },
        { "description"  : "hexadecimal to trinary"
        , "input_base"   : 16
        , "input_digits" : [2, 10]
        , "output_base"  : 3
        , "output_digits": [1, 1, 2, 0]
        },

I'm just adding one more, to use a bigger number without risking overflow (base <= 127 && number <= 32767):

        { "description"  : "15-bit prime"
        , "input_base"   : 97
        , "input_digits" : [3,46,60]
        , "output_base"  : 73
        , "output_digits": [6,10,45]
        },

I understand it's inconvenient to have numbers greater than 9 in the sequence, but I don't think it's possible to avoid that if we want a generic base-conversion exercise. For small bases, we could encode the digits in characters, but that wouldn't work for bigger bases and I believe it was decided that this would be a strictly numeric problem, right ❓

If you or anyone wants to change how the number are represented in the .json or add new cases, just send a message and I'll change it immediately. 😄

@NobbZ
Copy link
Member

NobbZ commented Jun 29, 2016

It seems as if I just missed the existing tests, sorry.

And I like the one test with exotic bases.

I don't like going back to string-representation again. They'd limit us to a max base of 36/62 (depending on case sensitivity). And as you said, it is about converting numbers, not strings.

We need to be aware though, that the current exercises are very straight forward to implement since one just can spit out a result of some computation and has a fixed base. This has a higher complexity and might need to appear later on the list of exercises in most languages.

@kytrinyx
Copy link
Member

I think this is a really good first pass. Let's merge it as is, and we can revisit it if we discover issues as languages implement it.

@kytrinyx kytrinyx merged commit 40fa791 into exercism:master Jun 29, 2016
@NobbZ
Copy link
Member

NobbZ commented Jun 29, 2016

Katrina Owen writes:

I think this is a really good first pass. Let's merge it as is, and we
can revisit it if we discover issues as languages implement it.

Any idea when you will let blazon file the issues?

@kytrinyx
Copy link
Member

I will try to get this filed tomorrow. If anyone wants to help draft the issue text for blazon (over in #276 I think), that would be a great help!

@kytrinyx
Copy link
Member

Update: it's going to happen over in #279

@rbasso rbasso deleted the add-exercise-all-your-base branch July 4, 2016 14:24
petertseng added a commit that referenced this pull request Nov 11, 2017
all-your-base 1.2.0

In #280 we made
the explicit choice to leave various cases `expected: null` where tracks
can make their own decisions (leading zeroes, the representations of
zero, whether empty sequence is acceptable as input).

There are four affected cases.

#473 results in
a discussion that some post-processing is necessary, based on the
decisions that a given track makes. One might even imagine that each
individual generator may have to reimplement base translation to arrive
at the proper answers for each case.

To ease the process of translation, instead we make some canonical
choices, explicitly show what the choices are, and offer that tracks may
make a different choice if desired.

A previous proposal did not receive support:
#468

However, this proposal differs because it does change the comments at
the top of the file.

Closes #1002 by mutual exclusion
petertseng added a commit that referenced this pull request Nov 11, 2017
all-your-base 2.0.0

This uses the error object defined in
#401 (comment)
and thenceforth added to the README's **Test Data Format**.

In #280 we made
the explicit choice to leave various cases `expected: null` where tracks
can make their own decisions (leading zeroes, the representations of
zero, whether empty sequence is acceptable as input).

However, since `expected: null` is also used for cases where there is
always an error (and there is no decision being left to the tracks),
it becomes unreasonably hard to tell cases in these two categories
apart.

To make this easier, all cases in the latter category (always are
errors) can canonically be marked as such.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants