Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion for Dependent-Type-Like Functions: Conservative Narrowing of Generic Indexed Access Result Type #33014

Closed
5 tasks done
rubenpieters opened this issue Aug 21, 2019 · 31 comments · Fixed by #56941
Closed
5 tasks done
Assignees
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript

Comments

@rubenpieters
Copy link

rubenpieters commented Aug 21, 2019

Search Terms

generic indexed access type dependent function
generic narrowing

Suggestion

TypeScript does not provide dependent types, but by combining various features such as unions, literal types and indexed access types we can obtain fairly similar behaviour to a dependent function from the caller point of view. Unfortunately, the implementer of such functions is faced with either the possibility of creating unsoundness (pre 3.5) or forced to assert the return type with unsound features (post 3.5). There are some differences with typical dependent functions, which is why we call these dependent-type-like functions.

This suggestion proposes an addition to the typechecking part of TypeScript, which aids the programmer in writing dependent-type-like functions. The main attractive point of this suggestion is that it does not introduce any additional syntax, and is meant to be a conservative extension. Thus it is not a breaking change with regards to the changes introduced in 3.5, but it does enable certain sound scenarios to typecheck.

Use Case

The main use case is dependent-type-like functions, the depLikeFun example is a minimalistic scenario showcasing such a situation. In this example, we have the depLikeFun function which returns number when its input is of type "t" and boolean when its input is of type "f".

interface F {
  "t": number,
  "f": boolean,
}

function depLikeFun<T extends "t" | "f">(str: T): F[T] {
  if (str === "t") {
    return 1;
  } else {
    return true;
  }
}

depLikeFun("t"); // has type number
depLikeFun("f"); // has type boolean

This pattern occurs in various places, such as the TypeScript dom bindings, in issues such as #31672 and #32698, in comments of related issues such as #13995, or on stackoverflow questions. This extension could serve as a workaround for related issues such as #22609 and #23861.

Problem

The problem lies in the implementation of the function. The pre-3.5 behaviour enabled the creation of unsoundness in these type of functions. TypeScript checked the return value using the constraint of key type T, simplifying F[T] to number | boolean. However, this is unsound since the caller can provide a more specific type for T such as "t".

// pre-3.5
function depLikeFun<T extends "t" | "f">(str: T): F[T] {
  if (str === "t") {
    return true; // should be error
  } else {
    return 1; // should be error
  }
}

depLikeFun("t"); // has type number, but is actually a boolean
depLikeFun("f"); // has type boolean, but is actually a number

The post-3.5 behaviour also isn't satisfactory for this use case. It disallows the depLikeFun to be implemented, which means the implementer needs to use unsafe type assertions. By #30769, assigning to type F[T] is interpreted as a write to F at key T. This means that the result type F[T] is checked against the intersection of its possibilities, which is number & boolean and thus never.

// post-3.5
function depLikeFun<T extends "t" | "f">(str: T): F[T] {
  if (str === "t") {
    return 1; // unexpected error: '1' is not assignable to never
  } else {
    return true; // unexpected error: 'true' is not assignable to never
  }
}

Mistakes are more likely to occur in complex situations, and thus aiding the user with these types of functions seems in line with TypeScript's design goals.

Examples

In a dependently typed language depLikeFun would be modeled as function depFun(str: "t" | "f"): F[str]. There are meaningful differences between depending on the actual value of the input versus the type of an input. This distinction makes this issue more tricky to solve than appears on first sight. In this section we showcase the expected behaviour of the addition on certain representative examples.

The main idea behind the addition is as follows: in a dependent-type-like function we cannot narrow the type T of a variable when its value is checked. For example, if str has type T extends "t" | "f" and we check whether str === "t", then it is unsafe to narrow T to "t" in that branch, since T could also be instantiated with the wider type "t" | "f". Instead, we add knowledge about the type T within the branch which is more conservative, but makes it possible to allow the behaviour of dependent-type-like functions. In more traditional programming language theory, the knowledge added is very similar to adding a lower bound "t" <: T into the context.

Basic Use Case

First we look at the depLikeFun example. In the if-branch, returning 1 is allowed since after checking str === "t" the type of T can only be "t" or "t" | "f". Thus, the caller will expect either a number or number | boolean, and thus it is safe to return the value 1.

Note: The claim above that T can only be "t" or "t" | "f" is not quite true due to branded types. For example, T could also be the type Branded, defined as type Branded = "t" & {_branded?: any}. In this case the caller expects a value of type unknown, so returning 1 is also safe.

function depLikeFun<T extends "t" | "f">(str: T): F[T] {
  if (str === "t") {
    return 1; // no error
  } else {
    return true; // no error
  }
}

An example which does not occur for dependent functions, but does occur in dependent-type-like functions, is the following. The type T can be attributed to multiple inputs, this possibility is what makes it unsafe to fully narrow the type parameter based on a check of the input str. However, the reasoning above still holds since the added knowledge is true regardless of how many inputs T is linked to.

function depLikeFun<T extends "t" | "f">(str: T, str2: T): F[T] {
  if (str === "t") {
    return 1; // no error
  } else {
    return true; // no error
  }
}

The extension is conservative, and behaviour of other cases should be the same as in 3.5. However, what might change is the error message users see when implementing dependent-type-like functions. In the situation below, instead of comparing true / 1 to never, they get compared against the simplified indexed access type within each branch. This is number in the then branch and boolean in the else branch.

function depLikeFun<T extends "t" | "f">(str: T): F[T] {
  if (str === "t") {
    return true; // Type 'true' is not assignable to type 'number'
  } else {
    return 1; // Type '1' is not assignable to type 'boolean'
  }
}

Non-Return Indexed Access

The type F[T] can occur elsewhere in the function. This extension retains the normal TypeScript behaviour in those cases, and is only enabled when checking the return type of a function. In the following example we have a value of type F[T] as second input to the function. Within the then branch, this type should not be simplified to number, since it can actually be of type boolean. Checking the first input value does not give us any information regarding the second input value, and thus we cannot do any more simplification.

function depLikeFun<T extends "t" | "f">(str: T, ft: F[T]): F[T] {
  if (str === "t") {
    const n: number = ft; // Type 'F[T]' is not assignable to type 'number'.
    return 1;
  } else {
    return true;
  }
}

let x: boolean = false;
depLikeFun<"t" | "f">("t", x); // ft can be of type 'boolean'

Transitive Type Parameters

TypeScript allows an indexed access with a type parameter which has a direct transitive line to a correct bound. For example, it allows function depLikeFun<T extends "t" | "f", B extends T>(str: B): F[B], but disallows function depLikeFun<T extends "t", F extends "f", B extends T | F>(str: B): F[B].

In these situation it seems safe to let information flow through the transitive chain in both directions. For example, when checking an input of type B, we can add knowledge about T. And vice versa for checking an input of type T and an input of type B.

function depLikeFun<T extends "f" | "t", B extends T>(str: B): F[T] {
  if (str === "t") {
    return true; // no error
  } else {
    return 1; // no error
  }
}

function depLikeFun<T extends "f" | "t", B extends T>(str: T): F[B] {
  if (str === "t") {
    return true; // no error
  } else {
    return 1; // no error
  }
}

Consistency with Conditional Types

TypeScript provides alternative means of creating type-level functions by using conditional types. Instead of creating the record type F, we can create the type constructor F which provides a similar mapping.

type F<A extends "t" | "f"> =
  A extends "t" ? number :
  A extends "f" ? boolean :
  never;

Which can be used as seen below.

function depLikeFun<T extends "t" | "f">(str: T): F<T> {
  if (str === "t") {
    return true;
  } else {
    return 1;
  }
}

This raises the question whether the addition of typechecking for dependent-type-like functions should be added for type level functions created with conditional types too, for consistency purposes. Two points worth noting is that: users were never able to implement this behaviour using conditional types (even pre-3.5), and, type level functions with conditional types are not restricted to a domain of string keys which makes things more complex. Nevertheless, we feel it is a point worth bringing up for discussion.

Behaviour with any

The result of instantiating a dependent-type-like function with the any type gives a result of any. This occurs, for example, when disabling --strictNullChecks and calling the function with null. Any behaviour related to interaction with null/any and dependent-type-like functions is supposed to be unchanged compared to 3.5.

Compiler Changes

In this section we roughly outline the suspected changes to the compiler needed to implement this extension.

  • The main change is an addition to the checkReturnStatement (checker.ts) function. When the criteria for enabling dependent-type-like function checking are true, we instantiate the type parameter based on a lookup of checks with literals in the control flow graph. These criteria are currently: the index type is a type parameter, this type parameter is declared in the enclosing function, the index type can be used to index the object type and the object type is concrete.
  • To be able to check the control flow graph in return statements, the graph for return statements needs to be bound in the bindWorker function (binder.ts).

Workaround

The current suggested workaround seems to be unsafe type assertions or overloaded signatures, both of which are unsafe features. Neither are satisfactory since the typechecker does not help in preventing programmer mistakes when implementing dependent-type-like functions.

Related Suggestions

The following list gathers related suggestions.

  1. extending the TypeScript syntax, such as a oneof generic constraint: #25879, #27808 or #30284
  2. dependent types in TypeScript, which would have a huge impact on the compiler and is possibly out of scope for the TypeScript project
  3. this extension is a more refined idea based on previous ideas of narrowing types based on term-level checks such as #21879

Complementary to oneof constraints

This proposal has some overlapping use cases with proposals which propose to extend TypeScript with a form of oneof generic constraint, as mentioned in point 1. This proposal is not competing with these, but rather should be seen as complementary. The oneof proposals suggest a new generic constraint which is more restrictive in which instantiations it allows. However, for many of these use cases this is only part of the solution. In addition to adding this syntax, the typechecker must also take this restriction into account to typecheck the original use cases. This proposal kickstarts the implementation of this additional behaviour by focusing on a more constrained use case which does not need this new constraint. If a oneof generic constraint is added to TypeScript, the behaviour defined in this proposal can be extended to take this additional constraint into account.

Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

NOTE: (Regarding bullet point 1) The goal of this proposal is to provide more complete typechecking than 3.5, and thus we want to avoid breaking any code compared to 3.5.

@RyanCavanaugh RyanCavanaugh added In Discussion Not yet reached consensus Suggestion An idea for TypeScript labels Aug 21, 2019
@RyanCavanaugh
Copy link
Member

@ahejlsberg this is a co-effort with @jack-williams that looks very promising. Thoughts?

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Aug 21, 2019

Silly question but,

Will this also narrow T x to "t" and "f"?

What about the more generalized narrowing of T from an arbitrary discriminated union type?


Taking @keithlayne 's example from Gitter,

interface Foo {
    a: number
    b: string
    c: boolean
}
declare const foo: 'a' | 'b' | 'c'
function f<T extends typeof foo>(x: T, y: Foo[T]) {
    if (x === 'a') {
        x // T extends "a" | "b" | "c"
    }
}

Would be nice if T x is narrowed to "a".

Of course, we can't narrow y:Foo[T] here (unsound)

@rubenpieters
Copy link
Author

@AnyhowStep

Narrowing the type parameter T to "t" or "f" is unsound.

Consider this situation:

function f<T extends "t" | "f">(x: T, y: T) {
    if (x === 't') {
        y // has type "t", but can possibly be inhabited by "f"
    }
}

@fatcerberus
Copy link

@rubenpieters

In more traditional programming language theory, the knowledge added is very similar to adding a lower bound "t" <: T into the context.

According to what @jack-williams told me in another issue, it's not necessarily sound to narrow to T super "t" in this case either (and you seem to acknowledge as much by saying "very similar"): can you give me an example of when lower-bounding wouldn't be sound?

@AnyhowStep
Copy link
Contributor

AnyhowStep commented Aug 21, 2019

My bad. I meant narrow x. Sorry!
Brain fart

@rubenpieters
Copy link
Author

rubenpieters commented Aug 21, 2019

@fatcerberus

It is similar to lower bounds, in that the bound is applicable only in a very limited scope: the indexed access type of a function return type.

Adding a lower bound into the context can be problematic in scenarios similar as described in the Non-Return Indexed Access section. But this is dependent on the rules of how the simplification of indexed access works. So possibly for TypeScript there wouldn't be an issue.

Adding the general form of lower bounds seem more complicated without enabling interesting use cases, compared to keeping the bounds limited in checking the return statement. Plus, adding lower bounds in this way breaks parametricity.

@AnyhowStep

That seems ok on first sight. But I'm not sure if this should be part of this proposal.

What about the more generalized narrowing of T from an arbitrary discriminated union type

That is something I would like to investigate as well.

@jack-williams
Copy link
Collaborator

It's probably better not to think of these as 'true' lower-bounds on the type, rather a lower bound on the type's behaviour as a set of keys.

When we see str === "t", we don't know that T is lower-bounded by "t", but we do know that the set of keys denoted by T include "t". It's a weaker statement but sufficient for refining indexed access types because they only care about T as a set of keys.

@AnyhowStep

Narrowing x would be sound but comes with additional technical challenges and it doesn't help us assign a value to F[T] (unless we are returning something like f[x]).

@fatcerberus
Copy link

When we see str === "t", we don't know that T is lower-bounded by "t"

Don’t you, though? I mean, yes, you can manually instantiate the generic using an unrelated type and then cast, but that’s already unsound in so many other ways that I don’t think we need to account for it at that point.

If there’s a case I’m missing where full lower-bounding would make a valid instantiation unsound, please enlighten me, I’m all ears! 👂

@rubenpieters
Copy link
Author

rubenpieters commented Aug 22, 2019

@fatcerberus

I don't think lower bounds by themselves create unsoundness. The problem arises when creating interaction between lower bounds and the indexed access simplification.

For example in the example below, imagine we add the lower bound "t" <: T in the if-branch. This is ok by itself, but does not let us typecheck the return-statement. For this to work we need to add a rule which allows us to do something like: given a lower bound "t" <: T, then F[T] simplifies to number. However, this is problematic as the type of the variable ft also gets simplified, which is unsound. So, having the interaction between a lower bound and indexed access simplification be applicable everywhere creates unsoundness.

function depLikeFun<T extends "t" | "f">(str: T, ft: F[T]): F[T] {
  if (str === "t") {
    const n: number = ft; // ft would be simplified to `number`, but can be `boolean`
    return 1;
  } else {
    return true;
  }
}

let x: boolean = false;
depLikeFun<"t" | "f">("t", x); // ft can be of type 'boolean'

@jack-williams
Copy link
Collaborator

jack-williams commented Aug 22, 2019

I think for the initial proposal we would only look at invoking narrowing for specific forms of return type, but ideally we would be more liberal and handle cases like:

const enum Flags {
    On,
    Off
}

interface FlagLights {
    [Flags.On]: 'green';
    [Flags.Off]: 'red';
}

interface FlagText {
    [Flags.On]: 'on';
    [Flags.Off]: 'off';
}

function getToggleProps<T extends Flags>(flag: T): { colour: FlagLights[T]; text: FlagText[T] } {
  switch (flag) {
    case Flags.On:
      return { colour: 'green', text: 'on' };
    case Flags.Off: 
      return { colour: 'red', text: 'off' };
  }
}
  • (though you can get round the limitation with helper functions)

@rubenpieters
Copy link
Author

rubenpieters commented Aug 23, 2019

Indeed. If we do want to handle narrowing of generic indexed access inside a structure, we do have to take care to only handle the positive positions. Otherwise we get the same problem as I mentioned above.

function depLikeFun<T extends "t" | "f">(str: T): (ft: F[T]) => F[T] {
  if (str === "t") {
    return (ft: number) => { // should be rejected
        return 1;
    };
  } else {
    return (ft: F[T]) => {
        return true;
    };
  }
}

depLikeFun<"t" | "f">("t")(true); // ft can be `boolean`

@rubenpieters
Copy link
Author

rubenpieters commented Sep 20, 2019

After experimenting more with this, I feel that limiting the implementation to the checkReturnStatement is something that will be too brittle to work for more complex cases. For example, the type of use case @jack-williams pointed where the indexed access is nested inside the return type.

Therefore, I suggest to focus the implementation inside the getSimplifiedIndexedAccessType function instead. On first sight this might bring along some problematic interactions, but I think there are proper solutions for them.

Let's focus on this function which gathers the essence of these interactions:

function f<T extends "t" | "f">(
    t: T,
    t2: T, // second key which we do not test
    ft: F[T], // a value of type F[T]
    f: F, // a record of type F
) {
    if (t === "t") {
        const n: number = ft; // a) should be rejected, ft can be bool
        f[t2] = 1; // b) should be rejected, f[t2] can be bool
        return 1; // c) should be accepted
    }
    throw "";
}

This use case shows three scenarios, two of which should be rejected and one which should be accepted. Scenario A showcases that the variance is important when doing the simplification for this proposal, this information is already part of the getSimplifiedIndexedAccessType function which I think we can reuse. Scenario B is a bit more problematic, but it is possible to resolve this problem using something like #33089 , which assigns a specific write indexed access type to the lhs of the expression f[t2] = 1. Scenario C is the normal operation of the extension, and since the implementation is located in the getSimplifiedIndexedAccessType it naturally extends towards checking indexed access inside nested structures as well.

Any thoughts / comments are welcome.

@miginmrs
Copy link

miginmrs commented Jan 2, 2020

Function overloading provides a functional workaround.
Since depLikeFun return type is dependent of its parameters type, the function type can be:

{
    (str: "t"): number;
    (str: "f"): boolean;
}

And it can be safely defined this way:

type K = string | number | symbol;
const keyIs = <k extends K, C, T extends { [i in k]: unknown; }, V extends { [i in k]: C; }>(
  args: T | V, k: k, c: C): args is V => args[k] === c;

function depLikeFun(str: "t"): F["t"];
function depLikeFun(str: "f"): F["f"];
function depLikeFun(...args: ["t"] | ["f"]) {
  if (keyIs(args, 0, 't' as const)) {
    return 1 as number;
  } else {
    return true as boolean;
  }
}
const myNumber = depLikeFun('t'), myBoolean = depLikeFun('f');

@rubenpieters
Copy link
Author

rubenpieters commented Jan 2, 2020

@miginmrs This is covered in the Workaround section.

As far as I can see, your workaround is unsafe as mentioned. The compiler does not prevent incorrect implementations of depLikeFun, essentially reverting back to pre-3.5 behaviour. If we swap the 1 and boolean return values in your example, the compiler does not complain. Playground link.

@miginmrs
Copy link

miginmrs commented Jan 2, 2020

@rubenpieters Thank you for the note, I didn't notice this before.
By the way, I am still thinking for a workaround by the time the feature is implemented, it could be ugly, but this is the best I could:

type depLikeFun<arg, T> = typeof depLikeFun extends { (x: arg): T; } ? T : never;
function depLikeFun(str: "t"): F["t"];
function depLikeFun(str: "f"): F["f"];
function depLikeFun(str: "t" | "f") {
  if (str === 't') {
    const ret: depLikeFun<typeof str, number> = 1;
    return ret;
  } else {
    const ret: depLikeFun<typeof str, boolean> = true as boolean;
    return ret;
  }
}
const myNumber = depLikeFun('t'), myBoolean = depLikeFun('f');

@rubenpieters
Copy link
Author

@miginmrs Actually, I think you can simplify your depLikeFun type with just F[typeof str]. Then it becomes this:

function depLikeFun(str: "t"): F["t"];
function depLikeFun(str: "f"): F["f"];
function depLikeFun(str: "t" | "f") {
  if (str === "t") {
    const ret: F[typeof str] = 1;
    return ret;
  } else {
    const ret: F[typeof str] = true as boolean;
    return ret;
  }
}

It requires the discipline of annotating the return type with this expression to prevent mistakes, but is the neatest workaround I've seen so far.

@mmichaelis
Copy link

If I get this right, even the examples lack this feature: TypeScript: Documentation - Conditional Types.

Thus, the given function createLabel cannot be implemented without error:

function createLabel<T extends number | string>(idOrName: T): NameOrId<T> {
  throw "unimplemented";
}

and this will fail (without ts-expect-error or explicit cast to any):

function createLabel<T extends number | string>(idOrName: T): NameOrId<T> {
  if (typeof idOrName === "number") {
    // @ts-expect-error
    return { id: idOrName };
  }
  // @ts-expect-error
  return { name: idOrName };
}

See also TypeScript Playground.

@darrylnoakes
Copy link

I believe I just ran into a case where this would apply:

type DataNameMap = {
  string: string;
  number: number;
};

type DataTranslationMap = {
  string: DataNameMap["number"];
  number: DataNameMap["string"];
};

type DataTranslationObject = {
  [key in keyof DataNameMap]: {
    arg: {
      to: key;
      data: DataTranslationMap[key];
    };
    res: DataNameMap[key];
  };
};

type DataTranslationUnion = DataTranslationObject[keyof DataTranslationObject];

function dataTranslation<O extends keyof DataTranslationObject>(arg: {
  to: O;
  data: DataTranslationObject[O]["arg"]["data"];
}): DataTranslationObject[O]["res"] | undefined;
function dataTranslation<U extends DataTranslationUnion>(
  arg: U["arg"]
): U["res"] | undefined {
  if (arg.to === "number") {
    return Number.parseInt(arg.data);
  }

  if (arg.to === "string") {
    return arg.data.toString();
  }

  return undefined;
}

I believe the overloading and maybe more could be removed if this was implemented, because I only had to add it to link the return type to the argument type. Otherwise, just the union worked fine.


Coincidentally, even though all the possibilities are covered, the function still needs an ending return statement. Not sure if this is intended, and how to work around it if it is? I would prefer if that undefined could be removed, based on the fact that another return will always be reached.

@Ischca
Copy link

Ischca commented Nov 22, 2023

What's the current status?

@craigphicks
Copy link

craigphicks commented May 3, 2024

@darrylnoakes

Coincidentally, even though all the possibilities are covered, the function still needs an ending return statement. Not sure if this is intended, and how to work around it if it is? I would prefer if that undefined could be removed, based on the fact that another return will always be reached.

TypeScript flow doesn't check if statement for case exhaustion, but does so for the switch statement. So this has no error on the return:

function dataTranslation<O extends keyof DataTranslationObject>(arg: {
  to: O;
  data: DataTranslationObject[O]["arg"]["data"];
}): DataTranslationObject[O]["res"] | undefined;
function dataTranslation<U extends DataTranslationUnion>(
  arg: U["arg"]
): U["res"] {
  switch (arg.to){
    case "number": return Number.parseInt(arg.data);
    case "string": return arg.data.toString();
  }
}

@jindong-zhannng
Copy link

@craigphicks I don't think there is anything different between if and switch statements. I guess your code works because there are function overloads which will bypass some checks.

@craigphicks
Copy link

craigphicks commented May 21, 2024

@jindong-zhannng

These examples, not using overloads, are better:

declare const x: 1|2;
function f2(): 1|2|undefined { 
    switch (x){
        case 1: return x;
        case 2: return x;
    }
    x; // error - unreachable code detected (but type is displayed as "never"
}

function f3(): 1|2|undefined { // error - not all code paths return a value 
    if (x===1){
        return x;
    }
    if (x===2){
        return x;
    }
    x; // (not unreachable error) type is displayed as "never"
}

playground

So they are both exhaustive with respect to the type, because in both cases the type is displayed as never.

However, with respect to reachability, the switch statement is exhaustive and the if statements are not.

I didn't make that clear in my previous post so thank you for pointing it out.

@1EDExg0ffyXfTEqdIUAYNZGnCeajIxMWd2vaQeP
Copy link

function f3(): 1|2 { // error - not all code paths return a value 
    if (x===1){
        return x;
    }
    if (x===2){
        return x;
    }
    return assertNever(x); // (not error) (type is shown as never)
}

function assertNever(value: never): never {
  return value;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript
Projects
None yet