-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APL support in Monospace font? #191
Comments
APL does require additional characters: see http://www.math.uwaterloo.ca/~ljdickey/apl-rep/n1.html et seq. |
Thanks. Those appear to be gif image files, and poking around a bit I was unable to find machine-readable tables that listed all the Unicode codepoints here. Do you know of any sources? Otherwise I have to hand-transcribe these pages, which might lead to errors. |
You say set2-ucs.html is 'obsolete', but is it complete w.r.t. the list of Unicode codepoints used by APL? Since the original images order the codepoints differently it's not straightforward to compare. |
There are multiple implementations of APL, which use slightly different character repertoires, so I wouldn’t assume any of those links is complete on its own. The draft table in set2-ucs.html probably has the same characters as the images because they were all created as part of the same effort to encode APL symbols in Unicode. |
OK. I'll just add these to the set and then leave it to other folks to point out what's still missing. |
Heh. Doesn't help that there's confusion about the naming of Down Tack/Up Tack... It's just the names that disagree, though, it appears. Here's the list from set2-ucs.html with the APL names from that file in parens where they differ (ignoring the 'APL FUNCTIONAL SYMBOL' prefix Unicode uses for the 2300 range.
|
I've discussed about APL symbols on adobe-fonts/source-code-pro#114. I've listed all the APL characters I've seem modern support there (there are not many modern compilers, there's Dyalog APL, APLX, GNU APL, a few others, and the one I'm working on); please note that Unicode reference glyphs for those might look unnatural to APL programmers: the APL385 font is a good reference for glyphs. |
The discussion on adobe-fonts/source-code-pro indicates that apl385 shows 'preferred' forms of the APL glyphs. Is there any disagreement with this statement? The discussion there (and examination of the font) indicates it renders circled latin capital letters (24b6-24cf) with underlines instead. This is not something we would do by default, since we are building a general-purpose symbol font and this rendering of these characters is not conforming. Are underlines required? Would putting alternate glyphs under a feature work? |
@paulotorrens, with particular reference to the APL characters in NotoSansSymbols, what in particular looks unnatural to APL programmers? If we were to modify these shapes, what would be the most essential changes? |
There's also this as a reference: https://web.archive.org/web/20150116040011/http://www.wickensonline.co.uk/apl/unicomp-apl-front-large.jpg |
https://www.math.uwaterloo.ca/~ljdickey/apl-rep/APL_Character_Repertoire.html What I gleaned from his running commentary on proper glyphs, it's most important to get good, consistent size and weight. The components of a traditionally overstruck character, in particular, should look like the result of its component parts. Diæresis Dot, for example. As he's quite picky about APL glyphs, I was a bit surprised that he was fairly bullish on APL385 after only a quick look. The only relevant nitpick was that Del Tilde et al use the less-preferred "Tilde above" rather than the forms with the Tilde across the middle. So yeah, it's probably a safe bet to start there. |
@dougfelt, if you could please wait until tomorrow, I'll switch to my workstation and check how my codebase looks with NotoSansSymbols. 😄 About the circled letters (24b6-24cf): if I recall correctly, the underlined letters were used to differentiate symbols, giving a wider range of possible identifier names, since lowercase letters were not available. Not really a problem nowadays I assume, but I've never dealt with legacy APL code, so I can't be sure. But then, for that purposes, though APL compilers usually underline the symbols, I guess I could get used to them being circled instead. Also, I would agree with @Wyatts's friend: the most important is to get good, consistent size and weight. Preferably not too small/tight because APL code can get messy easily. (Also, the del tilde bothers me as well, because the symbols should look overstruck, so tildes on different heights, though recognizable, trigger my OCD.) I've personally learned to program APL with the APL385 font, as all modern pages on APL suggest using this font, so that's what looks natural to me. I got an old book on APL here with me, I could compare the glyphs to APL385 and to NotoSansSymbols tomorrow if you guys want to. |
@paulotorrens I'm Wyatt's friend from the above post. APL was my first programming language, back in 1973 in high school. I grew up in the shadow of IBM, and later worked for them. Learned APL2 as it came out (in the form of the APL2 IUP), and developed a 9-pin proprinter font for it (which the APL2 folk ultimately adopted). 9-pin is low-resolution, so there were some compromises in that font. Wyatt is spot on in my critique. Some things I care about more than others, but consistency of weight, style, baseline, ... are more important than other things. Get that right, and you're 90% there. From a post elsewhere related to this, the below symbols (if they survive the posting) are correct. The "~" should be "through" the symbol, not above it. This is still less important than the consistency issue, however. U+236B ⍫ APL FUNCTIONAL SYMBOL DEL TILDE The "~" above the character format was mostly used for low-resolution environments on CRTs attached to mainframes in the 1970s and 1980s - in that case it was needed because at the low resolution, making the "through" the symbol form was impossible, the resolution was simply insufficient. I can understand that this may have inadvertently become normative based on carry over to early DOS days, and similar low resolution. Now that shouldn't be the case. I would consider they characters from the 2741 typeball correct in shape, positioning, and consistency. Noteworthy in that category are epsilon and rho. Since some (many) of the "overstruck" (or composed) symbols didn't exist then (note: many did, too), using that today to create quad jot, quad delta, quad up-arrow, ... may not have yield aesthetically pleasing results. IMHO. And, yes, @paulotorrens, the "~" above characters get my OCD going as well....! ---Marty |
Thanks @paulotorrens and @martyb42. Marty, if you can look at NotoSansSymbols' APL character set that would help. It is consistent (among the APL chars added in the 2300 block) and generally follows the style in the images on the waterloo site. However, there are some differences, notably the quad chars are shorter and wider, and the alpha, epsilon, iota, omega characters don't have the characteristic slant but instead are symmetric. We're right now looking at extending the weight support of Noto Symbols and there's an opportunity to tweak the APL character set. |
@dougfelt In a general sense these are OK. The items that I see are these: I agree that the aspect ratio for quad and its composed characters is "off". It should be a bit taller and thinner. In general, the proportioning and shapes are better in APL385. IMHO: Noto gets the symbol right for Del-Tilde; Up/Down-Caret-Tilde. APL385 gets the symbol right for all composed quad characters. Note especially quad with up-caret and quad with down-caret (also quad with less than/greater than). Noto gets the carets/greater/less too big. I prefer the styling of the greek characters (epsilon, rho, iota, alpha, omega) in APL385. But this is not egregious. Also, circle-star is correct in APL385, and odd in Noto (yes it should be a 5 pointed thing, but more an asterisk than a true star). So, in general (exceptions noted above), APL385 does better than Noto (this is for the nature of the symbol, I like the font weight of Noto somewhat better). This, however, is comparative. Noto would still be a big improvement over what I get to use when I use gnu-apl (which is still ultimately whatever system font is being supplied). That has major issues. The APL385 vs. Noto issues are smaller by comparison. Before clicking the below link, consider it is a somewhat large PDF... A text of the period is: http://www.softwarepreservation.org/projects/apl/Books/GillmanAndRose This was scanned, and that results in some issues. The APL font of the time was fairly "light" resulting in more. But if you go to PDF page 280 (print page # 267) and following, you'll see what APL characters looked like. I would not necessarily copy the character weight. Part of this is if you look at rho as "shape/size" (monadic function) it looks lighter than rho as "reshape/restructure". This is due to the scanning process I'm sure. But this may give you some feel. Also, this was "APL1" (meaning - no nested arrays and similar extensions of that period), so there is no symbol for the depth function, etc... In the modern PDF: http://publibfp.boulder.ibm.com/epubs/pdf/h2110611.pdf See the bottom of pdf page 487 (print page # 471) and following for something that includes the modern APL2 characters. This won't cover extensions by Dyalog, IPSA, etc., but these characters flow from the original ones nicely. (Thank you IBM.) Look in the rest of this latter reference for size proportion and position with letters (although I would NOT make the letters of the font italic, that was a holdover from the typeball days that need not be copied). I hope that this helps! |
@dougfelt I'll add one more general plea... Please make "squad" (squished quad, or skinny quad - "[" composed with "]") distinct from quad (a unique character). Although it is an estimate on my part, squad should be 1/2 to 2/3 as wide as quad. If it is closer than that, things get weird when reading it in code. They should be the same height (perhaps squad a bit taller, but not much, and taller downward) and weight, so the width needs to be a bit more than a "just noticeable difference" for ease of reading. In the current Noto implementation they are a bit close (I'm reading only out of the Miscellaneous Technical Block). But since you may be readdressing the proportions anyway, I thought it worthwhile to make the comment when thinking about this. I hope I haven't overwhelmed you. |
@martyb42 Thank you very much. I'd already myself thought squish quad was too similar in width to quad, and I'm not an APL programmer. A few other specific questions:
One reason for the size is that the related characters outside of the APL set (greater than, less than, subset of, superset of) are all larger. NotoSansSymbols is a general-purpose font and different communities have different expectations for these characters. For the APL-specific set we can do what we want, but for example the lists on the waterloo site include a number of other math symbols (including the four listed above) so there would be a size inconsistency in this case. It is possible we could (as I mentioned for the underline issue above, which appears to not be necessary) use a stylistic alternate to allow a different APL rendering of some of these characters so that they would harmonize better with the APL-specific ones. I am not sure though how customizable people's APL environments are and whether, for example, they could define features to be used with the font in these contexts. My takeaway at this point is to recommend the designer:
|
@dougfelt Thanks for listening. To your specifics, then I'll add a few...
Basically, I agree with your comments in the above. This didn't strike me so hard in a font viewer, but in code I think it would have. I agree with your concern in the middle paragraph. It would be "better" to have the size consistency within what is traditionally APL (which would mean a bit smaller versions of the characters your mention). How this would look, with the larger sizes, I'm not sure. Again, I think that the aesthetics will work better with smaller versions, but I'm not sure I think that is truly vital (I'd want to see it in code). In that regard, you'd likely need to add all the shoes to the list. All of your "takeaway to the designer" comments are correct. Note, however, "more asterisk-like" still does mean a 5-pointed asterisk! (Don't go six-pointed.) I'll attempt to stay fully engaged in this, as it is near and dear to me, and (in a somewhat weak sense) I've been through it years ago. There the limitations were the technology (low-resolution proprinter fonts - that enforced [at the time] smoothing algorithms). Thanks for listening, and I'd be glad to look at proposed glyphs before they gel, if that is available. Many Thanks! |
I've taken a look on NotoSansSymbols, and I agree with @martyb42 on every point. One thing that botters me is that, e.g., squared delta (U+234D) has a delta much smaller than the one without a square. I'd personally like a bigger character, with a delta similar to the others (but, obviously, squared); same thing goes for other squared versions (which you'll probably change anyways, since you'll change quad). I really like how APL385 does those. (I also like the serif on iota.) As of the shoes, Dyalog APL uses APL385 font; I've programmed with it. The shoes should be consistent with subset and superset of glyphs (⊂, U+2282, and ⊃, U+2283), as those are used as left and right shoes there (as seem in U+2367, left shoe stile)! Down-shoe stile is not actually used in any APL implementation I've heard of (as most of the squared symbols, they are not used), but they were added in Unicode as they could be "typed" (overstriked) in APL keyboards I assume. As such, it should look like intersection (∪, U+222A) with a stile, since intersection is used. Luckly, superset, subset, intersection and union (∩, U+2229) look consistent on NotoSansSymbols, and then the lamp symbol (upshoe jot) should be based on union for consistency (I hope @martyb42 agrees with me). Btw, delta underbar (U+2359), as you might see on Dyalog APL's manual, on pdf page 67 (document page 45), is used for identifier names, as are the underlined letters. Since Unicode assigned circled letters for those, and APL fonts use those underlined (U+24B6 through U+24CF), those should be consistent, then I'd suggest delta underbar should be circled as well, consistent with the circled letters and the greek letter delta. I'd be really bothered to program and see delta with an underbar, but B circled. Quoting the document:
Finally, a small remark: if you are intending to add those symbols to Noto Mono, I'd like to ask you to add the relational algebra symbols (⋈, U+22C8, ⋉, U+22C9, ⋊, U+22CA, ▷, U+25B7, ⟕, U+27D5, ⟖, U+27D6, and ⟗, U+27D7) there as well as the APL research compiler I'm working on uses those. They look fine on NotoSansSymbols. I'll probably open an issue later to talk about some other symbols used on other languages, e.g., ligatures for Haskell (as can be seem on the screenshots here). |
I basically agree with the comments by @paulotorrens... (All save one...) I've included APL characters for reference, but I think always included their name as well. Do remember, however, we may not all see these "the same" based on fonts installed. Quad overstrikes - generally, but including quad-delta:Please make sure the quad [⎕] is large enough. It should be about the width of a stroke taller and shorter than a capital letter. This may be off if the stroke weight is light (so it may be more than the width of a stroke in that case, I'm not sure without visual feedback). Most (perhaps all) of the overstruck characters should fit fairly tightly within the quad in at least one direction. This then becomes a design issue for delta [∆] that it needs to be small enough to fit in the ("large enough version of") quad. --but just small enough. Touching an edge of the quad is fine. If it can be avoided, with only a tiny amount of whitespace, that might be better, I'm not sure without seeing it. But if the fonts were raster fonts (I know they aren't), you should be able to "or" the quad and delta together to get quad-delta [⍍]. The same concept applies for other overstrikes. They should look like the two characters of which they are composed. Make sure there is sufficient difference in width for easy recognition between squad [⌷] and quad [⎕]. Possible exception: If the character I see as "lower case psi" (multiset in NARS) is best represented as "lower case psi", then that is how it should be done. But if better as the appropriate shoe with stile, then do it that way. Visually/aesthetically, I prefer a lower case psi. Shoes and shoe overstrikesSee http://unicode.org/charts/PDF/U2200.pdf (and http://unicode.org/charts/PDF/U2300.pdf) for the references below. I believe down shoe stile is NARS (a kind of APL) "multiset" function. But I've seen it presented both as "down shoe stile" [⍦] and as a lowercase greek psi [ψ]. Shoes - up and down shoe should be 90 degree rotations of u+2282 [⊂](and/or u+2283 [⊃]) (u+22C2 [⋂] and u+22C3 [⋃] as seen in the above document (U2200), are too big). And/Or - like u+2227 [∧] and u+2228 [∨]; not like u+22C0 [⋀] and u+22C1 [⋁](the latter are too big, again as seen in the above document %28U2200%29). In all cases, stroke weights need to be consistent. Circled charactersHere I respond from a font/unicode perspective. I really can't consider a circled letter a variant of an underscored character. Or "vice versa". So, Unicode lacks underscored A-Z as individual characters. But an implementation has mapped the underscored characters to circled characters. And Marty groans a huge sigh on this one. I would rather the underscores have gone to lower case, or something similar (regulars go to lower, underscores to upper is another alternative). Do we need 3 alphabets? (OK, we allowed it in the past and there was the mistake...) That being said... Delta Underscore is [⍙] is a "character" and circled delta isn't the same thing with a style variation (sorry, @paulotorrens, our only disagreement so far, that I see, anyway). Having said that, the "how much I care" factor is somewhat low. I never liked underscores in identifiers. Delta Underscore was perhaps a bit of a special case, but still generally avoided. But I'd really prefer to see delta underscore remain "as is", in the same way that a lot of unused quad-characters still "exist." ...and that delta underscore [⍙] be a "proper" overstrike in formation. I'm not opposed to a circled delta in a private block. I am only opposed to redefining the character. Relational algebra symbolsNo personal comment. But I'd prefer consistency, as @paulotorrens advocates for. |
Clarifications... @Wyatts pointed out that there were a few things that may not have been clear. This is my attempt (poor as it may be) to clarify... Stroke width: the weight of a line that makes up the characterSo the idea is that for a quad (as in ⎕AV), the top of the quad should a bit higher than a capital letter, and the bottom of the quad a bit below the baseline. If your font has styled capital letters, make it a bit higher than the tallest capital letter. In the case of a "nicer" font that has strokes of varying weights, use your best judgement to achieve the above effect. Upper-case, Lower-case, Underscore-case, oh my!(Just a clarification on why all three cases exist.) Historically, (IBM) on the 2741 APL Typeball there were no lower-case characters. Underscore characters provided an alternate case to the upper-case characters, a kind of "super" upper-case. These were typed as "character, backspace, underscore." --And didn't need any new positions on the typeball (which was limited to 88 characters, but could be extended by overstrikes). In the early years, identifiers were restricted to upper-case and underscore-case. Lower-case was allowed in strings but not as identifiers, initially. Getting them into the strings was generally a bit of a challenge, since they couldn't be typed directly, and the changing typeballs wouldn't be known by the system.... Then came the PC. Here my memory isn't quite so certain, but I believe it allowed identifiers to be upper- or lower-case, but not underscore-case. Some implementations later (mainframe?) allowed all three - and that broke the bank. Since people use features if they are available, that created issues in migration from systems with 3 cases to systems with 2 cases. I'm in favor of sticking with two, but since delta underscore [⍙] already exists as a character, I'm not in favor of replacing it. Just "sending it to the corner." Peace. |
@martyb42, I hadn't heard of NARS yet, thanks for the information! They seem to use § (U+00A7), π (U+03C0), √ (U+221A), ∞ (U+221E) and ⊙ (U+2299) that were not listed here yet, those should probably be added to Noto Mono as well (I've also used the root symbol on my implementation, didn't notice it was missing). About the shoes, union and intersection (⋂, U+22C2, and ⋃, U+22C3) are really too big in the pdf you linked, but they seem to be mirrored versions of left and right shoes (subset and superset) in NotoSansSymbols, so I believe they'd fit nicely for the alternate shoes (like down shoe stile and the lamp). They probably should take the underscore for other compositions to be at the same height from subset/equal (⊆, U+2286) and superset/equal (⊇, U+2287) as well to look consistent. If I understood correctly, you agree with that. About the circled letters (and underline delta), you're probably right. It was just a suggestion. NARS seem to use the SimPL Medium font, which assigns U+E036 through U+E04F for the underlined symbols. Maybe this could be an option. And, before I forget: sorry, @dougfelt, I don't believe stylistic alternates would help. I don't know any APL environment customizable enough. |
After doing some checking, I believe that I was wrong earlier in:
This should be: Unicode Character 'APL FUNCTIONAL SYMBOL DOWN SHOE STILE' (U+2366). It seems the one which looked good, looked different. The appropriate appearance is with the smaller down shoe, and ultimately "looks more" like a lowercase greek psi. But isn't. Sigh. Again, the APL385 version looks very nice for this character. APL385 looks good apart from "del-tilde", "nand", and "nor". It provides a good "relative shape". Some styling could be done from there, but proportionality and consistency seem very right in that font. I mentioned "del-tilde", "nand", and "nor" in a different forum. But in general, the "~" should not be a supersign, but in the middle of the character cell. And the overstrikes should go through the other component of the character. U+236B ⍫ APL FUNCTIONAL SYMBOL DEL TILDE Note that how these render depends on the font you are using. But if the "~" is through the other symbol (rather than above it), that's the "right look". |
A question for this community. Noto Sans Symbols will be a proportional, multiweight font, and a proposed Noto Math font would be proportional and single weight. I can't really imaging APL being proportional, so I'm tempted to support APL only in Noto Mono and omit APL-specific characters from the symbols and math fonts. Any objections to this? |
@dougfelt, seems fine by me. After thinking about it, I've never seem "APL equations" mixed with regular math. Btw, any progress on this issue? I'd love to try programming with Noto Mono. 😄 |
@paulotorrens : we just started specifying the updates to Noto Sans Mono. I'm not sure how long it will take to develop it, but I hope we'll be done soon (where I don't know the value of "soon" :-() |
I would have a problem with that. There are two main issues:
|
As far as I can tell, all the APL functional symbols are now part of Noto Sans Mono. Overstrikes won't work, but I don't think that's how you program APL in Unicode anyway; overstrikes were there because you couldn't access particular glyphs, and now you can... I might be wrong, not being an APL programmer; if so, please feel free to reopen! |
APL users might want Monospace support like the Haskell folks do.
This is a list of characters marked as APL in the Unicode Miscellaneous Technical block. We should consider adding these to Mono as well. We'll need to find out if these are sufficient or if APL requires additional characters not listed here.
The text was updated successfully, but these errors were encountered: