Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: Generate the charset tables dynamically from codes #3409

Merged
merged 9 commits into from
Sep 1, 2024
3 changes: 3 additions & 0 deletions doc/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,9 @@
"requires": requirements,
}

# MyST-NB configurations: https://myst-nb.readthedocs.io/en/latest/configuration.html
nb_render_markdown_format = "myst" # The format to use for text/markdown rendering

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is 'commonmark', which doesn't support tables.


# Make the list of returns arguments and attributes render the same as the
# parameters list
Expand Down
141 changes: 53 additions & 88 deletions doc/techref/encodings.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,35 @@
---
file_format: mystnb
---

```{code-cell}
---
tags: [remove-input]
---
from IPython.display import display, Markdown
from pygmt.encodings import charset


def get_charset_mdtable(name):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is modified from the original script at #3206 (comment)

"""
Create a markdown table for a charset.
"""
mappings = charset[name]

undefined = "\ufffd"
text = "| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |\n"
text += "|---|---|---|---|---|---|---|---|---|\n"
for i in range(0o00, 0o400, 8):
chars = [mappings.get(j, undefined) for j in range(i, i + 8)]
if chars == [undefined] * 8:
continue
chars = [f"&#x{ord(char):04x};" for char in chars]
row = f"\\{i:03o}"[:-1] + "x"
text += f"| **{row}** | {' | '.join(chars)} |\n"
text += "\n"
return Markdown(text)
```

# Supported Encodings and Non-ASCII Characters

GMT supports a number of encodings and each encoding contains a set of ASCII and
Expand All @@ -10,100 +42,33 @@ that the character is not defined in the encoding.

## Adobe ISOLatin1+ Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\03x** | � | • | … | ™ | — | – | fi | ž |
| **\04x** |   | ! | " | # | $ | % | & | ’ |
| **\05x** | ( | ) | * | + | , | - | . | / |
| **\06x** | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| **\07x** | 8 | 9 | : | &#x003b; | < | = | > | ? |
| **\10x** | @ | A | B | C | D | E | F | G |
| **\11x** | H | I | J | K | L | M | N | O |
| **\12x** | P | Q | R | S | T | U | V | W |
| **\13x** | X | Y | Z | [ | \ | ] | ^ | _ |
| **\14x** | ‘ | a | b | c | d | e | f | g |
| **\15x** | h | i | j | k | l | m | n | o |
| **\16x** | p | q | r | s | t | u | v | w |
| **\17x** | x | y | z | { | | | } | ~ | š |
| **\20x** | Œ | † | ‡ | Ł | ⁄ | ‹ | Š | › |
| **\21x** | œ | Ÿ | Ž | ł | ‰ | „ | “ | ” |
| **\22x** | ı | ` | ´ | ^ | ˜ | ¯ | ˘ | ˙ |
| **\23x** | ¨ | ‚ | ˚ | ¸ | ' | ˝ | ˛ | ˇ |
| **\24x** | � | ¡ | ¢ | £ | ¤ | ¥ | ¦ | § |
| **\25x** | ¨ | © | ª | « | ¬ | ­ | ® | ¯ |
| **\26x** | ° | ± | ² | ³ | ´ | µ | ¶ | · |
| **\27x** | ¸ | ¹ | º | » | ¼ | ½ | ¾ | ¿ |
| **\30x** | À | Á | Â | Ã | Ä | Å | Æ | Ç |
| **\31x** | È | É | Ê | Ë | Ì | Í | Î | Ï |
| **\32x** | Ð | Ñ | Ò | Ó | Ô | Õ | Ö | × |
| **\33x** | Ø | Ù | Ú | Û | Ü | Ý | Þ | ß |
| **\34x** | à | á | â | ã | ä | å | æ | ç |
| **\35x** | è | é | ê | ë | ì | í | î | ï |
| **\36x** | ð | ñ | ò | ó | ô | õ | ö | ÷ |
| **\37x** | ø | ù | ú | û | ü | ý | þ | ÿ |
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("ISOLatin1+"))
```

## Adobe Symbol Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\04x** |   | ! | ∀ | # | ∃ | % | & | ∋ |
| **\05x** | ( | ) | ∗ | + | , | − | . | / |
| **\06x** | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| **\07x** | 8 | 9 | : | &#x003b; | < | = | > | ? |
| **\10x** | ≅ | Α | Β | Χ | ∆ | Ε | Φ | Γ |
| **\11x** | Η | Ι | ϑ | Κ | Λ | Μ | Ν | Ο |
| **\12x** | Π | Θ | Ρ | Σ | Τ | Υ | ς | Ω |
| **\13x** | Ξ | Ψ | Ζ | [ | ∴ | ] | ⊥ | _ |
| **\14x** |  | α | β | χ | δ | ε | φ | γ |
| **\15x** | η | ι | ϕ | κ | λ | μ | ν | ο |
| **\16x** | π | θ | ρ | σ | τ | υ | ϖ | ω |
| **\17x** | ξ | ψ | ζ | { | | | } | ∼ | � |
| **\24x** | € | ϒ | ′ | ≤ | ∕ | ∞ | ƒ | ♣ |
| **\25x** | ♦ | ♥ | ♠ | ↔ | ← | ↑ | → | ↓ |
| **\26x** | ° | ± | ″ | ≥ | × | ∝ | ∂ | • |
| **\27x** | ÷ | ≠ | ≡ | ≈ | … | ⏐ | ⎯ | ↵ |
| **\30x** | ℵ | ℑ | ℜ | ℘ | ⊗ | ⊕ | ∅ | ∩ |
| **\31x** | ∪ | ⊃ | ⊇ | ⊄ | ⊂ | ⊆ | ∈ | ∉ |
| **\32x** | ∠ | ∇ | ® | © | ™ | ∏ | √ | ⋅ |
| **\33x** | ¬ | ∧ | ∨ | ⇔ | ⇐ | ⇑ | ⇒ | ⇓ |
| **\34x** | ◊ | 〈 | ® | © | ™ | ∑ | ⎛ | ⎜ |
| **\35x** | ⎝ | ⎡ | ⎢ | ⎣ | ⎧ | ⎨ | ⎩ | ⎪ |
| **\36x** | � | 〉 | ∫ | ⌠ | ⎮ | ⌡ | ⎞ | ⎟ |
| **\37x** | ⎠ | ⎤ | ⎥ | ⎦ | ⎫ | ⎬ | ⎭ | � |

**Note**: The octal code `\140` represents the RADICAL EXTENDER character, which is not available in
the Unicode character set.
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("Symbol"))
```

**Note**: The octal code `\140` represents the RADICAL EXTENDER character, which is not
available in the Unicode character set.

## Adobe ZapfDingbats Encoding

| octal | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|---|---|---|
| **\04x** |   | ✁ | ✂ | ✃ | ✄ | ☎ | ✆ | ✇ |
| **\05x** | ✈ | ✉ | ☛ | ☞ | ✌ | ✍ | ✎ | ✏ |
| **\06x** | ✐ | ✑ | ✒ | ✓ | ✔ | ✕ | ✖ | ✗ |
| **\07x** | ✘ | ✙ | ✚ | ✛ | ✜ | ✝ | ✞ | ✟ |
| **\10x** | ✠ | ✡ | ✢ | ✣ | ✤ | ✥ | ✦ | ✧ |
| **\11x** | ★ | ✩ | ✪ | ✫ | ✬ | ✭ | ✮ | ✯ |
| **\12x** | ✰ | ✱ | ✲ | ✳ | ✴ | ✵ | ✶ | ✷ |
| **\13x** | ✸ | ✹ | ✺ | ✻ | ✼ | ✽ | ✾ | ✿ |
| **\14x** | ❀ | ❁ | ❂ | ❃ | ❄ | ❅ | ❆ | ❇ |
| **\15x** | ❈ | ❉ | ❊ | ❋ | ● | ❍ | ■ | ❏ |
| **\16x** | ❐ | ❑ | ❒ | ▲ | ▼ | ◆ | ❖ | ◗ |
| **\17x** | ❘ | ❙ | ❚ | ❛ | ❜ | ❝ | ❞ | � |
| **\20x** | ❨ | ❩ | ❪ | ❫ | ❬ | ❭ | ❮ | ❯ |
| **\21x** | ❰ | ❱ | ❲ | ❳ | ❴ | ❵ | � | � |
| **\24x** | � | ❡ | ❢ | ❣ | ❤ | ❥ | ❦ | ❧ |
| **\25x** | ♣ | ♦ | ♥ | ♠ | ① | ② | ③ | ④ |
| **\26x** | ⑤ | ⑥ | ⑦ | ⑧ | ⑨ | ⑩ | ❶ | ❷ |
| **\27x** | ❸ | ❹ | ❺ | ❻ | ❼ | ❽ | ❾ | ❿ |
| **\30x** | ➀ | ➁ | ➂ | ➃ | ➄ | ➅ | ➆ | ➇ |
| **\31x** | ➈ | ➉ | ➊ | ➋ | ➌ | ➍ | ➎ | ➏ |
| **\32x** | ➐ | ➑ | ➒ | ➓ | ➔ | → | ↔ | ↕ |
| **\33x** | ➘ | ➙ | ➚ | ➛ | ➜ | ➝ | ➞ | ➟ |
| **\34x** | ➠ | ➡ | ➢ | ➣ | ➤ | ➥ | ➦ | ➧ |
| **\35x** | ➨ | ➩ | ➪ | ➫ | ➬ | ➭ | ➮ | ➯ |
| **\36x** | � | ➱ | ➲ | ➳ | ➴ | ➵ | ➶ | ➷ |
| **\37x** | ➸ | ➹ | ➺ | ➻ | ➼ | ➽ | ➾ | � |
```{code-cell}
---
tags: [remove-input]
---
display(get_charset_mdtable("ZapfDingbats"))
```

## ISO/IEC 8859

Expand Down