<template>
    <main id="about" class="black-mode">

        <header class="flex-spread">
            <h1 class="page-header" id="title"><router-link to="/">Hyperglot</router-link></h1>
            <a class="page-header" id="credit" href="https://www.rosettatype.com">A project by Rosetta</a>
        </header>

        <div class="row stack-on-tablet">
            <div class="cols-6 margin-left-1 margin-right-1">
                <a v-on:click="history.state.back ? this.$router.back() : this.$router.push('/')" class="text-button "
                    id="close-about">Go back</a>

                <h1>About Hyperglot</h1>

                <p>Hyperglot helps answer a seemingly simple question about language support in fonts that is deceptively
                    complex: "When can I use a font to set texts in a particular language?" or, more generally, "What do I
                    need to represent a language in writing in a digital environment?".

                    Hyperglot is an open-source toolkit built around a database of language orthographies. The database
                    currently includes 774 languages which corresponds to around 7.3 billion speakers<a href="#ref-1">1</a>.
                    It is important
                    to note that the database is a work in progress, designed to grow, and does not contain information for
                    all the languages of the world.<a href="#ref-2">2</a>

                    This web app, now in version 2.0, is built on top of the Hyperglot toolkit to provide two ways of using
                    the data: a <em>Database view</em> that lets one browse through all the language data and a <em>Font
                        checker</em> that
                    automatically determines language support in fonts.</p>

                <h2>How is the database organized</h2>

                <p>In order to understand how both the Database view and the Font checker
                    work, it is useful to peek under the hood a little bit. The relationship
                    that languages have with writing is complex and ever-changing, which
                    poses some challenges in organizing and obtaining the data. Here are a
                    few examples to illustrate this:</p>

                <ul class="text-bullets">
                    <li>the way languages are written can change over time: Azerbaijani in
                        Azerbaijan used to use the Arabic script, but currently it uses the
                        Latin script,</li>
                    <li>the way languages are written can change across geopolitical regions: in
                        Dagestan (Russia) Azerbaijani uses the Cyrillic script,</li>
                    <li>a single language can be written in multiple scripts separately at the
                        same time and place: Serbian is written in Latin and Cyrillic in Serbia,</li>
                    <li>a single language can combine multiple scripts to form sentences:
                        Japanese language uses a combination of kanji, hiragana, and katakana,</li>
                    <li>orthographies may survive their replacement by official authorities:
                        after the German language reform in 1996, the previous orthography
                        remained widely in use; the most controversial changes from the 1996
                        reform were removed in 2006,</li>
                    <li>languages are often transliterated in scripts that are not common or
                        native to them: Pinyin transliteration of Chinese to the Latin script,</li>
                    <li>loan words, names, professional and academic terminology make it
                        difficult to automatically identify characters used by a language based
                        on digital corpora: consider the sentence "Vietnamese president Võ Văn
                        Thưởng met the Norwegian prime minister Jonas Gahr Støre during the
                        visit of the Öresund Bridge".</li>
                </ul>

                <p>In order to provide a straightforward solution that can be automated,
                    Hyperglot uses a pragmatic approach built on established technological
                    standards such as the <a href="https://iso639-3.sil.org/" target="_blank">ISO 639-3
                        standard</a><a href="#ref-3">3</a> for language codes, the <a href="https://unicode.org/"
                        target="_blank">Unicode
                        Standard</a> to encode characters, and the <a
                        href="https://learn.microsoft.com/en-gb/typography/opentype/spec/" target="_blank">OpenType font
                        format</a> to display characters and formulate design requirements.</p>

                <p>The basic organizing entity of the database is a <i>language</i>. That single
                    language can then have multiple orthographies. In Hyperglot, an
                    <i>orthography</i> corresponds to several sets of characters necessary to
                    represent a particular language in a particular <i>script</i> or several
                    <i>scripts</i>. Hyperglot currently uses these character sets:
                </p>

                <ul class="text-bullets">
                    <li>a <i>base set</i> that includes characters used to form common words in the
                        language,</li>
                    <li>an optional <i>auxiliary set</i> that includes characters from frequent loan
                        words or characters that provide backward compatibility,</li>
                    <li>additionally, a <i>set of marks</i> that can be used in combination with
                        characters from the previous two groups.</li>
                </ul>

                <p>In the future, Hyperglot will also provide sets of <i>numerals</i> and
                    <i>punctuation</i> symbols to cover the typographic basics for each language.
                    Characters are represented via their Unicode code points.
                </p>

                <p>Note, however, that for a font to support a language, the simple
                    inclusion of the relevant characters may not be sufficient. Firstly,
                    depending on the script used, the font might need to include additional
                    instructions to help form meaningful word shapes through the
                    text-shaping engine in the software used to display a text. Secondly,
                    design preferences may differ depending on the language or region. For
                    example: the preferred shape for the Cyrillic small letter "De" is
                    different between Bulgarian (<span lang="bg">Д</span>) and Ukrainian (<span>Д</span>). Yet, they both
                    use the same Unicode code point.<a href="#ref-4">4</a> This kind of assessment cannot be
                    currently automated, therefore Hyperglot provides a list of <i>Design
                        requirements</i> for each orthography for the users to review. These are
                    meant as brief pointers with as limited styling recommendations and
                    personal preferences as possible.</p>

                <p>As orthographies evolve, they can have a different status with respect
                    to their language:</p>

                <ul class="text-bullets">
                    <li><i>primary</i> refers to a currently used, main orthography, typically
                        proposed by linguists and later standardized by governmental
                        institutions,</li>
                    <li><i>secondary</i> refers to a current, but less frequently used, orthography,</li>
                    <li><i>local</i> refers to a secondary orthography limited to a relatively small
                        geographic region,</li>
                    <li><i>historical</i> refers to an orthography that is no longer in active use by
                        the general population, but which still may be used by scholars for
                        example,</li>
                    <li><i>transliteration</i> refers to an orthography used for transliterations in
                        a non-typical script, e.g. Latin-script transliteration of the Standard
                        Arabic.</li>
                </ul>

                <p>To assess the validity of the individual entries, each orthography comes
                    with a label indicating the validity of its data (which may change as
                    the database grows):</p>

                <ul class="text-bullets">
                    <li><i>todo</i> — the entry is incomplete,</li>
                    <li><i>draft</i> — not yet verified with enough authoritative sources and may not be reliable,</li>
                    <li><i>preliminary</i> — verified with at least two online sources and likely to be accurate,</li>
                    <li><i>verified</i> — verified by a competent user of the language or a linguist and can be considered
                        accurate.</li>
                </ul>

                <p>The first two validity types are hidden by default. Use filter "show
                    unverified data" in the settings menu (hidden under the cog wheel icon)
                    to view them.</p>

                <p>All of the above information is available in the Database view together
                    with language autonyms<a href="#ref-5">5</a> and first-language (L1) speaker counts (see
                    note 1). Each language has its own URL for future reference.</p>

                <p>The language-search feature works with languages' English names or
                    autonyms. Users can also search for a single character to see the list
                    of languages that use this character. Additionally, the settings menu
                    includes filters to include historical and constructed languages or
                    secondary and historical orthographies.</p>

                <p>The language data can be copied in different formats (plain text, YAML,
                    JSON) via the Copy data button.</p>

                <p>Because of its vast language support, the <a href="https://fonts.google.com/noto" target="_blank">Noto
                        family of
                        fonts</a> is used to display
                    characters in the Database view and as a fallback in the Font checker.
                    It is a great design for some languages, but lacking for others. This
                    choice of font does not serve as our design recommendation.</p>

                <h2>Determining language support in fonts</h2>

                <p>The Font checker runs in the browser and comes ready to try with
                    selected fonts from Rosetta's multilingual catalogue, but it also works
                    with OTF or TTF fonts provided by the users. Hyperglot analyses the
                    basic Unicode code points provided font supports right in the browser
                    and retrieves the supported languages from the database. The fonts are
                    not uploaded and no information about the fonts processed is collected.
                    In other words, it’s legal to use with any fonts.</p>

                <p>By default, the Font checker automatically compares the character sets
                    of primary orthographies of all languages from the database to provide a
                    list of supported languages. The default behaviour can be modified via
                    the cog-wheel settings. If a font contains glyphs for all characters in
                    the orthography, the language is considered supported. Users can also
                    view languages that are not fully supported by their font to see what
                    characters it is missing.</p>

                <p>Note that Hyperglot should only be used to detect whether a font can be
                    considered (!) for use with a particular language. It does not say
                    anything about the quality of the font’s design. The first
                    step on a way to assess the design properly is to review the design
                    requirements reported for the relevant languages.</p>

                <p>In addition to the search and filter options that are similar to the
                    Database view, the user interface of the Font checker allows selecting
                    multiple languages in the left panel. Selecting multiple languages on
                    the left creates aggregated character sets (base, auxiliary, mark) and
                    design requirements in the right panel.</p>

                <p>Similarly to the Database view, the resulting list of languages,
                    character sets, and other data can be copied in various formats (plain
                    text, YAML, JSON) via the Copy data button.</p>

                <h2>Designed to grow</h2>

                <p>A free command-line tool and Python package are provided to help
                    integrate Hyperglot into font and software development workflows. Refer
                    to the project’s <a href="https://github.com/rosettatype/hyperglot/" target="_blank">GitHub
                        repository</a> for
                    more information.</p>

                <p>The Hyperglot database, tools, and the web app were originally developed
                    by Rosetta, world typography specialists, publishers, and makers of
                    original fonts addressing the needs of global typography with the goal
                    to enable people to read better in their native languages.</p>

                <p>The database and tools are provided AS IS without any guarantee.</p>

                <p>Rosetta has made Hyperglot open-source to allow it to grow and to enable
                    others to use it and share feedback easily. Mapping orthographies for
                    world's languages is a big task. Many have already contributed their
                    expertise and feedback over the years helping to expand the data to
                    cover more and more languages. If you spot an issue or notice a language
                    that is altogether missing, please take the time to provide feedback
                    either via email or via <a href="https://github.com/rosettatype/hyperglot/issues">GitHub’s
                        issue-tracking system</a>.</p>

                <p>Hopefully, it will help developers support more languages in their
                    software and fonts.</p>

                <h2>Contributors (as of 23 November 2023)</h2>

                <p>David Březina,<br>
                    Johannes Neumeier,<br>
                    Sérgio Martins,<br>
                    Toshi Omagari,<br>
                    Denis Moyogo Jacquerye,<br>
                    Hrant Papazian,<br>
                    Meir Sadan,<br>
                    Vincent W.J. van Gerven Oei,<br>
                    Michael Rießler,<br>
                    M. Mahali Syarifuddin,<br>
                    Fadhl Haqq,<br>
                    Fredrick Brennan,<br>
                    Stephan Kurz,<br>
                    Aadarsh Rajan,<br>
                    Rafael Dietzsch,<br>
                    Sunny Walker,<br>
                    Justin Penner,<br>
                    Bert Zhang,<br>
                    Ana Sanikidze,<br>
                    Gustavo Reis,<br>
                    Neil Suresh Patel,<br>
                    Daniel Yacob,<br>
                    Marianna Paszkowska,<br>
                    @berrymot,<br>
                    John Hudson,<br>
                    Claus Eggers Sørensen,<br>
                    and others.</p>

                <p>—</p>

                <p>David & Johannes<br>
                    Rosetta HQ<br>
                    23 November 2023</p>

                <hr>
                <p class="small"><a id="ref-1">1</a> This is a total number of first-language (L1) speakers based on
                    various sources as reported on Wikipedia. This simple sum does not
                    account for multilingualism and literacy rates.</p>

                <p class="small"><a id="ref-2">2</a> In other words, users' fonts may support other languages not
                    included in the database.</p>

                <p class="small"><a id="ref-3">3</a> The standard presents a singular opinion. Some might find
                    languages missing or consider some languages to be dialects or vice
                    versa.</p>

                <p class="small"><a id="ref-4">4</a> The series <a
                        href="https://designregression.com/essay/elements-of-multi-script-typography-chapter-2">Elements of
                        multi-script
                        typography</a> in the Design Regression mini-journal discusses these topics in greater deetail.</p>

                <p class="small"><a id="ref-5">5</a> Autonym is the name of the language in the language itself. We
                    provide it spelled out for each orthography.</p>

                <Imprint />
            </div>

            <aside id="sidebar" class="cols-4">
                <a class="button-filled" href="https://github.com/rosettatype/hyperglot/">Hyperglot on Github</a><br>
                <a class="button-outline" href="mailto:info@rosettatype.com?subject=Hyperglot website feedback">Send
                    feedback</a>

                <p class="italic">Tips:</p>
                <ul class="text-bullets">
                    <li>Study languages in full detail in the <em>Database view</em></li>
                    <li>Link directly to language entries in the <em>Database view</em></li>
                    <li>Analyse your fonts’ potential language support in <em>Font checker</em></li>
                    <li>Aggregate character sets for multiple languages with <em>Font checker</em></li>
                    <li>View languages that are not fully supported by your font to see what characters it is missing</li>
                    <li>View and assess reliability of the data</li>
                    <li>View additional design requirements</li>
                    <li>Copy language data in different formats</li>
                    <li>Search for language names, autonyms, or individual characters</li>
                </ul>
            </aside>
        </div>
    </main>
</template>

<script>
import Imprint from "../components/Imprint"

export default {
    name: "About",

    components: {
        Imprint
    },

    data: function () {
        return {
            history: window.history,
        };
    },
}
</script>

<style lang="scss" scoped>
header {
    margin-bottom: 3em;
}

#close-about {
    display: block;

    position: absolute;
    margin-left: calc(-100% / 6);

    @include tablet {
        position: relative;
        margin-left: initial;
        margin-bottom: 1em;
    }
}

h2 {
    margin: 3rem 0 1rem;
}

#sidebar {
    padding-top: 4em;
    max-width: 30rem;
    margin-right: auto;

    .button-filled,
    .button-outline {
        margin-top: 0;
    }

    p {
        margin-top: 5em;
        margin-bottom: 1em;
    }

    ol li:before {
        background: var(--text);
    }
}

p a,
ul a {
    color: var(--link);
    text-decoration: underline;
}

ul.text-bullets li:before {
    background-color: var(--link);
}

hr {
    max-width: none;
    margin: 6rem auto;
}

i {
    font-style: normal;
    background: #ffffff33;
    padding: 2px 3px 1px;
    border-radius: 2px;
}

em {
    font-weight: normal;
    @extend .italic;
}

a[href^="#ref"],
a[id^="ref"] {
    letter-spacing: 0;
    line-height: inherit !important;
    position: relative;
    display: inline-block;
    padding: 0.05em 0 0.05em;
    margin: 0 0.3em;
    text-decoration: none;
}

a[href^="#ref"] {
    top: -0.2em;
    font-feature-settings: "ss11" 1;

}

a[id^="ref"] {
    font-feature-settings: "ss12" 1;
    font-size: $font-size-l;
    top: 0.05em;
    margin: 0 0.1em;
}

.footer-content {
    margin-top: 6rem;
}
</style>