There's "fancypants" for CHICKEN, (which is like "SmartyPants" elsewhere). It should handle the brunt of the conversion work for you.
I agree with you on the ligatures. I tend to lean more toward Python, where smartypants there has a number of options to tweak all manner of things.
I don't think that Python's smartypants even supports ligatures, though.
I tend to think of ligatures as something that makes sense for some words but not all words with the same pattern -- so something that can't be automatically applied without an explicit dictionary.
I feel you on the juggling priorities. So much to do!
I would appreciate it if my browser displayed waffle as waffle but still let me interact with the word as if it were waffle. But I think it does do that already, doesn’t it? This is why we font-rendering.
Hard-coding it just screws over terminal- and braille-users, and scrapers and searchers and kitbashers.
These unicode chars have some use for logo-drawing purps in Inkscape and that’s basically it. Or internal representation for something like the rendering layer of a new TeX (the old one already has its own internal glyph set system that predates Unicode).
I totally agree with you. Yeah, I immediately thought of the ae ligature, but that's because all of the various f-ligatures (etc) aren't things any normal person ever needs to think about. I certainly never want any system to automatically convert to that. I mean, WTF.
In the places where it doesn't happen automatically -- like Terminal applications -- the experience of those Unicode ligature codepoints is far inferior to having the actual characters.
Oh, I just found out what sxml-apply-ruleset is; it’s just a wrapper for pre-post-order. OK, I can probably whip together something pretty quickly using this. Thanks sjamaan
@yam655 @kensanata
I started out by looking at sjaaman’s fancypants egg but I decided I’d have to change it so much that it might be easier just starting over.
First of all, I defo don’t want ligatures. Ligature characters are not something that normally should live in the file. Ligatures is great when you are hand-typesetting, like making a logo in Inkscape or something. Not the right call for serving html or gemtext. It’s gonna make selecting text and searching for text awkward.
Second, you can translate strings to unicode but I kinda sometimes want to translate strings to strings. For example, in French a “ should become a « followed by a space.
Third, I don’t want apostrophes between two vowels to be translated.
His general idea to write a ruleset for
sxml-apply-rulesis good. I can’t find that procedure anywhere, though—is it gone from sxml-transforms? I’m already using a 63-line longpre-post-order-splice*, might as well do some quote smartening in there.I think I remember now that the reason I backburnered this is that I started working on a string library (a wrapper / macro set for irregex basically), and then I backburnered that to work on cgit stuff, and then I backburnered that in turn to work on a thing for a friend.