Mid word spaces appearing in HTML display

complexpoint · August 23, 2022, 9:06pm

I notice, (Bike 1.4 Preview 69), that if formatting runs start or end within a word, as in the partial italicisation of Epsilon below:

then we are getting a slight problem (in build 69) in the browser display of Bike HTML – whitespace gaps appear before and after the formatting run:

Screenshot 2022-08-23 at 21.57.27

Perhaps newlines in the .bike HTML pretty-printing are interpreted by the browser as syntactically significant soft breaks ?

          <li id="GD">
            <p>
              <span>Ep</span>
              <em>sil</em>
              <span>on</span>
            </p>
          </li>

The browser display drops the intruded spaces if we reformat to:

          <li id="GD">
            <p>
              <span>Ep</span><em>sil</em><span>on</span>
            </p>
          </li>

Screenshot 2022-08-23 at 22.05.00

(FWIW the Pandoc HTML parser also interprets those mid-word pretty-print line endings as semantic SoftBreak tokens)

complexpoint · August 23, 2022, 9:14pm

(The Pandoc abstract syntax tree can be inspected by writing out to its native format)

pandoc -f html -t native sample.bike -o sample.ast

Expand disclosure triangle to view Pandoc AST

[ BulletList
    [ [ Div
          ( "40" , [] , [] )
          [ Para [ Str "Alpha" ]
          , BulletList
              [ [ Div ( "Hh" , [] , [] ) [ Para [ Str "Beta" ] ] ]
              , [ Div ( "WM" , [] , [] ) [ Para [ Str "Gamma" ] ] ]
              , [ Div ( "f2" , [] , [] ) [ Para [ Str "Delta" ] ] ]
              , [ Div
                    ( "GD" , [] , [] )
                    [ Para
                        [ Span ( "" , [] , [] ) [ Str "Ep" ]
                        , SoftBreak
                        , Emph [ Str "sil" ]
                        , SoftBreak
                        , Span ( "" , [] , [] ) [ Str "on" ]
                        ]
                    ]
                ]
              ]
          ]
      ]
    ]
]

jessegrosjean · August 23, 2022, 10:01pm

This is just a bug that I don’t have a good solution for yet.

When reading Bike files I don’t follow all the HTML conventions… in particular I only read text from leaf nodes, and I treat that read text as pre-formatted… so I don’t do things like collapse multiple spaces into a single space.

I think “maybe” the solution is to stop using the framework provided pretty print and instead construct the XML document with my own pretty print implementation… which puts everything within the <p> element on the same line.