Logseq Doctor

I’m not entirely sure, because I don’t know details of Logseq Markdown. But it looks like a similar idea to Bike’s markdown… ensure everything is in nested list. Given that I think this can be useful if you want to convert non-list markdown it a list form for Bike.

With the help of ChatGPT, I’ve written two Python scripts to convert BikeMD-style Markdown text into regular Markdown text and the other way around.

Bike md to md script:

  • Removes YAML frontmatter.
  • Reads the document line by line and skips blank lines.
  • Calculates hierarchy from indentation using tabs or 4 spaces.
  • Removes inline IDs if present.
  • Converts - # Heading nodes into Markdown headings based on indentation level.
  • Limits heading depth to H6; deeper BikeMD levels are still exported as H6, not H7 or beyond.
  • Converts - --- into an HTML page break.
  • Converts - normal text nodes into Markdown paragraphs.
  • Converts + item nodes into Markdown unordered list items.
  • Preserves numbered list lines such as 1. item.
  • Normalizes extra blank lines and outputs clean Markdown.
Bike > md
#!/usr/bin/env python3

import re
import sys

def remove_frontmatter(text):
    lines = text.splitlines()
    if lines and lines[0].strip() == "---":
        for i in range(1, len(lines)):
            if lines[i].strip() == "---":
                return "\n".join(lines[i + 1:])
    return text

def clean_inline(text):
    text = re.sub(r"\s*\{#[^}]+\}", "", text)
    text = re.sub(r"\[([^\]]+)\]\{strong\}", r"\1", text)
    text = re.sub(r"\[([^\]]+)\]\{emphasis\}", r"\1", text)
    text = re.sub(r"[ \t]+$", "", text)
    return text

def indent_level(prefix):
    tabs = prefix.count("\t")
    spaces = prefix.replace("\t", "")
    return tabs + (len(spaces) // 4)

def convert_bikemd_to_md(text):
    text = remove_frontmatter(text)
    out = []

    for raw in text.splitlines():
        if not raw.strip():
            continue

        line = raw.rstrip()
        prefix = line[:len(line) - len(line.lstrip(" \t"))]
        level = indent_level(prefix)
        stripped = line.strip()

        # Standalone Bike/Pandoc ID line
        if re.fullmatch(r"\{#[^}]+\}", stripped):
            continue

        # Bike node
        if stripped.startswith("- "):
            content = stripped[2:].strip()
            content = clean_inline(content)

            if not content:
                continue

            # Horizontal rule / page break
            if content == "---":
                if out and out[-1] != "":
                    out.append("")
                out.append('<div style="page-break-after: always; break-after: page;"></div>')
                out.append("")
                continue

            # Heading node
            if content.startswith("#"):
                m = re.match(r"^(#+)\s+(.*)$", content)
                if m:
                    title = m.group(2).strip()
                    heading_level = min(level + 1, 6)
                    if out and out[-1] != "":
                        out.append("")
                    out.append("#" * heading_level + " " + title)
                    out.append("")
                    continue

            # Non-heading Bike node becomes paragraph
            if out and out[-1] != "":
                out.append("")
            out.append(content)
            continue

        # Unordered list node from +
        if stripped.startswith("+ "):
            content = clean_inline(stripped[2:].strip())
            out.append("- " + content)
            continue

        # Numbered list lines stay numbered
        if re.match(r"^\d+\.\s+", stripped):
            out.append(clean_inline(stripped))
            continue

        # Other lines
        content = clean_inline(stripped)
        if content:
            out.append(content)

    md = "\n".join(out)
    md = re.sub(r"\n{3,}", "\n\n", md)
    return md.strip() + "\n"

if __name__ == "__main__":
    if len(sys.argv) == 3:
        input_path = sys.argv[1]
        output_path = sys.argv[2]

        with open(input_path, "r", encoding="utf-8") as f:
            input_text = f.read()

        output_text = convert_bikemd_to_md(input_text)

        with open(output_path, "w", encoding="utf-8") as f:
            f.write(output_text)

    else:
        input_text = sys.stdin.read()
        print(convert_bikemd_to_md(input_text), end="")

md to Bike md script

  • Removes YAML frontmatter.
  • Reads the Markdown document line by line.
  • Converts ATX headings (# to ######) into BikeMD heading nodes using indentation-based hierarchy.
  • Converts normal paragraphs into BikeMD paragraph nodes prefixed with -.
  • Converts Markdown unordered lists (-, +, *) into BikeMD unordered list nodes prefixed with +.
  • Preserves Markdown ordered list numbering such as 1. item.
  • Detects paragraph-following lists, with or without blank lines, and indents those lists under the preceding paragraph.
  • Converts indented Markdown code/paragraph lines into one-level-deeper inline code span nodes.
  • Preserves fenced code blocks as literal BikeMD child lines.
  • Converts Markdown tables into BikeMD outline blocks marked with `Tablo`.
  • In table conversion, bolds header names and separates header/value pairs with | for easier reverse conversion.
  • Converts Markdown horizontal rules into BikeMD - --- page-break markers.
  • Preserves blockquotes as normal BikeMD paragraph nodes.
  • Outputs clean BikeMD text with indentation defining the hierarchy.
md > Bike md
#!/usr/bin/env python3

import re
import sys

TAB = "\t"
MAX_HEADING_LEVEL = 6


def remove_frontmatter(text):
    lines = text.splitlines()
    if lines and lines[0].strip() == "---":
        for i in range(1, len(lines)):
            if lines[i].strip() == "---":
                return "\n".join(lines[i + 1:])
    return text


def clean_inline(text):
    return re.sub(r"[ \t]+$", "", text)


def clean_cell(text):
    text = text.strip()
    text = re.sub(r"<br\s*/?>", " / ", text, flags=re.IGNORECASE)
    text = re.sub(r"\s+", " ", text)
    return text


def bold(text):
    return f"**{text}**"


def is_alignment_row(cells):
    if not cells:
        return False

    for cell in cells:
        c = cell.strip()
        if not re.fullmatch(r":?-{3,}:?", c):
            return False

    return True


def split_table_row(line):
    line = line.strip()

    if line.startswith("|"):
        line = line[1:]

    if line.endswith("|"):
        line = line[:-1]

    return [clean_cell(cell) for cell in line.split("|")]


def make_unique_headers(headers):
    result = []
    counts = {}

    for i, header in enumerate(headers, start=1):
        h = header.strip()

        if not h:
            h = f"Kolon {i}"

        if h in counts:
            counts[h] += 1
            h = f"{h} {counts[h]}"
        else:
            counts[h] = 1

        result.append(h)

    return result


def normalize_row_length(row, header_count):
    row = list(row)

    if len(row) < header_count:
        row.extend([""] * (header_count - len(row)))

    if len(row) > header_count:
        extra = row[header_count:]
        row = row[:header_count]
        row[-1] = " | ".join([row[-1]] + extra)

    return row


def is_table_line(line):
    return line.strip().startswith("|")


def collect_table_lines(lines, start_index):
    table_lines = []
    i = start_index

    while i < len(lines):
        line = lines[i]

        if not line.strip():
            break

        if not is_table_line(line):
            break

        table_lines.append(line.rstrip())
        i += 1

    return table_lines, i


def table_lines_to_bikemd(table_lines, base_indent):
    table_lines = [
        line.rstrip()
        for line in table_lines
        if line.strip()
    ]

    if len(table_lines) < 2:
        return []

    headers = split_table_row(table_lines[0])
    headers = make_unique_headers(headers)

    data_lines = table_lines[1:]

    if data_lines and is_alignment_row(split_table_row(data_lines[0])):
        data_lines = data_lines[1:]

    out = []
    out.append(f"{base_indent}- `Tablo`")

    for line in data_lines:
        row = split_table_row(line)
        row = normalize_row_length(row, len(headers))

        first_header = headers[0]
        first_value = row[0] if row else ""

        out.append(
            f"{base_indent}{TAB}- {bold(first_header)} | {first_value}"
        )

        for header, value in zip(headers[1:], row[1:]):
            out.append(
                f"{base_indent}{TAB * 2}- {bold(header)} | {value}"
            )

    return out


def heading_to_bikemd(line):
    m = re.match(r"^(#{1,6})\s+(.+?)\s*#*\s*$", line)
    if not m:
        return None

    level = min(len(m.group(1)), MAX_HEADING_LEVEL)
    title = clean_inline(m.group(2).strip())
    indent = TAB * (level - 1)

    return f"{indent}- # {title}", level


def current_content_indent(current_heading_level):
    if current_heading_level <= 0:
        return ""
    return TAB * current_heading_level


def md_indent_level(prefix):
    tabs = prefix.count("\t")
    spaces = len(prefix.replace("\t", ""))
    return tabs + (spaces // 4)


def is_indented_code_candidate(line):
    m = re.match(r"^([ \t]+)(.+)$", line)
    if not m:
        return False

    md_prefix, _ = m.groups()

    is_indented_code = (
        "\t" in md_prefix
        or len(md_prefix.replace("\t", "")) >= 4
    )

    is_list_item = (
        re.match(r"^[ \t]*[-+*]\s+", line)
        or re.match(r"^[ \t]*\d+\.\s+", line)
    )

    return is_indented_code and not is_list_item


def is_unordered_list_line(line):
    return re.match(r"^([ \t]*)([-+*])\s+(.+)$", line)


def is_ordered_list_line(line):
    return re.match(r"^([ \t]*)(\d+\.\s+.+)$", line)


def is_paragraph_like_before_list(line):
    stripped = line.strip()

    if not stripped:
        return False

    if heading_to_bikemd(stripped):
        return False

    if is_table_line(line):
        return False

    if is_indented_code_candidate(line):
        return False

    if is_unordered_list_line(line) or is_ordered_list_line(line):
        return False

    if re.fullmatch(r"[-*_]{3,}", stripped):
        return False

    if stripped.startswith(">"):
        return False

    if re.match(r"^(```+|~~~+)", stripped):
        return False

    return True


def next_nonblank_line_is_list(lines, start_index):
    i = start_index + 1

    while i < len(lines):
        line = lines[i]

        if not line.strip():
            i += 1
            continue

        return bool(
            is_unordered_list_line(line)
            or is_ordered_list_line(line)
        )

    return False


def convert_md_to_bikemd(text):
    text = remove_frontmatter(text)
    lines = text.splitlines()

    out = []
    in_fenced_code = False
    fence_marker = None
    current_heading_level = 0

    pending_paragraph_parent = False
    list_after_paragraph = False

    i = 0

    while i < len(lines):
        raw = lines[i]
        line = raw.rstrip("\n")

        if not line.strip():
            i += 1
            continue

        stripped = line.strip()

        fence = re.match(r"^(```+|~~~+)", stripped)
        if fence:
            marker = fence.group(1)[0]
            indent = current_content_indent(current_heading_level)
            out.append(f"{indent}- {clean_inline(stripped)}")

            if not in_fenced_code:
                in_fenced_code = True
                fence_marker = marker
            elif marker == fence_marker:
                in_fenced_code = False
                fence_marker = None

            pending_paragraph_parent = False
            list_after_paragraph = False
            i += 1
            continue

        if in_fenced_code:
            indent = current_content_indent(current_heading_level)
            out.append(f"{indent}- {clean_inline(line)}")
            i += 1
            continue

        converted_heading = heading_to_bikemd(stripped)
        if converted_heading:
            bike_line, current_heading_level = converted_heading
            out.append(bike_line)

            pending_paragraph_parent = False
            list_after_paragraph = False
            i += 1
            continue

        indent = current_content_indent(current_heading_level)

        if is_table_line(line):
            table_lines, next_i = collect_table_lines(lines, i)
            converted_table = table_lines_to_bikemd(table_lines, indent)

            if converted_table:
                out.extend(converted_table)
                pending_paragraph_parent = False
                list_after_paragraph = False
                i = next_i
                continue

        if is_indented_code_candidate(line):
            content = re.sub(r"^[ \t]+", "", line)
            content = clean_inline(content.strip()).replace("`", "\\`")
            out.append(f"{indent}{TAB}- `{content}`")

            pending_paragraph_parent = False
            list_after_paragraph = False
            i += 1
            continue

        m_ul = is_unordered_list_line(line)
        if m_ul:
            md_prefix, _, content = m_ul.groups()
            extra = md_indent_level(md_prefix)

            if pending_paragraph_parent:
                list_after_paragraph = True

            paragraph_extra = 1 if list_after_paragraph else 0

            out.append(
                f"{indent}{TAB * (paragraph_extra + extra)}+ "
                f"{clean_inline(content.strip())}"
            )

            pending_paragraph_parent = False
            i += 1
            continue

        m_ol = is_ordered_list_line(line)
        if m_ol:
            md_prefix, content = m_ol.groups()
            extra = md_indent_level(md_prefix)

            if pending_paragraph_parent:
                list_after_paragraph = True

            paragraph_extra = 1 if list_after_paragraph else 0

            out.append(
                f"{indent}{TAB * (paragraph_extra + extra)}"
                f"{clean_inline(content.strip())}"
            )

            pending_paragraph_parent = False
            i += 1
            continue

        if re.fullmatch(r"[-*_]{3,}", stripped):
            out.append(f"{indent}- ---")

            pending_paragraph_parent = False
            list_after_paragraph = False
            i += 1
            continue

        if stripped.startswith(">"):
            out.append(f"{indent}- {clean_inline(stripped)}")

            pending_paragraph_parent = False
            list_after_paragraph = False
            i += 1
            continue

        out.append(f"{indent}- {clean_inline(stripped)}")

        if (
            is_paragraph_like_before_list(line)
            and next_nonblank_line_is_list(lines, i)
        ):
            pending_paragraph_parent = True
        else:
            pending_paragraph_parent = False
            list_after_paragraph = False

        i += 1

    return "\n".join(out).strip() + "\n"


if __name__ == "__main__":
    if len(sys.argv) == 3:
        input_path = sys.argv[1]
        output_path = sys.argv[2]

        with open(input_path, "r", encoding="utf-8") as f:
            input_text = f.read()

        output_text = convert_md_to_bikemd(input_text)

        with open(output_path, "w", encoding="utf-8") as f:
            f.write(output_text)
    else:
        input_text = sys.stdin.read()
        print(convert_md_to_bikemd(input_text), end="")

Usage

I use both scripts as clipboard filters from the terminal.

Markdown to Bike MD

pbpaste | python3 /path/to/md_to_bikemd.py | pbcopy
  1. Copy normal Markdown text to the clipboard.
  2. Run the command.
  3. The script reads the clipboard content with pbpaste.
  4. It converts Markdown into BikeMD outline format.
  5. The converted output is written back to the clipboard with pbcopy.
  6. Paste the result into Bike.

Bike MD to Markdown

pbpaste | python3 /path/to/bike_to_md.py | pbcopy
  1. Copy BikeMD outline text to the clipboard.
  2. Run the command.
  3. The script reads the clipboard content with pbpaste.
  4. It converts BikeMD outline format back into normal Markdown.
  5. The converted Markdown output is written back to the clipboard with pbcopy.
  6. Paste the result into any Markdown editor.
2 Likes